# World Bank API

The wbgapi package is used to get data from the World Bank API.

## Data Types

First, the different data types are listed. 

In [1]:
import pandas as pd
import wbgapi as wb

# lists the data types
wb.source.info()

id,name,code,concepts,lastupdated
1.0,Doing Business,DBS,3.0,2021-08-18
2.0,World Development Indicators,WDI,3.0,2024-03-28
3.0,Worldwide Governance Indicators,WGI,3.0,2023-09-29
5.0,Subnational Malnutrition Database,SNM,3.0,2016-03-21
6.0,International Debt Statistics,IDS,4.0,2024-02-29
11.0,Africa Development Indicators,ADI,3.0,2013-02-22
12.0,Education Statistics,EDS,3.0,2023-10-12
13.0,Enterprise Surveys,ESY,3.0,2022-03-25
14.0,Gender Statistics,GDS,3.0,2024-04-15
15.0,Global Economic Monitor,GEM,3.0,2024-01-17


## Choose 'Doing Business'

The first data type 'Doing Business' is selected. The different statistics within this data type are then listed.

In [2]:
# sets the database to 'Doing Business'  
wb.db = 1      

# lists the 'Doing Business' stats
wb.series.info() 

id,value
ENF.CONT.COEN.ATDR,Enforcing contracts: Alternative dispute resolution (0-3) (DB16-20 methodology)
ENF.CONT.COEN.ATFE.PR,Enforcing contracts: Attorney fees (% of claim)
ENF.CONT.COEN.COST.ZS,Enforcing contracts: Cost (% of claim)
ENF.CONT.COEN.COST.ZS.DFRN,Enforcing contracts: Cost (% of claim) - Score
ENF.CONT.COEN.CSMG,Enforcing contracts: Case management (0-6) (DB16-20 methodology)
ENF.CONT.COEN.CTAU,Enforcing contracts: Court automation (0-4) (DB17-20 methodology)
ENF.CONT.COEN.CTFE.PR,Enforcing contracts: Court fees (% of claim)
ENF.CONT.COEN.CTSP.DB1719,Enforcing contracts: Court structure and proceedings (0-5) (DB17-20 methodology)
ENF.CONT.COEN.DB0415.DFRN,Enforcing contracts (DB04-15 methodology) - Score
ENF.CONT.COEN.DB1719.DFRN,Enforcing contracts (DB17-20 methodology) - Score


## Select statistics

The statistics 'Enforcing contracts: Cost (% of claim)', 'Starting a business: Minimum capital (% of income per capita)' and 'Registering property: Cost (% of property value)' are selected for the Latin America and Caribbean region including the 5 most recent values. 

In [3]:
# retruns the requested data as a dataframe
data = wb.data.DataFrame(['ENF.CONT.COEN.COST.ZS', 'IC.REG.MIN.CAP', 'IC.REG.PRRT.COST.PRT.VAL'], # statistic codes
                         wb.region.members('LAC'), # Latin America and Caribbean region 
                         mrv=5, # 5 most recent values
                         labels=True) # includes country and statistic labels

data

Unnamed: 0_level_0,Unnamed: 1_level_0,Country,Series,YR2015,YR2016,YR2017,YR2018,YR2019
economy,series,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
NIC,ENF.CONT.COEN.COST.ZS,Nicaragua,Enforcing contracts: Cost (% of claim),26.8,26.8,26.8,26.8,26.8
GTM,ENF.CONT.COEN.COST.ZS,Guatemala,Enforcing contracts: Cost (% of claim),26.5,26.5,26.5,26.5,26.5
BLZ,ENF.CONT.COEN.COST.ZS,Belize,Enforcing contracts: Cost (% of claim),27.5,27.5,27.5,27.5,27.5
VCT,ENF.CONT.COEN.COST.ZS,St. Vincent and the Grenadines,Enforcing contracts: Cost (% of claim),30.3,30.3,30.3,30.3,30.3
BRA,ENF.CONT.COEN.COST.ZS,Brazil,Enforcing contracts: Cost (% of claim),20.7,22.0,22.0,22.0,22.0
...,...,...,...,...,...,...,...,...
LCA,IC.REG.PRRT.COST.PRT.VAL,St. Lucia,Registering property: Cost (% of property value),7.6,7.6,7.1,7.2,7.2
CRI,IC.REG.PRRT.COST.PRT.VAL,Costa Rica,Registering property: Cost (% of property value),3.4,3.4,3.4,3.4,3.4
DMA,IC.REG.PRRT.COST.PRT.VAL,Dominica,Registering property: Cost (% of property value),13.3,13.3,13.3,13.3,13.3
MEX,IC.REG.PRRT.COST.PRT.VAL,Mexico,Registering property: Cost (% of property value),5.2,5.2,5.6,5.8,5.9


## Save as CSVs

Each data type is saved as a unique dataframe so that it can be manipulated with SQL.

In [4]:
# creates an array of unique statistic names
statistic_names = data['Series'].unique()

# separates and saves each statistic as a separate csv file
for name in statistic_names:
    path = '_'.join(name.split(':')[0].split()) + '.csv'
    df = data.loc[data['Series'] == name]
    df.to_csv(path, index=False)