In [1]:
# Importing modules
import requests
import matplotlib.pyplot as plt
import json
import pandas as pd

In this exercise, you'll redo the data gathering phase of the UNData Exploration project by using APIs instead of downloading csv files.

You'll make use of the [World Bank Indicators API](https://datahelpdesk.worldbank.org/knowledgebase/articles/889392-about-the-indicators-api-documentation). Note that this API does not require an API key. Before attempting the exercise, it would be a good idea to skim through the Documentation page and to check out the [Basic Call Structure article](https://datahelpdesk.worldbank.org/knowledgebase/articles/898581).

1. Use the API to get all available data for the _GDP per capita, PPP (constant 2017 international $)_ indicator. Hint: this indicator has code "NY.GDP.PCAP.PP.KD". Adjust the query parameters so that you can retrieve all available rows. Convert the results to a DataFrame.

In [11]:
# Get the data for GDP_per_capita

endpoint = 'http://api.worldbank.org/v2/country/all/indicator/NY.GDP.PCAP.PP.KD'
params = {'format':'json',
          'per_page' : 16758}
res = requests.get(endpoint, params=params).json()
GDP_Data = pd.json_normalize(res[1])

2. Now, use the API to get all available data for _Life expectancy at birth, total (years)_. This indicator has code "SP.DYN.LE00.IN". Again, convert the results to a DataFrame.

In [15]:
# Get the data for Life expectancy

endpoint = 'http://api.worldbank.org/v2/country/all/indicator/SP.DYN.LE00.IN'
parasm = {'format':'json',
          'per_page' : 16758}
response = requests.get(endpoint, params = params)
res = response.json()
#res
Life_Expectancy = pd.json_normalize(res[1])

In [12]:
# Drop columnms, rename from GDP per capita

GDP_Data = GDP_Data.drop(columns=['countryiso3code','unit', 'indicator.id',
                                   'indicator.value','obs_status', 'decimal',
                                     'country.id']).copy().rename(columns={"value": "GDP_Per_Capita","date": "Year",
                                                                           "country.value":"Country or Region"})

In [16]:
# Drop columns, rename from Life expectancy data 

Life_Expectancy = Life_Expectancy.drop(columns = ['countryiso3code', 'unit',
                                                   'obs_status', 'decimal',
                                                   'indicator.id', 'indicator.value',
                                                     'country.id']).rename(columns={"value": "Life_Expectancy",
                                                                                    "date": "Year",
                                                                                    "country.value":"Country or Region"})

3. Merge the two results DataFrames together. You may want to rename or drop columns prior to merging.

In [18]:
# Merge GDP data and Life expectancy data on Country or Region

gdp_le = GDP_Data.merge(Life_Expectancy, on = ['Country or Region','Year'])

4. You can also get more information about the available countries (region, capital city, income level classification, etc.) by using the [Country API](https://datahelpdesk.worldbank.org/knowledgebase/articles/898590-country-api-queries). Use this API to pull in all available data. Merge this with your other datasets. Use this to now remove the rows that correspond to regions and not countries.


In [21]:
# Load Countries data, choose rows with only countries, drop columns and rename columns
endpoint = 'http://api.worldbank.org/v2/country'
params = {'format':'json',
          'per_page' : 297}
res = requests.get(endpoint, params=params).json()
Countries = pd.json_normalize(res[1])
Countries_only = Countries[Countries['region.id']!='NA']
Countries_only = Countries_only.drop(columns= ['id', 'iso2Code', 'longitude', 'latitude',
       'region.id', 'region.iso2code', 'region.value', 'adminregion.id',
       'adminregion.iso2code', 'adminregion.value', 'incomeLevel.id',
       'incomeLevel.iso2code', 'incomeLevel.value', 'lendingType.id',
       'lendingType.iso2code', 'lendingType.value']).rename(columns={'name':'Country or Region',
                                                                     'capitalCity':'Capital'})

In [56]:
# merge it so the merged datafram only has countries and not just_regions
merged_gdp_le = gdp_le.merge(Countries_only, on = 'Country or Region').dropna()

In [57]:
merged_gdp_le['Country or Region'].nunique()

193

In [58]:
merged_gdp_le.head()

Unnamed: 0,Year,GDP_Per_Capita,Country or Region,Life_Expectancy,Capital
1,2021,1517.016266,Afghanistan,61.982,Kabul
2,2020,1968.341002,Afghanistan,62.575,Kabul
3,2019,2079.921861,Afghanistan,63.565,Kabul
4,2018,2060.698973,Afghanistan,63.081,Kabul
5,2017,2096.093111,Afghanistan,63.016,Kabul


### Bonus Questions

1. Adjust your request so that it returns data just for the United States.

In [69]:
endpoint = 'http://api.worldbank.org/v2/country/USA/indicator/NY.GDP.PCAP.PP.KD'
params = {'format':'json',
          'per_page' : 63}
res = requests.get(endpoint, params=params).json()
#res
GDP_Data_USA = pd.json_normalize(res[1])

In [108]:
GDP_Data_USA.head()

Unnamed: 0,countryiso3code,date,value,unit,obs_status,decimal,indicator.id,indicator.value,country.id,country.value
0,USA,2022,64702.978311,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2017 internation...",US,United States
1,USA,2021,63635.82381,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2017 internation...",US,United States
2,USA,2020,60158.910453,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2017 internation...",US,United States
3,USA,2019,62470.929913,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2017 internation...",US,United States
4,USA,2018,61348.456596,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2017 internation...",US,United States


2. Adjust your request so that it returns data just for the United States for the year 2021.

In [110]:
endpoint = 'http://api.worldbank.org/v2/country/USA/indicator/NY.GDP.PCAP.PP.KD'
params = {'format':'json',
          'per_page' : 50,
          'date':'2021'}
res = requests.get(endpoint, params=params).json()
#res
GDP_Data_USA_2021 = pd.json_normalize(res[1])

In [115]:
GDP_Data_USA_2021

Unnamed: 0,countryiso3code,date,value,unit,obs_status,decimal,indicator.id,indicator.value,country.id,country.value
0,USA,2021,63635.82381,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2017 internation...",US,United States


3. Adjust your request so that it returns data just for the United States for the years 2000 through 2021.

In [114]:
endpoint = 'http://api.worldbank.org/v2/country/USA/indicator/NY.GDP.PCAP.PP.KD'
params = {'format':'json',
          'per_page' : 50,
          'date': '2000:2021'}
res = requests.get(endpoint, params=params).json()
#res
GDP_Data_USA_2000_2021 = pd.json_normalize(res[1])

In [116]:
GDP_Data_USA_2000_2021.head(2)

Unnamed: 0,countryiso3code,date,value,unit,obs_status,decimal,indicator.id,indicator.value,country.id,country.value
0,USA,2021,63635.82381,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2017 internation...",US,United States
1,USA,2020,60158.910453,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2017 internation...",US,United States


4. Adjust your request so that it returns data for the United States and Canada for the years 2000 through 2021.

In [118]:
endpoint = 'http://api.worldbank.org/v2/country/USA;CAN/indicator/NY.GDP.PCAP.PP.KD'
params = {'format':'json',
          'per_page' : 50,
          'date': '2000:2021'}
res = requests.get(endpoint, params=params).json()
#res
GDP_Data_USA_Canada_2000_2021 = pd.json_normalize(res[1])

In [120]:
GDP_Data_USA_Canada_2000_2021.head()

Unnamed: 0,countryiso3code,date,value,unit,obs_status,decimal,indicator.id,indicator.value,country.id,country.value
0,CAN,2021,48218.038316,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2017 internation...",CA,Canada
1,CAN,2020,46181.757555,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2017 internation...",CA,Canada
2,CAN,2019,49175.67705,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2017 internation...",CA,Canada
3,CAN,2018,48962.481511,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2017 internation...",CA,Canada
4,CAN,2017,48317.174584,,,0,NY.GDP.PCAP.PP.KD,"GDP per capita, PPP (constant 2017 internation...",CA,Canada


5. If you haven't already done so and you would like to get some additional practice using loops, use the page parameter in order to pull all records. Do not change the value of the per_page parameter. You will likely need to utilize a loop of some kind in order to pull all records.