---
### Fetching Data from API

#### Using real world dataset from WHO at GHO (The Global Health Observatory) Website.



Importing data from API directly
- json viewer is used online to see the format of original API Json structure.
- `req` stores API container, `.json` converts it to json format, and we choose 'value' key to see results.
- `display.max_colwidth` is set to None so that any values in IndicatorName column are not truncated

In [23]:
import requests as rq
import pandas as pd

request = rq.get('https://ghoapi.azureedge.net/api/Indicator')
df = pd.DataFrame(request.json()['value'])
pd.set_option('display.max_colwidth', None)

In [24]:
df[df['IndicatorName'] == 'Life expectancy at birth (years)']

Unnamed: 0,IndicatorCode,IndicatorName,Language
1788,WHOSIS_000001,Life expectancy at birth (years),EN


In [25]:
df[df['IndicatorName'].str.contains('Under-five mortality rate', case=False, na=False)]

Unnamed: 0,IndicatorCode,IndicatorName,Language
1375,u5mr,Under-five mortality rate (deaths per 1000 live births),EN
2848,MDG_0000000007,Under-five mortality rate (probability of dying by age 5 per 1000 live births),EN


In [26]:
df[df['IndicatorName'].str.contains('maternal mortality ratio', case=False, na=False)]

Unnamed: 0,IndicatorCode,IndicatorName,Language
2090,MDG_0000000032,Maternal mortality ratio (per 100 000 live births) - Country reported estimates,EN
2341,MDG_0000000026,Maternal mortality ratio (per 100 000 live births),EN


In [27]:
df[df['IndicatorName'].str.contains('physicians', case=False, na=False)]

Unnamed: 0,IndicatorCode,IndicatorName,Language
76,HRH_02,Number of physicians,EN
254,HRH_26,Physicians density (per 1000 population),EN
872,GDO_q6x2_1,Inclusion of basic dementia competencies in training of physicians,EN
2276,PRISON_B9_1_PHYSICIANS_RATIO,Physicians in prisons (FTEs per 1000 prisoners),EN


In [28]:
df[df['IndicatorName'].str.contains('hospital beds', case=False, na=False)]

Unnamed: 0,IndicatorCode,IndicatorName,Language
1116,GDO_q9x1_4,"Geriatric-specific hospital beds (per 10,000 population)",EN
1144,GDO_q9x1_3,"Dementia-specific hospital beds (per 10,000 population)",EN
1927,WHS6_102,Hospital beds (per 10 000 population),EN


In [29]:
df[df['IndicatorName'].str.contains('gdp', case=False, na=False)]

Unnamed: 0,IndicatorCode,IndicatorName,Language
307,GB_XPD_RSDV,Research and development (R&amp;D) expenditure as a proportion of GDP,EN
730,R_afford_gdp,Affordability - percentage of GDP per capita required to purchase 2000 cigarettes of the most sold brand,EN
1257,GHED_GGHE-DGDP_SHA2011,Domestic general government health expenditure (GGHE-D) as percentage of gross domestic product (GDP) (%),EN
1343,GHED_CHEGDP_SHA2011,Current health expenditure (CHE) as percentage of gross domestic product (GDP) (%),EN
2399,TAXBEV_AFFORD_GDP,GDP per capita to purchase a volume of beverage (%),EN


In [30]:
df

Unnamed: 0,IndicatorCode,IndicatorName,Language
0,AIR_10,Ambient air pollution attributable DALYs per 100'000 children under 5 years,EN
1,AIR_12,Household air pollution attributable deaths in children under 5 years,EN
2,AIR_16,Household air pollution attributable DALYs in children under 5 years,EN
3,AIR_3,Percentage of the total population living in cities > 100'000 inhabitants,EN
4,AIR_35,Joint effects of air pollution attributable deaths,EN
...,...,...,...
3051,SA_0000001844,"Alcohol-attributable all-cause deaths (all ages), (number)",EN
3052,UHCTRANSFATS,Best-practice policy implemented for industrially produced trans-fatty acids (TFA) (Y/N),EN
3053,HEPATITIS_HBV_INFECTIONS_NEW_NUM,New chronic hepatitis B (HBV) infections (number),EN
3054,PHE_HHAIR_PROP_POP_CATEGORY_FUELS,"Proportion of population with primary reliance on fuels and technologies for cooking, by fuel type (%)",EN


In [None]:
codes = ['WHOSIS_000001' , 'u5mr' , 'MDG_0000000026' , 'HRH_26' , 'WHS6_102']
df = df[df['IndicatorCode'].isin(codes)]
df

Unnamed: 0,IndicatorCode,IndicatorName,Language
254,HRH_26,Physicians density (per 1000 population),EN
1375,u5mr,Under-five mortality rate (deaths per 1000 live births),EN
1788,WHOSIS_000001,Life expectancy at birth (years),EN
1927,WHS6_102,Hospital beds (per 10 000 population),EN
2341,MDG_0000000026,Maternal mortality ratio (per 100 000 live births),EN


In [37]:
# for i in range(len(codes)):
#     code = codes[i]

request = rq.get('https://ghoapi.azureedge.net/api/WHOSIS_000001')
d = pd.DataFrame(request.json()['value'])
d.head()

Unnamed: 0,Id,IndicatorCode,SpatialDimType,SpatialDim,TimeDimType,ParentLocationCode,ParentLocation,Dim1Type,TimeDim,Dim1,...,DataSourceDim,Value,NumericValue,Low,High,Comments,Date,TimeDimensionValue,TimeDimensionBegin,TimeDimensionEnd
0,9387322,WHOSIS_000001,COUNTRY,OMN,YEAR,EMR,Eastern Mediterranean,SEX,2003,SEX_MLE,...,,70.3 [69.7-71.1],70.316239,69.70728,71.131767,,2024-08-02T09:43:39.193+02:00,2003,2003-01-01T00:00:00+01:00,2003-12-31T00:00:00+01:00
1,9388527,WHOSIS_000001,COUNTRY,BLZ,YEAR,AMR,Americas,SEX,2003,SEX_FMLE,...,,74.7 [74.5-75.0],74.727983,74.533552,74.955545,,2024-08-02T09:43:39.193+02:00,2003,2003-01-01T00:00:00+01:00,2003-12-31T00:00:00+01:00
2,9388585,WHOSIS_000001,COUNTRY,GIN,YEAR,AFR,Africa,SEX,2016,SEX_FMLE,...,,61.1 [60.1-62.0],61.095901,60.113522,62.005773,,2024-08-02T09:43:39.193+02:00,2016,2016-01-01T00:00:00+01:00,2016-12-31T00:00:00+01:00
3,9389231,WHOSIS_000001,COUNTRY,NZL,YEAR,WPR,Western Pacific,SEX,2018,SEX_BTSX,...,,81.8 [81.7-81.8],81.755405,81.727056,81.842149,,2024-08-02T09:43:39.193+02:00,2018,2018-01-01T00:00:00+01:00,2018-12-31T00:00:00+01:00
4,9390603,WHOSIS_000001,COUNTRY,IRL,YEAR,EUR,Europe,SEX,2003,SEX_BTSX,...,,78.0 [78.0-78.1],78.023793,78.003415,78.129261,,2024-08-02T09:43:39.193+02:00,2003,2003-01-01T00:00:00+01:00,2003-12-31T00:00:00+01:00


In [55]:
d['SpatialDim'].unique()

array(['OMN', 'BLZ', 'GIN', 'NZL', 'IRL', 'TZA', 'ZAF', 'NER', 'SLV',
       'CPV', 'BLR', 'BGD', 'RUS', 'TON', 'CYP', 'SYC', 'GBR', 'KAZ',
       'ARE', 'GRC', 'GTM', 'BOL', 'AFR', 'SLB', 'GRD', 'DOM', 'DEU',
       'PRK', 'SRB', 'PRT', 'MNG', 'POL', 'WPR', 'UZB', 'AUT', 'PRY',
       'VCT', 'CAF', 'SAU', 'WB_UMI', 'FIN', 'NOR', 'LBR', 'PAK', 'HRV',
       'WB_LMI', 'USA', 'WB_HI', 'SWZ', 'DJI', 'NGA', 'TJK', 'ERI', 'BHR',
       'LKA', 'DZA', 'WSM', 'DNK', 'MLT', 'JAM', 'GUY', 'BRN', 'IRQ',
       'QAT', 'EST', 'MAR', 'TKM', 'VNM', 'LVA', 'CHL', 'MEX', 'PRI',
       'AGO', 'SEAR', 'GHA', 'ROU', 'NIC', 'BGR', 'AZE', 'CHN', 'FJI',
       'PER', 'CHE', 'LAO', 'MOZ', 'SSD', 'TGO', 'CUB', 'BHS', 'GEO',
       'LSO', 'LUX', 'PNG', 'AMR', 'URY', 'PAN', 'MDA', 'LTU', 'KOR',
       'SLE', 'HND', 'SVK', 'GLOBAL', 'EMR', 'PHL', 'IDN', 'LCA', 'GMB',
       'ECU', 'NAM', 'BDI', 'VEN', 'KIR', 'BTN', 'FRA', 'TUR', 'SUR',
       'TLS', 'ETH', 'MWI', 'YEM', 'MYS', 'UKR', 'BRA', 'BIH', 'EGY',
       '

Need to do same thing for each Indicator. find their SpatialDim, TimeDim, value of indicator. Then merge all indicators year and country wise to form a real dataset that can be worked upon.

This tutorial was for learning API basics and the objective is achieved.

- This dataset can be completed and be uploaded to kaggle or be used to create a model later. 
## To Be Continued...
