# Demographic and Health Surveys (DHS) Indicators

This notebook walks through obtaining a list of all indicators in the [DHS API](http://api.dhsprogram.com/#/api-data.cfm) and then creates an adapter which pulls the indicator at the subnational (admin1) level for Ethiopia and formats it for ingestion into Datamarts.

In [1]:
import requests
import pandas as pd

# Obtain all dataset metadata

In [2]:
initial_ = requests.get('http://api.dhsprogram.com/rest/dhs/v7/data?countryIds=ET').json()

In [3]:
pages = initial_['TotalPages']

In [4]:
datasets = []
for p in range(1,pages+1):
    d = requests.get(f'http://api.dhsprogram.com/rest/dhs/v7/data?countryIds=ET&page={p}').json()
    datasets.extend(d['Data'])

In [5]:
indicator_ids = list(set([i['IndicatorId'] for i in datasets]))

In [6]:
print(f"There are {len(indicator_ids)} total indicators in DHS for Ethiopia")

There are 2856 total indicators in DHS for Ethiopia


In [7]:
for i in indicator_ids[:10]:
    print(i)

CN_BRFS_C_EXB
RH_PCCT_C_D36
RH_PCMT_W_23H
DV_SPVM_W_UNW
WE_OWNA_M_HNM
FP_EVUM_W_TRA
EM_MERN_M_JNT
HC_CKFL_H_KER
ML_IPTP_W_UNW
CM_ECMR_C_U5E


# Obtain Indicator Specific Timeseries

In [8]:
def process_record(record):
    row = {'time': record['SurveyYear'],
           'country': record['CountryName'],
           'admin_1': record['CharacteristicLabel'],
           record['IndicatorId']: record['Value'],
           record['IndicatorId'] + '_description': record['Indicator'],
          }

    qualifiers = [k for k in record.keys() if k not in ['CountryName','Value','Indicator','IndicatorId', 'CharacteristicLabel']]

    for q in qualifiers:
        row[record['IndicatorId'] + '_' + q] = record[q]

    return row

In [9]:
def get_indicator(indicator_id):
    initial = requests.get(f'http://api.dhsprogram.com/rest/dhs/v7/data?countryIds=ET&indicatorIds={indicator_id}').json()    
    pages = initial['TotalPages']
    data = []
    for p in range(1,pages+1):
        d = requests.get(f'http://api.dhsprogram.com/rest/dhs/v7/data?countryIds=ET&indicatorIds={indicator_id}&breakdown=Subnational&page={p}').json()
        for d_ in d['Data']:
            data.append(process_record(d_))
    return pd.DataFrame(data)

In [10]:
data = get_indicator('HC_CKFL_P_ELC')

In [11]:
data.head()

Unnamed: 0,time,country,admin_1,HC_CKFL_P_ELC,HC_CKFL_P_ELC_description,HC_CKFL_P_ELC_DataId,HC_CKFL_P_ELC_SurveyId,HC_CKFL_P_ELC_IsPreferred,HC_CKFL_P_ELC_SDRID,HC_CKFL_P_ELC_Precision,...,HC_CKFL_P_ELC_IndicatorType,HC_CKFL_P_ELC_CharacteristicId,HC_CKFL_P_ELC_CharacteristicCategory,HC_CKFL_P_ELC_CharacteristicOrder,HC_CKFL_P_ELC_ByVariableLabel,HC_CKFL_P_ELC_DenominatorUnweighted,HC_CKFL_P_ELC_DenominatorWeighted,HC_CKFL_P_ELC_CIHigh,HC_CKFL_P_ELC_IsTotal,HC_CKFL_P_ELC_ByVariableId
0,2000,Ethiopia,Tigray,0.7,Population cooking with electricity,3545006,ET2000DHS,1,HCCKFLPELC,1,...,I,406001,Region,1406001,,6053.0,4366.0,,0,0
1,2000,Ethiopia,Afar,0.0,Population cooking with electricity,2409400,ET2000DHS,1,HCCKFLPELC,1,...,I,406002,Region,1406002,,3753.0,726.0,,0,0
2,2000,Ethiopia,Amhara,0.0,Population cooking with electricity,927868,ET2000DHS,1,HCCKFLPELC,1,...,I,406003,Region,1406003,,9161.0,17962.0,,0,0
3,2000,Ethiopia,Oromia,0.0,Population cooking with electricity,1068330,ET2000DHS,1,HCCKFLPELC,1,...,I,406004,Region,1406004,,11303.0,25733.0,,0,0
4,2000,Ethiopia,Somali,0.1,Population cooking with electricity,2409426,ET2000DHS,1,HCCKFLPELC,1,...,I,406005,Region,1406005,,4429.0,902.0,,0,0
