# Demographic and Health Surveys (DHS) Indicators

This notebook walks through obtaining a list of all indicators in the [DHS API](http://api.dhsprogram.com/#/api-data.cfm) and then creates an adapter which pulls the indicator at the subnational (admin1) level for Ethiopia and formats it for ingestion into Datamarts.

In [1]:
import requests
import pandas as pd

# Obtain all dataset metadata

In [2]:
initial_ = requests.get('http://api.dhsprogram.com/rest/dhs/v7/data?countryIds=ET').json()

In [3]:
pages = initial_['TotalPages']

In [4]:
datasets = []
for p in range(1,pages+1):
    d = requests.get(f'http://api.dhsprogram.com/rest/dhs/v7/data?countryIds=ET&page={p}').json()
    datasets.extend(d['Data'])

In [5]:
indicator_ids = list(set([i['IndicatorId'] for i in datasets]))

In [6]:
print(f"There are {len(indicator_ids)} total indicators in DHS for Ethiopia")

There are 2856 total indicators in DHS for Ethiopia


In [7]:
for i in indicator_ids[:10]:
    print(i)

ML_FEVT_C_AMS
HC_FLRM_H_PQT
CH_DSTL_C_UNW
SV_HRSM_H_DNF
DV_FMVL_W_NUM
AN_NUTS_W_NRM
CH_DIAT_C_INC
HA_AATT_W_VEG
HC_LVAR_C_NBA
SV_RESI_W_POS


# Obtain Indicator Specific Timeseries

In [8]:
def process_record(record):
    row = {'time': record['SurveyYear'],
           'country': record['CountryName'],
           'admin_1': record['CharacteristicLabel'],
           record['IndicatorId'] + '_value' : record['Value'],
           record['IndicatorId'] + '_description': record['Indicator'],
          }

    qualifiers = [k for k in record.keys() if k not in ['CountryName','Value','Indicator','IndicatorId', 'CharacteristicLabel']]

    for q in qualifiers:
        row[record['IndicatorId'] + '_' + q] = record[q]

    return row

In [9]:
def get_indicator(indicator_id):
    initial = requests.get(f'http://api.dhsprogram.com/rest/dhs/v7/data?countryIds=ET&indicatorIds={indicator_id}').json()    
    pages = initial['TotalPages']
    data = []
    for p in range(1,pages+1):
        d = requests.get(f'http://api.dhsprogram.com/rest/dhs/v7/data?countryIds=ET&indicatorIds={indicator_id}&breakdown=Subnational&page={p}').json()
        for d_ in d['Data']:
            data.append(process_record(d_))
    return pd.DataFrame(data)

In [10]:
data = get_indicator('DV_MCTL_W_ACC')

In [11]:
data.head()

Unnamed: 0,time,country,admin_1,DV_MCTL_W_ACC_value,DV_MCTL_W_ACC_description,DV_MCTL_W_ACC_DataId,DV_MCTL_W_ACC_SurveyId,DV_MCTL_W_ACC_IsPreferred,DV_MCTL_W_ACC_SDRID,DV_MCTL_W_ACC_Precision,...,DV_MCTL_W_ACC_IndicatorType,DV_MCTL_W_ACC_CharacteristicId,DV_MCTL_W_ACC_CharacteristicCategory,DV_MCTL_W_ACC_CharacteristicOrder,DV_MCTL_W_ACC_ByVariableLabel,DV_MCTL_W_ACC_DenominatorUnweighted,DV_MCTL_W_ACC_DenominatorWeighted,DV_MCTL_W_ACC_CIHigh,DV_MCTL_W_ACC_IsTotal,DV_MCTL_W_ACC_ByVariableId
0,2016,Ethiopia,Tigray,9.1,Women whose husband/partner frequently accuses...,2266009,ET2016DHS,1,DVMCTLWACC,1,...,I,406001,Region,1406001,,493.0,316.0,,0,0
1,2016,Ethiopia,Afar,6.4,Women whose husband/partner frequently accuses...,2263199,ET2016DHS,1,DVMCTLWACC,1,...,I,406002,Region,1406002,,387.0,43.0,,0,0
2,2016,Ethiopia,Amhara,6.7,Women whose husband/partner frequently accuses...,4608490,ET2016DHS,1,DVMCTLWACC,1,...,I,406003,Region,1406003,,572.0,1085.0,,0,0
3,2016,Ethiopia,Oromia,19.6,Women whose husband/partner frequently accuses...,3789240,ET2016DHS,1,DVMCTLWACC,1,...,I,406004,Region,1406004,,649.0,1746.0,,0,0
4,2016,Ethiopia,Somali,2.3,Women whose husband/partner frequently accuses...,4328282,ET2016DHS,1,DVMCTLWACC,1,...,I,406005,Region,1406005,,464.0,132.0,,0,0
