# Gender Data Indicators
Generic Framework for extracting Gender Data

## World Bank API Usage Documentation
  
**API Documentation:** https://datahelpdesk.worldbank.org/knowledgebase/articles/898581</br>
**Base URL:** <a href="api.worldbank.org/v2/">api.worldbank.org/v2/</a></br>
**Parameters:** </br>

| Parameter | Description | Example |
| --- | --- | --- |
| `date` | Date or date range by year, month or quarter | `date=2000:2010` |
| `format` | Output format: xml, json, jsonP | `format=json` |
| `downloadformat` | Download format: csv, xml, excel | `downloadformat=csv` |
| `page` | Page number of the result set | `page=2` |
| `per_page` | Number of results per page (default 50) | `per_page=25` |
| `mrv` | Most recent values based on the number specified | `mrv=5` |
| `mrnev` | Most recent non-empty values based on the number specified | `mrnev=5` |
| `gapfill` | Fills values by back tracking to the next available period (works with mrv) | `gapfill=Y` |
| `frequency` | Frequency of values: Q (quarterly), M (monthly), Y (yearly) (works with mrv) | `frequency=M` |
| `footnote` | Fetches footnote detail in data calls | `footnote=y` |
| `language` | Local language translations for some countries | `language=vi` |


In [7]:
# Imports
import pandas as pd # Dataframe
from pyjstat import pyjstat
import json # Parsing json to object
import requests # Making HTTP get requests
import pycountry
from ydata_profiling import ProfileReport

In [2]:
# Create Lists
commonwealth_countries = [
    "Botswana", "Cameroon", "Gabon", "Gambia", "Ghana", "Kenya",
    "Eswatini", "Lesotho", "Malawi", "Mauritius", "Mozambique",
    "Namibia", "Nigeria", "Rwanda", "Seychelles", "Sierra Leone",
    "South Africa", "Togo", "Uganda", "Tanzania, United Republic of", "Zambia",
    "Bangladesh", "Brunei Darussalam", "India", "Malaysia", "Maldives",
    "Pakistan", "Singapore", "Sri Lanka", "Antigua and Barbuda", "Bahamas",
    "Barbados", "Belize", "Canada", "Dominica", "Grenada", "Guyana",
    "Jamaica", "Saint Lucia", "Saint Kitts and Nevis", "Saint Vincent and The Grenadines",
    "Trinidad and Tobago", "Cyprus", "Malta", "United Kingdom", "Australia",
    "Fiji", "Kiribati", "Nauru", "New Zealand", "Papua New Guinea", "Samoa",
    "Solomon Islands", "Tonga", "Tuvalu", "Vanuatu"
]

gender_indicators = [
    "FIN21.T.D.2017.1","FIN21.T.D.2017.2","FIN21.T.D.2017","SG.GEN.PARL.ZS",
    "SG.GEN.MNST.ZS","SE.SEC.ENRR.FE","UIS.FGP.5T8.F600","SL.TLF.CACT.FE.ZS",
    "SG.LAW.NODC.HR","SG.OWN.LDAL.FE.ZS","SG.OPN.BANK.EQ","SG.CNT.SIGN.EQ",
    "SP.DYN.SMAM.FE","SP.DYN.SMAM.MA","SP.M15.2024.FE.ZS","SP.M18.2024.FE.ZS",
    "SG.VAW.1549.ME.ZS","SG.VAW.15PL.ME.ZS","SG.VAW.1549.LT.ME.ZS","SG.VAW.15PL.LT.ME.ZS",
    "SG.LEG.DVAW","SH.STA.MMRT","SH.STA.MMRT.NE","SP.DYN.LE00.FE.IN","SP.DYN.LE00.MA.IN","SP.DYN.LE00.IN"
]

country_iso_codes = {}
for country in commonwealth_countries:
    try:
        iso_code = pycountry.countries.get(name=country).alpha_3 # Trinidad and Tobago | Trinidad & Tobago | trinidad and tobago [TTO] - https://www.iban.com/country-codes
        country_iso_codes[country] = iso_code
    except AttributeError:
        print(f"ISO code not found for {country}")

In [3]:
#http://api.worldbank.org/v2/country/LSO;ZAF/indicator/SP.POP.TOTL;SG.GEN.PARL.ZS?format=jsonstat&source=14
def download_indicators(country_list, indicator_list):
    # leverage on the parameter structure of the API
    isocode_filter = []
    indicator_filter = []
    for country in country_list:
        iso_code = country_iso_codes[country]
        isocode_filter.append(iso_code)
    for indicator in indicator_list:
        indicator_filter.append(indicator) 
    api_url = f'http://api.worldbank.org/v2/country/{";".join(isocode_filter)}/indicator/{";".join(indicator_filter)}?format=jsonstat&gapfill=N&source=14'
    dataset = pyjstat.Dataset.read(api_url)
    df = dataset.write('dataframe')
    return df

In [4]:
df = download_indicators(commonwealth_countries, gender_indicators)

In [5]:
df.head()

Unnamed: 0,Country,Series,Year,value
0,Antigua and Barbuda,"Borrowed to start, operate, or expand a farm o...",1960,
1,Antigua and Barbuda,"Borrowed to start, operate, or expand a farm o...",1961,
2,Antigua and Barbuda,"Borrowed to start, operate, or expand a farm o...",1962,
3,Antigua and Barbuda,"Borrowed to start, operate, or expand a farm o...",1963,
4,Antigua and Barbuda,"Borrowed to start, operate, or expand a farm o...",1964,


In [8]:
print(f'Data rows : {len(df)}')

Data rows : 91728


In [9]:
profile = ProfileReport(df, title="Profiling Report")

In [10]:
profile.to_file("report.html")

Summarize dataset:   0%|          | 0/5 [00:00<?, ?it/s]

Generate report structure:   0%|          | 0/1 [00:00<?, ?it/s]

Render HTML:   0%|          | 0/1 [00:00<?, ?it/s]

Export report to file:   0%|          | 0/1 [00:00<?, ?it/s]