Living conditions are a contributory factor to the rate of education children and young adults receive today. Inequalities exist which determine how wealthy and educated a country can be. 
One variable to consider when looking at female literacy rate is the income levels of different countries.

### Income Classifications

The World Bank assigns world economies to four income groups based on Gross National Income(GNI) per capita in current USD.


|Group | July 1, 2021 (new) | July 1, 2020 (old) |
|-------|-------|--------|
|Low income | < 1,045 | < 1,035 |
|Lower-middle income | 1,046 – 4,095 | 1,035 – 4,045 |
|Upper-middle income | 4,096 -12,695 | 4,046 -12,535 |
|High income | > 12,695 | > 12,535 |

[ source ](https://blogs.worldbank.org/opendata/new-world-bank-country-classifications-income-level-2021-2022)

We read in the income levels for different countries through rapid api. 
This has been converted into a json file as it initially loads as xml file.
to change this, we added the '?format=json' at the end of the url to download as a json file

In [1]:
import requests
import pandas as pd
import json
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns


url_pages = [1, 2, 3, 4, 5, 6]
page_nos = []

for pages in url_pages:
    url = f"https://api.worldbank.org/v2/country?format=json&page={pages}"
    headers = {
	"X-RapidAPI-Key": "fbf62b0dcdmsh21be4db79aad44bp1f178djsne2f15eac6f66",
	"X-RapidAPI-Host": "world-bank-gdp.p.rapidapi.com"}
    response = requests.get(url, headers=headers )
    j = response.json()
    page_nos.append(j)
    print(response.status_code)
   


    
    






# Response is successful

200
200
200
200
200
200


In [2]:
# response.text

page_nos


[[{'page': 1, 'pages': 6, 'per_page': '50', 'total': 299},
  [{'id': 'ABW',
    'iso2Code': 'AW',
    'name': 'Aruba',
    'region': {'id': 'LCN',
     'iso2code': 'ZJ',
     'value': 'Latin America & Caribbean '},
    'adminregion': {'id': '', 'iso2code': '', 'value': ''},
    'incomeLevel': {'id': 'HIC', 'iso2code': 'XD', 'value': 'High income'},
    'lendingType': {'id': 'LNX', 'iso2code': 'XX', 'value': 'Not classified'},
    'capitalCity': 'Oranjestad',
    'longitude': '-70.0167',
    'latitude': '12.5167'},
   {'id': 'AFE',
    'iso2Code': 'ZH',
    'name': 'Africa Eastern and Southern',
    'region': {'id': 'NA', 'iso2code': 'NA', 'value': 'Aggregates'},
    'adminregion': {'id': '', 'iso2code': '', 'value': ''},
    'incomeLevel': {'id': 'NA', 'iso2code': 'NA', 'value': 'Aggregates'},
    'lendingType': {'id': '', 'iso2code': '', 'value': 'Aggregates'},
    'capitalCity': '',
    'longitude': '',
    'latitude': ''},
   {'id': 'AFG',
    'iso2Code': 'AF',
    'name': 'Afghanis

In [3]:
inc_df = pd.DataFrame(columns=['name', 'incomeLevel'])


In [4]:
# extract data from json and make it a dataframe named 'inc_df'

count = 0
index_data = [0, 1, 2, 3, 4, 5]

for i in index_data:
    for inc in (page_nos[i][1]):
        inc_df.loc[count] = [inc['name'], inc['incomeLevel']['value']]
        count += 1
    
    

inc_df

Unnamed: 0,name,incomeLevel
0,Aruba,High income
1,Africa Eastern and Southern,Aggregates
2,Afghanistan,Low income
3,Africa,Aggregates
4,Africa Western and Central,Aggregates
...,...,...
294,Sub-Saharan Africa excluding South Africa and ...,Aggregates
295,"Yemen, Rep.",Low income
296,South Africa,Upper middle income
297,Zambia,Low income


In [5]:
# Next extract data for our top 5 and bottom 5 countries 
# and their literacy rates
# Read in the cleaned csv data for child marriage 

read_cm_df = pd.read_csv('../literacy-project/clean_child_marriage_merged_w_literacy.csv')

In [6]:
read_cm_df

Unnamed: 0,Country Name,Country Code,Status,Mean Adult Female Literacy Rate (%),Median Female Child Marriage Rate (%)
0,Chad,TCD,Lowest,15.379128,24.556
1,Afghanistan,AFG,Lowest,19.80931,16.3
2,Mali,MLI,Lowest,20.470424,42.1
3,Niger,NER,Lowest,20.530956,21.1555
4,Guinea,GIN,Lowest,22.271088,28.1
5,Cuba,CUB,Highest,99.769315,12.362
6,Lithuania,LTU,Highest,99.777059,0.15
7,Estonia,EST,Highest,99.849846,4.8805
8,Latvia,LVA,Highest,99.858515,4.8805
9,"Korea, Dem. People's Rep.",PRK,Highest,99.997612,0.05


In [7]:
# Copy read child marriage file to work with

cm_copy = read_cm_df.copy()

cm_copy

Unnamed: 0,Country Name,Country Code,Status,Mean Adult Female Literacy Rate (%),Median Female Child Marriage Rate (%)
0,Chad,TCD,Lowest,15.379128,24.556
1,Afghanistan,AFG,Lowest,19.80931,16.3
2,Mali,MLI,Lowest,20.470424,42.1
3,Niger,NER,Lowest,20.530956,21.1555
4,Guinea,GIN,Lowest,22.271088,28.1
5,Cuba,CUB,Highest,99.769315,12.362
6,Lithuania,LTU,Highest,99.777059,0.15
7,Estonia,EST,Highest,99.849846,4.8805
8,Latvia,LVA,Highest,99.858515,4.8805
9,"Korea, Dem. People's Rep.",PRK,Highest,99.997612,0.05


In [8]:
# We will match countries in our income level df 
# to countries in child marriage df 
# Now we can compare income levels/status/lit. rate & child marriage

match_country = inc_df['name'].isin(cm_copy['Country Name'])


In [9]:
# We have assigned a new var to inc_df
# This way we are only looking at countries in our top 5 and bottom 5

new_inc_df = inc_df[match_country]

new_inc_df

Unnamed: 0,name,incomeLevel
2,Afghanistan,Low income
63,Cuba,Upper middle income
94,Estonia,High income
109,Guinea,Low income
168,Lithuania,High income
170,Latvia,High income
184,Mali,Low income
200,Niger,Low income
222,"Korea, Dem. People's Rep.",Low income
260,Chad,Low income


In [15]:
# Rename 'name'col to 'Country Name'and incomeLevel using a list
# This makes matching countries easier.

new_inc_df_cols = ['Country Name', 'Income Level']

new_inc_df.columns = new_inc_df_cols

new_inc_df


Unnamed: 0,Country Name,Income Level
2,Afghanistan,Low income
63,Cuba,Upper middle income
94,Estonia,High income
109,Guinea,Low income
168,Lithuania,High income
170,Latvia,High income
184,Mali,Low income
200,Niger,Low income
222,"Korea, Dem. People's Rep.",Low income
260,Chad,Low income


In [11]:
# Next we will match income level to the coutries;
# Add the inc level to the child marriage table.

merged_df = cm_copy.merge(new_inc_df, on=['Country Name'])

merged_df

Unnamed: 0,Country Name,Country Code,Status,Mean Adult Female Literacy Rate (%),Median Female Child Marriage Rate (%),Income Level
0,Chad,TCD,Lowest,15.379128,24.556,Low income
1,Afghanistan,AFG,Lowest,19.80931,16.3,Low income
2,Mali,MLI,Lowest,20.470424,42.1,Low income
3,Niger,NER,Lowest,20.530956,21.1555,Low income
4,Guinea,GIN,Lowest,22.271088,28.1,Low income
5,Cuba,CUB,Highest,99.769315,12.362,Upper middle income
6,Lithuania,LTU,Highest,99.777059,0.15,High income
7,Estonia,EST,Highest,99.849846,4.8805,High income
8,Latvia,LVA,Highest,99.858515,4.8805,High income
9,"Korea, Dem. People's Rep.",PRK,Highest,99.997612,0.05,Low income


The results in the table above, show that countries with the lowest female literacy rate are all grouped as low income countries except for Korea Dem.People's Rep..

In [13]:
# We have now added the income levels to the other table
# We will save the new table as a csv 


merged_df.to_csv('merged_tables.csv', index=False)