<a href="https://colab.research.google.com/github/cbarnes5/world_development_explorer/blob/main/wdx_analysis_resubmit.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Women’s Education effects on Health, and the Economy

**Author**: Crista Barnes

Girls' education corrolates to many other national indicators of progress. An increase in girls' education normally points to a higher GDP and income because more women are able to contribute to economic development and wealth creation. And in the reverse, in a wealthier country, more women have the ability to send girls to school. In the same fashion, girls' education normally corrolates to better health metrics, as women with higher paying jobs (granted by education) typically can afford better health care. In this blog we will explore metrics used to measure the state of girls' education across the world and specific cases in Mali and Chile. We will also draw comparisons with the state of girls' education as related to other nation indicators, and determine factors that affect the ability of girls to receive education.

## Where is the Data From? 

To estabolish a corrolation between girls' education and other nation indicators, we will use data provided by The World Bank and charts published on the World Development Explorer website, which utilized The World Bank’s data ([World Development Explorer](http://www.worlddev.xyz/)). The World Bank consists of five institutions from both the public and private sector: The International Bank for Reconstruction and Development (IBRD), The International Development Association (IDA), The International Finance Corporation (IFC), The Multilateral Investment Guarantee Agency (MIGA), and The International Centre for Settlement of Investment Disputes (ICSID). Together their mission is to end extreme poverty and promote shared prosperity ([World Bank Group-Home](https://www.worldbank.org/en/home)). The World Bank produces World Development Indicators (WDI), a cross-country comparable data set on development. WDIs range from indicators on people, the environment and the economy. There are more than 1400 socioeconomic indicators of 200 plus countries over 50 plus years ([WDI-Home](https://datatopics.worldbank.org/world-development-indicators/)).





## Exploring the Data

First, to understand the state of girls' education across the world, we will look specifically at girls' enrollment in primary education (% female, net) and the literacy rate of 15-24 year olds (% female). These variables (particularly literacy rate) will be used to sumerize the state of girls' education for the remander of blog. By looking at Plot 1 we can determine how countries performed according to these metrics in 2010. Note that the majority of countries are between 90 and 100 % for girls' primary education enrollment and there is more variation for literacy rates among 15-24 year females (the majority of countries are reporting 80 to 100 %). This could indicate an increase of girls' education over time or purhaps that girls' enroll in education than drop out while still young. 

In [2]:
import pandas as pd
import plotly.express as px
#Import data for Plots 1-3
df=pd.read_csv("https://raw.githubusercontent.com/cbarnes5/world_development_explorer/main/Data/Plot1_data.csv")
df.head()

Unnamed: 0.1,Unnamed: 0,Year,SE.PRM.NENR.FE,SE.ADT.1524.LT.FE.ZS,SP.POP.TOTL,Country Code,Country Name,Region,Income Group,Lending Type
0,0,2011,88.24009,98.856239,2905195,ALB,Albania,Europe & Central Asia,Upper middle income,IBRD
1,1,2012,90.952,99.020187,2900401,ALB,Albania,Europe & Central Asia,Upper middle income,IBRD
2,2,2011,95.76167,99.87619,2876538,ARM,Armenia,Europe & Central Asia,Upper middle income,IBRD
3,3,2017,91.91511,99.874161,2944809,ARM,Armenia,Europe & Central Asia,Upper middle income,IBRD
4,4,2010,83.84758,99.933327,9054332,AZE,Azerbaijan,Europe & Central Asia,Upper middle income,IBRD


*Variables*

- Year = Year
- SE.PRM.NENR.FE = Literacy Rate (% female age 15-24)
- SE.ADT.1524.LT.FE.ZS = School Enrollment, Primary (% female, net)
- SP.POP.TOTL = Total Population of Country
- Country Code = Country Abbreviation 
- Country Name = Country Name
- Region = Region
- Lending Type =  World Bank Loan Type

In [3]:
#Change column names to easily callable variables 
df.columns=['Index', 'Year', 'LitRate', 'Enrollment', 'Pop', 'CountryCode', 'CountryName', 'Region', 'IncomeGroup', 'LendingType']

In [4]:
#Filter the dataframe to 2010
df1=df[df.Year==2010]
#Create Plot 1
fig = px.scatter(df1, x=df1.LitRate, y=df1.Enrollment, color=df1.CountryName, size=df1.Pop, labels={
                     "LitRate": "Literacy Rate, Youth Female (% of females ages 15-24)",
                     "Enrollment": "School Enrollment, Primary, Female (% net)",
                     "CountryName": "Country Name"}, title='Plot 1')
fig.show()

Grouping all the countries by income level can provide insight on how income could be related to girls' education (Plot 2). Note that the three countries with the worst literacy rates among females age 15-24 are all low income. An explanation for the correlation could be that women in low income countries are required to work to support their family and therefore cannot attend school.

In [5]:
#Create Plot 2
fig = px.scatter(df1, x=df1.LitRate, y=df1.Enrollment, color=df1.IncomeGroup, size=df1.Pop, labels={
                     "LitRate": "Literacy Rate, Youth Female (% of females ages 15-24)",
                     "Enrollment": "School Enrollment, Primary, Female (% net)",
                     "CountryName": "Country Name"}, title='Plot 2')
fig.show()

Specifically, we can see the correlations between income levels and literacy rates when directly comparing Mali and Chile. Mali is classified as a low income country and Chile is classified as a High Income country. As the Plot 3 indicates, Chile's literacy rates among women aged 15-24 is higher than Mali’s by roughly 35%.

In [6]:
#Filter data to Chili and Mali
df3=df1[(df1.CountryCode=='CHL') | (df1.CountryCode=='MLI')]
#Create Plot 3
fig = px.bar(df3, x=df3.CountryName, y=df3.LitRate, labels={
                     "LitRate": "Literacy Rate, Youth Female (% of females ages 15-24)",
                     "CountryName": "Country Name"}, title='Plot 3')
fig.show()

## Girls' Education and a Nation's Economy

In order to establish a corrolation between girls' education and a nation's economy, we can analyze the GDP Per Capita indicator. As shown in Plot 4 from a variety of countries grouped by income level, all countries on the plot with a GDP approaching 20,000 or higher have close to a 100% literacy rate among females aged 15-24.

In [7]:
#Import data for Plots 4-6
df=pd.read_csv("https://raw.githubusercontent.com/cbarnes5/world_development_explorer/main/Data/Plot4_data.csv")
df.head()

Unnamed: 0.1,Unnamed: 0,Year,SE.ADT.1524.LT.FE.ZS,SP.POP.TOTL,NY.GDP.PCAP.PP.CD,Country Code,Country Name,Region,Income Group,Lending Type
0,0,2011,32.11322,30117413.0,1699.487997,AFG,Afghanistan,South Asia,Low income,IDA
1,1,2018,56.254749,37172386.0,2083.321897,AFG,Afghanistan,South Asia,Low income,IDA
2,2,2011,98.856239,2905195.0,10207.733502,ALB,Albania,Europe & Central Asia,Upper middle income,IBRD
3,3,2012,99.020187,2900401.0,10526.318974,ALB,Albania,Europe & Central Asia,Upper middle income,IBRD
4,4,2018,99.629997,2866376.0,13974.011607,ALB,Albania,Europe & Central Asia,Upper middle income,IBRD


*Variables*

- Year = Year
- SE.PRM.NENR.FE = Literacy Rate (% female age 15-24)
- SP.POP.TOTL = Total Population of Country
- NY.GDP.PCAP.PP.CD = GDP per capita, PPP (current international $)
- Country Code = Country Abbreviation 
- Country Name = Country Name
- Region = Region
- Lending Type =  World Bank Loan Type

In [8]:
#Change column names to easily callable variables 
df.columns=['Index', 'Year', 'LitRate', 'Pop', 'GDP', 'CountryCode', 'CountryName', 'Region', 'IncomeGroup', 'LendingType']

In [9]:
#Filter data to 2010
df4=df[df.Year==2010]
#Create Plot 4
fig = px.scatter(df4, x=df4.LitRate, y=df4.GDP, color=df4.CountryName, labels={
                     "LitRate": "Literacy Rate, Youth Female (% of females ages 15-24)",
                     "GDP": "GDP per capita, PPP (current international $)",
                     "CountryName": "Country Name"}, title='Plot 4')
fig.show()

We can also look at Mali’s GDP and girls' literacy data over the years 2010-2018 and determine that as Mali's literacy rate among 15-24 year old women increase, as does its GDP (Plots 5 & 6). This is likely because there are more workers able to do jobs that require an education, and in reverse, women with higher paying jobs can afford to send their girls to school to recieve and education.

In [10]:
#Filter data to specifically Mali (from imported data, so we have the data over all years, not just 2010)
df5=df[df.CountryCode=='MLI']
#Create Plot 5
fig = px.line(df5, x=df5.Year, y=df5.GDP, color=df5.CountryName, labels={
                     "Year": "Year",
                     "GDP": "GDP per capita, PPP (current international $)",
                     "CountryName": "Country Name"}, title='Plot 5')
fig.show()

In [11]:
#Create Plot 6
fig = px.line(df5, x=df5.Year, y=df5.LitRate, color=df5.CountryName, labels={
                     "Year": "Year",
                     "LitRate": "Literacy Rate, Youth Female (% of females ages 15-24)",
                     "CountryName": "Country Name"}, title='Plot 6')
fig.show()

## Girls' Education and Health

Not only does women’s literacy correlate with GDP, but it also affects their life expectancy, and the health of their families. As the following plots show, Chile, a country with higher literacy rates (Plot 3), has a larger percentage of their female population over 80 years old, and has a smaller percentage of still birth per 1,000 births when compared to Mali in 2010. A potential explanation could be that if women are educated, they are afforded more job opportunities and can therefore better access health care (including prenatal care). Better access to health care correlates to longer life expectancy and fewer stillbirths.

In [12]:
#Import data for Plot 7
df=pd.read_csv("https://raw.githubusercontent.com/cbarnes5/world_development_explorer/main/Data/Plot7_data.csv")
df.head()

Unnamed: 0.1,Unnamed: 0,Year,SE.ADT.1524.LT.FE.ZS,SP.POP.TOTL,SH.DYN.STLB,Country Code,Country Name,Region,Income Group,Lending Type
0,0,2011,32.11322,30117413.0,32.113551,AFG,Afghanistan,South Asia,Low income,IDA
1,1,2018,56.254749,37172386.0,28.911948,AFG,Afghanistan,South Asia,Low income,IDA
2,2,2011,98.856239,2905195.0,4.363312,ALB,Albania,Europe & Central Asia,Upper middle income,IBRD
3,3,2012,99.020187,2900401.0,4.252687,ALB,Albania,Europe & Central Asia,Upper middle income,IBRD
4,4,2018,99.629997,2866376.0,4.017681,ALB,Albania,Europe & Central Asia,Upper middle income,IBRD


*Variables*

- Year = Year
- SE.PRM.NENR.FE = Literacy Rate (% female age 15-24)
- SP.POP.TOTL = Total Population of Country
- SH.DYN.STLB = Stillbirth Rate (per 1,000 total births)
- Country Code = Country Abbreviation 
- Country Name = Country Name
- Region = Region
- Lending Type =  World Bank Loan Type

In [13]:
#Change column names to easily callable variables 
df.columns=['Index', 'Year', 'LitRate', 'Pop', 'StillBirths', 'CountryCode', 'CountryName', 'Region', 'IncomeGroup', 'LendingType']

In [14]:
#Filter data for only Chile and Mali
df7=df[(df.CountryCode=='CHL') | (df.CountryCode=='MLI')]
#Filter data for 2010
df7=df7[df7.Year==2010]
#Create Plot 7
fig = px.scatter(df7, x=df7.LitRate, y=df7.StillBirths, color=df7.CountryName, labels={
                     "LitRate": "Literacy Rate, Youth Female (% of females ages 15-24)",
                     "StillBirths": "Stillbirth Rate (per 1,000 total births)",
                     "CountryName": "Country Name"}, title='Plot 7')
fig.update_traces(marker=dict(size=12))
fig.show()

In [15]:
#Import data for plot 8
df=pd.read_csv("https://raw.githubusercontent.com/cbarnes5/world_development_explorer/main/Data/Plot8_data.csv")
df.head()

Unnamed: 0.1,Unnamed: 0,Year,SE.ADT.1524.LT.FE.ZS,SP.POP.TOTL,SP.POP.80UP.FE,Country Code,Country Name,Region,Income Group,Lending Type
0,0,2011,32.11322,30117413.0,42436,AFG,Afghanistan,South Asia,Low income,IDA
1,1,2018,56.254749,37172386.0,58031,AFG,Afghanistan,South Asia,Low income,IDA
2,2,2011,98.856239,2905195.0,31764,ALB,Albania,Europe & Central Asia,Upper middle income,IBRD
3,3,2012,99.020187,2900401.0,33221,ALB,Albania,Europe & Central Asia,Upper middle income,IBRD
4,4,2018,99.629997,2866376.0,42667,ALB,Albania,Europe & Central Asia,Upper middle income,IBRD


*Variables*

- Year = Year
- SE.PRM.NENR.FE = Literacy Rate (% female age 15-24)
- SP.POP.TOTL = Total Population of Country
- SP.POP.80UP.FE = Population ages 80 and above, females (% of female population)
- Country Code = Country Abbreviation 
- Country Name = Country Name
- Region = Region
- Lending Type =  World Bank Loan Type

In [16]:
#Change column names to easily callable variables 
df.columns=['Index', 'Year', 'LitRate', 'Pop', 'above80Pop', 'CountryCode', 'CountryName', 'Region', 'IncomeGroup', 'LendingType']

In [17]:
#Filter data for only Chile and Mali
df8=df[(df.CountryCode=='CHL') | (df.CountryCode=='MLI')]
#Filter data for 2010
df8=df8[df8.Year==2010]
#Create Plot 8
fig = px.scatter(df8, x=df8.LitRate, y=df8.above80Pop, color=df8.CountryName, labels={
                     "LitRate": "Literacy Rate, Youth Female (% of females ages 15-24)",
                     "above80Pop": "Population ages 80 and above, females (% of female population",
                     "CountryName": "Country Name"}, title='Plot 8')
fig.update_traces(marker=dict(size=12))
fig.show()

## Conclusion

In conclusion, by analyzing many indicators from the The World Bank’s dataset using the The World Development Explorer Tool, one can conclude from Mali’s and Chile's data that there is a correlation between women’s literacy, girls' enrollment in primary education, income level of a country, GDP, women’s health and maternal health (specifically measured by life expectancy and still birth likelihood). Revealing all the benefits of girls' education and researching the aspects that contribute to low education and literacy levels through data can bring us closer towards The World Bank’s Mission of ending extreme poverty and promoting shared prosperity.