<a href="https://colab.research.google.com/github/akukudala/world_development_explorer/blob/main/wdx_analysis_partB_draft.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

* Name: Akshitha Kukudala
* Date: 4/01/2022

# Analyzing the percentages of female population (age 15-19) education stats over the years 2010 to 2015 
## Indicators to consider
* No Education
* Completed Primary Education
* Completed Secondary Education

The primary motive of this project is to focus on Female Primary and Secondary education in East Asia & Pacific region. 
Many governments publish statistics showing how well their education systems are working and improving,
with data on enrollment and completion. 
Monitoring progress toward national and global education targets helps allocate resources more efficiently and make quality learning opportunities accessible to all.
Better-quality education leads to an empowered citizenry and a more productive labor force. We will also analyze the mentioned region with the other regions in the world as per data derived from 
World Development Explorer. The reason why most of the females in the world are left uneducationed is due to their financial background. Let us consider the Financial states of the region to compare 
the correlation between the education and the income.

- **Data Source:** World Development Explorer ([worlddev.xyz](https://))
- **Regions Analyzed:** East Asia & Pacific
- **Years considered:** 2010-2015

In [1]:
import pandas as pd
import plotly.express as px
import plotly.graph_objs as go
import plotly.io as pio
import plotly.express as px
import plotly

In [2]:

csv_file = "https://raw.githubusercontent.com/akukudala/world_development_explorer/main/wdi_data_3_indicators.csv"  

In [3]:
df = pd.read_csv(csv_file, index_col=0)

df.sample(5)

Unnamed: 0,Year,value,indicator,Country Code,Country Name,Region,Income Group,Lending Type
411,2010,12.76,BAR.SEC.CMPT.1519.FE.ZS,CHE,Switzerland,Europe & Central Asia,High income,Not classified
334,2010,8.92,BAR.SEC.CMPT.1519.FE.ZS,DEU,Germany,Europe & Central Asia,High income,Not classified
394,2010,18.99,BAR.SEC.CMPT.1519.FE.ZS,PRT,Portugal,Europe & Central Asia,High income,Not classified
318,2010,19.28,BAR.SEC.CMPT.1519.FE.ZS,HRV,Croatia,Europe & Central Asia,High income,IBRD
226,2010,56.58,BAR.NOED.1519.FE.ZS,MLI,Mali,Sub-Saharan Africa,Low income,IDA


In [4]:
df_noed = df.query("indicator == 'BAR.NOED.1519.FE.ZS'").query("Region == 'East Asia & Pacific'")
df_noed = df_noed.sort_values(by= "value", ascending= False)
df_noed



Unnamed: 0,Year,value,indicator,Country Code,Country Name,Region,Income Group,Lending Type
284,2010,57.77,BAR.NOED.1519.FE.ZS,VNM,Vietnam,East Asia & Pacific,Lower middle income,IBRD
245,2010,40.67,BAR.NOED.1519.FE.ZS,PNG,Papua New Guinea,East Asia & Pacific,Lower middle income,Blend
215,2010,25.41,BAR.NOED.1519.FE.ZS,LAO,Lao PDR,East Asia & Pacific,Lower middle income,IDA
160,2010,20.1,BAR.NOED.1519.FE.ZS,BRN,Brunei Darussalam,East Asia & Pacific,High income,Not classified
163,2010,7.86,BAR.NOED.1519.FE.ZS,KHM,Cambodia,East Asia & Pacific,Lower middle income,IDA
239,2010,4.68,BAR.NOED.1519.FE.ZS,NZL,New Zealand,East Asia & Pacific,High income,Not classified
271,2010,2.47,BAR.NOED.1519.FE.ZS,THA,Thailand,East Asia & Pacific,Upper middle income,IBRD
232,2010,2.32,BAR.NOED.1519.FE.ZS,MNG,Mongolia,East Asia & Pacific,Lower middle income,IBRD
259,2010,2.02,BAR.NOED.1519.FE.ZS,SGP,Singapore,East Asia & Pacific,High income,Not classified
224,2010,1.63,BAR.NOED.1519.FE.ZS,MYS,Malaysia,East Asia & Pacific,Upper middle income,IBRD


In [20]:
import plotly
fig = px.bar(
    data_frame= df_noed,
    x= "Country Name",
    labels={"value":"2010 Barro-Lee: Percentage of Female population age 15-19 with no education"},
    y= "value",
    color= "Country Name",
    height= 700,
    template=list(plotly.io.templates.keys())[5],
    title= " Data: The World Bank www.worlddev.xyz"
)

fig.update_layout(showlegend= False)
fig.show()

In [6]:
df_noed.shape

(21, 8)

In [21]:

df_pcmpt = df.query("indicator == 'BAR.PRM.CMPT.1519.FE.ZS'").query("Region == 'East Asia & Pacific'")
df_pcmpt = df_pcmpt.sort_values(by= "value", ascending= False)
df_pcmpt.sample(5)

Unnamed: 0,Year,value,indicator,Country Code,Country Name,Region,Income Group,Lending Type
19,2010,39.93,BAR.PRM.CMPT.1519.FE.ZS,KHM,Cambodia,East Asia & Pacific,Lower middle income,IDA
68,2010,0.07,BAR.PRM.CMPT.1519.FE.ZS,KOR,"Korea, Rep.",East Asia & Pacific,High income,Not classified
91,2010,57.4,BAR.PRM.CMPT.1519.FE.ZS,MMR,Myanmar,East Asia & Pacific,Lower middle income,IDA
95,2010,0.06,BAR.PRM.CMPT.1519.FE.ZS,NZL,New Zealand,East Asia & Pacific,High income,Not classified
80,2010,1.91,BAR.PRM.CMPT.1519.FE.ZS,MYS,Malaysia,East Asia & Pacific,Upper middle income,IBRD


In [22]:
import plotly
fig = px.bar(
    data_frame= df_pcmpt,
    x= "Country Name",
    labels={"value":"2010 Barro-Lee: Percentage of Female population age 15-19 with primary schooling, Completed Primary"},
    y= "value",
    color= "Country Name",
    height= 700,
    template=list(plotly.io.templates.keys())[5],
    title= " Data: The World Bank www.worlddev.xyz"
)

fig.update_layout(showlegend= False)
fig.show()

In [20]:
df_pcmpt.shape

(144, 8)

In [23]:
df_scmpt = df.query("indicator == 'BAR.SEC.CMPT.1519.FE.ZS'").query("Region == 'East Asia & Pacific'")
df_scmpt = df_scmpt.sort_values(by= "value", ascending= False)
df_scmpt.sample(5)

Unnamed: 0,Year,value,indicator,Country Code,Country Name,Region,Income Group,Lending Type
366,2010,34.39,BAR.SEC.CMPT.1519.FE.ZS,MAC,"Macao SAR, China",East Asia & Pacific,High income,Not classified
376,2010,45.08,BAR.SEC.CMPT.1519.FE.ZS,MNG,Mongolia,East Asia & Pacific,Lower middle income,IBRD
304,2010,19.74,BAR.SEC.CMPT.1519.FE.ZS,BRN,Brunei Darussalam,East Asia & Pacific,High income,Not classified
415,2010,37.31,BAR.SEC.CMPT.1519.FE.ZS,THA,Thailand,East Asia & Pacific,Upper middle income,IBRD
307,2010,2.38,BAR.SEC.CMPT.1519.FE.ZS,KHM,Cambodia,East Asia & Pacific,Lower middle income,IDA


In [24]:
import plotly
fig = px.bar(
    data_frame= df_scmpt,
    x= "Country Name",
    labels={"value":"2010 Barro-Lee: Percentage of Female population age 15-19 with secondary schooling, Completed Secondary"},
    y= "value",
    color= "Country Name",
    height= 700,
    template=list(plotly.io.templates.keys())[5],
    title= " Data: The World Bank www.worlddev.xyz"
)

fig.update_layout(showlegend= False)
fig.show()

In [11]:
df_scmpt.shape

(21, 8)

Firstly, let us observe the female population with no education among various countries

In [13]:
df_noed.groupby('Income Group')
df_noed2 = df_noed.groupby(['Income Group']).agg(
    Hours_Mean = ('value', 'mean'))

df_noed2

Unnamed: 0_level_0,Hours_Mean
Income Group,Unnamed: 1_level_1
High income,3.5025
Lower middle income,16.99625
Upper middle income,1.042


In [None]:
m

In [26]:
fig = px.pie(df_noed2, values='Hours_Mean', title='grouping by Income group')

fig.show()