<a href="https://colab.research.google.com/github/akukudala/world_development_explorer_final/blob/main/wdx_final/wdx_analysis_partB_draft.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

* Name: Akshitha Kukudala
* Date: 4/01/2022

# Analyzing the percentages of female population (age 15-19) education stats over the years 2010 to 2015 
## Indicators to consider
* No Education
* Completed Primary Education
* Completed Secondary Education

The primary motive of this project is to focus on Female Primary and Secondary education in East Asia & Pacific region. 
Many governments publish statistics showing how well their education systems are working and improving,
with data on enrollment and completion. 
Monitoring progress toward national and global education targets helps allocate resources more efficiently and make quality learning opportunities accessible to all.
Better-quality education leads to an empowered citizenry and a more productive labor force. We will also analyze the mentioned region with the other regions in the world as per data derived from 
World Development Explorer. The reason why most of the females in the world are left uneducationed is due to their financial background. Let us consider the Financial states of the region to compare 
the correlation between the education and the income.

- **Data Source:** World Development Explorer ([worlddev.xyz](https://))
- **Regions Analyzed:** East Asia & Pacific
- **Years considered:** 2010-2015

In [1]:
import pandas as pd
import plotly.express as px
import plotly.graph_objs as go
import plotly.io as pio
import plotly.express as px
import plotly

In [2]:

csv_file = "https://raw.githubusercontent.com/akukudala/world_development_explorer/main/wdi_data_3_indicators.csv"  

In [3]:
df = pd.read_csv(csv_file, index_col=0)

df.sample(5)

Unnamed: 0,Year,value,indicator,Country Code,Country Name,Region,Income Group,Lending Type
308,2010,12.55,BAR.SEC.CMPT.1519.FE.ZS,CMR,Cameroon,Sub-Saharan Africa,Lower middle income,Blend
267,2010,5.08,BAR.NOED.1519.FE.ZS,CHE,Switzerland,Europe & Central Asia,High income,Not classified
47,2010,15.47,BAR.PRM.CMPT.1519.FE.ZS,GHA,Ghana,Sub-Saharan Africa,Lower middle income,IDA
381,2010,16.59,BAR.SEC.CMPT.1519.FE.ZS,NPL,Nepal,South Asia,Lower middle income,IDA
253,2010,0.58,BAR.NOED.1519.FE.ZS,RUS,Russian Federation,Europe & Central Asia,Upper middle income,IBRD


In [4]:
df_noed = df.query("indicator == 'BAR.NOED.1519.FE.ZS'").query("Region == 'East Asia & Pacific'")
df_noed = df_noed.sort_values(by= "value", ascending= False)
df_noed.sample(5)



Unnamed: 0,Year,value,indicator,Country Code,Country Name,Region,Income Group,Lending Type
232,2010,2.32,BAR.NOED.1519.FE.ZS,MNG,Mongolia,East Asia & Pacific,Lower middle income,IBRD
185,2010,0.32,BAR.NOED.1519.FE.ZS,FJI,Fiji,East Asia & Pacific,Upper middle income,Blend
201,2010,0.92,BAR.NOED.1519.FE.ZS,IDN,Indonesia,East Asia & Pacific,Lower middle income,IBRD
168,2010,0.49,BAR.NOED.1519.FE.ZS,CHN,China,East Asia & Pacific,Upper middle income,IBRD
248,2010,0.92,BAR.NOED.1519.FE.ZS,PHL,Philippines,East Asia & Pacific,Lower middle income,IBRD


In [5]:
import plotly
fig = px.bar(
    data_frame= df_noed,
    x= "Country Name",
    labels={"value":"2010 Barro-Lee: Percentage of Female population age 15-19 with no education"},
    y= "value",
    color= "Country Name",
    height= 700,
    template=list(plotly.io.templates.keys())[5],
    title= " Data: The World Bank www.worlddev.xyz"
)

fig.update_layout(showlegend= False)
fig.show()

In [6]:
df_noed.shape

(21, 8)

In [7]:

df_pcmpt = df.query("indicator == 'BAR.PRM.CMPT.1519.FE.ZS'").query("Region == 'East Asia & Pacific'")
df_pcmpt = df_pcmpt.sort_values(by= "value", ascending= False)
df_pcmpt.sample(5)

Unnamed: 0,Year,value,indicator,Country Code,Country Name,Region,Income Group,Lending Type
24,2010,6.67,BAR.PRM.CMPT.1519.FE.ZS,CHN,China,East Asia & Pacific,Upper middle income,IBRD
5,2010,18.53,BAR.PRM.CMPT.1519.FE.ZS,AUS,Australia,East Asia & Pacific,High income,Not classified
57,2010,27.03,BAR.PRM.CMPT.1519.FE.ZS,IDN,Indonesia,East Asia & Pacific,Lower middle income,IBRD
129,2010,0.95,BAR.PRM.CMPT.1519.FE.ZS,TON,Tonga,East Asia & Pacific,Upper middle income,IDA
78,2010,11.46,BAR.PRM.CMPT.1519.FE.ZS,MAC,"Macao SAR, China",East Asia & Pacific,High income,Not classified


In [8]:
import plotly
fig = px.bar(
    data_frame= df_pcmpt,
    x= "Country Name",
    labels={"value":"2010 Barro-Lee: Percentage of Female population age 15-19 with primary schooling, Completed Primary"},
    y= "value",
    color= "Country Name",
    height= 700,
    template=list(plotly.io.templates.keys())[5],
    title= " Data: The World Bank www.worlddev.xyz"
)

fig.update_layout(showlegend= False)
fig.show()

In [9]:
df_pcmpt.shape

(21, 8)

In [10]:
df_scmpt = df.query("indicator == 'BAR.SEC.CMPT.1519.FE.ZS'").query("Region == 'East Asia & Pacific'")
df_scmpt = df_scmpt.sort_values(by= "value", ascending= False)
df_scmpt.sample(5)

Unnamed: 0,Year,value,indicator,Country Code,Country Name,Region,Income Group,Lending Type
307,2010,2.38,BAR.SEC.CMPT.1519.FE.ZS,KHM,Cambodia,East Asia & Pacific,Lower middle income,IDA
329,2010,51.18,BAR.SEC.CMPT.1519.FE.ZS,FJI,Fiji,East Asia & Pacific,Upper middle income,Blend
368,2010,65.05,BAR.SEC.CMPT.1519.FE.ZS,MYS,Malaysia,East Asia & Pacific,Upper middle income,IBRD
403,2010,72.53,BAR.SEC.CMPT.1519.FE.ZS,SGP,Singapore,East Asia & Pacific,High income,Not classified
366,2010,34.39,BAR.SEC.CMPT.1519.FE.ZS,MAC,"Macao SAR, China",East Asia & Pacific,High income,Not classified


In [11]:
import plotly
fig = px.bar(
    data_frame= df_scmpt,
    x= "Country Name",
    labels={"value":"2010 Barro-Lee: Percentage of Female population age 15-19 with secondary schooling, Completed Secondary"},
    y= "value",
    color= "Country Name",
    height= 700,
    template=list(plotly.io.templates.keys())[5],
    title= " Data: The World Bank www.worlddev.xyz"
)

fig.update_layout(showlegend= False)
fig.show()

In [12]:
df_scmpt.shape

(21, 8)

Firstly, let us observe the female population with no education among various countries

In [13]:
df_noed.groupby('Income Group')
df_noed2 = df_noed.groupby(['Income Group']).agg(
    Hours_Mean = ('value', 'mean'))

df_noed2

Unnamed: 0_level_0,Hours_Mean
Income Group,Unnamed: 1_level_1
High income,3.5025
Lower middle income,16.99625
Upper middle income,1.042


In [15]:
fig = px.pie(df_noed2, values='Hours_Mean', title='grouping by Income group')

fig.show()