# Visualizing Female School Dropout Rates: Unveiling Educational Disparities

The increasing number of women facing disadvantageous circumstances in society has reached alarming proportions. Among the contributing factors, one prominent issue is the high dropout rate of girls from school.
In this project, our goal is to comprehend the specific educational levels at which girls are most likely to drop out. This investigation is part of a broader mission to understand the underlying reasons for this phenomenon and offer effective strategies to address this problem.Utilizing the comprehensive dataset provided by UNICEF, we will employ various visualization techniques, including dynamic charts, to gain a deeper understanding of the issue at hand. By analyzing statistics and data, we can paint a clear picture that will guide us towards sustainable solutions to empower women and prevent further dropout rates.Our primary focus is to capture and analyze the number of school dropouts at different educational levels. By delving deeper into the data, we aim to identify the root causes behind these dropouts. Through detailed exploration, we can shed light on the challenges that hinder girls' educational progress and develop effective strategies to mitigate these obstacles.
Armed with the insights garnered from our analysis, we will work towards providing actionable recommendations. These solutions will be designed to address the underlying causes of female school dropout, enabling us to assist women in reaching their educational milestones and fulfilling their potential.

By unraveling the complexities of this issue and leveraging data-driven visualizations, we strive to create a transformative impact. Our ultimate goal is to break the cycle of educational disadvantage for girls and foster an environment that empowers women to thrive at every level of education.

In [3]:
# importing the libraries 

import pandas as pd 

import numpy as np

import matplotlib.pyplot as plt
import seaborn as sns 

In [12]:
# Reading the dataset 

school_data = pd.read_csv("School-completion-rates-Nov2019.csv")
school_data

Unnamed: 0.1,Unnamed: 0,"Indicator name: Completion rate, primary",Unnamed: 2,Unnamed: 3,Unnamed: 4,Unnamed: 5,Unnamed: 6,Unnamed: 7,Unnamed: 8,Unnamed: 9,...,Unnamed: 15,Unnamed: 16,Unnamed: 17,Unnamed: 18,Unnamed: 19,Unnamed: 20,Unnamed: 21,Unnamed: 22,Unnamed: 23,Unnamed: 24
0,,,,,,,,Total,Gender,,...,,,,,,,,,,
1,ISO_Code,Country,Region,Sub-region,Least developed countries (LDC),Africa sub-regions,Africa region,Total,Male,Female,...,Fourth,Richest,Source,"Total Population, one year before primary",,,,,,
2,AFG,Afghanistan,SA,,LDC,,,54,67,40,...,60,75,DHS 2015,6170913,3160379.0,3010534.0,1.606042e+06,4.564871e+06,26.026,73.974
3,ALB,Albania,ECA,EECA,,,,92,91,93,...,92,98,DHS 2017-18,156833,81809.0,75024.0,9.741211e+04,5.942089e+04,62.112,37.888
4,DZA,Algeria,MENA,,,Northern Africa,All,94,93,94,...,96,99,MICS 2012-13,4040275,2060745.0,1979530.0,2.979016e+06,1.061259e+06,73.733,26.267
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
222,,African sub-region - North Africa,,,,Northern Africa,,91,,,...,,,,,,,,,,
223,,African sub-region - South Africa,,,,Southern Africa,,67,,,...,,,,,,,,,,
224,,African sub-region - West Africa,,,,Western Africa,,65,,,...,,,,,,,,,,
225,,Africa region - All,,,,,All,66,,,...,,,,,,,,,,


#### displaying the dataset notice they are two rows the unamed row and the NaN row 

In [14]:
#set the first (notna/ unnamed) row as header

school_data.columns = school_data.iloc[0]

In [15]:
#emove that unamed row from the dataset 

school_data = school_data.iloc[1:,].reindex()

In [16]:
school_data.head()

Unnamed: 0,NaN,NaN.1,NaN.2,NaN.3,NaN.4,NaN.5,NaN.6,Total,Gender,NaN.7,...,NaN.8,NaN.9,NaN.10,NaN.11,NaN.12,NaN.13,NaN.14,NaN.15,NaN.16,NaN.17
1,ISO_Code,Country,Region,Sub-region,Least developed countries (LDC),Africa sub-regions,Africa region,Total,Male,Female,...,Fourth,Richest,Source,"Total Population, one year before primary",,,,,,
2,AFG,Afghanistan,SA,,LDC,,,54,67,40,...,60,75,DHS 2015,6170913,3160379.0,3010534.0,1606042.0,4564871.0,26.026,73.974
3,ALB,Albania,ECA,EECA,,,,92,91,93,...,92,98,DHS 2017-18,156833,81809.0,75024.0,97412.11,59420.89,62.112,37.888
4,DZA,Algeria,MENA,,,Northern Africa,All,94,93,94,...,96,99,MICS 2012-13,4040275,2060745.0,1979530.0,2979016.0,1061259.0,73.733,26.267
5,AND,Andorra,ECA,WE,,,,,,,...,,,,,,,,,87.916,12.084


In [17]:
'''
still another first (notna) row as header and 
we had to do the same as we did for the unamed row to 
have a more presentable dataset

'''

school_data.columns = school_data.iloc[0]
school_data = school_data.iloc[1:,].reindex()
school_data.head()

1,ISO_Code,Country,Region,Sub-region,Least developed countries (LDC),Africa sub-regions,Africa region,Total,Male,Female,...,Fourth,Richest,Source,"Total Population, one year before primary",NaN,NaN.1,NaN.2,NaN.3,NaN.4,NaN.5
2,AFG,Afghanistan,SA,,LDC,,,54.0,67.0,40.0,...,60.0,75.0,DHS 2015,6170913.0,3160379.0,3010534.0,1606042.0,4564871.0,26.026,73.974
3,ALB,Albania,ECA,EECA,,,,92.0,91.0,93.0,...,92.0,98.0,DHS 2017-18,156833.0,81809.0,75024.0,97412.11,59420.89,62.112,37.888
4,DZA,Algeria,MENA,,,Northern Africa,All,94.0,93.0,94.0,...,96.0,99.0,MICS 2012-13,4040275.0,2060745.0,1979530.0,2979016.0,1061259.0,73.733,26.267
5,AND,Andorra,ECA,WE,,,,,,,...,,,,,,,,,87.916,12.084
6,AGO,Angola,SSA,ESA,LDC,Southern Africa,All,51.0,53.0,49.0,...,68.0,82.0,DHS 2015-16,5482624.0,2722689.0,2759935.0,3663763.0,1818861.0,66.825,33.175


In [19]:
# Details about the dataset 

school_data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 225 entries, 2 to 226
Data columns (total 25 columns):
 #   Column                                       Non-Null Count  Dtype  
---  ------                                       --------------  -----  
 0   ISO_Code                                     203 non-null    object 
 1   Country                                      222 non-null    object 
 2   Region                                       189 non-null    object 
 3   Sub-region                                   100 non-null    object 
 4   Least developed countries (LDC)              41 non-null     object 
 5   Africa sub-regions                           59 non-null     object 
 6   Africa region                                55 non-null     object 
 7   Total                                        118 non-null    object 
 8   Male                                         99 non-null     object 
 9   Female                                       99 non-null     object 
 10  Ur

In [20]:
school_data

1,ISO_Code,Country,Region,Sub-region,Least developed countries (LDC),Africa sub-regions,Africa region,Total,Male,Female,...,Fourth,Richest,Source,"Total Population, one year before primary",NaN,NaN.1,NaN.2,NaN.3,NaN.4,NaN.5
2,AFG,Afghanistan,SA,,LDC,,,54,67,40,...,60,75,DHS 2015,6170913,3160379.0,3010534.0,1.606042e+06,4.564871e+06,26.026,73.974
3,ALB,Albania,ECA,EECA,,,,92,91,93,...,92,98,DHS 2017-18,156833,81809.0,75024.0,9.741211e+04,5.942089e+04,62.112,37.888
4,DZA,Algeria,MENA,,,Northern Africa,All,94,93,94,...,96,99,MICS 2012-13,4040275,2060745.0,1979530.0,2.979016e+06,1.061259e+06,73.733,26.267
5,AND,Andorra,ECA,WE,,,,,,,...,,,,,,,,,87.916,12.084
6,AGO,Angola,SSA,ESA,LDC,Southern Africa,All,51,53,49,...,68,82,DHS 2015-16,5482624,2722689.0,2759935.0,3.663763e+06,1.818861e+06,66.825,33.175
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
222,,African sub-region - North Africa,,,,Northern Africa,,91,,,...,,,,,,,,,,
223,,African sub-region - South Africa,,,,Southern Africa,,67,,,...,,,,,,,,,,
224,,African sub-region - West Africa,,,,Western Africa,,65,,,...,,,,,,,,,,
225,,Africa region - All,,,,,All,66,,,...,,,,,,,,,,


In [None]:
#Renam Re

school_data.rename(
    columns={"Least developed Countries": "LDC", 
             "Africa sub-regions": "Africa-sub-regions", 
             "Africa region": "Africa-region",
             "Total Population, one year before primary":"Total-Population"},
    inplace=True,
)

school_data

In [None]:
ISO_Code                                     
 1   Country                                      
   
 3   Sub-region                                   
 4   Least developed countries (LDC)              
 5   Africa sub-regions                           
 6   Africa region                               
 7   Total                                        
 8   Male                                       
 9   Female                                       
 10  Urban                                        
 11  Rural                                        
 12  Poorest                                      
 13  Second                                       
 14  Middle                                      
 15  Fourth                                       
 16  Richest                                      
 17  Source                                       
 18   Total Population, one year before primary   
 19  nan                                          
 20  nan                                          
 21  nan                                          
 22  nan                                          
 23  nan                                         
 24  nan      

In [21]:
school_data['Nations'] = school_data.Least developed countries (LDC).combine_first(school_data.Total).
combine_first(chool_data.Male).combine_first(school_data.Female).combine_first(school_data.Urban).
combine_first(school_data.Rural)

SyntaxError: invalid syntax (<ipython-input-21-e735057b4883>, line 1)