# World Educational Data

This is a comprehensive overview of the "World Educational Data" dataset, a valuable resource for gaining insights into the state of education worldwide. This meticulously curated dataset provides a window into the diverse educational landscapes across countries and regions. It encompasses a wide array of key statistics, including out-of-school rates, completion rates, proficiency levels, literacy rates, birth rates, enrollment figures in primary and tertiary education, and unemployment rates.Whether you are a researcher, educator, policymaker, or simply curious about the world's educational trends, this dataset offers a unique opportunity to analyze and understand the dynamic nature of education systems on a global scale. It is an invaluable tool for identifying trends, disparities, and areas for improvement in education across the globe. Dive into the world of education statistics, explore the data, and unlock a wealth of knowledge with this extraordinary dataset.

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import plotly.express as px

In [14]:
data = pd.read_csv('data\\Global_Education.csv', encoding = 'latin-1')

#### Initial statistics:
1. Dataset Shape:
2. First Few Rows (df.head())
3. Column Names (df.columns):
4. Data Types (df.dtypes):
5. Descriptive Statistics (df.describe()):

By performing these initial data statistics, you set the stage for more in-depth exploratory data analysis (EDA) and data manipulation. These initial insights provide a foundation for uncovering patterns,

In [3]:
data.shape # dataset contains 202 rows and 29 columns

(202, 29)

In [4]:
data.head() # print first five rows

Unnamed: 0,Countries and areas,Latitude,Longitude,OOSR_Pre0Primary_Age_Male,OOSR_Pre0Primary_Age_Female,OOSR_Primary_Age_Male,OOSR_Primary_Age_Female,OOSR_Lower_Secondary_Age_Male,OOSR_Lower_Secondary_Age_Female,OOSR_Upper_Secondary_Age_Male,...,Primary_End_Proficiency_Reading,Primary_End_Proficiency_Math,Lower_Secondary_End_Proficiency_Reading,Lower_Secondary_End_Proficiency_Math,Youth_15_24_Literacy_Rate_Male,Youth_15_24_Literacy_Rate_Female,Birth_Rate,Gross_Primary_Education_Enrollment,Gross_Tertiary_Education_Enrollment,Unemployment_Rate
0,Afghanistan,33.93911,67.709953,0,0,0,0,0,0,44,...,13,11,0,0,74,56,32.49,104.0,9.7,11.12
1,Albania,41.153332,20.168331,4,2,6,3,6,1,21,...,0,0,48,58,99,100,11.78,107.0,55.0,12.33
2,Algeria,28.033886,1.659626,0,0,0,0,0,0,0,...,0,0,21,19,98,97,24.28,109.9,51.4,11.7
3,Andorra,42.506285,1.521801,0,0,0,0,0,0,0,...,0,0,0,0,0,0,7.2,106.4,0.0,0.0
4,Angola,11.202692,17.873887,31,39,0,0,0,0,0,...,0,0,0,0,0,0,40.73,113.5,9.3,6.89


In [5]:
data.describe()

Unnamed: 0,Latitude,Longitude,OOSR_Pre0Primary_Age_Male,OOSR_Pre0Primary_Age_Female,OOSR_Primary_Age_Male,OOSR_Primary_Age_Female,OOSR_Lower_Secondary_Age_Male,OOSR_Lower_Secondary_Age_Female,OOSR_Upper_Secondary_Age_Male,OOSR_Upper_Secondary_Age_Female,...,Primary_End_Proficiency_Reading,Primary_End_Proficiency_Math,Lower_Secondary_End_Proficiency_Reading,Lower_Secondary_End_Proficiency_Math,Youth_15_24_Literacy_Rate_Male,Youth_15_24_Literacy_Rate_Female,Birth_Rate,Gross_Primary_Education_Enrollment,Gross_Tertiary_Education_Enrollment,Unemployment_Rate
count,202.0,202.0,202.0,202.0,202.0,202.0,202.0,202.0,202.0,202.0,...,202.0,202.0,202.0,202.0,202.0,202.0,202.0,202.0,202.0,202.0
mean,25.081422,55.166928,19.658416,19.282178,5.282178,5.569307,8.707921,8.831683,20.292079,19.975248,...,10.717822,10.376238,25.787129,24.450495,35.80198,35.084158,18.91401,94.942574,34.392574,6.0
std,16.813639,45.976287,25.007604,25.171147,9.396442,10.383092,13.258203,14.724717,21.485592,23.140376,...,24.866101,22.484423,33.181384,31.965467,45.535186,45.249643,10.828184,29.769338,29.978206,5.273136
min,0.023559,0.824782,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,11.685062,18.665678,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.25,...,0.0,0.0,0.0,0.0,0.0,0.0,10.355,97.2,9.0,2.3025
50%,21.207861,43.518091,9.0,7.0,1.0,1.0,2.0,2.0,15.0,12.0,...,0.0,0.0,0.0,0.0,0.0,0.0,17.55,101.85,24.85,4.585
75%,39.901792,77.684945,31.0,30.0,6.0,6.75,12.75,10.75,32.75,30.0,...,0.0,0.0,56.75,50.75,94.0,96.75,27.6925,107.3,59.975,8.655
max,64.963051,178.065032,96.0,96.0,58.0,67.0,61.0,70.0,84.0,89.0,...,99.0,89.0,89.0,94.0,100.0,100.0,46.08,142.5,136.6,28.18


In [15]:
data.columns

Index(['Countries and areas', 'Latitude ', 'Longitude',
       'OOSR_Pre0Primary_Age_Male', 'OOSR_Pre0Primary_Age_Female',
       'OOSR_Primary_Age_Male', 'OOSR_Primary_Age_Female',
       'OOSR_Lower_Secondary_Age_Male', 'OOSR_Lower_Secondary_Age_Female',
       'OOSR_Upper_Secondary_Age_Male', 'OOSR_Upper_Secondary_Age_Female',
       'Completion_Rate_Primary_Male', 'Completion_Rate_Primary_Female',
       'Completion_Rate_Lower_Secondary_Male',
       'Completion_Rate_Lower_Secondary_Female',
       'Completion_Rate_Upper_Secondary_Male',
       'Completion_Rate_Upper_Secondary_Female',
       'Grade_2_3_Proficiency_Reading', 'Grade_2_3_Proficiency_Math',
       'Primary_End_Proficiency_Reading', 'Primary_End_Proficiency_Math',
       'Lower_Secondary_End_Proficiency_Reading',
       'Lower_Secondary_End_Proficiency_Math',
       'Youth_15_24_Literacy_Rate_Male', 'Youth_15_24_Literacy_Rate_Female',
       'Birth_Rate', 'Gross_Primary_Education_Enrollment',
       'Gross_Tertiary_Edu

In [16]:
data.dtypes

Countries and areas                         object
Latitude                                   float64
Longitude                                  float64
OOSR_Pre0Primary_Age_Male                    int64
OOSR_Pre0Primary_Age_Female                  int64
OOSR_Primary_Age_Male                        int64
OOSR_Primary_Age_Female                      int64
OOSR_Lower_Secondary_Age_Male                int64
OOSR_Lower_Secondary_Age_Female              int64
OOSR_Upper_Secondary_Age_Male                int64
OOSR_Upper_Secondary_Age_Female              int64
Completion_Rate_Primary_Male                 int64
Completion_Rate_Primary_Female               int64
Completion_Rate_Lower_Secondary_Male         int64
Completion_Rate_Lower_Secondary_Female       int64
Completion_Rate_Upper_Secondary_Male         int64
Completion_Rate_Upper_Secondary_Female       int64
Grade_2_3_Proficiency_Reading                int64
Grade_2_3_Proficiency_Math                   int64
Primary_End_Proficiency_Reading

In [10]:
data.isnull().sum() # checkig for null values

Countries and areas                        0
Latitude                                   0
Longitude                                  0
Pre_Primary_Age_Male                       0
Pre_Primary_Age_Female                     0
Primary_Age_Male                           0
Primary_Age_Female                         0
Lower_Secondary_Age_Male                   0
Lower_Secondary_Age_Female                 0
Upper_Secondary_Age_Male                   0
Upper_Secondary_Age_Female                 0
Completion_Rate_Primary_Male               0
Completion_Rate_Primary_Female             0
Completion_Rate_Lower_Secondary_Male       0
Completion_Rate_Lower_Secondary_Female     0
Completion_Rate_Upper_Secondary_Male       0
Completion_Rate_Upper_Secondary_Female     0
Grade_2_3_Proficiency_Reading              0
Grade_2_3_Proficiency_Math                 0
Primary_End_Proficiency_Reading            0
Primary_End_Proficiency_Math               0
Lower_Secondary_End_Proficiency_Reading    0
Lower_Seco

By inspecting the columns it is found that there is no null values in the dataset.

#### Birth_Rate 

In [21]:
fig = px.choropleth(data, locations='Countries and areas', locationmode='country names',
                    color='Birth_Rate', range_color=[0, 100],
                    title='Birth_Rate ')
fig.show()

#### Five countries with High Unemployment Rate

In [17]:
umemployment_rate = data.sort_values('Unemployment_Rate', ascending = False).head()

In [18]:
px.bar(data, x = umemployment_rate['Countries and areas'] , y = umemployment_rate['Unemployment_Rate'],
      title = 'Unemployment Rate', labels = {'x':'Countries', 'y': 'Rate'})