## Global Student Diversity Insights

###### The hypothesis suggests a slow increase in the number of females in academic programs over the last decade. To support this idea, we'll take a close look at how many boys and girls were enrolled in different academic years using the Global_Student_Diversity_Insights database. This analysis aims to reveal noticeable patterns and differences in gender participation over specific time periods, providing useful insights into the complex dynamics of gender representation in education. The potential impact of this study goes beyond mere observation; it could help shape educational policies to create more inclusive and diverse academic environments worldwide.

**Step 1:**

###### Import the database into VSCode and explore insights within the Global_Student_Diversity_Insights database.

In [1]:
import numpy as np

import pandas as pd

import pymysql as mysql

import matplotlib.pyplot as plt

from sqlalchemy import create_engine

In [3]:
# Defining the password variable
password = '123123123'

# Creating a connection to the Global_Student_Diversity_Insights database
conn = mysql.connect(host='localhost',
                     port=3306, user='root',
                     passwd=password,
                     db='Global_Student_Diversity_Insights')
# Create a SQLAlchemy engine
engin = create_engine(f"mysql+pymysql://{'root'}:{password}@{'localhost'}:{3306}/{'Global_Student_Diversity_Insights'}")

In [4]:
showTable = pd.read_sql("Show Tables ", engin)
showTable

Unnamed: 0,Tables_in_global_student_diversity_insights
0,academic_detail_modified
1,academic_modified
2,field_of_study_modified
3,origin_modified
4,source_of_fund_modified
5,status_modified


###### The "Global_Student_Diversity_Insights" dataset consists of six tables—academic details, enrollment, field of study, student origin, funding sources, and student status—offering comprehensive insights into various aspects of global student diversity.

In [29]:
academicDetail = pd.read_sql("""DESCRIBE academic_detail_modified;
""", engin)
print("The decribe table of academic detail modified:")
academicDetail


The decribe table of academic detail modified:


Unnamed: 0,Field,Type,Null,Key,Default,Extra
0,year,int,YES,,,
1,academic_type,varchar(13),YES,,,
2,academic_level,varchar(29),YES,,,
3,students,int,YES,,,


In [28]:
academicDetailDisplay = pd.read_sql("SELECT * FROM academic_detail_modified;", engin)
print("Display of the table:")
academicDetailDisplay.head(3)

Display of the table:


Unnamed: 0,year,academic_type,academic_level,students
0,1999,Undergraduate,Associate's,59830
1,1999,Undergraduate,Bachelor's,177381
2,1999,Graduate,Master's,110857


In [18]:
academic = pd.read_sql("""DESCRIBE academic_modified;
""", engin)
print("The decribe table of academic modified:")
academic

The decribe table of academic modified:


Unnamed: 0,Field,Type,Null,Key,Default,Extra
0,year,int,YES,,,
1,students,int,YES,,,
2,us_students,float,YES,,,
3,undergraduate,float,YES,,,
4,graduate,float,YES,,,
5,non_degree,float,YES,,,
6,opt,float,YES,,,


In [30]:
academicDisplay = pd.read_sql("SELECT * FROM academic_modified;", engin)
print("Display of the table:")
academicDisplay.head(3)

Display of the table:


Unnamed: 0,year,students,us_students,undergraduate,graduate,non_degree,opt
0,1979,286343,11570000.0,172520.0,94130.0,16850.0,2840.0
1,1980,311882,12097000.0,186660.0,99110.0,21660.0,3450.0
2,1981,326299,12372000.0,195150.0,106290.0,21980.0,2880.0


In [19]:
studyField= pd.read_sql("""DESCRIBE field_of_study_modified;
""", engin)
print("The decribe table of field of study modified:")
studyField

The decribe table of field of study modified:


Unnamed: 0,Field,Type,Null,Key,Default,Extra
0,year,int,YES,,,
1,field_of_study,varchar(33),YES,,,
2,major,varchar(52),YES,,,
3,students,double,YES,,,


In [31]:
studyFieldDisplay = pd.read_sql("SELECT * FROM field_of_study_modified;", engin)
print("Display of the table:")
studyFieldDisplay.head(3)

Display of the table:


Unnamed: 0,year,field_of_study,major,students
0,1998,Agriculture,Agriculture,6146.0
1,1998,Agriculture,Natural Resources and Conservation,1803.0
2,1998,Business and Management,Business and Management,101360.0


In [20]:
origin= pd.read_sql("""DESCRIBE origin_modified;
""", engin)
print("The decribe table of origin modified:")
origin

The decribe table of origin modified:


Unnamed: 0,Field,Type,Null,Key,Default,Extra
0,year,int,YES,,,
1,origin_region,varchar(27),YES,,,
2,origin,varchar(40),YES,,,
3,academic_type,varchar(13),YES,,,
4,students,int,YES,,,


In [32]:
originDisplay = pd.read_sql("SELECT * FROM origin_modified;", engin)
print("Display of the table:")
originDisplay.head(3)

Display of the table:


Unnamed: 0,year,origin_region,origin,academic_type,students
0,2000,"Africa, Subsaharan","Africa, Subsaharan, Unspecified",Graduate,2
1,2000,"Africa, Subsaharan","Africa, Subsaharan, Unspecified",Other,0
2,2000,"Africa, Subsaharan","Africa, Subsaharan, Unspecified",Undergraduate,6


In [21]:
sourceOfFund = pd.read_sql("""DESCRIBE source_of_fund_modified;
""", engin)
print("The decribe table of source of fund modified:")
sourceOfFund

The decribe table of source of fund modified:


Unnamed: 0,Field,Type,Null,Key,Default,Extra
0,year,int,YES,,,
1,academic_type,varchar(13),YES,,,
2,source_type,varchar(13),YES,,,
3,source_of_fund,varchar(32),YES,,,
4,students,int,YES,,,


In [33]:
sourceOfFundDisplay = pd.read_sql("SELECT * FROM source_of_fund_modified;", engin)
print("Display of the table:")
sourceOfFundDisplay.head(3)

Display of the table:


Unnamed: 0,year,academic_type,source_type,source_of_fund,students
0,1999,Undergraduate,International,Personal and Family,201578
1,1999,Undergraduate,International,Foreign Government or University,9742
2,1999,Undergraduate,International,Foreign Private Sponsor,6245


In [22]:
status = pd.read_sql("""DESCRIBE status_modified;
""", engin)
print("The decribe table of status modified:")
status

The decribe table of status modified:


Unnamed: 0,Field,Type,Null,Key,Default,Extra
0,year,int,YES,,,
1,female,double,YES,,,
2,male,double,YES,,,
3,single,double,YES,,,
4,married,double,YES,,,
5,full_time,double,YES,,,
6,part_time,double,YES,,,
7,visa_f,double,YES,,,
8,visa_j,double,YES,,,
9,visa_other,double,YES,,,


In [34]:
statusDisplay = pd.read_sql("SELECT * FROM status_modified;", engin)
print("Display of the table:")
statusDisplay.head(3)

Display of the table:


Unnamed: 0,year,female,male,single,married,full_time,part_time,visa_f,visa_j,visa_other
0,2007,278841.0,344964.0,543958.0,79847.0,575772.0,48033.0,552691.0,31814.0,39300.0
1,2008,304242.0,367374.0,591694.0,79922.0,613185.0,58431.0,589007.0,39625.0,42984.0
2,2009,309534.0,381389.0,615612.0,75311.0,637722.0,53201.0,612158.0,38692.0,40073.0
