# Indiana High School Graduation Rates Study

This study digs into the graduation rates in Indiana high schools by separating out those students who received diplomas verses those who received waivers. We'll examine the relationships between diploma rates and federal school grade; diploma rates and school size; diploma rates and school attendance; and any trends and outliers that appear in the data.

In [1]:
#Dependencies
import pandas as pd
    

In [2]:
# Save file path to Excel files (only looking at school demographics to begin) 
path_demo_school = ('Resources/demographics_public_schools.csv')

In [12]:
# Save file path to Indiana Student Demographics.
path_demo_summary = ('Resources/demographics_state.csv')

## Indiana Student Demographics Summary

In [13]:
# read school demographics csv file.
in_school_demo = pd.read_csv(path_demo_summary)
in_school_demo.head()

Unnamed: 0,Student Demographic,Cohort Count*,Graduates,2017 Graduation Rate,Unnamed: 4
0,American Indian,200,159,79.50%,
1,Asian,1877,1652,88.01%,
2,Black,9615,7487,77.89%,
3,Hispanic,7682,6397,83.29%,
4,Multiracial,3203,2695,84.14%,


In [14]:
# read column names. csv file used two rows to id columns. Pandas will use one row as column names.
in_school_demo.columns

Index(['Student Demographic', 'Cohort Count*', 'Graduates',
       '2017 Graduation Rate', 'Unnamed: 4'],
      dtype='object')

In [16]:
clean_in_df = in_school_demo.drop(["Unnamed: 4"], axis=1)
clean_in_df.head()

Unnamed: 0,Student Demographic,Cohort Count*,Graduates,2017 Graduation Rate
0,American Indian,200,159,79.50%
1,Asian,1877,1652,88.01%
2,Black,9615,7487,77.89%
3,Hispanic,7682,6397,83.29%
4,Multiracial,3203,2695,84.14%


## Student Demographics by School

In [26]:
# read in csv file saved from xlsx file pulled from doe.in.gov on IN graduation rates.
school_demo_df = pd.read_csv(path_demo_school)
school_demo_df.head()

Unnamed: 0.1,Unnamed: 0,Unnamed: 1,Unnamed: 2,Unnamed: 3,American Indian,Unnamed: 5,Unnamed: 6,Asian,Unnamed: 8,Unnamed: 9,...,Unnamed: 42,Female,Unnamed: 44,Unnamed: 45,Male,Unnamed: 47,Unnamed: 48,Total,Unnamed: 50,Unnamed: 51
0,Corp ID,Corporation Name,Schl ID,School Name,Cohort count,Graduates,Graduation Rate,Cohort count,Graduates,Graduation Rate,...,Graduation Rate,Cohort count,Graduates,Graduation Rate,Cohort count,Graduates,Graduation Rate,Cohort count,Graduates,Graduation Rate
1,0015,Adams Central Community Schools,0021,Adams Central High School,***,***,***,,,,...,,51,50,98.04%,44,43,97.73%,95,93,97.89%
2,0035,South Adams Schools,0023,South Adams High School,***,***,***,***,***,***,...,,54,50,92.59%,57,53,92.98%,111,103,92.79%
3,0025,North Adams Community Schools,0029,Bellmont Senior High School,***,***,***,***,***,***,...,***,89,84,94.38%,95,83,87.37%,184,167,90.76%
4,0125,M S D Southwest Allen County Schls,0047,Homestead Senior High School,,,,15,13,86.67%,...,***,287,282,98.26%,281,264,93.95%,568,546,96.13%


In [30]:
# read column names. csv file used two rows to id columns. Pandas will use one row as column names.
school_demo_df.columns

Index(['Unnamed: 0', 'Unnamed: 1', 'Unnamed: 2', 'Unnamed: 3',
       'American Indian', 'Unnamed: 5', 'Unnamed: 6', 'Asian', 'Unnamed: 8',
       'Unnamed: 9', 'Black', 'Unnamed: 11', 'Unnamed: 12', 'Hispanic',
       'Unnamed: 14', 'Unnamed: 15', 'Multiracial', 'Unnamed: 17',
       'Unnamed: 18', 'Native Hawaiian or Other Pacific Islander',
       'Unnamed: 20', 'Unnamed: 21', 'White', 'Unnamed: 23', 'Unnamed: 24',
       'Paid Meals', 'Unnamed: 26', 'Unnamed: 27', 'Free/Reduced Price Meals',
       'Unnamed: 29', 'Unnamed: 30', 'General Education', 'Unnamed: 32',
       'Unnamed: 33', 'Special Education', 'Unnamed: 35', 'Unnamed: 36',
       'Non-English Language Learner', 'Unnamed: 38', 'Unnamed: 39',
       'English Language Learner', 'Unnamed: 41', 'Unnamed: 42', 'Female',
       'Unnamed: 44', 'Unnamed: 45', 'Male', 'Unnamed: 47', 'Unnamed: 48',
       'Total', 'Unnamed: 50', 'Unnamed: 51'],
      dtype='object')

In [28]:
# rename columns for df use in pandas
renamed_df = school_demo_df.rename(columns={"Unnamed: 0":"Corp ID", 
                                            "Unnamed: 1":"Corporation Name",
                                            "Unnamed: 2":"Schl ID",
                                            "Unnamed: 3":"School Name",
                                            "American Indian":"Native American Cohort Count",
                                            "Unnamed: 5":"Native American Graduates",
                                            "Unnamed: 6":"Native American Grad Rate",
                                            "Asian":"Asian Cohort Count",
                                            "Unnamed: 8":"Asian Graduates",
                                            "Unnamed: 9":"Asian Grad Rate",
                                            "Black":"Black Cohort Count",
                                            "Unnamed: 11":"Black Graduates",
                                            "Unnamed: 12":"Black Grad Rate",
                                            "Hispanic":"Hispanic Cohort Count",
                                            "Unnamed: 14":"Hispanic Graduates",
                                            "Unnamed: 15":"Hispanic Grad Rate",
                                            "Multiracial":"Multiracial Cohort Count",
                                            "Unnamed: 17":"Multiracial Graduates",
                                            "Unnamed: 18":"Multiracial Grad Rate",
                                            "Native Hawaiian or Other Pacific Islander":"Native Pacific Islander Cohort Count",
                                            "Unnamed: 20":"Native Pacific Islander Graduates",
                                            "Unnamed: 21":"Native Pacific Islander Grad Rate",
                                            "White":"White Cohort Count", 
                                            "Unnamed: 23":"White Graduates",
                                            "Unnamed: 24":"White Grad Rate",
                                            "Paid Meals":"Paid Meals Cohort Count",
                                            "Unnamed: 26":"Paid Meals Graduates", 
                                            "Unnamed: 27":"Paid Meals Grad Rate",
                                            "Free/Reduced Price Meals":"Free/Reduced Meals Cohort Count",
                                            "Unnamed: 29":"Free/Reduced Meals Graduates",
                                            "Unnamed: 30":"Free/Reduced Meals Grad Rate",
                                            "General Education":"General Ed Cohort Count",
                                            "Unnamed: 32":"General Ed Graduates",
                                            "Unnamed: 33":"General Ed Grad Rate",
                                            "Special Education":"Special Ed Cohort Count",
                                            "Unnamed: 35":"Special Ed Graduates",
                                            "Unnamed: 36":"Special Ed Grad Rate",
                                            "Non-English Language Learner":"Non-English Language Learner Cohort Count", 
                                            "Unnamed: 38":"Non-English Language Learner Graduates",
                                            "Unnamed: 39":"Non-English Language Learner Grad Rate",
                                            "English Language Learner":"English Language Learner Cohort Count",
                                            "Unnamed: 41":"English Language Learner Graduates",
                                            "Unnamed: 42":"English Language Learner Grad Rate",
                                            "Female":"Female Cohort Count",
                                            "Unnamed: 44":"Female Graduates",
                                            "Unnamed: 45":"Female Grad Rate",
                                            "Male":"Male Cohort Count",
                                            "Unnamed: 47":"Male Graduates", 
                                            "Unnamed: 48":"Male Grad Rate",
                                            "Total":"Total Cohort Count", 
                                            "Unnamed: 50":"Total Graduates", 
                                            "Unnamed: 51":"Total Grad Rate"})
#renamed_df.head()

In [31]:
# drop the first row of data that csv file used as second column id row
clean_demo_df = renamed_df.drop([0])
clean_demo_df.head()

Unnamed: 0,Corp ID,Corporation Name,Schl ID,School Name,Native American Cohort Count,Native American Graduates,Native American Grad Rate,Asian Cohort Count,Asian Graduates,Asian Grad Rate,...,English Language Learner Grad Rate,Female Cohort Count,Female Graduates,Female Grad Rate,Male Cohort Count,Male Graduates,Male Grad Rate,Total Cohort Count,Total Graduates,Total Grad Rate
1,15,Adams Central Community Schools,21,Adams Central High School,***,***,***,,,,...,,51,50,98.04%,44,43,97.73%,95,93,97.89%
2,35,South Adams Schools,23,South Adams High School,***,***,***,***,***,***,...,,54,50,92.59%,57,53,92.98%,111,103,92.79%
3,25,North Adams Community Schools,29,Bellmont Senior High School,***,***,***,***,***,***,...,***,89,84,94.38%,95,83,87.37%,184,167,90.76%
4,125,M S D Southwest Allen County Schls,47,Homestead Senior High School,,,,15,13,86.67%,...,***,287,282,98.26%,281,264,93.95%,568,546,96.13%
5,255,East Allen County Schools,49,Leo Junior/Senior High School,***,***,***,***,***,***,...,,103,98,95.15%,90,81,90.00%,193,179,92.75%
