# PyCity Schools Analysis

* As a whole, smaller and medium sized schools dramatically out-performed large sized schools on passing math performances (89-91% passing vs 67%).
* As a whole, schools with higher budgets, did not yield better test results. By contrast, schools with higher spending per student actually underperformed compare to schools with smaller budgets ($645-675/per student) vs. (<585 per student).
* As a whole, charter schools out-performed the public district schools across all metrics. However, more analysis will be required to glean if the effect is due to school practices or the fact that charter schools tend to serve smaller student populations per school. 

## District Summary

* Calculate the total number of schools

* Calculate the total number of students

* Calculate the total budget

* Calculate the average math score 

* Calculate the average reading score

* Calculate the overall passing rate (overall average score), i.e. (avg. math score + avg. reading score)/2

* Calculate the percentage of students with a passing math score (70 or greater)

* Calculate the percentage of students with a passing reading score (70 or greater)

* Create a dataframe to hold the above results

* Optional: give the displayed data cleaner formatting

In [5]:
# Dependencies and Setup
import pandas as pd
import numpy as np

# File to Load (Remember to Change These)
school_data_to_load = "Resources/schools_complete.csv"
student_data_to_load = "Resources/students_complete.csv"

# Read School and Student Data File and store into Pandas Data Frames
school_data = pd.read_csv(school_data_to_load)
student_data = pd.read_csv(student_data_to_load)

# Combine the data into a single dataset
school_data_complete = pd.merge(student_data, school_data, how="outer", on=["school_name", "school_name"])
school_data_complete.head()

Unnamed: 0,Student ID,student_name,gender,grade,school_name,reading_score,math_score,School ID,type,size,budget
0,0,Paul Bradley,M,9th,Huang High School,66,79,0,District,2917,1910635
1,1,Victor Smith,M,12th,Huang High School,94,61,0,District,2917,1910635
2,2,Kevin Rodriguez,M,12th,Huang High School,90,60,0,District,2917,1910635
3,3,Dr. Richard Scott,M,12th,Huang High School,67,58,0,District,2917,1910635
4,4,Bonnie Ray,F,9th,Huang High School,97,84,0,District,2917,1910635


In [6]:
total_schools = len(school_data_complete["school_name"].unique())
total_schools

15

In [7]:
student_count = school_data_complete["student_name"].count()
student_count

39170

In [8]:
#Get total budget
unique_budget = school_data_complete["budget"].unique()
unique_budget
total_budget = float(unique_budget.sum())
total_budget

24649428.0

In [9]:
#Calculate average math score
average_math = float(school_data_complete["math_score"].mean())
average_math

78.98537145774827

In [10]:
#Calculate average reading score
average_read = float(school_data_complete["reading_score"].mean())
average_read

81.87784018381414

In [11]:
#Calculate overall passing rate
overall_passing = ((average_math)+(average_read))/2
overall_passing

80.43160582078121

In [12]:
#Find percentage of passing math students
math_students = school_data_complete.loc[school_data_complete["math_score"]>= 70,:]
math = math_students["student_name"].count()
percent_math_pass = float(math/student_count*100)
percent_math_pass

74.9808526933878

In [13]:
#Determine passing reading students and percentage
reading_students = school_data_complete.loc[school_data_complete["reading_score"]>= 70,:]
read = reading_students["student_name"].count()
percent_reading_pass = float(read/student_count*100)
percent_reading_pass

85.80546336482001

In [14]:
#Create DataFrame of results
district_summary = pd.DataFrame({"Total Schools": total_schools, "Total Students": [student_count], "Total District Budget": [total_budget], "Average Math Score": [average_math], "Average Reading Score": [average_read], "Overall Passing Rate": [overall_passing], "Students Passing Math": [percent_math_pass], "Students Passing Reading": [percent_reading_pass]})

In [15]:
#optional formatting
district_summary["Total District Budget"] = district_summary["Total District Budget"].map("${:.2f}".format)
district_summary["Average Math Score"] = district_summary["Average Math Score"].map("{:.2f}".format)
district_summary["Average Reading Score"] = district_summary["Average Reading Score"].map("{:.2f}".format)
district_summary["Overall Passing Rate"] = district_summary["Overall Passing Rate"].map("{:.2f}".format)
district_summary["Students Passing Math"] = district_summary["Students Passing Math"].map("%{:.2f}".format)
district_summary["Students Passing Reading"] = district_summary["Students Passing Reading"].map("%{:.2f}".format)
district_summary

Unnamed: 0,Total Schools,Total Students,Total District Budget,Average Math Score,Average Reading Score,Overall Passing Rate,Students Passing Math,Students Passing Reading
0,15,39170,$24649428.00,78.99,81.88,80.43,%74.98,%85.81
