# PyCity Schools Analysis

School data was analyzed to calculate average test scores and the percentage of students who passed (received over 70 on a test) based on multiple factors: grade, per student spending ranges, school size, and school type. The analysis was performed over the entire school district and on each individual school within the district. 

---

- Schools with a medium size (1000-2000 students) saw the highest percentage of students passing when compared to smaller (<1000 students) schools which had slightly higher scores, but a lower percent of students passing. Large schools (2000-5000 students) showed the lowest average score and lowest percentage of students passing as would be expected from a larger sample size introducing variability. 

- Charter schools showed higher average test scores in math (84.47) reading (83.90) than district schools (76.96, 80.97). Charter schools also showed higher percentages of students who passed (>70) math (93.62%), reading (96.59%), and overall (90.43) than district schools which saw low percentages of students who passed math (66.55), reading (80.80), and overall (53.67).

- The highest scores were seen in schools with a low spending range per student, while the lowest scores were seen in schools with the highest spending range per student.

# District Data

In [1]:
import pandas as pd

# File paths for both data files found in the resource folder
school_data_path = "../Resources/schools_complete.csv"
student_data_path = "../Resources/students_complete.csv"

# Store data into data frames
school_data = pd.read_csv(school_data_path)
student_data = pd.read_csv(student_data_path)

# Merges the data into one data frame based on the name of the school
school_data_complete = pd.merge(student_data, school_data, how="left", on=["school_name", "school_name"])
school_data_complete.head()

Unnamed: 0,Student ID,student_name,gender,grade,school_name,reading_score,math_score,School ID,type,size,budget
0,0,Paul Bradley,M,9th,Huang High School,66,79,0,District,2917,1910635
1,1,Victor Smith,M,12th,Huang High School,94,61,0,District,2917,1910635
2,2,Kevin Rodriguez,M,12th,Huang High School,90,60,0,District,2917,1910635
3,3,Dr. Richard Scott,M,12th,Huang High School,67,58,0,District,2917,1910635
4,4,Bonnie Ray,F,9th,Huang High School,97,84,0,District,2917,1910635


In [2]:
# Calculates number of schools, students, school budget, and 
# average math and reading scores for the entire district
district_school_count = len(school_data_complete["school_name"].unique())
district_student_count = school_data_complete["Student ID"].count()
district_budget = school_data_complete["budget"].unique().sum()
district_math_score = school_data_complete["math_score"].mean()
district_reading_score = school_data_complete["reading_score"].mean()

# Filters out all scores below 70
masked_passed_math = school_data_complete["math_score"] >= 70
masked_passed_reading = school_data_complete["reading_score"] >= 70
masked_passed_overall = (school_data_complete["math_score"] >= 70) & \
                        (school_data_complete["reading_score"] >= 70)

# Calculates the number of students who scored above 70 in math, reading, and overall
district_math_count = school_data_complete[masked_passed_math].count()["student_name"]
district_reading_count = school_data_complete[masked_passed_reading].count()["student_name"]
district_overall_count = school_data_complete[masked_passed_overall].count()["student_name"]

# Calculates the percent of students who passed math, reading and overall
district_math_percent = district_math_count / district_student_count * 100
district_reading_percent = district_reading_count / district_student_count * 100
district_overall_percent = district_overall_count / district_student_count * 100

In [3]:
# Constructs data frame summarizing the school district statistics
district_summary = pd.DataFrame([{
    "Total Schools" : district_school_count,
    "Total Students" : district_student_count,
    "Total Budget" : district_budget,
    "Average Math Score" : district_math_score,
    "Average Reading Score" : district_reading_score,
    "Percent Passing Math" : district_math_percent,
    "Percent Passing Reading" : district_reading_percent,
    "Percent Overall Passing" : district_overall_percent
    }])

# Formates each column
district_summary["Total Students"] = district_summary["Total Students"].map("{:,}".format)
district_summary["Total Budget"] = district_summary["Total Budget"].map("${:,.2f}".format)
district_summary["Average Math Score"] = district_summary["Average Math Score"].map("{:,.2f}".format)
district_summary["Average Reading Score"] = district_summary["Average Reading Score"].map("{:,.2f}".format)
district_summary["Percent Passing Math"] = district_summary["Percent Passing Math"].map("{:.2f}%".format)
district_summary["Percent Passing Reading"] = district_summary["Percent Passing Reading"].map("{:.2f}%".format)
district_summary["Percent Overall Passing"] = district_summary["Percent Overall Passing"].map("{:.2f}%".format)
district_summary.style.hide_index()

Total Schools,Total Students,Total Budget,Average Math Score,Average Reading Score,Percent Passing Math,Percent Passing Reading,Percent Overall Passing
15,39170,"$24,649,428.00",78.99,81.88,74.98%,85.81%,65.17%


# Per School Data

In [4]:
# Sorts data by school type and school name to find information on each school
school_types = school_data.set_index(["school_name"])["type"]
school_data_byschool = school_data_complete.groupby(["school_name"])
school_average = school_data_byschool.mean()

# Calculates the size, budget, budget per student, and average scores of each school
per_school_counts = school_average["size"]
per_school_budget = school_average["budget"]
per_school_capita = per_school_budget / per_school_counts
per_school_math = school_average["math_score"]
per_school_reading = school_average["reading_score"]

In [5]:
# Finds the number of students in each school that passed math, reading, and overall
per_school_passed_math_count = school_data_complete[masked_passed_math].groupby(["school_name"]).count()["math_score"]
per_school_passed_reading_count = school_data_complete[masked_passed_reading].groupby(["school_name"]).count()["reading_score"]
per_school_passed_overall_count = school_data_complete[masked_passed_overall].groupby(["school_name"]).count()["Student ID"]

# Finds the percent of students in each school that passed math, reading, and overall
per_school_passed_math  = (per_school_passed_math_count / per_school_counts) * 100
per_school_passed_reading = (per_school_passed_reading_count / per_school_counts) * 100
per_school_passed_overall = (per_school_passed_overall_count / per_school_counts) * 100

In [6]:
# Constructs data frame summarizing each school's statistics
per_school_summary = pd.DataFrame({
    "School Type" : school_types,
    "Total Students" : per_school_counts,
    "Total School Budget" : per_school_budget,
    "Per Student Budget" : per_school_capita,
    "Average Math Score" : per_school_math,
    "Average Reading Score" : per_school_reading,
    "Percent Passing Math" : per_school_passed_math,
    "Percent Passing Reading" : per_school_passed_reading,
    "Percent Passing Overall" : per_school_passed_overall
})

# Creates and formats a seperate dataframe to summarize each school's data
# Seperate formated data frame is needed since the per school data will be used later on
per_school_summary_format = per_school_summary.copy()
per_school_summary_format["Total Students"] = per_school_summary_format["Total Students"].map("{:.0f}".format)
per_school_summary_format["Total School Budget"] = per_school_summary_format["Total School Budget"].map("${:,.2f}".format)
per_school_summary_format["Per Student Budget"] = per_school_summary_format["Per Student Budget"].map("${:,.2f}".format)
per_school_summary_format["Average Math Score"] = per_school_summary_format["Average Math Score"].map("{:,.2f}".format)
per_school_summary_format["Average Reading Score"] = per_school_summary_format["Average Reading Score"].map("{:,.2f}".format)
per_school_summary_format["Percent Passing Math"] = per_school_summary_format["Percent Passing Math"].map("{:,.2f}%".format)
per_school_summary_format["Percent Passing Reading"] = per_school_summary_format["Percent Passing Reading"].map("{:,.2f}%".format)
per_school_summary_format["Percent Passing Overall"] = per_school_summary_format["Percent Passing Overall"].map("{:,.2f}%".format)

per_school_summary_format

Unnamed: 0,School Type,Total Students,Total School Budget,Per Student Budget,Average Math Score,Average Reading Score,Percent Passing Math,Percent Passing Reading,Percent Passing Overall
Bailey High School,District,4976,"$3,124,928.00",$628.00,77.05,81.03,66.68%,81.93%,54.64%
Cabrera High School,Charter,1858,"$1,081,356.00",$582.00,83.06,83.98,94.13%,97.04%,91.33%
Figueroa High School,District,2949,"$1,884,411.00",$639.00,76.71,81.16,65.99%,80.74%,53.20%
Ford High School,District,2739,"$1,763,916.00",$644.00,77.1,80.75,68.31%,79.30%,54.29%
Griffin High School,Charter,1468,"$917,500.00",$625.00,83.35,83.82,93.39%,97.14%,90.60%
Hernandez High School,District,4635,"$3,022,020.00",$652.00,77.29,80.93,66.75%,80.86%,53.53%
Holden High School,Charter,427,"$248,087.00",$581.00,83.8,83.81,92.51%,96.25%,89.23%
Huang High School,District,2917,"$1,910,635.00",$655.00,76.63,81.18,65.68%,81.32%,53.51%
Johnson High School,District,4761,"$3,094,650.00",$650.00,77.07,80.97,66.06%,81.22%,53.54%
Pena High School,Charter,962,"$585,858.00",$609.00,83.84,84.04,94.59%,95.95%,90.54%


# Top Performing Schools

In [32]:
# Calcualtes the highest performing schools and displays the top 5 results
top_schools = per_school_summary.sort_values("Percent Passing Overall", ascending = False)

# Formats Summary Table
top_schools["Total Students"] = top_schools["Total Students"].map("{:.0f}".format)
top_schools["Total School Budget"] = top_schools["Total School Budget"].map("${:,.2f}".format)
top_schools["Per Student Budget"] = top_schools["Per Student Budget"].map("${:,.2f}".format)
top_schools["Average Math Score"] = top_schools["Average Math Score"].map("{:,.2f}".format)
top_schools["Average Reading Score"] = top_schools["Average Reading Score"].map("{:,.2f}".format)
top_schools["Percent Passing Math"] = top_schools["Percent Passing Math"].map("{:,.2f}%".format)
top_schools["Percent Passing Reading"] = top_schools["Percent Passing Reading"].map("{:,.2f}%".format)
top_schools["Percent Passing Overall"] = top_schools["Percent Passing Overall"].map("{:,.2f}%".format)

top_schools.head(5)

Unnamed: 0,School Type,Total Students,Total School Budget,Per Student Budget,Average Math Score,Average Reading Score,Percent Passing Math,Percent Passing Reading,Percent Passing Overall,School Size
Cabrera High School,Charter,1858,"$1,081,356.00",$582.00,83.06,83.98,94.13%,97.04%,91.33%,Medium (1000-2000)
Thomas High School,Charter,1635,"$1,043,130.00",$638.00,83.42,83.85,93.27%,97.31%,90.95%,Medium (1000-2000)
Griffin High School,Charter,1468,"$917,500.00",$625.00,83.35,83.82,93.39%,97.14%,90.60%,Medium (1000-2000)
Wilson High School,Charter,2283,"$1,319,574.00",$578.00,83.27,83.99,93.87%,96.54%,90.58%,Large (2000-5000)
Pena High School,Charter,962,"$585,858.00",$609.00,83.84,84.04,94.59%,95.95%,90.54%,Small (<1000)


# Bottom Performing Schools

In [33]:
# Calcualtes the lowest performing schools and displays the bottom 5 results
bot_schools = per_school_summary.sort_values("Percent Passing Overall")

# Formats Summary Table
bot_schools["Total Students"] = bot_schools["Total Students"].map("{:.0f}".format)
bot_schools["Total School Budget"] = bot_schools["Total School Budget"].map("${:,.2f}".format)
bot_schools["Per Student Budget"] = bot_schools["Per Student Budget"].map("${:,.2f}".format)
bot_schools["Average Math Score"] = bot_schools["Average Math Score"].map("{:,.2f}".format)
bot_schools["Average Reading Score"] = bot_schools["Average Reading Score"].map("{:,.2f}".format)
bot_schools["Percent Passing Math"] = bot_schools["Percent Passing Math"].map("{:,.2f}%".format)
bot_schools["Percent Passing Reading"] = bot_schools["Percent Passing Reading"].map("{:,.2f}%".format)
bot_schools["Percent Passing Overall"] = bot_schools["Percent Passing Overall"].map("{:,.2f}%".format)

bot_schools.head(5)

Unnamed: 0,School Type,Total Students,Total School Budget,Per Student Budget,Average Math Score,Average Reading Score,Percent Passing Math,Percent Passing Reading,Percent Passing Overall,School Size
Rodriguez High School,District,3999,"$2,547,363.00",$637.00,76.84,80.74,66.37%,80.22%,52.99%,Large (2000-5000)
Figueroa High School,District,2949,"$1,884,411.00",$639.00,76.71,81.16,65.99%,80.74%,53.20%,Large (2000-5000)
Huang High School,District,2917,"$1,910,635.00",$655.00,76.63,81.18,65.68%,81.32%,53.51%,Large (2000-5000)
Hernandez High School,District,4635,"$3,022,020.00",$652.00,77.29,80.93,66.75%,80.86%,53.53%,Large (2000-5000)
Johnson High School,District,4761,"$3,094,650.00",$650.00,77.07,80.97,66.06%,81.22%,53.54%,Large (2000-5000)


# Math Score by Grade

In [9]:
# Sorts all school's data by grade
ninth_graders = school_data_complete[(school_data_complete["grade"] == "9th")]
tenth_graders = school_data_complete[(school_data_complete["grade"] == "10th")]
eleventh_graders = school_data_complete[(school_data_complete["grade"] == "11th")]
twelfth_graders = school_data_complete[(school_data_complete["grade"] == "12th")]

# Sorts each grade by school and finds the average value of each metric
ninth_graders_scores = ninth_graders.groupby("school_name").mean()
tenth_graders_scores = tenth_graders.groupby("school_name").mean()
eleventh_graders_scores = eleventh_graders.groupby("school_name").mean()
twelfth_graders_scores = twelfth_graders.groupby("school_name").mean()

# Finds the average math, reading, and overall score for each school/grade
ninth_grade_math_scores = ninth_graders_scores["math_score"]
tenth_grader_math_scores = tenth_graders_scores["math_score"]
eleventh_grader_math_scores = eleventh_graders_scores["math_score"]
twelfth_grader_math_scores = twelfth_graders_scores["math_score"]

# Summarizes each school's statistics by grade
math_scores_by_grade = pd.DataFrame({
    "9th Grade" : ninth_grade_math_scores,
    "10th Grade" : tenth_grader_math_scores,
    "11th Grade" : eleventh_grader_math_scores,
    "12th Grade" : twelfth_grader_math_scores
})

# Formats the summary table
math_scores_by_grade.index.name = None
math_scores_by_grade.index.name = None
math_scores_by_grade["9th Grade"] = math_scores_by_grade["9th Grade"].map("{:,.2f}".format)
math_scores_by_grade["10th Grade"] = math_scores_by_grade["10th Grade"].map("{:,.2f}".format)
math_scores_by_grade["11th Grade"] = math_scores_by_grade["11th Grade"].map("{:,.2f}".format)
math_scores_by_grade["12th Grade"] = math_scores_by_grade["12th Grade"].map("{:,.2f}".format)
math_scores_by_grade

Unnamed: 0,9th Grade,10th Grade,11th Grade,12th Grade
Bailey High School,77.08,77.0,77.52,76.49
Cabrera High School,83.09,83.15,82.77,83.28
Figueroa High School,76.4,76.54,76.88,77.15
Ford High School,77.36,77.67,76.92,76.18
Griffin High School,82.04,84.23,83.84,83.36
Hernandez High School,77.44,77.34,77.14,77.19
Holden High School,83.79,83.43,85.0,82.86
Huang High School,77.03,75.91,76.45,77.23
Johnson High School,77.19,76.69,77.49,76.86
Pena High School,83.63,83.37,84.33,84.12


# Reading Scores by Grade

In [10]:
# Use the code to select only the `reading_score`.
ninth_grade_reading_scores = ninth_graders_scores["reading_score"]
tenth_grader_reading_scores = tenth_graders_scores["reading_score"]
eleventh_grader_reading_scores = eleventh_graders_scores["reading_score"]
twelfth_grader_reading_scores = twelfth_graders_scores["reading_score"]

# Combine each of the scores above into single DataFrame called `reading_scores_by_grade`
reading_scores_by_grade = pd.DataFrame({
    "9th Grade" : ninth_grade_reading_scores,
    "10th Grade" : tenth_grader_reading_scores,
    "11th Grade" : eleventh_grader_reading_scores,
    "12th Grade" : twelfth_grader_reading_scores
})

# Formats the summary table
reading_scores_by_grade.index.name = None
reading_scores_by_grade["9th Grade"] = reading_scores_by_grade["9th Grade"].map("{:,.2f}".format)
reading_scores_by_grade["10th Grade"] = reading_scores_by_grade["10th Grade"].map("{:,.2f}".format)
reading_scores_by_grade["11th Grade"] = reading_scores_by_grade["11th Grade"].map("{:,.2f}".format)
reading_scores_by_grade["12th Grade"] = reading_scores_by_grade["12th Grade"].map("{:,.2f}".format)
reading_scores_by_grade

Unnamed: 0,9th Grade,10th Grade,11th Grade,12th Grade
Bailey High School,81.3,80.91,80.95,80.91
Cabrera High School,83.68,84.25,83.79,84.29
Figueroa High School,81.2,81.41,80.64,81.38
Ford High School,80.63,81.26,80.4,80.66
Griffin High School,83.37,83.71,84.29,84.01
Hernandez High School,80.87,80.66,81.4,80.86
Holden High School,83.68,83.32,83.82,84.7
Huang High School,81.29,81.51,81.42,80.31
Johnson High School,81.26,80.77,80.62,81.23
Pena High School,83.81,83.61,84.34,84.59


# Scores by School Spending

In [17]:
# Sorts all school data by budget cateogies <$585, $585-630, $630-645, $645-680
spending_bins = [0, 585, 630, 645, 680]
labels = ["<$585", "$585-630", "$630-645", "$645-680"]

school_spending_df = per_school_summary.copy()
school_spending_df["Spending Ranges (Per Student)"] = pd.cut(x = school_spending_df["Per Student Budget"], bins = spending_bins, labels = labels)

# Formats the budget summary table
# Seperate formated data frame is needed since the school spending data will be used later on
school_spending_df_formatted = school_spending_df.copy()
school_spending_df_formatted["Total Students"] = school_spending_df_formatted["Total Students"].map("{:.0f}".format)
school_spending_df_formatted["Total School Budget"] = school_spending_df_formatted["Total School Budget"].map("${:,.2f}".format)
school_spending_df_formatted["Per Student Budget"] = school_spending_df_formatted["Per Student Budget"].map("${:,.2f}".format)
school_spending_df_formatted["Average Math Score"] = school_spending_df_formatted["Average Math Score"].map("{:,.2f}".format)
school_spending_df_formatted["Average Reading Score"] = school_spending_df_formatted["Average Reading Score"].map("{:,.2f}".format)
school_spending_df_formatted["Percent Passing Math"] = school_spending_df_formatted["Percent Passing Math"].map("{:,.2f}%".format)
school_spending_df_formatted["Percent Passing Reading"] = school_spending_df_formatted["Percent Passing Reading"].map("{:,.2f}%".format)
school_spending_df_formatted["Percent Passing Overall"] = school_spending_df_formatted["Percent Passing Overall"].map("{:,.2f}%".format)
school_spending_df_formatted

Unnamed: 0,School Type,Total Students,Total School Budget,Per Student Budget,Average Math Score,Average Reading Score,Percent Passing Math,Percent Passing Reading,Percent Passing Overall,Spending Ranges (Per Student)
Bailey High School,District,4976,"$3,124,928.00",$628.00,77.05,81.03,66.68%,81.93%,54.64%,$585-630
Cabrera High School,Charter,1858,"$1,081,356.00",$582.00,83.06,83.98,94.13%,97.04%,91.33%,<$585
Figueroa High School,District,2949,"$1,884,411.00",$639.00,76.71,81.16,65.99%,80.74%,53.20%,$630-645
Ford High School,District,2739,"$1,763,916.00",$644.00,77.1,80.75,68.31%,79.30%,54.29%,$630-645
Griffin High School,Charter,1468,"$917,500.00",$625.00,83.35,83.82,93.39%,97.14%,90.60%,$585-630
Hernandez High School,District,4635,"$3,022,020.00",$652.00,77.29,80.93,66.75%,80.86%,53.53%,$645-680
Holden High School,Charter,427,"$248,087.00",$581.00,83.8,83.81,92.51%,96.25%,89.23%,<$585
Huang High School,District,2917,"$1,910,635.00",$655.00,76.63,81.18,65.68%,81.32%,53.51%,$645-680
Johnson High School,District,4761,"$3,094,650.00",$650.00,77.07,80.97,66.06%,81.22%,53.54%,$645-680
Pena High School,Charter,962,"$585,858.00",$609.00,83.84,84.04,94.59%,95.95%,90.54%,$585-630


In [18]:
# Provided from starter code
# Finds the average math, reading, and overall scores and the percent passing of each budget range
spending_math_scores = school_spending_df.groupby(["Spending Ranges (Per Student)"]).mean()["Average Math Score"]
spending_reading_scores = school_spending_df.groupby(["Spending Ranges (Per Student)"]).mean()["Average Reading Score"]
spending_passing_math = school_spending_df.groupby(["Spending Ranges (Per Student)"]).mean()["Percent Passing Math"]
spending_passing_reading = school_spending_df.groupby(["Spending Ranges (Per Student)"]).mean()["Percent Passing Reading"]
overall_passing_spending = school_spending_df.groupby(["Spending Ranges (Per Student)"]).mean()["Percent Passing Overall"]

In [20]:
# Combines all school statistics by budget range
spending_summary = pd.DataFrame({
    "Average Math Score" : spending_math_scores,
    "Average Reading Score" : spending_reading_scores,
    "Percent Passing Math" : spending_passing_math,
    "Percent Passing Reading" : spending_passing_reading,
    "Percent Passing Overall" : overall_passing_spending
})

# Formats summary table
spending_summary["Average Math Score"] = spending_summary["Average Math Score"].map("{:,.2f}".format)
spending_summary["Average Reading Score"] = spending_summary["Average Reading Score"].map("{:,.2f}".format)
spending_summary["Percent Passing Math"] = spending_summary["Percent Passing Math"].map("{:,.2f}%".format)
spending_summary["Percent Passing Reading"] = spending_summary["Percent Passing Reading"].map("{:,.2f}%".format)
spending_summary["Percent Passing Overall"] = spending_summary["Percent Passing Overall"].map("{:,.2f}%".format)

spending_summary

Unnamed: 0_level_0,Average Math Score,Average Reading Score,Percent Passing Math,Percent Passing Reading,Percent Passing Overall
Spending Ranges (Per Student),Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
<$585,83.46,83.93,93.46%,96.61%,90.37%
$585-630,81.9,83.16,87.13%,92.72%,81.42%
$630-645,78.52,81.62,73.48%,84.39%,62.86%
$645-680,77.0,81.03,66.16%,81.13%,53.53%


# Scores by School Size

In [23]:
# Sorts all schools by size: Small (<1000), Medium (1000-2000), Large (2000-5000)
size_bins = [0, 1000, 2000, 5000]
labels = ["Small (<1000)", "Medium (1000-2000)", "Large (2000-5000)"]

per_school_summary["School Size"] = pd.cut(x = school_spending_df["Total Students"], bins = size_bins, labels = labels)

# Formats summary table
# Seperate formated data frame is needed since the per school data will be used later on
per_school_summary_format = per_school_summary.copy()
per_school_summary_format["Total Students"] = per_school_summary_format["Total Students"].map("{:.0f}".format)
per_school_summary_format["Total School Budget"] = per_school_summary_format["Total School Budget"].map("${:,.2f}".format)
per_school_summary_format["Per Student Budget"] = per_school_summary_format["Per Student Budget"].map("${:,.2f}".format)
per_school_summary_format["Average Math Score"] = per_school_summary_format["Average Math Score"].map("{:,.2f}".format)
per_school_summary_format["Average Reading Score"] = per_school_summary_format["Average Reading Score"].map("{:,.2f}".format)
per_school_summary_format["Percent Passing Math"] = per_school_summary_format["Percent Passing Math"].map("{:,.2f}%".format)
per_school_summary_format["Percent Passing Reading"] = per_school_summary_format["Percent Passing Reading"].map("{:,.2f}%".format)
per_school_summary_format["Percent Passing Overall"] = per_school_summary_format["Percent Passing Overall"].map("{:,.2f}%".format)

per_school_summary_format

Unnamed: 0,School Type,Total Students,Total School Budget,Per Student Budget,Average Math Score,Average Reading Score,Percent Passing Math,Percent Passing Reading,Percent Passing Overall,School Size
Bailey High School,District,4976,"$3,124,928.00",$628.00,77.05,81.03,66.68%,81.93%,54.64%,Large (2000-5000)
Cabrera High School,Charter,1858,"$1,081,356.00",$582.00,83.06,83.98,94.13%,97.04%,91.33%,Medium (1000-2000)
Figueroa High School,District,2949,"$1,884,411.00",$639.00,76.71,81.16,65.99%,80.74%,53.20%,Large (2000-5000)
Ford High School,District,2739,"$1,763,916.00",$644.00,77.1,80.75,68.31%,79.30%,54.29%,Large (2000-5000)
Griffin High School,Charter,1468,"$917,500.00",$625.00,83.35,83.82,93.39%,97.14%,90.60%,Medium (1000-2000)
Hernandez High School,District,4635,"$3,022,020.00",$652.00,77.29,80.93,66.75%,80.86%,53.53%,Large (2000-5000)
Holden High School,Charter,427,"$248,087.00",$581.00,83.8,83.81,92.51%,96.25%,89.23%,Small (<1000)
Huang High School,District,2917,"$1,910,635.00",$655.00,76.63,81.18,65.68%,81.32%,53.51%,Large (2000-5000)
Johnson High School,District,4761,"$3,094,650.00",$650.00,77.07,80.97,66.06%,81.22%,53.54%,Large (2000-5000)
Pena High School,Charter,962,"$585,858.00",$609.00,83.84,84.04,94.59%,95.95%,90.54%,Small (<1000)


In [24]:
# From Starter code
# Finds the average math, reading, and overall scores and the percent passing of each size range
size_math_scores = per_school_summary.groupby(["School Size"]).mean()["Average Math Score"]
size_reading_scores = per_school_summary.groupby(["School Size"]).mean()["Average Reading Score"]
size_passing_math = per_school_summary.groupby(["School Size"]).mean()["Percent Passing Math"]
size_passing_reading = per_school_summary.groupby(["School Size"]).mean()["Percent Passing Reading"]
size_overall_passing = per_school_summary.groupby(["School Size"]).mean()["Percent Passing Overall"]

In [27]:
# Combines all school statistics by size range
size_summary = pd.DataFrame({
    "Average Math Score" : size_math_scores,
    "Average Reading Score" : size_reading_scores,
    "Percent Passing Math" : size_passing_math,
    "Percent Passing Reading" : size_passing_reading,
    "Percent Passing Overall" : size_overall_passing
})

# Formats summary table
size_summary["Average Math Score"] = size_summary["Average Math Score"].map("{:,.2f}".format)
size_summary["Average Reading Score"] = size_summary["Average Reading Score"].map("{:,.2f}".format)
size_summary["Percent Passing Math"] = size_summary["Percent Passing Math"].map("{:,.2f}%".format)
size_summary["Percent Passing Reading"] = size_summary["Percent Passing Reading"].map("{:,.2f}%".format)
size_summary["Percent Passing Overall"] = size_summary["Percent Passing Overall"].map("{:,.2f}%".format)

size_summary

Unnamed: 0_level_0,Average Math Score,Average Reading Score,Percent Passing Math,Percent Passing Reading,Percent Passing Overall
School Size,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Small (<1000),83.82,83.93,93.55%,96.10%,89.88%
Medium (1000-2000),83.37,83.86,93.60%,96.79%,90.62%
Large (2000-5000),77.75,81.34,69.96%,82.77%,58.29%


# Scores by School Type

In [28]:
# Sorts school data by each school and school type
type_math_scores = per_school_summary.groupby(["School Type"]).mean()
type_reading_scores = per_school_summary.groupby(["School Type"]).mean()
type_passing_math = per_school_summary.groupby(["School Type"]).mean()
type_passing_reading = per_school_summary.groupby(["School Type"]).mean()
type_overall_passing = per_school_summary.groupby(["School Type"]).mean()

# Finds the average math and reading scores and the percent of students who passed
average_math_score_by_type = type_math_scores["Average Math Score"]
average_reading_score_by_type = type_reading_scores["Average Reading Score"]
average_percent_passing_math_by_type = type_passing_math["Percent Passing Math"]
average_percent_passing_reading_by_type = type_passing_reading["Percent Passing Reading"]
average_percent_overall_passing_by_type = type_overall_passing["Percent Passing Overall"]

In [30]:
# Combines school data by school type and summarizes their statistics
type_summary = pd.DataFrame({
    "Average Math Score" : average_math_score_by_type,
    "Average Reading Score" : average_reading_score_by_type,
    "Percent Passing Math" : average_percent_passing_math_by_type,
    "Percent Passing Reading" : average_percent_passing_reading_by_type,
    "Percent Passing Overall" : average_percent_overall_passing_by_type
})

# Formats the summary table
type_summary["Average Math Score"] = type_summary["Average Math Score"].map("{:,.2f}".format)
type_summary["Average Reading Score"] = type_summary["Average Reading Score"].map("{:,.2f}".format)
type_summary["Percent Passing Math"] = type_summary["Percent Passing Math"].map("{:,.2f}%".format)
type_summary["Percent Passing Reading"] = type_summary["Percent Passing Reading"].map("{:,.2f}%".format)
type_summary["Percent Passing Overall"] = type_summary["Percent Passing Overall"].map("{:,.2f}%".format)

type_summary

Unnamed: 0_level_0,Average Math Score,Average Reading Score,Percent Passing Math,Percent Passing Reading,Percent Passing Overall
School Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Charter,83.47,83.9,93.62%,96.59%,90.43%
District,76.96,80.97,66.55%,80.80%,53.67%
