# City Schools Report

District Summary Analysis:

    The total number of schools is 15 with a total student population of 3,9170.
    The total budget is $24,649,428.
    
    Student Success

    1) Math Scores
        The average math score for the student population is 78.99%.
        There are 29,370 students passing math or 74.98% of the student population.

    2) Reading Scores
        The average reading score for the student population is 81.88%. 
        There are 33,610 students passing reading or 85.81% of the student population. 
        The overall passing rate of students in both math and reading is 65.17%.

    Analysis: Students are more successful in reading than math as evidenced by their passing rate and average scores for the subjects.

School Summary, Highest and Lowest Performance

    1) Size
        The largest school in the district is Bailey High School with a census of 4976 students.
        The smallest school in the district is Holden High School with a census of 427 students. 

    2) Spending per Student
        The budget per capita is the highest for Hernandez High School at $652 per student. 
        Wilson High School has the lowest budget per capital at $578 per student. 

    3) Scores
        Highest Scores and Passing Rates: 
            Cabrera High School has the highest overall percent of students passing at 91.33%.
            Pena High School has the highest math score average at 83.84%. 
            Pena High School also has the highest reading score average at 84.04%.
        Lowest Scores and Passing Rates: 
            Rodriguez High School has the overall lowest passing rate at 52.99%.
            Huang High School has the lowest math score average at 76.63%.
            Rodriguez High School has the lowest reading score average at 80.74%
        Scores per grade
            The average math scores are consistently lower than reading scores for each grade: 9th, 10th, 11th, and 12th. 

Scores by Spending Ranges (per student) 

    1) The schools with the lowest spending range per student ($0-$584) had the highest Overall Passing Rate at 90.37% of students passing math and reading. 
    2) The schools with the highest spending rate per student ($646-$680) had the lowest Overall Passing Rate at 53.53%.

Scores by Size Ranges 

    1) Medium sized schools with 1000-2000 students had the highest overall passing rate at 90.62% of students passing. 
    2) Smallest schools with under 100 students had an overall passing rate that was slightly lower than medium schools at 89.89%.
    3) The Largest schools with 2000 to 5000 students had the lowest passing rate at 58.29%. This is a drastic difference of 31.60% from the passing rates at smallest schools and 32.33% from the 
        passing rates at medium sized schools.

Scores by School Type: Charter Schools have higher scores and higher passing rates than District Schools.  The overall passing rate for Charter Schools is 90.43% compared to District Schools at 53.67%. 
    

Overall Analysis: 

1) The spending per student differences within this group of schools does not seem to help the number of students overall passing rate.  As spending increased the overall passing rate decreased.  

2) Size of school could be a better determiner of percentage of students passing but only when comparing small and medium schools to large schools.  There was not much difference between the passing rates of small and medium schools. However, the largest schools with over 2000 students had far lower passing rates than both the medium and small schools.  Perhaps addtional analysis could review the teacher to student ratios of schools within the different size ranges. 
  
3) Average reading scores are higher than math scores for all schools and there are more students passing reading than math at all schools.
   
4) School types also seem to have correlation with the overall passing rate of students.  Charter Schools have a 36.76% higher passing rate than the District Schools.  The largest difference was in students passing math at 93.62% for Charter Schools and only 66.55% for District Schools.  Further analysis on what contributes to the higher math scores might be helpful investigating how to improve match scores for all schools.   




In [1]:
# Import 
import pandas as pd
from pathlib import Path

In [2]:

# File Path
school_data_to_load = Path("Resources\schools_complete.csv")
student_data_to_load = Path("Resources\students_complete.csv")

# Read School and Student Data File and store into Pandas DataFrames
school_data = pd.read_csv(school_data_to_load)
student_data = pd.read_csv(student_data_to_load)

# Combine the data from the files into a single dataset
school_data_complete = pd.merge(student_data, school_data, how="left", on=["school_name", "school_name"])

# Make an output for the new dataset
output_file_path= 'school_data_complete.csv'
school_data_complete.to_csv(output_file_path, index=False)
school_data_complete=pd.read_csv(output_file_path)


school_data_complete.head()

Unnamed: 0,Student ID,student_name,gender,grade,school_name,reading_score,math_score,School ID,type,size,budget
0,0,Paul Bradley,M,9th,Huang High School,66,79,0,District,2917,1910635
1,1,Victor Smith,M,12th,Huang High School,94,61,0,District,2917,1910635
2,2,Kevin Rodriguez,M,12th,Huang High School,90,60,0,District,2917,1910635
3,3,Dr. Richard Scott,M,12th,Huang High School,67,58,0,District,2917,1910635
4,4,Bonnie Ray,F,9th,Huang High School,97,84,0,District,2917,1910635


## School District Summary

In [3]:
# Calculate the total number of unique schools

school_unique_count=school_data['school_name'].nunique()
print("Number of unique schools:",school_unique_count)

Number of unique schools: 15


In [4]:
# Calculate the total number of students

student_unique_count =student_data['Student ID'].nunique()
print("Number of unique students:",student_unique_count)

Number of unique students: 39170


In [11]:
# Calculate the total budget

total_budget = school_data['budget'].sum()
print("Total Budget:",total_budget)

Total Budget: 24649428


In [12]:
# Calculate the average (mean) math score
average_math_score =student_data["math_score"].mean()
print("Average Math Score:", average_math_score)


Average Math Score: 78.98537145774827


In [13]:
# Calculate the average (mean) reading score
average_reading_score =student_data["reading_score"].mean()
print("Average Reading Score:", average_reading_score)

Average Reading Score: 81.87784018381414


In [14]:
# Use the following to calculate the percentage of students who passed math (math scores greather than or equal to 70)
passing_math_count = school_data_complete[(school_data_complete["math_score"] >= 70)]["Student ID"].count()
passing_math_percentage = passing_math_count / float(student_unique_count) * 100

print("Number of Students Passing Math:", passing_math_count)
print("Percentage of Students Passing Math:", passing_math_percentage)


Number of Students Passing Math: 29370
Percentage of Students Passing Math: 74.9808526933878


In [15]:
# Calculate the percentage of students who passed reading (hint: look at how the math percentage was calculated)
passing_reading_count =school_data_complete[(school_data_complete["reading_score"] >= 70)]["Student ID"].count()

passing_reading_percentage =passing_reading_count / float(student_unique_count) * 100
print("Number of Students Passing Reading:", passing_reading_count)
print("Percentage of Students Passing Reading:", passing_reading_percentage)


Number of Students Passing Reading: 33610
Percentage of Students Passing Reading: 85.80546336482001


In [16]:
# Use the following to calculate the percentage of students that passed math and reading
passing_math_reading_count = school_data_complete[
    (school_data_complete["math_score"] >= 70) & (school_data_complete["reading_score"] >= 70)]["Student ID"].count()
overall_passing_rate = passing_math_reading_count /  float(student_unique_count) * 100
print("Overall Passing Rate:",overall_passing_rate)


Overall Passing Rate: 65.17232575950983


In [17]:
# Create a high-level snapshot of the district's key metrics in a DataFrame
#district_summary =["school_unique_count","passing_reading_count"
district_summary=pd.DataFrame(data=[[school_unique_count, student_unique_count, total_budget, average_math_score, average_reading_score, passing_math_percentage, passing_reading_percentage, overall_passing_rate]],\
                              columns=["Total Schools","Total Students", "Total Budget", "Average Math Score", "Average Reading Score", "Passing Math Percentage", "Passing Reading Percentage","Overall Passing Percentage"])  

# Formatting
district_summary["Total Students"] = district_summary["Total Students"].map("{:,}".format)
district_summary["Total Budget"] = district_summary["Total Budget"].map("${:,.2f}".format)

# Display the DataFrame
district_summary

Unnamed: 0,Total Schools,Total Students,Total Budget,Average Math Score,Average Reading Score,Passing Math Percentage,Passing Reading Percentage,Overall Passing Percentage
0,15,39170,"$24,649,428.00",78.985371,81.87784,74.980853,85.805463,65.172326


## School Summary

In [18]:
# Use the code provided to select the type per school from school_data
school_types = school_data.set_index(["school_name"])
school_types= school_types["type"]

In [19]:
# Calculate the total student count per school from school_data
per_school_counts =school_data.set_index(["school_name"])["size"]


In [20]:
# Calculate the total school budget and per capita spending per school from school_data
per_school_budget =school_data.set_index(["school_name"])["budget"]
per_school_capita =per_school_budget/per_school_counts


In [21]:
# Calculate the average test scores per school from school_data_complete
per_school_math_avg = school_data_complete.groupby("school_name")["math_score"].mean()

per_school_reading_avg = school_data_complete.groupby("school_name")["reading_score"].mean()


In [22]:
# Calculate the number of students per school with math scores of 70 or higher from school_data_complete
per_school_passing_math =school_data_complete[school_data_complete['math_score']>=70].groupby('school_name')['Student ID'].count()



In [23]:
# Calculate the number of students per school with reading scores of 70 or higher from school_data_complete
per_school_passing_reading =school_data_complete[school_data_complete['reading_score']>=70].groupby('school_name')['Student ID'].count()



In [24]:
# Use the provided code to calculate the number of students per school that passed both math and reading with scores of 70 or higher
students_passing_math_and_reading = school_data_complete[
    (school_data_complete["reading_score"] >= 70) & (school_data_complete["math_score"] >= 70)]

school_students_passing_math_and_reading = students_passing_math_and_reading.groupby(["school_name"]).size()




In [25]:
# Use the provided code to calculate the passing rates
per_school_passing_math_rate = (per_school_passing_math / per_school_counts) * 100
per_school_passing_reading_rate = (per_school_passing_reading / per_school_counts) * 100
overall_passing_rate = (school_students_passing_math_and_reading / per_school_counts) * 100


In [26]:
# Create a DataFrame called `per_school_summary` with columns for the calculations above.
per_school_summary =pd.DataFrame({
    'School Type':school_types,
    'Student Census':per_school_counts,
    'Total School Budget':per_school_budget,
    'Per Student Budget':per_school_capita,
    'Average Math Score':per_school_math_avg,
    'Average Reading Score':per_school_reading_avg,
    '% of Students Passing Math':per_school_passing_math_rate,
    '% of Students Passing Reading':per_school_passing_reading_rate,
    '% Overall Passing':overall_passing_rate,
    
})

per_school_summary=per_school_summary.rename_axis("School Name")

# Formatting
per_school_summary_formatted=per_school_summary.copy()
per_school_summary_formatted["Per Student Budget"]=per_school_summary["Per Student Budget"].map("${:,.2f}".format)
per_school_summary_formatted["Total School Budget"] = per_school_summary["Total School Budget"].map("${:,.2f}".format)

# Display the formatted DataFrame
per_school_summary_formatted

Unnamed: 0_level_0,School Type,Student Census,Total School Budget,Per Student Budget,Average Math Score,Average Reading Score,% of Students Passing Math,% of Students Passing Reading,% Overall Passing
School Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Bailey High School,District,4976,"$3,124,928.00",$628.00,77.048432,81.033963,66.680064,81.93328,54.642283
Cabrera High School,Charter,1858,"$1,081,356.00",$582.00,83.061895,83.97578,94.133477,97.039828,91.334769
Figueroa High School,District,2949,"$1,884,411.00",$639.00,76.711767,81.15802,65.988471,80.739234,53.204476
Ford High School,District,2739,"$1,763,916.00",$644.00,77.102592,80.746258,68.309602,79.299014,54.289887
Griffin High School,Charter,1468,"$917,500.00",$625.00,83.351499,83.816757,93.392371,97.138965,90.599455
Hernandez High School,District,4635,"$3,022,020.00",$652.00,77.289752,80.934412,66.752967,80.862999,53.527508
Holden High School,Charter,427,"$248,087.00",$581.00,83.803279,83.814988,92.505855,96.252927,89.227166
Huang High School,District,2917,"$1,910,635.00",$655.00,76.629414,81.182722,65.683922,81.316421,53.513884
Johnson High School,District,4761,"$3,094,650.00",$650.00,77.072464,80.966394,66.057551,81.222432,53.539172
Pena High School,Charter,962,"$585,858.00",$609.00,83.839917,84.044699,94.594595,95.945946,90.540541


## Highest Performing Schools 
#### (% of Students with Overall Passing Scores)

In [27]:
# Sort the schools by `% Overall Passing` in descending order and display the top 5 rows.
top_schools =per_school_summary_formatted.sort_values(["% Overall Passing"], ascending=False) 
top_schools.head(5)

Unnamed: 0_level_0,School Type,Student Census,Total School Budget,Per Student Budget,Average Math Score,Average Reading Score,% of Students Passing Math,% of Students Passing Reading,% Overall Passing
School Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Cabrera High School,Charter,1858,"$1,081,356.00",$582.00,83.061895,83.97578,94.133477,97.039828,91.334769
Thomas High School,Charter,1635,"$1,043,130.00",$638.00,83.418349,83.84893,93.272171,97.308869,90.948012
Griffin High School,Charter,1468,"$917,500.00",$625.00,83.351499,83.816757,93.392371,97.138965,90.599455
Wilson High School,Charter,2283,"$1,319,574.00",$578.00,83.274201,83.989488,93.867718,96.539641,90.582567
Pena High School,Charter,962,"$585,858.00",$609.00,83.839917,84.044699,94.594595,95.945946,90.540541


## Lowest Performing Schools 
#### (% of Students with Overall Passing Scores)

In [28]:
# Sort the schools by `% Overall Passing` in ascending order and display the top 5 rows.
bottom_schools =per_school_summary_formatted.sort_values(["% Overall Passing"], ascending=True) 
bottom_schools.head(5)

Unnamed: 0_level_0,School Type,Student Census,Total School Budget,Per Student Budget,Average Math Score,Average Reading Score,% of Students Passing Math,% of Students Passing Reading,% Overall Passing
School Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Rodriguez High School,District,3999,"$2,547,363.00",$637.00,76.842711,80.744686,66.366592,80.220055,52.988247
Figueroa High School,District,2949,"$1,884,411.00",$639.00,76.711767,81.15802,65.988471,80.739234,53.204476
Huang High School,District,2917,"$1,910,635.00",$655.00,76.629414,81.182722,65.683922,81.316421,53.513884
Hernandez High School,District,4635,"$3,022,020.00",$652.00,77.289752,80.934412,66.752967,80.862999,53.527508
Johnson High School,District,4761,"$3,094,650.00",$650.00,77.072464,80.966394,66.057551,81.222432,53.539172


## Math Scores by Grade

In [29]:
# Use the code provided to separate the data by grade
ninth_graders = school_data_complete[(school_data_complete["grade"] == "9th")]
tenth_graders = school_data_complete[(school_data_complete["grade"] == "10th")]
eleventh_graders = school_data_complete[(school_data_complete["grade"] == "11th")]
twelfth_graders = school_data_complete[(school_data_complete["grade"] == "12th")]

# Group by `school_name` and take the mean of the `math_score` column for each.
ninth_grade_math_scores =ninth_graders.groupby("school_name")["math_score"].mean()
tenth_grader_math_scores =tenth_graders.groupby("school_name")["math_score"].mean()
eleventh_grader_math_scores =eleventh_graders.groupby("school_name")["math_score"].mean()
twelfth_grader_math_scores =twelfth_graders.groupby("school_name")["math_score"].mean()

# Combine each of the scores above into single DataFrame called `math_scores_by_grade`
math_scores_by_grade =pd.DataFrame({
    "9th":ninth_grade_math_scores,
    "10th":tenth_grader_math_scores,
    "11th":eleventh_grader_math_scores,
    "12th":twelfth_grader_math_scores
})
# Minor data wrangling
math_scores_by_grade.index.name = None

# Display the DataFrame
math_scores_by_grade

Unnamed: 0,9th,10th,11th,12th
Bailey High School,77.083676,76.996772,77.515588,76.492218
Cabrera High School,83.094697,83.154506,82.76556,83.277487
Figueroa High School,76.403037,76.539974,76.884344,77.151369
Ford High School,77.361345,77.672316,76.918058,76.179963
Griffin High School,82.04401,84.229064,83.842105,83.356164
Hernandez High School,77.438495,77.337408,77.136029,77.186567
Holden High School,83.787402,83.429825,85.0,82.855422
Huang High School,77.027251,75.908735,76.446602,77.225641
Johnson High School,77.187857,76.691117,77.491653,76.863248
Pena High School,83.625455,83.372,84.328125,84.121547


## Reading Scores by Grade

In [30]:
# Use the code provided to separate the data by grade
ninth_graders = school_data_complete[(school_data_complete["grade"] == "9th")]
tenth_graders = school_data_complete[(school_data_complete["grade"] == "10th")]
eleventh_graders = school_data_complete[(school_data_complete["grade"] == "11th")]
twelfth_graders = school_data_complete[(school_data_complete["grade"] == "12th")]

# Group by `school_name` and take the mean of the the `reading_score` column for each.
ninth_grade_reading_scores =ninth_graders.groupby("school_name")["reading_score"].mean()
tenth_grader_reading_scores =tenth_graders.groupby("school_name")["reading_score"].mean()
eleventh_grader_reading_scores =eleventh_graders.groupby("school_name")["reading_score"].mean()
twelfth_grader_reading_scores =twelfth_graders.groupby("school_name")["reading_score"].mean()

# Combine each of the scores above into single DataFrame called `reading_scores_by_grade`
reading_scores_by_grade =pd.DataFrame({
    "9th":ninth_grade_reading_scores,
    "10th":tenth_grader_reading_scores,
    "11th":eleventh_grader_reading_scores,
    "12th":twelfth_grader_reading_scores
})
    
# Minor data wrangling
reading_scores_by_grade.index.name = None

# Display the DataFrame
reading_scores_by_grade


Unnamed: 0,9th,10th,11th,12th
Bailey High School,81.303155,80.907183,80.945643,80.912451
Cabrera High School,83.676136,84.253219,83.788382,84.287958
Figueroa High School,81.198598,81.408912,80.640339,81.384863
Ford High School,80.632653,81.262712,80.403642,80.662338
Griffin High School,83.369193,83.706897,84.288089,84.013699
Hernandez High School,80.86686,80.660147,81.39614,80.857143
Holden High School,83.677165,83.324561,83.815534,84.698795
Huang High School,81.290284,81.512386,81.417476,80.305983
Johnson High School,81.260714,80.773431,80.616027,81.227564
Pena High School,83.807273,83.612,84.335938,84.59116


In [31]:
reading_scores_by_grade.mean()

9th     82.513318
10th    82.505439
11th    82.559485
12th    82.554817
dtype: float64

## Scores by School Spending

In [32]:
# Establish the bins
spending_bins = [0, 585, 630, 645, 680]
labels = ["0 to $584", "$585 to 629", "$630 to 644", "$645 to 680"]


In [33]:
# Create a copy of the school summary since it has the "Per Student Budget"
school_spending_df = per_school_summary.copy()


In [34]:
# Use `pd.cut` to categorize spending based on the bins.
school_spending_df["Spending Ranges(Per Student)"]=pd.cut(school_spending_df["Per Student Budget"], bins=spending_bins, labels=labels, include_lowest=True)

school_spending_df_formatted=school_spending_df.copy()

school_spending_df_formatted["Per Student Budget"]=school_spending_df["Per Student Budget"].map("${:,.2f}".format)
school_spending_df_formatted["Total School Budget"]=school_spending_df["Total School Budget"].map("${:,.2f}".format)

school_spending_df_formatted

Unnamed: 0_level_0,School Type,Student Census,Total School Budget,Per Student Budget,Average Math Score,Average Reading Score,% of Students Passing Math,% of Students Passing Reading,% Overall Passing,Spending Ranges(Per Student)
School Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
Bailey High School,District,4976,"$3,124,928.00",$628.00,77.048432,81.033963,66.680064,81.93328,54.642283,$585 to 629
Cabrera High School,Charter,1858,"$1,081,356.00",$582.00,83.061895,83.97578,94.133477,97.039828,91.334769,0 to $584
Figueroa High School,District,2949,"$1,884,411.00",$639.00,76.711767,81.15802,65.988471,80.739234,53.204476,$630 to 644
Ford High School,District,2739,"$1,763,916.00",$644.00,77.102592,80.746258,68.309602,79.299014,54.289887,$630 to 644
Griffin High School,Charter,1468,"$917,500.00",$625.00,83.351499,83.816757,93.392371,97.138965,90.599455,$585 to 629
Hernandez High School,District,4635,"$3,022,020.00",$652.00,77.289752,80.934412,66.752967,80.862999,53.527508,$645 to 680
Holden High School,Charter,427,"$248,087.00",$581.00,83.803279,83.814988,92.505855,96.252927,89.227166,0 to $584
Huang High School,District,2917,"$1,910,635.00",$655.00,76.629414,81.182722,65.683922,81.316421,53.513884,$645 to 680
Johnson High School,District,4761,"$3,094,650.00",$650.00,77.072464,80.966394,66.057551,81.222432,53.539172,$645 to 680
Pena High School,Charter,962,"$585,858.00",$609.00,83.839917,84.044699,94.594595,95.945946,90.540541,$585 to 629


In [35]:
#  Calculate averages for the desired columns.
spending_math_scores = school_spending_df.groupby(["Spending Ranges(Per Student)"])["Average Math Score"].mean()
spending_reading_scores = school_spending_df.groupby(["Spending Ranges(Per Student)"])["Average Reading Score"].mean()
spending_passing_math = school_spending_df.groupby(["Spending Ranges(Per Student)"])["% of Students Passing Math"].mean()
spending_passing_reading = school_spending_df.groupby(["Spending Ranges(Per Student)"])["% of Students Passing Reading"].mean()
overall_passing_spending = school_spending_df.groupby(["Spending Ranges(Per Student)"])["% Overall Passing"].mean()

  spending_math_scores = school_spending_df.groupby(["Spending Ranges(Per Student)"])["Average Math Score"].mean()
  spending_reading_scores = school_spending_df.groupby(["Spending Ranges(Per Student)"])["Average Reading Score"].mean()
  spending_passing_math = school_spending_df.groupby(["Spending Ranges(Per Student)"])["% of Students Passing Math"].mean()
  spending_passing_reading = school_spending_df.groupby(["Spending Ranges(Per Student)"])["% of Students Passing Reading"].mean()
  overall_passing_spending = school_spending_df.groupby(["Spending Ranges(Per Student)"])["% Overall Passing"].mean()


In [36]:
# Assemble into DataFrame
spending_summary = pd.DataFrame({
    "Average Math Score":spending_math_scores,
    "Average Reading Score":spending_reading_scores,
    "% Passing Math":spending_passing_math,
    "% Passing Reading":spending_passing_reading,
    "% Overall Passing":overall_passing_spending
})

# Display results
spending_summary

Unnamed: 0_level_0,Average Math Score,Average Reading Score,% Passing Math,% Passing Reading,% Overall Passing
Spending Ranges(Per Student),Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
0 to $584,83.455399,83.933814,93.460096,96.610877,90.369459
$585 to 629,81.899826,83.155286,87.133538,92.718205,81.418596
$630 to 644,78.518855,81.624473,73.484209,84.391793,62.857656
$645 to 680,76.99721,81.027843,66.164813,81.133951,53.526855


## Scores by School Size

In [37]:
# Establish the bins.
size_bins = [0, 1000, 2000, 5000]
labels = ["Small (<1000)", "Medium (1000-2000)", "Large (2000-5000)"]


In [38]:
# Categorize the spending based on the bins
# Use `pd.cut` on the "Total Students" column of the `per_school_summary` DataFrame.

per_school_summary["School Size"]=pd.cut(per_school_summary["Student Census"], bins=size_bins, labels=labels)

#per_school_size_summary
per_school_summary_formatted

Unnamed: 0_level_0,School Type,Student Census,Total School Budget,Per Student Budget,Average Math Score,Average Reading Score,% of Students Passing Math,% of Students Passing Reading,% Overall Passing
School Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Bailey High School,District,4976,"$3,124,928.00",$628.00,77.048432,81.033963,66.680064,81.93328,54.642283
Cabrera High School,Charter,1858,"$1,081,356.00",$582.00,83.061895,83.97578,94.133477,97.039828,91.334769
Figueroa High School,District,2949,"$1,884,411.00",$639.00,76.711767,81.15802,65.988471,80.739234,53.204476
Ford High School,District,2739,"$1,763,916.00",$644.00,77.102592,80.746258,68.309602,79.299014,54.289887
Griffin High School,Charter,1468,"$917,500.00",$625.00,83.351499,83.816757,93.392371,97.138965,90.599455
Hernandez High School,District,4635,"$3,022,020.00",$652.00,77.289752,80.934412,66.752967,80.862999,53.527508
Holden High School,Charter,427,"$248,087.00",$581.00,83.803279,83.814988,92.505855,96.252927,89.227166
Huang High School,District,2917,"$1,910,635.00",$655.00,76.629414,81.182722,65.683922,81.316421,53.513884
Johnson High School,District,4761,"$3,094,650.00",$650.00,77.072464,80.966394,66.057551,81.222432,53.539172
Pena High School,Charter,962,"$585,858.00",$609.00,83.839917,84.044699,94.594595,95.945946,90.540541


In [39]:
# Calculate averages for the desired columns.
size_math_scores = per_school_summary.groupby(["School Size"])["Average Math Score"].mean()
size_reading_scores = per_school_summary.groupby(["School Size"])["Average Reading Score"].mean()
size_passing_math = per_school_summary.groupby(["School Size"])["% of Students Passing Math"].mean()
size_passing_reading = per_school_summary.groupby(["School Size"])["% of Students Passing Reading"].mean()
size_overall_passing = per_school_summary.groupby(["School Size"])["% Overall Passing"].mean()


  size_math_scores = per_school_summary.groupby(["School Size"])["Average Math Score"].mean()
  size_reading_scores = per_school_summary.groupby(["School Size"])["Average Reading Score"].mean()
  size_passing_math = per_school_summary.groupby(["School Size"])["% of Students Passing Math"].mean()
  size_passing_reading = per_school_summary.groupby(["School Size"])["% of Students Passing Reading"].mean()
  size_overall_passing = per_school_summary.groupby(["School Size"])["% Overall Passing"].mean()


In [40]:
# Create a DataFrame called `size_summary` that breaks down school performance based on school size (small, medium, or large).
# Use the scores above to create a new DataFrame called `size_summary`
size_summary =pd.DataFrame({
    "Average Math Score":size_math_scores,
    "Average Reading Score":size_reading_scores,
    "% Passing Math":size_passing_math,
    "% Passing Reading":size_passing_reading,
    "% Overall Passing":size_overall_passing
})
# Display results
size_summary

Unnamed: 0_level_0,Average Math Score,Average Reading Score,% Passing Math,% Passing Reading,% Overall Passing
School Size,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Small (<1000),83.821598,83.929843,93.550225,96.099437,89.883853
Medium (1000-2000),83.374684,83.864438,93.599695,96.79068,90.621535
Large (2000-5000),77.746417,81.344493,69.963361,82.766634,58.286003


## Scores by School Type

In [41]:
# Group the per_school_summary DataFrame by "School Type" and average the results.
average_math_score_by_type = per_school_summary.groupby(["School Type"])["Average Math Score"].mean()
average_reading_score_by_type = per_school_summary.groupby(["School Type"])["Average Reading Score"].mean()
average_percent_passing_math_by_type = per_school_summary.groupby(["School Type"])["% of Students Passing Math"].mean()
average_percent_passing_reading_by_type = per_school_summary.groupby(["School Type"])["% of Students Passing Reading"].mean()
average_percent_overall_passing_by_type = per_school_summary.groupby(["School Type"])["% Overall Passing"].mean()


In [42]:
# Assemble the new data by type into a DataFrame called `type_summary`
type_summary =pd.DataFrame({
    "Average Math Score":average_math_score_by_type,
    "Average Reading Score":average_reading_score_by_type,
    "% Passing Math":average_percent_passing_math_by_type,
    "% Passing Reading":average_percent_passing_reading_by_type,
    "% Overall Passing":average_percent_overall_passing_by_type
})
# Display results
type_summary


Unnamed: 0_level_0,Average Math Score,Average Reading Score,% Passing Math,% Passing Reading,% Overall Passing
School Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Charter,83.473852,83.896421,93.62083,96.586489,90.432244
District,76.956733,80.966636,66.548453,80.799062,53.672208
