# PyCitySchools


by Ryan Cornelius for UMN Data Vis Course 2023
Code from "/Starter_Code/PyCitySchools_starter.ipynb" used as a reference

Written report:

Below is a series of calculations and summaries of data collected over the school district and student population. Within the district there are 15 schools and 39,170 students. Student test results show passing rates for math, reading, and the overall (combined) passing rates. In general, reading results show more passing students. Of the top 5 schools for overall passing rates, all 5 are charter schools. Of the bottom 5, all are district schools. There appears to be an inverse relationship between per student spending and average passing rate. The average budget per student is related to both the total school population. School populations between 1 and 2000 students didn't show statistically different passing rates, but school larger than 2000 students showed a decline in average scores. The largest variance can be seen when comparing district schools to charter schools. Charter schools show dramatically higher passing rates. This can explain the population relationship and only one charter school falls over 2000 students whereas all schools under 2000 students were charter schools. Charter schools also on average have less total funding per student, thus explaining the inverse relationship between passing rates and per student budget. 


In [19]:
#Dependencies
import pandas as pd
from pathlib import Path

#Loading Data
school_data = pd.read_csv('Resources/schools_complete.csv')
student_data = pd.read_csv('Resources/students_complete.csv')

#Doing the initial merge
CombinedSchoolDF = pd.merge(student_data, school_data, how="left", on=["school_name", "school_name"])

#Setting up an dictionary for my results over the district. 
DistrictResults = {'Number of Schools' : [], 'Number of Students' : [], 'Budget' : [],
                   'Average Math Score' : [], 'Average Reading Score' : [], 
                   'Math Pass Rate' : [], 'Reading Pass Rate' : [], 'Combined Pass Rate' : []}

## District Summary

Calculating the following information on the entire school district, per instructions:

    Total number of unique schools
    Total students
    Total budget
    Average math score
    Average reading score
    % passing math (the percentage of students who passed math)
    % passing reading (the percentage of students who passed reading)
    % overall passing (the percentage of students who passed math AND reading)

In [20]:
#Total Number of schools calculated
SchoolList = CombinedSchoolDF['school_name'].unique()
NumSchools = len(SchoolList)

DistrictResults['Number of Schools'].append(NumSchools)
print(f'There are {NumSchools} schools.')

There are 15 schools.


In [21]:
#Total number of students calculated. Using anything other than Student ID causes duplicate name issues.
StudentList = CombinedSchoolDF['Student ID'].unique()
NumStudents = len(StudentList)

DistrictResults['Number of Students'].append(NumStudents)
print(f'There is {NumStudents} students.')

There is 39170 students.


In [22]:
#Total budget calculated.
TotalBudget = school_data['budget'].sum()

DistrictResults['Budget'].append(TotalBudget)
print(f'The total budget is ${TotalBudget}.')

The total budget is $24649428.


In [23]:
#Average math and reading score calculated
AverageMath = CombinedSchoolDF['math_score'].mean()
AverageReading = CombinedSchoolDF['reading_score'].mean()

DistrictResults['Average Math Score'].append(AverageMath)
DistrictResults['Average Reading Score'].append(AverageReading)
print(f'The average math and reading scores are {AverageMath:.3F} and {AverageReading:.3F}.')

The average math and reading scores are 78.985 and 81.878.


In [24]:
# % passing math, reading, and overall calculated
MathPassRate = CombinedSchoolDF.loc[CombinedSchoolDF['math_score']>=70,'Student ID'
                                   ].count() / NumStudents *100
ReadingPassRate = CombinedSchoolDF.loc[CombinedSchoolDF['reading_score']>=70,'Student ID'
                                      ].count() / NumStudents *100
CombinedPassRate = CombinedSchoolDF.loc[(CombinedSchoolDF['reading_score']>=70) & (CombinedSchoolDF['math_score']>=70),'Student ID'
                                       ].count() / NumStudents *100

DistrictResults['Math Pass Rate'].append(MathPassRate)
DistrictResults['Reading Pass Rate'].append(ReadingPassRate)
DistrictResults['Combined Pass Rate'].append(CombinedPassRate)

print(f'The pass rates are: {MathPassRate:0.2f}% math, {ReadingPassRate:0.2f}% reading, and {CombinedPassRate:0.2f}% combined.')

The pass rates are: 74.98% math, 85.81% reading, and 65.17% combined.


In [25]:
# Creating a Datafram of the results for the district 
DistrictResultsDF = pd.DataFrame(data=DistrictResults)

# Applying formatting 
DistrictResultsDF["Number of Students"] = DistrictResultsDF["Number of Students"
                                                           ].map('{:,}'.format)
DistrictResultsDF["Budget"] = DistrictResultsDF["Budget"].map('${:,.0f}'.format)
DistrictResultsDF["Average Math Score"] = DistrictResultsDF["Average Math Score"
                                                           ].map('{:.3F}'.format)
DistrictResultsDF["Average Reading Score"] = DistrictResultsDF["Average Reading Score"
                                                              ].map('{:.3F}'.format)
DistrictResultsDF["Math Pass Rate"] = DistrictResultsDF["Math Pass Rate"
                                                       ].map('{:.3F}'.format)
DistrictResultsDF["Reading Pass Rate"] = DistrictResultsDF["Reading Pass Rate"
                                                          ].map('{:.3F}'.format)
DistrictResultsDF["Combined Pass Rate"] = DistrictResultsDF["Combined Pass Rate"
                                                           ].map('{:.3F}'.format)
#Showing results
DistrictResultsDF

Unnamed: 0,Number of Schools,Number of Students,Budget,Average Math Score,Average Reading Score,Math Pass Rate,Reading Pass Rate,Combined Pass Rate
0,15,39170,"$24,649,428",78.985,81.878,74.981,85.805,65.172


## School Summary

As per instructions, create a DataFrame that includes the following for each school:

    School name
    School type
    Total students
    Total school budget
    Per student budget
    Average math score
    Average reading score
    % passing math (the percentage of students who passed math)
    % passing reading (the percentage of students who passed reading)
    % overall passing (the percentage of students who passed math AND reading)

In [26]:
#setting up an emtpy DF to fill
ResultsDF = pd.DataFrame()

#looping over schools to fill out a results dictionary, then append that dictionary as a DF row
for school in SchoolList:
    SchoolDF = CombinedSchoolDF.loc[CombinedSchoolDF['school_name'] == school,:]

    SchoolResults = {'School Name' : [], 'School Type' : [], 'Number of Students' : [],
                     'Budget' : [], 'Budget per Student' : [], 'Average Math Score' : [], 'Average Reading Score' : [], 
                       'Math Pass Rate' : [], 'Reading Pass Rate' : [], 'Combined Pass Rate' : []}

    SchoolResults['School Name'].append(school)

    Type = SchoolDF['type'].iloc[0]
    SchoolResults['School Type'].append(Type)

    NumStudents = len(SchoolDF['Student ID'].unique())
    SchoolResults['Number of Students'].append(NumStudents)

    budget = SchoolDF['budget'].iloc[0]
    SchoolResults['Budget'].append(budget)

    BudPerStu = budget / NumStudents
    SchoolResults['Budget per Student'].append(BudPerStu)

    AverageMath = SchoolDF['math_score'].mean()
    AverageReading = SchoolDF['reading_score'].mean()
    SchoolResults['Average Math Score'].append(AverageMath)
    SchoolResults['Average Reading Score'].append(AverageReading)

    MathPassRate = SchoolDF.loc[SchoolDF['math_score']>=70,
                                    'Student ID'].count() / NumStudents *100
    ReadingPassRate = SchoolDF.loc[SchoolDF['reading_score']>=70,
                                    'Student ID'].count() / NumStudents *100
    CombinedPassRate = SchoolDF.loc[(SchoolDF['reading_score']>=70) & (SchoolDF['math_score']>=70),
                                    'Student ID'].count() / NumStudents *100

    SchoolResults['Math Pass Rate'].append(MathPassRate)
    SchoolResults['Reading Pass Rate'].append(ReadingPassRate)
    SchoolResults['Combined Pass Rate'].append(CombinedPassRate)
    DF = pd.DataFrame(SchoolResults)

    ResultsDF = pd.concat([ResultsDF, DF])

#Sorting and arranging my results DF for per school results
ResultsDF = ResultsDF.set_index('School Name')
ResultsDF = ResultsDF.sort_values('School Name',ascending=True)

#Setting up a copy to use later that has no $ or % formatting
UnformattedResultsDF1 = ResultsDF.copy()

#formatting each column
ResultsDF["Number of Students"] = ResultsDF["Number of Students"
                                                           ].map('{:,}'.format)
ResultsDF["Budget"] = ResultsDF["Budget"].map('${:,.0f}'.format)
ResultsDF["Budget per Student"] = ResultsDF["Budget per Student"].map('${:,.2f}'.format)
ResultsDF["Average Math Score"] = ResultsDF["Average Math Score"
                                                           ].map('{:.3F}'.format)
ResultsDF["Average Reading Score"] = ResultsDF["Average Reading Score"
                                                              ].map('{:.3F}'.format)
UnformattedResultsDF2 = ResultsDF.copy()
ResultsDF["Math Pass Rate"] = ResultsDF["Math Pass Rate"
                                                       ].map('{:.3}'.format)
ResultsDF["Reading Pass Rate"] = ResultsDF["Reading Pass Rate"
                                                          ].map('{:.3}'.format)
ResultsDF["Combined Pass Rate"] = ResultsDF["Combined Pass Rate"
                                                           ].map('{:.3}'.format)

# Displaying my per-school results
display(ResultsDF)
    
    
    

Unnamed: 0_level_0,School Type,Number of Students,Budget,Budget per Student,Average Math Score,Average Reading Score,Math Pass Rate,Reading Pass Rate,Combined Pass Rate
School Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Bailey High School,District,4976,"$3,124,928",$628.00,77.048,81.034,66.7,81.9,54.6
Cabrera High School,Charter,1858,"$1,081,356",$582.00,83.062,83.976,94.1,97.0,91.3
Figueroa High School,District,2949,"$1,884,411",$639.00,76.712,81.158,66.0,80.7,53.2
Ford High School,District,2739,"$1,763,916",$644.00,77.103,80.746,68.3,79.3,54.3
Griffin High School,Charter,1468,"$917,500",$625.00,83.351,83.817,93.4,97.1,90.6
Hernandez High School,District,4635,"$3,022,020",$652.00,77.29,80.934,66.8,80.9,53.5
Holden High School,Charter,427,"$248,087",$581.00,83.803,83.815,92.5,96.3,89.2
Huang High School,District,2917,"$1,910,635",$655.00,76.629,81.183,65.7,81.3,53.5
Johnson High School,District,4761,"$3,094,650",$650.00,77.072,80.966,66.1,81.2,53.5
Pena High School,Charter,962,"$585,858",$609.00,83.84,84.045,94.6,95.9,90.5


## Highest-Performing Schools (by % Overall Passing)

In [27]:
# Displaying top schools (by combined rate)
top_schools = ResultsDF.sort_values('Combined Pass Rate',ascending=False)
top_schools.head()

Unnamed: 0_level_0,School Type,Number of Students,Budget,Budget per Student,Average Math Score,Average Reading Score,Math Pass Rate,Reading Pass Rate,Combined Pass Rate
School Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Cabrera High School,Charter,1858,"$1,081,356",$582.00,83.062,83.976,94.1,97.0,91.3
Thomas High School,Charter,1635,"$1,043,130",$638.00,83.418,83.849,93.3,97.3,90.9
Griffin High School,Charter,1468,"$917,500",$625.00,83.351,83.817,93.4,97.1,90.6
Wilson High School,Charter,2283,"$1,319,574",$578.00,83.274,83.989,93.9,96.5,90.6
Pena High School,Charter,962,"$585,858",$609.00,83.84,84.045,94.6,95.9,90.5


## Bottom Performing Schools (By % Overall Passing)

In [28]:
# Displaying bottom schools (by combined rate)
bottom_schools = ResultsDF.sort_values('Combined Pass Rate',ascending=True)
bottom_schools.head()

Unnamed: 0_level_0,School Type,Number of Students,Budget,Budget per Student,Average Math Score,Average Reading Score,Math Pass Rate,Reading Pass Rate,Combined Pass Rate
School Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
Rodriguez High School,District,3999,"$2,547,363",$637.00,76.843,80.745,66.4,80.2,53.0
Figueroa High School,District,2949,"$1,884,411",$639.00,76.712,81.158,66.0,80.7,53.2
Hernandez High School,District,4635,"$3,022,020",$652.00,77.29,80.934,66.8,80.9,53.5
Huang High School,District,2917,"$1,910,635",$655.00,76.629,81.183,65.7,81.3,53.5
Johnson High School,District,4761,"$3,094,650",$650.00,77.072,80.966,66.1,81.2,53.5


## Reading Score by Grade 

In [29]:
#similar to before, setting up an empty dataframe that will be filled by a dictionary / DF
#this time for math results per grade
MathGradeResultsDF = pd.DataFrame()
GradeList = CombinedSchoolDF['grade'].unique()

#Looping over all schools and grades to generate rows (as dictionaries)
for school in SchoolList:
    SchoolResults = {'School Name' : [], '9th' : [], '10th' : [], '11th' : [], '12th' : []}
    SchoolResults['School Name'].append(school)
    
    SchoolDF = CombinedSchoolDF.loc[CombinedSchoolDF['school_name'] == school,:]
    for grade in GradeList:
        GradeDF = SchoolDF.loc[SchoolDF['grade'] == grade,:]

        AverageMath = GradeDF['math_score'].mean()
        SchoolResults[grade].append(AverageMath)
    DF = pd.DataFrame(SchoolResults)
    MathGradeResultsDF = pd.concat([MathGradeResultsDF, DF])

#formatting and arranging the results
MathGradeResultsDF = MathGradeResultsDF.set_index('School Name')
MathGradeResultsDF = MathGradeResultsDF.sort_values('School Name',ascending=True)

#displaying the results
MathGradeResultsDF

Unnamed: 0_level_0,9th,10th,11th,12th
School Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Bailey High School,77.083676,76.996772,77.515588,76.492218
Cabrera High School,83.094697,83.154506,82.76556,83.277487
Figueroa High School,76.403037,76.539974,76.884344,77.151369
Ford High School,77.361345,77.672316,76.918058,76.179963
Griffin High School,82.04401,84.229064,83.842105,83.356164
Hernandez High School,77.438495,77.337408,77.136029,77.186567
Holden High School,83.787402,83.429825,85.0,82.855422
Huang High School,77.027251,75.908735,76.446602,77.225641
Johnson High School,77.187857,76.691117,77.491653,76.863248
Pena High School,83.625455,83.372,84.328125,84.121547


In [30]:
#similar to before, setting up an empty dataframe that will be filled by a dictionary / DF
#this time for reading results per grade
ReadingGradeResultsDF = pd.DataFrame()
GradeList = CombinedSchoolDF['grade'].unique()

#Looping over all schools and grades to generate rows (as dictionaries)
for school in SchoolList:
    SchoolResults = {'School Name' : [], '9th' : [], '10th' : [], '11th' : [], '12th' : []}
    SchoolResults['School Name'].append(school)
    
    SchoolDF = CombinedSchoolDF.loc[CombinedSchoolDF['school_name'] == school,:]
    for grade in GradeList:
        GradeDF = SchoolDF.loc[SchoolDF['grade'] == grade,:]

        AverageReading = GradeDF['reading_score'].mean()
        SchoolResults[grade].append(AverageReading)
    DF = pd.DataFrame(SchoolResults)
    ReadingGradeResultsDF = pd.concat([ReadingGradeResultsDF, DF])

#formatting and arranging the results
ReadingGradeResultsDF = ReadingGradeResultsDF.set_index('School Name')
ReadingGradeResultsDF = ReadingGradeResultsDF.sort_values('School Name',ascending=True)

#displaying the results
ReadingGradeResultsDF

Unnamed: 0_level_0,9th,10th,11th,12th
School Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Bailey High School,81.303155,80.907183,80.945643,80.912451
Cabrera High School,83.676136,84.253219,83.788382,84.287958
Figueroa High School,81.198598,81.408912,80.640339,81.384863
Ford High School,80.632653,81.262712,80.403642,80.662338
Griffin High School,83.369193,83.706897,84.288089,84.013699
Hernandez High School,80.86686,80.660147,81.39614,80.857143
Holden High School,83.677165,83.324561,83.815534,84.698795
Huang High School,81.290284,81.512386,81.417476,80.305983
Johnson High School,81.260714,80.773431,80.616027,81.227564
Pena High School,83.807273,83.612,84.335938,84.59116


## Scores by School Spending

In [31]:
#using bins given
spending_bins = [0, 585, 630, 645, 680]
labels = ["<$585", "$585-630", "$630-645", "$645-680"]

#grabbing a copy of earilier unformatted dataframe, naming it same as the provided example.
school_spending_df = UnformattedResultsDF1.copy()

#using .cut over the provided bins to generate a new column
school_spending_df["Spending Ranges (Per Student)"] = pd.cut(school_spending_df['Budget per Student'].astype('long'),
                                                            spending_bins,labels=labels,right=True)

#arranging and formatting columns
school_spending_df = school_spending_df.sort_values('School Name',ascending=True)
school_spending_df["Budget"] = school_spending_df["Budget"].map('${:,.0f}'.format)
school_spending_df["Budget per Student"] = school_spending_df["Budget per Student"].map('${:,.2f}'.format)

#displaying my resulting dataframe
school_spending_df

Unnamed: 0_level_0,School Type,Number of Students,Budget,Budget per Student,Average Math Score,Average Reading Score,Math Pass Rate,Reading Pass Rate,Combined Pass Rate,Spending Ranges (Per Student)
School Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
Bailey High School,District,4976,"$3,124,928",$628.00,77.048432,81.033963,66.680064,81.93328,54.642283,$585-630
Cabrera High School,Charter,1858,"$1,081,356",$582.00,83.061895,83.97578,94.133477,97.039828,91.334769,<$585
Figueroa High School,District,2949,"$1,884,411",$639.00,76.711767,81.15802,65.988471,80.739234,53.204476,$630-645
Ford High School,District,2739,"$1,763,916",$644.00,77.102592,80.746258,68.309602,79.299014,54.289887,$630-645
Griffin High School,Charter,1468,"$917,500",$625.00,83.351499,83.816757,93.392371,97.138965,90.599455,$585-630
Hernandez High School,District,4635,"$3,022,020",$652.00,77.289752,80.934412,66.752967,80.862999,53.527508,$645-680
Holden High School,Charter,427,"$248,087",$581.00,83.803279,83.814988,92.505855,96.252927,89.227166,<$585
Huang High School,District,2917,"$1,910,635",$655.00,76.629414,81.182722,65.683922,81.316421,53.513884,$645-680
Johnson High School,District,4761,"$3,094,650",$650.00,77.072464,80.966394,66.057551,81.222432,53.539172,$645-680
Pena High School,Charter,962,"$585,858",$609.00,83.839917,84.044699,94.594595,95.945946,90.540541,$585-630


In [32]:
#creating a dataframe grouped by the spending ranges and calculating the means.
spending_summary = school_spending_df.groupby(["Spending Ranges (Per Student)"]).mean()

#It fails to provide values for some columns... so not displaying those in the below re-formatting
spending_summary = spending_summary[["Average Math Score","Average Reading Score","Math Pass Rate","Reading Pass Rate","Combined Pass Rate"]]
spending_summary

  spending_summary = school_spending_df.groupby(["Spending Ranges (Per Student)"]).mean()


Unnamed: 0_level_0,Average Math Score,Average Reading Score,Math Pass Rate,Reading Pass Rate,Combined Pass Rate
Spending Ranges (Per Student),Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
<$585,83.455399,83.933814,93.460096,96.610877,90.369459
$585-630,81.899826,83.155286,87.133538,92.718205,81.418596
$630-645,78.518855,81.624473,73.484209,84.391793,62.857656
$645-680,76.99721,81.027843,66.164813,81.133951,53.526855


## Scores by School Size

In [33]:
#using bins given
size_bins = [0, 1000, 2000, 5000]
poplabels = ["Small (<1000)", "Medium (1000-2000)", "Large (2000-5000)"]

In [34]:
#grabbing a copy of earilier unformatted dataframe, naming it same as the provided example.
per_school_summary = UnformattedResultsDF1.copy()

# Per given: Use `pd.cut` on the "Total Students" column of the `per_school_summary` DataFrame.
per_school_summary["School Size"] = pd.cut(school_spending_df['Number of Students'].astype('long'),
                                                            size_bins,labels=poplabels)

#arranging and formatting columns
per_school_summary = per_school_summary.sort_values('School Name',ascending=True)
per_school_summary["Budget"] = per_school_summary["Budget"].map('${:,.0f}'.format)
per_school_summary["Budget per Student"] = per_school_summary["Budget per Student"].map('${:,.2f}'.format)

#displaying my resulting dataframe
per_school_summary

Unnamed: 0_level_0,School Type,Number of Students,Budget,Budget per Student,Average Math Score,Average Reading Score,Math Pass Rate,Reading Pass Rate,Combined Pass Rate,School Size
School Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
Bailey High School,District,4976,"$3,124,928",$628.00,77.048432,81.033963,66.680064,81.93328,54.642283,Large (2000-5000)
Cabrera High School,Charter,1858,"$1,081,356",$582.00,83.061895,83.97578,94.133477,97.039828,91.334769,Medium (1000-2000)
Figueroa High School,District,2949,"$1,884,411",$639.00,76.711767,81.15802,65.988471,80.739234,53.204476,Large (2000-5000)
Ford High School,District,2739,"$1,763,916",$644.00,77.102592,80.746258,68.309602,79.299014,54.289887,Large (2000-5000)
Griffin High School,Charter,1468,"$917,500",$625.00,83.351499,83.816757,93.392371,97.138965,90.599455,Medium (1000-2000)
Hernandez High School,District,4635,"$3,022,020",$652.00,77.289752,80.934412,66.752967,80.862999,53.527508,Large (2000-5000)
Holden High School,Charter,427,"$248,087",$581.00,83.803279,83.814988,92.505855,96.252927,89.227166,Small (<1000)
Huang High School,District,2917,"$1,910,635",$655.00,76.629414,81.182722,65.683922,81.316421,53.513884,Large (2000-5000)
Johnson High School,District,4761,"$3,094,650",$650.00,77.072464,80.966394,66.057551,81.222432,53.539172,Large (2000-5000)
Pena High School,Charter,962,"$585,858",$609.00,83.839917,84.044699,94.594595,95.945946,90.540541,Small (<1000)


In [35]:
#grouping the previous dataframe to the size bins, averaging, and displaying the desired columns
size_summary = per_school_summary.groupby(["School Size"]).mean()
size_summary = size_summary[["Average Math Score","Average Reading Score","Math Pass Rate","Reading Pass Rate","Combined Pass Rate"]]
size_summary

  size_summary = per_school_summary.groupby(["School Size"]).mean()


Unnamed: 0_level_0,Average Math Score,Average Reading Score,Math Pass Rate,Reading Pass Rate,Combined Pass Rate
School Size,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Small (<1000),83.821598,83.929843,93.550225,96.099437,89.883853
Medium (1000-2000),83.374684,83.864438,93.599695,96.79068,90.621535
Large (2000-5000),77.746417,81.344493,69.963361,82.766634,58.286003


## Scores by School Type

In [36]:
#grouping the original results dataframe to the school type, averaging, and displaying the desired columns
type_summary = UnformattedResultsDF1.groupby(["School Type"]).mean()
type_summary = type_summary[["Average Math Score","Average Reading Score","Math Pass Rate","Reading Pass Rate","Combined Pass Rate"]]
type_summary

Unnamed: 0_level_0,Average Math Score,Average Reading Score,Math Pass Rate,Reading Pass Rate,Combined Pass Rate
School Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Charter,83.473852,83.896421,93.62083,96.586489,90.432244
District,76.956733,80.966636,66.548453,80.799062,53.672208
