# PyCity Schools Analysis

* As a whole, schools with higher budgets, did not yield better test results. By contrast, schools with higher spending per student actually (\$645-675) underperformed compared to schools with smaller budgets (<\$585 per student).

* As a whole, smaller and medium sized schools dramatically out-performed large sized schools on passing math performances (89-91% passing vs 67%).

* As a whole, charter schools out-performed the public district schools across all metrics. However, more analysis will be required to glean if the effect is due to school practices or the fact that charter schools tend to serve smaller student populations per school. 
---

In [1]:
import pandas as pd

In [2]:
schools_df = pd.read_csv("./raw_data/schools_complete.csv")
students_df = pd.read_csv("./raw_data/students_complete.csv")

In [3]:
schools_df.head(2)

Unnamed: 0,School ID,name,type,size,budget
0,0,Huang High School,District,2917,1910635
1,1,Figueroa High School,District,2949,1884411


In [5]:
schools_df.describe()

Unnamed: 0,School ID,size,budget
count,15.0,15.0,15.0
mean,7.0,2611.333333,1643295.0
std,4.472136,1420.915282,934776.3
min,0.0,427.0,248087.0
25%,3.5,1698.0,1046265.0
50%,7.0,2283.0,1319574.0
75%,10.5,3474.0,2228999.0
max,14.0,4976.0,3124928.0


In [6]:
students_df.head(2)

Unnamed: 0,Student ID,name,gender,grade,school,reading_score,math_score
0,0,Paul Bradley,M,9th,Huang High School,66,79
1,1,Victor Smith,M,12th,Huang High School,94,61


In [7]:
students_df.describe()

Unnamed: 0,Student ID,reading_score,math_score
count,39170.0,39170.0,39170.0
mean,19584.5,81.87784,78.985371
std,11307.549359,10.23958,12.309968
min,0.0,63.0,55.0
25%,9792.25,73.0,69.0
50%,19584.5,82.0,79.0
75%,29376.75,91.0,89.0
max,39169.0,99.0,99.0


In [8]:
students_df.columns

Index(['Student ID', 'name', 'gender', 'grade', 'school', 'reading_score',
       'math_score'],
      dtype='object')

In [9]:
schools_df.columns

Index(['School ID', 'name', 'type', 'size', 'budget'], dtype='object')

In [10]:
students_renamed_df = students_df.rename(columns={'school':'school_name'})
students_renamed_df.head(2)

Unnamed: 0,Student ID,name,gender,grade,school_name,reading_score,math_score
0,0,Paul Bradley,M,9th,Huang High School,66,79
1,1,Victor Smith,M,12th,Huang High School,94,61


In [11]:
schools_renamed_df = schools_df.rename(columns={'name':'school_name'})
schools_renamed_df.head(2)

Unnamed: 0,School ID,school_name,type,size,budget
0,0,Huang High School,District,2917,1910635
1,1,Figueroa High School,District,2949,1884411


In [12]:
district_df = pd.merge(schools_renamed_df, students_renamed_df, on='school_name')
district_df.head(2)
#district_df.describe()

Unnamed: 0,School ID,school_name,type,size,budget,Student ID,name,gender,grade,reading_score,math_score
0,0,Huang High School,District,2917,1910635,0,Paul Bradley,M,9th,66,79
1,0,Huang High School,District,2917,1910635,1,Victor Smith,M,12th,94,61


In [13]:
district_df.describe()

Unnamed: 0,School ID,size,budget,Student ID,reading_score,math_score
count,39170.0,39170.0,39170.0,39170.0,39170.0,39170.0
mean,6.978172,3332.95711,2117241.0,19584.5,81.87784,78.985371
std,4.444329,1323.914069,874998.7,11307.549359,10.23958,12.309968
min,0.0,427.0,248087.0,0.0,63.0,55.0
25%,3.0,1858.0,1081356.0,9792.25,73.0,69.0
50%,7.0,2949.0,1910635.0,19584.5,82.0,79.0
75%,11.0,4635.0,3022020.0,29376.75,91.0,89.0
max,14.0,4976.0,3124928.0,39169.0,99.0,99.0


## District Summary

In [14]:
totalSchools = schools_df['School ID'].nunique()
totalStudents = f"{students_df['Student ID'].nunique(): ,}"
totalBudget = f"${schools_df['budget'].sum():,.2f}"
averageMathScore = students_df['math_score'].mean()
averageReadingScore = students_df['reading_score'].mean()
percPassingMath = len(students_df[students_df['math_score']>70]) / students_df['Student ID'].nunique() * 100
percPassingReading = len(students_df[students_df['reading_score']>70]) / students_df['Student ID'].nunique() * 100
#
# To calculate the overall passinf percentage, taking the verage of the math and reading passing percentages
#
percOverallPassingRate = (percPassingMath + percPassingReading) /2
pd.DataFrame({'Total Schools':[totalSchools],
              'Total Students':[totalStudents],
              'Total Budget':[totalBudget],
              'Average Math Score':[averageMathScore],
              'Average Reading Score':[averageReadingScore],
              '% Passing Math':[percPassingMath],
              '% Passing Reading':[percPassingReading],
              '% Overall Passing Rate':[percOverallPassingRate]
             },
            columns=[
                'Total Schools','Total Students','Total Budget',
                'Average Math Score','Average Reading Score',
                '% Passing Math','% Passing Reading','% Overall Passing Rate'
            ])

Unnamed: 0,Total Schools,Total Students,Total Budget,Average Math Score,Average Reading Score,% Passing Math,% Passing Reading,% Overall Passing Rate
0,15,39170,"$24,649,428.00",78.985371,81.87784,72.392137,82.971662,77.681899


Unnamed: 0,Total Schools,Total Students,Total Budget,Average Math Score,Average Reading Score,% Passing Math,% Passing Reading,% Overall Passing Rate
0,15,39170,"$24,649,428.00",78.985371,81.87784,72.392137,82.971662,80.431606


## School Summary

In [15]:
district_df.columns

Index(['School ID', 'school_name', 'type', 'size', 'budget', 'Student ID',
       'name', 'gender', 'grade', 'reading_score', 'math_score'],
      dtype='object')

In [16]:
district_details_gb = district_df.groupby(['school_name','type'])

In [17]:
totalStudents = district_details_gb['School ID'].size()
totalSchoolBudget = district_details_gb['budget'].mean()
perStudentBudget = totalSchoolBudget / totalStudents
averageMathScore = district_details_gb.sum()['math_score'] / district_details_gb['School ID'].size()
averageReadingScore = district_details_gb.sum()['reading_score'] / district_details_gb['School ID'].size()
# steps:
# 1. Get pass mark subset from the merged df,
# 2. groupby schools and get the total marks per school
# 3. div by school student count and get the percentage
#
mathPassSubset = district_df[district_df['math_score']>70]
percPassingMath = (mathPassSubset.groupby('school_name').count()['math_score'] / totalStudents) * 100

readingPassSubset = district_df[district_df['reading_score']>70]
percPassingReading = (readingPassSubset.groupby('school_name').count()['reading_score'] / totalStudents) * 100

percOverallPassingRate = (percPassingMath + percPassingReading) / 2

In [18]:
school_summary = pd.DataFrame({'Average Reading Score':averageReadingScore,
              'Average Math Score':averageMathScore,
              'Total Students':totalStudents,
              'Total School Budget':totalSchoolBudget.map("${:,.2f}".format),
              'Per Student Budget':perStudentBudget.map("${:,.2f}".format),
              '% Passing Math':percPassingMath,
              '% Passing Reading':percPassingReading,
              '% Overall Passing Rate':percOverallPassingRate
             }, 
            columns=[
                #'School Type',
                'Total Students',
                'Total School Budget',
                'Per Student Budget',
                'Average Math Score',
                'Average Reading Score',
                '% Passing Math',
                '% Passing Reading',
                '% Overall Passing Rate'
            ]
)

In [19]:
#minor cleanup to match the output provided
school_summary.reset_index(inplace=True)
school_summary.rename(columns={'type':'School Type', 'school_name':''}, inplace=True)
school_summary.set_index('', inplace=True)

In [20]:
school_summary

Unnamed: 0,School Type,Total Students,Total School Budget,Per Student Budget,Average Math Score,Average Reading Score,% Passing Math,% Passing Reading,% Overall Passing Rate
,,,,,,,,,
Bailey High School,District,4976.0,"$3,124,928.00",$628.00,77.048432,81.033963,64.630225,79.300643,71.965434
Cabrera High School,Charter,1858.0,"$1,081,356.00",$582.00,83.061895,83.97578,89.558665,93.86437,91.711518
Figueroa High School,District,2949.0,"$1,884,411.00",$639.00,76.711767,81.15802,63.750424,78.433367,71.091896
Ford High School,District,2739.0,"$1,763,916.00",$644.00,77.102592,80.746258,65.753925,77.51004,71.631982
Griffin High School,Charter,1468.0,"$917,500.00",$625.00,83.351499,83.816757,89.713896,93.392371,91.553134
Hernandez High School,District,4635.0,"$3,022,020.00",$652.00,77.289752,80.934412,64.746494,78.187702,71.467098
Holden High School,Charter,427.0,"$248,087.00",$581.00,83.803279,83.814988,90.632319,92.740047,91.686183
Huang High School,District,2917.0,"$1,910,635.00",$655.00,76.629414,81.182722,63.318478,78.81385,71.066164
Johnson High School,District,4761.0,"$3,094,650.00",$650.00,77.072464,80.966394,63.852132,78.281874,71.067003


Unnamed: 0,School Type,Total Students,Total School Budget,Per Student Budget,Average Math Score,Average Reading Score,% Passing Math,% Passing Reading,% Overall Passing Rate
Bailey High School,District,4976,"$4,976.00",$628.00,77.048432,81.033963,64.630225,79.300643,71.965434
Cabrera High School,Charter,1858,"$1,858.00",$582.00,83.061895,83.97578,89.558665,93.86437,91.711518
Figueroa High School,District,2949,"$2,949.00",$639.00,76.711767,81.15802,63.750424,78.433367,71.091896
Ford High School,District,2739,"$2,739.00",$644.00,77.102592,80.746258,65.753925,77.51004,71.631982
Griffin High School,Charter,1468,"$1,468.00",$625.00,83.351499,83.816757,89.713896,93.392371,91.553134
Hernandez High School,District,4635,"$4,635.00",$652.00,77.289752,80.934412,64.746494,78.187702,71.467098
Holden High School,Charter,427,$427.00,$581.00,83.803279,83.814988,90.632319,92.740047,91.686183
Huang High School,District,2917,"$2,917.00",$655.00,76.629414,81.182722,63.318478,78.81385,71.066164
Johnson High School,District,4761,"$4,761.00",$650.00,77.072464,80.966394,63.852132,78.281874,71.067003
Pena High School,Charter,962,$962.00,$609.00,83.839917,84.044699,91.683992,92.203742,91.943867


## Top Performing Schools (By Passing Rate)

In [21]:
school_summary.sort_values(by=['% Overall Passing Rate'], ascending=False).head(5)

Unnamed: 0,School Type,Total Students,Total School Budget,Per Student Budget,Average Math Score,Average Reading Score,% Passing Math,% Passing Reading,% Overall Passing Rate
,,,,,,,,,
Wilson High School,Charter,2283.0,"$1,319,574.00",$578.00,83.274201,83.989488,90.932983,93.25449,92.093736
Pena High School,Charter,962.0,"$585,858.00",$609.00,83.839917,84.044699,91.683992,92.203742,91.943867
Wright High School,Charter,1800.0,"$1,049,400.00",$583.00,83.682222,83.955,90.277778,93.444444,91.861111
Cabrera High School,Charter,1858.0,"$1,081,356.00",$582.00,83.061895,83.97578,89.558665,93.86437,91.711518
Holden High School,Charter,427.0,"$248,087.00",$581.00,83.803279,83.814988,90.632319,92.740047,91.686183


Unnamed: 0,School Type,Total Students,Total School Budget,Per Student Budget,Average Math Score,Average Reading Score,% Passing Math,% Passing Reading,% Overall Passing Rate
Wilson High School,Charter,2283,"$2,283.00",$578.00,83.274201,83.989488,90.932983,93.25449,92.093736
Pena High School,Charter,962,$962.00,$609.00,83.839917,84.044699,91.683992,92.203742,91.943867
Wright High School,Charter,1800,"$1,800.00",$583.00,83.682222,83.955,90.277778,93.444444,91.861111
Cabrera High School,Charter,1858,"$1,858.00",$582.00,83.061895,83.97578,89.558665,93.86437,91.711518
Holden High School,Charter,427,$427.00,$581.00,83.803279,83.814988,90.632319,92.740047,91.686183


## Bottom Performing Schools (By Passing Rate)

In [22]:
school_summary.sort_values(by=['% Overall Passing Rate']).head(5)

Unnamed: 0,School Type,Total Students,Total School Budget,Per Student Budget,Average Math Score,Average Reading Score,% Passing Math,% Passing Reading,% Overall Passing Rate
,,,,,,,,,
Rodriguez High School,District,3999.0,"$2,547,363.00",$637.00,76.842711,80.744686,64.066017,77.744436,70.905226
Huang High School,District,2917.0,"$1,910,635.00",$655.00,76.629414,81.182722,63.318478,78.81385,71.066164
Johnson High School,District,4761.0,"$3,094,650.00",$650.00,77.072464,80.966394,63.852132,78.281874,71.067003
Figueroa High School,District,2949.0,"$1,884,411.00",$639.00,76.711767,81.15802,63.750424,78.433367,71.091896
Hernandez High School,District,4635.0,"$3,022,020.00",$652.00,77.289752,80.934412,64.746494,78.187702,71.467098


Unnamed: 0,School Type,Total Students,Total School Budget,Per Student Budget,Average Math Score,Average Reading Score,% Passing Math,% Passing Reading,% Overall Passing Rate
Rodriguez High School,District,3999,"$3,999.00",$637.00,76.842711,80.744686,64.066017,77.744436,70.905226
Huang High School,District,2917,"$2,917.00",$655.00,76.629414,81.182722,63.318478,78.81385,71.066164
Johnson High School,District,4761,"$4,761.00",$650.00,77.072464,80.966394,63.852132,78.281874,71.067003
Figueroa High School,District,2949,"$2,949.00",$639.00,76.711767,81.15802,63.750424,78.433367,71.091896
Hernandez High School,District,4635,"$4,635.00",$652.00,77.289752,80.934412,64.746494,78.187702,71.467098


## Math Scores by Grade

In [23]:
district_df.columns

Index(['School ID', 'school_name', 'type', 'size', 'budget', 'Student ID',
       'name', 'gender', 'grade', 'reading_score', 'math_score'],
      dtype='object')

In [24]:
grade_9_gp = district_df[district_df['grade']=='9th'].groupby('school_name')
grade_10_gp = district_df[district_df['grade']=='10th'].groupby('school_name')
grade_11_gp = district_df[district_df['grade']=='11th'].groupby('school_name')
grade_12_gp = district_df[district_df['grade']=='12th'].groupby('school_name')

In [25]:
grade_9_math_score = grade_9_gp.sum()['math_score'] / grade_9_gp.count()['math_score']
grade_10_math_score = grade_10_gp.sum()['math_score'] / grade_10_gp.count()['math_score']
grade_11_math_score = grade_11_gp.sum()['math_score'] / grade_11_gp.count()['math_score']
grade_12_math_score = grade_12_gp.sum()['math_score'] / grade_12_gp.count()['math_score']
grade_9_reading_score = grade_9_gp.sum()['reading_score'] / grade_9_gp.count()['reading_score']
grade_10_reading_score = grade_10_gp.sum()['reading_score'] / grade_10_gp.count()['reading_score']
grade_11_reading_score = grade_11_gp.sum()['reading_score'] / grade_11_gp.count()['reading_score']
grade_12_reading_score = grade_12_gp.sum()['reading_score'] / grade_12_gp.count()['reading_score']

In [26]:
math_score_summary = pd.DataFrame({
                '9th':grade_9_math_score,
                '10th':grade_10_math_score,
                '11th':grade_11_math_score,
                '12th':grade_12_math_score
            },columns=['9th','10th','11th','12th'])

In [27]:
#minor cleanup to match the output provided
math_score_summary.reset_index(inplace=True)
math_score_summary.rename(columns={'school_name':''}, inplace=True)
math_score_summary.set_index('', inplace=True)

In [28]:
math_score_summary

Unnamed: 0,9th,10th,11th,12th
,,,,
Bailey High School,77.083676,76.996772,77.515588,76.492218
Cabrera High School,83.094697,83.154506,82.76556,83.277487
Figueroa High School,76.403037,76.539974,76.884344,77.151369
Ford High School,77.361345,77.672316,76.918058,76.179963
Griffin High School,82.04401,84.229064,83.842105,83.356164
Hernandez High School,77.438495,77.337408,77.136029,77.186567
Holden High School,83.787402,83.429825,85.0,82.855422
Huang High School,77.027251,75.908735,76.446602,77.225641
Johnson High School,77.187857,76.691117,77.491653,76.863248


Unnamed: 0,9th,10th,11th,12th
Bailey High School,77.083676,76.996772,77.515588,76.492218
Cabrera High School,83.094697,83.154506,82.76556,83.277487
Figueroa High School,76.403037,76.539974,76.884344,77.151369
Ford High School,77.361345,77.672316,76.918058,76.179963
Griffin High School,82.04401,84.229064,83.842105,83.356164
Hernandez High School,77.438495,77.337408,77.136029,77.186567
Holden High School,83.787402,83.429825,85.0,82.855422
Huang High School,77.027251,75.908735,76.446602,77.225641
Johnson High School,77.187857,76.691117,77.491653,76.863248
Pena High School,83.625455,83.372,84.328125,84.121547


## Reading Score by Grade 

In [29]:
reading_score_summary = pd.DataFrame({
                '9th':grade_9_reading_score,
                '10th':grade_10_reading_score,
                '11th':grade_11_reading_score,
                '12th':grade_12_reading_score
            },columns=['9th','10th','11th','12th'])

In [30]:
#minor cleanup to match the output provided
reading_score_summary.reset_index(inplace=True)
reading_score_summary.rename(columns={'school_name':''}, inplace=True)
reading_score_summary.set_index('', inplace=True)

In [31]:
reading_score_summary

Unnamed: 0,9th,10th,11th,12th
,,,,
Bailey High School,81.303155,80.907183,80.945643,80.912451
Cabrera High School,83.676136,84.253219,83.788382,84.287958
Figueroa High School,81.198598,81.408912,80.640339,81.384863
Ford High School,80.632653,81.262712,80.403642,80.662338
Griffin High School,83.369193,83.706897,84.288089,84.013699
Hernandez High School,80.86686,80.660147,81.39614,80.857143
Holden High School,83.677165,83.324561,83.815534,84.698795
Huang High School,81.290284,81.512386,81.417476,80.305983
Johnson High School,81.260714,80.773431,80.616027,81.227564


Unnamed: 0,9th,10th,11th,12th
Bailey High School,81.303155,80.907183,80.945643,80.912451
Cabrera High School,83.676136,84.253219,83.788382,84.287958
Figueroa High School,81.198598,81.408912,80.640339,81.384863
Ford High School,80.632653,81.262712,80.403642,80.662338
Griffin High School,83.369193,83.706897,84.288089,84.013699
Hernandez High School,80.86686,80.660147,81.39614,80.857143
Holden High School,83.677165,83.324561,83.815534,84.698795
Huang High School,81.290284,81.512386,81.417476,80.305983
Johnson High School,81.260714,80.773431,80.616027,81.227564
Pena High School,83.807273,83.612,84.335938,84.59116


## Scores by School Spending

In [32]:
spendingBins = [0,585,615,645,675]
perStudentSpendingLabel = ['<$585','$585-615','$615-645','$645-675']

In [33]:
raw_school_summary_df = pd.DataFrame({'Average Reading Score':averageReadingScore,
              'Average Math Score':averageMathScore,
              'Total Students':totalStudents,
              'Total School Budget':totalSchoolBudget,
              'Per Student Budget':perStudentBudget,
              '% Passing Math':percPassingMath,
              '% Passing Reading':percPassingReading,
              '% Overall Passing Rate':percOverallPassingRate
             }, 
            columns=[
                'Total Students',
                'Total School Budget',
                'Per Student Budget',
                'Average Math Score',
                'Average Reading Score',
                '% Passing Math',
                '% Passing Reading',
                '% Overall Passing Rate'
            ])

In [34]:
raw_school_summary_df['Spending Ranges (Per Student)'] = pd.cut(raw_school_summary_df['Per Student Budget'], 
                            spendingBins, 
                            labels=perStudentSpendingLabel,
                            right = False)

In [35]:
col_list = ['Spending Ranges (Per Student)','Average Math Score',
            'Average Reading Score','% Passing Math',
            '% Passing Reading','% Overall Passing Rate']
raw_school_summary_df = raw_school_summary_df[col_list]

In [36]:
perStudentSpending_gb = raw_school_summary_df.groupby('Spending Ranges (Per Student)')

In [37]:
perStudentSpending_gb.mean()

Unnamed: 0_level_0,Average Math Score,Average Reading Score,% Passing Math,% Passing Reading,% Overall Passing Rate
Spending Ranges (Per Student),Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
<$585,83.455399,83.933814,90.350436,93.325838,91.838137
$585-615,83.599686,83.885211,90.788049,92.410786,91.599418
$615-645,79.079225,81.891436,73.021426,83.214343,78.117884
$645-675,76.99721,81.027843,63.972368,78.427809,71.200088


Unnamed: 0_level_0,Average Math Score,Average Reading Score,% Passing Math,% Passing Reading,% Overall Passing Rate
Spending Ranges (Per Student),Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
<$585,83.455399,83.933814,90.350436,93.325838,83.694607
$585-615,83.599686,83.885211,90.788049,92.410786,83.742449
$615-645,79.079225,81.891436,73.021426,83.214343,80.48533
$645-675,76.99721,81.027843,63.972368,78.427809,79.012526


## Scores by School Size

In [38]:
raw_school_summary_df = pd.DataFrame({'Average Reading Score':averageReadingScore,
              'Average Math Score':averageMathScore,
              'Total Students':totalStudents,
              'Total School Budget':totalSchoolBudget,
              'Per Student Budget':perStudentBudget,
              '% Passing Math':percPassingMath,
              '% Passing Reading':percPassingReading,
              '% Overall Passing Rate':percOverallPassingRate
             }, 
            columns=[
                'Total Students',
                'Total School Budget',
                'Per Student Budget',
                'Average Math Score',
                'Average Reading Score',
                '% Passing Math',
                '% Passing Reading',
                '% Overall Passing Rate'
            ])

In [39]:
sizeBins = [0,1000,2000,5000]
perSchoolSizeLabel = ['Small (<1000)','Medium (1000-2000)','Large (2000-5000)']

In [40]:
raw_school_summary_df['School Size'] = pd.cut(raw_school_summary_df['Total Students'], 
                            sizeBins, 
                            labels=perSchoolSizeLabel,
                            right = False)

In [41]:
col_list = ['School Size','Average Math Score',
            'Average Reading Score','% Passing Math',
            '% Passing Reading','% Overall Passing Rate']
raw_school_summary_df = raw_school_summary_df[col_list]
schoolSize_gb = raw_school_summary_df.groupby('School Size')

In [42]:
schoolSize_gb.mean()

Unnamed: 0_level_0,Average Math Score,Average Reading Score,% Passing Math,% Passing Reading,% Overall Passing Rate
School Size,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Small (<1000),83.821598,83.929843,91.158155,92.471895,91.815025
Medium (1000-2000),83.374684,83.864438,89.931303,93.244843,91.588073
Large (2000-5000),77.746417,81.344493,67.631335,80.1908,73.911067


Unnamed: 0_level_0,Average Math Score,Average Reading Score,% Passing Math,% Passing Reading,% Overall Passing Rate
School Size,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Small (<1000),83.821598,83.929843,91.158155,92.471895,91.815025
Medium (1000-2000),83.374684,83.864438,89.931303,93.244843,91.588073
Large (2000-5000),77.746417,81.344493,67.631335,80.1908,73.911067


## Scores by School Type

In [43]:
raw_school_summary_df = pd.DataFrame({'Average Reading Score':averageReadingScore,
              'Average Math Score':averageMathScore,
              'Total Students':totalStudents,
              'Total School Budget':totalSchoolBudget,
              'Per Student Budget':perStudentBudget,
              '% Passing Math':percPassingMath,
              '% Passing Reading':percPassingReading,
              '% Overall Passing Rate':percOverallPassingRate
             }, 
            columns=[
                'Total Students',
                'Total School Budget',
                'Per Student Budget',
                'Average Math Score',
                'Average Reading Score',
                '% Passing Math',
                '% Passing Reading',
                '% Overall Passing Rate'
            ])
raw_school_summary_df.reset_index(inplace=True)
raw_school_summary_df.rename(columns={'type':'School Type'}, inplace=True)

In [44]:
col_list = ['School Type','Average Math Score',
            'Average Reading Score','% Passing Math',
            '% Passing Reading','% Overall Passing Rate']
raw_school_summary_df = raw_school_summary_df[col_list]

In [45]:
raw_school_summary_df.set_index('School Type', inplace=True)

In [46]:
raw_school_summary_df.groupby('School Type').mean()

Unnamed: 0_level_0,Average Math Score,Average Reading Score,% Passing Math,% Passing Reading,% Overall Passing Rate
School Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Charter,83.473852,83.896421,90.363226,93.052812,91.708019
District,76.956733,80.966636,64.302528,78.324559,71.313543


Unnamed: 0_level_0,Average Math Score,Average Reading Score,% Passing Math,% Passing Reading,% Overall Passing Rate
School Type,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Charter,83.473852,83.896421,90.363226,93.052812,91.708019
District,76.956733,80.966636,64.302528,78.324559,71.313543
