# PyCity Schools Analysis

James Dietz

* In respect to the previous findings already presented to us about school size and charter school status, it was noted that much of the variance in performance appears to be within schools in addition to across schools.  This suggests that school effects may be important but only part of the story.  Individual student effects which include socioeconomic and other background factors may be playing an important role both at individual and school levels.  Background and SES factors would be useful to include in future models, including multi-level, hierarichal linear modeling, to determine extent to which school size and charter status may be spuriously causing these effects via socioeconomic factors largely active in families and neighborhoods.

* Students performed higher in reading (mean 82) than in mathematics (mean 79). The passing rate for reading was approximately 11 points higher for reading than math, suggesting increased resources toward mathematics instruction may be warranted.



### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [1]:
#James S Dietz
#Data Analytics and Visualization 3



# Dependencies 
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# File 
school_data_to_load = "Resources/schools_complete.csv"
student_data_to_load = "Resources/students_complete.csv"

# Read School and Student Data File and store into Pandas Data Frames
school_data = pd.read_csv(school_data_to_load)
student_data = pd.read_csv(student_data_to_load)

# Combine the data into a single dataset
complete = pd.merge(student_data, school_data, how="left", on=["school_name", "school_name"])

## District Summary

* Calculate the total number of schools

* Calculate the total number of students

* Calculate the total budget

* Calculate the average math score 

* Calculate the average reading score

* Calculate the overall passing rate (overall average score), i.e. (avg. math score + avg. reading score)/2

* Calculate the percentage of students with a passing math score (70 or greater)

* Calculate the percentage of students with a passing reading score (70 or greater)

* Create a dataframe to hold the above results

* Optional: give the displayed data cleaner formatting

In [2]:
school_data.head(15)

Unnamed: 0,School ID,school_name,type,size,budget
0,0,Huang High School,District,2917,1910635
1,1,Figueroa High School,District,2949,1884411
2,2,Shelton High School,Charter,1761,1056600
3,3,Hernandez High School,District,4635,3022020
4,4,Griffin High School,Charter,1468,917500
5,5,Wilson High School,Charter,2283,1319574
6,6,Cabrera High School,Charter,1858,1081356
7,7,Bailey High School,District,4976,3124928
8,8,Holden High School,Charter,427,248087
9,9,Pena High School,Charter,962,585858


In [3]:
student_data.head()

Unnamed: 0,Student ID,student_name,gender,grade,school_name,reading_score,math_score
0,0,Paul Bradley,M,9th,Huang High School,66,79
1,1,Victor Smith,M,12th,Huang High School,94,61
2,2,Kevin Rodriguez,M,12th,Huang High School,90,60
3,3,Dr. Richard Scott,M,12th,Huang High School,67,58
4,4,Bonnie Ray,F,9th,Huang High School,97,84


In [4]:
complete.head()

Unnamed: 0,Student ID,student_name,gender,grade,school_name,reading_score,math_score,School ID,type,size,budget
0,0,Paul Bradley,M,9th,Huang High School,66,79,0,District,2917,1910635
1,1,Victor Smith,M,12th,Huang High School,94,61,0,District,2917,1910635
2,2,Kevin Rodriguez,M,12th,Huang High School,90,60,0,District,2917,1910635
3,3,Dr. Richard Scott,M,12th,Huang High School,67,58,0,District,2917,1910635
4,4,Bonnie Ray,F,9th,Huang High School,97,84,0,District,2917,1910635


In [5]:
complete.describe()

Unnamed: 0,Student ID,reading_score,math_score,School ID,size,budget
count,39170.0,39170.0,39170.0,39170.0,39170.0,39170.0
mean,19584.5,81.87784,78.985371,6.978172,3332.95711,2117241.0
std,11307.549359,10.23958,12.309968,4.444329,1323.914069,874998.7
min,0.0,63.0,55.0,0.0,427.0,248087.0
25%,9792.25,73.0,69.0,3.0,1858.0,1081356.0
50%,19584.5,82.0,79.0,7.0,2949.0,1910635.0
75%,29376.75,91.0,89.0,11.0,4635.0,3022020.0
max,39169.0,99.0,99.0,14.0,4976.0,3124928.0


In [6]:
#total number of schools.
totschools = school_data['school_name'].nunique()
totschools

15

In [7]:
#total number of students.
totstudents = complete['Student ID'].count()
totstudents

39170

In [110]:
#total budget.
totbud = school_data['budget'].sum()
totbud


24649428

In [9]:
#mean math score.
meanmath = student_data['math_score'].mean()
meanmath

78.98537145774827

In [10]:
#mean reading score.
meanread = student_data['reading_score'].mean()
meanread

81.87784018381414

In [11]:
#overall mean score.
meanoverall = (meanmath + meanread)/2
meanoverall

80.43160582078121

In [12]:
#bins and group labels for score cuts.
bins = [0, 69, 100]
group_names = ["Fail", "Pass"]

In [13]:
#Assessing pass/fail for reading.
complete["read_pass"] = pd.cut(complete["reading_score"], bins, labels=group_names)
complete.head()



Unnamed: 0,Student ID,student_name,gender,grade,school_name,reading_score,math_score,School ID,type,size,budget,read_pass
0,0,Paul Bradley,M,9th,Huang High School,66,79,0,District,2917,1910635,Fail
1,1,Victor Smith,M,12th,Huang High School,94,61,0,District,2917,1910635,Pass
2,2,Kevin Rodriguez,M,12th,Huang High School,90,60,0,District,2917,1910635,Pass
3,3,Dr. Richard Scott,M,12th,Huang High School,67,58,0,District,2917,1910635,Fail
4,4,Bonnie Ray,F,9th,Huang High School,97,84,0,District,2917,1910635,Pass


In [14]:
#Assessing pass/fail for mathematics.
complete["math_pass"] = pd.cut(complete["math_score"], bins, labels=group_names)
complete.head()

Unnamed: 0,Student ID,student_name,gender,grade,school_name,reading_score,math_score,School ID,type,size,budget,read_pass,math_pass
0,0,Paul Bradley,M,9th,Huang High School,66,79,0,District,2917,1910635,Fail,Pass
1,1,Victor Smith,M,12th,Huang High School,94,61,0,District,2917,1910635,Pass,Fail
2,2,Kevin Rodriguez,M,12th,Huang High School,90,60,0,District,2917,1910635,Pass,Fail
3,3,Dr. Richard Scott,M,12th,Huang High School,67,58,0,District,2917,1910635,Fail,Fail
4,4,Bonnie Ray,F,9th,Huang High School,97,84,0,District,2917,1910635,Pass,Pass


In [21]:
complete['read_pass'].value_counts()

Pass    33610
Fail     5560
Name: read_pass, dtype: int64

In [22]:
complete['math_pass'].value_counts()

Pass    29370
Fail     9800
Name: math_pass, dtype: int64

In [23]:
failread = complete['read_pass'].value_counts()[0] 
passread = complete['read_pass'].value_counts()[1] 
totalrtest = failread + passread 
proppassread = passread / totalrtest 


In [24]:
failmath = complete['math_pass'].value_counts()[0] 
passmath = complete['math_pass'].value_counts()[1] 
totalmtest = failmath + passmath 
proppassmath = passmath / totalmtest 

    

In [87]:
pctpassread = proppassread * 100
pctpassread

85.80546336482001

In [26]:
pctpassmath = proppassmath * 100
pctpassmath

74.9808526933878

In [27]:
overallpass = (proppassread + proppassmath)/2
overallpass

0.8039315802910391

In [28]:
pctoverall = overallpass * 100
pctoverall

80.3931580291039

In [122]:
#Creating overview data frame for assessment of district level results.

sumdist_df = pd.DataFrame({"Total Number of Schools": [totschools],
                           "Total Number of Students": [totstudents],
                           "Total Budget": [totbud],
                           "Average Math Score": [meanmath],
                           "Average Reading Score": [meanread],
                           "Overall Average Score": [meanoverall],
                           "Percentage Passing Math": [pctpassmath],
                           "Percentage Passing Reading": [pctpassread],
                           "Overall Passing Rate": [pctoverall]})






sumdist_df.round(2)

Unnamed: 0,Total Number of Schools,Total Number of Students,Total Budget,Average Math Score,Average Reading Score,Overall Average Score,Percentage Passing Math,Percentage Passing Reading,Overall Passing Rate
0,15,39170,24649428,78.99,81.88,80.43,74.98,85.81,80.39


## School Summary

* Create an overview table that summarizes key metrics about each school, including:
  * School Name x
  * School Type x
  * Total Students x
  * Total School Budget x
  * Per Student Budget x
  * Average Math Score x
  * Average Reading Score x
  * % Passing Math x
  * % Passing Reading x
  * Overall Passing Rate (Average of the above two) x
  
* Create a dataframe to hold the above results

In [30]:
#doing some checks on data
school_data.isnull()

Unnamed: 0,School ID,school_name,type,size,budget
0,False,False,False,False,False
1,False,False,False,False,False
2,False,False,False,False,False
3,False,False,False,False,False
4,False,False,False,False,False
5,False,False,False,False,False
6,False,False,False,False,False
7,False,False,False,False,False
8,False,False,False,False,False
9,False,False,False,False,False


In [31]:
dfs = school_data.head()
dfs

Unnamed: 0,School ID,school_name,type,size,budget
0,0,Huang High School,District,2917,1910635
1,1,Figueroa High School,District,2949,1884411
2,2,Shelton High School,Charter,1761,1056600
3,3,Hernandez High School,District,4635,3022020
4,4,Griffin High School,Charter,1468,917500


In [32]:
#calculate per student school budget.
school_data["Budget_Per_Student"] = (school_data["budget"] / school_data ["size"])
school_data

Unnamed: 0,School ID,school_name,type,size,budget,Budget_Per_Student
0,0,Huang High School,District,2917,1910635,655.0
1,1,Figueroa High School,District,2949,1884411,639.0
2,2,Shelton High School,Charter,1761,1056600,600.0
3,3,Hernandez High School,District,4635,3022020,652.0
4,4,Griffin High School,Charter,1468,917500,625.0
5,5,Wilson High School,Charter,2283,1319574,578.0
6,6,Cabrera High School,Charter,1858,1081356,582.0
7,7,Bailey High School,District,4976,3124928,628.0
8,8,Holden High School,Charter,427,248087,581.0
9,9,Pena High School,Charter,962,585858,609.0


In [33]:
bud = school_data.sort_values(by='school_name', ascending=True)
bud

Unnamed: 0,School ID,school_name,type,size,budget,Budget_Per_Student
7,7,Bailey High School,District,4976,3124928,628.0
6,6,Cabrera High School,Charter,1858,1081356,582.0
1,1,Figueroa High School,District,2949,1884411,639.0
13,13,Ford High School,District,2739,1763916,644.0
4,4,Griffin High School,Charter,1468,917500,625.0
3,3,Hernandez High School,District,4635,3022020,652.0
8,8,Holden High School,Charter,427,248087,581.0
0,0,Huang High School,District,2917,1910635,655.0
12,12,Johnson High School,District,4761,3094650,650.0
9,9,Pena High School,Charter,962,585858,609.0


In [35]:
school_groups = complete.groupby(['school_name', 'type']).mean()
school_groups

Unnamed: 0_level_0,Unnamed: 1_level_0,Student ID,reading_score,math_score,School ID,size,budget
school_name,type,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Bailey High School,District,20358.5,81.033963,77.048432,7.0,4976.0,3124928.0
Cabrera High School,Charter,16941.5,83.97578,83.061895,6.0,1858.0,1081356.0
Figueroa High School,District,4391.0,81.15802,76.711767,1.0,2949.0,1884411.0
Ford High School,District,36165.0,80.746258,77.102592,13.0,2739.0,1763916.0
Griffin High School,Charter,12995.5,83.816757,83.351499,4.0,1468.0,917500.0
Hernandez High School,District,9944.0,80.934412,77.289752,3.0,4635.0,3022020.0
Holden High School,Charter,23060.0,83.814988,83.803279,8.0,427.0,248087.0
Huang High School,District,1458.0,81.182722,76.629414,0.0,2917.0,1910635.0
Johnson High School,District,32415.0,80.966394,77.072464,12.0,4761.0,3094650.0
Pena High School,Charter,23754.5,84.044699,83.839917,9.0,962.0,585858.0


In [36]:
#groupby to find number of passing students in math.
school_pass_math = student_data[student_data.math_score > 69].groupby('school_name')['math_score'].count()
school_pass_math

school_name
Bailey High School       3318
Cabrera High School      1749
Figueroa High School     1946
Ford High School         1871
Griffin High School      1371
Hernandez High School    3094
Holden High School        395
Huang High School        1916
Johnson High School      3145
Pena High School          910
Rodriguez High School    2654
Shelton High School      1653
Thomas High School       1525
Wilson High School       2143
Wright High School       1680
Name: math_score, dtype: int64

In [37]:
#groupby to find number of failing students in math.
school_fail_math = student_data[student_data.math_score < 70].groupby('school_name')['math_score'].count()
school_fail_math

school_name
Bailey High School       1658
Cabrera High School       109
Figueroa High School     1003
Ford High School          868
Griffin High School        97
Hernandez High School    1541
Holden High School         32
Huang High School        1001
Johnson High School      1616
Pena High School           52
Rodriguez High School    1345
Shelton High School       108
Thomas High School        110
Wilson High School        140
Wright High School        120
Name: math_score, dtype: int64

In [38]:
#calculating school level passing rate for math.
school_pass_math_rate = (school_pass_math / (school_pass_math + school_fail_math)) * 100
school_pass_math_rate

school_name
Bailey High School       66.680064
Cabrera High School      94.133477
Figueroa High School     65.988471
Ford High School         68.309602
Griffin High School      93.392371
Hernandez High School    66.752967
Holden High School       92.505855
Huang High School        65.683922
Johnson High School      66.057551
Pena High School         94.594595
Rodriguez High School    66.366592
Shelton High School      93.867121
Thomas High School       93.272171
Wilson High School       93.867718
Wright High School       93.333333
Name: math_score, dtype: float64

In [39]:
#groupby to find number of passing students in reading.
school_pass_read = student_data[student_data.reading_score > 69].groupby('school_name')['reading_score'].count()
school_pass_read

school_name
Bailey High School       4077
Cabrera High School      1803
Figueroa High School     2381
Ford High School         2172
Griffin High School      1426
Hernandez High School    3748
Holden High School        411
Huang High School        2372
Johnson High School      3867
Pena High School          923
Rodriguez High School    3208
Shelton High School      1688
Thomas High School       1591
Wilson High School       2204
Wright High School       1739
Name: reading_score, dtype: int64

In [40]:
#groupby to find number of failing students in reading.
school_fail_read = student_data[student_data.reading_score < 70].groupby('school_name')['reading_score'].count()
school_fail_read

school_name
Bailey High School       899
Cabrera High School       55
Figueroa High School     568
Ford High School         567
Griffin High School       42
Hernandez High School    887
Holden High School        16
Huang High School        545
Johnson High School      894
Pena High School          39
Rodriguez High School    791
Shelton High School       73
Thomas High School        44
Wilson High School        79
Wright High School        61
Name: reading_score, dtype: int64

In [41]:
math = school_pass_math_rate.reset_index()
math

Unnamed: 0,school_name,math_score
0,Bailey High School,66.680064
1,Cabrera High School,94.133477
2,Figueroa High School,65.988471
3,Ford High School,68.309602
4,Griffin High School,93.392371
5,Hernandez High School,66.752967
6,Holden High School,92.505855
7,Huang High School,65.683922
8,Johnson High School,66.057551
9,Pena High School,94.594595


In [42]:
#calculating school level passing rate for reading.
school_pass_read_rate = (school_pass_read / (school_pass_read + school_fail_read)) * 100
school_pass_read_rate

school_name
Bailey High School       81.933280
Cabrera High School      97.039828
Figueroa High School     80.739234
Ford High School         79.299014
Griffin High School      97.138965
Hernandez High School    80.862999
Holden High School       96.252927
Huang High School        81.316421
Johnson High School      81.222432
Pena High School         95.945946
Rodriguez High School    80.220055
Shelton High School      95.854628
Thomas High School       97.308869
Wilson High School       96.539641
Wright High School       96.611111
Name: reading_score, dtype: float64

In [43]:
read = school_pass_read_rate.reset_index()
read

Unnamed: 0,school_name,reading_score
0,Bailey High School,81.93328
1,Cabrera High School,97.039828
2,Figueroa High School,80.739234
3,Ford High School,79.299014
4,Griffin High School,97.138965
5,Hernandez High School,80.862999
6,Holden High School,96.252927
7,Huang High School,81.316421
8,Johnson High School,81.222432
9,Pena High School,95.945946


In [44]:
#overall school level passing rates calc.
school_overall_pass_rate = (school_pass_math_rate + school_pass_read_rate) / 2
school_overall_pass_rate

school_name
Bailey High School       74.306672
Cabrera High School      95.586652
Figueroa High School     73.363852
Ford High School         73.804308
Griffin High School      95.265668
Hernandez High School    73.807983
Holden High School       94.379391
Huang High School        73.500171
Johnson High School      73.639992
Pena High School         95.270270
Rodriguez High School    73.293323
Shelton High School      94.860875
Thomas High School       95.290520
Wilson High School       95.203679
Wright High School       94.972222
dtype: float64

In [45]:
overall = school_overall_pass_rate.reset_index()
overall

Unnamed: 0,school_name,0
0,Bailey High School,74.306672
1,Cabrera High School,95.586652
2,Figueroa High School,73.363852
3,Ford High School,73.804308
4,Griffin High School,95.265668
5,Hernandez High School,73.807983
6,Holden High School,94.379391
7,Huang High School,73.500171
8,Johnson High School,73.639992
9,Pena High School,95.27027


In [46]:
school_pass_read_rate.describe()

count    15.000000
mean     89.219023
std       8.180664
min      79.299014
25%      81.042716
50%      95.854628
75%      96.575376
max      97.308869
Name: reading_score, dtype: float64

In [47]:
#created table by merging just with passing rates
readmathrates_table = pd.merge(read, math)
readmathrates_table

Unnamed: 0,school_name,reading_score,math_score
0,Bailey High School,81.93328,66.680064
1,Cabrera High School,97.039828,94.133477
2,Figueroa High School,80.739234,65.988471
3,Ford High School,79.299014,68.309602
4,Griffin High School,97.138965,93.392371
5,Hernandez High School,80.862999,66.752967
6,Holden High School,96.252927,92.505855
7,Huang High School,81.316421,65.683922
8,Johnson High School,81.222432,66.057551
9,Pena High School,95.945946,94.594595


In [48]:
#merged reading and math pass rates df with overall pass rates df.
rates_table = pd.merge(readmathrates_table, overall)
rates_table

Unnamed: 0,school_name,reading_score,math_score,0
0,Bailey High School,81.93328,66.680064,74.306672
1,Cabrera High School,97.039828,94.133477,95.586652
2,Figueroa High School,80.739234,65.988471,73.363852
3,Ford High School,79.299014,68.309602,73.804308
4,Griffin High School,97.138965,93.392371,95.265668
5,Hernandez High School,80.862999,66.752967,73.807983
6,Holden High School,96.252927,92.505855,94.379391
7,Huang High School,81.316421,65.683922,73.500171
8,Johnson High School,81.222432,66.057551,73.639992
9,Pena High School,95.945946,94.594595,95.27027


In [None]:
# i learned a lesson about commenting while i am doing the work because now, I am tired and scared to delete
# failed attempts to create the table because I dont want to mess anything up.  So I am leaving this the way it is.

In [49]:
rates_table


Unnamed: 0,school_name,reading_score,math_score,0
0,Bailey High School,81.93328,66.680064,74.306672
1,Cabrera High School,97.039828,94.133477,95.586652
2,Figueroa High School,80.739234,65.988471,73.363852
3,Ford High School,79.299014,68.309602,73.804308
4,Griffin High School,97.138965,93.392371,95.265668
5,Hernandez High School,80.862999,66.752967,73.807983
6,Holden High School,96.252927,92.505855,94.379391
7,Huang High School,81.316421,65.683922,73.500171
8,Johnson High School,81.222432,66.057551,73.639992
9,Pena High School,95.945946,94.594595,95.27027


In [50]:
school_groups.drop(['Student ID', 'School ID'], axis=1)

Unnamed: 0_level_0,Unnamed: 1_level_0,reading_score,math_score,size,budget
school_name,type,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Bailey High School,District,81.033963,77.048432,4976.0,3124928.0
Cabrera High School,Charter,83.97578,83.061895,1858.0,1081356.0
Figueroa High School,District,81.15802,76.711767,2949.0,1884411.0
Ford High School,District,80.746258,77.102592,2739.0,1763916.0
Griffin High School,Charter,83.816757,83.351499,1468.0,917500.0
Hernandez High School,District,80.934412,77.289752,4635.0,3022020.0
Holden High School,Charter,83.814988,83.803279,427.0,248087.0
Huang High School,District,81.182722,76.629414,2917.0,1910635.0
Johnson High School,District,80.966394,77.072464,4761.0,3094650.0
Pena High School,Charter,84.044699,83.839917,962.0,585858.0


In [51]:
bud.drop(['type', 'School ID', 'size', 'budget'], axis=1)

Unnamed: 0,school_name,Budget_Per_Student
7,Bailey High School,628.0
6,Cabrera High School,582.0
1,Figueroa High School,639.0
13,Ford High School,644.0
4,Griffin High School,625.0
3,Hernandez High School,652.0
8,Holden High School,581.0
0,Huang High School,655.0
12,Johnson High School,650.0
9,Pena High School,609.0


In [52]:
merge_table = pd.merge(school_groups, bud, on="school_name")
merge_table

Unnamed: 0,school_name,Student ID,reading_score,math_score,School ID_x,size_x,budget_x,School ID_y,type,size_y,budget_y,Budget_Per_Student
0,Bailey High School,20358.5,81.033963,77.048432,7.0,4976.0,3124928.0,7,District,4976,3124928,628.0
1,Cabrera High School,16941.5,83.97578,83.061895,6.0,1858.0,1081356.0,6,Charter,1858,1081356,582.0
2,Figueroa High School,4391.0,81.15802,76.711767,1.0,2949.0,1884411.0,1,District,2949,1884411,639.0
3,Ford High School,36165.0,80.746258,77.102592,13.0,2739.0,1763916.0,13,District,2739,1763916,644.0
4,Griffin High School,12995.5,83.816757,83.351499,4.0,1468.0,917500.0,4,Charter,1468,917500,625.0
5,Hernandez High School,9944.0,80.934412,77.289752,3.0,4635.0,3022020.0,3,District,4635,3022020,652.0
6,Holden High School,23060.0,83.814988,83.803279,8.0,427.0,248087.0,8,Charter,427,248087,581.0
7,Huang High School,1458.0,81.182722,76.629414,0.0,2917.0,1910635.0,0,District,2917,1910635,655.0
8,Johnson High School,32415.0,80.966394,77.072464,12.0,4761.0,3094650.0,12,District,4761,3094650,650.0
9,Pena High School,23754.5,84.044699,83.839917,9.0,962.0,585858.0,9,Charter,962,585858,609.0


In [53]:
merge_table.drop(['Student ID', 'School ID_x', 'School ID_y', 'size_x', 'budget_x'], axis=1)

Unnamed: 0,school_name,reading_score,math_score,type,size_y,budget_y,Budget_Per_Student
0,Bailey High School,81.033963,77.048432,District,4976,3124928,628.0
1,Cabrera High School,83.97578,83.061895,Charter,1858,1081356,582.0
2,Figueroa High School,81.15802,76.711767,District,2949,1884411,639.0
3,Ford High School,80.746258,77.102592,District,2739,1763916,644.0
4,Griffin High School,83.816757,83.351499,Charter,1468,917500,625.0
5,Hernandez High School,80.934412,77.289752,District,4635,3022020,652.0
6,Holden High School,83.814988,83.803279,Charter,427,248087,581.0
7,Huang High School,81.182722,76.629414,District,2917,1910635,655.0
8,Johnson High School,80.966394,77.072464,District,4761,3094650,650.0
9,Pena High School,84.044699,83.839917,Charter,962,585858,609.0


In [54]:
merge_table.rename(index=str, columns={"size_y": "School Size", "budget_y": "Budget"})

Unnamed: 0,school_name,Student ID,reading_score,math_score,School ID_x,size_x,budget_x,School ID_y,type,School Size,Budget,Budget_Per_Student
0,Bailey High School,20358.5,81.033963,77.048432,7.0,4976.0,3124928.0,7,District,4976,3124928,628.0
1,Cabrera High School,16941.5,83.97578,83.061895,6.0,1858.0,1081356.0,6,Charter,1858,1081356,582.0
2,Figueroa High School,4391.0,81.15802,76.711767,1.0,2949.0,1884411.0,1,District,2949,1884411,639.0
3,Ford High School,36165.0,80.746258,77.102592,13.0,2739.0,1763916.0,13,District,2739,1763916,644.0
4,Griffin High School,12995.5,83.816757,83.351499,4.0,1468.0,917500.0,4,Charter,1468,917500,625.0
5,Hernandez High School,9944.0,80.934412,77.289752,3.0,4635.0,3022020.0,3,District,4635,3022020,652.0
6,Holden High School,23060.0,83.814988,83.803279,8.0,427.0,248087.0,8,Charter,427,248087,581.0
7,Huang High School,1458.0,81.182722,76.629414,0.0,2917.0,1910635.0,0,District,2917,1910635,655.0
8,Johnson High School,32415.0,80.966394,77.072464,12.0,4761.0,3094650.0,12,District,4761,3094650,650.0
9,Pena High School,23754.5,84.044699,83.839917,9.0,962.0,585858.0,9,Charter,962,585858,609.0


In [55]:
merge_table.drop(['Student ID', 'School ID_x', 'School ID_y', 'size_x', 'budget_x'], axis=1)

Unnamed: 0,school_name,reading_score,math_score,type,size_y,budget_y,Budget_Per_Student
0,Bailey High School,81.033963,77.048432,District,4976,3124928,628.0
1,Cabrera High School,83.97578,83.061895,Charter,1858,1081356,582.0
2,Figueroa High School,81.15802,76.711767,District,2949,1884411,639.0
3,Ford High School,80.746258,77.102592,District,2739,1763916,644.0
4,Griffin High School,83.816757,83.351499,Charter,1468,917500,625.0
5,Hernandez High School,80.934412,77.289752,District,4635,3022020,652.0
6,Holden High School,83.814988,83.803279,Charter,427,248087,581.0
7,Huang High School,81.182722,76.629414,District,2917,1910635,655.0
8,Johnson High School,80.966394,77.072464,District,4761,3094650,650.0
9,Pena High School,84.044699,83.839917,Charter,962,585858,609.0


In [56]:
school_table = pd.merge(merge_table, rates_table, on="school_name")
school_table

Unnamed: 0,school_name,Student ID,reading_score_x,math_score_x,School ID_x,size_x,budget_x,School ID_y,type,size_y,budget_y,Budget_Per_Student,reading_score_y,math_score_y,0
0,Bailey High School,20358.5,81.033963,77.048432,7.0,4976.0,3124928.0,7,District,4976,3124928,628.0,81.93328,66.680064,74.306672
1,Cabrera High School,16941.5,83.97578,83.061895,6.0,1858.0,1081356.0,6,Charter,1858,1081356,582.0,97.039828,94.133477,95.586652
2,Figueroa High School,4391.0,81.15802,76.711767,1.0,2949.0,1884411.0,1,District,2949,1884411,639.0,80.739234,65.988471,73.363852
3,Ford High School,36165.0,80.746258,77.102592,13.0,2739.0,1763916.0,13,District,2739,1763916,644.0,79.299014,68.309602,73.804308
4,Griffin High School,12995.5,83.816757,83.351499,4.0,1468.0,917500.0,4,Charter,1468,917500,625.0,97.138965,93.392371,95.265668
5,Hernandez High School,9944.0,80.934412,77.289752,3.0,4635.0,3022020.0,3,District,4635,3022020,652.0,80.862999,66.752967,73.807983
6,Holden High School,23060.0,83.814988,83.803279,8.0,427.0,248087.0,8,Charter,427,248087,581.0,96.252927,92.505855,94.379391
7,Huang High School,1458.0,81.182722,76.629414,0.0,2917.0,1910635.0,0,District,2917,1910635,655.0,81.316421,65.683922,73.500171
8,Johnson High School,32415.0,80.966394,77.072464,12.0,4761.0,3094650.0,12,District,4761,3094650,650.0,81.222432,66.057551,73.639992
9,Pena High School,23754.5,84.044699,83.839917,9.0,962.0,585858.0,9,Charter,962,585858,609.0,95.945946,94.594595,95.27027


In [57]:
school_table


Unnamed: 0,school_name,Student ID,reading_score_x,math_score_x,School ID_x,size_x,budget_x,School ID_y,type,size_y,budget_y,Budget_Per_Student,reading_score_y,math_score_y,0
0,Bailey High School,20358.5,81.033963,77.048432,7.0,4976.0,3124928.0,7,District,4976,3124928,628.0,81.93328,66.680064,74.306672
1,Cabrera High School,16941.5,83.97578,83.061895,6.0,1858.0,1081356.0,6,Charter,1858,1081356,582.0,97.039828,94.133477,95.586652
2,Figueroa High School,4391.0,81.15802,76.711767,1.0,2949.0,1884411.0,1,District,2949,1884411,639.0,80.739234,65.988471,73.363852
3,Ford High School,36165.0,80.746258,77.102592,13.0,2739.0,1763916.0,13,District,2739,1763916,644.0,79.299014,68.309602,73.804308
4,Griffin High School,12995.5,83.816757,83.351499,4.0,1468.0,917500.0,4,Charter,1468,917500,625.0,97.138965,93.392371,95.265668
5,Hernandez High School,9944.0,80.934412,77.289752,3.0,4635.0,3022020.0,3,District,4635,3022020,652.0,80.862999,66.752967,73.807983
6,Holden High School,23060.0,83.814988,83.803279,8.0,427.0,248087.0,8,Charter,427,248087,581.0,96.252927,92.505855,94.379391
7,Huang High School,1458.0,81.182722,76.629414,0.0,2917.0,1910635.0,0,District,2917,1910635,655.0,81.316421,65.683922,73.500171
8,Johnson High School,32415.0,80.966394,77.072464,12.0,4761.0,3094650.0,12,District,4761,3094650,650.0,81.222432,66.057551,73.639992
9,Pena High School,23754.5,84.044699,83.839917,9.0,962.0,585858.0,9,Charter,962,585858,609.0,95.945946,94.594595,95.27027


In [58]:
school_table.drop(columns=['Student ID', 'School ID_x', 'School ID_y', 'budget_x'], axis = 1, inplace = True)

In [59]:
school_table

Unnamed: 0,school_name,reading_score_x,math_score_x,size_x,type,size_y,budget_y,Budget_Per_Student,reading_score_y,math_score_y,0
0,Bailey High School,81.033963,77.048432,4976.0,District,4976,3124928,628.0,81.93328,66.680064,74.306672
1,Cabrera High School,83.97578,83.061895,1858.0,Charter,1858,1081356,582.0,97.039828,94.133477,95.586652
2,Figueroa High School,81.15802,76.711767,2949.0,District,2949,1884411,639.0,80.739234,65.988471,73.363852
3,Ford High School,80.746258,77.102592,2739.0,District,2739,1763916,644.0,79.299014,68.309602,73.804308
4,Griffin High School,83.816757,83.351499,1468.0,Charter,1468,917500,625.0,97.138965,93.392371,95.265668
5,Hernandez High School,80.934412,77.289752,4635.0,District,4635,3022020,652.0,80.862999,66.752967,73.807983
6,Holden High School,83.814988,83.803279,427.0,Charter,427,248087,581.0,96.252927,92.505855,94.379391
7,Huang High School,81.182722,76.629414,2917.0,District,2917,1910635,655.0,81.316421,65.683922,73.500171
8,Johnson High School,80.966394,77.072464,4761.0,District,4761,3094650,650.0,81.222432,66.057551,73.639992
9,Pena High School,84.044699,83.839917,962.0,Charter,962,585858,609.0,95.945946,94.594595,95.27027


In [60]:
school_table.drop(columns=['size_x'], axis = 1, inplace = True)

In [61]:
school_table

Unnamed: 0,school_name,reading_score_x,math_score_x,type,size_y,budget_y,Budget_Per_Student,reading_score_y,math_score_y,0
0,Bailey High School,81.033963,77.048432,District,4976,3124928,628.0,81.93328,66.680064,74.306672
1,Cabrera High School,83.97578,83.061895,Charter,1858,1081356,582.0,97.039828,94.133477,95.586652
2,Figueroa High School,81.15802,76.711767,District,2949,1884411,639.0,80.739234,65.988471,73.363852
3,Ford High School,80.746258,77.102592,District,2739,1763916,644.0,79.299014,68.309602,73.804308
4,Griffin High School,83.816757,83.351499,Charter,1468,917500,625.0,97.138965,93.392371,95.265668
5,Hernandez High School,80.934412,77.289752,District,4635,3022020,652.0,80.862999,66.752967,73.807983
6,Holden High School,83.814988,83.803279,Charter,427,248087,581.0,96.252927,92.505855,94.379391
7,Huang High School,81.182722,76.629414,District,2917,1910635,655.0,81.316421,65.683922,73.500171
8,Johnson High School,80.966394,77.072464,District,4761,3094650,650.0,81.222432,66.057551,73.639992
9,Pena High School,84.044699,83.839917,Charter,962,585858,609.0,95.945946,94.594595,95.27027


In [62]:
school_table.columns = ['School Name', 'Read Score', 'Math Score', 'Type', 'Size', 'Budget', 'Budget/Student', 'Read Pass Rate', 'Math Pass Rate', 'Overall Pass Rate']

In [None]:
#Eventually I was able to drop the unneeded columns and rename to present finished table

In [123]:
school_table.set_index('School Name').round(2)


Unnamed: 0_level_0,Read Score,Math Score,Type,Size,Budget,Budget/Student,Read Pass Rate,Math Pass Rate,Overall Pass Rate,Budget Size,School Size Groups
School Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
Bailey High School,81.03,77.05,District,4976,3124928,628.0,81.93,66.68,74.31,$615-645,Large (2000-5000)
Cabrera High School,83.98,83.06,Charter,1858,1081356,582.0,97.04,94.13,95.59,<$585,Medium (1000-2000)
Figueroa High School,81.16,76.71,District,2949,1884411,639.0,80.74,65.99,73.36,$615-645,Large (2000-5000)
Ford High School,80.75,77.1,District,2739,1763916,644.0,79.3,68.31,73.8,$615-645,Large (2000-5000)
Griffin High School,83.82,83.35,Charter,1468,917500,625.0,97.14,93.39,95.27,$615-645,Medium (1000-2000)
Hernandez High School,80.93,77.29,District,4635,3022020,652.0,80.86,66.75,73.81,$645-675,Large (2000-5000)
Holden High School,83.81,83.8,Charter,427,248087,581.0,96.25,92.51,94.38,<$585,Small (<1000)
Huang High School,81.18,76.63,District,2917,1910635,655.0,81.32,65.68,73.5,$645-675,Large (2000-5000)
Johnson High School,80.97,77.07,District,4761,3094650,650.0,81.22,66.06,73.64,$645-675,Large (2000-5000)
Pena High School,84.04,83.84,Charter,962,585858,609.0,95.95,94.59,95.27,$585-615,Small (<1000)


## Top Performing Schools (By Passing Rate)

* Sort and display the top five schools in overall passing rate

In [64]:
#TOP SCHOOLS
best_table = school_table.sort_values("Overall Pass Rate", ascending=False)
best_table.head()

Unnamed: 0,School Name,Read Score,Math Score,Type,Size,Budget,Budget/Student,Read Pass Rate,Math Pass Rate,Overall Pass Rate
1,Cabrera High School,83.97578,83.061895,Charter,1858,1081356,582.0,97.039828,94.133477,95.586652
12,Thomas High School,83.84893,83.418349,Charter,1635,1043130,638.0,97.308869,93.272171,95.29052
9,Pena High School,84.044699,83.839917,Charter,962,585858,609.0,95.945946,94.594595,95.27027
4,Griffin High School,83.816757,83.351499,Charter,1468,917500,625.0,97.138965,93.392371,95.265668
13,Wilson High School,83.989488,83.274201,Charter,2283,1319574,578.0,96.539641,93.867718,95.203679


## Bottom Performing Schools (By Passing Rate)

* Sort and display the five worst-performing schools

In [65]:
#BOTTOM SCHOOLS WHICH IS JUST THE TAIL OF SORTED DF
best_table.tail()

Unnamed: 0,School Name,Read Score,Math Score,Type,Size,Budget,Budget/Student,Read Pass Rate,Math Pass Rate,Overall Pass Rate
3,Ford High School,80.746258,77.102592,District,2739,1763916,644.0,79.299014,68.309602,73.804308
8,Johnson High School,80.966394,77.072464,District,4761,3094650,650.0,81.222432,66.057551,73.639992
7,Huang High School,81.182722,76.629414,District,2917,1910635,655.0,81.316421,65.683922,73.500171
2,Figueroa High School,81.15802,76.711767,District,2949,1884411,639.0,80.739234,65.988471,73.363852
10,Rodriguez High School,80.744686,76.842711,District,3999,2547363,637.0,80.220055,66.366592,73.293323


## Math Scores by Grade

In [124]:
#overall by school
gradesmath = student_data.groupby('school_name')['math_score'].mean()
gradesmath.round(2)

school_name
Bailey High School      77.05
Cabrera High School     83.06
Figueroa High School    76.71
Ford High School         77.1
Griffin High School     83.35
Hernandez High School   77.29
Holden High School       83.8
Huang High School       76.63
Johnson High School     77.07
Pena High School        83.84
Rodriguez High School   76.84
Shelton High School     83.36
Thomas High School      83.42
Wilson High School      83.27
Wright High School      83.68
Name: math_score, dtype: float64

In [125]:
#9th grade math
ninthmath = student_data[student_data.grade == "9th"].groupby('school_name')['math_score'].mean()
ninthmath.round(2)

school_name
Bailey High School      77.08
Cabrera High School     83.09
Figueroa High School     76.4
Ford High School        77.36
Griffin High School     82.04
Hernandez High School   77.44
Holden High School      83.79
Huang High School       77.03
Johnson High School     77.19
Pena High School        83.63
Rodriguez High School   76.86
Shelton High School     83.42
Thomas High School      83.59
Wilson High School      83.09
Wright High School      83.26
Name: math_score, dtype: float64

In [126]:
#10th grade math
tenthmath = student_data[student_data.grade == "10th"].groupby('school_name')['math_score'].mean()
tenthmath.round(2)

school_name
Bailey High School       77.0
Cabrera High School     83.15
Figueroa High School    76.54
Ford High School        77.67
Griffin High School     84.23
Hernandez High School   77.34
Holden High School      83.43
Huang High School       75.91
Johnson High School     76.69
Pena High School        83.37
Rodriguez High School   76.61
Shelton High School     82.92
Thomas High School      83.09
Wilson High School      83.72
Wright High School      84.01
Name: math_score, dtype: float64

In [127]:
#11th grade Math
eleventhmath = student_data[student_data.grade == "11th"].groupby('school_name')['math_score'].mean()
eleventhmath.round(2)

school_name
Bailey High School      77.52
Cabrera High School     82.77
Figueroa High School    76.88
Ford High School        76.92
Griffin High School     83.84
Hernandez High School   77.14
Holden High School       85.0
Huang High School       76.45
Johnson High School     77.49
Pena High School        84.33
Rodriguez High School    76.4
Shelton High School     83.38
Thomas High School       83.5
Wilson High School       83.2
Wright High School      83.84
Name: math_score, dtype: float64

In [128]:
#12th grade math
twelfthmath = student_data[student_data.grade == "12th"].groupby('school_name')['math_score'].mean()
twelfthmath.round(2)

school_name
Bailey High School      76.49
Cabrera High School     83.28
Figueroa High School    77.15
Ford High School        76.18
Griffin High School     83.36
Hernandez High School   77.19
Holden High School      82.86
Huang High School       77.23
Johnson High School     76.86
Pena High School        84.12
Rodriguez High School   77.69
Shelton High School     83.78
Thomas High School       83.5
Wilson High School      83.04
Wright High School      83.64
Name: math_score, dtype: float64

## Reading Score by Grade



In [129]:
#Reading, overall by school
gradesread = student_data.groupby('school_name')['reading_score'].mean()
gradesread.round(2)

school_name
Bailey High School      81.03
Cabrera High School     83.98
Figueroa High School    81.16
Ford High School        80.75
Griffin High School     83.82
Hernandez High School   80.93
Holden High School      83.81
Huang High School       81.18
Johnson High School     80.97
Pena High School        84.04
Rodriguez High School   80.74
Shelton High School     83.73
Thomas High School      83.85
Wilson High School      83.99
Wright High School      83.96
Name: reading_score, dtype: float64

In [130]:
#9th grade reading by school
ninthread = student_data[student_data.grade == "9th"].groupby('school_name')['reading_score'].mean()
ninthread.round(2)

school_name
Bailey High School       81.3
Cabrera High School     83.68
Figueroa High School     81.2
Ford High School        80.63
Griffin High School     83.37
Hernandez High School   80.87
Holden High School      83.68
Huang High School       81.29
Johnson High School     81.26
Pena High School        83.81
Rodriguez High School   80.99
Shelton High School     84.12
Thomas High School      83.73
Wilson High School      83.94
Wright High School      83.83
Name: reading_score, dtype: float64

In [131]:
#10th grade reading
tenthread = student_data[student_data.grade == "10th"].groupby('school_name')['reading_score'].mean()
tenthread.round(2)

school_name
Bailey High School      80.91
Cabrera High School     84.25
Figueroa High School    81.41
Ford High School        81.26
Griffin High School     83.71
Hernandez High School   80.66
Holden High School      83.32
Huang High School       81.51
Johnson High School     80.77
Pena High School        83.61
Rodriguez High School   80.63
Shelton High School     83.44
Thomas High School      84.25
Wilson High School      84.02
Wright High School      83.81
Name: reading_score, dtype: float64

In [132]:
#11th grade reading
eleventhread = student_data[student_data.grade == "11th"].groupby('school_name')['reading_score'].mean()
eleventhread.round(2)

school_name
Bailey High School      80.95
Cabrera High School     83.79
Figueroa High School    80.64
Ford High School         80.4
Griffin High School     84.29
Hernandez High School    81.4
Holden High School      83.82
Huang High School       81.42
Johnson High School     80.62
Pena High School        84.34
Rodriguez High School   80.86
Shelton High School     84.37
Thomas High School      83.59
Wilson High School      83.76
Wright High School      84.16
Name: reading_score, dtype: float64

In [133]:
#12 grade reading
twelfthread = student_data[student_data.grade == "12th"].groupby('school_name')['reading_score'].mean()
twelfthread.round(2)

school_name
Bailey High School      80.91
Cabrera High School     84.29
Figueroa High School    81.38
Ford High School        80.66
Griffin High School     84.01
Hernandez High School   80.86
Holden High School       84.7
Huang High School       80.31
Johnson High School     81.23
Pena High School        84.59
Rodriguez High School   80.38
Shelton High School     82.78
Thomas High School      83.83
Wilson High School      84.32
Wright High School      84.07
Name: reading_score, dtype: float64

In [None]:
# summary table of math and reading scores for grades 9-12 by school

In [134]:
sumgrade_df = pd.DataFrame({"9th Math Average": ninthmath,
                           "10th Math Average": tenthmath,
                           "11th Math Average": eleventhmath,
                           "12th Math Average": twelfthmath,
                            "9th Read Average": ninthread,
                           "10th Read Average": tenthread,
                           "11th Read Average": eleventhread,
                           "12th Read Average": twelfthread})

sumgrade_df.round(2)

Unnamed: 0_level_0,9th Math Average,10th Math Average,11th Math Average,12th Math Average,9th Read Average,10th Read Average,11th Read Average,12th Read Average
school_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
Bailey High School,77.08,77.0,77.52,76.49,81.3,80.91,80.95,80.91
Cabrera High School,83.09,83.15,82.77,83.28,83.68,84.25,83.79,84.29
Figueroa High School,76.4,76.54,76.88,77.15,81.2,81.41,80.64,81.38
Ford High School,77.36,77.67,76.92,76.18,80.63,81.26,80.4,80.66
Griffin High School,82.04,84.23,83.84,83.36,83.37,83.71,84.29,84.01
Hernandez High School,77.44,77.34,77.14,77.19,80.87,80.66,81.4,80.86
Holden High School,83.79,83.43,85.0,82.86,83.68,83.32,83.82,84.7
Huang High School,77.03,75.91,76.45,77.23,81.29,81.51,81.42,80.31
Johnson High School,77.19,76.69,77.49,76.86,81.26,80.77,80.62,81.23
Pena High School,83.63,83.37,84.33,84.12,83.81,83.61,84.34,84.59


## Scores by School Spending

* Create a table that breaks down school performances based on average Spending Ranges (Per Student). Use 4 reasonable bins to group school spending. Include in the table each of the following:
  * Average Math Score
  * Average Reading Score
  * % Passing Math
  * % Passing Reading
  * Overall Passing Rate (Average of the above two)

In [137]:
# Sample bins. Feel free to create your own bins.
bins = [0, 585, 615, 645, 675]
group_names = ["<$585", "$585-615", "$615-645", "$645-675"]

In [142]:
#Binning up the budget!
school_table["Budget Size"] = pd.cut(school_table["Budget/Student"], bins, labels=group_names)
school_table.round(2)

Unnamed: 0,School Name,Read Score,Math Score,Type,Size,Budget,Budget/Student,Read Pass Rate,Math Pass Rate,Overall Pass Rate,Budget Size,School Size Groups
0,Bailey High School,81.03,77.05,District,4976,3124928,628.0,81.93,66.68,74.31,$615-645,Large (2000-5000)
1,Cabrera High School,83.98,83.06,Charter,1858,1081356,582.0,97.04,94.13,95.59,<$585,Medium (1000-2000)
2,Figueroa High School,81.16,76.71,District,2949,1884411,639.0,80.74,65.99,73.36,$615-645,Large (2000-5000)
3,Ford High School,80.75,77.1,District,2739,1763916,644.0,79.3,68.31,73.8,$615-645,Large (2000-5000)
4,Griffin High School,83.82,83.35,Charter,1468,917500,625.0,97.14,93.39,95.27,$615-645,Medium (1000-2000)
5,Hernandez High School,80.93,77.29,District,4635,3022020,652.0,80.86,66.75,73.81,$645-675,Large (2000-5000)
6,Holden High School,83.81,83.8,Charter,427,248087,581.0,96.25,92.51,94.38,<$585,Small (<1000)
7,Huang High School,81.18,76.63,District,2917,1910635,655.0,81.32,65.68,73.5,$645-675,Large (2000-5000)
8,Johnson High School,80.97,77.07,District,4761,3094650,650.0,81.22,66.06,73.64,$645-675,Large (2000-5000)
9,Pena High School,84.04,83.84,Charter,962,585858,609.0,95.95,94.59,95.27,$585-615,Small (<1000)


## Scores by School Size

* Perform the same operations as above, based on school size.

In [80]:
# Sample bins. Feel free to create your own bins.
size_bins = [0, 1000, 2000, 5000]
group_names = ["Small (<1000)", "Medium (1000-2000)", "Large (2000-5000)"]

In [81]:
school_table["School Size Groups"] = pd.cut(school_table["Size"], size_bins, labels=group_names)
school_table

Unnamed: 0,School Name,Read Score,Math Score,Type,Size,Budget,Budget/Student,Read Pass Rate,Math Pass Rate,Overall Pass Rate,Budget Size,School Size Groups
0,Bailey High School,81.033963,77.048432,District,4976,3124928,628.0,81.93328,66.680064,74.306672,$615-645,Large (2000-5000)
1,Cabrera High School,83.97578,83.061895,Charter,1858,1081356,582.0,97.039828,94.133477,95.586652,<$585,Medium (1000-2000)
2,Figueroa High School,81.15802,76.711767,District,2949,1884411,639.0,80.739234,65.988471,73.363852,$615-645,Large (2000-5000)
3,Ford High School,80.746258,77.102592,District,2739,1763916,644.0,79.299014,68.309602,73.804308,$615-645,Large (2000-5000)
4,Griffin High School,83.816757,83.351499,Charter,1468,917500,625.0,97.138965,93.392371,95.265668,$615-645,Medium (1000-2000)
5,Hernandez High School,80.934412,77.289752,District,4635,3022020,652.0,80.862999,66.752967,73.807983,$645-675,Large (2000-5000)
6,Holden High School,83.814988,83.803279,Charter,427,248087,581.0,96.252927,92.505855,94.379391,<$585,Small (<1000)
7,Huang High School,81.182722,76.629414,District,2917,1910635,655.0,81.316421,65.683922,73.500171,$645-675,Large (2000-5000)
8,Johnson High School,80.966394,77.072464,District,4761,3094650,650.0,81.222432,66.057551,73.639992,$645-675,Large (2000-5000)
9,Pena High School,84.044699,83.839917,Charter,962,585858,609.0,95.945946,94.594595,95.27027,$585-615,Small (<1000)


## Scores by School Type

* Perform the same operations as above, based on school type.

In [151]:
type_groups = school_table.groupby(['Type']).mean()

school_table.round(2)

Unnamed: 0,School Name,Read Score,Math Score,Type,Size,Budget,Budget/Student,Read Pass Rate,Math Pass Rate,Overall Pass Rate,Budget Size,School Size Groups
0,Bailey High School,81.03,77.05,District,4976,3124928,628.0,81.93,66.68,74.31,$615-645,Large (2000-5000)
1,Cabrera High School,83.98,83.06,Charter,1858,1081356,582.0,97.04,94.13,95.59,<$585,Medium (1000-2000)
2,Figueroa High School,81.16,76.71,District,2949,1884411,639.0,80.74,65.99,73.36,$615-645,Large (2000-5000)
3,Ford High School,80.75,77.1,District,2739,1763916,644.0,79.3,68.31,73.8,$615-645,Large (2000-5000)
4,Griffin High School,83.82,83.35,Charter,1468,917500,625.0,97.14,93.39,95.27,$615-645,Medium (1000-2000)
5,Hernandez High School,80.93,77.29,District,4635,3022020,652.0,80.86,66.75,73.81,$645-675,Large (2000-5000)
6,Holden High School,83.81,83.8,Charter,427,248087,581.0,96.25,92.51,94.38,<$585,Small (<1000)
7,Huang High School,81.18,76.63,District,2917,1910635,655.0,81.32,65.68,73.5,$645-675,Large (2000-5000)
8,Johnson High School,80.97,77.07,District,4761,3094650,650.0,81.22,66.06,73.64,$645-675,Large (2000-5000)
9,Pena High School,84.04,83.84,Charter,962,585858,609.0,95.95,94.59,95.27,$585-615,Small (<1000)
