# PyCity Schools Analysis

* As a whole, schools with higher budgets, did not yield better test results. By contrast, schools with higher spending per student actually (\\$645 - 675) underperformed compared to schools with smaller budgets (\\$585 per student).

* As a whole, smaller and medium sized schools dramatically out-performed large sized schools on passing math performances (89-91% passing vs 67%).

* As a whole, charter schools out-performed the public district schools across all metrics. However, more analysis will be required to glean if the effect is due to school practices or the fact that charter schools tend to serve smaller student populations per school. 
---

**Note:**
Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [168]:
# Dependencies and Setup
import pandas as pd
import numpy as np

# File to Load (Remember to Change These)
school_data_to_load = "schools_complete.csv"
student_data_to_load = "students_complete.csv"

# Read School and Student Data File and store into Pandas Data Frames
school_data = pd.read_csv(school_data_to_load)
student_data = pd.read_csv(student_data_to_load)

# Combine the data into a single dataset
school_data_complete = pd.merge(student_data, school_data, how="left", on=["school_name", "school_name"])
school_data_complete.head()

# school_data_complete.count()

Unnamed: 0,Student ID,student_name,gender,grade,school_name,reading_score,math_score,School ID,type,size,budget
0,0,Paul Bradley,M,9th,Huang High School,66,79,0,District,2917,1910635
1,1,Victor Smith,M,12th,Huang High School,94,61,0,District,2917,1910635
2,2,Kevin Rodriguez,M,12th,Huang High School,90,60,0,District,2917,1910635
3,3,Dr. Richard Scott,M,12th,Huang High School,67,58,0,District,2917,1910635
4,4,Bonnie Ray,F,9th,Huang High School,97,84,0,District,2917,1910635


## District Summary

* Calculate the total number of schools

* Calculate the total number of students

* Calculate the total budget

* Calculate the average math score 

* Calculate the average reading score

* Calculate the overall passing rate (overall average score), i.e. (avg. math score + avg. reading score)/2

* Calculate the percentage of students with a passing math score (70 or greater)

* Calculate the percentage of students with a passing reading score (70 or greater)

* Create a dataframe to hold the above results

* Optional: give the displayed data cleaner formatting

In [169]:
# Total number of schools
numschool = len(pd.unique(school_data_complete['school_name']))
numschool 


15

In [170]:
# Total number of students
kids = len(pd.unique(school_data_complete['Student ID']))
kids

39170

In [171]:
# Total budget
unique_school = pd.unique(school_data_complete['budget'])
budget = sum(unique_school)
budget



24649428

In [172]:
# Average math score
m=np.mean(school_data_complete.math_score)
m



78.98537145774827

In [173]:
# Average reading score
r=np.mean(school_data_complete.reading_score)
r



81.87784018381414

In [174]:
# Overall average score
overall = (m+r)/2
overall


80.43160582078121

In [175]:
# Percentage of passing math (70 or greater)
count = 0
for i in school_data_complete.math_score:
    if i > 70:
        count = count + 1
        
passingm = (count/kids)*100
passingm



72.39213683941792

In [176]:
# Percentage of passing reading (70 or greater)
count = 0
for i in school_data_complete.reading_score:
    if i > 70:
        count = count + 1
        
passingr = (count/kids)*100
passingr 


82.97166198621395

In [177]:
results = {'# of. schools': [numschool],
           '# of sstudents': [kids], 
           'budget': [budget], 
           'avg. math': [m],
           'avg. read': [r],
           'overall avg': [overall], 
           'pass math': [passingm], 
           'pass read': [passingr]}
resultsdf = pd.DataFrame(results)
resultsdf

Unnamed: 0,# of. schools,# of sstudents,budget,avg. math,avg. read,overall avg,pass math,pass read
0,15,39170,24649428,78.985371,81.87784,80.431606,72.392137,82.971662


## School Summary

* Create an overview table that summarizes key metrics about each school, including:
  * School Name
  * School Type
  * Total Students
  * Total School Budget
  * Per Student Budget
  * Average Math Score
  * Average Reading Score
  * % Passing Math
  * % Passing Reading
  * Overall Passing Rate (Average of the above two)
  
* Create a dataframe to hold the above results

### Top Performing Schools (By Passing Rate)

In [178]:
# Calculate total school budget
school_data_complete.loc[:,['school_name', 'budget']].drop_duplicates()

Unnamed: 0,school_name,budget
0,Huang High School,1910635
2917,Figueroa High School,1884411
5866,Shelton High School,1056600
7627,Hernandez High School,3022020
12262,Griffin High School,917500
13730,Wilson High School,1319574
16013,Cabrera High School,1081356
17871,Bailey High School,3124928
22847,Holden High School,248087
23274,Pena High School,585858


In [179]:
# Calculate per student budget
school_data_complete['per student budget'] = school_data_complete['budget'] / school_data_complete['size']
school_data_complete.loc[:,['school_name', 'per student budget']].drop_duplicates()

Unnamed: 0,school_name,per student budget
0,Huang High School,655.0
2917,Figueroa High School,639.0
5866,Shelton High School,600.0
7627,Hernandez High School,652.0
12262,Griffin High School,625.0
13730,Wilson High School,578.0
16013,Cabrera High School,582.0
17871,Bailey High School,628.0
22847,Holden High School,581.0
23274,Pena High School,609.0


In [180]:
# Cacluate the avg math and reading score
rmdata = school_data_complete.loc[:,['school_name', 'math_score', 'reading_score']]
scores = rmdata.groupby('school_name').mean()
scores

Unnamed: 0_level_0,math_score,reading_score
school_name,Unnamed: 1_level_1,Unnamed: 2_level_1
Bailey High School,77.048432,81.033963
Cabrera High School,83.061895,83.97578
Figueroa High School,76.711767,81.15802
Ford High School,77.102592,80.746258
Griffin High School,83.351499,83.816757
Hernandez High School,77.289752,80.934412
Holden High School,83.803279,83.814988
Huang High School,76.629414,81.182722
Johnson High School,77.072464,80.966394
Pena High School,83.839917,84.044699


#### Find the passing rate for math and reading (above 70 points)

In [181]:
# Find the total counts of math result
math = school_data_complete.loc[:,['school_name', 'math_score']]
counts = math.groupby('school_name').count()
counts



Unnamed: 0_level_0,math_score
school_name,Unnamed: 1_level_1
Bailey High School,4976
Cabrera High School,1858
Figueroa High School,2949
Ford High School,2739
Griffin High School,1468
Hernandez High School,4635
Holden High School,427
Huang High School,2917
Johnson High School,4761
Pena High School,962


In [182]:
# Find the counts for math result in each school that pass 70 or higher
countpassm = math[math['math_score'] >= 70]
countpassm = countpassm.groupby('school_name').count()
countpassm


Unnamed: 0_level_0,math_score
school_name,Unnamed: 1_level_1
Bailey High School,3318
Cabrera High School,1749
Figueroa High School,1946
Ford High School,1871
Griffin High School,1371
Hernandez High School,3094
Holden High School,395
Huang High School,1916
Johnson High School,3145
Pena High School,910


In [183]:
# Calculate the math passing rate
passingratem = countpassm/counts
passingratem

Unnamed: 0_level_0,math_score
school_name,Unnamed: 1_level_1
Bailey High School,0.666801
Cabrera High School,0.941335
Figueroa High School,0.659885
Ford High School,0.683096
Griffin High School,0.933924
Hernandez High School,0.66753
Holden High School,0.925059
Huang High School,0.656839
Johnson High School,0.660576
Pena High School,0.945946


In [184]:
# Find the total counts of read result
read = school_data_complete.loc[:,['school_name', 'reading_score']]
counts = read.groupby('school_name').count()
counts




Unnamed: 0_level_0,reading_score
school_name,Unnamed: 1_level_1
Bailey High School,4976
Cabrera High School,1858
Figueroa High School,2949
Ford High School,2739
Griffin High School,1468
Hernandez High School,4635
Holden High School,427
Huang High School,2917
Johnson High School,4761
Pena High School,962


In [185]:
# Find the counts for read result in each school that pass 70 or higher
countpassr = read[read['reading_score'] >= 70]
countpassr = countpassr.groupby('school_name').count()
countpassr


Unnamed: 0_level_0,reading_score
school_name,Unnamed: 1_level_1
Bailey High School,4077
Cabrera High School,1803
Figueroa High School,2381
Ford High School,2172
Griffin High School,1426
Hernandez High School,3748
Holden High School,411
Huang High School,2372
Johnson High School,3867
Pena High School,923


In [186]:
# Calculate the read passing rate
passingrater = countpassr/counts
passingrater

Unnamed: 0_level_0,reading_score
school_name,Unnamed: 1_level_1
Bailey High School,0.819333
Cabrera High School,0.970398
Figueroa High School,0.807392
Ford High School,0.79299
Griffin High School,0.97139
Hernandez High School,0.80863
Holden High School,0.962529
Huang High School,0.813164
Johnson High School,0.812224
Pena High School,0.959459


In [187]:
# Calculate the overall passing rate (average of the math and reading passing rate)
passingrate = pd.merge(passingratem, passingrater, how="left", on=[ "school_name"])
passingrate['avg_score'] = (passingrate['math_score'] + passingrate['reading_score'])/2
passingrate



Unnamed: 0_level_0,math_score,reading_score,avg_score
school_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Bailey High School,0.666801,0.819333,0.743067
Cabrera High School,0.941335,0.970398,0.955867
Figueroa High School,0.659885,0.807392,0.733639
Ford High School,0.683096,0.79299,0.738043
Griffin High School,0.933924,0.97139,0.952657
Hernandez High School,0.66753,0.80863,0.73808
Holden High School,0.925059,0.962529,0.943794
Huang High School,0.656839,0.813164,0.735002
Johnson High School,0.660576,0.812224,0.7364
Pena High School,0.945946,0.959459,0.952703


In [188]:
#  Sort and display the top five schools in overall passing rate
passingrate.sort_values('avg_score', ascending=False).head(5)

Unnamed: 0_level_0,math_score,reading_score,avg_score
school_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Cabrera High School,0.941335,0.970398,0.955867
Thomas High School,0.932722,0.973089,0.952905
Pena High School,0.945946,0.959459,0.952703
Griffin High School,0.933924,0.97139,0.952657
Wilson High School,0.938677,0.965396,0.952037


### Bottom Performing Schools (By Passing Rate)

* Sort and display the five worst-performing schools

In [189]:
#  Sort and display the worst five schools in overall passing rate
passingrate.sort_values('avg_score', ascending=False).tail(5)



Unnamed: 0_level_0,math_score,reading_score,avg_score
school_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Ford High School,0.683096,0.79299,0.738043
Johnson High School,0.660576,0.812224,0.7364
Huang High School,0.656839,0.813164,0.735002
Figueroa High School,0.659885,0.807392,0.733639
Rodriguez High School,0.663666,0.802201,0.732933


## Math Scores by Grade

* Create a table that lists the average Reading Score for students of each grade level (9th, 10th, 11th, 12th) at each school.

  * Create a pandas series for each grade. Hint: use a conditional statement.
  
  * Group each series by school
  
  * Combine the series into a dataframe
  
  * Optional: give the displayed data cleaner formatting

In [190]:
# Calculate the average math score for 9th grade in each school
by_grade = school_data_complete.loc[:,['school_name', 'math_score', 'grade']]
mscores = by_grade.groupby([ 'grade'])
nine = mscores.get_group('9th')
nine = nine.groupby('school_name').mean()
nine = nine.rename(columns={'math_score': 'math_score_9th'})
nine


  nine = nine.groupby('school_name').mean()


Unnamed: 0_level_0,math_score_9th
school_name,Unnamed: 1_level_1
Bailey High School,77.083676
Cabrera High School,83.094697
Figueroa High School,76.403037
Ford High School,77.361345
Griffin High School,82.04401
Hernandez High School,77.438495
Holden High School,83.787402
Huang High School,77.027251
Johnson High School,77.187857
Pena High School,83.625455


In [191]:
# Calculate the average math score for 10th grade in each school
ten = mscores.get_group('10th')
ten = ten.groupby('school_name').mean()
ten = ten.rename(columns={'math_score': 'math_score_10th'})
ten



  ten = ten.groupby('school_name').mean()


Unnamed: 0_level_0,math_score_10th
school_name,Unnamed: 1_level_1
Bailey High School,76.996772
Cabrera High School,83.154506
Figueroa High School,76.539974
Ford High School,77.672316
Griffin High School,84.229064
Hernandez High School,77.337408
Holden High School,83.429825
Huang High School,75.908735
Johnson High School,76.691117
Pena High School,83.372


In [192]:
# Calculate the average math score for 11th grade in each school
eleven = mscores.get_group('11th')
eleven = eleven.groupby('school_name').mean()
eleven = eleven.rename(columns={'math_score': 'math_score_11th'})

eleven



  eleven = eleven.groupby('school_name').mean()


Unnamed: 0_level_0,math_score_11th
school_name,Unnamed: 1_level_1
Bailey High School,77.515588
Cabrera High School,82.76556
Figueroa High School,76.884344
Ford High School,76.918058
Griffin High School,83.842105
Hernandez High School,77.136029
Holden High School,85.0
Huang High School,76.446602
Johnson High School,77.491653
Pena High School,84.328125


In [193]:
# Calculate the average math score for 12th grade in each school
twelve = mscores.get_group('12th')
twelve = twelve.groupby('school_name').mean()
twelve = twelve.rename(columns={'math_score': 'math_score_12th'})

twelve



  twelve = twelve.groupby('school_name').mean()


Unnamed: 0_level_0,math_score_12th
school_name,Unnamed: 1_level_1
Bailey High School,76.492218
Cabrera High School,83.277487
Figueroa High School,77.151369
Ford High School,76.179963
Griffin High School,83.356164
Hernandez High School,77.186567
Holden High School,82.855422
Huang High School,77.225641
Johnson High School,76.863248
Pena High School,84.121547


In [194]:
# Create table that lists the average math score for each school of each grade level.
gradesm = pd.merge(pd.merge(pd.merge(nine, ten, on= "school_name"), eleven, on= "school_name"), twelve, on= "school_name")
gradesm


Unnamed: 0_level_0,math_score_9th,math_score_10th,math_score_11th,math_score_12th
school_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Bailey High School,77.083676,76.996772,77.515588,76.492218
Cabrera High School,83.094697,83.154506,82.76556,83.277487
Figueroa High School,76.403037,76.539974,76.884344,77.151369
Ford High School,77.361345,77.672316,76.918058,76.179963
Griffin High School,82.04401,84.229064,83.842105,83.356164
Hernandez High School,77.438495,77.337408,77.136029,77.186567
Holden High School,83.787402,83.429825,85.0,82.855422
Huang High School,77.027251,75.908735,76.446602,77.225641
Johnson High School,77.187857,76.691117,77.491653,76.863248
Pena High School,83.625455,83.372,84.328125,84.121547


### Reading Score by Grade 

* Perform the same operations as above for reading scores

In [195]:
# Calculate the average reading score for 9th grade in each school
by_grade = school_data_complete.loc[:,['school_name', 'reading_score', 'grade']]
rscores = by_grade.groupby([ 'grade'])
nine = rscores.get_group('9th')
nine = nine.groupby('school_name').mean()
nine = nine.rename(columns={'reading_score': 'reading_score_9th'})
nine



  nine = nine.groupby('school_name').mean()


Unnamed: 0_level_0,reading_score_9th
school_name,Unnamed: 1_level_1
Bailey High School,81.303155
Cabrera High School,83.676136
Figueroa High School,81.198598
Ford High School,80.632653
Griffin High School,83.369193
Hernandez High School,80.86686
Holden High School,83.677165
Huang High School,81.290284
Johnson High School,81.260714
Pena High School,83.807273


In [196]:
# Calculate the average reading score for 10th grade in each school
ten = rscores.get_group('10th')
ten = ten.groupby('school_name').mean()
ten = ten.rename(columns={'reading_score': 'reading_score_10th'})
ten



  ten = ten.groupby('school_name').mean()


Unnamed: 0_level_0,reading_score_10th
school_name,Unnamed: 1_level_1
Bailey High School,80.907183
Cabrera High School,84.253219
Figueroa High School,81.408912
Ford High School,81.262712
Griffin High School,83.706897
Hernandez High School,80.660147
Holden High School,83.324561
Huang High School,81.512386
Johnson High School,80.773431
Pena High School,83.612


In [197]:
# Calculate the average reading score for 11th grade in each school
eleven = rscores.get_group('11th')
eleven = eleven.groupby('school_name').mean()
eleven = eleven.rename(columns={'reading_score': 'reading_score_11th'})
eleven



  eleven = eleven.groupby('school_name').mean()


Unnamed: 0_level_0,reading_score_11th
school_name,Unnamed: 1_level_1
Bailey High School,80.945643
Cabrera High School,83.788382
Figueroa High School,80.640339
Ford High School,80.403642
Griffin High School,84.288089
Hernandez High School,81.39614
Holden High School,83.815534
Huang High School,81.417476
Johnson High School,80.616027
Pena High School,84.335938


In [198]:
# Calculate the average reading score for 12th grade in each school
twelve = mscores.get_group('12th')
twelve = twelve.groupby('school_name').mean()
twelve = twelve.rename(columns={'reading_score': 'reading_score_12th'})
twelve



  twelve = twelve.groupby('school_name').mean()


Unnamed: 0_level_0,math_score
school_name,Unnamed: 1_level_1
Bailey High School,76.492218
Cabrera High School,83.277487
Figueroa High School,77.151369
Ford High School,76.179963
Griffin High School,83.356164
Hernandez High School,77.186567
Holden High School,82.855422
Huang High School,77.225641
Johnson High School,76.863248
Pena High School,84.121547


In [199]:
# Create table that lists the average reading score for each school of each grade level.
gradesm = pd.merge(pd.merge(pd.merge(nine, ten, on= "school_name"), eleven, on= "school_name"), twelve, on= "school_name")
gradesm

Unnamed: 0_level_0,reading_score_9th,reading_score_10th,reading_score_11th,math_score
school_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Bailey High School,81.303155,80.907183,80.945643,76.492218
Cabrera High School,83.676136,84.253219,83.788382,83.277487
Figueroa High School,81.198598,81.408912,80.640339,77.151369
Ford High School,80.632653,81.262712,80.403642,76.179963
Griffin High School,83.369193,83.706897,84.288089,83.356164
Hernandez High School,80.86686,80.660147,81.39614,77.186567
Holden High School,83.677165,83.324561,83.815534,82.855422
Huang High School,81.290284,81.512386,81.417476,77.225641
Johnson High School,81.260714,80.773431,80.616027,76.863248
Pena High School,83.807273,83.612,84.335938,84.121547


## Scores by School Spending

* Create a table that breaks down school performances based on average Spending Ranges (Per Student). Use 4 reasonable bins to group school spending. Include in the table each of the following:
  * Average Math Score
  * Average Reading Score
  * % Passing Math
  * % Passing Reading
  * Overall Passing Rate (Average of the above two)

In [200]:
# Sample bins. Feel free to create your own bins.
spending_bins = [0, 585, 615, 645, 675]
group_names = ["<$585", "$585-615", "$615-645", "$645-675"]

In [216]:
# Create a new column to show budget per student in each row
school_data_complete['budget_per_student'] = school_data_complete['budget'] / school_data_complete['size']
school_data_complete


Unnamed: 0,Student ID,student_name,gender,grade,school_name,reading_score,math_score,School ID,type,size,budget,per student budget,budget_per_student,spending
0,0,Paul Bradley,M,9th,Huang High School,66,79,0,District,2917,1910635,655.0,655.0,$645-675
1,1,Victor Smith,M,12th,Huang High School,94,61,0,District,2917,1910635,655.0,655.0,$645-675
2,2,Kevin Rodriguez,M,12th,Huang High School,90,60,0,District,2917,1910635,655.0,655.0,$645-675
3,3,Dr. Richard Scott,M,12th,Huang High School,67,58,0,District,2917,1910635,655.0,655.0,$645-675
4,4,Bonnie Ray,F,9th,Huang High School,97,84,0,District,2917,1910635,655.0,655.0,$645-675
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
39165,39165,Donna Howard,F,12th,Thomas High School,99,90,14,Charter,1635,1043130,638.0,638.0,$615-645
39166,39166,Dawn Bell,F,10th,Thomas High School,95,70,14,Charter,1635,1043130,638.0,638.0,$615-645
39167,39167,Rebecca Tanner,F,9th,Thomas High School,73,84,14,Charter,1635,1043130,638.0,638.0,$615-645
39168,39168,Desiree Kidd,F,10th,Thomas High School,99,90,14,Charter,1635,1043130,638.0,638.0,$615-645


In [217]:
# Create a new column to define the spending ranges per student
conditions = [
    (school_data_complete['budget_per_student'] <= 585),
    (school_data_complete['budget_per_student'] > 585) & (school_data_complete['budget_per_student'] <= 615),
    (school_data_complete['budget_per_student'] > 615) & (school_data_complete['budget_per_student'] <= 645),
    (school_data_complete['budget_per_student'] >645 )]
values = ["<$585", "$585-615", "$615-645", "$645-675"]
school_data_complete['spending'] = np.select(conditions, values)   
                     
school_data_complete



Unnamed: 0,Student ID,student_name,gender,grade,school_name,reading_score,math_score,School ID,type,size,budget,per student budget,budget_per_student,spending
0,0,Paul Bradley,M,9th,Huang High School,66,79,0,District,2917,1910635,655.0,655.0,$645-675
1,1,Victor Smith,M,12th,Huang High School,94,61,0,District,2917,1910635,655.0,655.0,$645-675
2,2,Kevin Rodriguez,M,12th,Huang High School,90,60,0,District,2917,1910635,655.0,655.0,$645-675
3,3,Dr. Richard Scott,M,12th,Huang High School,67,58,0,District,2917,1910635,655.0,655.0,$645-675
4,4,Bonnie Ray,F,9th,Huang High School,97,84,0,District,2917,1910635,655.0,655.0,$645-675
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
39165,39165,Donna Howard,F,12th,Thomas High School,99,90,14,Charter,1635,1043130,638.0,638.0,$615-645
39166,39166,Dawn Bell,F,10th,Thomas High School,95,70,14,Charter,1635,1043130,638.0,638.0,$615-645
39167,39167,Rebecca Tanner,F,9th,Thomas High School,73,84,14,Charter,1635,1043130,638.0,638.0,$615-645
39168,39168,Desiree Kidd,F,10th,Thomas High School,99,90,14,Charter,1635,1043130,638.0,638.0,$615-645


In [222]:
# Calculate the average math score within each spending range
spendingScore = school_data_complete.loc[:,['spending', 'math_score']]
sscores = spendingScore.groupby([ 'spending']).mean()
sscores

Unnamed: 0_level_0,math_score
spending,Unnamed: 1_level_1
$585-615,83.529196
$615-645,78.061635
$645-675,77.049297
<$585,83.363065


In [232]:
# Calculate the percentage passing rate for math in each spending range
countpassms = spendingScore[spendingScore['math_score'] >= 70]
countpassms = countpassms.groupby('spending').count()
countspending = spendingScore.groupby('spending').count()
spendingPassRatem = countpassms / countspending
spendingPassRatem = spendingPassRatem.rename(columns={'math_score': 'math_passing_rate'})
spendingPassRatem

Unnamed: 0_level_0,math_passing_rate
spending,Unnamed: 1_level_1
$585-615,0.941241
$615-645,0.714004
$645-675,0.662308
<$585,0.937029


In [245]:
# Calculate the percentage passing rate for reading in each spending range
spendingScorer = school_data_complete.loc[:,['spending', 'reading_score']]
countpassrs = spendingScorer[spendingScorer['reading_score'] >= 70]
countpassrs = countpassrs.groupby('spending').count()
countspending = spendingScorer.groupby('spending').count()
spendingPassRater = countpassrs / countspending
spendingPassRater = spendingPassRater.rename(columns={'reading_score': 'read_passing_rate'})
spendingPassRater


Unnamed: 0_level_0,read_passing_rate
spending,Unnamed: 1_level_1
$585-615,0.958869
$615-645,0.836148
$645-675,0.811094
<$585,0.966866


In [262]:
# Calculate the percentage overall passing rate in each spending range
spendpassing = school_data_complete['spending']
spendpassing



0        $645-675
1        $645-675
2        $645-675
3        $645-675
4        $645-675
           ...   
39165    $615-645
39166    $615-645
39167    $615-645
39168    $615-645
39169    $615-645
Name: spending, Length: 39170, dtype: object

### Scores by School Size

* Perform the same operations as above, based on school size.

In [None]:
# Sample bins. Feel free to create your own bins.
size_bins = [0, 1000, 2000, 5000]
group_names = ["Small (<1000)", "Medium (1000-2000)", "Large (2000-5000)"]

In [None]:
# Create a new column for the bin groups


Look for the total count of test scores that pass 70% or higher




In [None]:
# math_pass_size




In [None]:
# read_pass_size




In [None]:
# Calculate the overall passing rate for different school size




### Scores by School Type

* Perform the same operations as above, based on school type.

In [None]:
# Create bins and groups, school type {'Charter', 'District'}




Find counts of the passing 70 or higher score for the both test


In [None]:
# math pass size




In [None]:
# reading pass size



In [None]:
# Calculate the overall passing rate

