### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [722]:
# Dependencies and Setup
import pandas as pd

# File to Load (Remember to Change These)
school_data_to_load = "Resources/schools_complete.csv"
student_data_to_load = "Resources/students_complete.csv"

# Read School and Student Data File and store into Pandas DataFrames
school_data_df = pd.read_csv(school_data_to_load)
student_data_df = pd.read_csv(student_data_to_load)

# Combine the data into a single dataset.  
school_data_complete_df = pd.merge(student_data_df, school_data_df, how="left", on=["school_name", "school_name"])


In [723]:
total_number_of_schools = len(school_data_complete_df["school_name"].unique())
total_number_of_schools

15

In [724]:
total_number_of_students = school_data_df["size"].sum()
total_number_of_students

39170

In [725]:
total_budget = school_data_df["budget"].sum()
total_budget

24649428

In [726]:
average_maths_score = school_data_complete_df["maths_score"].mean()
average_maths_score

70.33819249425581

In [727]:
average_reading_score = school_data_complete_df["reading_score"].mean()
average_reading_score

69.98013786060761

In [728]:
Passing_maths = school_data_complete_df["maths_score"].loc[school_data_complete_df.maths_score > 49].count()
Passing_maths


33717

In [729]:
Percent_Passing_maths = (Passing_maths/total_number_of_students)*100
Percent_Passing_maths

86.07863160582077

In [730]:
Passing_reading = school_data_complete_df["reading_score"].loc[school_data_complete_df.reading_score > 49].count()
Passing_reading

33070

In [731]:
Percent_Passing_reading = (Passing_reading/total_number_of_students)*100
Percent_Passing_reading

84.42685728874139

In [732]:
total_passing_maths_reading = school_data_complete_df["maths_score"].loc[(school_data_complete_df["maths_score"] > 49) & (school_data_complete_df["reading_score"] > 49)]
total_passing_maths_reading.count()

28519

In [733]:
total_passing_maths_reading = (total_passing_maths_reading.count()/total_number_of_students)*100
total_passing_maths_reading


72.80827163645647

In [734]:
local_gov_area_summary_df = pd.DataFrame ({"Total Schools":[total_number_of_schools], "Total Students":[total_number_of_students], "Total Budget":[total_budget], "Average Maths Score":[average_maths_score], "Average Reading Score":[average_reading_score], "% Passing Maths":[Percent_Passing_maths], "% Passing Reading":[Percent_Passing_reading], "% Overall Passing":[total_passing_maths_reading]})
local_gov_area_summary_df.head()

Unnamed: 0,Total Schools,Total Students,Total Budget,Average Maths Score,Average Reading Score,% Passing Maths,% Passing Reading,% Overall Passing
0,15,39170,24649428,70.338192,69.980138,86.078632,84.426857,72.808272


## Local Government Area Summary

* Calculate the total number of schools

* Calculate the total number of students

* Calculate the total budget

* Calculate the average maths score 

* Calculate the average reading score

* Calculate the percentage of students with a passing maths score (50 or greater)

* Calculate the percentage of students with a passing reading score (50 or greater)

* Calculate the percentage of students who passed maths **and** reading (% Overall Passing)

* Create a dataframe to hold the above results

* Optional: give the displayed data cleaner formatting

## School Summary

* Create an overview table that summarises key metrics about each school, including:
  * School Name
  * School Type
  * Total Students
  * Total School Budget
  * Per Student Budget
  * Average Maths Score
  * Average Reading Score
  * % Passing Maths
  * % Passing Reading
  * % Overall Passing (The percentage of students that passed maths **and** reading.)
  
* Create a dataframe to hold the above results

In [735]:
organized_df = school_data_complete_df[["school_name","type","size","budget",'year','maths_score','reading_score']]
organized_df.head()

Unnamed: 0,school_name,type,size,budget,year,maths_score,reading_score
0,Huang High School,Government,2917,1910635,9,94,96
1,Huang High School,Government,2917,1910635,12,43,90
2,Huang High School,Government,2917,1910635,12,76,41
3,Huang High School,Government,2917,1910635,12,86,89
4,Huang High School,Government,2917,1910635,9,69,87


In [736]:
renamed_df = organized_df.rename(columns={"school_name":"School Name","type":"School Type","size":"Total Students","budget":"Total School Budget",'year':"Year",'maths_score':"Maths Score",'reading_score':"Reading Score"})
renamed_df.head()

Unnamed: 0,School Name,School Type,Total Students,Total School Budget,Year,Maths Score,Reading Score
0,Huang High School,Government,2917,1910635,9,94,96
1,Huang High School,Government,2917,1910635,12,43,90
2,Huang High School,Government,2917,1910635,12,76,41
3,Huang High School,Government,2917,1910635,12,86,89
4,Huang High School,Government,2917,1910635,9,69,87


In [737]:
renamed_df.columns

Index(['School Name', 'School Type', 'Total Students', 'Total School Budget',
       'Year', 'Maths Score', 'Reading Score'],
      dtype='object')

In [738]:
School_Name_Type_group = renamed_df.groupby(["School Name", 'School Type','Total Students', 'Total School Budget'])
School_Name_Type_group

<pandas.core.groupby.generic.DataFrameGroupBy object at 0x7fc8f1d74490>

In [739]:
School_Name_Type_df = School_Name_Type_group.sum()
School_Name_Type_df

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,Year,Maths Score,Reading Score
School Name,School Type,Total Students,Total School Budget,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Bailey High School,Government,4976,3124928,51609,360028,353340
Cabrera High School,Independent,1858,1081356,19298,133139,132586
Figueroa High School,Government,2949,1884411,30585,202592,203711
Ford High School,Government,2739,1763916,28294,189241,190559
Griffin High School,Independent,1468,917500,15216,105385,104588
Hernandez High School,Government,4635,3022020,47932,319235,320679
Holden High School,Independent,427,248087,4412,30993,30599
Huang High School,Government,2917,1910635,30217,201084,201012
Johnson High School,Government,4761,3094650,49280,327762,328696
Pena High School,Independent,962,585858,9963,69349,68892


In [740]:
School_Name_Type_df.reset_index(inplace=True)
School_Name_Type_df

Unnamed: 0,School Name,School Type,Total Students,Total School Budget,Year,Maths Score,Reading Score
0,Bailey High School,Government,4976,3124928,51609,360028,353340
1,Cabrera High School,Independent,1858,1081356,19298,133139,132586
2,Figueroa High School,Government,2949,1884411,30585,202592,203711
3,Ford High School,Government,2739,1763916,28294,189241,190559
4,Griffin High School,Independent,1468,917500,15216,105385,104588
5,Hernandez High School,Government,4635,3022020,47932,319235,320679
6,Holden High School,Independent,427,248087,4412,30993,30599
7,Huang High School,Government,2917,1910635,30217,201084,201012
8,Johnson High School,Government,4761,3094650,49280,327762,328696
9,Pena High School,Independent,962,585858,9963,69349,68892


In [741]:
School_Name_Type_df["Per Student Budget"] = School_Name_Type_df["Total School Budget"]/School_Name_Type_df["Total Students"]
School_Name_Type_df["Per Student Budget"]

0     628.0
1     582.0
2     639.0
3     644.0
4     625.0
5     652.0
6     581.0
7     655.0
8     650.0
9     609.0
10    637.0
11    600.0
12    638.0
13    578.0
14    583.0
Name: Per Student Budget, dtype: float64

In [742]:
organized_School_Name_Type_df = School_Name_Type_df[['School Name', 'School Type', 'Total Students','Total School Budget','Per Student Budget','Year', 'Maths Score', 'Reading Score']]
organized_School_Name_Type_df

Unnamed: 0,School Name,School Type,Total Students,Total School Budget,Per Student Budget,Year,Maths Score,Reading Score
0,Bailey High School,Government,4976,3124928,628.0,51609,360028,353340
1,Cabrera High School,Independent,1858,1081356,582.0,19298,133139,132586
2,Figueroa High School,Government,2949,1884411,639.0,30585,202592,203711
3,Ford High School,Government,2739,1763916,644.0,28294,189241,190559
4,Griffin High School,Independent,1468,917500,625.0,15216,105385,104588
5,Hernandez High School,Government,4635,3022020,652.0,47932,319235,320679
6,Holden High School,Independent,427,248087,581.0,4412,30993,30599
7,Huang High School,Government,2917,1910635,655.0,30217,201084,201012
8,Johnson High School,Government,4761,3094650,650.0,49280,327762,328696
9,Pena High School,Independent,962,585858,609.0,9963,69349,68892


In [743]:
School_Name_Type_df["Average Maths Score"] = School_Name_Type_df["Maths Score"]/School_Name_Type_df["Total Students"]
School_Name_Type_df["Average Maths Score"]

0     72.352894
1     71.657158
2     68.698542
3     69.091274
4     71.788147
5     68.874865
6     72.583138
7     68.935207
8     68.843100
9     72.088358
10    72.047762
11    72.034072
12    69.581651
13    69.170828
14    72.047222
Name: Average Maths Score, dtype: float64

In [744]:
School_Name_Type_df["Average Reading Score"] = School_Name_Type_df["Reading Score"]/School_Name_Type_df["Total Students"]
School_Name_Type_df["Average Reading Score"]

0     71.008842
1     71.359526
2     69.077993
3     69.572472
4     71.245232
5     69.186408
6     71.660422
7     68.910525
8     69.039277
9     71.613306
10    70.935984
11    70.257808
12    69.768807
13    68.876916
14    70.969444
Name: Average Reading Score, dtype: float64

In [745]:
organized_School_Name_Type_df = School_Name_Type_df[['School Name', 'School Type', 'Total Students','Total School Budget','Per Student Budget','Average Maths Score', 'Average Reading Score']]
organized_School_Name_Type_df

Unnamed: 0,School Name,School Type,Total Students,Total School Budget,Per Student Budget,Average Maths Score,Average Reading Score
0,Bailey High School,Government,4976,3124928,628.0,72.352894,71.008842
1,Cabrera High School,Independent,1858,1081356,582.0,71.657158,71.359526
2,Figueroa High School,Government,2949,1884411,639.0,68.698542,69.077993
3,Ford High School,Government,2739,1763916,644.0,69.091274,69.572472
4,Griffin High School,Independent,1468,917500,625.0,71.788147,71.245232
5,Hernandez High School,Government,4635,3022020,652.0,68.874865,69.186408
6,Holden High School,Independent,427,248087,581.0,72.583138,71.660422
7,Huang High School,Government,2917,1910635,655.0,68.935207,68.910525
8,Johnson High School,Government,4761,3094650,650.0,68.8431,69.039277
9,Pena High School,Independent,962,585858,609.0,72.088358,71.613306


In [746]:
Maths_df = renamed_df[["School Name","Maths Score"]]
Maths_df

Unnamed: 0,School Name,Maths Score
0,Huang High School,94
1,Huang High School,43
2,Huang High School,76
3,Huang High School,86
4,Huang High School,69
...,...,...
39165,Thomas High School,48
39166,Thomas High School,89
39167,Thomas High School,99
39168,Thomas High School,77


In [747]:
Passing_Maths_df = Maths_df.loc[renamed_df["Maths Score"] > 49,:]
Passing_Maths_df

Unnamed: 0,School Name,Maths Score
0,Huang High School,94
2,Huang High School,76
3,Huang High School,86
4,Huang High School,69
5,Huang High School,93
...,...,...
39164,Thomas High School,79
39166,Thomas High School,89
39167,Thomas High School,99
39168,Thomas High School,77


In [748]:
Passing_Maths_group_df = Passing_Maths_df.groupby(["School Name"])
Passing_Maths_group_df

<pandas.core.groupby.generic.DataFrameGroupBy object at 0x7fc8f1da0dd0>

In [749]:
Passing_Maths_group_df.count()

Unnamed: 0_level_0,Maths Score
School Name,Unnamed: 1_level_1
Bailey High School,4560
Cabrera High School,1688
Figueroa High School,2408
Ford High School,2258
Griffin High School,1339
Hernandez High School,3752
Holden High School,384
Huang High School,2383
Johnson High School,3907
Pena High School,882


In [750]:
Number_Passing_Maths_df = Passing_Maths_group_df.count()
Number_Passing_Maths_df

Unnamed: 0_level_0,Maths Score
School Name,Unnamed: 1_level_1
Bailey High School,4560
Cabrera High School,1688
Figueroa High School,2408
Ford High School,2258
Griffin High School,1339
Hernandez High School,3752
Holden High School,384
Huang High School,2383
Johnson High School,3907
Pena High School,882


In [751]:
Total_students_df = School_Name_Type_df[["School Name","Total Students"]]
Total_students_df

Unnamed: 0,School Name,Total Students
0,Bailey High School,4976
1,Cabrera High School,1858
2,Figueroa High School,2949
3,Ford High School,2739
4,Griffin High School,1468
5,Hernandez High School,4635
6,Holden High School,427
7,Huang High School,2917
8,Johnson High School,4761
9,Pena High School,962


In [752]:
combined_Passing_Maths_Total_Student_df = pd.merge(Number_Passing_Maths_df, Total_students_df,how='outer', on='School Name')
combined_Passing_Maths_Total_Student_df

Unnamed: 0,School Name,Maths Score,Total Students
0,Bailey High School,4560,4976
1,Cabrera High School,1688,1858
2,Figueroa High School,2408,2949
3,Ford High School,2258,2739
4,Griffin High School,1339,1468
5,Hernandez High School,3752,4635
6,Holden High School,384,427
7,Huang High School,2383,2917
8,Johnson High School,3907,4761
9,Pena High School,882,962


In [753]:
Percent_Passing_Maths = (combined_Passing_Maths_Total_Student_df["Maths Score"] / combined_Passing_Maths_Total_Student_df["Total Students"]) * 100
Percent_Passing_Maths

0     91.639871
1     90.850377
2     81.654798
3     82.438846
4     91.212534
5     80.949299
6     89.929742
7     81.693521
8     82.062592
9     91.683992
10    90.797699
11    91.538898
12    83.853211
13    82.785808
14    91.777778
dtype: float64

In [754]:
Percent_Passing_Maths_df = Percent_Passing_Maths
Percent_Passing_Maths_df

0     91.639871
1     90.850377
2     81.654798
3     82.438846
4     91.212534
5     80.949299
6     89.929742
7     81.693521
8     82.062592
9     91.683992
10    90.797699
11    91.538898
12    83.853211
13    82.785808
14    91.777778
dtype: float64

In [755]:
organized_School_Name_Type_df["% Passing Maths"] = Percent_Passing_Maths_df

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  """Entry point for launching an IPython kernel.


In [756]:
Reading_df = renamed_df[["School Name","Reading Score"]]
Reading_df

Unnamed: 0,School Name,Reading Score
0,Huang High School,96
1,Huang High School,90
2,Huang High School,41
3,Huang High School,89
4,Huang High School,87
...,...,...
39165,Thomas High School,51
39166,Thomas High School,81
39167,Thomas High School,99
39168,Thomas High School,72


In [757]:
Passing_Reading_df = Reading_df.loc[renamed_df["Reading Score"] > 49,:]
Passing_Reading_df

Unnamed: 0,School Name,Reading Score
0,Huang High School,96
1,Huang High School,90
3,Huang High School,89
4,Huang High School,87
5,Huang High School,88
...,...,...
39164,Thomas High School,97
39165,Thomas High School,51
39166,Thomas High School,81
39167,Thomas High School,99


In [758]:
Passing_Reading_group_df = Passing_Reading_df.groupby(["School Name"])
Passing_Reading_group_df

<pandas.core.groupby.generic.DataFrameGroupBy object at 0x7fc8f1d74910>

In [759]:
Passing_Reading_group_df.count()

Unnamed: 0_level_0,Reading Score
School Name,Unnamed: 1_level_1
Bailey High School,4348
Cabrera High School,1655
Figueroa High School,2442
Ford High School,2252
Griffin High School,1299
Hernandez High School,3795
Holden High School,378
Huang High School,2376
Johnson High School,3903
Pena High School,833


In [760]:
Number_Passing_Reading_df = Passing_Reading_group_df.count()
Number_Passing_Reading_df

Unnamed: 0_level_0,Reading Score
School Name,Unnamed: 1_level_1
Bailey High School,4348
Cabrera High School,1655
Figueroa High School,2442
Ford High School,2252
Griffin High School,1299
Hernandez High School,3795
Holden High School,378
Huang High School,2376
Johnson High School,3903
Pena High School,833


In [761]:
combined_Passing_Reading_Total_Student_df = pd.merge(Number_Passing_Reading_df, Total_students_df,how='outer', on='School Name')
combined_Passing_Reading_Total_Student_df

Unnamed: 0,School Name,Reading Score,Total Students
0,Bailey High School,4348,4976
1,Cabrera High School,1655,1858
2,Figueroa High School,2442,2949
3,Ford High School,2252,2739
4,Griffin High School,1299,1468
5,Hernandez High School,3795,4635
6,Holden High School,378,427
7,Huang High School,2376,2917
8,Johnson High School,3903,4761
9,Pena High School,833,962


In [762]:
Percent_Passing_Reading = (combined_Passing_Reading_Total_Student_df["Reading Score"] / combined_Passing_Maths_Total_Student_df["Total Students"]) * 100
Percent_Passing_Reading

0     87.379421
1     89.074273
2     82.807731
3     82.219788
4     88.487738
5     81.877023
6     88.524590
7     81.453548
8     81.978576
9     86.590437
10    87.396849
11    86.712095
12    82.629969
13    81.296540
14    86.666667
dtype: float64

In [763]:
Percent_Passing_Reading_df = Percent_Passing_Reading
Percent_Passing_Reading_df

0     87.379421
1     89.074273
2     82.807731
3     82.219788
4     88.487738
5     81.877023
6     88.524590
7     81.453548
8     81.978576
9     86.590437
10    87.396849
11    86.712095
12    82.629969
13    81.296540
14    86.666667
dtype: float64

In [764]:
organized_School_Name_Type_df["% Passing Reading"] = Percent_Passing_Reading_df


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  """Entry point for launching an IPython kernel.


In [765]:
organized_School_Name_Type_df.columns

Index(['School Name', 'School Type', 'Total Students', 'Total School Budget',
       'Per Student Budget', 'Average Maths Score', 'Average Reading Score',
       '% Passing Maths', '% Passing Reading'],
      dtype='object')

In [766]:
re_organized_School_Name_Type_df = organized_School_Name_Type_df[['School Name', 'School Type', 'Total Students','Total School Budget','Per Student Budget','Average Maths Score', 'Average Reading Score', '% Passing Maths','% Passing Reading']]
re_organized_School_Name_Type_df

Unnamed: 0,School Name,School Type,Total Students,Total School Budget,Per Student Budget,Average Maths Score,Average Reading Score,% Passing Maths,% Passing Reading
0,Bailey High School,Government,4976,3124928,628.0,72.352894,71.008842,91.639871,87.379421
1,Cabrera High School,Independent,1858,1081356,582.0,71.657158,71.359526,90.850377,89.074273
2,Figueroa High School,Government,2949,1884411,639.0,68.698542,69.077993,81.654798,82.807731
3,Ford High School,Government,2739,1763916,644.0,69.091274,69.572472,82.438846,82.219788
4,Griffin High School,Independent,1468,917500,625.0,71.788147,71.245232,91.212534,88.487738
5,Hernandez High School,Government,4635,3022020,652.0,68.874865,69.186408,80.949299,81.877023
6,Holden High School,Independent,427,248087,581.0,72.583138,71.660422,89.929742,88.52459
7,Huang High School,Government,2917,1910635,655.0,68.935207,68.910525,81.693521,81.453548
8,Johnson High School,Government,4761,3094650,650.0,68.8431,69.039277,82.062592,81.978576
9,Pena High School,Independent,962,585858,609.0,72.088358,71.613306,91.683992,86.590437


In [767]:
Number_Passing_Maths_Reading_df=school_data_complete_df[["school_name",'maths_score','reading_score']]
Number_Passing_Maths_Reading_df

Unnamed: 0,school_name,maths_score,reading_score
0,Huang High School,94,96
1,Huang High School,43,90
2,Huang High School,76,41
3,Huang High School,86,89
4,Huang High School,69,87
...,...,...,...
39165,Thomas High School,48,51
39166,Thomas High School,89,81
39167,Thomas High School,99,99
39168,Thomas High School,77,72


In [768]:
Number_Passing_Maths_Reading_renamed_df = Number_Passing_Maths_Reading_df.rename(columns={"school_name":"School Name",'maths_score':"Maths Score",'reading_score':"Reading Score"})
Number_Passing_Maths_Reading_renamed_df

Unnamed: 0,School Name,Maths Score,Reading Score
0,Huang High School,94,96
1,Huang High School,43,90
2,Huang High School,76,41
3,Huang High School,86,89
4,Huang High School,69,87
...,...,...,...
39165,Thomas High School,48,51
39166,Thomas High School,89,81
39167,Thomas High School,99,99
39168,Thomas High School,77,72


In [774]:
maths_reading_df = Number_Passing_Maths_Reading_renamed_df["Maths Score"].loc[(Number_Passing_Maths_Reading_renamed_df["Maths Score"] > 49) & (Number_Passing_Maths_Reading_renamed_df["Reading Score"] > 49)]
maths_reading_df

0        94
3        86
4        69
5        93
6        60
         ..
39163    86
39164    79
39166    89
39167    99
39168    77
Name: Maths Score, Length: 28519, dtype: int64

In [779]:
passing_maths_reading = maths_reading_df.groupby(["School Name"])
passing_maths_reading.count()

KeyError: 'School Name'

In [778]:
passing_maths_reading_df = pd.merge(Number_Passing_Maths_Reading_renamed_df, maths_reading_df, how ='outer', on = "School Name")
passing_maths_reading_df

KeyError: 'School Name'

In [772]:
passing_maths_reading_df = pd.merge(Number_Passing_Maths_Reading_renamed_df, maths_reading_df, how ='outer', on = "School Name")
passing_maths_reading_df

KeyError: 'School Name'

In [771]:
passing_maths_reading = maths_reading_df.groupby(["School Name"])
passing_maths_reading.count()

KeyError: 'School Name'

In [None]:
passing_maths_reading_df = pd.merge(maths_reading_df, Total_students_df, how ='outer', on = "School Name")
passing_maths_reading_df

In [None]:
passing_maths_reading_df = maths_reading_df.groupby(["school_name"])
passing_maths_reading_df

In [None]:
maths_reading = passing_maths_reading_df["maths_score"].loc[(passing_maths_reading_df["maths_score"] > 49) & (passing_maths_reading_df["reading_score"] > 49)]

In [None]:
Passing_Maths_Reading_df = pd.merge(Number_Passing_Reading_df, Total_students_df,how='outer', on='School Name')
Passing_Maths_Reading_df

In [None]:
Number_passing_maths_reading = school_data_complete_df["maths_score"].loc[(school_data_complete_df["maths_score"] > 49) & (school_data_complete_df["reading_score"] > 49)]
Number_passing_maths_reading.count()

In [None]:
school_summary_df = pd.DataFrame (grouped_df[["School Name"]['School Type']['Total Students']['Total School Budget']])
school_summary_df

In [None]:
Total_students_and_budget_df = pd.DataFrame(grouped_df[[]])

In [None]:
Total_students_and_budget_df = grouped_df[["Total Students", "Total School Budget"]]
Total_students_and_budget_df 

In [None]:
Bailey_High_School_df = grouped_df[['School Name', 'School Type', 'Total Students', 'Total School Budget',
       'Year', 'Maths Score', 'Reading Score']]
Cabrera_High_School_df =
Figueroa_High_School_df =
Ford_High_School_df =
Griffin_High_School_df =	
Hernandez_High_School_df =
Holden_High_School_df =
Huang_High_School_df =
Johnson_High_School_df = 
Pena_High_School_df =
Rodriguez_High_School_df =	
Shelton_High_School_df =
Thomas_High_School_df =
Wilson_High_School_df =
Wright_High_School_df =


In [None]:
Schools = renamed_df["School Name"].unique()
Schools

In [None]:
Huang_High_School_df = renamed_df.groupby["School Name", "School Type", "Total Students", "Total School Budget"]
budget_df = grouped_df.reset_index(drop=True)
budget_df.head()


In [None]:
grouped_df[[]]

In [None]:
school_summary_df = grouped_df.set_index("School Name")
school_summary_df.head()

In [None]:
school_summary_df([{"School Name":'school_name', "Total Students":'size', "Total Budget":'budget', "Average Maths Score":'average_maths_score', "Average Reading Score":'average_reading_score', "% Passing Maths":'Percent_Passing_maths', "% Passing Reading":'Percent_Passing_reading', "% Overall Passing":'total_passing_maths_reading'}])

In [None]:
'Student ID', 'student_name', 'gender', 'year', 'school_name',
       'reading_score', 'maths_score', 'School ID', 'type', 'size', 'budget'

In [None]:
school_summary_df = pd.DataFrame ([{"School Name":'school_name', "Total Students":'size', "Total Budget":'budget', "Average Maths Score":'average_maths_score', "Average Reading Score":'average_reading_score', "% Passing Maths":'Percent_Passing_maths', "% Passing Reading":'Percent_Passing_reading', "% Overall Passing":'total_passing_maths_reading'}])
school_summary_df.head()

## Top Performing Schools (By % Overall Passing)

* Sort and display the top five performing schools by % overall passing.

## Bottom Performing Schools (By % Overall Passing)

* Sort and display the five worst-performing schools by % overall passing.

## Maths Scores by Year

* Create a table that lists the average maths score for students of each year level (9, 10, 11, 12) at each school.

  * Create a pandas series for each year. Hint: use a conditional statement.
  
  * Group each series by school
  
  * Combine the series into a dataframe
  
  * Optional: give the displayed data cleaner formatting

## Reading Score by Year

* Perform the same operations as above for reading scores

## Scores by School Spending

* Create a table that breaks down school performances based on average Spending Ranges (Per Student). Use 4 reasonable bins to group school spending. Include in the table each of the following:
  * Average Maths Score
  * Average Reading Score
  * % Passing Maths
  * % Passing Reading
  * Overall Passing Rate (Average of the above two)

## Scores by School Size

* Perform the same operations as above, based on school size.

## Scores by School Type

* Perform the same operations as above, based on school type