### Note
* Instructions have been included for each segment. You do not have to follow them exactly, but they are included to help you think through the steps.

In [1]:
# Dependencies and Setup
import pandas as pd

# File to Load (Remember to Change These)
school_data_to_load = "Resources/schools_complete.csv"
student_data_to_load = "Resources/students_complete.csv"

# Read School and Student Data File and store into Pandas DataFrames
school_data = pd.read_csv(school_data_to_load,encoding="utf-8")
student_data = pd.read_csv(student_data_to_load,encoding="utf-8")

# Combine the data into a single dataset.  
school_data_complete = pd.merge(student_data, school_data, how="left", on=["school_name", "school_name"])

## Local Government Area Summary

* Calculate the total number of schools

* Calculate the total number of students

* Calculate the total budget

* Calculate the average maths score 

* Calculate the average reading score

* Calculate the percentage of students with a passing maths score (50 or greater)

* Calculate the percentage of students with a passing reading score (50 or greater)

* Calculate the percentage of students who passed maths **and** reading (% Overall Passing)

* Create a dataframe to hold the above results

* Optional: give the displayed data cleaner formatting

In [2]:
# Convert school_data_complete into DataFrame
sc_df = pd.DataFrame(school_data_complete)
sc_df.head()

Unnamed: 0,Student ID,student_name,gender,year,school_name,reading_score,maths_score,School ID,type,size,budget
0,0,Paul Bradley,M,9,Huang High School,96,94,0,Government,2917,1910635
1,1,Victor Smith,M,12,Huang High School,90,43,0,Government,2917,1910635
2,2,Kevin Rodriguez,M,12,Huang High School,41,76,0,Government,2917,1910635
3,3,Richard Scott,M,12,Huang High School,89,86,0,Government,2917,1910635
4,4,Bonnie Ray,F,9,Huang High School,87,69,0,Government,2917,1910635


In [3]:
#Create new variables for summary table

#Total number of Schools
total_schools = len(sc_df["school_name"].unique())

#Total number of students
total_students = len(sc_df["Student ID"].unique())

#Total Budget
total_budget = sc_df["budget"].sum()

#Average Maths Score
ave_maths_score = sc_df["maths_score"].mean()

#Calculate the average reading score
ave_read_score = sc_df["reading_score"].mean()

#Calculate the percentage of students with a passing maths score (50 or greater)
perc_maths_pass = len(sc_df.loc[sc_df["maths_score"]>=50])/\
                        len(sc_df["maths_score"])*100

#Calculate the percentage of students with a passing reading score (50 or greater)
perc_read_pass =len(sc_df.loc[sc_df["reading_score"]>=50])/\
                        len(sc_df["reading_score"])*100

# Calculate the percentage of students who passed maths and reading (% Overall Passing)
perc_over_pass = len(sc_df.loc[(sc_df["maths_score"]>=50)&(sc_df["reading_score"]>=50)])/\
                            len(sc_df["reading_score"])*100


# Create a dataframe to hold the above results
summary_sc_df = pd.DataFrame({"Total Schools":[total_schools],
                                 "Total Students":[total_students],
                                  "Total Budget":[total_budget],
                                  "Average Maths Score":[ave_maths_score],
                                  "Average Reading Scor":[ave_read_score],
                                  "% Passing Maths":[perc_maths_pass],
                                  "% Passing Reading":[perc_read_pass],
                                  "% Passing Overall":[perc_over_pass],
                             })
summary_sc_df
# Optional: give the displayed data cleaner formatting


Unnamed: 0,Total Schools,Total Students,Total Budget,Average Maths Score,Average Reading Scor,% Passing Maths,% Passing Reading,% Passing Overall
0,15,39170,82932329558,70.338192,69.980138,86.078632,84.426857,72.808272


## School Summary

* Create an overview table that summarises key metrics about each school, including:
  * School Name
  * School Type
  * Total Students
  * Total School Budget
  * Per Student Budget
  * Average Maths Score
  * Average Reading Score
  * % Passing Maths
  * % Passing Reading
  * % Overall Passing (The percentage of students that passed maths **and** reading.)
  
* Create a dataframe to hold the above results

In [None]:
#Group by schoolobject
school_group = sc_df.groupby("school_name")
school_group.head()

## Top Performing Schools (By % Overall Passing)

* Sort and display the top five performing schools by % overall passing.

## Bottom Performing Schools (By % Overall Passing)

* Sort and display the five worst-performing schools by % overall passing.

## Maths Scores by Year

* Create a table that lists the average maths score for students of each year level (9, 10, 11, 12) at each school.

  * Create a pandas series for each year. Hint: use a conditional statement.
  
  * Group each series by school
  
  * Combine the series into a dataframe
  
  * Optional: give the displayed data cleaner formatting

## Reading Score by Year

* Perform the same operations as above for reading scores

## Scores by School Spending

* Create a table that breaks down school performances based on average Spending Ranges (Per Student). Use 4 reasonable bins to group school spending. Include in the table each of the following:
  * Average Maths Score
  * Average Reading Score
  * % Passing Maths
  * % Passing Reading
  * Overall Passing Rate (Average of the above two)

## Scores by School Size

* Perform the same operations as above, based on school size.

## Scores by School Type

* Perform the same operations as above, based on school type