![New York City schoolbus](schoolbus.jpg)

Photo by [Jannis Lucas](https://unsplash.com/@jannis_lucas) on [Unsplash](https://unsplash.com).
<br>

Every year, American high school students take SATs, which are standardized tests intended to measure literacy, numeracy, and writing skills. There are three sections - reading, math, and writing, each with a **maximum score of 800 points**. These tests are extremely important for students and colleges, as they play a pivotal role in the admissions process.

Analyzing the performance of schools is important for a variety of stakeholders, including policy and education professionals, researchers, government, and even parents considering which school their children should attend. 

You have been provided with a dataset called `schools.csv`, which is previewed below.

You have been tasked with answering three key questions about New York City (NYC) public school SAT performance.

In [88]:
# Re-run this cell 
import pandas as pd

# Read in the data
schools = pd.read_csv("schools.csv")

# fill missing data
schools = schools.fillna(0)

# Preview the data
schools.head()

# Start coding here...
# Add as many cells as you like...

Unnamed: 0,school_name,borough,building_code,average_math,average_reading,average_writing,percent_tested
0,"New Explorations into Science, Technology and ...",Manhattan,M022,657,601,601,0.0
1,Essex Street Academy,Manhattan,M445,395,411,387,78.9
2,Lower Manhattan Arts Academy,Manhattan,M445,418,428,415,65.1
3,High School for Dual Language and Asian Studies,Manhattan,M445,613,453,463,95.9
4,Henry Street School for International Studies,Manhattan,M056,410,406,381,59.7


In [89]:
# Which NYC schools have the best math results? 
best_math_schools = schools.loc[schools["average_math"] >=(0.8 * 800),["school_name","average_math"]].sort_values(["average_math"],ascending=False)

In [90]:
# What are the top 10 performing schools based on the combined SAT scores? 
schools["total_SAT"] = schools["average_math"] + schools["average_reading"] + schools["average_writing"]

# Sort the schools based on the total_SAT column and select the top 10
top_10_schools = schools.sort_values("total_SAT", ascending=False).iloc[:10, [0, -1]]

In [91]:
# Which single borough has the largest standard deviation in the combined SAT score?

# Group by borough and calculate the standard deviation, mean, and count of the total_SAT scores
borough_stats = schools.groupby("borough")["total_SAT"].agg(['std', 'mean', 'count']).reset_index()

# Rename the columns for clarity
borough_stats.columns = ["borough", "std_SAT", "average_SAT", "num_schools"]

# Find the borough with the largest standard deviation
largest_std_dev = borough_stats.sort_values("std_SAT", ascending=False).head(1)

# Round all numeric values to two decimal places
largest_std_dev = largest_std_dev.round(2)