# Exploring NYC Public School Test Result Scores

<figure>
<img src="schoolbus.jpg", alt="New York City schoolbus">
<figcaption style="font-size: 11px;">Photo by <a href="https://unsplash.com/@jannis_lucas">Jannis Lucas</a> on <a href="https://unsplash.com">Unsplash</a></figcaption>
</figure>

## Project Description

Every year, American high school students take SATs, which are standardized tests intended to measure   
literacy, numeracy, and writing skills.  

There are three sections - reading, math, and writing, each with a maximum score of 800 points.   
These tests are extremely important for students and colleges, as they play a pivotal role in the  
admissions process.

Analyzing the performance of schools is important for a variety of stakeholders, including policy   
and education professionals, researchers, government, and even parents considering which school   
their children should attend. 

You have been provided with a dataset called `schools.csv`, which is previewed below.

You have been tasked with answering three key questions about New York City (NYC) public school   
SAT performance.

In [1]:
# Import required libraries
import pandas as pd
import numpy as np 
import matplotlib.pyplot as plt 
import seaborn as sns

# Change the default console output settings in NumPy and pandas to improve readability
pd.options.display.max_columns = 20
pd.options.display.max_rows = 20
pd.options.display.max_colwidth = 80
np.set_printoptions(precision=4, suppress=True)

In [2]:
# Read in the data
schools = pd.read_csv("files/schools.csv")

# Preview the data
schools.head()

Unnamed: 0,school_name,borough,building_code,average_math,average_reading,average_writing,percent_tested
0,"New Explorations into Science, Technology and Math High School",Manhattan,M022,657,601,601,
1,Essex Street Academy,Manhattan,M445,395,411,387,78.9
2,Lower Manhattan Arts Academy,Manhattan,M445,418,428,415,65.1
3,High School for Dual Language and Asian Studies,Manhattan,M445,613,453,463,95.9
4,Henry Street School for International Studies,Manhattan,M056,410,406,381,59.7


**Explore the `schools.csv` dataset and use your findings to answer the following questions:**

* Which NYC schools have the best math results?
    * The best math results are at least 80% of the *maximum possible score of 800*  
    for math.

* What are the top 10 performing schools based on the combined SAT scores?

* Which single borough has the largest standard deviation in the combined   
SAT score?

* Round all numeric values to two decimal places.

**1. Finding schools with the best math scores**

In [3]:
# Subset the data to find the schools with math scores of at least 80%
best_math_schools = schools[schools["average_math"] >= (800*0.8)]

# Extract required columns and sort scores in descending order
best_math_schools = best_math_schools[["school_name", "average_math"]].\
    sort_values(by="average_math", ascending=False, ignore_index=True)

print("NYC schools with the best math results:")
best_math_schools.head()

NYC schools with the best math results:


Unnamed: 0,school_name,average_math
0,Stuyvesant High School,754
1,Bronx High School of Science,714
2,Staten Island Technical High School,711
3,Queens High School for the Sciences at York College,701
4,"High School for Mathematics, Science, and Engineering at City College",683


**2. Identifying the top 10 performing schools**

In [4]:
# Create list of columns to be added together
cols = ["average_math", "average_reading", "average_writing"]

# Store sum of column values in a new column
schools["total_SAT"] = schools[cols].sum(axis=1)

schools.head()

Unnamed: 0,school_name,borough,building_code,average_math,average_reading,average_writing,percent_tested,total_SAT
0,"New Explorations into Science, Technology and Math High School",Manhattan,M022,657,601,601,,1859
1,Essex Street Academy,Manhattan,M445,395,411,387,78.9,1193
2,Lower Manhattan Arts Academy,Manhattan,M445,418,428,415,65.1,1261
3,High School for Dual Language and Asian Studies,Manhattan,M445,613,453,463,95.9,1529
4,Henry Street School for International Studies,Manhattan,M056,410,406,381,59.7,1197


In [5]:
# Create a new DataFrame with total_SAT scores in descending order
top_10_schools = schools[["school_name", "total_SAT"]].\
    sort_values(by="total_SAT", ascending=False, ignore_index=True)

# Identify the 10 top performing schools
top_10_schools = top_10_schools.iloc[:10]

print("Top 10 performing schools based on the combined SAT scores:")
top_10_schools

Top 10 performing schools based on the combined SAT scores:


Unnamed: 0,school_name,total_SAT
0,Stuyvesant High School,2144
1,Bronx High School of Science,2041
2,Staten Island Technical High School,2041
3,High School of American Studies at Lehman College,2013
4,Townsend Harris High School,1981
5,Queens High School for the Sciences at York College,1947
6,Bard High School Early College,1914
7,Brooklyn Technical High School,1896
8,Eleanor Roosevelt High School,1889
9,"High School for Mathematics, Science, and Engineering at City College",1889


**3. Locating the NYC borough with the largest standard deviation in SAT performance**

In [6]:
# Group schools by borough; find the count of schools; mean and std of total_SAT
boroughs = schools.groupby("borough", as_index=False)["total_SAT"].\
    agg(["count", "mean", "std"]).round(2)

# Rename columns
boroughs = boroughs.rename(columns={"count": "num_schools", "mean": "average_SAT", "std": "std_SAT"})

boroughs

Unnamed: 0,borough,num_schools,average_SAT,std_SAT
0,Bronx,98,1202.72,150.39
1,Brooklyn,109,1230.26,154.87
2,Manhattan,89,1340.13,230.29
3,Queens,69,1345.48,195.25
4,Staten Island,10,1439.0,222.3


In [7]:
# Filtering for the borough with the largest std
large_std_dev = boroughs[boroughs["std_SAT"] == boroughs["std_SAT"].max()]

print("NYC borough with the largest standard deviation in SAT performance:")
large_std_dev

NYC borough with the largest standard deviation in SAT performance:


Unnamed: 0,borough,num_schools,average_SAT,std_SAT
2,Manhattan,89,1340.13,230.29


<div style="text-align: center; font-size: 18px; font-weight: bold; color: #007BFF;">
<span>[ End of notebook ]</span>
</div>