![New York City schoolbus](schoolbus.jpg)

Photo by [Jannis Lucas](https://unsplash.com/@jannis_lucas) on [Unsplash](https://unsplash.com).
<br>

This project, proposed by **DataCamp**, aims to analyze the performance of New York public schools on the SAT, a standardized test taken annually by high school students in the United States. The test assesses skills in reading, math, and writing, playing a crucial role in the college admissions process.

Analyzing SAT performance is essential for various stakeholders, including education policymakers, researchers, government agencies, and parents seeking the best schools for their children.

Using the `schools.csv` dataset provided by **DataCamp**, this project explores and answers three key questions about the performance of New York public schools on the SAT. It applies data analysis techniques to identify performance patterns, influencing factors, and potential educational inequalities across the city.

In [1]:
import pandas as pd

schools = pd.read_csv("schools.csv")

schools.head()

Unnamed: 0,school_name,borough,building_code,average_math,average_reading,average_writing,percent_tested
0,"New Explorations into Science, Technology and ...",Manhattan,M022,657,601,601,
1,Essex Street Academy,Manhattan,M445,395,411,387,78.9
2,Lower Manhattan Arts Academy,Manhattan,M445,418,428,415,65.1
3,High School for Dual Language and Asian Studies,Manhattan,M445,613,453,463,95.9
4,Henry Street School for International Studies,Manhattan,M056,410,406,381,59.7


## 1. Which NYC schools have the best math results?

The best math results are at least 80% of the *maximum possible score of 800* for math.

In [163]:
threshold = 0.8 * 800

best_math_schools = schools[schools["average_math"] >= threshold]
best_math_schools = best_math_schools[["school_name", "average_math"]].sort_values(by="average_math", ascending=False)
print(best_math_school)



                                                    average_math
school_name                                                     
Stuyvesant High School                                       754
Bronx High School of Science                                 714
Staten Island Technical High School                          711
Queens High School for the Sciences at York Col...           701
High School for Mathematics, Science, and Engin...           683
Brooklyn Technical High School                               682
Townsend Harris High School                                  680
High School of American Studies at Lehman College            669
New Explorations into Science, Technology and M...           657
Eleanor Roosevelt High School                                641


## 2. What are the top 10 performing schools based on the combined SAT scores?

In [164]:
top_10_schools = schools
top_10_schools["total_SAT"] = top_10_schools["average_math"] + top_10_schools["average_reading"] + top_10_schools["average_writing"]
top_10_schools = schools[["school_name", "total_SAT"]].sort_values(by="total_SAT", ascending=False).head(10)
print(top_10_schools)


                                           school_name  total_SAT
88                              Stuyvesant High School       2144
170                       Bronx High School of Science       2041
93                 Staten Island Technical High School       2041
174  High School of American Studies at Lehman College       2013
333                        Townsend Harris High School       1981
365  Queens High School for the Sciences at York Co...       1947
5                       Bard High School Early College       1914
280                     Brooklyn Technical High School       1896
45                       Eleanor Roosevelt High School       1889
68   High School for Mathematics, Science, and Engi...       1889


## 3. Which single borough has the largest standard deviation in the combined SAT score?

In [165]:
largest_std_dev = schools.copy()
largest_std_dev["num_schools"] = largest_std_dev.groupby("borough")["borough"].transform('count')
largest_std_dev["average_SAT"] = largest_std_dev.groupby("borough")["total_SAT"].transform('mean').round(2)
largest_std_dev["std_SAT"] = largest_std_dev.groupby("borough")["total_SAT"].transform('std').round(2)
largest_std_dev = largest_std_dev[["borough", "num_schools", "average_SAT", "std_SAT"]].drop_duplicates().set_index("borough").head(1)
largest_std_dev

Unnamed: 0_level_0,num_schools,average_SAT,std_SAT
borough,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Manhattan,89,1340.13,230.29
