# Key Insights

### *1. The average marks for each subject is:*
  * **Math** ➔ 432.94
  * **Reading** ➔ 424.50
  * **Writing** ➔ 418.46

### *2. the best schools in the three subjects are:*

➔ High School of American Studies at Lehman College\
➔ Stuyvesant High School\
➔ Bronx High School of Science\
➔ Townsend Harris High School\
➔ Staten Island Technical High School

***
# Imports and Load Data

In [1]:
import pandas as pd
import matplotlib.pyplot as plt

In [2]:
schools = pd.read_csv("schools.csv")
schools.head()

Unnamed: 0,school_name,borough,building_code,average_math,average_reading,average_writing,percent_tested
0,"New Explorations into Science, Technology and ...",Manhattan,M022,657,601,601,
1,Essex Street Academy,Manhattan,M445,395,411,387,78.9
2,Lower Manhattan Arts Academy,Manhattan,M445,418,428,415,65.1
3,High School for Dual Language and Asian Studies,Manhattan,M445,613,453,463,95.9
4,Henry Street School for International Studies,Manhattan,M056,410,406,381,59.7


***
# Data Explore

In [3]:
# Corresponds to 80% of the maximum possible score (800 -> 80% = 640)
pct_score=800*0.8

In [11]:
schools.mean()

  schools.mean()


average_math        432.944000
average_reading     424.504000
average_writing     418.458667
percent_tested       64.976338
total_SAT          1275.906667
dtype: float64

In [4]:
# Bests Schools in Math
best_math_schools = schools[schools["average_math"] >= pct_score][["school_name", "average_math"]].sort_values("average_math", ascending=False).head(10)
best_math_schools

Unnamed: 0,school_name,average_math
88,Stuyvesant High School,754
170,Bronx High School of Science,714
93,Staten Island Technical High School,711
365,Queens High School for the Sciences at York Co...,701
68,"High School for Mathematics, Science, and Engi...",683
280,Brooklyn Technical High School,682
333,Townsend Harris High School,680
174,High School of American Studies at Lehman College,669
0,"New Explorations into Science, Technology and ...",657
45,Eleanor Roosevelt High School,641


In [5]:
# Bests Schools in Reading
best_reading_schools = schools[schools["average_reading"] >= pct_score][["school_name", "average_reading"]].sort_values("average_reading", ascending=False).head(10)
best_reading_schools

Unnamed: 0,school_name,average_reading
88,Stuyvesant High School,697
174,High School of American Studies at Lehman College,672
93,Staten Island Technical High School,660
170,Bronx High School of Science,660
5,Bard High School Early College,641
333,Townsend Harris High School,640


In [6]:
# Bests Schools in writing
best_writing_schools = schools[schools["average_writing"] >= pct_score][["school_name", "average_writing"]].sort_values("average_writing", ascending=False).head(10)
best_writing_schools

Unnamed: 0,school_name,average_writing
88,Stuyvesant High School,693
174,High School of American Studies at Lehman College,672
93,Staten Island Technical High School,670
170,Bronx High School of Science,667
333,Townsend Harris High School,661


In [7]:
common_schools = list(set(best_math_schools['school_name']).intersection(set(best_reading_schools['school_name']), set(best_writing_schools['school_name'])))
common_schools

['High School of American Studies at Lehman College',
 'Stuyvesant High School',
 'Bronx High School of Science',
 'Townsend Harris High School',
 'Staten Island Technical High School']

In [8]:
# Calculate total_SAT per school
schools["total_SAT"] = schools[["average_math", "average_reading", "average_writing"]].sum(axis=1)

# Top 10 performing schools
top_10_schools = schools.groupby("school_name", as_index=False)["total_SAT"].mean().sort_values("total_SAT", ascending=False).head(10)
top_10_schools

Unnamed: 0,school_name,total_SAT
325,Stuyvesant High School,2144.0
324,Staten Island Technical High School,2041.0
55,Bronx High School of Science,2041.0
188,High School of American Studies at Lehman College,2013.0
334,Townsend Harris High School,1981.0
293,Queens High School for the Sciences at York Co...,1947.0
30,Bard High School Early College,1914.0
83,Brooklyn Technical High School,1896.0
121,Eleanor Roosevelt High School,1889.0
180,"High School for Mathematics, Science, and Engi...",1889.0


In [9]:
# Which NYC borough has the highest standard deviation for total_SAT?
boroughs = schools.groupby("borough")["total_SAT"].agg(["count", "mean", "std"]).round(2)

# Filter for max std and reset index so borough is a column
largest_std_dev = boroughs[boroughs["std"] == boroughs["std"].max()]

# Rename the columns for clarity
largest_std_dev = largest_std_dev.rename(columns={"count": "num_schools", "mean": "average_SAT", "std": "std_SAT"})