![New York City schoolbus](schoolbus.jpg)

Photo by [Jannis Lucas](https://unsplash.com/@jannis_lucas) on [Unsplash](https://unsplash.com).
<br>

Every year, American high school students take SATs, which are standardized tests intended to measure literacy, numeracy, and writing skills. There are three sections - reading, math, and writing, each with a maximum score of 800 points. These tests are extremely important for students and colleges, as they play a pivotal role in the admissions process.

Analyzing the performance of schools is important for a variety of stakeholders, including policy and education professionals, researchers, government, and even parents considering which school their children should attend. 

You have been provided with a dataset called `schools.csv`, which is previewed below.

You have been tasked with answering three key questions about New York City (NYC) public school SAT performance.

In [52]:
# Re-run this cell 
import pandas as pd

# Read in the data
schools = pd.read_csv("schools.csv")

# Preview the data
schools.head()

# Start coding here...
# Add as many cells as you like...

Unnamed: 0,school_name,borough,building_code,average_math,average_reading,average_writing,percent_tested
0,"New Explorations into Science, Technology and ...",Manhattan,M022,657,601,601,
1,Essex Street Academy,Manhattan,M445,395,411,387,78.9
2,Lower Manhattan Arts Academy,Manhattan,M445,418,428,415,65.1
3,High School for Dual Language and Asian Studies,Manhattan,M445,613,453,463,95.9
4,Henry Street School for International Studies,Manhattan,M056,410,406,381,59.7


In [53]:
# Explore Data
print(schools.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 375 entries, 0 to 374
Data columns (total 7 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   school_name      375 non-null    object 
 1   borough          375 non-null    object 
 2   building_code    375 non-null    object 
 3   average_math     375 non-null    int64  
 4   average_reading  375 non-null    int64  
 5   average_writing  375 non-null    int64  
 6   percent_tested   355 non-null    float64
dtypes: float64(1), int64(3), object(3)
memory usage: 20.6+ KB
None


In [54]:
print(schools.describe())

       average_math  average_reading  average_writing  percent_tested
count    375.000000       375.000000       375.000000      355.000000
mean     432.944000       424.504000       418.458667       64.976338
std       71.952373        61.881069        64.548599       18.747634
min      317.000000       302.000000       284.000000       18.500000
25%      386.000000       386.000000       382.000000       50.950000
50%      415.000000       413.000000       403.000000       64.800000
75%      458.500000       445.000000       437.500000       79.600000
max      754.000000       697.000000       693.000000      100.000000


Question 1: Finding Schools with the best math scores

In [55]:
#Find the schools that have an average math score at 80% of the max score

# Define Score Cut at 80% of max
best_math_score_cut_off = 800 * 0.8

# Select schools above the score cut
best_math_schools = schools[schools['average_math']>=best_math_score_cut_off][['school_name','average_math']].copy()

# Sort dataset descending
best_math_schools.sort_values(by='average_math', ascending=False, inplace=True)

print(best_math_schools.head(30))

                                           school_name  average_math
88                              Stuyvesant High School           754
170                       Bronx High School of Science           714
93                 Staten Island Technical High School           711
365  Queens High School for the Sciences at York Co...           701
68   High School for Mathematics, Science, and Engi...           683
280                     Brooklyn Technical High School           682
333                        Townsend Harris High School           680
174  High School of American Studies at Lehman College           669
0    New Explorations into Science, Technology and ...           657
45                       Eleanor Roosevelt High School           641


Question 2: Identifying the top 10 performing schools

In [56]:

# Create total SAT
#schools['total_SAT'] = schools['average_math'] + schools['average_reading'] + schools['average_writing']

# Sort Schools by total SAT descending
#schools.sort_values(by='total_SAT',ascending=False,inplace=True)

# Select Top 10 schools
#top_10_schools=schools[:10]

# Calculate total_SAT per school
schools["total_SAT"] = schools["average_math"] + schools["average_reading"] + schools["average_writing"]

# Who are the top 10 performing schools?
top_10_schools = schools.groupby("school_name", as_index=False)["total_SAT"].mean().sort_values("total_SAT", ascending=False).head(10)

# Show top 10 schools
top_10_schools

Unnamed: 0,school_name,total_SAT
325,Stuyvesant High School,2144.0
324,Staten Island Technical High School,2041.0
55,Bronx High School of Science,2041.0
188,High School of American Studies at Lehman College,2013.0
334,Townsend Harris High School,1981.0
293,Queens High School for the Sciences at York Co...,1947.0
30,Bard High School Early College,1914.0
83,Brooklyn Technical High School,1896.0
121,Eleanor Roosevelt High School,1889.0
180,"High School for Mathematics, Science, and Engi...",1889.0


Question 3: Locating the NYC borough with the largest standard devisation in SAT Performance

In [57]:
# Find the standard deviation on total SAT by borough
largest_std_dev=schools.groupby('borough').agg(num_schools=('school_name','size'),
                                              average_SAT=('total_SAT','mean'),
                                              std_SAT=('total_SAT','std'))

# Round numeric values
largest_std_dev['average_SAT']=largest_std_dev['average_SAT'].round(2)
largest_std_dev['std_SAT']=largest_std_dev['std_SAT'].round(2)

# Sort by Standard Dev
largest_std_dev.sort_values(by='std_SAT', ascending=False, inplace=True)

# Select borough with largest std
largest_std_dev=largest_std_dev.head(1)

# show dataset
largest_std_dev

Unnamed: 0_level_0,num_schools,average_SAT,std_SAT
borough,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Manhattan,89,1340.13,230.29
