# Day 3 – SQL via Python: NYC School Data Exploration
---

## Objective

Using Python and SQL together to explore real-world education data. </br>
Connect to a PostgreSQL database, SQL queries, analyze school-level insights, and present insights in this notebook.

---
## Install Libraries

In [None]:
#Installs required Python packages for data analysis, database abstraction, and PostgreSQL connectivity within the Jupyter Notebook

%pip install pandas sqlalchemy psycopg2-binary 


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0.1[0m[39;49m -> [0m[32;49m25.3[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip3 install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


---
## Import Libraries

In [4]:
#Imports pandas for data manipulation and sqlalchemy for database connections and SQL operations

import pandas as pd 
import sqlalchemy as sa 

---
## Database Connection

1. Make a connection 
2. Create an engine

In [5]:
#Defines the database connection URL for accessing a PostgreSQL database with SSL enabled

db_url = 'postgresql://neondb_owner:a9Am7Yy5r9_T7h4OF2GN@ep-falling-glitter-a5m0j5gk-pooler.us-east-2.aws.neon.tech:5432/neondb?sslmode=require'

In [6]:
#Sets up the database infrastructure by initializing the engine and establishing a connection
 
engine = sa.create_engine(db_url) #Creates a SQLAlchemy engine using the specified database connection URL
connection = engine.connect() #Opens a connection to the database via the created engine

In [7]:
#Loads multiple tables from the database into pandas DataFrames for further analysis

directory = pd.read_sql("SELECT * FROM nyc_schools.high_school_directory", engine) #Reads the high school directory table from the database into a DataFrame "directory"
demographics = pd.read_sql("SELECT * FROM nyc_schools.school_demographics", engine) #Reads the school demographics table from the database into a DataFrame "demographics"
safety_report = pd.read_sql("SELECT * FROM nyc_schools.school_safety_report", engine) #Reads the school safety report table from the database into a DataFrame "safety_report"

---
## Initial Data Inspection

### Table: directory

In [None]:
#Displays the first five rows of the directory DataFrame to quickly inspect its structure and contents

directory.head() 

Unnamed: 0,dbn,school_name,borough,building_code,phone_number,fax_number,grade_span_min,grade_span_max,expgrade_span_min,expgrade_span_max,start_time,end_time,priority01,priority02,priority03,priority04,priority05,priority06,priority07,priority08,priority09,priority10,location,phone_number2,school_email,website,subway,bus,grades2018,finalgrades,total_students,extracurricular_activities,school_sports,attendance_rate,pct_stu_enough_variety,pct_stu_safe,school_accessibility_description,directions1,requirement1,requirement2,requirement3,requirement4,requirement5,program1,code1,interest1,method1,seats9ge1,grade9gefilledflag1,grade9geapplicants1,seats9swd1,grade9swdfilledflag1,grade9swdapplicants1,campus_name,building_borough,building_location,latitude,longitude,community_board,council_district,census_tract,bin,bbl,nta,zip_codes,community_districts,borough_boundaries,city_council_districts,police_precincts,primary_address_line_1,city,state_code,postcode,school_type,overview_paragraph,program_highlights,language_classes,advancedplacement_courses,online_ap_courses,online_language_courses,psal_sports_boys,psal_sports_girls,psal_sports_coed,partner_cbo,partner_hospital,partner_highered,partner_cultural,partner_nonprofit,partner_corporate,partner_financial,partner_other,addtl_info1,addtl_info2,se_services,ell_programs,number_programs,Location 1,Community Board,Council District,Census Tract,Zip Codes,Community Districts,Borough Boundaries,City Council Districts,Police Precincts
0,27Q260,Frederick Douglass Academy VI High School,Queens,Q465,718-471-2154,718-471-2890,9.0,12,,,7:45 AM,2:05 PM,Priority to Queens students or residents who a...,Then to New York City residents who attend an ...,Then to Queens students or residents,Then to New York City residents,,,,,,,,,,http://schools.nyc.gov/schoolportals/27/Q260,A to Beach 25th St-Wavecrest,"Q113, Q22",,,412.0,"After-school Program, Book, Writing, Homework ...","Step Team, Modern Dance, Hip Hop Dance",,,,Not Functionally Accessible,,,,,,,,,,,,,,,,,Far Rockaway Educational Campus,,,,,,,,4300730.0,4157360000.0,Far Rockaway-Bayswater ...,,,,,,8-21 Bay 25 Street,Far Rockaway,NY,11691,,Frederick Douglass Academy (FDA) VI High Schoo...,"Advisory, Graphic Arts Design, Teaching Intern...",Spanish,"Calculus AB, English Language and Composition,...","Biology, Physics B","French, Spanish","Basketball, Cross Country, Indoor Track, Outdo...","Basketball, Cross Country, Indoor Track, Outdo...",,,"Jamaica Hospital Medical Center, Peninsula Hos...","York College, Brooklyn College, St. John's Col...",,"Queens District Attorney, Sports and Arts Foun...","Replications, Inc.",Citibank,New York Road Runners Foundation (NYRRF),"Uniform Required: plain white collared shirt, ...","Extended Day Program, Student Summer Orientati...",This school will provide students with disabil...,ESL,1,"{'latitude': '40.601989336', 'longitude': '-73...",14,31,100802,20529,51,3,47,59
1,21K559,Life Academy High School for Film and Music,Brooklyn,K400,718-333-7750,718-333-7775,9.0,12,,,8:15 AM,3:00 PM,Priority to New York City residents who attend...,Then to New York City residents,,,,,,,,,,,,http://schools.nyc.gov/schoolportals/21/K559,D to 25th Ave ; N to Ave U ; N to Gravesend - ...,"B1, B3, B4, B6, B64, B82",,,260.0,"Film, Music, Talent Show, Holiday Concert, Stu...",,,,,Functionally Accessible,,,,,,,,,,,,,,,,,Lafayette Educational Campus,,,,,,,,3186454.0,3068830000.0,Gravesend ...,,,,,,2630 Benson Avenue,Brooklyn,NY,11214,,At Life Academy High School for Film and Music...,"College Now, iLEARN courses, Art and Film Prod...",Spanish,,"Biology, English Literature and Composition, E...",,"Basketball, Bowling, Indoor Track, Soccer, Sof...","Basketball, Bowling, Indoor Track, Soccer, Sof...",Cricket,Coney Island Generation Gap,,"City Tech, Kingsborough Early College Secondar...","Museum of the Moving Image, New York Public Li...",Institute for Student Achievement,"Film Life, Inc., SONY Wonder Tech",,,Our school requires completion of a Common Cor...,,This school will provide students with disabil...,ESL,1,"{'latitude': '40.593593811', 'longitude': '-73...",13,47,306,17616,21,2,45,35
2,16K393,Frederick Douglass Academy IV Secondary School,Brooklyn,K026,718-574-2820,718-574-2821,9.0,12,,,8:00 AM,2:20 PM,Priority to continuing 8th graders,Then to Brooklyn students or residents who att...,Then to New York City residents who attend an ...,Then to Brooklyn students or residents,Then to New York City residents,,,,,,,,,http://schools.nyc.gov/schoolportals/16/K393,"J to Kosciusko St ; M, Z to Myrtle Ave","B15, B38, B46, B47, B52, B54, Q24",,,155.0,"After-school and Saturday Programs, Art Studio...",Basketball Team,,,,Not Functionally Accessible,,,,,,,,,,,,,,,,,,,,,,,,,3393805.0,3016160000.0,Stuyvesant Heights ...,,,,,,1014 Lafayette Avenue,Brooklyn,NY,11221,,The Frederick Douglass Academy IV (FDA IV) Sec...,"College Now with Medgar Evers College, Fresh P...","French, Spanish","English Language and Composition, United State...",French Language and Culture,,,,,Achieving Change in our Neighborhood (Teen ACT...,,Medgar Evers College,Noel Pointer School of Music,"Hip-Hop 4 Life, Urban Arts, and St. Nicks Alli...",,,,"Dress Code Required: solid white shirt/blouse,...","Student Summer Orientation, Weekend Program of...",This school will provide students with disabil...,ESL,1,"{'latitude': '40.692133704', 'longitude': '-73...",3,36,291,18181,69,2,49,52
3,08X305,Pablo Neruda Academy,Bronx,X450,718-824-1682,718-824-1663,9.0,12,,,8:00 AM,3:50 PM,Priority to Bronx students or residents who at...,Then to New York City residents who attend an ...,Then to Bronx students or residents,Then to New York City residents,,,,,,,,,,www.pablonerudaacademy.org,,"Bx22, Bx27, Bx36, Bx39, Bx5",,,335.0,"Youth Court, Student Government, Youth Service...","Baseball, Basketball, Flag Football, Soccer, S...",,,,Functionally Accessible,,,,,,,,,,,,,,,,,Adlai E. Stevenson Educational Campus,,,,,,,,2022205.0,2036040000.0,Soundview-Castle Hill-Clason Point-Harding Par...,,,,,,1980 Lafayette Avenue,Bronx,NY,10473,,"Our mission is to engage, inspire, and educate...","Advanced Placement courses, Electives courses ...",Spanish,"Art History, English Language and Composition,...",,Spanish,"Basketball, Outdoor Track, Softball, Tennis, V...","Basketball, Outdoor Track, Softball, Tennis, V...",,,"Soundview Health Center, Bronx Lebanon Hospita...","Hostos Community College, Monroe College, Lehm...","Chilean Consulate, Materials for the Arts","Network for Teaching Entrepreneurship (NFTE), ...",,,iLearnNYC,All students are individually programmed (base...,Extended Day Program,This school will provide students with disabil...,ESL,1,"{'latitude': '40.822303765', 'longitude': '-73...",9,18,16,11611,58,5,31,26
4,03M485,Fiorello H. LaGuardia High School of Music & A...,Manhattan,M485,212-496-0700,212-724-5748,9.0,12,,,8:00 AM,4:00 PM,Open to New York City residents,Admission is based on the outcome of a competi...,Students must audition for each program (studi...,Students must be residents of New York City at...,,,,,,,,,,www.laguardiahs.org,"1 to 66th St - Lincoln Center ; 2, 3 to 72nd S...","M10, M104, M11, M20, M31, M5, M57, M66, M7, M72",,,2730.0,"Amnesty International, Anime, Annual Musical, ...",,,,,Functionally Accessible,,,,,,,,,,,,,,,,,,,,,,,,,1030341.0,1011560000.0,Lincoln Square ...,,,,,,100 Amsterdam Avenue,New York,NY,10023,Specialized School,We enjoy an international reputation as the fi...,Students have a daily program that includes bo...,"French, Italian, Japanese, Spanish","Art History, Biology, Calculus AB, Calculus BC...",,Spanish,"Basketball, Bowling, Cross Country, Fencing, G...","Basketball, Bowling, Cross Country, Fencing, G...",,Lincoln Center for the Performing Arts,Mount Sinai Medical Center,The Cooper Union for the Advancement of Scienc...,"Lincoln Center for the Performing Arts, Americ...","Junior Achievement, Red Cross, United Nations ...","Sony Music, Warner Music Group, Capital Cities...",,,Chancellor’s Arts Endorsed Diploma,,This school will provide students with disabil...,ESL,6,"{'latitude': '40.773670507', 'longitude': '-73...",7,6,151,12420,20,4,19,12


In [None]:
#Displays a concise summary of the directory DataFrame, including column names, data types, and non-null counts

directory.info() 

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 435 entries, 0 to 434
Columns: 105 entries, dbn to Police Precincts
dtypes: float64(6), int64(1), object(98)
memory usage: 357.0+ KB


In [None]:
#Generates descriptive statistics for the numeric columns in the directory DataFrame

directory.describe() 

Unnamed: 0,grade_span_min,grade_span_max,expgrade_span_min,expgrade_span_max,total_students,bin,bbl
count,432.0,435.0,31.0,33.0,426.0,431.0,431.0
mean,8.44213,11.894253,8.516129,12.363636,703.842723,2581724.0,2510921000.0
std,1.164591,0.421583,1.121635,0.783349,775.870436,1189750.0,1136455000.0
min,6.0,9.0,6.0,12.0,50.0,1000811.0,1000160000.0
25%,9.0,12.0,9.0,12.0,349.0,2000992.0,2023060000.0
50%,9.0,12.0,9.0,12.0,460.5,2116159.0,2053680000.0
75%,9.0,12.0,9.0,12.0,622.0,3330710.0,3068830000.0
max,9.0,12.0,9.0,14.0,5458.0,5149609.0,5066130000.0


In [None]:
#Calculates the number of missing (null) values in each column of the directory DataFrame

directory.isnull().sum() 

dbn                       0
school_name               0
borough                   0
building_code             0
phone_number              0
                         ..
Zip Codes                 0
Community Districts       0
Borough Boundaries        0
City Council Districts    0
Police Precincts          0
Length: 105, dtype: int64

In [None]:
#Counts the number of duplicate rows in the directory DataFrame

directory.duplicated().sum() 

np.int64(0)

---
### Table: demographics

In [7]:
#Inspects the demographics DataFrame and adjusts display settings to show all columns

demographics.head() #Displays the first few rows of the demographics DataFrame to preview its structure
pd.set_option('display.max_columns', None) #Configures pandas to display all columns and then outputs the full demographics DataFrame
demographics #Show Output

Unnamed: 0,dbn,Name,schoolyear,fl_percent,frl_percent,total_enrollment,prek,k,grade1,grade2,grade3,grade4,grade5,grade6,grade7,grade8,grade9,grade10,grade11,grade12,ell_num,ell_percent,sped_num,sped_percent,ctt_num,selfcontained_num,asian_num,asian_per,black_num,black_per,hispanic_num,hispanic_per,white_num,white_per,male_num,male_per,female_num,female_per
0,01M015,P.S. 015 ROBERTO CLEMENTE,20052006,89.4,,281,15,36,40,33,38,52,29,38,,,,,,,36.0,12.8,57,20.3,25.0,9.0,10,3.6,74,26.3,189,67.3,5,1.8,158,56.2,123,43.8
1,01M015,P.S. 015 ROBERTO CLEMENTE,20062007,89.4,,243,15,29,39,38,34,42,46,,,,,,,,38.0,15.6,55,22.6,19.0,15.0,18,7.4,68,28.0,153,63.0,4,1.6,140,57.6,103,42.4
2,01M015,P.S. 015 ROBERTO CLEMENTE,20072008,89.4,,261,18,43,39,36,38,47,40,,,,,,,,52.0,19.9,60,23.0,20.0,14.0,16,6.1,77,29.5,157,60.2,7,2.7,143,54.8,118,45.2
3,01M015,P.S. 015 ROBERTO CLEMENTE,20082009,89.4,,252,17,37,44,32,34,39,49,,,,,,,,48.0,19.0,62,24.6,21.0,17.0,16,6.3,75,29.8,149,59.1,7,2.8,149,59.1,103,40.9
4,01M015,P.S. 015 ROBERTO CLEMENTE,20092010,,96.5,208,16,40,28,32,30,24,38,,,,,,,,40.0,19.2,46,22.1,14.0,14.0,16,7.7,67,32.2,118,56.7,6,2.9,124,59.6,84,40.4
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
238,01M064,PS 064 ROBERT SIMON,20072008,79.3,,295,26,48,49,47,36,46,43,,,,,,,,17.0,5.8,68,23.1,37.0,11.0,12,4.1,70,23.7,206,69.8,5,1.7,149,50.5,146,49.5
239,01M064,PS 064 ROBERT SIMON,20082009,79.3,,300,26,44,48,53,47,38,44,,,,,,,,15.0,5.0,73,24.3,43.0,22.0,17,5.7,72,24.0,200,66.7,5,1.7,162,54.0,138,46.0
240,01M064,PS 064 ROBERT SIMON,20092010,,90.5,292,26,46,43,44,51,50,32,,,,,,,,13.0,4.5,88,30.1,49.0,24.0,17,5.8,67,22.9,192,65.8,9,3.1,152,52.1,140,47.9
241,01M064,PS 064 ROBERT SIMON,20102011,,90.5,322,32,53,47,45,40,52,53,,,,,,,,21.0,6.5,98,30.4,64.0,18.0,22,6.8,79,24.5,200,62.1,14,4.3,179,55.6,143,44.4


In [None]:
#Displays a concise summary of the demographics DataFrame, including column names, data types, and non-null counts

demographics.info() 

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 243 entries, 0 to 242
Data columns (total 38 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   dbn                243 non-null    object 
 1   Name               243 non-null    object 
 2   schoolyear         243 non-null    int64  
 3   fl_percent         243 non-null    object 
 4   frl_percent        104 non-null    float64
 5   total_enrollment   243 non-null    int64  
 6   prek               243 non-null    object 
 7   k                  243 non-null    object 
 8   grade1             243 non-null    object 
 9   grade2             243 non-null    object 
 10  grade3             243 non-null    object 
 11  grade4             243 non-null    object 
 12  grade5             243 non-null    object 
 13  grade6             243 non-null    object 
 14  grade7             243 non-null    object 
 15  grade8             243 non-null    object 
 16  grade9             243 non

In [None]:
#Generates descriptive statistics for the numeric columns in the demographics DataFrame

demographics.describe() 

Unnamed: 0,schoolyear,frl_percent,total_enrollment,ell_num,ell_percent,sped_num,sped_percent,ctt_num,selfcontained_num,asian_num,...,black_num,black_per,hispanic_num,hispanic_per,white_num,white_per,male_num,male_per,female_num,female_per
count,243.0,104.0,243.0,239.0,243.0,243.0,243.0,200.0,211.0,243.0,...,243.0,243.0,243.0,243.0,243.0,243.0,243.0,243.0,243.0,243.0
mean,20081970.0,77.683654,389.024691,52.167364,11.863374,67.823045,19.43251,28.275,19.033175,76.477366,...,70.99177,20.32428,189.390947,52.103292,44.36214,9.521399,197.872428,51.502881,191.152263,48.497119
std,20156.34,18.743165,213.035653,84.805296,13.83832,33.497958,7.909256,23.985954,11.881712,121.213537,...,34.196925,8.140595,103.31313,18.892945,105.479506,13.164749,103.299753,5.181933,113.58816,5.181933
min,20052010.0,19.6,78.0,1.0,0.0,1.0,0.2,0.0,0.0,1.0,...,12.0,2.1,15.0,2.9,0.0,0.0,37.0,25.1,41.0,36.2
25%,20062010.0,71.725,246.5,13.5,5.0,47.0,15.95,9.75,10.0,17.0,...,44.5,14.9,116.0,45.45,7.0,2.0,128.5,48.75,121.0,45.4
50%,20082010.0,82.05,338.0,30.0,8.5,67.0,21.7,21.0,17.0,32.0,...,68.0,20.6,171.0,57.7,12.0,3.4,175.0,51.7,173.0,48.3
75%,20102010.0,91.2,476.5,49.5,13.1,92.5,24.05,43.0,27.0,60.0,...,87.0,25.95,269.5,65.1,31.5,8.5,242.5,54.6,229.5,51.25
max,20112010.0,99.7,1613.0,486.0,87.5,166.0,35.5,114.0,58.0,559.0,...,192.0,45.4,503.0,80.1,725.0,56.8,794.0,63.8,819.0,74.9


# Table Overview

In [None]:
#Returns a list of all column names in the directory DataFrame

list(directory.columns)

['dbn',
 'school_name',
 'borough',
 'building_code',
 'phone_number',
 'fax_number',
 'grade_span_min',
 'grade_span_max',
 'expgrade_span_min',
 'expgrade_span_max',
 'start_time',
 'end_time',
 'priority01',
 'priority02',
 'priority03',
 'priority04',
 'priority05',
 'priority06',
 'priority07',
 'priority08',
 'priority09',
 'priority10',
 'location',
 'phone_number2',
 'school_email',
 'website',
 'subway',
 'bus',
 'grades2018',
 'finalgrades',
 'total_students',
 'extracurricular_activities',
 'school_sports',
 'attendance_rate',
 'pct_stu_enough_variety',
 'pct_stu_safe',
 'school_accessibility_description',
 'directions1',
 'requirement1',
 'requirement2',
 'requirement3',
 'requirement4',
 'requirement5',
 'program1',
 'code1',
 'interest1',
 'method1',
 'seats9ge1',
 'grade9gefilledflag1',
 'grade9geapplicants1',
 'seats9swd1',
 'grade9swdfilledflag1',
 'grade9swdapplicants1',
 'campus_name',
 'building_borough',
 'building_location',
 'latitude',
 'longitude',
 'community_

In [None]:
#Returns a list of all column names in the demographics DataFrame

list(demographics.columns)

['dbn',
 'Name',
 'schoolyear',
 'fl_percent',
 'frl_percent',
 'total_enrollment',
 'prek',
 'k',
 'grade1',
 'grade2',
 'grade3',
 'grade4',
 'grade5',
 'grade6',
 'grade7',
 'grade8',
 'grade9',
 'grade10',
 'grade11',
 'grade12',
 'ell_num',
 'ell_percent',
 'sped_num',
 'sped_percent',
 'ctt_num',
 'selfcontained_num',
 'asian_num',
 'asian_per',
 'black_num',
 'black_per',
 'hispanic_num',
 'hispanic_per',
 'white_num',
 'white_per',
 'male_num',
 'male_per',
 'female_num',
 'female_per']

In [None]:
#Returns a list of all column names in the safety_report DataFrame

list(safety_report.columns)

['school_year',
 'building_code',
 'dbn',
 'location_name',
 'location_code',
 'address',
 'borough',
 'geographical_district_code',
 'register',
 'building_name',
 'num_schools',
 'schools_in_building',
 'major_n',
 'oth_n',
 'nocrim_n',
 'prop_n',
 'vio_n',
 'engroupa',
 'rangea',
 'avgofmajor_n',
 'avgofoth_n',
 'avgofnocrim_n',
 'avgofprop_n',
 'avgofvio_n',
 'borough_name',
 'postcode',
 'latitude',
 'longitude',
 'community_board',
 'council_district',
 'census_tract',
 'bin',
 'bbl',
 'nta',
 '_schools']

# Questions

#### School Distribution

##### How many schools are there in each borough?

In [13]:
#Defines and executes an SQL query to aggregate the number of schools per borough based on joined tables
#Creates a multi-line SQL query that counts distinct schools per borough and orders the results by count

query = """
        SELECT 
          hsd.borough AS borough,
          COUNT(DISTINCT(hsd.dbn)) AS cnt_schools
        FROM nyc_schools.high_school_directory AS hsd
        JOIN nyc_schools.school_safety_report AS ssr
        ON hsd.dbn = ssr.dbn
        GROUP BY hsd.borough
        ORDER BY cnt_schools DESC;
        """

pd.read_sql(query, engine) #Executes the SQL query against the database engine and returns the result as a pandas DataFrame

Unnamed: 0,borough,cnt_schools
0,Brooklyn,121
1,Bronx,118
2,Manhattan,106
3,Queens,80
4,Staten Island,10


#### Language Learners

##### What is the average % of English Language Learners (ELL) per borough?

In [None]:
#Defines and executes an SQL query to calculate the average percentage of English Language Learners per borough
#Creates a multi-line SQL query that joins school directory and demographics data and orders results by average percentage

query = """
        SELECT
          hsd.borough,
          AVG(sd.ell_percent) AS avg_perc_ell
        FROM nyc_schools.high_school_directory hsd
        LEFT JOIN nyc_schools.school_demographics sd
          ON TRIM(UPPER(hsd.dbn)) = TRIM(UPPER(sd.dbn))
        GROUP BY hsd.borough
        ORDER BY avg_perc_ell ASC;
        """ 

pd.read_sql(query, engine) #Executes the SQL query against the database engine and returns the result as a pandas DataFrame

Unnamed: 0,borough,avg_perc_ell
0,Manhattan,7.5725
1,Queens,
2,Brooklyn,
3,Staten Island,
4,Bronx,


#### School supporting special needs

##### Write a query to find the top 3 schools in each borough with the highest percentage of special education students

In [10]:
#Defines and executes an SQL query to identify the top three schools per borough by special education percentage
#Creates a multi-step SQL query using CTEs to aggregate, rank, and filter schools within each borough

query = """
        WITH per_school AS (
          SELECT
            TRIM(UPPER(dbn)) AS dbn,
            sped_percent
          FROM nyc_schools.school_demographics
          WHERE sped_percent IS NOT NULL
        ),
        ranked AS (
          SELECT
            hsd.borough,
            ps.sped_percent,
            ROW_NUMBER() OVER (PARTITION BY hsd.borough ORDER BY ps.sped_percent DESC) AS rank
          FROM nyc_schools.high_school_directory AS hsd
          JOIN per_school AS ps
            ON TRIM(UPPER(hsd.dbn)) = ps.dbn
          WHERE hsd.borough IS NOT NULL
        )
        SELECT
          borough,
          sped_percent,
          rank
        FROM ranked
        WHERE rank <= 3
        ORDER BY borough, rank;
        """

pd.read_sql(query, engine) #Executes the SQL query against the database engine and returns the result as a pandas DataFrame

Unnamed: 0,borough,sped_percent,rank
0,Manhattan,28.8,1
1,Manhattan,27.7,2
2,Manhattan,26.7,3


# Summary

##### Question 1 

How many schools are there in each borough?

The results show that the number of high schools varies substantially across boroughs, ranging from **10 to 121 schools**. **Brooklyn (121), the Bronx (118), and Manhattan (106)** have the highest counts, while **Queens (80)** and especially **Staten Island (10)** have significantly fewer schools.

##### Question 2 

What is the average % of English Language Learners (ELL) per borough?

The results show that **Manhattan** has an average **ELL percentage of 7.57%**, while **no valid ELL averages are available** for **Queens, Brooklyn, Staten Island, and the Bronx** due to missing or non-aggregable data in the demographics table.

##### Question 3 

Find the top 3 schools in each borough with the highest percentage of special education students

The results show that only **Manhattan** has sufficient data to identify the top three schools by special education percentage, with values ranging from **25.1% to 28.8%**. No ranked results are returned for the other boroughs due to missing or unavailable special education data.