# 🧠 Day 3 – SQL via Python: NYC School Data Exploration
In this notebook, you'll connect to a PostgreSQL database and execute SQL queries to explore NYC school data.

Database Tables:

high_school_directory – School names, locations, types, programs

school_demographics – Enrollment data, ELL, FRPL, disabilities, etc.

school_safety_report – Reported incidents by type and location.

## Import Libraries

In [190]:
import pandas as pd
from sqlalchemy import create_engine
import warnings
warnings.filterwarnings('ignore', category=UserWarning, module='psycopg2')

## Connection to Database

In [191]:
# Database connection
conn = create_engine("postgresql+psycopg2://neondb_owner:npg_CeS9fJg2azZD@ep-falling-glitter-a5m0j5gk-pooler.us-east-2.aws.neon.tech:5432/neondb?sslmode=require").connect()

In [192]:
# Run a query to fetch first 5 records from the high_school_directory table
query = "SELECT * FROM nyc_schools.high_school_directory LIMIT 5;"
df = pd.read_sql(query, conn)
df.head()

Unnamed: 0,dbn,school_name,borough,building_code,phone_number,fax_number,grade_span_min,grade_span_max,expgrade_span_min,expgrade_span_max,...,number_programs,Location 1,Community Board,Council District,Census Tract,Zip Codes,Community Districts,Borough Boundaries,City Council Districts,Police Precincts
0,27Q260,Frederick Douglass Academy VI High School,Queens,Q465,718-471-2154,718-471-2890,9.0,12,,,...,1,"{'latitude': '40.601989336', 'longitude': '-73...",14,31,100802,20529,51,3,47,59
1,21K559,Life Academy High School for Film and Music,Brooklyn,K400,718-333-7750,718-333-7775,9.0,12,,,...,1,"{'latitude': '40.593593811', 'longitude': '-73...",13,47,306,17616,21,2,45,35
2,16K393,Frederick Douglass Academy IV Secondary School,Brooklyn,K026,718-574-2820,718-574-2821,9.0,12,,,...,1,"{'latitude': '40.692133704', 'longitude': '-73...",3,36,291,18181,69,2,49,52
3,08X305,Pablo Neruda Academy,Bronx,X450,718-824-1682,718-824-1663,9.0,12,,,...,1,"{'latitude': '40.822303765', 'longitude': '-73...",9,18,16,11611,58,5,31,26
4,03M485,Fiorello H. LaGuardia High School of Music & A...,Manhattan,M485,212-496-0700,212-724-5748,9.0,12,,,...,6,"{'latitude': '40.773670507', 'longitude': '-73...",7,6,151,12420,20,4,19,12


## ✅ How many schools are there in each borough?

In [193]:
#Count schools by borough
query = """
SELECT borough, COUNT(DISTINCT dbn) AS school_count
FROM nyc_schools.high_school_directory
GROUP BY borough;
"""
df_result1 = pd.read_sql(query, conn)
df_result1

Unnamed: 0,borough,school_count
0,Bronx,118
1,Brooklyn,121
2,Manhattan,106
3,Queens,80
4,Staten Island,10


In [194]:
# Joining tables 'High School Directory' and 'School Demographics'
query = """
SELECT * FROM nyc_schools.high_school_directory AS dir
LEFT JOIN nyc_schools.school_demographics AS demo
ON dir.dbn = demo.dbn;
"""
df_dir_demo = pd.read_sql(query, conn)
df_dir_demo

Unnamed: 0,dbn,school_name,borough,building_code,phone_number,fax_number,grade_span_min,grade_span_max,expgrade_span_min,expgrade_span_max,...,black_num,black_per,hispanic_num,hispanic_per,white_num,white_per,male_num,male_per,female_num,female_per
0,01M292,Henry Street School for International Studies,Manhattan,M056,212-406-9411,212-406-9417,6.0,12,,,...,106.0,36.1,133.0,45.2,10.0,3.4,160.0,54.4,134.0,45.6
1,01M292,Henry Street School for International Studies,Manhattan,M056,212-406-9411,212-406-9417,6.0,12,,,...,137.0,31.6,208.0,47.9,14.0,3.2,241.0,55.5,193.0,44.5
2,01M292,Henry Street School for International Studies,Manhattan,M056,212-406-9411,212-406-9417,6.0,12,,,...,158.0,30.7,272.0,52.8,12.0,2.3,281.0,54.6,234.0,45.4
3,01M292,Henry Street School for International Studies,Manhattan,M056,212-406-9411,212-406-9417,6.0,12,,,...,138.0,29.4,264.0,56.2,14.0,3.0,264.0,56.2,206.0,43.8
4,01M292,Henry Street School for International Studies,Manhattan,M056,212-406-9411,212-406-9417,6.0,12,,,...,141.0,27.6,290.0,56.8,16.0,3.1,297.0,58.1,214.0,41.9
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
463,14K685,El Puente Academy for Peace and Justice,Brooklyn,K778,718-387-1125,718-387-4229,9.0,12,,,...,,,,,,,,,,
464,22K555,Brooklyn College Academy,Brooklyn,K917,718-853-6184,718-853-6356,9.0,12,,,...,,,,,,,,,,
465,14K478,"The High School for Enterprise, Business and T...",Brooklyn,K450,718-387-2800,718-387-2748,9.0,12,,,...,,,,,,,,,,
466,24Q296,Pan American International High School,Queens,Q744,718-271-3602,718-271-4041,9.0,12,,,...,,,,,,,,,,


In [195]:
# Count of missing values in 'ell_percent' column
df_dir_demo['ell_percent'].isna().sum() 

np.int64(428)

## ✅What is the average % of ELL per borough?

In [196]:
# Count the average % of ELL students per borough
df_ell_count = df_dir_demo.groupby('borough')['ell_percent'].agg(['mean', 'count']).sort_values('mean', ascending=False)
df_ell_count

Unnamed: 0_level_0,mean,count
borough,Unnamed: 1_level_1,Unnamed: 2_level_1
Manhattan,7.5725,40
Bronx,,0
Brooklyn,,0
Queens,,0
Staten Island,,0


The English Language Learners are only in Manhattan and the average of its percentage is 7.5 

## The top 3 schools in each borough with the highest percentage of special education students.

In [197]:
# Joining tables 'High School Directory' and 'School Demographics'
query = """
WITH joined_data AS (select dir.dbn, dir.borough, dir.school_name, demo.sped_percent
FROM nyc_schools.high_school_directory AS dir
LEFT JOIN nyc_schools.school_demographics AS demo
ON dir.dbn = demo.dbn)
SELECT * FROM joined_data
WHERE borough = 'Manhattan' AND sped_percent IS NOT NULL
ORDER BY sped_percent DESC
LIMIT 3;
"""
df_dir_demo = pd.read_sql(query, conn)
df_dir_demo

Unnamed: 0,dbn,borough,school_name,sped_percent
0,01M450,Manhattan,East Side Community School,28.8
1,01M450,Manhattan,East Side Community School,27.7
2,01M450,Manhattan,East Side Community School,26.7


## 🧠 Insights

1) Number of schools per borough: 

    Bronx : 118

    Brooklyn: 121

    Manhattan: 106

    Queens: 80

    Staten Island: 10

2) The English Language Learners are only in Manhattan and the average of its percentage is 7.5 

3) Top 3 special education schools are all in Manhattan with 'dbn' : 01M450