# SQL Practice

This is a quick notebook with exercises for anyone wanting to practice their SQL. It uses the [California School SAT Performance](http://2016.padjo.org/tutorials/sqlite-data-starterpacks/#toc-california-school-sat-performance-and-poverty-data) dataset, which gives a good enough set of data to use many SQL features.

We use SQLite specifically, which lacks a few SQL language features - you can adapt it to your own database if you like.

It also requires you to install `jupysql` into your Python environment (`conda install jupysql` if using Conda).

## Download the Dataset

In [107]:
![ -f "cdeschools.sqlite" ] || curl http://2016.padjo.org/files/data/starterpack/cde-schools/cdeschools.sqlite --output cdeschools.sqlite

## Load `jupysql` and the Dataset

In [108]:
%load_ext sql

%sql sqlite:///cdeschools.sqlite

The sql extension is already loaded. To reload it, use:
  %reload_ext sql


## Show Tables Available, and a Sample for each Table

In [109]:
%%sql
SELECT name 
FROM sqlite_master 
WHERE type='table'
ORDER BY name;

name
frpm
satscores
schools


In [110]:
%%sql
SELECT *
FROM frpm
LIMIT 3;

Academic Year,County Code,District Code,School Code,County Name,District Name,School Name,District Type,School Type,Educational Option Type,NSLP Provision Status,Charter School (Y/N),Charter School Number,Charter Funding Type,IRC,Low Grade,High Grade,Enrollment (K-12),Free Meal Count (K-12),Percent (%) Eligible Free (K-12),FRPM Count (K-12),Percent (%) Eligible FRPM (K-12),Enrollment (Ages 5-17),Free Meal Count (Ages 5-17),Percent (%) Eligible Free (Ages 5-17),FRPM Count (Ages 5-17),Percent (%) Eligible FRPM (Ages 5-17),2013-14 CALPADS Fall 1 Certification Status
2014-2015,1,10017,109835,Alameda,Alameda County Office of Education,FAME Public Charter,County Office of Education (COE),K-12 Schools (Public),Traditional,,1,728,Directly funded,1,K,12,1087.0,565.0,0.5197792088316467,715.0,0.6577736890524379,1070.0,553.0,0.516822429906542,702.0,0.6560747663551402,1
2014-2015,1,10017,112607,Alameda,Alameda County Office of Education,Envision Academy for Arts & Technology,County Office of Education (COE),High Schools (Public),Traditional,,1,811,Directly funded,1,9,12,395.0,186.0,0.4708860759493671,186.0,0.4708860759493671,376.0,182.0,0.4840425531914893,182.0,0.4840425531914893,1
2014-2015,1,10017,118489,Alameda,Alameda County Office of Education,Aspire California College Preparatory Academy,County Office of Education (COE),High Schools (Public),Traditional,,1,1049,Directly funded,1,9,12,244.0,134.0,0.5491803278688525,175.0,0.7172131147540983,230.0,128.0,0.5565217391304348,168.0,0.7304347826086957,1


In [111]:
%%sql
SELECT *
FROM satscores
LIMIT 3;

cds,rtype,sname,dname,cname,enroll12,NumTstTakr,AvgScrRead,AvgScrMath,AvgScrWrite,NumGE1500,PctGE1500
0,X,,,,496901,210706,489,500,484,93334,44.3
1000000000000,C,,,Alameda,16978,8855,516,536,517,4900,55.34
1100170000000,D,,Alameda County Office of Education,Alameda,398,88,418,418,417,14,15.91


In [112]:
%%sql
SELECT *
FROM schools
LIMIT 3;

CDSCode,NCESDist,NCESSchool,StatusType,County,District,School,Street,StreetAbr,City,Zip,State,MailStreet,MailStrAbr,MailCity,MailZip,MailState,Phone,Ext,Website,OpenDate,ClosedDate,Charter,CharterNum,FundingType,DOC,DOCType,SOC,SOCType,EdOpsCode,EdOpsName,EILCode,EILName,GSoffered,GSserved,Virtual,Magnet,Latitude,Longitude,AdmFName1,AdmLName1,AdmEmail1,AdmFName2,AdmLName2,AdmEmail2,AdmFName3,AdmLName3,AdmEmail3,LastUpdate
1100170000000,691051,,Active,Alameda,Alameda County Office of Education,,313 West Winton Avenue,313 West Winton Ave.,Hayward,94544-1136,CA,313 West Winton Avenue,313 West Winton Ave.,Hayward,94544-1136,CA,(510) 887-0152,,www.acoe.org,,,,,,0,County Office of Education (COE),,,,,,,,,,,37.658212,-122.09713,L Karen,Monroe,lkmonroe@acoe.org,,,,,,,2015-06-23
1100170109835,691051,10546.0,Closed,Alameda,Alameda County Office of Education,FAME Public Charter,"39899 Balentine Drive, Suite 335","39899 Balentine Dr., Ste. 335",Newark,94560-5359,CA,"39899 Balentine Drive, Suite 335","39899 Balentine Dr., Ste. 335",Newark,94560-5359,CA,,,,2005-08-29,2015-07-31,1.0,728.0,Directly funded,0,County Office of Education (COE),65.0,K-12 Schools (Public),TRAD,Traditional,ELEMHIGH,Elementary-High Combination,K-12,K-12,P,0.0,37.521436,-121.99391,,,,,,,,,,2015-09-01
1100170112607,691051,10947.0,Active,Alameda,Alameda County Office of Education,Envision Academy for Arts & Technology,1515 Webster Street,1515 Webster St.,Oakland,94612-3355,CA,1515 Webster Street,1515 Webster St.,Oakland,94612,CA,(510) 596-8901,,www.envisionacademy.org/,2006-08-28,,1.0,811.0,Directly funded,0,County Office of Education (COE),66.0,High Schools (Public),TRAD,Traditional,HS,High School,9-12,9-12,N,0.0,37.80452,-122.26815,Laura,Robell,laura@envisionacademy.org,,,,,,,2015-06-18


## 1. Basic Queries

### Project Columns

Retrieve the `School Name`, `County Name` and `Enrollment (K-12)` from frpm.

In [113]:
%%sql
SELECT "School Name" AS school_name, "County Name" AS county_name, "Enrollment (K-12)" AS enrolment_k12
FROM frpm
LIMIT 3;

school_name,county_name,enrolment_k12
FAME Public Charter,Alameda,1087.0
Envision Academy for Arts & Technology,Alameda,395.0
Aspire California College Preparatory Academy,Alameda,244.0


### Simple Filtering

List all schools in the “Los Angeles” county (from schools) that are open.

In [114]:
%%sql
SELECT "School" AS school_name, "County" AS county_name, OpenDate AS open_date
FROM schools
WHERE StatusType IS NOT "Closed" AND school_name IS NOT NULL AND county_name IS "Los Angeles";

school_name,county_name,open_date
Jardin de la Infancia,Los Angeles,2004-09-01
Aspire Antonio Maria Lugo Academy,Los Angeles,2005-09-06
Los Angeles International Charter High,Los Angeles,2005-09-06
Aspire Ollin University Preparatory Academy,Los Angeles,2006-09-11
Environmental Charter Middle,Los Angeles,2010-08-31
"Nidorf, Barry J.",Los Angeles,2010-07-01
Los Padrinos Juvenile Hall,Los Angeles,2010-07-01
Central Juvenile Hall,Los Angeles,2010-07-01
"Kirby, Dorothy Camp",Los Angeles,2010-07-01
Afflerbaugh-Paige Camp,Los Angeles,2010-07-01


### Sorting and Filtering

From satscores, find the top 5 schools by `AvgScrMath`, showing `sname`, `AvgScrRead`, `AvgScrMath`.

In [115]:
%%sql
SELECT sname AS school_name, AvgScrMath AS avg_scr_math, AvgScrRead AS avg_scr_read
FROM satscores
ORDER BY avg_scr_math DESC
LIMIT 5;

school_name,avg_scr_math,avg_scr_read
Mission San Jose High,699,653
Lynbrook High,698,639
Monta Vista High,691,638
Whitney (Gretchen) High,687,639
Henry M. Gunn High,686,642


### Distinct Values

Get all distinct `FundingType` values from schools.

In [116]:
%%sql
SELECT DISTINCT(FundingType) AS funding_type, COUNT(*) AS number_of_schools
FROM schools
GROUP BY funding_type;

funding_type,number_of_schools
,16044
Directly funded,1176
Locally funded,460
Not in CS funding model,6


## Aggregations and Groupings

### Basic Aggregates

For each district in frpm (`District Name`), compute the total `Enrollment (K-12)` and average `Percent (%) Eligible FRPM (K-12)`.

In [117]:
%%sql
SELECT "District Name" AS district_name, SUM("Enrollment (K-12)") AS total_enrolment_k12, AVG("Percent (%) Eligible FRPM (K-12)") AS avg_pct_eligible_frpm_k12
FROM frpm
GROUP BY district_name
ORDER BY total_enrolment_k12 DESC;

district_name,total_enrolment_k12,avg_pct_eligible_frpm_k12
,6236439.0,0.5861716918901957
Los Angeles Unified,646683.0,0.7661354707053055
San Diego Unified,129794.0,0.6341436895240766
Long Beach Unified,79709.0,0.6499238451630597
Fresno Unified,73543.0,0.8505455348746777
Elk Grove Unified,62888.0,0.5605733170202977
San Francisco Unified,58414.0,0.6422081930672513
Santa Ana Unified,56815.0,0.8565427846367221
Capistrano Unified,54036.0,0.2439343322360429
Corona-Norco Unified,53739.0,0.4835829509222873


### Filtering Groups

Identify districts with more than 10,000 total enrollment in frpm. Show `District Name` and total enrollment.

In [118]:
%%sql
SELECT "District Name" AS district_name, SUM("Enrollment (K-12)") AS total_enrolment_k12
FROM frpm
GROUP BY district_name
HAVING total_enrolment_k12 > 10000
ORDER BY total_enrolment_k12 ASC;

district_name,total_enrolment_k12
Merced Union High,10039.0
Woodland Joint Unified,10055.0
Lompoc Unified,10076.0
Bonita Unified,10146.0
Roseville Joint Union High,10223.0
Los Banos Unified,10260.0
Milpitas Unified,10281.0
Adelanto Elementary,10378.0
Berkeley Unified,10442.0
Perris Union High,10510.0
