<a href="https://cognitiveclass.ai"><img src = "https://ibm.box.com/shared/static/ugcqz6ohbvff804xp84y4kqnvvk3bq1g.png" width = 300, align = "center"></a>

<h1 align=center><font size = 5>Lab: Working with a real world data-set using SQL and Python</font></h1>

# Introduction

This notebook shows how to work with a real world dataset using SQL and Python. In this lab you will:
1. Understand the dataset for Chicago Public School level performance 
1. Store the dataset in an Db2 database on IBM Cloud instance
1. Retrieve metadata about tables and columns and query data from mixed case columns
1. Solve example problems to practice your SQL skills including using built-in database functions

## Chicago Public Schools - Progress Report Cards (2011-2012) 

The city of Chicago released a dataset showing all school level performance data used to create School Report Cards for the 2011-2012 school year. The dataset is available from the Chicago Data Portal: https://data.cityofchicago.org/Education/Chicago-Public-Schools-Progress-Report-Cards-2011-/9xs2-f89t

This dataset includes a large number of metrics. Start by familiarizing yourself with the types of metrics in the database: https://data.cityofchicago.org/api/assets/AAD41A13-BE8A-4E67-B1F5-86E711E09D5F?download=true

__NOTE__: Do not download the dataset directly from City of Chicago portal. Instead download a more database friendly version from the link below.
Now download a static copy of this database and review some of its contents:
https://ibm.box.com/shared/static/f9gjvj1gjmxxzycdhplzt01qtz0s7ew7.csv



### Store the dataset in a Table
In many cases the dataset to be analyzed is available as a .CSV (comma separated values) file, perhaps on the internet. To analyze the data using SQL, it first needs to be stored in the database.

While it is easier to read the dataset into a Pandas dataframe and then PERSIST it into the database as we saw in the previous lab, it results in mapping to default datatypes which may not be optimal for SQL querying. For example a long textual field may map to a CLOB instead of a VARCHAR. 

Therefore, __it is highly recommended to manually load the table using the database console LOAD tool, as indicated in Week 2 Lab 1 Part II__. The only difference with that lab is that in Step 5 of the instructions you will need to click on create "(+) New Table" and specify the name of the table you want to create and then click "Next". 

##### Now open the Db2 console, open the LOAD tool, Select / Drag the .CSV file for the CHICAGO PUBLIC SCHOOLS dataset and load the dataset into a new table called __SCHOOLS__.

<a href="https://cognitiveclass.ai"><img src = "https://ibm.box.com/shared/static/uc4xjh1uxcc78ks1i18v668simioz4es.jpg"></a>

### Connect to the database
Let us now load the ipython-sql  extension and establish a connection with the database

In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

%matplotlib inline

pd.options.display.max_columns = 78

### Query the database system catalog to retrieve table metadata

##### You can verify that the table creation was successful by retrieving the list of all tables in your schema and checking whether the SCHOOLS table was created

In [2]:
# type in your query to retrieve list of all tables in the database for your db2 schema (username)
df = pd.read_csv("Chicago_Public_Schools.csv")

In [3]:
df

Unnamed: 0,School ID,NAME_OF_SCHOOL,"Elementary, Middle, or High School",Street Address,City,State,ZIP Code,Phone Number,Link,Network Manager,Collaborative Name,Adequate Yearly Progress Made?,Track Schedule,CPS Performance Policy Status,CPS Performance Policy Level,HEALTHY_SCHOOL_CERTIFIED,Safety Icon,SAFETY_SCORE,Family Involvement Icon,Family Involvement Score,Environment Icon,Environment Score,Instruction Icon,Instruction Score,Leaders Icon,Leaders Score,Teachers Icon,Teachers Score,Parent Engagement Icon,Parent Engagement Score,Parent Environment Icon,Parent Environment Score,AVERAGE_STUDENT_ATTENDANCE,Rate of Misconducts (per 100 students),Average Teacher Attendance,Individualized Education Program Compliance Rate,Pk-2 Literacy %,Pk-2 Math %,Gr3-5 Grade Level Math %,Gr3-5 Grade Level Read %,Gr3-5 Keep Pace Read %,Gr3-5 Keep Pace Math %,Gr6-8 Grade Level Math %,Gr6-8 Grade Level Read %,Gr6-8 Keep Pace Math%,Gr6-8 Keep Pace Read %,Gr-8 Explore Math %,Gr-8 Explore Read %,ISAT Exceeding Math %,ISAT Exceeding Reading %,ISAT Value Add Math,ISAT Value Add Read,ISAT Value Add Color Math,ISAT Value Add Color Read,Students Taking Algebra %,Students Passing Algebra %,9th Grade EXPLORE (2009),9th Grade EXPLORE (2010),10th Grade PLAN (2009),10th Grade PLAN (2010),Net Change EXPLORE and PLAN,11th Grade Average ACT (2011),Net Change PLAN and ACT,College Eligibility %,Graduation Rate %,College Enrollment Rate %,COLLEGE_ENROLLMENT,General Services Route,Freshman on Track Rate %,X_COORDINATE,Y_COORDINATE,Latitude,Longitude,COMMUNITY_AREA_NUMBER,COMMUNITY_AREA_NAME,Ward,Police District,Location
0,610038,Abraham Lincoln Elementary School,ES,615 W Kemper Pl,Chicago,IL,60614,(773) 534-5720,http://schoolreports.cps.edu/SchoolProgressRep...,Fullerton Elementary Network,NORTH-NORTHWEST SIDE COLLABORATIVE,No,Standard,Not on Probation,Level 1,Yes,Very Strong,99.0,Very Strong,99,Strong,74.0,Strong,66.0,Strong,65,Strong,70,Strong,56,Average,47,96.00%,2.0,96.40%,95.80%,80.1,43.3,89.6,84.9,60.7,62.6,81.9,85.2,52,62.4,66.3,77.9,69.7,64.4,0.2,0.9,Yellow,Green,67.1,54.5,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,813,33,NDA,1171699.458,1915829.428,41.924497,-87.644522,7,LINCOLN PARK,43,18,"(41.92449696, -87.64452163)"
1,610281,Adam Clayton Powell Paideia Community Academy ...,ES,7511 S South Shore Dr,Chicago,IL,60649,(773) 535-6650,http://schoolreports.cps.edu/SchoolProgressRep...,Skyway Elementary Network,SOUTH SIDE COLLABORATIVE,No,Track_E,Not on Probation,Level 1,No,Average,54.0,Strong,66,Strong,74.0,Very Strong,84.0,Strong,63,Strong,76,Weak,46,Average,50,95.60%,15.7,95.30%,100.00%,62.4,51.7,21.9,15.1,29,42.8,38.5,27.4,44.8,42.7,14.1,34.4,16.8,16.5,0.7,1.4,Green,Green,17.2,27.3,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,521,46,NDA,1196129.985,1856209.466,41.760324,-87.556736,43,SOUTH SHORE,7,4,"(41.76032435, -87.55673627)"
2,610185,Adlai E Stevenson Elementary School,ES,8010 S Kostner Ave,Chicago,IL,60652,(773) 535-2280,http://schoolreports.cps.edu/SchoolProgressRep...,Midway Elementary Network,SOUTHWEST SIDE COLLABORATIVE,No,Standard,Not on Probation,Level 2,No,Strong,61.0,NDA,NDA,Average,50.0,Weak,36.0,NDA,NDA,NDA,NDA,Average,47,Weak,41,95.70%,2.3,94.70%,98.30%,53.7,26.6,38.3,34.7,43.7,57.3,48.8,39.2,46.8,44,7.5,21.9,18.3,15.5,-0.9,-1.0,Red,Red,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,1324,44,NDA,1148427.165,1851012.215,41.747111,-87.731702,70,ASHBURN,13,8,"(41.74711093, -87.73170248)"
3,609993,Agustin Lara Elementary Academy,ES,4619 S Wolcott Ave,Chicago,IL,60609,(773) 535-4389,http://schoolreports.cps.edu/SchoolProgressRep...,Pershing Elementary Network,SOUTHWEST SIDE COLLABORATIVE,No,Track_E,Not on Probation,Level 1,No,Average,56.0,Average,44,Average,45.0,Weak,37.0,Strong,65,Average,48,Average,53,Strong,58,95.50%,10.4,95.80%,100.00%,76.9,NDA,26,24.7,61.8,49.7,39.2,27.2,69.7,60.6,9.1,18.2,11.1,9.6,0.9,2.4,Green,Green,42.9,25,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,556,42,NDA,1164504.290,1873959.199,41.809757,-87.672145,61,NEW CITY,20,9,"(41.8097569, -87.6721446)"
4,610513,Air Force Academy High School,HS,3630 S Wells St,Chicago,IL,60609,(773) 535-1590,http://schoolreports.cps.edu/SchoolProgressRep...,Southwest Side High School Network,SOUTHWEST SIDE COLLABORATIVE,NDA,Standard,Not on Probation,Not Enough Data,Yes,Average,49.0,Strong,60,Strong,60.0,Average,55.0,Average,45,Average,54,Average,53,Average,49,93.30%,15.6,96.90%,100.00%,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,,,,,NDA,NDA,NDA,NDA,14.6,14.8,NDA,16,1.4,NDA,NDA,NDA,NDA,NDA,302,40,91.8,1175177.622,1880745.126,41.828146,-87.632794,34,ARMOUR SQUARE,11,9,"(41.82814609, -87.63279369)"
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
561,610172,William T Sherman Elementary School,ES,1000 W 52nd St,Chicago,IL,60609,(773) 535-1757,http://schoolreports.cps.edu/SchoolProgressRep...,AUSL Schools,SOUTHWEST SIDE COLLABORATIVE,No,Track_E,Probation,Level 3,No,Weak,32.0,NDA,NDA,Average,46.0,Average,55.0,NDA,NDA,NDA,NDA,Average,49,Average,52,92.30%,230.6,95.00%,100.00%,NDA,NDA,21.8,26.8,41.1,42,30.1,19.7,46.3,35.2,7.5,15.1,8.1,1.7,0.7,-1.3,Green,Red,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,462,45,NDA,1170500.817,1870373.159,41.799788,-87.650255,61,NEW CITY,16,9,"(41.79978772, -87.65025483)"
562,609844,William W Carter Elementary School,ES,5740 S Michigan Ave,Chicago,IL,60637,(773) 535-0860,http://schoolreports.cps.edu/SchoolProgressRep...,Burnham Park Elementary Network,SOUTH SIDE COLLABORATIVE,No,Standard,Probation,Level 3,No,Very Weak,13.0,Average,49,Weak,33.0,Weak,35.0,Average,56,Strong,62,Weak,46,Weak,46,91.20%,27.0,95.90%,100.00%,72.3,46.6,21.8,17.8,50.5,57.6,24.8,24.8,44.9,48.6,13.3,33.3,6.8,5.5,-0.2,-1.2,Yellow,Red,42.4,50,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,371,42,NDA,1178101.365,1866810.123,41.789841,-87.622490,40,WASHINGTON PARK,20,2,"(41.78984129, -87.62248974)"
563,610088,Wolfgang A Mozart Elementary School,ES,2200 N Hamlin Ave,Chicago,IL,60647,(773) 534-4160,http://schoolreports.cps.edu/SchoolProgressRep...,Fullerton Elementary Network,NORTH-NORTHWEST SIDE COLLABORATIVE,No,Standard,Not on Probation,Level 2,No,Average,41.0,NDA,NDA,Average,56.0,Weak,32.0,NDA,NDA,NDA,NDA,Average,50,Strong,54,95.20%,3.6,96.40%,100.00%,39.8,10.1,28.7,34.2,51.1,48.2,46.7,32.6,74.7,64,NDA,NDA,12.0,11.0,0.4,0.3,Yellow,Yellow,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,748,34,NDA,1150644.396,1914368.955,41.920927,-87.721925,22,LOGAN SQUARE,35,25,"(41.92092734, -87.72192541)"
564,609977,Woodlawn Community Elementary School,ES,6657 S Kimbark Ave,Chicago,IL,60637,(773) 535-0801,http://schoolreports.cps.edu/SchoolProgressRep...,Burnham Park Elementary Network,SOUTH SIDE COLLABORATIVE,No,Standard,Not on Probation,Level 2,No,Strong,70.0,NDA,NDA,Very Strong,80.0,Strong,66.0,NDA,NDA,NDA,NDA,Strong,59,Strong,54,93.90%,12.4,94.30%,100.00%,57.7,NDA,43.9,45.8,60,62.4,75,30,88.9,55.6,NDA,NDA,17.5,22.2,0.1,0.0,Yellow,Yellow,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,238,46,NDA,1185825.188,1860883.579,41.773400,-87.594356,42,WOODLAWN,5,3,"(41.77339962, -87.59435584)"


### Query the database system catalog to retrieve column metadata

##### The SCHOOLS table contains a large number of columns. How many columns does this table have?

In [4]:
df.columns

Index(['School ID', 'NAME_OF_SCHOOL', 'Elementary, Middle, or High School',
       'Street Address', 'City', 'State', 'ZIP Code', 'Phone Number', 'Link ',
       'Network Manager', 'Collaborative Name',
       'Adequate Yearly Progress Made? ', 'Track Schedule',
       'CPS Performance Policy Status', 'CPS Performance Policy Level',
       'HEALTHY_SCHOOL_CERTIFIED', 'Safety Icon ', 'SAFETY_SCORE',
       'Family Involvement Icon', 'Family Involvement Score',
       'Environment Icon ', 'Environment Score', 'Instruction Icon ',
       'Instruction Score', 'Leaders Icon ', 'Leaders Score ',
       'Teachers Icon ', 'Teachers Score', 'Parent Engagement Icon ',
       'Parent Engagement Score', 'Parent Environment Icon',
       'Parent Environment Score', 'AVERAGE_STUDENT_ATTENDANCE',
       'Rate of Misconducts (per 100 students) ', 'Average Teacher Attendance',
       'Individualized Education Program Compliance Rate ', 'Pk-2 Literacy %',
       'Pk-2 Math %', 'Gr3-5 Grade Level Math %'

Now retrieve the the list of columns in SCHOOLS table and their column type (datatype) and length.

In [5]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 566 entries, 0 to 565
Data columns (total 78 columns):
 #   Column                                             Non-Null Count  Dtype  
---  ------                                             --------------  -----  
 0   School ID                                          566 non-null    int64  
 1   NAME_OF_SCHOOL                                     566 non-null    object 
 2   Elementary, Middle, or High School                 566 non-null    object 
 3   Street Address                                     566 non-null    object 
 4   City                                               566 non-null    object 
 5   State                                              566 non-null    object 
 6   ZIP Code                                           566 non-null    int64  
 7   Phone Number                                       566 non-null    object 
 8   Link                                               565 non-null    object 
 9   Network Ma

### Questions
1. Is the column name for the "SCHOOL ID" attribute in upper or mixed case?
1. What is the name of "Community Area Name" column in your table? Does it have spaces?
1. Are there any columns in whose names the spaces and paranthesis (round brackets) have been replaced by the underscore character "_"?

In [6]:
df['School ID']

0      610038
1      610281
2      610185
3      609993
4      610513
        ...  
561    610172
562    609844
563    610088
564    609977
565    610392
Name: School ID, Length: 566, dtype: int64

In [7]:
df['COMMUNITY_AREA_NAME']

0         LINCOLN PARK
1          SOUTH SHORE
2              ASHBURN
3             NEW CITY
4        ARMOUR SQUARE
            ...       
561           NEW CITY
562    WASHINGTON PARK
563       LOGAN SQUARE
564           WOODLAWN
565     SOUTH LAWNDALE
Name: COMMUNITY_AREA_NAME, Length: 566, dtype: object

## Problems

### Problem 1

##### How many Elementary Schools are in the dataset?

In [8]:
df['Elementary, Middle, or High School'].value_counts()

ES    462
HS     93
MS     11
Name: Elementary, Middle, or High School, dtype: int64

Double-click __here__ for a hint

<!--
Which column specifies the school type e.g. 'ES', 'MS', 'HS'?
-->

Double-click __here__ for another hint

<!--
Does the column name have mixed case, spaces or other special characters?
If so, ensure you use double quotes around the "Name of the Column"
-->

Double-click __here__ for the solution.

<!-- Solution:

%sql select count(*) from SCHOOLS where "Elementary, Middle, or High School" = 'ES'

Correct answer: 462

-->

### Problem 2

##### What is the highest Safety Score?

In [9]:
df['SAFETY_SCORE'].max()

99.0

Double-click __here__ for a hint

<!--
Use the MAX() function
-->

Double-click __here__ for the solution.

<!-- Hint:

%sql select MAX(Safety_Score) AS MAX_SAFETY_SCORE from SCHOOLS

Correct answer: 99
-->


### Problem 3

##### Which schools have highest Safety Score?

In [10]:
df[df['SAFETY_SCORE'] == 99.0]

Unnamed: 0,School ID,NAME_OF_SCHOOL,"Elementary, Middle, or High School",Street Address,City,State,ZIP Code,Phone Number,Link,Network Manager,Collaborative Name,Adequate Yearly Progress Made?,Track Schedule,CPS Performance Policy Status,CPS Performance Policy Level,HEALTHY_SCHOOL_CERTIFIED,Safety Icon,SAFETY_SCORE,Family Involvement Icon,Family Involvement Score,Environment Icon,Environment Score,Instruction Icon,Instruction Score,Leaders Icon,Leaders Score,Teachers Icon,Teachers Score,Parent Engagement Icon,Parent Engagement Score,Parent Environment Icon,Parent Environment Score,AVERAGE_STUDENT_ATTENDANCE,Rate of Misconducts (per 100 students),Average Teacher Attendance,Individualized Education Program Compliance Rate,Pk-2 Literacy %,Pk-2 Math %,Gr3-5 Grade Level Math %,Gr3-5 Grade Level Read %,Gr3-5 Keep Pace Read %,Gr3-5 Keep Pace Math %,Gr6-8 Grade Level Math %,Gr6-8 Grade Level Read %,Gr6-8 Keep Pace Math%,Gr6-8 Keep Pace Read %,Gr-8 Explore Math %,Gr-8 Explore Read %,ISAT Exceeding Math %,ISAT Exceeding Reading %,ISAT Value Add Math,ISAT Value Add Read,ISAT Value Add Color Math,ISAT Value Add Color Read,Students Taking Algebra %,Students Passing Algebra %,9th Grade EXPLORE (2009),9th Grade EXPLORE (2010),10th Grade PLAN (2009),10th Grade PLAN (2010),Net Change EXPLORE and PLAN,11th Grade Average ACT (2011),Net Change PLAN and ACT,College Eligibility %,Graduation Rate %,College Enrollment Rate %,COLLEGE_ENROLLMENT,General Services Route,Freshman on Track Rate %,X_COORDINATE,Y_COORDINATE,Latitude,Longitude,COMMUNITY_AREA_NUMBER,COMMUNITY_AREA_NAME,Ward,Police District,Location
0,610038,Abraham Lincoln Elementary School,ES,615 W Kemper Pl,Chicago,IL,60614,(773) 534-5720,http://schoolreports.cps.edu/SchoolProgressRep...,Fullerton Elementary Network,NORTH-NORTHWEST SIDE COLLABORATIVE,No,Standard,Not on Probation,Level 1,Yes,Very Strong,99.0,Very Strong,99,Strong,74.0,Strong,66.0,Strong,65,Strong,70,Strong,56,Average,47,96.00%,2.0,96.40%,95.80%,80.1,43.3,89.6,84.9,60.7,62.6,81.9,85.2,52,62.4,66.3,77.9,69.7,64.4,0.2,0.9,Yellow,Green,67.1,54.5,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,813,33,NDA,1171699.458,1915829.428,41.924497,-87.644522,7,LINCOLN PARK,43,18,"(41.92449696, -87.64452163)"
10,609799,Alexander Graham Bell Elementary School,ES,3730 N Oakley Ave,Chicago,IL,60618,(773) 534-5150,http://schoolreports.cps.edu/SchoolProgressRep...,Ravenswood-Ridge Elementary Network,NORTH-NORTHWEST SIDE COLLABORATIVE,No,Standard,Not on Probation,Level 1,No,Very Strong,99.0,Very Strong,88,Strong,64.0,Average,46.0,Average,51,Average,51,NDA,NDA,NDA,NDA,96.30%,6.3,95.90%,99.30%,91.9,67.3,79.2,77.4,53.3,54.7,84.2,83,49.8,53.6,62.6,71.7,64.0,57.9,0.0,0.3,Yellow,Yellow,58.1,65.6,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,998,35,NDA,1160327.881,1924862.722,41.949528,-87.686055,5,NORTH CENTER,47,19,"(41.94952795, -87.68605496)"
27,610084,Annie Keller Elementary Gifted Magnet School,ES,3020 W 108th St,Chicago,IL,60655,(773) 535-2636,http://schoolreports.cps.edu/SchoolProgressRep...,Rock Island Elementary Network,FAR SOUTH SIDE COLLABORATIVE,Yes,Standard,Not on Probation,Level 1,No,Very Strong,99.0,Very Strong,97,Very Strong,85.0,Very Strong,82.0,Very Strong,94,Very Strong,82,Strong,68,Strong,60,97.50%,4.9,96.50%,100.00%,100,NDA,100,100,63.4,74.7,96.5,97.7,61.2,76.7,89.3,100,92.8,92.3,1.6,2.3,Green,Green,51.5,70.6,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,245,49,NDA,1157959.455,1832892.067,41.697198,-87.697264,74,MOUNT GREENWOOD,19,22,"(41.69719792, -87.6972638)"
39,609820,Augustus H Burley Elementary School,ES,1630 W Barry Ave,Chicago,IL,60657,(773) 534-5475,http://schoolreports.cps.edu/SchoolProgressRep...,Ravenswood-Ridge Elementary Network,NORTH-NORTHWEST SIDE COLLABORATIVE,Yes,Standard,Not on Probation,Level 1,No,Very Strong,99.0,NDA,NDA,Strong,78.0,Strong,65.0,NDA,NDA,NDA,NDA,Strong,59,Average,49,96.50%,0.7,95.00%,97.90%,69.4,47,64.5,70.2,51.2,66.7,69.4,75,62.9,67.3,50,69.6,54.2,53.3,1.3,2.1,Green,Green,35.7,80,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,572,33,NDA,1164768.164,1920682.831,41.937965,-87.669852,6,LAKE VIEW,32,19,"(41.93796493, -87.66985204)"
118,610132,Edgar Allan Poe Elementary Classical School,ES,10538 S Langley Ave,Chicago,IL,60628,(773) 535-5525,http://schoolreports.cps.edu/SchoolProgressRep...,Lake Calumet Elementary Network,FAR SOUTH SIDE COLLABORATIVE,Yes,Standard,Not on Probation,Level 1,No,Very Strong,99.0,Weak,33,Strong,66.0,Very Strong,88.0,Weak,27,Average,41,Strong,55,Average,47,97.60%,0.0,97.20%,100.00%,100,78.3,94,91.6,51.8,65.5,78.6,92.9,53.6,60.7,NDA,NDA,78.8,79.6,1.7,-0.2,Green,Yellow,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,192,48,NDA,1182742.505,1835063.9,41.70262,-87.606456,50,PULLMAN,9,5,"(41.70261965, -87.60645552)"
119,609901,Edgebrook Elementary School,ES,6525 N Hiawatha Ave,Chicago,IL,60646,(773) 534-1194,http://schoolreports.cps.edu/SchoolProgressRep...,O'Hare Elementary Network,NORTH-NORTHWEST SIDE COLLABORATIVE,Yes,Standard,Not on Probation,Level 1,No,Very Strong,99.0,Very Strong,99,Average,51.0,Average,53.0,Strong,66,Strong,70,Strong,56,Weak,39,96.90%,3.4,96.60%,100.00%,96.3,47.6,88.1,74.1,55.2,75.2,80,79.1,54.5,60,50,71.4,60.2,56.8,0.0,0.5,Yellow,Yellow,83.3,64,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,468,30,NDA,1139595.896,1942911.685,41.99946,-87.761821,12,FOREST GLEN,41,16,"(41.99946016, -87.7618211)"
141,610073,Ellen Mitchell Elementary School,ES,2233 W Ohio St,Chicago,IL,60612,(773) 534-7655,http://schoolreports.cps.edu/SchoolProgressRep...,Fulton Elementary Network,WEST SIDE COLLABORATIVE,Yes,Standard,Not on Probation,Level 1,No,Very Strong,99.0,Strong,64,Very Strong,95.0,Very Strong,95.0,Very Strong,81,Very Strong,90,Strong,56,Strong,54,95.50%,12.2,97.70%,100.00%,46.7,41.9,53.1,43.9,72.4,75.5,35.5,51.3,55.3,55.3,21.1,26.3,30.1,27.3,-0.5,-0.3,Yellow,Yellow,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,344,35,NDA,1161269.097,1903925.361,41.892055,-87.683179,24,WEST TOWN,26,13,"(41.89205482, -87.68317867)"
244,610066,James E McDade Elementary Classical School,ES,8801 S Indiana Ave,Chicago,IL,60619,(773) 535-3669,http://schoolreports.cps.edu/SchoolProgressRep...,Skyway Elementary Network,SOUTH SIDE COLLABORATIVE,Yes,Standard,Not on Probation,Level 1,No,Very Strong,99.0,NDA,NDA,Average,57.0,Average,52.0,NDA,NDA,NDA,NDA,Strong,61,Average,52,96.20%,0.0,94.30%,100.00%,NDA,NDA,87.7,82.2,50.7,48.1,NDA,90,NDA,45,NDA,NDA,73.3,74.1,-1.1,-0.7,Red,Yellow,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,199,45,NDA,1179172.217,1846656.745,41.734514,-87.619177,44,CHATHAM,6,6,"(41.73451387, -87.61917677)"
245,609803,James G Blaine Elementary School,ES,1420 W Grace St,Chicago,IL,60613,(773) 534-5750,http://schoolreports.cps.edu/SchoolProgressRep...,Ravenswood-Ridge Elementary Network,NORTH-NORTHWEST SIDE COLLABORATIVE,Yes,Standard,Not on Probation,Level 1,No,Very Strong,99.0,NDA,NDA,Strong,76.0,Strong,74.0,NDA,NDA,NDA,NDA,Weak,40,Weak,46,96.40%,2.1,96.00%,88.40%,87.5,62.6,88.6,78.5,61.7,62.8,81.9,79.9,56.3,63.6,53.6,73.2,61.3,50.1,0.9,-0.4,Green,Yellow,50,63.3,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,923,33,NDA,1166063.167,1925373.268,41.950808,-87.664958,6,LAKE VIEW,44,19,"(41.95080812, -87.66495825)"
326,610033,LaSalle Elementary Language Academy,ES,1734 N Orleans St,Chicago,IL,60614,(773) 534-8470,http://schoolreports.cps.edu/SchoolProgressRep...,Fullerton Elementary Network,NORTH-NORTHWEST SIDE COLLABORATIVE,Yes,Standard,Not on Probation,Level 1,No,Very Strong,99.0,Strong,79,Strong,62.0,Average,52.0,Weak,39,Average,43,Average,53,Average,48,96.80%,7.0,97.60%,100.00%,86,47.4,81.4,78.3,67.7,76.4,79.2,84.7,73.2,76.5,61,71.2,55.5,49.5,0.3,-0.3,Yellow,Yellow,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,576,33,NDA,1173614.17,1911976.635,41.913882,-87.637601,7,LINCOLN PARK,43,18,"(41.9138823, -87.63760107)"


Double-click __here__ for the solution.

<!-- Solution:
In the previous problem we found out that the highest Safety Score is 99, so we can use that as an input in the where clause:

%sql select Name_of_School, Safety_Score from SCHOOLS where Safety_Score = 99

or, a better way:

%sql select Name_of_School, Safety_Score from SCHOOLS where \
  Safety_Score= (select MAX(Safety_Score) from SCHOOLS)


Correct answer: several schools with with Safety Score of 99.
-->


### Problem 4

##### What are the top 10 schools with the highest "Average Student Attendance"?


In [11]:
df["AVERAGE_STUDENT_ATTENDANCE"].value_counts()

95.50%    25
95.10%    22
95.60%    17
96.20%    14
94.90%    14
          ..
71.30%     1
79.10%     1
87.30%     1
87.70%     1
72.20%     1
Name: AVERAGE_STUDENT_ATTENDANCE, Length: 148, dtype: int64

In [12]:
df[df["AVERAGE_STUDENT_ATTENDANCE"] == '98.40%'].head(10)

Unnamed: 0,School ID,NAME_OF_SCHOOL,"Elementary, Middle, or High School",Street Address,City,State,ZIP Code,Phone Number,Link,Network Manager,Collaborative Name,Adequate Yearly Progress Made?,Track Schedule,CPS Performance Policy Status,CPS Performance Policy Level,HEALTHY_SCHOOL_CERTIFIED,Safety Icon,SAFETY_SCORE,Family Involvement Icon,Family Involvement Score,Environment Icon,Environment Score,Instruction Icon,Instruction Score,Leaders Icon,Leaders Score,Teachers Icon,Teachers Score,Parent Engagement Icon,Parent Engagement Score,Parent Environment Icon,Parent Environment Score,AVERAGE_STUDENT_ATTENDANCE,Rate of Misconducts (per 100 students),Average Teacher Attendance,Individualized Education Program Compliance Rate,Pk-2 Literacy %,Pk-2 Math %,Gr3-5 Grade Level Math %,Gr3-5 Grade Level Read %,Gr3-5 Keep Pace Read %,Gr3-5 Keep Pace Math %,Gr6-8 Grade Level Math %,Gr6-8 Grade Level Read %,Gr6-8 Keep Pace Math%,Gr6-8 Keep Pace Read %,Gr-8 Explore Math %,Gr-8 Explore Read %,ISAT Exceeding Math %,ISAT Exceeding Reading %,ISAT Value Add Math,ISAT Value Add Read,ISAT Value Add Color Math,ISAT Value Add Color Read,Students Taking Algebra %,Students Passing Algebra %,9th Grade EXPLORE (2009),9th Grade EXPLORE (2010),10th Grade PLAN (2009),10th Grade PLAN (2010),Net Change EXPLORE and PLAN,11th Grade Average ACT (2011),Net Change PLAN and ACT,College Eligibility %,Graduation Rate %,College Enrollment Rate %,COLLEGE_ENROLLMENT,General Services Route,Freshman on Track Rate %,X_COORDINATE,Y_COORDINATE,Latitude,Longitude,COMMUNITY_AREA_NUMBER,COMMUNITY_AREA_NAME,Ward,Police District,Location
273,609959,John Charles Haines Elementary School,ES,247 W 23rd Pl,Chicago,IL,60616,(773) 534-9200,http://schoolreports.cps.edu/SchoolProgressRep...,Pershing Elementary Network,SOUTHWEST SIDE COLLABORATIVE,No,Standard,Not on Probation,Level 1,No,Weak,32.0,Average,47,Weak,37.0,Weak,37.0,Average,42,Average,48,Weak,43,Weak,43,98.40%,0.3,96.20%,96.90%,70.2,47.6,69.3,57.6,69,69.8,73.2,53.6,74.3,63.6,35.9,32,45.0,25.7,1.0,0.5,Green,Yellow,25.9,93.1,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,664,40,NDA,1174842.529,1888681.328,41.849931,-87.633786,34,ARMOUR SQUARE,25,9,"(41.84993116, -87.63378596)"


Double-click __here__ for the solution.

<!-- Solution:

%sql select Name_of_School, Average_Student_Attendance from SCHOOLS \
    order by Average_Student_Attendance desc nulls last limit 10 

-->

### Problem 5

##### Retrieve the list of 5 Schools with the lowest Average Student Attendance sorted in ascending order based on attendance

In [13]:
df[df["AVERAGE_STUDENT_ATTENDANCE"] == '57.90%'].head(10)

Unnamed: 0,School ID,NAME_OF_SCHOOL,"Elementary, Middle, or High School",Street Address,City,State,ZIP Code,Phone Number,Link,Network Manager,Collaborative Name,Adequate Yearly Progress Made?,Track Schedule,CPS Performance Policy Status,CPS Performance Policy Level,HEALTHY_SCHOOL_CERTIFIED,Safety Icon,SAFETY_SCORE,Family Involvement Icon,Family Involvement Score,Environment Icon,Environment Score,Instruction Icon,Instruction Score,Leaders Icon,Leaders Score,Teachers Icon,Teachers Score,Parent Engagement Icon,Parent Engagement Score,Parent Environment Icon,Parent Environment Score,AVERAGE_STUDENT_ATTENDANCE,Rate of Misconducts (per 100 students),Average Teacher Attendance,Individualized Education Program Compliance Rate,Pk-2 Literacy %,Pk-2 Math %,Gr3-5 Grade Level Math %,Gr3-5 Grade Level Read %,Gr3-5 Keep Pace Read %,Gr3-5 Keep Pace Math %,Gr6-8 Grade Level Math %,Gr6-8 Grade Level Read %,Gr6-8 Keep Pace Math%,Gr6-8 Keep Pace Read %,Gr-8 Explore Math %,Gr-8 Explore Read %,ISAT Exceeding Math %,ISAT Exceeding Reading %,ISAT Value Add Math,ISAT Value Add Read,ISAT Value Add Color Math,ISAT Value Add Color Read,Students Taking Algebra %,Students Passing Algebra %,9th Grade EXPLORE (2009),9th Grade EXPLORE (2010),10th Grade PLAN (2009),10th Grade PLAN (2010),Net Change EXPLORE and PLAN,11th Grade Average ACT (2011),Net Change PLAN and ACT,College Eligibility %,Graduation Rate %,College Enrollment Rate %,COLLEGE_ENROLLMENT,General Services Route,Freshman on Track Rate %,X_COORDINATE,Y_COORDINATE,Latitude,Longitude,COMMUNITY_AREA_NUMBER,COMMUNITY_AREA_NAME,Ward,Police District,Location
451,609702,Richard T Crane Technical Preparatory High School,HS,2245 W Jackson Blvd,Chicago,IL,60612,(773) 534-7550,http://schoolreports.cps.edu/SchoolProgressRep...,West Side High School Network,WEST SIDE COLLABORATIVE,No,Standard,Probation,Level 3,No,Average,43.0,NDA,NDA,Average,50.0,Average,48.0,NDA,NDA,NDA,NDA,Average,47,Weak,46,57.90%,19.9,93.90%,99.10%,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,,,,,NDA,NDA,NDA,NDA,12,11.9,13.2,13.6,1.6,14,0.8,3.7,48,32.6,478,38,40.9,1161290.138,1898611.918,41.877474,-87.683249,28,NEAR WEST SIDE,2,12,"(41.87747384, -87.68324922)"


Double-click __here__ for the solution.

<!-- Solution:

%sql SELECT Name_of_School, Average_Student_Attendance  \
     from SCHOOLS \
     order by Average_Student_Attendance \
     fetch first 5 rows only

-->


### Problem 6

##### Now remove the '%' sign from the above result set for Average Student Attendance column

In [14]:
df[df["AVERAGE_STUDENT_ATTENDANCE"] < '66.00%'].head(10)

Unnamed: 0,School ID,NAME_OF_SCHOOL,"Elementary, Middle, or High School",Street Address,City,State,ZIP Code,Phone Number,Link,Network Manager,Collaborative Name,Adequate Yearly Progress Made?,Track Schedule,CPS Performance Policy Status,CPS Performance Policy Level,HEALTHY_SCHOOL_CERTIFIED,Safety Icon,SAFETY_SCORE,Family Involvement Icon,Family Involvement Score,Environment Icon,Environment Score,Instruction Icon,Instruction Score,Leaders Icon,Leaders Score,Teachers Icon,Teachers Score,Parent Engagement Icon,Parent Engagement Score,Parent Environment Icon,Parent Environment Score,AVERAGE_STUDENT_ATTENDANCE,Rate of Misconducts (per 100 students),Average Teacher Attendance,Individualized Education Program Compliance Rate,Pk-2 Literacy %,Pk-2 Math %,Gr3-5 Grade Level Math %,Gr3-5 Grade Level Read %,Gr3-5 Keep Pace Read %,Gr3-5 Keep Pace Math %,Gr6-8 Grade Level Math %,Gr6-8 Grade Level Read %,Gr6-8 Keep Pace Math%,Gr6-8 Keep Pace Read %,Gr-8 Explore Math %,Gr-8 Explore Read %,ISAT Exceeding Math %,ISAT Exceeding Reading %,ISAT Value Add Math,ISAT Value Add Read,ISAT Value Add Color Math,ISAT Value Add Color Read,Students Taking Algebra %,Students Passing Algebra %,9th Grade EXPLORE (2009),9th Grade EXPLORE (2010),10th Grade PLAN (2009),10th Grade PLAN (2010),Net Change EXPLORE and PLAN,11th Grade Average ACT (2011),Net Change PLAN and ACT,College Eligibility %,Graduation Rate %,College Enrollment Rate %,COLLEGE_ENROLLMENT,General Services Route,Freshman on Track Rate %,X_COORDINATE,Y_COORDINATE,Latitude,Longitude,COMMUNITY_AREA_NUMBER,COMMUNITY_AREA_NAME,Ward,Police District,Location
44,609871,Barbara Vick Early Childhood & Family Center,ES,2554 W 113th St,Chicago,IL,60655,(773) 535-2671,http://schoolreports.cps.edu/SchoolProgressRep...,Rock Island Elementary Network,FAR SOUTH SIDE COLLABORATIVE,NDA,Track_E,NDA,NDA,No,NDA,,NDA,NDA,NDA,,NDA,,NDA,NDA,NDA,NDA,Strong,66,Strong,60,60.90%,0.0,95.50%,100.00%,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,,,,,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,204,49,NDA,1161152.12,1829643.669,41.688218,-87.685663,75,MORGAN PARK,19,22,"(41.68821843, -87.6856634)"
117,609736,Dyett High School,HS,555 E 51st St,Chicago,IL,60615,(773) 535-1825,http://schoolreports.cps.edu/SchoolProgressRep...,South Side High School Network,SOUTH SIDE COLLABORATIVE,No,Track_E,Probation,Level 3,No,Weak,27.0,Very Weak,18,Weak,35.0,Average,47.0,Weak,35,Weak,25,NDA,NDA,NDA,NDA,62.50%,24.4,93.50%,100.00%,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,,,,,NDA,NDA,NDA,NDA,11.6,11.7,13,12.9,1.3,14,1,3.3,33.7,51,318,42,37.1,1180944.201,1871282.832,41.80205,-87.611928,40,WASHINGTON PARK,4,2,"(41.80204982, -87.61192836)"
451,609702,Richard T Crane Technical Preparatory High School,HS,2245 W Jackson Blvd,Chicago,IL,60612,(773) 534-7550,http://schoolreports.cps.edu/SchoolProgressRep...,West Side High School Network,WEST SIDE COLLABORATIVE,No,Standard,Probation,Level 3,No,Average,43.0,NDA,NDA,Average,50.0,Average,48.0,NDA,NDA,NDA,NDA,Average,47,Weak,46,57.90%,19.9,93.90%,99.10%,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,,,,,NDA,NDA,NDA,NDA,12,11.9,13.2,13.6,1.6,14,0.8,3.7,48,32.6,478,38,40.9,1161290.138,1898611.918,41.877474,-87.683249,28,NEAR WEST SIDE,2,12,"(41.87747384, -87.68324922)"
526,609727,Wendell Phillips Academy High School,HS,244 E Pershing Rd,Chicago,IL,60653,(773) 535-1603,http://schoolreports.cps.edu/SchoolProgressRep...,AUSL Schools,SOUTH SIDE COLLABORATIVE,No,Track_E,Probation,Level 3,No,NDA,,Strong,71,NDA,,NDA,,Very Strong,80,Strong,79,NDA,NDA,NDA,NDA,63.00%,22.0,96.10%,100.00%,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,,,,,NDA,NDA,NDA,NDA,11.5,12.8,13.4,12.9,1.4,15,1.6,2.5,40.3,42.6,590,40,32.1,1178735.106,1879229.78,41.823908,-87.619788,35,DOUGLAS,3,2,"(41.82390751, -87.61978794)"


Double-click __here__ for a hint

<!--
Use the REPLACE() function to replace '%' with ''
See documentation for this function at:
https://www.ibm.com/support/knowledgecenter/en/SSEPGG_10.5.0/com.ibm.db2.luw.sql.ref.doc/doc/r0000843.html
-->

Double-click __here__ for the solution.

<!-- Hint:

%sql SELECT Name_of_School, REPLACE(Average_Student_Attendance, '%', '') \
     from SCHOOLS \
     order by Average_Student_Attendance \
     fetch first 5 rows only

-->


### Problem 7

##### Which Schools have Average Student Attendance lower than 70%?

In [15]:
df[df["AVERAGE_STUDENT_ATTENDANCE"] < '70.00%'].head(10)

Unnamed: 0,School ID,NAME_OF_SCHOOL,"Elementary, Middle, or High School",Street Address,City,State,ZIP Code,Phone Number,Link,Network Manager,Collaborative Name,Adequate Yearly Progress Made?,Track Schedule,CPS Performance Policy Status,CPS Performance Policy Level,HEALTHY_SCHOOL_CERTIFIED,Safety Icon,SAFETY_SCORE,Family Involvement Icon,Family Involvement Score,Environment Icon,Environment Score,Instruction Icon,Instruction Score,Leaders Icon,Leaders Score,Teachers Icon,Teachers Score,Parent Engagement Icon,Parent Engagement Score,Parent Environment Icon,Parent Environment Score,AVERAGE_STUDENT_ATTENDANCE,Rate of Misconducts (per 100 students),Average Teacher Attendance,Individualized Education Program Compliance Rate,Pk-2 Literacy %,Pk-2 Math %,Gr3-5 Grade Level Math %,Gr3-5 Grade Level Read %,Gr3-5 Keep Pace Read %,Gr3-5 Keep Pace Math %,Gr6-8 Grade Level Math %,Gr6-8 Grade Level Read %,Gr6-8 Keep Pace Math%,Gr6-8 Keep Pace Read %,Gr-8 Explore Math %,Gr-8 Explore Read %,ISAT Exceeding Math %,ISAT Exceeding Reading %,ISAT Value Add Math,ISAT Value Add Read,ISAT Value Add Color Math,ISAT Value Add Color Read,Students Taking Algebra %,Students Passing Algebra %,9th Grade EXPLORE (2009),9th Grade EXPLORE (2010),10th Grade PLAN (2009),10th Grade PLAN (2010),Net Change EXPLORE and PLAN,11th Grade Average ACT (2011),Net Change PLAN and ACT,College Eligibility %,Graduation Rate %,College Enrollment Rate %,COLLEGE_ENROLLMENT,General Services Route,Freshman on Track Rate %,X_COORDINATE,Y_COORDINATE,Latitude,Longitude,COMMUNITY_AREA_NUMBER,COMMUNITY_AREA_NAME,Ward,Police District,Location
44,609871,Barbara Vick Early Childhood & Family Center,ES,2554 W 113th St,Chicago,IL,60655,(773) 535-2671,http://schoolreports.cps.edu/SchoolProgressRep...,Rock Island Elementary Network,FAR SOUTH SIDE COLLABORATIVE,NDA,Track_E,NDA,NDA,No,NDA,,NDA,NDA,NDA,,NDA,,NDA,NDA,NDA,NDA,Strong,66,Strong,60,60.90%,0.0,95.50%,100.00%,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,,,,,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,204,49,NDA,1161152.12,1829643.669,41.688218,-87.685663,75,MORGAN PARK,19,22,"(41.68821843, -87.6856634)"
85,609674,Chicago Vocational Career Academy High School,HS,2100 E 87th St,Chicago,IL,60617,(773) 535-6100,http://schoolreports.cps.edu/SchoolProgressRep...,South Side High School Network,SOUTH SIDE COLLABORATIVE,No,Standard,Probation,Level 3,No,Weak,27.0,Average,43,Weak,21.0,Weak,30.0,Weak,34,Weak,39,NDA,NDA,NDA,NDA,68.80%,33.3,94.60%,99.10%,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,,,,,NDA,NDA,NDA,NDA,12.5,12.6,13.8,13.2,0.7,14.3,0.5,7,49.5,54,833,47,44.1,1191700.593,1847743.551,41.737202,-87.573244,45,AVALON PARK,8,4,"(41.73720173, -87.57324389)"
117,609736,Dyett High School,HS,555 E 51st St,Chicago,IL,60615,(773) 535-1825,http://schoolreports.cps.edu/SchoolProgressRep...,South Side High School Network,SOUTH SIDE COLLABORATIVE,No,Track_E,Probation,Level 3,No,Weak,27.0,Very Weak,18,Weak,35.0,Average,47.0,Weak,35,Weak,25,NDA,NDA,NDA,NDA,62.50%,24.4,93.50%,100.00%,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,,,,,NDA,NDA,NDA,NDA,11.6,11.7,13,12.9,1.3,14,1,3.3,33.7,51,318,42,37.1,1180944.201,1871282.832,41.80205,-87.611928,40,WASHINGTON PARK,4,2,"(41.80204982, -87.61192836)"
352,609722,Manley Career Academy High School,HS,2935 W Polk St,Chicago,IL,60612,(773) 534-6900,http://schoolreports.cps.edu/SchoolProgressRep...,West Side High School Network,WEST SIDE COLLABORATIVE,No,Standard,Probation,Level 3,No,Average,41.0,Weak,39,Average,43.0,Weak,31.0,Very Weak,19,Weak,32,NDA,NDA,NDA,NDA,66.80%,19.7,95.40%,98.40%,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,,,,,NDA,NDA,NDA,NDA,12.2,11.9,13.3,13,0.8,13.8,0.5,6.7,49,51.9,599,37,59.3,1156776.858,1896186.78,41.870912,-87.699887,27,EAST GARFIELD PARK,28,11,"(41.87091163, -87.69988652)"
419,610389,Orr Academy High School,HS,730 N Pulaski Rd,Chicago,IL,60624,(773) 534-6500,http://schoolreports.cps.edu/SchoolProgressRep...,AUSL Schools,WEST SIDE COLLABORATIVE,No,Standard,Probation,Level 3,No,NDA,,NDA,NDA,NDA,,NDA,,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,66.30%,10.2,95.00%,99.60%,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,,,,,NDA,NDA,NDA,NDA,11.7,12,13.6,13.3,1.6,15.1,1.5,5.8,NDA,51.9,831,34,59.5,1149548.53,1904711.708,41.894448,-87.726203,23,HUMBOLDT PARK,28,11,"(41.89444828, -87.72620305)"
451,609702,Richard T Crane Technical Preparatory High School,HS,2245 W Jackson Blvd,Chicago,IL,60612,(773) 534-7550,http://schoolreports.cps.edu/SchoolProgressRep...,West Side High School Network,WEST SIDE COLLABORATIVE,No,Standard,Probation,Level 3,No,Average,43.0,NDA,NDA,Average,50.0,Average,48.0,NDA,NDA,NDA,NDA,Average,47,Weak,46,57.90%,19.9,93.90%,99.10%,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,,,,,NDA,NDA,NDA,NDA,12,11.9,13.2,13.6,1.6,14,0.8,3.7,48,32.6,478,38,40.9,1161290.138,1898611.918,41.877474,-87.683249,28,NEAR WEST SIDE,2,12,"(41.87747384, -87.68324922)"
462,609759,Roberto Clemente Community Academy High School,HS,1147 N Western Ave,Chicago,IL,60622,(773) 534-4000,http://schoolreports.cps.edu/SchoolProgressRep...,West Side High School Network,WEST SIDE COLLABORATIVE,No,Standard,Probation,Level 2,No,Average,44.0,Weak,31,Weak,39.0,Weak,34.0,Weak,28,Weak,25,NDA,NDA,NDA,NDA,69.60%,20.6,94.20%,99.60%,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,,,,,NDA,NDA,NDA,NDA,12.6,12.3,14,13.3,0.7,15.4,1.4,12.5,49.5,41.5,1016,35,58.9,1160235.206,1907768.877,41.902623,-87.686869,24,WEST TOWN,1,13,"(41.90262318, -87.68686934)"
526,609727,Wendell Phillips Academy High School,HS,244 E Pershing Rd,Chicago,IL,60653,(773) 535-1603,http://schoolreports.cps.edu/SchoolProgressRep...,AUSL Schools,SOUTH SIDE COLLABORATIVE,No,Track_E,Probation,Level 3,No,NDA,,Strong,71,NDA,,NDA,,Very Strong,80,Strong,79,NDA,NDA,NDA,NDA,63.00%,22.0,96.10%,100.00%,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,,,,,NDA,NDA,NDA,NDA,11.5,12.8,13.4,12.9,1.4,15,1.6,2.5,40.3,42.6,590,40,32.1,1178735.106,1879229.78,41.823908,-87.619788,35,DOUGLAS,3,2,"(41.82390751, -87.61978794)"


Double-click __here__ for a hint

<!--
The datatype of the "Average_Student_Attendance" column is varchar.
So you cannot use it as is in the where clause for a numeric comparison.
First use the CAST() function to cast it as a DECIMAL or DOUBLE
e.g. CAST("Column_Name" as DOUBLE)
or simply: DECIMAL("Column_Name")
-->

Double-click __here__ for another hint

<!--
Don't forget the '%' age sign needs to be removed before casting
-->

Double-click __here__ for the solution.

<!-- Solution:

%sql SELECT Name_of_School, Average_Student_Attendance  \
     from SCHOOLS \
     where CAST ( REPLACE(Average_Student_Attendance, '%', '') AS DOUBLE ) < 70 \
     order by Average_Student_Attendance
     
or,

%sql SELECT Name_of_School, Average_Student_Attendance  \
     from SCHOOLS \
     where DECIMAL ( REPLACE(Average_Student_Attendance, '%', '') ) < 70 \
     order by Average_Student_Attendance

-->


### Problem 8

##### Get the total College Enrollment for each Community Area

In [16]:
pd.DataFrame(df.groupby(by='COMMUNITY_AREA_NAME')['COLLEGE_ENROLLMENT'].sum())

Unnamed: 0_level_0,COLLEGE_ENROLLMENT
COMMUNITY_AREA_NAME,Unnamed: 1_level_1
ALBANY PARK,6864
ARCHER HEIGHTS,4823
ARMOUR SQUARE,1458
ASHBURN,6483
AUBURN GRESHAM,4175
...,...
WEST LAWN,4207
WEST PULLMAN,3240
WEST RIDGE,8197
WEST TOWN,9429


Double-click __here__ for a hint

<!--
Verify the exact name of the Enrollment column in the database
Use the SUM() function to add up the Enrollments for each Community Area
-->

Double-click __here__ for another hint

<!--
Don't forget to group by the Community Area
-->

Double-click __here__ for the solution.

<!-- Solution:

%sql select Community_Area_Name, sum(College_Enrollment) AS TOTAL_ENROLLMENT \
   from SCHOOLS \
   group by Community_Area_Name 

-->


### Problem 9

##### Get the 5 Community Areas with the least total College Enrollment  sorted in ascending order 

In [17]:
enroll = pd.DataFrame(df.groupby(by='COMMUNITY_AREA_NAME')['COLLEGE_ENROLLMENT'].sum())

In [18]:
enroll

Unnamed: 0_level_0,COLLEGE_ENROLLMENT
COMMUNITY_AREA_NAME,Unnamed: 1_level_1
ALBANY PARK,6864
ARCHER HEIGHTS,4823
ARMOUR SQUARE,1458
ASHBURN,6483
AUBURN GRESHAM,4175
...,...
WEST LAWN,4207
WEST PULLMAN,3240
WEST RIDGE,8197
WEST TOWN,9429


In [19]:
enroll["COLLEGE_ENROLLMENT"].sort_values(ascending=False)

COMMUNITY_AREA_NAME
SOUTH LAWNDALE    14793
BELMONT CRAGIN    14386
AUSTIN            10933
GAGE PARK          9915
BRIGHTON PARK      9647
                  ...  
LOOP                871
OHARE               786
BURNSIDE            549
FULLER PARK         531
OAKLAND             140
Name: COLLEGE_ENROLLMENT, Length: 77, dtype: int64

Double-click __here__ for a hint

<!--
Order the previous query and limit the number of rows you fetch
-->

Double-click __here__ for the solution.

<!-- Solution:

%sql select Community_Area_Name, sum(College_Enrollment) AS TOTAL_ENROLLMENT \
   from SCHOOLS \
   group by Community_Area_Name \
   order by TOTAL_ENROLLMENT asc \
   fetch first 5 rows only

-->

### Problem 10

##### Get the hardship index for the community area which has College Enrollment of 4638

In [20]:
df[df['COLLEGE_ENROLLMENT']== 4638]

Unnamed: 0,School ID,NAME_OF_SCHOOL,"Elementary, Middle, or High School",Street Address,City,State,ZIP Code,Phone Number,Link,Network Manager,Collaborative Name,Adequate Yearly Progress Made?,Track Schedule,CPS Performance Policy Status,CPS Performance Policy Level,HEALTHY_SCHOOL_CERTIFIED,Safety Icon,SAFETY_SCORE,Family Involvement Icon,Family Involvement Score,Environment Icon,Environment Score,Instruction Icon,Instruction Score,Leaders Icon,Leaders Score,Teachers Icon,Teachers Score,Parent Engagement Icon,Parent Engagement Score,Parent Environment Icon,Parent Environment Score,AVERAGE_STUDENT_ATTENDANCE,Rate of Misconducts (per 100 students),Average Teacher Attendance,Individualized Education Program Compliance Rate,Pk-2 Literacy %,Pk-2 Math %,Gr3-5 Grade Level Math %,Gr3-5 Grade Level Read %,Gr3-5 Keep Pace Read %,Gr3-5 Keep Pace Math %,Gr6-8 Grade Level Math %,Gr6-8 Grade Level Read %,Gr6-8 Keep Pace Math%,Gr6-8 Keep Pace Read %,Gr-8 Explore Math %,Gr-8 Explore Read %,ISAT Exceeding Math %,ISAT Exceeding Reading %,ISAT Value Add Math,ISAT Value Add Read,ISAT Value Add Color Math,ISAT Value Add Color Read,Students Taking Algebra %,Students Passing Algebra %,9th Grade EXPLORE (2009),9th Grade EXPLORE (2010),10th Grade PLAN (2009),10th Grade PLAN (2010),Net Change EXPLORE and PLAN,11th Grade Average ACT (2011),Net Change PLAN and ACT,College Eligibility %,Graduation Rate %,College Enrollment Rate %,COLLEGE_ENROLLMENT,General Services Route,Freshman on Track Rate %,X_COORDINATE,Y_COORDINATE,Latitude,Longitude,COMMUNITY_AREA_NUMBER,COMMUNITY_AREA_NAME,Ward,Police District,Location


Double-click __here__ for the solution.

<!-- Solution:
NOTE: For this solution to work the CHICAGO_SOCIOECONOMIC_DATA table 
      as created in the last lab of Week 3 should already exist

%%sql 
select hardship_index 
   from chicago_socioeconomic_data CD, schools CPS 
   where CD.ca = CPS.community_area_number 
      and college_enrollment = 4368

-->

### Problem 11

##### Get the hardship index for the community area which has the highest value for College Enrollment

In [21]:
df['COLLEGE_ENROLLMENT'].max()

4368

In [22]:
df[df['COLLEGE_ENROLLMENT']== 4368]

Unnamed: 0,School ID,NAME_OF_SCHOOL,"Elementary, Middle, or High School",Street Address,City,State,ZIP Code,Phone Number,Link,Network Manager,Collaborative Name,Adequate Yearly Progress Made?,Track Schedule,CPS Performance Policy Status,CPS Performance Policy Level,HEALTHY_SCHOOL_CERTIFIED,Safety Icon,SAFETY_SCORE,Family Involvement Icon,Family Involvement Score,Environment Icon,Environment Score,Instruction Icon,Instruction Score,Leaders Icon,Leaders Score,Teachers Icon,Teachers Score,Parent Engagement Icon,Parent Engagement Score,Parent Environment Icon,Parent Environment Score,AVERAGE_STUDENT_ATTENDANCE,Rate of Misconducts (per 100 students),Average Teacher Attendance,Individualized Education Program Compliance Rate,Pk-2 Literacy %,Pk-2 Math %,Gr3-5 Grade Level Math %,Gr3-5 Grade Level Read %,Gr3-5 Keep Pace Read %,Gr3-5 Keep Pace Math %,Gr6-8 Grade Level Math %,Gr6-8 Grade Level Read %,Gr6-8 Keep Pace Math%,Gr6-8 Keep Pace Read %,Gr-8 Explore Math %,Gr-8 Explore Read %,ISAT Exceeding Math %,ISAT Exceeding Reading %,ISAT Value Add Math,ISAT Value Add Read,ISAT Value Add Color Math,ISAT Value Add Color Read,Students Taking Algebra %,Students Passing Algebra %,9th Grade EXPLORE (2009),9th Grade EXPLORE (2010),10th Grade PLAN (2009),10th Grade PLAN (2010),Net Change EXPLORE and PLAN,11th Grade Average ACT (2011),Net Change PLAN and ACT,College Eligibility %,Graduation Rate %,College Enrollment Rate %,COLLEGE_ENROLLMENT,General Services Route,Freshman on Track Rate %,X_COORDINATE,Y_COORDINATE,Latitude,Longitude,COMMUNITY_AREA_NUMBER,COMMUNITY_AREA_NAME,Ward,Police District,Location
6,609720,Albert G Lane Technical High School,HS,2501 W Addison St,Chicago,IL,60618,(773) 534-5400,http://schoolreports.cps.edu/SchoolProgressRep...,North-Northwest Side High School Network,NORTH-NORTHWEST SIDE COLLABORATIVE,Yes,Standard,Not on Probation,Level 1,No,Very Strong,88.0,NDA,NDA,Strong,62.0,Average,52.0,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,96.30%,2.1,96.20%,99.40%,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,NDA,,,,,NDA,NDA,NDA,NDA,19.1,19.5,19.9,20.1,1,23.4,3.5,67.9,92.2,79.8,4368,35,90.7,1158975.392,1923791.705,41.946617,-87.691056,5,NORTH CENTER,47,19,"(41.94661693, -87.69105603)"


Double-click __here__ for the solution.

<!-- Solution:
NOTE: For this solution to work the CHICAGO_SOCIOECONOMIC_DATA table 
      as created in the last lab of Week 3 should already exist

%sql select ca, community_area_name, hardship_index from chicago_socioeconomic_data \
   where ca in \
   ( select community_area_number from schools order by college_enrollment desc limit 1 )

-->

## Summary

##### In this lab you learned how to work with a real word dataset using SQL and Python. You learned how to query columns with spaces or special characters in their names and with mixed case names. You also used built in database functions and practiced how to sort, limit, and order result sets, as well as used sub-queries and worked with multiple tables.

Copyright &copy; 2018 [cognitiveclass.ai](cognitiveclass.ai?utm_source=bducopyrightlink&utm_medium=dswb&utm_campaign=bdu). This notebook and its source code are released under the terms of the [MIT License](https://bigdatauniversity.com/mit-license/).
