### Store the datasets in database tables

To analyze the data using SQL, it first needs to be loaded into SQLite DB.
We will create three tables in as under:

1.  **CENSUS_DATA**
2.  **CHICAGO_PUBLIC_SCHOOLS**
3.  **CHICAGO_CRIME_DATA**

In [2]:
import csv, sqlite3
import pandas

con = sqlite3.connect("RealWorldData.db")
cur = con.cursor()

%load_ext sql
%sql sqlite:///RealWorldData.db

In [3]:
df = pandas.read_csv("https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DB0201EN-SkillsNetwork/labs/FinalModule_Coursera_V5/data/ChicagoCensusData.csv")
df.to_sql("CENSUS_DATA", con, if_exists='replace', index=False,method="multi")

df = pandas.read_csv("https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DB0201EN-SkillsNetwork/labs/FinalModule_Coursera_V5/data/ChicagoCrimeData.csv")
df.to_sql("CHICAGO_CRIME_DATA", con, if_exists='replace', index=False, method="multi")

df = pandas.read_csv("https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DB0201EN-SkillsNetwork/labs/FinalModule_Coursera_V5/data/ChicagoPublicSchools.csv")
df.to_sql("CHICAGO_PUBLIC_SCHOOLS_DATA", con, if_exists='replace', index=False)

566

## Problems

Now write and execute SQL queries to solve assignment problems

### Problem 1

##### Find the total number of crimes recorded in the CRIME table.


In [7]:
%%sql

select COUNT(*)
from CHICAGO_CRIME_DATA;

 * sqlite:///RealWorldData.db
Done.


COUNT(*)
533


### Problem 2

##### List community areas with per capita income less than 11000.


In [13]:
%%sql

select COMMUNITY_AREA_NAME, PER_CAPITA_INCOME
from CENSUS_DATA
where (PER_CAPITA_INCOME < 11000)
order by PER_CAPITA_INCOME nulls last;

 * sqlite:///RealWorldData.db
Done.


COMMUNITY_AREA_NAME,PER_CAPITA_INCOME
Riverdale,8201
South Lawndale,10402
Fuller Park,10432
West Garfield Park,10934


### Problem 3

##### List all case numbers for crimes  involving minors (children are not considered minors for the purposes of crime analysis)


In [20]:
%%sql

select CASE_NUMBER 
from CHICAGO_CRIME_DATA
where DESCRIPTION like '%minor%';

 * sqlite:///RealWorldData.db
Done.


CASE_NUMBER
HL266884
HK238408


### Problem 4

##### List all kidnapping crimes involving a child?


In [21]:
%%sql

select DESCRIPTION 
from CHICAGO_CRIME_DATA
where PRIMARY_TYPE like '%KIDNAPPING%';

 * sqlite:///RealWorldData.db
Done.


DESCRIPTION
CHILD ABDUCTION/STRANGER


### Problem 5

##### What kinds of crimes were recorded at schools?


In [114]:
%%sql

select DISTINCT DESCRIPTION
from CHICAGO_CRIME_DATA
where LOCATION_DESCRIPTION like '%SCHOOL%';

 * sqlite:///RealWorldData.db
Done.


DESCRIPTION
SIMPLE
PRO EMP HANDS NO/MIN INJURY
TO VEHICLE
POSS: HEROIN(WHITE)
MANU/DEL:CANNABIS 10GM OR LESS
TO LAND
BOMB THREAT


### Problem 6

##### List the average safety score for each type of school.


In [123]:
%%sql

select "Elementary, Middle, or High School", AVG(SAFETY_SCORE) as AverageSafetyScore
from CHICAGO_PUBLIC_SCHOOLS_DATA
group by "Elementary, Middle, or High School";

 * sqlite:///RealWorldData.db
Done.


"Elementary, Middle, or High School",AverageSafetyScore
ES,49.52038369304557
HS,49.62352941176471
MS,48.0


### Problem 7

##### List 5 community areas with highest % of households below poverty line


In [55]:
%%sql

select COMMUNITY_AREA_NAME, PERCENT_HOUSEHOLDS_BELOW_POVERTY
from CENSUS_DATA
order by PERCENT_HOUSEHOLDS_BELOW_POVERTY desc nulls last limit 5;

 * sqlite:///RealWorldData.db
Done.


COMMUNITY_AREA_NAME,PERCENT_HOUSEHOLDS_BELOW_POVERTY
Riverdale,56.5
Fuller Park,51.2
Englewood,46.6
North Lawndale,43.1
East Garfield Park,42.4


### Problem 8

##### Which community area is most crime prone?


In [111]:
%%sql

select COMMUNITY_AREA_NUMBER, count(*) as COUNT
from CHICAGO_CRIME_DATA
group by COMMUNITY_AREA_NUMBER
order by COUNT desc
limit 1;

 * sqlite:///RealWorldData.db
Done.


COMMUNITY_AREA_NUMBER,COUNT
25.0,43


### Problem 9

##### Use a sub-query to find the name of the community area with highest hardship index


In [89]:
%%sql

select COMMUNITY_AREA_NAME, HARDSHIP_INDEX
from CENSUS_DATA
order by HARDSHIP_INDEX desc limit 1;

 * sqlite:///RealWorldData.db
Done.


COMMUNITY_AREA_NAME,HARDSHIP_INDEX
Riverdale,98.0


### Problem 10

##### Use a sub-query to determine the Community Area Name with most number of crimes?


In [90]:
%%sql

select CD.COMMUNITY_AREA_NUMBER, CD.COMMUNITY_AREA_NAME, COUNT
from CENSUS_DATA CD
join (
    select COMMUNITY_AREA_NUMBER, count(*) as COUNT
    from CHICAGO_CRIME_DATA
    group by COMMUNITY_AREA_NUMBER
    order by COUNT desc
    limit 1
) CC
on CD.COMMUNITY_AREA_NUMBER = CC.COMMUNITY_AREA_NUMBER;

 * sqlite:///RealWorldData.db
Done.


COMMUNITY_AREA_NUMBER,COMMUNITY_AREA_NAME,COUNT
25.0,Austin,43
