# Table of Contents
 <p>

# Project Description 

In this project, we will explore Chicago Crime Dataset and implement a relational database for storing the data. The key tasks for this project are as follows: 

1. Indentify the features (attributes) in Chicago Crime dataset and design an entity-relationship model
2. Refine the model and convert each relation to 3NF (if required)
3. Using DDL implement the relations in a postgres server
4. Load the given data to the relations
5. Execute some interesting queries on the relations


## Dataset

* Dataset URL: **/dsa/data/DSA-7030/Chicago-Crime-Sample-2012.csv**
* Dataset Description: [pdf](./ChicagoData-Description.pdf)

## Dataset exploration

In [1]:
import pandas as pd
datapath = "/dsa/data/DSA-7030/Chicago-Crime-Sample-2012.csv"
df = pd.read_csv(datapath, index_col=0)

In [2]:
# check columns
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 334715 entries, 47398 to 2743778
Data columns (total 22 columns):
ID                      334715 non-null int64
Case Number             334715 non-null object
Date                    334715 non-null object
Block                   334715 non-null object
IUCR                    334715 non-null object
Primary Type            334715 non-null object
Description             334715 non-null object
Location Description    334384 non-null object
Arrest                  334715 non-null bool
Domestic                334715 non-null bool
Beat                    334715 non-null int64
District                334715 non-null int64
Ward                    334708 non-null float64
Community Area          334689 non-null float64
FBI Code                334715 non-null object
X Coordinate            334132 non-null float64
Y Coordinate            334132 non-null float64
Year                    334715 non-null int64
Updated On              334715 non-null ob

In [3]:
df.head() 

Unnamed: 0,ID,Case Number,Date,Block,IUCR,Primary Type,Description,Location Description,Arrest,Domestic,...,Ward,Community Area,FBI Code,X Coordinate,Y Coordinate,Year,Updated On,Latitude,Longitude,Location
47398,10433096,HZ170962,1/1/2012 0:00,026XX N MC VICKER AVE,1562,SEX OFFENSE,AGG CRIMINAL SEXUAL ABUSE,RESIDENCE,True,False,...,29.0,19.0,17,,,2012,5/11/2016 15:48,,,
47420,10433124,HZ170983,1/1/2012 0:00,026XX N MC VICKER AVE,1544,SEX OFFENSE,SEXUAL EXPLOITATION OF A CHILD,RESIDENCE,True,False,...,29.0,19.0,17,,,2012,5/11/2016 15:48,,,
802910,10532867,HZ276514,1/1/2012 0:00,036XX S RHODES AVE,1563,SEX OFFENSE,CRIMINAL SEXUAL ABUSE,APARTMENT,False,False,...,4.0,35.0,17,,,2012,5/26/2016 15:51,,,
803605,10536876,HZ280873,1/1/2012 0:00,062XX S ROCKWELL ST,1153,DECEPTIVE PRACTICE,FINANCIAL IDENTITY THEFT OVER $ 300,RESIDENCE,False,False,...,15.0,66.0,11,,,2012,5/27/2016 15:48,,,
831733,9581929,HX232501,1/1/2012 0:00,006XX W 66TH ST,1563,SEX OFFENSE,CRIMINAL SEXUAL ABUSE,RESIDENCE,False,True,...,6.0,68.0,17,,,2012,8/17/2015 15:03,,,


In [4]:
df.tail()

Unnamed: 0,ID,Case Number,Date,Block,IUCR,Primary Type,Description,Location Description,Arrest,Domestic,...,Ward,Community Area,FBI Code,X Coordinate,Y Coordinate,Year,Updated On,Latitude,Longitude,Location
2742387,8951459,HW100757,12/31/2012 23:50,028XX N HALSTED ST,890,THEFT,FROM BUILDING,RESIDENCE,False,False,...,44.0,6.0,6,1170439.0,1919244.0,2012,2/4/2016 6:33,41.933894,-87.649053,"(41.933894393, -87.649052922)"
2741932,8950836,HW100039,12/31/2012 23:55,0000X E OHIO ST,2890,PUBLIC PEACE VIOLATION,OTHER VIOLATION,SIDEWALK,True,False,...,42.0,8.0,26,1176775.0,1904213.0,2012,2/4/2016 6:33,41.892508,-87.626224,"(41.892507592, -87.626223996)"
2742001,8950918,HW100021,12/31/2012 23:55,035XX W MONTROSE AVE,610,BURGLARY,FORCIBLE ENTRY,OTHER,False,False,...,33.0,16.0,5,1152066.0,1929015.0,2012,2/4/2016 6:33,41.961089,-87.716315,"(41.961089289, -87.716314748)"
2743949,8954299,HW100700,12/31/2012 23:55,058XX S MARYLAND AVE,890,THEFT,FROM BUILDING,HOSPITAL BUILDING/GROUNDS,False,False,...,5.0,41.0,6,1182887.0,1866434.0,2012,2/4/2016 6:33,41.788699,-87.604954,"(41.788699253, -87.604954085)"
2743778,8953937,HW102973,12/31/2012 23:58,037XX N NORA AVE,610,BURGLARY,FORCIBLE ENTRY,RESIDENCE-GARAGE,False,False,...,36.0,17.0,5,1128745.0,1924002.0,2012,2/4/2016 6:33,41.947762,-87.802171,"(41.947761848, -87.802170774)"


In [4]:
df1 = df.groupby('Case Number')['ID'].nunique()
df1.count()

334715

In [5]:
df2 = df.groupby('Beat')['Block'].nunique()
df2.head()

Beat
111    77
112    52
113    51
114    83
121    62
Name: Block, dtype: int64

In [5]:
df3 = df.groupby('Location').nunique()
df3

Unnamed: 0_level_0,ID,Case Number,Date,Block,IUCR,Primary Type,Description,Location Description,Arrest,Domestic,...,Ward,Community Area,FBI Code,X Coordinate,Y Coordinate,Year,Updated On,Latitude,Longitude,Location
Location,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
"(36.619446395, -91.686565684)",22,22,22,13,18,10,17,10,2,2,...,11,9,11,1,1,1,1,1,1,1
"(41.644585429, -87.616512829)",1,1,1,1,1,1,1,1,1,1,...,1,1,1,1,1,1,1,1,1,1
"(41.644589713, -87.61587983)",1,1,1,1,1,1,1,1,1,1,...,1,1,1,1,1,1,1,1,1,1
"(41.644606925, -87.608997659)",1,1,1,1,1,1,1,1,1,1,...,1,1,1,1,1,1,1,1,1,1
"(41.644607723, -87.613055128)",4,4,4,1,2,2,2,2,2,1,...,1,1,2,1,1,1,1,1,1,1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
"(42.022536112, -87.674619658)",1,1,1,1,1,1,1,1,1,1,...,1,1,1,1,1,1,1,1,1,1
"(42.022536147, -87.673670147)",2,2,2,1,2,2,2,1,1,1,...,1,1,1,1,1,1,1,1,1,1
"(42.022536591, -87.673747428)",1,1,1,1,1,1,1,1,1,1,...,1,1,1,1,1,1,1,1,1,1
"(42.022548774, -87.675870822)",1,1,1,1,1,1,1,1,1,1,...,1,1,1,1,1,1,1,1,1,1


## 1.1 Design an Entity Relationship Model for the Chicago Crime Dataset

* List all the entities with associated attributes
* Indentify primary and foreign keys

## 1.2 If required, refine your initial set of relations and convert each of the relations to 3NF

While converting a relation to 3NF, please write down the process in the following cell. 

## 1.3 Final ERD

* Draw an entitiy relationship diagram once you are done with 1.1 and 1.2 
* Use crow's foot notation to specify the cardinality 
* Show the primary and foreign keys in the diagram

Please upload your ERD to the Module 8 exercises folder. Link the file [image](IMG_1452.jpg). Once you are done, change this cell type to Markdown and execute. ![image](IMG_1452.jpg)


## <center> Part-I ends here</center>

To access Part II, use this link: [Final Project Part II](./Final-Project-Part-II.ipynb)