# Project 3 Part 1
## Chicago Crime Data

*Christina Brockway*

use a prepared zip file with the Chicago Crime Data:
https://drive.google.com/file/d/1avxUlCAros-R9GF6SKXqM_GopzO7VwA5/view?usp=drive_link

**Original Source is the Chicago Data Portal: Crimes 2001 to Present**

**Data Description**
https://data.cityofchicago.org/Public-Safety/Crimes-2001-to-Present/ijzp-q8t2/data

-  includes type of crime, exact data/time, lat/long, District/ward, was there an arrest,....

# e

# Task:
Answer a series of question s about trends in crimes in Chicago for a reporter for a local newspaper.

-  Pick 3 topics to analyze:
  ~  Comparing Police Districts
  ~  Crimes Across the Years
  ~  Comparing AM vs PM Rush Hour
  ~  Comparing Months
  ~  Comparing Holidays
  ~  What cycles(seasonality) can you find this data


### Imports

In [7]:
import pandas as pd
import glob
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import matplotlib.ticker as mticks

import datetime as dt
import statsmodels.tsa.api as tsa

### Load Data

In [14]:
folder = "data/*Crime*.csv"
crime_files = sorted(glob.glob(folder, recursive=True))
df = pd.concat([pd.read_csv(f) for f in crime_files])
df.head()

Unnamed: 0,ID,Date,Primary Type,Description,Location Description,Arrest,Domestic,Beat,District,Ward,Latitude,Longitude
0,1326041,01/01/2001 01:00:00 AM,BATTERY,SIMPLE,RESIDENCE,False,False,1624,16.0,,41.95785,-87.749185
1,1319931,01/01/2001 01:00:00 PM,BATTERY,SIMPLE,RESIDENCE,False,True,825,8.0,,41.783892,-87.684841
2,1324743,01/01/2001 01:00:00 PM,GAMBLING,ILLEGAL ILL LOTTERY,STREET,True,False,313,3.0,,41.780412,-87.61197
3,1310717,01/01/2001 01:00:00 AM,CRIMINAL DAMAGE,TO VEHICLE,STREET,False,False,2424,24.0,,42.012391,-87.678032
4,1318099,01/01/2001 01:00:00 AM,BATTERY,SIMPLE,RESIDENCE PORCH/HALLWAY,False,True,214,2.0,,41.819538,-87.62002


In [15]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 7713109 entries, 0 to 238857
Data columns (total 12 columns):
 #   Column                Dtype  
---  ------                -----  
 0   ID                    int64  
 1   Date                  object 
 2   Primary Type          object 
 3   Description           object 
 4   Location Description  object 
 5   Arrest                bool   
 6   Domestic              bool   
 7   Beat                  int64  
 8   District              float64
 9   Ward                  float64
 10  Latitude              float64
 11  Longitude             float64
dtypes: bool(2), float64(4), int64(2), object(4)
memory usage: 662.0+ MB


### Data Dictionary

Taken from: data.cityofchicago.org/Public-Safety/Crimes-2001-to-Present/

<style type="text/css">
.tg  {border-collapse:collapse;border-spacing:0;}
.tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
  overflow:hidden;padding:10px 5px;word-break:normal;}
.tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
  font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg .tg-0pky{border-color:inherit;text-align:left;vertical-align:top}
</style>
<table class="tg">
<thead>
  <tr>
    <th class="tg-0pky"><span style="font-weight:bold">**Name**</span></th>
    <th class="tg-0pky"><span style="font-weight:bold">**dtype**</span></th>
    <th class="tg-0pky"><span style="font-weight:bold">**Description**</span></th>
    <th class="tg-0pky"></th>
  </tr>
</thead>
<tbody>
  <tr>
    <td class="tg-0pky">ID</td>
    <td class="tg-0pky">int64</td>
    <td class="tg-0pky">Unique identifier for the record</td>
    <td class="tg-0pky">numeric</td>
  </tr>
  <tr>
    <td class="tg-0pky">Date</td>
    <td class="tg-0pky">object</td>
    <td class="tg-0pky">Date incident occured</td>
    <td class="tg-0pky">date/time</td>
  </tr>
  <tr>
    <td class="tg-0pky">Primary Type</td>
    <td class="tg-0pky">object</td>
    <td class="tg-0pky">Primary description of the IL Uniform Crime Reporting code</td>
    <td class="tg-0pky">categorical</td>
  </tr>
  <tr>
    <td class="tg-0pky">Description</td>
    <td class="tg-0pky">object </td>
    <td class="tg-0pky">Secondary description of IUCR code</td>
    <td class="tg-0pky">categorical</td>
  </tr>
  <tr>
    <td class="tg-0pky">Location Description</td>
    <td class="tg-0pky">object </td>
    <td class="tg-0pky">Location where incident occurred</td>
    <td class="tg-0pky">categorical</td>
  </tr>
  <tr>
    <td class="tg-0pky">Arrest</td>
    <td class="tg-0pky">bool</td>
    <td class="tg-0pky">was an arrest made? </td>
    <td class="tg-0pky">true/false</td>
  </tr>
  <tr>
    <td class="tg-0pky">Domestic</td>
    <td class="tg-0pky">bool</td>
    <td class="tg-0pky">was the incident domestic related? </td>
    <td class="tg-0pky">true/false</td>
  </tr>
  <tr>
    <td class="tg-0pky">Beat</td>
    <td class="tg-0pky">int64</td>
    <td class="tg-0pky">beat where occured, Beat: smallest geo area deidcate police beat car</td>
    <td class="tg-0pky">numeric</td>
  </tr>
  <tr>
    <td class="tg-0pky">District</td>
    <td class="tg-0pky">float64</td>
    <td class="tg-0pky">indicates district where occured</td>
    <td class="tg-0pky">numeric</td>
  </tr>
  <tr>
    <td class="tg-0pky">Ward</td>
    <td class="tg-0pky">float64</td>
    <td class="tg-0pky">indicates city council district/ward where occured</td>
    <td class="tg-0pky">numeric</td>
  </tr>
  <tr>
    <td class="tg-0pky">Latitude</td>
    <td class="tg-0pky">float64 </td>
    <td class="tg-0pky">latitude where occured</td>
    <td class="tg-0pky">numeric</td>
  </tr>
  <tr>
    <td class="tg-0pky">Longitude</td>
    <td class="tg-0pky">float64</td>
    <td class="tg-0pky">longitude where occured</td>
    <td class="tg-0pky">numeric</td>
  </tr>
</tbody>
</table>

In [None]:
## Comparing Crimes Across the Years

In [None]:
#### Is the total number of crimes increasing/decreasing across the years?