# Draft analysis 

---

Group name: B

---


## Introduction

*This section includes an introduction to the project motivation, data, and research question. Include a data dictionary* 

### Motivation

In the city of Chicago, many incidents/crimes happen every day, from minor thefts to murders. To reduce the violence in the city, the city wants to open a new crime prevention centre. Now the city is asking our team which crimes occur particularly frequently and where they happen. With this information, the **Crime Prevention Center** can be built in a particularly well-situated location. In addition, the specialised departments of the centre can be trained for the relevant criminal offences. This should make Chicago a safer city and ensure that measures are taken at an early stage to prevent crime.

Various studies show that it is possible to prevent crime in cities with the help of specific actions. With the new **Crime Prevention Center**, we want to take a new approach in Chicago to prevent crime from the very beginning.

### Question

Which kind of crimes happen particularly frequently and where do they happen?

### Hypotheses

There are places (districts/blocks) in Chicago where the most (dangerous) crimes/incidents happen.

### Data dictionary


| Name  |   Description	   	| Type   	|  Format 	|
|---	|---	          	|---	    |---	|
|id   	|Unique identifier for the record.   	            |numeric   	    |category   	|
|date   	|Date when the incident occurred.   	       	    |oridnal   	    |date   	|
|block   	|The partially redacted address where the incident occurred.   	            |numeric   	    |category   	|
|primary_type   	|The primary description of the IUCR code.   	       	    |nominal   	    |category   	|
|arrest   	|Indicates whether an arrest was made.   	            |nominal   	    |category   	|
|domestic   	|Indicates whether the incident was domestic-related as defined by the Illinois Domestic Violence Act.   	       	    |nominal   	    |category   	|
|beat   	|Indicates the beat where the incident occurred.   	       	    |numeric, dicsrete   	    |category   	|
|district   	|Indicates the police district where the incident occurred.   	       	    |numeric, dicsrete   	    |category   	|
|ward   	|The ward (City Council district) where the incident occurred.   	       	    |numeric, discrete   	    |category   	|
|community_area   	|Indicates the community area where the incident occurred.   	       	    |numeric, discrete   	    |category   	|
|year   	|Year the incident occurred.   	       	    |nominal   	    |category   	|
|month   	|Month the incident occurred.   	       	    |nominal   	    |category   	|
|day   	|Day the incident occurred.   	       	    |nominal   	    |category   	|
|hour   	|Hour the incident occurred.   	       	    |nominal   	    |category   	|
|latitude   	|The longitude of the location where the incident occurred.   	       	    |numeric   	    |float   	|
|longitude   	|The longitude of the location where the incident occurred.   	       	    |numeric   	    |float   	|
|arrest__False   	|Indicates whether an arrest was made. 0 means False   	            |nominal   	    |category   	|
|arrest__True   	|Indicates whether an arrest was made. 1 means True   	            |nominal   	    |category   	|
|domestic__False   	|Indicates whether the incident was domestic-related as defined by the Illinois Domestic Violence Act. 0 means False  	       	    |nominal   	    |category   	|
|domestic__True   	|Indicates whether the incident was domestic-related as defined by the Illinois Domestic Violence Act. 1 means True   	       	    |nominal   	    |category   	|

<br>

## Setup

In [None]:
from pathlib import Path

#Pandas library
import pandas as pd

#Altair library for visualisations
import altair as alt

#Stop showing FutureWarning
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)

#Geopandas library to work with Chicago map
import geopandas as gpd

#Numpy library for data corrections
import numpy as np

## Data

## Import data

In [None]:
#import Dataset
PARENT_PATH = str(Path().resolve().parent) + "/"
PATH = "data/"
SUBPATH = "interim/"
FILE = "chicago_crimes-20221130-1405"
FORMAT = ".csv"


df = pd.read_csv(PARENT_PATH + PATH + SUBPATH + FILE + FORMAT)


#import Geopandas Dataset

PARENT_PATH = str(Path().resolve().parent) + "/"
PATH = "data/"
SUBPATH = "external/"
FILE = "wards"
FORMAT = ".shp"

gdf = gpd.read_file(PARENT_PATH + PATH + SUBPATH + FILE + FORMAT)

### Data structure

In [None]:
df.head()

In [None]:
df.info()

In [None]:
gdf.head()

In [None]:
gdf.info()

### Data corrections

## Analysis

### Descriptive statistics

### Exploratory data analysis

## Visualizations

### Visualization ideas

### Save Visualizations



Save your draft visualizations in the folder `reports/visualizations/`. Use a meaningful name (always include the word `draft` and a `timestamp`in your filename).

## Conclusion and recommended action