# ADA PROJECT 2 - Dive into the Heart of Chicaco's crime

## Dataset: Food inspection in chicago

### Abstract: 
This dataset records the results of food inspection since 2010 in Chicago. We have access to the name and location of the food institution, the results of the test. We can find areas that tend to fail inspections and see how they compare with the crime rate. It can tell us if crimes are located in areas with good restaurants (might be richer areas) or bad ones (might be poorer areas).

### Notes 

This dataset is available on : https://www.kaggle.com/chicago/chicago-food-inspections

In [3]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

## 0. Data Loading

In [9]:
# Load the data 
FOOD_INSPECTION_PATH = './data/chicago-food-inspections/food-inspections.csv'
food_inspection_data = pd.read_csv(FOOD_INSPECTION_PATH)

In [10]:
food_inspection_data.describe()

Unnamed: 0,Inspection ID,License #,Zip,Latitude,Longitude,Historical Wards 2003-2015,Zip Codes,Community Areas,Census Tracts,Wards
count,195979.0,195962.0,195929.0,195286.0,195286.0,0.0,0.0,0.0,0.0,0.0
mean,1445550.0,1599125.0,60628.737813,41.881216,-87.676832,,,,,
std,634189.4,896163.0,122.183684,0.080843,0.058978,,,,,
min,44247.0,0.0,10014.0,41.64467,-87.914428,,,,,
25%,1150586.0,1224170.0,60614.0,41.834761,-87.707795,,,,,
50%,1490878.0,1979615.0,60625.0,41.891908,-87.666816,,,,,
75%,1995608.0,2240034.0,60643.0,41.939812,-87.634955,,,,,
max,2345702.0,9999999.0,60827.0,42.021064,-87.525094,,,,,


In [11]:
food_inspection_data.head()

Unnamed: 0,Inspection ID,DBA Name,AKA Name,License #,Facility Type,Risk,Address,City,State,Zip,...,Results,Violations,Latitude,Longitude,Location,Historical Wards 2003-2015,Zip Codes,Community Areas,Census Tracts,Wards
0,2345678,"NEW KNOWLEDGE LEARNING CENTER, INC.","NEW KNOWLEDGE LEARNING CENTER, INC.",2215898.0,Children's Services Facility,Risk 1 (High),8440 S KEDZIE AVE,CHICAGO,IL,60652.0,...,Pass,"53. TOILET FACILITIES: PROPERLY CONSTRUCTED, S...",41.739458,-87.702257,"{'longitude': '41.73945838739822', 'latitude':...",,,,,
1,2345670,TACOS MONTANAS,TACOS MONTANAS,2684320.0,Restaurant,Risk 1 (High),3254 W LAWRENCE AVE,CHICAGO,IL,60625.0,...,Pass w/ Conditions,"3. MANAGEMENT, FOOD EMPLOYEE AND CONDITIONAL E...",41.968541,-87.710704,"{'longitude': '41.96854124925287', 'latitude':...",,,,,
2,2345691,MONGOLIAN CUISINE,MONGOLIAN CUISINE,2689518.0,Restaurant,Risk 1 (High),4640 N CUMBERLAND AVE,CHICAGO,IL,60656.0,...,Not Ready,,41.964212,-87.836837,"{'longitude': '41.96421158016989', 'latitude':...",,,,,
3,2345690,ALCOCER'S LOCAL SHOP,ALCOCER'S LOCAL SHOP,2578896.0,Grocery Store,Risk 1 (High),3413 W 51ST ST,CHICAGO,IL,60632.0,...,Pass,47. FOOD & NON-FOOD CONTACT SURFACES CLEANABLE...,41.800637,-87.709274,"{'longitude': '41.800636651320424', 'latitude'...",,,,,
4,2345666,GABY'S PANADERIA Y PIZZERIA,PANADERIA Y PIZZERIA LA VILLA,2550529.0,Bakery,Risk 1 (High),5050-5054 W FULLERTON AVE,CHICAGO,IL,60639.0,...,Pass,47. FOOD & NON-FOOD CONTACT SURFACES CLEANABLE...,41.924257,-87.753258,"{'longitude': '41.92425666505584', 'latitude':...",,,,,


## 1. Data Overview

In [13]:
# Look a the dimention of the data : 
food_inspection_data.shape

(195979, 22)

_We can see that our data has 195979 examples with 22 features._

### 1.1. Features description

In [15]:
food_inspection_data.columns

Index(['Inspection ID', 'DBA Name', 'AKA Name', 'License #', 'Facility Type',
       'Risk', 'Address', 'City', 'State', 'Zip', 'Inspection Date',
       'Inspection Type', 'Results', 'Violations', 'Latitude', 'Longitude',
       'Location', 'Historical Wards 2003-2015', 'Zip Codes',
       'Community Areas', 'Census Tracts', 'Wards'],
      dtype='object')

We found more information about the feature in https://data.cityofchicago.org/api/assets/BAD5301B-681A-4202-9D25-51B2CAE672FF , where we found the description of each features:

**Features description:**

- **DBA:** ‘Doing business as.’ This is legal name of the establishment.

- **AKA:** ‘Also known as.’ This is the name the public would know the establishment as.

- **License number:** This is a unique number assigned to the establishment for the purposes of licensing by the Department of Business Affairs and Consumer Protection.

- **Type of facility:** Each establishment is described by one of the following: bakery, banquet hall, candy store, caterer, coffee shop, day care center (for ages less than 2), day care center (for ages 2 – 6), day care center (combo, for ages less than 2 and 2 – 6 combined), gas station, Golden Diner, grocery store, hospital, long term care center(nursing home), liquor store, mobile food dispenser, restaurant, paleteria, school, shelter, tavern, social club, wholesaler, or Wrigley Field Rooftop.

- **Risk category of facility:** Each establishment is categorized as to its risk of adversely affecting the public’s health, with 1 being the highest and 3 the lowest. The frequency of inspection is tied to this risk, with risk 1 establishments inspected most frequently and risk 3 least frequently.

- **Street address, city, state and zip code of facility:** This is the complete address where the facility is located.

- **Inspection date:** This is the date the inspection occurred. A particular establishment is likely to have multiple inspections which are denoted by different inspection dates.

- **Inspection type:** An inspection can be one of the following types: canvass, the most common type of inspection performed at a frequency relative to the risk of the establishment; consultation, when the inspection is done at the request of the owner prior to the opening of the establishment; complaint, when the inspection is done in response to a complaint against the establishment; license, when the inspection is done as a requirement for the establishment to receive its license to operate; suspect food poisoning, when the inspection is done in response to one or more persons claiming to have gotten ill as a result of eating at the establishment (a specific type of complaint-based inspection); task-force inspection, when an inspection of a bar or tavern is done. Re-inspections can occur for most types of these inspections and are indicated as such.

- **Results:** An inspection can pass, pass with conditions or fail. Establishments receiving a ‘pass’ were found to have no critical or serious violations (violation number 1-14 and 15-29, respectively). Establishments receiving a ‘pass with conditions’ were found to have critical or serious violations, but these were corrected during the inspection. Establishments receiving a ‘fail’ were found to have critical or serious violations that were not correctable during the inspection. An establishment receiving a ‘fail’ does not necessarily mean the establishment’s licensed is suspended. Establishments found to be out of business or not located are indicated as such.

- **Violations:** An establishment can receive one or more of 45 distinct violations (violation numbers 1-44 and 70). For each violation number listed for a given establishment, the requirement the establishment must meet in order for it to NOT receive a violation is noted, followed by a specific description of the findings that caused the violation to be issued. 


In the data description file, there is also an interesting disclaimer about the duplicated elements:

**Disclaimer (duplicated data):**

Attempts have been made to minimize any and all duplicate inspection reports.
However, the dataset may still contain such duplicates and the appropriate precautions should
be exercised when viewing or analyzing these data. The result of the inspections (pass, pass
with conditions or fail) as well as the violations noted are based on the findings identified and
reported by the inspector at the time of the inspection, and may not reflect the findings noted at
other times.