## Kansas City Crime and Property Data (2015-2020)
The following analysis uses Kansas City, MO Data for Crime and Property Violations in 2015 and 2020.

All data was gathered from [OpenData KC](https://data.kcmo.org/) on December 5, 2020

## Questions
- Are there differences in rate and longevity of property code violations across zip codes?
- What changes has Kansas City seen in the last five years in crime and code violations?
- What contributes to crime rate?

In [5]:
# import necessary libraries and packages
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sb
%matplotlib inline

### Initial Problems with Data
- Rename headers that do not align between years due to misnaming (Reported_Time and Reported Time, From Time and From_Time, Firearm Used Flag, Location1 and Location, etc) to provide a single document

In [58]:
crime_2010 = pd.read_csv('kc_crime_2010.csv')
crime_2010.rename(columns = {"From Time": "From_Time","To Time": "To_Time", "Reported Time": "Reported_Time", "Location 1": "Location", "Firearm Used Flag  ": "Firearm Used Flag"},  
          inplace = True) 
crime_2011 = pd.read_csv('kc_crime_2011.csv')
crime_2011.rename(columns = {"From Time": "From_Time","To Time": "To_Time", "Reported Time": "Reported_Time", "Location 1": "Location", "Firearm Used Flag  ": "Firearm Used Flag"},  
          inplace = True) 
crime_2012 = pd.read_csv('kc_crime_2012.csv')
crime_2012.rename(columns = {"From Time": "From_Time","To Time": "To_Time", "Reported Time": "Reported_Time", "Location 1": "Location", "Firearm Used Flag  ": "Firearm Used Flag"},  
          inplace = True) 
crime_2013 = pd.read_csv('kc_crime_2013.csv')
crime_2013.rename(columns = {"From Time": "From_Time","To Time": "To_Time", "Reported Time": "Reported_Time", "Location 1": "Location", "Firearm Used Flag  ": "Firearm Used Flag"},  
          inplace = True) 
crime_2014 = pd.read_csv('kc_crime_2014.csv')
crime_2014.rename(columns = {"From Time": "From_Time","To Time": "To_Time", "Reported Time": "Reported_Time", "Location 1": "Location", "Firearm Used Flag  ": "Firearm Used Flag"},  
          inplace = True) 
crime_2015 = pd.read_csv('kc_crime_2015.csv')
crime_2015.rename(columns = {"From Time": "From_Time","To Time": "To_Time", "Reported Time": "Reported_Time", "Location 1": "Location", "Firearm Used Flag  ": "Firearm Used Flag"},  
          inplace = True) 
crime_2016 = pd.read_csv('kc_crime_2016.csv')
crime_2016.rename(columns = {"From Time": "From_Time","To Time": "To_Time", "Reported Time": "Reported_Time", "Location 1": "Location", "Firearm Used Flag  ": "Firearm Used Flag"},  
          inplace = True) 
crime_2017 = pd.read_csv('kc_crime_2017.csv')
crime_2017.rename(columns = {"From Time": "From_Time","To Time": "To_Time", "Reported Time": "Reported_Time", "Location 1": "Location", "Firearm Used Flag  ": "Firearm Used Flag"},  
          inplace = True) 
crime_2018 = pd.read_csv('kc_crime_2018.csv')
crime_2018.rename(columns = {"From Time": "From_Time","To Time": "To_Time", "Reported Time": "Reported_Time", "Location 1": "Location", "Firearm Used Flag  ": "Firearm Used Flag"},  
          inplace = True) 
crime_2019 = pd.read_csv('kc_crime_2019.csv')
crime_2019.rename(columns = {"From Time": "From_Time","To Time": "To_Time", "Reported Time": "Reported_Time", "Location 1": "Location", "Firearm Used Flag  ": "Firearm Used Flag"},  
          inplace = True) 

In [59]:
crime_frames = [crime_2010, crime_2011, crime_2012, crime_2013, crime_2014, crime_2015, crime_2016, crime_2017, crime_2018, crime_2019]
crimes = pd.concat(crime_frames, ignore_index=True)

In [60]:
print(crimes.shape)
print(crimes.columns)
crimes.head()

(1234605, 26)
Index(['Report_No', 'Reported_Date', 'Reported_Time', 'From_Date', 'From_Time',
       'To_Date', 'To_Time', 'Offense', 'IBRS', 'Description', 'Beat',
       'Address', 'Zip Code', 'City', 'Rep_Dist', 'Area', 'DVFlag', 'Invl_No',
       'Involvement', 'Race', 'Sex', 'Age', 'Location', 'Firearm Used Flag',
       'Latitude', 'Longitude'],
      dtype='object')


Unnamed: 0,Report_No,Reported_Date,Reported_Time,From_Date,From_Time,To_Date,To_Time,Offense,IBRS,Description,...,DVFlag,Invl_No,Involvement,Race,Sex,Age,Location,Firearm Used Flag,Latitude,Longitude
0,100048265,06/27/2010,13:46,06/27/2010,13:40,,,2655,90J,Trespassing,...,U,1.0,VIC,,,,"100 13 ST\nKANSAS CITY, MO\n(39.19607633300046...",N,,
1,100043775,06/11/2010,21:00,06/10/2010,10:00,,,650,23G,Stealing Auto Parts/,...,U,1.0,VIC,B,M,40.0,,N,,
2,100030602,04/27/2010,22:12,04/27/2010,19:00,,,1120,26A,Fraud,...,U,1.0,VIC,W,F,59.0,"600 NORTON AV\nKANSAS CITY, MO 64124\n(39.1056...",N,,
3,100000611,01/03/2010,19:41,01/03/2010,19:41,,,1120,26A,Fraud,...,U,2.0,VIC,,,,"25 GRAND\nKANSAS CITY, MO 64108\n(39.076060142...",N,,
4,100036538,05/17/2010,17:17,03/15/2010,12:00,,,640,23F,Stealing From Auto,...,U,1.0,SUS,U,U,,,N,,


In [63]:
crimes.to_csv('crimes_2010-2019.csv',index=False)

In [64]:
crimes_total = pd.read_csv('crimes_2010-2019.csv')

  interactivity=interactivity, compiler=compiler, result=result)


In [66]:
crimes_total.head()

Unnamed: 0,Report_No,Reported_Date,Reported_Time,From_Date,From_Time,To_Date,To_Time,Offense,IBRS,Description,...,DVFlag,Invl_No,Involvement,Race,Sex,Age,Location,Firearm Used Flag,Latitude,Longitude
0,100048265,06/27/2010,13:46,06/27/2010,13:40,,,2655,90J,Trespassing,...,U,1.0,VIC,,,,"100 13 ST\nKANSAS CITY, MO\n(39.19607633300046...",N,,
1,100043775,06/11/2010,21:00,06/10/2010,10:00,,,650,23G,Stealing Auto Parts/,...,U,1.0,VIC,B,M,40.0,,N,,
2,100030602,04/27/2010,22:12,04/27/2010,19:00,,,1120,26A,Fraud,...,U,1.0,VIC,W,F,59.0,"600 NORTON AV\nKANSAS CITY, MO 64124\n(39.1056...",N,,
3,100000611,01/03/2010,19:41,01/03/2010,19:41,,,1120,26A,Fraud,...,U,2.0,VIC,,,,"25 GRAND\nKANSAS CITY, MO 64108\n(39.076060142...",N,,
4,100036538,05/17/2010,17:17,03/15/2010,12:00,,,640,23F,Stealing From Auto,...,U,1.0,SUS,U,U,,,N,,


In [65]:
crimes_total.dtypes

Report_No             object
Reported_Date         object
Reported_Time         object
From_Date             object
From_Time             object
To_Date               object
To_Time               object
Offense               object
IBRS                  object
Description           object
Beat                  object
Address               object
Zip Code             float64
City                  object
Rep_Dist              object
Area                  object
DVFlag                object
Invl_No              float64
Involvement           object
Race                  object
Sex                   object
Age                  float64
Location              object
Firearm Used Flag     object
Latitude             float64
Longitude            float64
dtype: object

In [67]:
crimes_total.isnull().sum()

Report_No                 0
Reported_Date             0
Reported_Time             0
From_Date              1864
From_Time              2734
To_Date              765650
To_Time              741941
Offense                   0
IBRS                  24356
Description           14943
Beat                   5653
Address              125319
Zip Code              31748
City                 125362
Rep_Dist              33104
Area                   7201
DVFlag                    0
Invl_No               78046
Involvement               0
Race                 167103
Sex                  165867
Age                  531361
Location              14307
Firearm Used Flag         0
Latitude             984827
Longitude            984827
dtype: int64

### Next Steps for Data Cleaning (Crimes)
- Convert dates to appropriate date-time format
- Convert domestic violence, firearm used) to categorical variables
- Convert zipcodes to strings
- Delete data that is unnecessary for analysis due to missing data (To_Date, To_Time, To Time, Latitude, Longitude)
- Impute missing data where possible (Invl_No, etc)
- Remove data with missing zip code (less than 3% of data)

In [4]:
violations = pd.read_csv('kc_property_violations.csv')
print(violations.shape)
violations.head()

  interactivity=interactivity, compiler=compiler, result=result)


(777205, 23)


Unnamed: 0,Property Violation ID,Case ID,Status,Case Opened Date,Case Closed Date,Days Open,Violation Code,Violation Description,Ordinance Number,Ordinance Chapter,...,State,Zip Code,Latitude,Longitude,KIVA PIN,Council District,Police Patrol Area,Inspection Area,Neighborhood,Code Violation Location
0,1225153,2019141615,Closed,07/30/2019,06/02/2020,62.0,NSWLOPSTOR,UNAPPROVED STORAGE,48-32 C.O.,48,...,MO,64130.0,39.0527,-94.52384,32107,3.0,East,49,Vineyard,"5111 E 40th St\nMO 64130\n(39.0527, -94.52384)"
1,1059428,2012034662,Closed,03/21/2012,09/28/2012,140.0,NSWLLIMBS,LIMBS AND BRUSH,48-25 C.O.,48,...,MO,64132.0,38.99297,-94.56039,115469,5.0,Metro,127,East Meyer 7,"2300 E 74th St\nMO 64132\n(38.99297, -94.56039)"
2,1124057,2014129843,Open,10/08/2014,,2250.0,NSWLLIMBS,LIMBS AND BRUSH,48-25 C.O.,48,...,MO,64131.0,38.99342,-94.57131,114703,5.0,Metro,128,East Meyer 6,"7344 Lydia Ave\nMO 64131\n(38.99342, -94.57131)"
3,1125358,2014139084,Closed,10/30/2014,06/01/2015,180.0,NSWLOPSTOR,UNAPPROVED STORAGE,48-32 C.O.,48,...,MO,64109.0,39.07638,-94.5657,27120,3.0,Central,25,Beacon Hills,"2734 PASEO\nMO 64109\n(39.07638, -94.5657)"
4,1162495,2016086040,Closed,07/25/2016,08/01/2016,7.0,NSWLLIMBS,LIMBS AND BRUSH,48-25 C.O.,48,...,MO,64128.0,39.06858,-94.54791,25056,3.0,East,57,Santa Fe,"3005 E 32nd St\nMO 64128\n(39.06858, -94.54791)"


In [6]:
violations.columns

Index(['Property Violation ID', 'Case ID', 'Status', 'Case Opened Date',
       'Case Closed Date', 'Days Open', 'Violation Code',
       'Violation Description', 'Ordinance Number', 'Ordinance Chapter',
       'Violation Entry Date', 'Address', 'County', 'State', 'Zip Code',
       'Latitude', 'Longitude', 'KIVA PIN', 'Council District',
       'Police Patrol Area', 'Inspection Area', 'Neighborhood',
       'Code Violation Location'],
      dtype='object')

In [45]:
violations.isnull().sum()

Property Violation ID           0
Case ID                         0
Status                          0
Case Opened Date                0
Case Closed Date           140280
Days Open                      76
Violation Code                  0
Violation Description           0
Ordinance Number                0
Ordinance Chapter               0
Violation Entry Date            1
Address                         5
County                        182
State                           4
Zip Code                      107
Latitude                        0
Longitude                       0
KIVA PIN                        0
Council District              289
Police Patrol Area            289
Inspection Area              1792
Neighborhood                 1250
Code Violation Location         0
dtype: int64

In [47]:
violations.dtypes

Property Violation ID        int64
Case ID                      int64
Status                      object
Case Opened Date            object
Case Closed Date            object
Days Open                  float64
Violation Code              object
Violation Description       object
Ordinance Number            object
Ordinance Chapter            int64
Violation Entry Date        object
Address                     object
County                      object
State                       object
Zip Code                   float64
Latitude                   float64
Longitude                  float64
KIVA PIN                     int64
Council District           float64
Police Patrol Area          object
Inspection Area             object
Neighborhood                object
Code Violation Location     object
dtype: object