# Call for Service Data - 2015


### Prerequisite

  Install [Anaconda](https://www.continuum.io/downloads), the latest version with Python version as **3.5**. once you install anaconda, check for the version in your command prompt.
  
``` > conda --version ```



> Purpose: This dataset reflects incidents that have been reported to the New Orleans Police Department. 

[Link](http://nopdnews.com/transparency/policing-data/)

Download the files from the link above in a folder. Go to the same folder and start command line/command prompt and run the command below.

```> jupyter notebook```

A new chrome tab will be opened. There you will see this notebook with the name **Exploratory Data Analysis - Calls for service**.

The notebook covers following.

    Steps:
    1. Loading the csv dataset.
    2. understanding the columns.
    3. work on use case.

### Loading the dataset

  We can use [Pandas](http://pandas.pydata.org/pandas-docs/stable/index.html), a powerful Python data analysis tool-kit to do the data cleaning activity. It is fast and efficient. We will give documentation links to the functions that we use in the code. 2015 Calls for service data is of CSV format.

In [33]:
## import pandas
import pandas as pd

In [34]:
# store the csv file as dataframe. Calls For Service (CFS)
# http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html
# csv files are stored in the directory where jupyter notebook is running. 
cfs_df = pd.read_csv('Calls_for_Service_2015.csv')

In [35]:
cfs_df.head()

Unnamed: 0,﻿NOPD_Item,Type_,TypeText,Priority,InitialType,InitialTypeText,InitialPriority,MapX,MapY,TimeCreate,...,TimeArrive,TimeClosed,Disposition,DispositionText,SelfInitiated,Beat,BLOCK_ADDRESS,Zip,PoliceDistrict,Location
0,A0000115,56,SIMPLE CRIMINAL DAMAGE,1D,94F,FIREWORKS,1A,3682553,532626,01/01/2015 12:00:34 AM,...,01/01/2015 01:41:20 AM,01/01/2015 01:41:30 AM,UNF,UNFOUNDED,N,8D06,007XX Orleans Ave,70116.0,8,"(29.95850519, -90.06470624)"
1,A0000215,21,COMPLAINT OTHER,1H,21,COMPLAINT OTHER,1H,3682368,532820,01/01/2015 12:00:36 AM,...,01/01/2015 12:00:36 AM,01/01/2015 01:31:54 AM,NAT,Necessary Action Taken,Y,8E01,Bourbon St & Orleans Ave,70116.0,8,"(29.95904477, -90.06528204)"
2,A0000415,94,DISCHARGING FIREARM,1A,94,DISCHARGING FIREARM,2B,3686245,546280,01/01/2015 12:01:47 AM,...,,01/01/2015 01:32:38 AM,UNF,UNFOUNDED,N,3Y03,Clematis St & Acacia St,70122.0,3,"(29.99593586, -90.05256561)"
3,A0000515,107,SUSPICIOUS PERSON,2A,107,SUSPICIOUS PERSON,2A,3687521,537825,01/01/2015 12:02:22 AM,...,01/01/2015 12:13:19 AM,01/01/2015 12:24:40 AM,GOA,GONE ON ARRIVAL,N,5C03,026XX N Robertson St,70117.0,5,"(29.97264816, -90.04883217)"
4,A0000615,21,COMPLAINT OTHER,1H,21,COMPLAINT OTHER,1H,3682082,529645,01/01/2015 12:02:44 AM,...,,01/01/2015 01:22:17 AM,VOI,VOID,N,8G02,003XX Canal St,70130.0,8,"(29.95032257, -90.06629572)"


In [36]:
# how many Calls for service incidents recorded in the year 2015? 432739
cfs_df.shape

(432739, 21)

In [37]:
# printing the column names.
list(cfs_df)

['\ufeffNOPD_Item',
 'Type_',
 'TypeText',
 'Priority',
 'InitialType',
 'InitialTypeText',
 'InitialPriority',
 'MapX',
 'MapY',
 'TimeCreate',
 'TimeDispatch',
 'TimeArrive',
 'TimeClosed',
 'Disposition',
 'DispositionText',
 'SelfInitiated',
 'Beat',
 'BLOCK_ADDRESS',
 'Zip',
 'PoliceDistrict',
 'Location']

In [38]:
cfs_df.Type_.unique()

array(['56', '21', '94', '107', '94F', '63', '62A', '100', '17J', '911',
       '24', '103D', '103', '18', '103M', '37', '106', '67P', '66D', '68',
       '64A', '21P', '103F', '20', '52F', '21M', '67', '21N', '62R', '17M',
       '21H', '20I', '58R', '35', '34C', '35D', '58', '65P', '67A', '99',
       '67F', '60', '65', '98', '64G', '20X', '29S', '100I', '17R', '38',
       '64K', '56D', 'WALK', '62C', '67S', '21TEST', '67B', '67C', '102',
       '59', '19', '30S', '966', '29', '55', '62', '21J', '67AR', '42',
       '21C', '62L', '17F', '66', '29U', '52', '62B', '38D', '37D', '20F',
       '17T', '95G', '67E', '81', '34S', '18DE', '79', '34D', '34',
       '1028P', '21R', '21T', '69', '65J', '43', '72', '43B', '21U', '95',
       '51', '108', '94S', '284', '100X', '30C', '23', '64', '95K', '30D',
       '17', '64J', '107U', '1055', '21S', '82', '83', '107S', '71', '51B',
       '39', '110', '54', '112', '30', '42M', '21MG', '62D', '42U', '21F',
       '63P', '18A', '21B', '90', 'NOP

In [39]:
crime_types = pd.read_excel('MAX_CFS_UCR_Categories.xlsx',sheetname='Sheet1')

In [40]:
crime_types = crime_types.ix[:,['Code','UCR MAIN']]

In [41]:
import re
def remove_before_hyphen(x):
    x = x.strip()
    is_hyphen = re.search('-',x)
    if is_hyphen:
        return x[is_hyphen.span()[1]:]
    else:
        return x

In [42]:
crime_types.Code = crime_types.Code.map(str).map(remove_before_hyphen)

In [43]:
crime_types.rename(columns={'Code':'Type_','UCR MAIN':'crime_type'},inplace=True)
crime_types.head()

Unnamed: 0,Type_,crime_type
0,100,OTHER CFS
1,100F,OTHER CFS
2,100I,OTHER CFS
3,100X,OTHER CFS
4,101,OTHER CFS


In [44]:
cfs_df.Type_ = cfs_df.Type_.str.strip()

In [45]:
cfs_df = pd.merge(cfs_df,crime_types,on='Type_',how='left')

In [46]:
cfs_df.crime_type.fillna('MISSING',inplace=True)

## Use-case

 From the department's [analysis report](https://insight.livestories.com/s/project-transparent/569fc0404372f40014285093/), Here is how we believe the use-case is defined.
 
 [![Calls For Service.png](https://s29.postimg.org/6indkkh9j/Calls_For_Service.png)](https://postimg.org/image/mtnhgvtr7/)
 
 **Terminology**
 
 1. 911 - Emergency call. Can be initiated by public/NOPD officer (self-initiated)/Any state entity working with NOPD. Has code starting with **2**.
 2. 311 - Non Emergency call. Can be initiated by public/NOPD officer (self-initiated)/Any state entity working with NOPD. Has code starting with **1**.
 
 
A call is considered as an emergency call iff its **InitialPriority** (column name in dataframe) and **Priority** have codes starting with 2.

A Call is considered as a non-emergency call iff its **InitialPriority** and **Priority** have codes starting with 1.

Lets analyse from the beginning.


In [47]:
# dropping all rows if all the columns in the rows are empty.
cfs_df=cfs_df[pd.notnull(cfs_df.Priority)& pd.notnull(cfs_df.Disposition)]
cfs_df.shape

(460894, 22)

In [48]:
# how many self initiated calls are there in 2015? 125170 calls
self_init_df = cfs_df[cfs_df.SelfInitiated=='Y']
n_self_init_df = cfs_df[cfs_df.SelfInitiated=='N']
self_init_df.shape[0],n_self_init_df.shape[0]

(127789, 333105)

In [49]:
## lets visualize. I have used plotly, super easy, super fancy visualization tool.
## https://plot.ly/python/bar-charts/

import plotly
# plotly.tools.set_credentials_file(username='karthikb', api_key='ubvM66f4apRzPA049z05')
import plotly.graph_objs as go

data = [go.Bar(
            x=['Self-initiated CFS', 'Other CFS'],
            y=[self_init_df.shape[0],n_self_init_df.shape[0]]
    )]

# plotly.plotly.iplot(data, filename='Calls For Service')


In [50]:
# self initiated emergency calls
# self_init_df = self_init_df[pd.notnull(self_init_df.Priority)]
# n_self_init_df = n_self_init_df[pd.notnull(n_self_init_df.Priority)]
self_init_df = self_init_df.reset_index(drop=True)
n_self_init_df = n_self_init_df.reset_index(drop=True)

In [51]:
print("In 2015 there where {0} self initiated calls and {1} other calls".format(len(self_init_df),len(n_self_init_df)))

In 2015 there where 127789 self initiated calls and 333105 other calls


### How Call priority is determined as per the analysis.

Emergency calls
 A call is considered to be an emergency call iff **InitialPriority** and **Priority** has emergency codes starting with 2.
 
In this section we are going to segregate our dataset into 2 dimensions.

|             NOLA              | Self-Initated by officers | From Public/Other Departments |
|:-----------------------------:|:-------------------------:|-------------------------------|
|        Emergency Calls        |                           |                               |
|      Non-Emergency Calls      |                           |                               |
|   Converted Emergency Calls   |                           |                               |
| Converted Non-emergency Calls |                           |                               ||



In [52]:
# total emergency calls
all_emergency_self_init_calls = self_init_df[((self_init_df.InitialPriority.str.contains("2"))
                                              & self_init_df.Priority.str.contains("2"))]
all_emergency_self_init_calls.reset_index(drop=True)
print("In 2015 NOPD Officers initiated {0} emergency calls".format(len(all_emergency_self_init_calls)))

In 2015 NOPD Officers initiated 6151 emergency calls


In [53]:
# call initiated as non-emergency calls and converted as emergencies by officers
conv_emergency_self_init_calls = self_init_df[((self_init_df.InitialPriority.str.contains("1"))
                                              & self_init_df.Priority.str.contains("2"))]
conv_emergency_self_init_calls.reset_index(drop=True)
print("In 2015 NOPD Officers initiated {0} calls as non-emergency calls and converted to emergency calls".format
      (len(conv_emergency_self_init_calls)))

In 2015 NOPD Officers initiated 1147 calls as non-emergency calls and converted to emergency calls


In [54]:
non_emergency_self_init_calls = self_init_df[((self_init_df.InitialPriority.str.contains("1"))
                                              & self_init_df.Priority.str.contains("1"))]
non_emergency_self_init_calls.reset_index(drop=True)
print("In 2015 NOPD Officers initiated {0} non-emergency calls".format
      (len(non_emergency_self_init_calls)))

In 2015 NOPD Officers initiated 110797 non-emergency calls


In [55]:
conv_non_emer_self_init_calls = self_init_df[((self_init_df.InitialPriority.str.contains("2"))
                                              & self_init_df.Priority.str.contains("1"))]
conv_non_emer_self_init_calls.reset_index(drop=True)
print("In 2015 NOPD Officers initiated {0} calls as emergency calls and converted to non-emergency calls".format
      (len(conv_non_emer_self_init_calls)))

In 2015 NOPD Officers initiated 576 calls as emergency calls and converted to non-emergency calls


There are calls with Priority 0. We assume that they are trivial service calls and we ignore the state of those calls for analysis.

In [56]:
all_emergency_other_calls = n_self_init_df[((n_self_init_df.InitialPriority.str.contains("2"))
                                              & n_self_init_df.Priority.str.contains("2"))]
all_emergency_other_calls.reset_index(drop=True)
print("In 2015 NOPD Officers received {0} emergency calls from public and other departments"
      .format(len(all_emergency_other_calls)))

In 2015 NOPD Officers received 124450 emergency calls from public and other departments


In [57]:
conv_emergency_other_calls = n_self_init_df[((n_self_init_df.InitialPriority.str.contains("1"))
                                              & n_self_init_df.Priority.str.contains("2"))]
conv_emergency_other_calls.reset_index(drop=True)
print("In 2015 NOPD Officers received {0} calls as non-emergency calls but registered to emergency calls".format
      (len(conv_emergency_other_calls)))

In 2015 NOPD Officers received 7504 calls as non-emergency calls but registered to emergency calls


In [58]:
non_emergency_other_calls = n_self_init_df[((n_self_init_df.InitialPriority.str.contains("1"))
                                              & n_self_init_df.Priority.str.contains("1"))]
non_emergency_other_calls.reset_index(drop=True)
print("In 2015 NOPD Officers received {0} non-emergency calls".format
      (len(non_emergency_other_calls)))

In 2015 NOPD Officers received 154833 non-emergency calls


In [59]:
conv_non_emer_other_calls = n_self_init_df[((n_self_init_df.InitialPriority.str.contains("2"))
                                              & n_self_init_df.Priority.str.contains("1"))]
conv_non_emer_other_calls.reset_index(drop=True)
print("In 2015 NOPD Officers received {0} calls as emergency calls but to non-emergency calls".format
      (len(conv_non_emer_other_calls)))

In 2015 NOPD Officers received 34105 calls as emergency calls but to non-emergency calls


**Call Statistics**

In [60]:
self_init = go.Bar(
    x=['Emergency', 'Converted Emergency', 'Non-Emergency', 'Converted Non-Emergency'],
    y=[len(all_emergency_self_init_calls),len(conv_emergency_self_init_calls),len(non_emergency_self_init_calls),
       len(conv_non_emer_self_init_calls)],
    name='Self Initiated Calls'
)
other = go.Bar(
    x=['Emergency', 'Converted Emergency', 'Non-Emergency', 'Converted Non-Emergency'],
    y=[len(all_emergency_other_calls), len(conv_emergency_other_calls),len(non_emergency_other_calls),
      len(conv_non_emer_other_calls)],
    name='Public Calls'
)

data = [self_init, other]
layout = go.Layout(
    barmode='group'
)

fig = go.Figure(data=data, layout=layout)
# plotly.plotly.iplot(fig, filename='Call-Stats')

### Disposition

Disposition is a consquent action that is taken based on the call. Here are some types of disposition.


| Abbreviation |                                           Action                                           |
|:------------:|:------------------------------------------------------------------------------------------:|
|      UNF     |                     Officer determined that the incident did not occur                     |
|      NAT     |                     Officer handled the incident to completion on-scene                    |
|      GOA     |                  Complainant was not on the scene when the officer arrived                 |
|      VOI     |                               Complainant calcelled the call                               |
|      RTF     | A Police report or other report will be necessary to complete the handling of the incident |
|      DUP     |    An existing call such as when more than one person calls 9-1-1 on the same incident.    |
|      EST     |                         Test incident simulated by any department*                         |
|      FAR     |                                        False Alarm*                                        |



In [61]:
# this function gets DataFrame and returns a plotly data structure.

def plot_disposition_crime_type(df):
    count_cases =  df.groupby(['crime_type','Disposition']).size()
#     data = [go.Bar(
#             x=count_cases.index.tolist(),
#             y=count_cases.values.tolist()
#     )]
#    
    print(count_cases)


## Emergency Calls - Crime Type vs Disposition

In [62]:
# public calls
plot_disposition_crime_type(all_emergency_other_calls)

crime_type      Disposition
MISSING         DUP                3
                VOI                1
OTHER CFS       DUP             5160
                FAR                1
                GOA            13424
                NAT            56823
                RTF            14200
                UNF            14716
                VOI            13644
PROPERTY CRIME  DUP              113
                GOA               96
                NAT               18
                RTF             1923
                UNF              379
                VOI               70
VIOLENT CRIME   DUP              274
                GOA              102
                NAT                1
                RTF             2928
                UNF              549
                VOI               25
dtype: int64


In [32]:
# self initiated calls
plot_disposition_crime_type(all_emergency_self_init_calls)

crime_type      Disposition
OTHER CFS       DUP             123
                GOA             199
                NAT            4869
                RTF             635
                UNF             132
                VOI              16
PROPERTY CRIME  DUP               2
                RTF              18
VIOLENT CRIME   DUP              12
                RTF             137
                UNF               8
dtype: int64


## Converted Emergency Calls - Crime Type vs Disposition

In [63]:
# Public calls
plot_disposition_crime_type(conv_emergency_other_calls)

crime_type      Disposition
OTHER CFS       DUP             281
                GOA             520
                NAT            1863
                RTF            3154
                UNF             616
                VOI             132
PROPERTY CRIME  DUP              30
                GOA               8
                RTF             409
                UNF              56
                VOI               8
VIOLENT CRIME   DUP               8
                GOA               4
                RTF             389
                UNF              24
                VOI               2
dtype: int64


In [64]:
# self initated calls
plot_disposition_crime_type(conv_non_emer_self_init_calls)

crime_type      Disposition
OTHER CFS       DUP              5
                GOA              5
                NAT            363
                RTF            128
                UNF             19
                VOI              5
PROPERTY CRIME  RTF             29
                UNF              2
VIOLENT CRIME   RTF             18
                UNF              2
dtype: int64


## Non-Emergency Crime Type vs Calls Disposition

In [65]:
# public non emergency
plot_disposition_crime_type(non_emergency_other_calls)

crime_type      Disposition
OTHER CFS       DUP             3225
                EST                1
                GOA             6481
                NAT            61867
                RTF            25396
                UNF            15742
                VOI             6529
PROPERTY CRIME  DUP              808
                GOA             1974
                NAT                4
                RTF            20125
                UNF             9368
                VOI             2082
VIOLENT CRIME   DUP              121
                GOA               90
                RTF              540
                UNF              434
                VOI               46
dtype: int64


In [66]:
# self initated non emergency
plot_disposition_crime_type(non_emergency_self_init_calls)

crime_type      Disposition
OTHER CFS       DUP              546
                GOA              311
                NAT            93614
                RTF            11901
                UNF              299
                VOI              298
PROPERTY CRIME  DUP               59
                GOA               12
                RTF             3544
                UNF               96
                VOI                6
VIOLENT CRIME   DUP               11
                RTF               87
                UNF               13
dtype: int64


## Converted Non-Emergency Crime Type vs Calls Disposition

In [67]:
# public non-emergency
plot_disposition_crime_type(conv_non_emer_other_calls)

crime_type      Disposition
OTHER CFS       DUP              304
                GOA             1512
                NAT            16098
                RTF             5324
                UNF             2655
                VOI             3481
PROPERTY CRIME  DUP               38
                GOA              157
                NAT                6
                RTF             2915
                UNF              712
                VOI               84
VIOLENT CRIME   DUP               28
                GOA               24
                RTF              581
                UNF              179
                VOI                7
dtype: int64


In [68]:
# self initiated converted non emergency
plot_disposition_crime_type(conv_non_emer_self_init_calls)

crime_type      Disposition
OTHER CFS       DUP              5
                GOA              5
                NAT            363
                RTF            128
                UNF             19
                VOI              5
PROPERTY CRIME  RTF             29
                UNF              2
VIOLENT CRIME   RTF             18
                UNF              2
dtype: int64


# Analysis To Follow

    1. More visualization
    2. Time Series analysis
    3. Analysis on crime types to see if there is a pattern