# Emergency 911 Call Log Project

By Savahnna L. Cunningham

Date: March 28, 2018


### SCENARIO

The governor has offered a funding incentive for police departments that are able to meet a minimum standard of having at least 2.5 officers per incident. Each police department must apply for this funding and provide data to prove eligibility.

You are a data analyst that has been recruited to do consulting work for your local police department. The police chief has asked you to analyze the logs from emergency 911 calls in the city and then provide a summary of that data, including graphic representations. He has also tasked you with using this data to determine if the department qualifies for the governor’s funding incentive.

In [14]:
#import major libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

## Data Wrangling Process

### Gather Data

In [17]:
# Read in the "raw_data"file

df = pd.read_excel("raw_data.xlsx")
df

Unnamed: 0,CAD CDW ID,Sector,Total Officers,Latitude,Longitude,Event,Timestamp
0,1702311,R,2,47.561256,-122.284904,DISTURBANCES,2016-03-26 17:17:00
1,1702312,G,2,47.604195,-122.294930,TRAFFIC RELATED CALLS,2016-03-26 17:15:00
2,1702316,O,0,47.571580,-122.335240,TRAFFIC RELATED CALLS,2016-03-26 17:19:00
3,1702317,Q,2,47.640358,-122.402370,SUSPICIOUS CIRCUMSTANCES,2016-03-26 17:26:00
4,1702318,G,1,47.604320,-122.301090,SUSPICIOUS CIRCUMSTANCES,2016-03-26 17:19:00
5,1702319,S,2,47.519850,-122.268030,LIQUOR VIOLATIONS,2016-03-26 17:17:00
6,1702321,R,3,47.579674,-122.299160,DISTURBANCES,2016-03-26 17:18:00
7,1702322,S,1,47.506386,-122.250630,FALSE ALACAD,2016-03-26 17:17:00
8,1702323,O,3,47.579807,-122.329056,DISTURBANCES,2016-03-26 17:18:00
9,1702324,C,1,47.627274,-122.314590,OTHER PROPERTY,2016-03-26 17:14:00


### Assessing data

In [18]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1046 entries, 0 to 1045
Data columns (total 7 columns):
CAD CDW ID        1046 non-null int64
Sector            1045 non-null object
Total Officers    1046 non-null int64
Latitude          1046 non-null float64
Longitude         1046 non-null float64
Event             1046 non-null object
Timestamp         1046 non-null datetime64[ns]
dtypes: datetime64[ns](1), float64(2), int64(2), object(2)
memory usage: 57.3+ KB


In [19]:
#how many unique "events" are in the clearance description?
df['Event'].nunique()

32

In [27]:
#find the missing value in the Sector column
df[df['Sector'].notnull()==False]

Unnamed: 0,CAD CDW ID,Sector,Total Officers,Latitude,Longitude,Event,Timestamp
224,1702543,,1,47.53204,-122.334335,TRAFFIC RELATED CALLS,2016-03-26 23:39:00


65

**Quality Issues: **

#### `Dataframe` Table:
- check for miss labelled, mispelled or missing data
- Address Column: XX needs to be replaced with 00.
    

### Tidyness Issues
Issues with the structure of the data
-Census Tract                   

#### `Dataframe` Table:
- Parse the datetime information into seperate columns
- Drop columns that are not needed & rearrange column order for an easier read
- Combine each dog stage column into a single column named "stage"
- `tweet_id` column needs to be converted from a number to string value
- `Date and Time` columns need to be converted to datetime objects
- `Rating` columns need to be converted to float values

### Cleaning data