# 911 Calls

Emergency (911) Calls: Fire, Traffic, EMS for Montgomery County, PA


Acknowledgements: Data provided by montcoalert.org
 
 The data contains the following fields:

* lat : String variable, Latitude
* lng: String variable, Longitude
* desc: String variable, Description of the Emergency Call
* zip: String variable, Zipcode
* title: String variable, Title
* timeStamp: String variable, YYYY-MM-DD HH:MM:SS
* twp: String variable, Township
* addr: String variable, Address
* e: String variable, Dummy variable (always 1)


## Data and Setup

Importing numpy and pandas for data exploration

In [None]:
import numpy as np
import pandas as pd

Importing visualization libraries

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

sns.set(context = 'paper', style= "whitegrid", font_scale=2)

In [None]:
from plotly import __version__
import cufflinks as cf
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
init_notebook_mode(connected=True)

cf.go_offline()

Reading the csv file as a dataframe called df

In [None]:
df = pd.read_csv('../input/911.csv')

Checking the info() of the df

In [None]:
df.info()

Checking the head of df


In [None]:
df.head()

 ## Basic Data Exploration

Top 5 zipcodes for 911 calls

In [None]:
df['zip'].value_counts().head()

The top 5 townships (twp) for 911 calls

In [None]:
df['twp'].value_counts().head()

Townships with Most 911 Calls


In [None]:
plt.figure(figsize=(14,8))
df['twp'].value_counts().head(10).plot.bar(color = 'blue')
plt.xlabel('Townships', labelpad = 20)
plt.ylabel('Number of Calls')
plt.title('Townships with Most 911 Calls')

Number of unique title codes.

In [None]:
df['title'].nunique()

## Creating new features

In the titles column there are "Reasons/Departments" specified before the title code. These are EMS, Fire, and Traffic. Creating a new column called "Reason" that contains this string value


In [None]:
df['Reason'] = df['title'].apply(lambda x: x.split(':')[0])

Most common Reason for a 911 call based on the new column

In [None]:
df['Reason'].value_counts().head()

## Visulaization

Categorization of 911 calls into EMS, Fire and Traffic

In [None]:
plt.figure(figsize=(14,8))
sns.countplot('Reason', data=df, palette='rainbow')

Checking data type of timestamp

In [None]:
type(df['timeStamp'][0])

Converting the timestamp to datetime object

In [None]:
df['timeStamp'] = pd.to_datetime(df['timeStamp'])


Creating 3 new columns called Hour, Month, and Day of Week.

In [None]:
df['Hour'] = df['timeStamp'].apply(lambda time: time.hour)

# Starting the hour value from 1 instead of 0
df['Hour'] = df['Hour'].map({0:1, 1:2, 2:3, 3:4, 4:5, 5:6, 6:7, 7:8, 8:9, 9:10, 10:11, 11:12, 12:13, 13:14, 
        14:15, 15:16, 16:17, 17:18, 18:19, 19:20, 20:21, 21:22, 22:23, 23:24})

df['Month'] = df['timeStamp'].apply(lambda time: time.month)

df['Day of Week'] = df['timeStamp'].apply(lambda time: time.dayofweek)

# Mapping the actual string names to the day of the week
df['Day of Week'] = df['Day of Week'].map({0:'Mon',1:'Tue',2:'Wed',3:'Thu',
                                        4:'Fri',5:'Sat',6:'Sun'}) 

df['Year'] = df['timeStamp'].apply(lambda time: time.year)


**Count of the call in Days of a week based on the Reason.**

In [None]:
plt.figure(figsize=(14,8))
sns.countplot(df['Day of Week'], data = df, hue = df['Reason'], palette='viridis')
plt.title('Count of the calls')
plt.legend(loc = 'center right', bbox_to_anchor=(1.2,0.5) )

**Count of the call in months based on the Reason.**

In [None]:
plt.figure(figsize=(14,8))
sns.countplot(df['Month'], data = df, hue = df['Reason'], palette='viridis')
plt.title('Count of the calls in months')
plt.legend(loc = 'center right', bbox_to_anchor=(1.2,0.5) )

**Creating different visualization on basis of aggregation**

Using plotly for interactive plots


Grouping the data based on month and counting 

In [None]:
byMonth = df.groupby('Month').count()

Simple plot off of the dataframe byMonyth indicating the count of calls per month

In [None]:
byMonth['twp'].iplot(title =" Calls per month", xTitle='Month', yTitle='Calls')

**Creating a new column 'Date' by extracting date data from timestamp**

In [None]:
df['Date'] = df['timeStamp'].apply(lambda x: x.date() )
df.head()

**Creating a plot of counts of 911 calls based on month.**

In [None]:
df.groupby('Date').count()['twp'].iplot(title =" Calls", xTitle='Month', yTitle='Calls')


**Now we will create 3 different a plot of counts of 911 calls based on month and different resons for the call.** 

Based on Traffic

In [None]:
df[df['Reason']=='Traffic'].groupby('Date').count()['twp'].iplot(title ="Traffic", xTitle='Month', yTitle='Calls')


Based on EMS

In [None]:
df[df['Reason']=='EMS'].groupby('Date').count()['twp'].iplot(title ="EMS", xTitle='Month', yTitle='Calls')

Based on Fire

In [None]:
df[df['Reason']=='Fire'].groupby('Date').count()['twp'].iplot(title ="Fire", xTitle='Month', yTitle='Calls')

**Now we are creating a plot of counts of 911 calls based on Hour.**

In [None]:
df.groupby('Hour').count()['twp'].iplot(title ='Call by hour - All year', xTitle='Hour', yTitle='Calls')

**Now we will create 4 different a plot of counts of 911 calls based on hour and different year.** 

Year 2015

In [None]:
df[df['Year']==2015].groupby('Hour').count()['twp'].iplot(title ='Call by hour - 2015', xTitle='Hour', yTitle='Calls')

Year 2016

In [None]:
df[df['Year']==2016].groupby('Hour').count()['twp'].iplot(title ='Call by hour - 2016', xTitle='Hour', yTitle='Calls')

Year 2017

In [None]:
df[df['Year']==2017].groupby('Hour').count()['twp'].iplot(title ='Call by hour - 2017', xTitle='Hour', yTitle='Calls')

Year 2018

In [None]:
df[df['Year']==2018].groupby('Hour').count()['twp'].iplot(title ='Call by hour - 2018', xTitle='Hour', yTitle='Calls')

## Heat Maps

We'll first restructure the dataframe so that the columns become the Hours and the Index becomes the Day of the Week. 

In [None]:
dayHour = df.groupby(by=['Day of Week','Hour']).count()['Reason'].unstack()
dayHour.head()

Creating a HeatMap using this new DataFrame.


In [None]:
plt.figure(figsize=(15,10))
sns.heatmap(dayHour, cmap = 'viridis', linewidths=.1)


Now creating a clustermap using this DataFrame.


In [None]:
plt.figure(figsize=(14,8))
sns.clustermap(dayHour, cmap = 'viridis', linewidths=.1)


Now we'll first restructure the dataframe so that the columns become the Months and the Index becomes the Day of the Week. 

In [None]:
dayMonth = df.groupby(by=['Day of Week','Month']).count()['Reason'].unstack()
dayMonth.head()

Creating a HeatMap using this new DataFrame.

In [None]:
plt.figure(figsize=(14,8))
sns.heatmap(dayMonth, cmap = 'viridis', linewidths=.1)


Creating a ClusterMap using this new DataFrame.


In [None]:
plt.figure(figsize=(14,8))
sns.clustermap(dayMonth, cmap = 'viridis', linewidths=.1)


My first attempt. Please comment