# Exploratory Data Analysis on the Emergency - 911 Calls Dataset

In this project we will be analyzing some 911 call data from [Kaggle](https://www.kaggle.com/mchirico/montcoalert). The data contains the following fields:

* lat : String variable, Latitude
* lng: String variable, Longitude
* desc: String variable, Description of the Emergency Call
* zip: String variable, Zipcode
* title: String variable, Title
* timeStamp: String variable, YYYY-MM-DD HH:MM:SS
* twp: String variable, Township
* addr: String variable, Address
* e: String variable, Dummy variable

### Importing packages

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

### Loading the dataset

In [None]:
data=pd.read_csv("911.csv")

### Basic data exploration

In [None]:
data.info()

In [None]:
data.describe()

In [None]:
data.head()

### Top 5 zipcodes for 911 calls

In [None]:
data['zip'].value_counts().head(5)

### Top 5 townships (twp) for 911 calls

In [None]:
data['twp'].value_counts().head(5)

### Unique number of calls reported

In [None]:
data['title'].nunique()

#### Creating new features
In the titles column there are "Reasons/Departments" specified before the title code. 
For example, if the title column value is Fire: GAS-ODOR/LEAK , the Reason column value would be Fire.

In [None]:
data['reason']=data['title'].apply(lambda x: x.split(':')[0])

### Most common Reason for a 911 call 

In [None]:
data['reason'].value_counts()

In [None]:
sns.countplot(x='reason',data=data)

#### The major reason for 911 calls was EMS- Emergency Medical Services followed by Traffic and Fire

### Changing the datatype of "timeStamp" from string to TimeStamp to extract more information

### Creating countplots to study the reasons trends during Day of Week and Month

In [None]:
sns.countplot(x='Day of Week',data=data,hue='reason',palette='viridis')
plt.legend(bbox_to_anchor=(1.05,1),loc=2,borderaxespad=0)

In [None]:
sns.countplot(x='Month',data=data,hue='reason',palette='viridis')
plt.legend(bbox_to_anchor=(1.05,1),loc=2,borderaxespad=0)

In [None]:
bymonth=data.groupby('Month').count()

In [None]:
bymonth.head()

### Create a linear fit on the number of calls per month

In [None]:
bymonth['lat'].plot()

In [None]:
sns.countplot(x='Month',data=data,palette='viridis')
plt.legend(bbox_to_anchor=(1.05,1),loc=2,borderaxespad=0)

In [None]:
sns.lmplot(x='Month',y='twp',data=bymonth.reset_index())

In [None]:
 data['Date']=data['timeStamp'].apply(lambda x: x.date())

### Creating a plot of counts of 911 calls datewise 

In [None]:
plt.figure(figsize=(12,6))
data.groupby('Date').count()['lat'].plot()
plt.tight_layout()

### Creating a plot of counts of 911 calls datewise( reason: Traffic)

In [None]:
plt.figure(figsize=(12,6))
data[data['reason']=='Traffic'].groupby('Date').count()['lat'].plot()
plt.title('Traffic')
plt.tight_layout()

### Creating a plot of counts of 911 calls datewise( reason: EMS)

In [None]:
plt.figure(figsize=(12,6))
data[data['reason']=='EMS'].groupby('Date').count()['lat'].plot()
plt.title('EMS')
plt.tight_layout()

### Creating a plot of counts of 911 calls datewise( reason: Fire)

In [None]:
plt.figure(figsize=(12,6))
data[data['reason']=='Fire'].groupby('Date').count()['lat'].plot()
plt.title('Fire')
plt.tight_layout()

In [None]:
plt.figure(figsize=(12,6))
data[data['reason']=='Fire'].groupby('Date').count()['lat'].plot()
data[data['reason']=='EMS'].groupby('Date').count()['lat'].plot()
data[data['reason']=='Traffic'].groupby('Date').count()['lat'].plot()
plt.tight_layout()

## HeatMaps

In [None]:
type(data['timeStamp'].iloc[0])

In [None]:
data['timeStamp']=pd.to_datetime(data['timeStamp'])

In [None]:
data['Hour']=data['timeStamp'].apply(lambda time: time.hour)
data['Month']=data['timeStamp'].apply(lambda time: time.month)
data['Day of Week']=data['timeStamp'].apply(lambda time: time.dayofweek)

In [None]:
dmap = {0:'Mon',1:'Tue',2:'Wed',3:'Thu',4:'Fri',5:'Sat',6:'Sun'}
data['Day of Week']=data['Day of Week'].map(dmap)