# Exploratory Data Analysis (EDA)- Python

***Emergency 911 Calls - Montgomery County, PA***


911 -Emergency call dataset -

The data- contains the following fields:-- -

    lat : String variable, Latitude
    lng: String variable, Longitude
    desc: String variable, Description of the Emergency Call
    zip: String variable, Zipcode
    title: String variable, Title
    timeStamp: String variable, YYYY-MM-DD HH:MM:SS
    twp: String variable, Township
    addr: String variable, Address
    e: String variable, Dummy variable (always 1)



In [None]:
import numpy as np
import pandas as pd

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style('whitegrid')
%matplotlib inline

In [None]:
df = pd.read_csv('../input/montcoalert/911.csv')

In [None]:
df.info()

In [None]:
df.head(3)

** Top 5 zipcodes for 911 calls **

In [None]:
df['zip'].value_counts().head(5)

In [None]:
df['twp'].value_counts().head(5)

In [None]:
df['title'].nunique()

In [None]:
df['Reason'] = df['title'].apply(lambda title: title.split(':')[0])

** Most common Reason for a 911 call based off of this new column **

In [None]:
df['Reason'].value_counts()

** Countplot of 911 calls by Reason. **

In [None]:
sns.countplot(x='Reason',data=df)

** Converting the timestamp column from strings to DateTime objects using [pd.to_datetime](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.to_datetime.html)  **

In [None]:
df['timeStamp'] = pd.to_datetime(df['timeStamp'])

In [None]:
df['Hour'] = df['timeStamp'].apply(lambda time: time.hour)
df['Month'] = df['timeStamp'].apply(lambda time: time.month)
df['Day of Week'] = df['timeStamp'].apply(lambda time: time.dayofweek)

In [None]:
dmap = {0:'Mon',1:'Tue',2:'Wed',3:'Thu',4:'Fri',5:'Sat',6:'Sun'}

In [None]:
df['Day of Week'] = df['Day of Week'].map(dmap)

** Countplot of the Day of Week column with the hue based off of the Reason column. **

In [None]:
sns.countplot(x='Day of Week',data=df,hue='Reason')

# To relocate the legend
plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)

** Countplot of the Month column with the hue based off of the Reason column. **

In [None]:
sns.countplot(x='Month',data=df,hue='Reason')

# To relocate the legend
plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)

In [None]:
byMonth = df.groupby('Month').count()
byMonth.head()

** Count of calls per month. **

In [None]:
byMonth['twp'].plot()

** Creating a linear fit on the number of calls per month **

In [None]:
sns.lmplot(x='Month',y='twp',data=byMonth.reset_index())

** Plot representing a Fire as the Reason for the 911 call by month**

In [None]:
df[df['Reason']=='Fire'].groupby('Month').count()['twp'].plot()
plt.title('Fire')
plt.tight_layout()

** Plot representing a EMS as the Reason for the 911 call by month**

In [None]:
df[df['Reason']=='EMS'].groupby('Month').count()['twp'].plot()
plt.title('EMS')
plt.tight_layout()

** Plot representing a Traffic as the Reason for the 911 call by month**

In [None]:
df[df['Reason']=='Traffic'].groupby('Month').count()['twp'].plot()
plt.title('Traffic')
plt.tight_layout()

In [None]:
dayHour = df.groupby(by=['Day of Week','Hour']).count()['Reason'].unstack()
dayHour.head()

**Heatmap Day of week by Hour**

In [None]:
plt.figure(figsize=(12,6))
sns.heatmap(dayHour)

**Clustermap Day of week by Hour**

In [None]:
sns.clustermap(dayHour)

In [None]:
dayMonth = df.groupby(by=['Day of Week','Month']).count()['Reason'].unstack()
dayMonth.head()

**Heatmap Month by Hour**

In [None]:
plt.figure(figsize=(12,6))
sns.heatmap(dayMonth)

**Clustermap Month by Hour**

In [None]:
sns.clustermap(dayMonth)