The data contains the following fields:

* **lat :** String variable, Latitude
* **lng:** String variable, Longitude
* **desc:** String variable, Description of the Emergency Call
* **zip:** String variable, Zipcode
* **title:** String variable, Title
* **timeStamp:** String variable, YYYY-MM-DD HH:MM:SS
* **twp:** String variable, Township
* **addr:** String variable, Address
* **e:** String variable, Dummy variable (always 1)

**Importing required libraries**

In [7]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style('whitegrid')
%matplotlib inline

**Read in the csv file as a dataframe called df**

In [8]:
df = pd.read_csv("../input/911.csv")

**Check the info() of the df**

In [9]:
df.info()

**Check the head of df**

In [10]:
df.head(3)

**Now we will find out answers to some basic questions.**

**What are the top 5 zipcodes for 911 calls?**

In [11]:
df['zip'].value_counts().head(5)

**What are the top 5 townships (twp) for 911 calls?**

In [12]:
df['twp'].value_counts().head(5)

** How many unique title codes are there?**

In [13]:
df['title'].nunique()

**Creating new features**

**In the titles column there are "Reasons/Departments" specified before the title code. These are EMS, Fire, and Traffic. Use .apply() with a custom lambda expression to create a new column called "Reason" that contains this string value.**

**For example, if the title column value is EMS: BACK PAINS/INJURY , the Reason column value would be EMS.**

In [14]:
df['Reason'] = df['title'].apply(lambda title: title.split(':')[0])

**What is the most common Reason for a 911 call based off of this new column?**

In [15]:
df['Reason'].value_counts()

**Now use seaborn to create a countplot of 911 calls by Reason.**

In [16]:
sns.countplot(x='Reason',data=df,palette='viridis')

**Checking the data type of the objects in the timeStamp column?**

In [17]:
type(df['timeStamp'].iloc[0])

** Converting the column from strings to DateTime objects.**

In [18]:
df['timeStamp'] = pd.to_datetime(df['timeStamp'])

**Creating 3 new columns called Hour, Month, and Day of Week based off of the timeStamp column.**

In [19]:
df['Hour'] = df['timeStamp'].apply(lambda time: time.hour)
df['Month'] = df['timeStamp'].apply(lambda time: time.month)
df['Day of Week'] = df['timeStamp'].apply(lambda time: time.dayofweek)

**Creating a dictionary and using the .map() function with this dictionary to map the actual string names to the day of the week.**

In [20]:
dmap = {0:'Mon',1:'Tue',2:'Wed',3:'Thu',4:'Fri',5:'Sat',6:'Sun'}

In [21]:
df['Day of Week'] = df['Day of Week'].map(dmap)

** Using seaborn to create a countplot of the Day of Week column with the hue based off of the Reason column.**

In [22]:
sns.countplot(x='Day of Week',data=df,hue='Reason',palette='viridis')
plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)

**Doing the same as above for the Month.**

In [23]:
sns.countplot(x='Month',data=df,hue='Reason',palette='viridis')
plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)

In [24]:
byMonth = df.groupby('Month').count()
byMonth.head()

**Creating a simple plot off of the dataframe indicating the count of calls per month.**

In [25]:

byMonth['twp'].plot()

**Using seaborn's lmplot() to create a linear fit on the number of calls per month.**

In [26]:
sns.lmplot(x='Month',y='twp',data=byMonth.reset_index())

**Creating a new column called 'Date' that contains the date from the timeStamp column.**

In [27]:
df['Date']=df['timeStamp'].apply(lambda t: t.date())

**Now groupby this Date column with the count() aggregate and create a plot of counts of 911 calls.**

In [28]:
df.groupby('Date').count()['twp'].plot()
plt.tight_layout()

** Recreating this plot with 3 separate plots where each plot representing a Reason for the 911 call.**

In [29]:
df[df['Reason']=='Traffic'].groupby('Date').count()['twp'].plot()
plt.title('Traffic')
plt.tight_layout()

In [30]:
df[df['Reason']=='Fire'].groupby('Date').count()['twp'].plot()
plt.title('Fire')
plt.tight_layout()

In [31]:
df[df['Reason']=='EMS'].groupby('Date').count()['twp'].plot()
plt.title('EMS')
plt.tight_layout()

**Restructure the dataframe so that the columns become the Hours and the Index becomes the Day of the Week by using Unstack() method for creating Heatmaps.**

In [32]:
dayHour = df.groupby(by=['Day of Week','Hour']).count()['Reason'].unstack()
dayHour.head()

**Creating a HeatMap using this new DataFrame.**

In [33]:
plt.figure(figsize=(12,6))
sns.heatmap(dayHour,cmap='viridis')

**Creating a clustermap.**

In [34]:
sns.clustermap(dayHour,cmap='viridis')

**Now repeating these same plots and operations for a DataFrame that shows the Month as the column.**

In [35]:
dayMonth = df.groupby(by=['Day of Week','Month']).count()['Reason'].unstack()
dayMonth.head()

In [36]:
plt.figure(figsize=(12,6))
sns.heatmap(dayMonth,cmap='viridis')

In [37]:
sns.clustermap(dayMonth,cmap='viridis')