## NYC complaints 311 Calls

In this notebook we will exploit Pandas to perform data analysis on a dataset of calls to 311 (municipal calls, not emergency) in the New York City area.

Download the data from [here](https://data.cityofnewyork.us/Social-Services/311-Service-Requests-from-2010-to-Present/erm2-nwe9) (Go to Export -> CSV). (**WARNING: > 16 GB of data**)

For this notebook, use on a smaller version of the data [here](https://drive.google.com/file/d/1EHYsxnN18LAKIPpZbtqjqCLi5hokG1ag/view?usp=sharing).

In [None]:
%matplotlib inline
import pandas as pd
import matplotlib.pyplot as plt

plt.rcParams['figure.figsize'] = (15, 5)


First, let's load the data. It will take some time...

In [None]:
complaints = pd.read_csv('311_small.csv')

In [None]:
complaints.iloc[:,[8,31,32,34,35,36,37]].columns

In [None]:
complaints.head()

# Same would have been with
# complaints[:5]

In [None]:
complaints['Complaint Type'].unique()

In [None]:
complaints[['Complaint Type', 'Borough']][:10]
# complaints[:10][['Complaint Type', 'Borough']]

In [None]:
complaints['Complaint Type'].value_counts()

In [None]:
# Let's clean the data a bit
# WARNING: HUGE MEMORY required
complaints = complaints[~complaints['Complaint Type'].str.contains("Misc.")]

How many different distinct complaints have been issued?

In [None]:
complaint_count = complaints['Complaint Type'].value_counts()
complaint_count[:10]

In [None]:
type(complaint_count)

In [None]:
complaint_count[:10].plot(kind='bar')

What about the `noise` complaint?

In [None]:
noise_complaints = complaints[complaints['Complaint Type'] == 'Noise - Residential']
noise_complaints[:3]

In [None]:
# Boolean indexing of dataframe
complaints['Complaint Type'] == 'Noise - Residential'

In [None]:
is_noise = complaints['Complaint Type'] == 'Noise - Residential'
in_brooklyn = complaints['Borough'] == 'BROOKLYN'
complaints[is_noise & in_brooklyn][:5]

In [None]:
complaints[is_noise & in_brooklyn][['Complaint Type', 'Borough', 'Created Date']][:10]

In [None]:
type(complaints['Created Date'])

In [None]:
is_noise[:3]

In [None]:
noise_complaints = complaints[is_noise]
noise_complaints['Borough'].value_counts()

In [None]:
noise_complaints_counts = noise_complaints['Borough'].value_counts()
complaints_counts = complaints['Borough'].value_counts()

In [None]:
noise_complaints_counts / complaints_counts

In [None]:
complaints_counts[:3]

In [None]:
noise_complaints_counts / complaints_counts.astype(float)

In [None]:
(noise_complaints_counts / complaints_counts.astype(float)).plot(kind='bar')