# NYC311 - Customer Service Requests Analysis

## DESCRIPTION

**Background of Problem Statement :**

NYC 311's mission is to provide the public with quick and easy access to all New York City government services and information while offering the best customer service. Each day, NYC311 receives thousands of requests related to several hundred types of non-emergency services, including noise complaints, plumbing issues, and illegally parked cars. These requests are received by NYC311 and forwarded to the relevant agencies such as the police, buildings, or transportation. The agency responds to the request, addresses it, and then closes it.

**Problem Objective :**

Perform a service request data analysis of New York City 311 calls. You will focus on the data wrangling techniques to understand the pattern in the data and also visualize the major complaint types.
Domain: Customer Service

**Analysis Tasks to be performed:**

(Perform a service request data analysis of New York City 311 calls) 

Import a 311 NYC service request.
Read or convert the columns ‘Created Date’ and Closed Date’ to datetime datatype and create a new column ‘Request_Closing_Time’ as the time elapsed between request creation and request closing. (Hint: Explore the package/module datetime)
Provide major insights/patterns that you can offer in a visual format (graphs or tables); at least 4 major conclusions that you can come up with after generic data mining.
Order the complaint types based on the average ‘Request_Closing_Time’, grouping them for different locations.
Perform a statistical test for the following:

In [2]:
#import library
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

In [4]:
#read csv file
df=pd.read_csv('s3://simplilearn-project/311_Service_Requests_from_2010_to_Present.csv')
df.head()

  has_raised = await self.run_ast_nodes(code_ast.body, cell_name,


Unnamed: 0,Unique Key,Created Date,Closed Date,Agency,Agency Name,Complaint Type,Descriptor,Location Type,Incident Zip,Incident Address,...,Bridge Highway Name,Bridge Highway Direction,Road Ramp,Bridge Highway Segment,Garage Lot Name,Ferry Direction,Ferry Terminal Name,Latitude,Longitude,Location
0,32310363,12/31/2015 11:59:45 PM,01-01-16 0:55,NYPD,New York City Police Department,Noise - Street/Sidewalk,Loud Music/Party,Street/Sidewalk,10034.0,71 VERMILYEA AVENUE,...,,,,,,,,40.865682,-73.923501,"(40.86568153633767, -73.92350095571744)"
1,32309934,12/31/2015 11:59:44 PM,01-01-16 1:26,NYPD,New York City Police Department,Blocked Driveway,No Access,Street/Sidewalk,11105.0,27-07 23 AVENUE,...,,,,,,,,40.775945,-73.915094,"(40.775945312321085, -73.91509393898605)"
2,32309159,12/31/2015 11:59:29 PM,01-01-16 4:51,NYPD,New York City Police Department,Blocked Driveway,No Access,Street/Sidewalk,10458.0,2897 VALENTINE AVENUE,...,,,,,,,,40.870325,-73.888525,"(40.870324522111424, -73.88852464418646)"
3,32305098,12/31/2015 11:57:46 PM,01-01-16 7:43,NYPD,New York City Police Department,Illegal Parking,Commercial Overnight Parking,Street/Sidewalk,10461.0,2940 BAISLEY AVENUE,...,,,,,,,,40.835994,-73.828379,"(40.83599404683083, -73.82837939584206)"
4,32306529,12/31/2015 11:56:58 PM,01-01-16 3:24,NYPD,New York City Police Department,Illegal Parking,Blocked Sidewalk,Street/Sidewalk,11373.0,87-14 57 ROAD,...,,,,,,,,40.73306,-73.87417,"(40.733059618956815, -73.87416975810375)"


In [5]:
#determine the number of rows and columns
df.shape

(300698, 53)

In [6]:
df.describe()

Unnamed: 0,Unique Key,Incident Zip,X Coordinate (State Plane),Y Coordinate (State Plane),School or Citywide Complaint,Vehicle Type,Taxi Company Borough,Taxi Pick Up Location,Garage Lot Name,Latitude,Longitude
count,300698.0,298083.0,297158.0,297158.0,0.0,0.0,0.0,0.0,0.0,297158.0,297158.0
mean,31300540.0,10848.888645,1004854.0,203754.534416,,,,,,40.725885,-73.92563
std,573854.7,583.182081,21753.38,29880.183529,,,,,,0.082012,0.078454
min,30279480.0,83.0,913357.0,121219.0,,,,,,40.499135,-74.254937
25%,30801180.0,10310.0,991975.2,183343.0,,,,,,40.669796,-73.972142
50%,31304360.0,11208.0,1003158.0,201110.5,,,,,,40.718661,-73.931781
75%,31784460.0,11238.0,1018372.0,224125.25,,,,,,40.78184,-73.876805
max,32310650.0,11697.0,1067173.0,271876.0,,,,,,40.912869,-73.70076


In [7]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 300698 entries, 0 to 300697
Data columns (total 53 columns):
 #   Column                          Non-Null Count   Dtype  
---  ------                          --------------   -----  
 0   Unique Key                      300698 non-null  int64  
 1   Created Date                    300698 non-null  object 
 2   Closed Date                     298534 non-null  object 
 3   Agency                          300698 non-null  object 
 4   Agency Name                     300698 non-null  object 
 5   Complaint Type                  300698 non-null  object 
 6   Descriptor                      294784 non-null  object 
 7   Location Type                   300567 non-null  object 
 8   Incident Zip                    298083 non-null  float64
 9   Incident Address                256288 non-null  object 
 10  Street Name                     256288 non-null  object 
 11  Cross Street 1                  251419 non-null  object 
 12  Cross Street 2  

In [7]:
#determine if there are missing values
df.isna().any()

Unique Key                        False
Created Date                      False
Closed Date                        True
Agency                            False
Agency Name                       False
Complaint Type                    False
Descriptor                         True
Location Type                      True
Incident Zip                       True
Incident Address                   True
Street Name                        True
Cross Street 1                     True
Cross Street 2                     True
Intersection Street 1              True
Intersection Street 2              True
Address Type                       True
City                               True
Landmark                           True
Facility Type                      True
Status                            False
Due Date                           True
Resolution Description            False
Resolution Action Updated Date     True
Community Board                   False
Borough                           False


In [9]:
df.isna().sum()

Unique Key                             0
Created Date                           0
Closed Date                         2164
Agency                                 0
Agency Name                            0
Complaint Type                         0
Descriptor                          5914
Location Type                        131
Incident Zip                        2615
Incident Address                   44410
Street Name                        44410
Cross Street 1                     49279
Cross Street 2                     49779
Intersection Street 1             256840
Intersection Street 2             257336
Address Type                        2815
City                                2614
Landmark                          300349
Facility Type                       2171
Status                                 0
Due Date                               3
Resolution Description                 0
Resolution Action Updated Date      2187
Community Board                        0
Borough         

Apply method chaining
https://towardsdatascience.com/the-unreasonable-effectiveness-of-method-chaining-in-pandas-15c2109e3c69

In [None]:
#drop columns with no data
#drop column Landmark ,School or Citywide Complaint, Vehicle Type, Taxi Company Borough, 
#Taxi Pick Up Location, Bridge Highway Name, Bridge Highway Direction
#Road Ramp Bridge Highway Segment Garage Lot Name Ferry Direction Ferry Terminal Name
drop_columns=["Landmark","School or Citywide Complaint","Vehicle Type","Taxi Company Borough"
             "Taxi Pick Up Location","Bridge Highway Name","Bridge Highway Direction",
             "Road Ramp","Bridge Highway Segment","Garage Lot Name","Ferry Direction",
             "Ferry Terminal Name"]
df_subset=df.drop(columns=drop_columns,axis=1)

In [None]:
#new shape of the dataset after dropping columns
df_subset.shape()

In [11]:
#number of records in each category, user value_counts
df_subset["Complaint Type"].value_counts()

Blocked Driveway             77044
Illegal Parking              75361
Noise - Street/Sidewalk      48612
Noise - Commercial           35577
Derelict Vehicle             17718
Noise - Vehicle              17083
Animal Abuse                  7778
Traffic                       4498
Homeless Encampment           4416
Noise - Park                  4042
Vending                       3802
Drinking                      1280
Noise - House of Worship       931
Posting Advertisement          650
Urinating in Public            592
Bike/Roller/Skate Chronic      427
Panhandling                    307
Disorderly Youth               286
Illegal Fireworks              168
Graffiti                       113
Agency Issues                    6
Squeegee                         4
Ferry Complaint                  2
Animal in a Park                 1
Name: Complaint Type, dtype: int64

In [12]:
#Descriptor	Location Type
df_subset["Descriptor"].value_counts()

Loud Music/Party                  61430
No Access                         56976
Posted Parking Sign Violation     22440
Loud Talking                      21584
Partial Access                    20068
With License Plate                17718
Blocked Hydrant                   16081
Commercial Overnight Parking      12189
Car/Truck Music                   11273
Blocked Sidewalk                  11121
Double Parked Blocking Traffic     5731
Double Parked Blocking Vehicle     4211
Engine Idling                      4189
Banging/Pounding                   4165
Neglected                          3787
Car/Truck Horn                     3511
Congestion/Gridlock                2761
In Prohibited Area                 2025
Other (complaint details)          1969
Unlicensed                         1777
Overnight Commercial Storage       1757
Unauthorized Bus Layover           1367
Truck Route Violation              1014
In Public                           932
Tortured                            854


In [14]:
df_subset["Location Type"].value_counts()

Street/Sidewalk               249299
Store/Commercial               20381
Club/Bar/Restaurant            17360
Residential Building/House      6960
Park/Playground                 4773
House of Worship                 929
Residential Building             227
Highway                          215
Parking Lot                      117
House and Store                   93
Vacant Lot                        77
Commercial                        62
Roadway Tunnel                    35
Subway Station                    34
Bridge                             2
Ferry                              1
Park                               1
Terminal                           1
Name: Location Type, dtype: int64

In [15]:
#change data types of Created Date and Closed Date
df_subset.astype({'Created Date':'datetime64[ns]','Closed Date':'datetime64[ns]'}).dtypes

Unique Key                                 int64
Created Date                      datetime64[ns]
Closed Date                       datetime64[ns]
Agency                                    object
Agency Name                               object
Complaint Type                            object
Descriptor                                object
Location Type                             object
Incident Zip                             float64
Incident Address                          object
Street Name                               object
Cross Street 1                            object
Cross Street 2                            object
Intersection Street 1                     object
Intersection Street 2                     object
Address Type                              object
City                                      object
Landmark                                  object
Facility Type                             object
Status                                    object
Due Date            

In [19]:
df_fillna=df_subset.fillna(
    value={
        'Closed Date':0
    }
)

Unnamed: 0,Created Date,Closed Date
count,300698,298534
unique,259493,237165
top,11-06-15 23:34,11-08-15 7:34
freq,9,24


In [38]:
df_fillna.agg(
    {
        "Created Date":["min","max"]
        "Closed Date":["min","max"],
    }
)

Unnamed: 0,Created Date
min,03/29/2015 01:01:25 AM
max,12/31/2015 12:59:18 PM


In [42]:
df_isna_closed_date=df_subset[df_subset["Closed Date"].isna()==True]
df_isna_closed_date.head()

Unnamed: 0,Unique Key,Created Date,Closed Date,Agency,Agency Name,Complaint Type,Descriptor,Location Type,Incident Zip,Incident Address,...,Bridge Highway Name,Bridge Highway Direction,Road Ramp,Bridge Highway Segment,Garage Lot Name,Ferry Direction,Ferry Terminal Name,Latitude,Longitude,Location
416,32305700,12/31/2015 02:16:04 PM,,NYPD,New York City Police Department,Illegal Parking,Posted Parking Sign Violation,Street/Sidewalk,,5426-5526 90TH ST,...,,,,,,,,,,
611,32309308,12/31/2015 09:58:06 AM,,NYPD,New York City Police Department,Noise - Street/Sidewalk,Loud Music/Party,Street/Sidewalk,,30 STREET,...,,,,,,,,,,
1648,32303348,12/30/2015 05:13:42 AM,,NYPD,New York City Police Department,Illegal Parking,Commercial Overnight Parking,Street/Sidewalk,,21600-2169 91ST AVE,...,,,,,,,,,,
1816,32294519,12/29/2015 10:44:50 PM,,NYPD,New York City Police Department,Derelict Vehicle,With License Plate,Street/Sidewalk,,127 STREET,...,,,,,,,,,,
1965,32296487,12/29/2015 07:09:13 PM,,NYPD,New York City Police Department,Derelict Vehicle,With License Plate,Street/Sidewalk,,5201-5299 68TH ST,...,,,,,,,,,,


In [34]:
df_subset["Closed Date"].isna()

0         False
1         False
2         False
3         False
4         False
          ...  
300693     True
300694    False
300695    False
300696    False
300697    False
Name: Closed Date, Length: 300698, dtype: bool

In [41]:
df_isna_closed_date.agg({
    "Closed Date":["min","max"]
})

Unnamed: 0,Closed Date
min,
max,


In [49]:
df_isna_closed_date["Closed Date"].fillna(0)
df_isna_closed_date.head()

Unnamed: 0,Unique Key,Created Date,Closed Date,Agency,Agency Name,Complaint Type,Descriptor,Location Type,Incident Zip,Incident Address,...,Bridge Highway Name,Bridge Highway Direction,Road Ramp,Bridge Highway Segment,Garage Lot Name,Ferry Direction,Ferry Terminal Name,Latitude,Longitude,Location
416,32305700,12/31/2015 02:16:04 PM,,NYPD,New York City Police Department,Illegal Parking,Posted Parking Sign Violation,Street/Sidewalk,,5426-5526 90TH ST,...,,,,,,,,,,
611,32309308,12/31/2015 09:58:06 AM,,NYPD,New York City Police Department,Noise - Street/Sidewalk,Loud Music/Party,Street/Sidewalk,,30 STREET,...,,,,,,,,,,
1648,32303348,12/30/2015 05:13:42 AM,,NYPD,New York City Police Department,Illegal Parking,Commercial Overnight Parking,Street/Sidewalk,,21600-2169 91ST AVE,...,,,,,,,,,,
1816,32294519,12/29/2015 10:44:50 PM,,NYPD,New York City Police Department,Derelict Vehicle,With License Plate,Street/Sidewalk,,127 STREET,...,,,,,,,,,,
1965,32296487,12/29/2015 07:09:13 PM,,NYPD,New York City Police Department,Derelict Vehicle,With License Plate,Street/Sidewalk,,5201-5299 68TH ST,...,,,,,,,,,,


In [46]:
df_isna_closed_date_0=df_isna_closed_date["Closed Date"].fillna(0)

In [47]:
df_isna_closed_date_0.head()

416     0
611     0
1648    0
1816    0
1965    0
Name: Closed Date, dtype: int64

In [54]:
df_fillna=df_isna_closed_date.fillna(
    value={
        'Closed Date':0
    }
)

In [53]:
df_isna_closed_date.agg({
    "Closed Date":["min","max"]
})

Unnamed: 0,Closed Date
min,
max,


In [55]:
df_fillna.head()

Unnamed: 0,Unique Key,Created Date,Closed Date,Agency,Agency Name,Complaint Type,Descriptor,Location Type,Incident Zip,Incident Address,...,Bridge Highway Name,Bridge Highway Direction,Road Ramp,Bridge Highway Segment,Garage Lot Name,Ferry Direction,Ferry Terminal Name,Latitude,Longitude,Location
416,32305700,12/31/2015 02:16:04 PM,0,NYPD,New York City Police Department,Illegal Parking,Posted Parking Sign Violation,Street/Sidewalk,,5426-5526 90TH ST,...,,,,,,,,,,
611,32309308,12/31/2015 09:58:06 AM,0,NYPD,New York City Police Department,Noise - Street/Sidewalk,Loud Music/Party,Street/Sidewalk,,30 STREET,...,,,,,,,,,,
1648,32303348,12/30/2015 05:13:42 AM,0,NYPD,New York City Police Department,Illegal Parking,Commercial Overnight Parking,Street/Sidewalk,,21600-2169 91ST AVE,...,,,,,,,,,,
1816,32294519,12/29/2015 10:44:50 PM,0,NYPD,New York City Police Department,Derelict Vehicle,With License Plate,Street/Sidewalk,,127 STREET,...,,,,,,,,,,
1965,32296487,12/29/2015 07:09:13 PM,0,NYPD,New York City Police Department,Derelict Vehicle,With License Plate,Street/Sidewalk,,5201-5299 68TH ST,...,,,,,,,,,,


In [56]:
df_fillna.agg({
    "Closed Date":["min","max"]
})

Unnamed: 0,Closed Date
min,0
max,0


In [1]:
df

NameError: name 'df' is not defined