In [1]:
import pandas as pd
pd.set_option("max_columns", None)
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
%matplotlib inline

The dataset under examination is the New York City Open Data Portal's [NYPD Complaints](https://data.cityofnewyork.us/Public-Safety/NYPD-Complaint-Data-Current-YTD/5uac-w243) dataset. NYPD complaints are logged on the portal in two datases, one of "Historical" complaints going back to 2006, and one (the one here) that is YTD for 2016.

For the data dictionary see `../data/`.

In [3]:
complaints = pd.read_csv("../data/NYPD_Complaint_Data_Current_YTD.csv", index_col=0)

In [5]:
complaints.head(0)

Unnamed: 0_level_0,CMPLNT_FR_DT,CMPLNT_FR_TM,CMPLNT_TO_DT,CMPLNT_TO_TM,RPT_DT,KY_CD,OFNS_DESC,PD_CD,PD_DESC,CRM_ATPT_CPTD_CD,LAW_CAT_CD,JURIS_DESC,BORO_NM,ADDR_PCT_CD,LOC_OF_OCCUR_DESC,PREM_TYP_DESC,PARKS_NM,HADEVELOPT,X_COORD_CD,Y_COORD_CD,Latitude,Longitude,Lat_Lon
CMPLNT_NUM,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1


The rules under which entries are dated and recorded are fairly intricate. The column names are in a difficult-to-read format, so we'll convert them to readable column names before we do anythign else.

In [12]:
complaints.columns = ['Complaint Date', 'Complaint Time', 'Reported Complaint Date', 'Reported Complaint Time',
                      'Reported Date', 'Offense Classification', 'Key Code', 'Internal Classification',
                      'Internal Classification Description', 'Crime Successful',
                      'Level of Offense', 'Jurisdiction', 'Borough', 'Precinct', 'Location Description', 'Premises Type',
                      'Park Name', 'NYCHA Name', 'X Coordinate', 'Y Coordinate', 'Latitude', 'Longitude', 'Location']

In [16]:
complaints.index.name = 'ID'

In [17]:
complaints.head(0)

Unnamed: 0_level_0,Complaint Date,Complaint Time,Reported Complaint Date,Reported Complaint Time,Reported Date,Offense Classification,Key Code,Internal Classification,Internal Classification Description,Crime Successful,Level of Offense,Jurisdiction,Borough,Precinct,Location Description,Premises Type,Park Name,NYCHA Name,X Coordinate,Y Coordinate,Latitude,Longitude,Location
ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1


In [19]:
complaints['Complaint Date'].value_counts()

01/01/2016    2077
06/01/2016    1731
04/01/2016    1728
07/01/2016    1536
03/01/2016    1503
04/22/2016    1501
01/15/2016    1482
05/20/2016    1466
04/13/2016    1464
08/05/2016    1459
05/11/2016    1456
06/02/2016    1452
06/24/2016    1452
06/22/2016    1452
06/15/2016    1450
03/11/2016    1449
04/15/2016    1445
05/14/2016    1441
08/20/2016    1435
06/17/2016    1432
05/12/2016    1432
08/01/2016    1428
06/10/2016    1421
05/25/2016    1418
05/17/2016    1416
04/21/2016    1414
06/03/2016    1412
07/23/2016    1410
04/20/2016    1409
02/01/2016    1408
              ... 
08/12/2013       1
07/28/2007       1
11/27/2012       1
01/01/1978       1
01/08/2010       1
09/15/2014       1
05/23/2011       1
03/09/2012       1
04/27/2013       1
05/15/1016       1
12/12/2013       1
09/14/1016       1
10/20/2014       1
01/01/1988       1
04/12/2011       1
06/09/2012       1
01/31/2005       1
01/21/2014       1
09/23/1985       1
10/26/2014       1
08/07/2010       1
07/03/2005  

In [20]:
complaints['Complaint Time'].value_counts()

12:00:00    9732
15:00:00    8000
18:00:00    7486
17:00:00    7120
20:00:00    7062
16:00:00    6991
19:00:00    6680
14:00:00    6307
22:00:00    6111
09:00:00    5977
21:00:00    5932
10:00:00    5861
13:00:00    5784
08:00:00    5526
23:00:00    5279
11:00:00    5088
00:01:00    4659
01:00:00    3934
18:30:00    3632
16:30:00    3507
17:30:00    3502
15:30:00    3373
19:30:00    3346
02:00:00    3346
20:30:00    3323
14:30:00    3260
21:30:00    3150
00:00:00    3148
22:30:00    2825
07:00:00    2724
            ... 
06:32:00      11
06:16:00      11
05:02:00      11
07:39:00      11
07:06:00      10
05:41:00      10
08:34:00      10
05:53:00      10
06:34:00      10
05:21:00      10
06:13:00       9
07:29:00       9
05:46:00       9
06:11:00       9
07:04:00       9
07:01:00       9
06:46:00       9
07:26:00       9
05:56:00       8
06:48:00       8
07:02:00       8
06:51:00       8
07:21:00       8
05:37:00       6
06:49:00       6
06:47:00       6
05:34:00       5
05:51:00      

In [22]:
# complaints['Reported Complaint Date'].value_counts()
# complaints['Reported Complaint Time'].value_counts()
# complaints['Reported Date'].value_counts()

In [23]:
complaints['Offense Classification'].value_counts()

341    60850
578    49907
344    39977
109    32561
351    30117
361    17402
106    15869
235    13635
105    11472
126     9961
107     9635
121     7501
359     6911
348     4950
110     4767
113     4585
347     4423
118     4111
236     3947
233     3708
112     3703
117     3553
352     3001
340     2626
343     1602
353     1338
104     1107
116     1031
358      947
355      945
       ...  
125      528
364      319
101      263
350      221
231      205
238      179
124      111
356       93
345       88
346       75
572       66
363       48
675       46
230       35
120       29
677       28
342       27
115        8
571        7
237        7
349        3
122        3
103        3
676        2
354        2
366        2
455        2
685        1
234        1
102        1
Name: Offense Classification, dtype: int64

In [25]:
complaints['Key Code'].value_counts()

PETIT LARCENY                           60850
HARRASSMENT 2                           49907
ASSAULT 3 & RELATED OFFENSES            39977
CRIMINAL MISCHIEF & RELATED OF          37618
GRAND LARCENY                           32561
OFF. AGNST PUB ORD SENSBLTY &           17401
DANGEROUS DRUGS                         17188
FELONY ASSAULT                          15869
ROBBERY                                 11472
MISCELLANEOUS PENAL LAW                 10560
BURGLARY                                 9635
DANGEROUS WEAPONS                        8058
OFFENSES AGAINST PUBLIC ADMINI           6911
VEHICLE AND TRAFFIC LAWS                 4950
GRAND LARCENY OF MOTOR VEHICLE           4767
SEX CRIMES                               4739
FORGERY                                  4585
INTOXICATED & IMPAIRED DRIVING           4423
THEFT-FRAUD                              3703
CRIMINAL TRESPASS                        3001
FRAUDS                                   2626
UNAUTHORIZED USE OF A VEHICLE     

In [27]:
complaints['Internal Classification'].value_counts()

638.0    34865
101.0    32347
333.0    19819
639.0    16279
637.0    15042
338.0    14675
109.0    12425
254.0    11041
259.0     9869
321.0     9030
258.0     8610
567.0     8121
198.0     6051
339.0     5752
113.0     5539
511.0     4998
905.0     4380
782.0     3945
729.0     3883
441.0     3802
916.0     3708
739.0     3574
421.0     3527
386.0     3298
343.0     3200
269.0     3181
267.0     3130
793.0     3095
221.0     3039
349.0     2867
         ...  
107.0        2
687.0        2
122.0        1
476.0        1
754.0        1
890.0        1
872.0        1
648.0        1
513.0        1
283.0        1
529.0        1
530.0        1
669.0        1
876.0        1
784.0        1
892.0        1
737.0        1
880.0        1
345.0        1
821.0        1
605.0        1
694.0        1
532.0        1
667.0        1
587.0        1
701.0        1
289.0        1
798.0        1
574.0        1
715.0        1
Name: Internal Classification, dtype: int64

In [28]:
complaints['Internal Classification Description'].value_counts()

HARASSMENT,SUBD 3,4,5                    34865
ASSAULT 3                                32347
LARCENY,PETIT FROM STORE-SHOPL           19819
AGGRAVATED HARASSMENT 2                  16279
HARASSMENT,SUBD 1,CIVILIAN               15042
LARCENY,PETIT FROM BUILDING,UN           14675
ASSAULT 2,1,UNCLASSIFIED                 12425
MISCHIEF, CRIMINAL 4, OF MOTOR           11041
CRIMINAL MISCHIEF,UNCLASSIFIED 4          9869
LARCENY,PETIT FROM AUTO                   9030
CRIMINAL MISCHIEF 4TH, GRAFFIT            8610
MARIJUANA, POSSESSION 4 & 5               8121
CRIMINAL CONTEMPT 1                       6051
LARCENY,PETIT FROM OPEN AREAS,            5752
MENACING,UNCLASSIFIED                     5539
CONTROLLED SUBSTANCE, POSSESSI            5186
INTOXICATED DRIVING,ALCOHOL               4380
WEAPONS, POSSESSION, ETC                  3945
FORGERY,ETC.,UNCLASSIFIED-FELO            3883
LARCENY,GRAND OF AUTO                     3802
LEAVING SCENE-ACCIDENT-PERSONA            3708
FRAUD,UNCLASS

In [30]:
complaints['Crime Successful'].value_counts()

COMPLETED    355599
ATTEMPTED      6141
Name: Crime Successful, dtype: int64

In [31]:
complaints['Level of Offense'].value_counts()

MISDEMEANOR    199063
FELONY         112014
VIOLATION       50663
Name: Level of Offense, dtype: int64

In [32]:
complaints['Jurisdiction'].value_counts()

N.Y. POLICE DEPT                319158
N.Y. HOUSING POLICE              28874
N.Y. TRANSIT POLICE               8859
PORT AUTHORITY                    1872
DEPT OF CORRECTIONS               1265
OTHER                             1006
TRI-BORO BRDG TUNNL                165
HEALTH & HOSP CORP                 159
N.Y. STATE POLICE                  130
METRO NORTH                         57
NYC PARKS                           53
N.Y. STATE PARKS                    36
STATN IS RAPID TRANS                33
NEW YORK CITY SHERIFF OFFICE        30
U.S. PARK POLICE                    16
LONG ISLAND RAILRD                  15
AMTRACK                              6
NYS DEPT TAX AND FINANCE             5
POLICE DEPT NYC                      1
Name: Jurisdiction, dtype: int64

In [33]:
complaints['Precinct'].value_counts()

75.0     11403
40.0      9384
43.0      8955
44.0      8793
14.0      7797
47.0      7783
67.0      7695
52.0      7411
46.0      7165
73.0      7121
114.0     6789
113.0     6177
18.0      6154
42.0      6132
120.0     5882
109.0     5882
115.0     5835
41.0      5829
48.0      5802
103.0     5765
70.0      5540
105.0     5475
32.0      5182
79.0      5171
25.0      5132
71.0      5057
13.0      5040
23.0      5037
110.0     4952
83.0      4871
         ...  
84.0      3933
72.0      3883
68.0      3814
122.0     3769
63.0      3736
101.0     3580
69.0      3565
24.0      3533
6.0       3487
5.0       3384
107.0     3363
108.0     3363
50.0      3354
66.0      3295
7.0       3286
30.0      3250
33.0      3157
78.0      3036
10.0      2971
88.0      2927
20.0      2914
94.0      2810
17.0      2508
112.0     2422
100.0     2238
26.0      2231
111.0     2165
123.0     2026
76.0      2025
22.0       247
Name: Precinct, dtype: int64

In [34]:
complaints['Location Description'].value_counts()

INSIDE         188740
FRONT OF        82181
OPPOSITE OF      9699
REAR OF          7622
OUTSIDE           159
Name: Location Description, dtype: int64

In [35]:
complaints['Premises Type'].value_counts()

STREET                          108644
RESIDENCE - APT. HOUSE           78352
RESIDENCE-HOUSE                  34176
RESIDENCE - PUBLIC HOUSING       28945
OTHER                            10429
COMMERCIAL BUILDING               9407
CHAIN STORE                       8992
TRANSIT - NYC SUBWAY              8751
DEPARTMENT STORE                  7740
GROCERY/BODEGA                    5795
PARK/PLAYGROUND                   4883
RESTAURANT/DINER                  4754
BAR/NIGHT CLUB                    4112
DRUG STORE                        3583
PUBLIC SCHOOL                     3516
CLOTHING/BOUTIQUE                 2779
FAST FOOD                         2077
FOOD SUPERMARKET                  2070
PUBLIC BUILDING                   2011
HOTEL/MOTEL                       1909
HOSPITAL                          1889
PARKING LOT/GARAGE (PUBLIC)       1629
PARKING LOT/GARAGE (PRIVATE)      1602
BANK                              1547
STORE UNCLASSIFIED                1457
SMALL MERCHANT           