# NY Food review project

This notebook contains testing and scratch work

### Imports

In [1]:
%load_ext autoreload
%autoreload 2

# Import ds libraries
import pandas as pd
import numpy as np
import re

# Import acquire functions
import nick_acquire as a

In [2]:
pd.set_option('display.max_columns', None)

### Data dictionary

|          feature          |                            description                           |
| ------------------------- | ---------------------------------------------------------------- |
| camis                     | Unique identifier for the restaurant                             |
| dba                       | Name of the business                                             |
| boro                      | Borough in which restaurant is located                           |
| building                  | Building number for restaurant                                   |
| street                    | Street name for establishment                                    |
| zipcode                   | Zip code for the establishment                                   |
| phone                     | Phone number for the establishment                               |
| inspection_date           | Date of the inspection of the restaurant                         |
| critical_flag             | Indicator of critical violation                                  |
| record_date               | The date when the extract was run to produce this data set       |
| latitude                  | Latitude                                                         |
| longitude                 | Longitude                                                        |
| community_board           | Local government body in the five boroughs of New York City      |
| council_district          | District of a New York City Council member                       |
| census_tract              | This is a geographic region  for the purpose of a census         |
| bin                       | This stands for Building Identification Number.                  |
| bbl                       | Borough, Block, and Lot. It's a unique real state id             |
| nta                       | Neighborhood Tabulation Area                                     |
| cuisine_description       | Describes type of cuisine at the restaurant                      |
| action                    | The actions that is associated with each restaurant inspection   |
| violation_code            | Violation code associated with establishment inspection          |
| violation_description     | Violation description associated with establishment inspection   |
| score                     | Total score for a particular inspection                          |
| grade                     | Grade associated with inspection                                 |
| grade_date                | Date when the current grade was issued                           |
| inspection_type           | Combination of the inspection program and the type of inspection |

This field represents the actions that is associated with each restaurant inspection. ; 

* Violations were cited in the following area(s). 
* No violations were recorded at the time of this inspection. 
* Establishment re-opened by DOHMH 
* Establishment re-closed by DOHMH 
* Establishment Closed by DOHMH.  Violations were cited in the following area(s) and those requiring immediate action were addressed. 
* "Missing" = not yet inspected;

In [3]:
ny = a.acquire_ny()
ny.head(3)

Unnamed: 0,camis,dba,boro,building,street,zipcode,phone,inspection_date,critical_flag,record_date,latitude,longitude,community_board,council_district,census_tract,bin,bbl,nta,cuisine_description,action,violation_code,violation_description,score,grade,grade_date,inspection_type
0,50106756,UNGARO COAL FIRED PIZZA CAFE,Staten Island,1298,FOREST AVENUE,10302.0,6464690930,1900-01-01T00:00:00.000,Not Applicable,2023-10-26T06:00:14.000,40.626371,-74.133111,501.0,50.0,20100.0,5170408.0,5003870000.0,SI07,,,,,,,,
1,50105716,STELLA'S,Brooklyn,559,5 AVENUE,11215.0,4155703174,1900-01-01T00:00:00.000,Not Applicable,2023-10-26T06:00:14.000,40.665416,-73.989417,307.0,39.0,14100.0,3337750.0,3010480000.0,BK37,,,,,,,,
2,41168748,DUNKIN,Bronx,880,GARRISON AVENUE,10474.0,7188614171,2022-03-30T00:00:00.000,Not Critical,2023-10-26T06:00:11.000,40.816753,-73.892364,202.0,17.0,9300.0,2098685.0,2027390000.0,BX27,Donuts,Violations were cited in the following area(s).,10J,Hand wash sign not posted,13.0,A,2022-03-30T00:00:00.000,Cycle Inspection / Initial Inspection


 ## Unique counts of columns within dataframe

In [4]:
ny.nunique()

camis                    28232
dba                      22114
boro                         6
building                  7479
street                    2403
zipcode                    226
phone                    25633
inspection_date           1678
critical_flag                3
record_date                  3
latitude                 23115
longitude                23115
community_board             69
council_district            51
census_tract              1183
bin                      20020
bbl                      19709
nta                        193
cuisine_description         89
action                       5
violation_code             143
violation_description      221
score                      130
grade                        6
grade_date                1455
inspection_type             31
dtype: int64

In [5]:
ny.camis.nunique()

28232

In [6]:
ny.dba.nunique()

22114

In [7]:
ny.isna().sum()

camis                         0
dba                         508
boro                          0
building                    351
street                        6
zipcode                    2680
phone                         7
inspection_date               0
critical_flag                 0
record_date                   0
latitude                    257
longitude                   257
community_board            3247
council_district           3251
census_tract               3251
bin                        4237
bbl                         573
nta                        3247
cuisine_description        2305
action                     2305
violation_code             3452
violation_description      3452
score                      9706
grade                    105753
grade_date               114506
inspection_type            2305
dtype: int64

In [8]:
ny_info = pd.DataFrame(ny.isna().sum())
ny_info['dtype'] = ny.dtypes
ny_info = ny_info.rename(columns={0:'nulls'})

In [9]:
ny_info.T

Unnamed: 0,camis,dba,boro,building,street,zipcode,phone,inspection_date,critical_flag,record_date,latitude,longitude,community_board,council_district,census_tract,bin,bbl,nta,cuisine_description,action,violation_code,violation_description,score,grade,grade_date,inspection_type
nulls,0,508,0,351,6,2680,7,0,0,0,257,257,3247,3251,3251,4237,573,3247,2305,2305,3452,3452,9706,105753,114506,2305
dtype,int64,object,object,object,object,float64,object,object,object,object,float64,float64,float64,float64,float64,float64,float64,object,object,object,object,object,float64,object,object,object


In [10]:
len(ny)

207929

### Drop useless columns

In [11]:
ny = ny.drop(columns=['bin', 'bbl', 'nta', 'census_tract', 'council_district', 'community_board', 'grade_date'])

### Clean phone numbers

In [12]:
# Clean phone numbers by removing non-digit characters and dropping nulls
ny.phone = ny.phone.str.replace(' ','')
ny.phone = ny.phone.str.replace('_','')
ny = ny[ny.phone.notna()]

### Clean zipcodes

In [13]:
# Clean zipcodes by filling nulls with 0 and then converting to integers
ny.zipcode = ny.zipcode.fillna(0)
ny.zipcode = ny.zipcode.astype(int)
ny = ny[ny.zipcode.notna()]  # Drop nulls

### Clean streets

In [14]:
# Remove nulls from street
ny = ny[ny.street.notna()]

### Clean scores

In [15]:
ny = ny[ny.inspection_date != '1900-01-01T00:00:00.000']

In [16]:
new_scores = []

for score,rep in zip(ny.score, ny.action.str.contains('No violation')):
    if rep == True:
        new_scores.append(0)
    else:
        new_scores.append(score)

In [17]:
ny['remp_scores'] = new_scores

In [18]:
ny = ny.drop(columns='remp_scores')

In [19]:
ny.score = new_scores

In [20]:
ny = ny[ny.score.notna()]

In [21]:
ny[ny.score.isna()].action.value_counts()

Series([], Name: action, dtype: int64)

### Clean actions

In [22]:
# Remove nulls from action
ny = ny[ny.action.notna()]

In [23]:
fail = ny.action.value_counts().index[1]

In [24]:
ny[ny.action == fail].inspection_date.value_counts().head(70)

2023-01-09T00:00:00.000    64
2023-01-12T00:00:00.000    51
2022-11-28T00:00:00.000    50
2023-03-28T00:00:00.000    49
2022-09-27T00:00:00.000    47
                           ..
2022-07-14T00:00:00.000    28
2022-07-20T00:00:00.000    27
2022-05-17T00:00:00.000    27
2022-08-15T00:00:00.000    27
2022-01-27T00:00:00.000    27
Name: inspection_date, Length: 70, dtype: int64

In [25]:
ny[ny.grade.isna()].score.value_counts().head(20)

0.0     7680
21.0    3634
22.0    3382
23.0    3213
20.0    3131
19.0    3098
25.0    2991
26.0    2825
28.0    2811
29.0    2702
24.0    2686
30.0    2679
18.0    2662
27.0    2639
31.0    2486
33.0    2346
32.0    2279
16.0    2195
17.0    2016
35.0    1992
Name: score, dtype: int64

In [26]:
ny[ny.action == fail].score.value_counts().head(70)

0.0     512
55.0    191
48.0    189
58.0    188
70.0    183
       ... 
26.0     40
32.0     40
81.0     40
24.0     36
96.0     35
Name: score, Length: 70, dtype: int64

In [27]:
ny.isna().sum()

camis                        0
dba                          4
boro                         0
building                   329
street                       0
zipcode                      0
phone                        0
inspection_date              0
critical_flag                0
record_date                  0
latitude                   222
longitude                  222
cuisine_description          0
action                       0
violation_code            1145
violation_description     1145
score                        0
grade                    96676
inspection_type              0
dtype: int64

In [28]:
new_grades = []

for grade, score in zip(ny.grade, ny.score):
    if score <= 13:
        new_grades.append('A')
    elif score <= 27:
        new_grades.append('B')
    elif score > 27:
        new_grades.append('C')
    

In [29]:
ny.grade = new_grades

In [30]:
ny.score[(ny.grade == 'A') & (ny.score <= 13)]

2         13.0
6          0.0
18        10.0
21         0.0
28         0.0
          ... 
207919    12.0
207921     4.0
207924     7.0
207926     0.0
207928     5.0
Name: score, Length: 85598, dtype: float64

In [31]:
ny.grade.value_counts()

A    85598
C    59238
B    53994
Name: grade, dtype: int64

In [32]:
ny.score[ny.grade == 'B'].value_counts()

21.0    5222
23.0    4736
22.0    4688
27.0    4626
19.0    4436
26.0    4394
25.0    4376
20.0    4256
24.0    4038
18.0    3831
16.0    3223
17.0    2844
14.0    2066
15.0    1258
Name: score, dtype: int64

In [33]:
ny.score[ny.grade == 'C'].value_counts()

28.0     3444
30.0     3345
29.0     3165
31.0     3065
33.0     2922
         ... 
168.0       9
110.0       8
112.0       8
142.0       8
140.0       5
Name: score, Length: 103, dtype: int64

In [34]:
ny.score[ny.grade == 'Z'].value_counts().head(20)

Series([], Name: score, dtype: int64)

In [35]:
ny[ny.grade == 'N'].head()

Unnamed: 0,camis,dba,boro,building,street,zipcode,phone,inspection_date,critical_flag,record_date,latitude,longitude,cuisine_description,action,violation_code,violation_description,score,grade,inspection_type


In [36]:
ny.score[ny.grade == 'N'].value_counts().head(20)

Series([], Name: score, dtype: int64)

In [37]:
ny[ny.score.isna()].grade.value_counts()

Series([], Name: grade, dtype: int64)

In [38]:
ny[ny.score.isna()].grade

Series([], Name: grade, dtype: object)

In [39]:
ny.grade[ny.score.isna()].value_counts().head(20)

Series([], Name: grade, dtype: int64)

In [40]:
ny.score[ny.grade.notna()].value_counts().head(20)

12.0    17315
13.0    15545
0.0     10501
10.0     8713
11.0     7988
9.0      7791
7.0      5621
21.0     5222
23.0     4736
22.0     4688
27.0     4626
19.0     4436
26.0     4394
25.0     4376
20.0     4256
24.0     4038
18.0     3831
28.0     3444
30.0     3345
16.0     3223
Name: score, dtype: int64

In [41]:
ny[ny.grade.isna()].inspection_date.value_counts()

Series([], Name: inspection_date, dtype: int64)

In [42]:
for uniq in ny.grade.unique():
    mini = ny.score[(ny.grade == uniq) & (ny.grade.notna())].mean()
    maxi = ny.score[ny.grade == uniq].max()
    print(f' {uniq}   {mini}   {maxi} ')

 A   8.794434449403024   13.0 
 B   21.358965811016038   27.0 
 C   43.93227320301158   168.0 


In [43]:
ny.grade.value_counts()

A    85598
C    59238
B    53994
Name: grade, dtype: int64

In [44]:
ny[['score', 'grade']]

Unnamed: 0,score,grade
2,13.0,A
6,0.0,A
18,10.0,A
21,0.0,A
24,24.0,B
...,...,...
207924,7.0,A
207925,26.0,B
207926,0.0,A
207927,48.0,C


 ## Marc's Actions
 

In [45]:
# looking at the dataframe
ny.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 198830 entries, 2 to 207928
Data columns (total 19 columns):
 #   Column                 Non-Null Count   Dtype  
---  ------                 --------------   -----  
 0   camis                  198830 non-null  int64  
 1   dba                    198826 non-null  object 
 2   boro                   198830 non-null  object 
 3   building               198501 non-null  object 
 4   street                 198830 non-null  object 
 5   zipcode                198830 non-null  int64  
 6   phone                  198830 non-null  object 
 7   inspection_date        198830 non-null  object 
 8   critical_flag          198830 non-null  object 
 9   record_date            198830 non-null  object 
 10  latitude               198608 non-null  float64
 11  longitude              198608 non-null  float64
 12  cuisine_description    198830 non-null  object 
 13  action                 198830 non-null  object 
 14  violation_code         197685 non-nu

In [46]:
ny.isna().sum()

camis                       0
dba                         4
boro                        0
building                  329
street                      0
zipcode                     0
phone                       0
inspection_date             0
critical_flag               0
record_date                 0
latitude                  222
longitude                 222
cuisine_description         0
action                      0
violation_code           1145
violation_description    1145
score                       0
grade                       0
inspection_type             0
dtype: int64

In [52]:
ny[ny['building'].isna()]

Unnamed: 0,camis,dba,boro,building,street,zipcode,phone,inspection_date,critical_flag,record_date,latitude,longitude,cuisine_description,action,violation_code,violation_description,score,grade,inspection_type
56,50016630,"HUDSON, EURO CAFE",Queens,,TERM8-A1,11430,7186560869,2017-01-12T00:00:00.000,Not Applicable,2023-10-26T06:00:13.000,,,Sandwiches/Salads/Mixed Buffet,No violations were recorded at the time of thi...,,,0.0,A,Trans Fat / Initial Inspection
870,50081641,DUNKIN,Queens,,"MARINE TER, LA GARDIA AIRPORT",11371,9732238527,2020-02-06T00:00:00.000,Not Applicable,2023-10-26T06:00:11.000,,,Donuts,No violations were recorded at the time of thi...,,,0.0,A,Cycle Inspection / Initial Inspection
938,50000390,DUNKIN,Queens,,JFK INTERNATIONAL AIRPORT,11430,7187514796,2022-06-02T00:00:00.000,Critical,2023-10-26T06:00:11.000,40.648313,-73.788281,Coffee/Tea,Violations were cited in the following area(s).,02B,Hot food item not held at or above 140º F.,21.0,B,Cycle Inspection / Initial Inspection
945,40376515,AMERICAN MUSEUM OF NATURAL HISTORY FOOD COURT,Manhattan,,W 79 STREET,10024,2127695370,2023-06-21T00:00:00.000,Not Applicable,2023-10-26T06:00:11.000,,,American,No violations were recorded at the time of thi...,,,0.0,A,Cycle Inspection / Initial Inspection
1123,50037907,BARRILES RESTAURANT AND SPORTS BAR,Queens,,37TH AVE,11372,3476491511,2023-01-23T00:00:00.000,Not Critical,2023-10-26T06:00:11.000,,,Spanish,Violations were cited in the following area(s).,08C,Pesticide not properly labeled or used by unli...,13.0,A,Cycle Inspection / Re-inspection
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
204155,50032780,ABITINO'S PIZZA(CONCOURSE C),Queens,,JFK INTERNATIONAL AIRPORT,11430,7182441344,2019-10-17T00:00:00.000,Not Critical,2023-10-26T06:00:11.000,40.648313,-73.788281,Pizza,Violations were cited in the following area(s).,10F,Non-food contact surface improperly constructe...,13.0,A,Cycle Inspection / Initial Inspection
204878,50032780,ABITINO'S PIZZA(CONCOURSE C),Queens,,JFK INTERNATIONAL AIRPORT,11430,7182441344,2022-06-01T00:00:00.000,Critical,2023-10-26T06:00:11.000,40.648313,-73.788281,Pizza,Violations were cited in the following area(s).,04L,Evidence of mice or live mice present in facil...,24.0,B,Cycle Inspection / Initial Inspection
205201,41629686,DUNKIN,Queens,,JFK INTERNATIONAL AIRPORT,11430,3472191094,2022-08-10T00:00:00.000,Critical,2023-10-26T06:00:11.000,40.648313,-73.788281,Donuts,Violations were cited in the following area(s).,04H,"Raw, cooked or prepared food is adulterated, c...",19.0,B,Cycle Inspection / Initial Inspection
206244,50010286,GRAND BANKS,Manhattan,,Park N. Moore St. at West S,10013,2126606312,2022-06-21T00:00:00.000,Critical,2023-10-26T06:00:11.000,,,Seafood,Violations were cited in the following area(s).,06A,Personal cleanliness inadequate. Outer garment...,12.0,A,Cycle Inspection / Initial Inspection


In [53]:
ny[ny['latitude'].isna()]

Unnamed: 0,camis,dba,boro,building,street,zipcode,phone,inspection_date,critical_flag,record_date,latitude,longitude,cuisine_description,action,violation_code,violation_description,score,grade,inspection_type
56,50016630,"HUDSON, EURO CAFE",Queens,,TERM8-A1,11430,7186560869,2017-01-12T00:00:00.000,Not Applicable,2023-10-26T06:00:13.000,,,Sandwiches/Salads/Mixed Buffet,No violations were recorded at the time of thi...,,,0.0,A,Trans Fat / Initial Inspection
775,50086391,FIX-U-PLATE,Brooklyn,1139/1141,CLARKSON AVE,11212,9292343888,2022-03-21T00:00:00.000,Critical,2023-10-26T06:00:11.000,,,Caribbean,Violations were cited in the following area(s).,04L,Evidence of mice or live mice present in facil...,25.0,B,Cycle Inspection / Initial Inspection
870,50081641,DUNKIN,Queens,,"MARINE TER, LA GARDIA AIRPORT",11371,9732238527,2020-02-06T00:00:00.000,Not Applicable,2023-10-26T06:00:11.000,,,Donuts,No violations were recorded at the time of thi...,,,0.0,A,Cycle Inspection / Initial Inspection
945,40376515,AMERICAN MUSEUM OF NATURAL HISTORY FOOD COURT,Manhattan,,W 79 STREET,10024,2127695370,2023-06-21T00:00:00.000,Not Applicable,2023-10-26T06:00:11.000,,,American,No violations were recorded at the time of thi...,,,0.0,A,Cycle Inspection / Initial Inspection
974,41612260,TROPICAL RESTAURANT,Queens,88-18/20,JAMAICA AVENUE,11421,7188468816,2021-09-01T00:00:00.000,Critical,2023-10-26T06:00:11.000,,,Mediterranean,Establishment re-opened by DOHMH.,04M,Live roaches present in facility's food and/or...,27.0,B,Cycle Inspection / Reopening Inspection
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
203777,41521007,"PRIME TAVERN, DELTA TERMINAL",Queens,,US AIRWAYS TERMINAL,11371,7186566210,2018-03-08T00:00:00.000,Critical,2023-10-26T06:00:11.000,,,American,Violations were cited in the following area(s).,06C,Food not protected from potential source of co...,7.0,A,Cycle Inspection / Initial Inspection
206244,50010286,GRAND BANKS,Manhattan,,Park N. Moore St. at West S,10013,2126606312,2022-06-21T00:00:00.000,Critical,2023-10-26T06:00:11.000,,,Seafood,Violations were cited in the following area(s).,06A,Personal cleanliness inadequate. Outer garment...,12.0,A,Cycle Inspection / Initial Inspection
206412,50063280,ENZO'S OF ARTHUR AVE,Bronx,,ARTHUR AVE,10458,7187334455,2023-09-21T00:00:00.000,Not Critical,2023-10-26T06:00:11.000,,,Italian,Violations were cited in the following area(s).,10F,Non-food contact surface or equipment made of ...,12.0,A,Cycle Inspection / Re-inspection
206506,50086391,FIX-U-PLATE,Brooklyn,1139/1141,CLARKSON AVE,11212,9292343888,2023-04-27T00:00:00.000,Not Critical,2023-10-26T06:00:11.000,,,Caribbean,Violations were cited in the following area(s).,10F,Non-food contact surface or equipment made of ...,10.0,A,Cycle Inspection / Re-inspection


In [54]:
ny[ny['longitude'].isna()]

Unnamed: 0,camis,dba,boro,building,street,zipcode,phone,inspection_date,critical_flag,record_date,latitude,longitude,cuisine_description,action,violation_code,violation_description,score,grade,inspection_type
56,50016630,"HUDSON, EURO CAFE",Queens,,TERM8-A1,11430,7186560869,2017-01-12T00:00:00.000,Not Applicable,2023-10-26T06:00:13.000,,,Sandwiches/Salads/Mixed Buffet,No violations were recorded at the time of thi...,,,0.0,A,Trans Fat / Initial Inspection
775,50086391,FIX-U-PLATE,Brooklyn,1139/1141,CLARKSON AVE,11212,9292343888,2022-03-21T00:00:00.000,Critical,2023-10-26T06:00:11.000,,,Caribbean,Violations were cited in the following area(s).,04L,Evidence of mice or live mice present in facil...,25.0,B,Cycle Inspection / Initial Inspection
870,50081641,DUNKIN,Queens,,"MARINE TER, LA GARDIA AIRPORT",11371,9732238527,2020-02-06T00:00:00.000,Not Applicable,2023-10-26T06:00:11.000,,,Donuts,No violations were recorded at the time of thi...,,,0.0,A,Cycle Inspection / Initial Inspection
945,40376515,AMERICAN MUSEUM OF NATURAL HISTORY FOOD COURT,Manhattan,,W 79 STREET,10024,2127695370,2023-06-21T00:00:00.000,Not Applicable,2023-10-26T06:00:11.000,,,American,No violations were recorded at the time of thi...,,,0.0,A,Cycle Inspection / Initial Inspection
974,41612260,TROPICAL RESTAURANT,Queens,88-18/20,JAMAICA AVENUE,11421,7188468816,2021-09-01T00:00:00.000,Critical,2023-10-26T06:00:11.000,,,Mediterranean,Establishment re-opened by DOHMH.,04M,Live roaches present in facility's food and/or...,27.0,B,Cycle Inspection / Reopening Inspection
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
203777,41521007,"PRIME TAVERN, DELTA TERMINAL",Queens,,US AIRWAYS TERMINAL,11371,7186566210,2018-03-08T00:00:00.000,Critical,2023-10-26T06:00:11.000,,,American,Violations were cited in the following area(s).,06C,Food not protected from potential source of co...,7.0,A,Cycle Inspection / Initial Inspection
206244,50010286,GRAND BANKS,Manhattan,,Park N. Moore St. at West S,10013,2126606312,2022-06-21T00:00:00.000,Critical,2023-10-26T06:00:11.000,,,Seafood,Violations were cited in the following area(s).,06A,Personal cleanliness inadequate. Outer garment...,12.0,A,Cycle Inspection / Initial Inspection
206412,50063280,ENZO'S OF ARTHUR AVE,Bronx,,ARTHUR AVE,10458,7187334455,2023-09-21T00:00:00.000,Not Critical,2023-10-26T06:00:11.000,,,Italian,Violations were cited in the following area(s).,10F,Non-food contact surface or equipment made of ...,12.0,A,Cycle Inspection / Re-inspection
206506,50086391,FIX-U-PLATE,Brooklyn,1139/1141,CLARKSON AVE,11212,9292343888,2023-04-27T00:00:00.000,Not Critical,2023-10-26T06:00:11.000,,,Caribbean,Violations were cited in the following area(s).,10F,Non-food contact surface or equipment made of ...,10.0,A,Cycle Inspection / Re-inspection


In [55]:
ny[ny['violation_code'].isna()]

Unnamed: 0,camis,dba,boro,building,street,zipcode,phone,inspection_date,critical_flag,record_date,latitude,longitude,cuisine_description,action,violation_code,violation_description,score,grade,inspection_type
6,41688142,TABLE 87,Brooklyn,620,ATLANTIC AVENUE,11217,9176186100,2017-01-25T00:00:00.000,Not Applicable,2023-10-26T06:00:11.000,40.683447,-73.975691,Pizza,No violations were recorded at the time of thi...,,,0.0,A,Cycle Inspection / Initial Inspection
21,50086686,GERTIE,Brooklyn,58,MARCY AVENUE,11211,7186360902,2021-08-25T00:00:00.000,Not Applicable,2023-10-26T06:00:13.000,40.712360,-73.955419,American,No violations were recorded at the time of thi...,,,0.0,A,Cycle Inspection / Initial Inspection
28,50000995,THE MEATBALL SHOP,Manhattan,1462,2 AVENUE,10075,2122576121,2022-05-09T00:00:00.000,Not Applicable,2023-10-26T06:00:11.000,40.771619,-73.956289,American,Establishment re-opened by DOHMH.,,,0.0,A,Cycle Inspection / Reopening Inspection
45,50112851,CHA KEE,Manhattan,43,MOTT STREET,10013,2125772888,2023-03-10T00:00:00.000,Not Applicable,2023-10-26T06:00:11.000,40.715215,-73.998745,Chinese,No violations were recorded at the time of thi...,,,0.0,A,Administrative Miscellaneous / Initial Inspection
51,41365127,YAQUE RIVER,Bronx,860,EAST TREMONT AVENUE,10460,9176003611,2019-10-18T00:00:00.000,Not Applicable,2023-10-26T06:00:11.000,40.843059,-73.886393,Latin American,No violations were recorded at the time of thi...,,,0.0,A,Smoke-Free Air Act / Limited Inspection
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
207070,50129233,DELICIOUS SLICE OF BROOKLYN & DELI,Brooklyn,849,4 AVENUE,11232,3473842668,2023-04-04T00:00:00.000,Not Applicable,2023-10-26T06:00:13.000,40.657015,-74.001589,Pizza,No violations were recorded at the time of thi...,,,0.0,A,Administrative Miscellaneous / Re-inspection
207240,41202218,PURE LOUNGE,Queens,12619,MERRICK BOULEVARD,11434,7186685226,2021-10-05T00:00:00.000,Not Applicable,2023-10-26T06:00:13.000,40.681863,-73.766062,Caribbean,No violations were recorded at the time of thi...,,,0.0,A,Smoke-Free Air Act / Initial Inspection
207317,41362423,CIBO EXPRESS GOURMET MARKET,Queens,0,JFK INTL. AIRPORT,11430,6464835087,2019-10-17T00:00:00.000,Not Applicable,2023-10-26T06:00:11.000,40.648313,-73.788281,American,No violations were recorded at the time of thi...,,,0.0,A,Cycle Inspection / Initial Inspection
207378,41469754,GOLDMAN SACHS,Manhattan,200,WEST STREET,10282,9173439899,2023-05-12T00:00:00.000,Not Applicable,2023-10-26T06:00:11.000,40.713839,-74.013812,American,No violations were recorded at the time of thi...,,,0.0,A,Cycle Inspection / Initial Inspection


In [56]:
ny[ny['violation_description'].isna()]

Unnamed: 0,camis,dba,boro,building,street,zipcode,phone,inspection_date,critical_flag,record_date,latitude,longitude,cuisine_description,action,violation_code,violation_description,score,grade,inspection_type
6,41688142,TABLE 87,Brooklyn,620,ATLANTIC AVENUE,11217,9176186100,2017-01-25T00:00:00.000,Not Applicable,2023-10-26T06:00:11.000,40.683447,-73.975691,Pizza,No violations were recorded at the time of thi...,,,0.0,A,Cycle Inspection / Initial Inspection
21,50086686,GERTIE,Brooklyn,58,MARCY AVENUE,11211,7186360902,2021-08-25T00:00:00.000,Not Applicable,2023-10-26T06:00:13.000,40.712360,-73.955419,American,No violations were recorded at the time of thi...,,,0.0,A,Cycle Inspection / Initial Inspection
28,50000995,THE MEATBALL SHOP,Manhattan,1462,2 AVENUE,10075,2122576121,2022-05-09T00:00:00.000,Not Applicable,2023-10-26T06:00:11.000,40.771619,-73.956289,American,Establishment re-opened by DOHMH.,,,0.0,A,Cycle Inspection / Reopening Inspection
45,50112851,CHA KEE,Manhattan,43,MOTT STREET,10013,2125772888,2023-03-10T00:00:00.000,Not Applicable,2023-10-26T06:00:11.000,40.715215,-73.998745,Chinese,No violations were recorded at the time of thi...,,,0.0,A,Administrative Miscellaneous / Initial Inspection
51,41365127,YAQUE RIVER,Bronx,860,EAST TREMONT AVENUE,10460,9176003611,2019-10-18T00:00:00.000,Not Applicable,2023-10-26T06:00:11.000,40.843059,-73.886393,Latin American,No violations were recorded at the time of thi...,,,0.0,A,Smoke-Free Air Act / Limited Inspection
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
207070,50129233,DELICIOUS SLICE OF BROOKLYN & DELI,Brooklyn,849,4 AVENUE,11232,3473842668,2023-04-04T00:00:00.000,Not Applicable,2023-10-26T06:00:13.000,40.657015,-74.001589,Pizza,No violations were recorded at the time of thi...,,,0.0,A,Administrative Miscellaneous / Re-inspection
207240,41202218,PURE LOUNGE,Queens,12619,MERRICK BOULEVARD,11434,7186685226,2021-10-05T00:00:00.000,Not Applicable,2023-10-26T06:00:13.000,40.681863,-73.766062,Caribbean,No violations were recorded at the time of thi...,,,0.0,A,Smoke-Free Air Act / Initial Inspection
207317,41362423,CIBO EXPRESS GOURMET MARKET,Queens,0,JFK INTL. AIRPORT,11430,6464835087,2019-10-17T00:00:00.000,Not Applicable,2023-10-26T06:00:11.000,40.648313,-73.788281,American,No violations were recorded at the time of thi...,,,0.0,A,Cycle Inspection / Initial Inspection
207378,41469754,GOLDMAN SACHS,Manhattan,200,WEST STREET,10282,9173439899,2023-05-12T00:00:00.000,Not Applicable,2023-10-26T06:00:11.000,40.713839,-74.013812,American,No violations were recorded at the time of thi...,,,0.0,A,Cycle Inspection / Initial Inspection


In [57]:
ny = ny.dropna()

In [60]:
ny.isna().sum()

camis                    0
dba                      0
boro                     0
building                 0
street                   0
zipcode                  0
phone                    0
inspection_date          0
critical_flag            0
record_date              0
latitude                 0
longitude                0
cuisine_description      0
action                   0
violation_code           0
violation_description    0
score                    0
grade                    0
inspection_type          0
dtype: int64

In [58]:
ny.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 197269 entries, 2 to 207928
Data columns (total 19 columns):
 #   Column                 Non-Null Count   Dtype  
---  ------                 --------------   -----  
 0   camis                  197269 non-null  int64  
 1   dba                    197269 non-null  object 
 2   boro                   197269 non-null  object 
 3   building               197269 non-null  object 
 4   street                 197269 non-null  object 
 5   zipcode                197269 non-null  int64  
 6   phone                  197269 non-null  object 
 7   inspection_date        197269 non-null  object 
 8   critical_flag          197269 non-null  object 
 9   record_date            197269 non-null  object 
 10  latitude               197269 non-null  float64
 11  longitude              197269 non-null  float64
 12  cuisine_description    197269 non-null  object 
 13  action                 197269 non-null  object 
 14  violation_code         197269 non-nu

In [62]:
ny.nunique()

camis                    25723
dba                      20519
boro                         5
building                  7243
street                    2251
zipcode                    221
phone                    23821
inspection_date           1596
critical_flag                3
record_date                  1
latitude                 21767
longitude                21767
cuisine_description         89
action                       4
violation_code              90
violation_description      167
score                      130
grade                        3
inspection_type             20
dtype: int64