<img src="http://imgur.com/1ZcRyrc.png" style="float: left; margin: 20px; height: 55px" />

# Exploring NYC Club Complaints with Pandas

_Authors: Julian Oquendo and James Larkin_

### Import standard libraries for data analysis in Python

In [4]:
# Import standard libraries with the commonly used aliases
# Aliased as pd
import pandas as pd

# Numerical Python - library for various mathematical operations
# Aliased as np
import numpy as np

# Seaborn is a robust data visualization library that sits atop Matplotlib
# Aliased as sns 
import matplotlib.pyplot as sns

# pyplot is a Matplotlib module which provides a MATLAB-like interface
# "matplotlib.pyplot" is aliased as plt
import matplotlib.pyplot as plt

### Reading in a comma separated values (csv) file
- [pandas.read_csv](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html)

In [6]:
club_noise_complaints = pd.read_csv('data/club_noise_complaints.csv') # Read in the "club_noise_complaints.csv" using a handy Pandas method
club_noise_complaints.head()


Unnamed: 0,index,LOC,ZC,CITY,BOUROUGHS_STATE,LAT,LONG,num_calls
0,1,Club/Bar/Restaurant,10308,STATEN ISLAND,"STATEN ISLAND, NY",40.544096,-74.141155,0
1,2,Club/Bar/Restaurant,10012,NEW YORK,"MANHATTAN, NY",40.729793,-73.998842,18
2,3,Club/Bar/Restaurant,10308,STATEN ISLAND,"STATEN ISLAND, NY",40.544209,-74.14104,21
3,4,Club/Bar/Restaurant,10034,New York,"MANHATTAN, NY",40.866376,-73.928258,160
4,5,Club/Bar/Restaurant,11220,,"BROOKLYN, NY",40.635207,-74.020285,17


In [8]:
# Assign the data to a variable name such as "cc" (for club complaints)

cc = club_noise_complaints

# A comma separated values (csv) file is a plain text file that contains a list of data 
# with elements separated by commmas. There are also tab separated files (tsv), json files
# (JavaScript Object Notation), etc.

### Getting a look at the dataset using Pandas methods such as `.head()`, `.tail()`, and `.sample()`

- [pandas.DataFrame.head](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.head.html)
- [pandas.DataFrame.tail](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.tail.html)
- [pandas.DataFrame.sample](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.sample.html)

In [101]:
# Check the first five rows of the Pandas DataFrame
cc.head(5)

# Check the last five rows of the Pandas DataFrame
cc.tail(5)

# Check out a sample row of the Pandas DataFrame
#cc.sample(frac=0.5, replace=True, random_state=1) #A random 50% sample of the DataFrame with replacement:

cc['BOUROUGHS_STATE'].sample(n=3, random_state=1)


1542     BROOKLYN, NY
2153    MANHATTAN, NY
194      BROOKLYN, NY
Name: BOUROUGHS_STATE, dtype: object

### Output a concise summary of a Pandas DataFrame using `.info()`

- [pandas.DataFrame.info](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.info.html)

In [11]:
# Chain on the `info()` method to the Pandas DataFrame variable
cc.info()



<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2447 entries, 0 to 2446
Data columns (total 8 columns):
 #   Column           Non-Null Count  Dtype  
---  ------           --------------  -----  
 0   index            2447 non-null   int64  
 1   LOC              2447 non-null   object 
 2   ZC               2447 non-null   int64  
 3   CITY             1708 non-null   object 
 4   BOUROUGHS_STATE  2447 non-null   object 
 5   LAT              2447 non-null   float64
 6   LONG             2447 non-null   float64
 7   num_calls        2447 non-null   object 
dtypes: float64(2), int64(2), object(4)
memory usage: 153.1+ KB


### Checking the dimensions (number of rows and columns) in the dataset

- [pandas.DataFrame.shape](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.shape.html)

In [13]:
# Chain on the `shape` attribute to the Pandas DataFrame variable
# The output is a tuple containing the number of rows and columns in the DataFrame
cc.shape

# Chain on the `columns` attribute to the Pandas DataFrame variable
# The output is the name of the columns in the Pandas DataFrame
cc.columns


# Chain on the `index` attribute to the Pandas DataFrame variable
# The output is the index of the DataFrame, the data type of that index, and
# the length of number of index value
cc.index



RangeIndex(start=0, stop=2447, step=1)

### Checking the data types in dataset

- [Pandas DataTypes](https://pbpython.com/pandas_dtypes.html)
![alt text](assets/data_types.png)

In [15]:
# Output the data types in the DataFrame using the `dtypes` attribute
# Common datatypes are int64, float64, object, and many more
cc.dtypes

index                int64
LOC                 object
ZC                   int64
CITY                object
BOUROUGHS_STATE     object
LAT                float64
LONG               float64
num_calls           object
dtype: object

## Check the number of unique values inside either each column/series of a Pandas Dataframe or a single Pandas column/series

- [pandas.DataFrame.nunique](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.nunique.html)
- [pandas.Series.nunique](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.nunique.html)
      - Count the number of distinct observations over a requested axis.


In [17]:
# The `.nunique()` method used on the DataFrame outputs a count of the number of the
# unique values in each column or Pandas Series in the DataFrame
cc.nunique().sort_values(ascending=False)

index              2447
LAT                2390
LONG               2389
num_calls           190
ZC                  159
CITY                 40
BOUROUGHS_STATE       6
LOC                   2
dtype: int64

- [pandas.DataFrame.drop](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.drop.html)
      - Drop specified labels from rows or columns.

In [19]:
# From the documentation...
# axis{0 or ‘index’, 1 or ‘columns’}, default 0   <---- this is key
# Whether to drop labels from the index (0 or ‘index’) or columns (1 or ‘columns’).

#cc.drop(columns=['index'], axis=1)  # This will drop the specified column

# So if you want to drop a column, such as the seemingly unneeded "index" column,
cc=cc.drop(columns='index')

# we'll override the default parameter of axis=0 or dropping rows by adding
# axis=1 inside the parantheses


In [118]:
# Check the columns again using the appropriate columns

### Checking for missing values 

- [pandas.DataFrame.isnull](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.isnull.html)
      - Detects missing values

In [21]:
# Chaining on the "isnull" method returns a series of Booleans. 
# True if the value is missing (NaN), False if the value is not missing)
# NaN = Not a Number
cc.isnull()


# Chaining on the .sum() method totals up those 1's and 0's
# True -> 1, False -> 0
cc.isnull().sum()


LOC                  0
ZC                   0
CITY               739
BOUROUGHS_STATE      0
LAT                  0
LONG                 0
num_calls            0
dtype: int64

In [23]:
# Now we can filter that "cc.isnull().sum()" 
cc.isnull().sum()[cc.isnull().sum()!=0].sort_values(ascending=False)

CITY    739
dtype: int64

### Checking for duplicate rows
- [pandas.DataFrame.duplicated](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.duplicated.html)
      - Returns a Series of booleans denoting duplicate rows.
      
      - Key Parameter:
           - keep{‘first’, ‘last’, False}, default ‘first’: Determines which duplicates 
             (if any) to mark.
           - first : Mark duplicates as True except for the first occurrence.
           - last : Mark duplicates as True except for the last occurrence.
           - False : Mark all duplicates as True.

In [126]:
cc.duplicated().sum()

7

In [25]:
cc.duplicated(keep='first').sum

<bound method Series.sum of 0       False
1       False
2       False
3       False
4       False
        ...  
2442    False
2443    False
2444    False
2445    False
2446    False
Length: 2447, dtype: bool>

### Dropping duplicate rows
- [pandas.DataFrame.drop_duplicates](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.drop_duplicates.html)
      - Returns a DataFrame with duplicate rows removed.

In [27]:
cc.duplicated()

0       False
1       False
2       False
3       False
4       False
        ...  
2442    False
2443    False
2444    False
2445    False
2446    False
Length: 2447, dtype: bool

In [29]:
# Use the "drop_duplicates" method and then either assign it back to the DataFrame
# variable name OR explore the inplace parameter and override the default

cc_dropdup=cc.drop_duplicates()
cc_dropdup


Unnamed: 0,LOC,ZC,CITY,BOUROUGHS_STATE,LAT,LONG,num_calls
0,Club/Bar/Restaurant,10308,STATEN ISLAND,"STATEN ISLAND, NY",40.544096,-74.141155,0
1,Club/Bar/Restaurant,10012,NEW YORK,"MANHATTAN, NY",40.729793,-73.998842,18
2,Club/Bar/Restaurant,10308,STATEN ISLAND,"STATEN ISLAND, NY",40.544209,-74.141040,21
3,Club/Bar/Restaurant,10034,New York,"MANHATTAN, NY",40.866376,-73.928258,160
4,Club/Bar/Restaurant,11220,,"BROOKLYN, NY",40.635207,-74.020285,17
...,...,...,...,...,...,...,...
2442,Club/Bar/Restaurant,11211,,"BROOKLYN, NY",40.711765,-73.942687,16
2443,Club/Bar/Restaurant,11104,SUNNYSIDE,"QUEENS, NY",40.740725,-73.923911,17
2444,Club/Bar/Restaurant,10012,NEW YORK,"MANHATTAN, NY",40.729859,-74.000592,17
2445,Club/Bar/Restaurant,10304,STATEN ISLAND,"STATEN ISLAND, NY",40.628744,-74.079935,11


**Always good to check your work...**

- [pandas.DataFrame.shape](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.shape.html)
      - Returns a tuple representing the dimensionality (rows, columns) of the DataFrame.

In [31]:
# We can use the `shape` attribute again to check the number of rows
# or columns in the DataFrame as we explore the data/make changes to it.
cc.shape

(2447, 7)

In [33]:
# "\" allows us to escape the pattern - the "?" is what we're searching for in the data

# Use the handy "len" (length) function to get a count of the rows that contain a "?"

# Find rows where num_calls contains '?'

#rows_question_mark = cc[cc['num_calls'].str.contains(r'\?', na=False)] == '?' #r = raw string 
#rows_question_mark

#len(rows_question_mark)

len(cc[cc['num_calls'].str.contains('\\?')])

22

### Use a handy method from the NumPy library commonly referred to as `np.where()` to replace the `"?"` and `"1?"` values with a NaN (Not a Number)

- [numpy.where](https://numpy.org/doc/stable/reference/generated/numpy.where.html)
      - Return elements chosen from x or y depending on a condition.
      - Common syntax involves three values:
          1. condition
          2. what to return if condition is True
          3. what to return if condition is False

In [143]:
cc['num_calls'] = np.where(cc['num_calls'].str.contains('\\?'), np.nan, cc['num_calls'])
cc.head()

Unnamed: 0,LOC,ZC,CITY,BOUROUGHS_STATE,LAT,LONG,num_calls
0,Club/Bar/Restaurant,10308,STATEN ISLAND,"STATEN ISLAND, NY",40.544096,-74.141155,0
1,Club/Bar/Restaurant,10012,NEW YORK,"MANHATTAN, NY",40.729793,-73.998842,18
2,Club/Bar/Restaurant,10308,STATEN ISLAND,"STATEN ISLAND, NY",40.544209,-74.14104,21
3,Club/Bar/Restaurant,10034,New York,"MANHATTAN, NY",40.866376,-73.928258,160
4,Club/Bar/Restaurant,11220,,"BROOKLYN, NY",40.635207,-74.020285,17


In [35]:
# Check the missing values in the columns of the Pandas DataFrame

cc['num_calls'].isnull().sum()

0

### Split the values in `BOUROUGHS_STATE` columns into two separate columns then concat (or add) those two columns to the original Pandas DataFrame

- [pandas.Series.str.split](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.str.split.html)
      - Key Parameters:
        - pat: String or regular expression to split on. If not specified, 
               the method will split on whitespace.
        - expand (default False): Expand the split strings into separate columns.
             - If True, return DataFrame/MultiIndex expanding dimensionality.
             - If False, return Series/Index, containing lists of strings.
                 
- [pandas.concat](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.concat.html)
      - Key Parameter: axis {0 = "index", 1 = "columns"}, default 0
         - References the axis to concatenate along.

In [37]:
cc.columns

Index(['LOC', 'ZC', 'CITY', 'BOUROUGHS_STATE', 'LAT', 'LONG', 'num_calls'], dtype='object')

In [41]:
# Split the 'BOUROUGHS_STATE' column into two separate columns 
split_col=cc['BOUROUGHS_STATE'].str.split(', ', expand=True)

# Rename the new columns
split_col.columns = ['Borough', 'State']

# and assign back to the new variable below
# Conceptually similar to using "Text to Columns" in Excel
split_cc = pd.concat([split_col, cc], axis=1)
split_cc

Unnamed: 0,Borough,State,LOC,ZC,CITY,BOUROUGHS_STATE,LAT,LONG,num_calls
0,STATEN ISLAND,NY,Club/Bar/Restaurant,10308,STATEN ISLAND,"STATEN ISLAND, NY",40.544096,-74.141155,0
1,MANHATTAN,NY,Club/Bar/Restaurant,10012,NEW YORK,"MANHATTAN, NY",40.729793,-73.998842,18
2,STATEN ISLAND,NY,Club/Bar/Restaurant,10308,STATEN ISLAND,"STATEN ISLAND, NY",40.544209,-74.141040,21
3,MANHATTAN,NY,Club/Bar/Restaurant,10034,New York,"MANHATTAN, NY",40.866376,-73.928258,160
4,BROOKLYN,NY,Club/Bar/Restaurant,11220,,"BROOKLYN, NY",40.635207,-74.020285,17
...,...,...,...,...,...,...,...,...,...
2442,BROOKLYN,NY,Club/Bar/Restaurant,11211,,"BROOKLYN, NY",40.711765,-73.942687,16
2443,QUEENS,NY,Club/Bar/Restaurant,11104,SUNNYSIDE,"QUEENS, NY",40.740725,-73.923911,17
2444,MANHATTAN,NY,Club/Bar/Restaurant,10012,NEW YORK,"MANHATTAN, NY",40.729859,-74.000592,17
2445,STATEN ISLAND,NY,Club/Bar/Restaurant,10304,STATEN ISLAND,"STATEN ISLAND, NY",40.628744,-74.079935,11


### Drop a column (or columns) from a Pandas DataFrame

- [pandas.DataFrame.drop](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.drop.html)
      - Key Parameter: axis{0 --> "index", 1 --> "columns"}, default 0
         - Whether to drop labels from the index (0 or ‘index’) or 
           columns (1 or ‘columns’).

In [43]:
# Drop the original "BOUROUGHS_STATE" column from DataFrame
split_cc.drop(columns=['BOUROUGHS_STATE'])

Unnamed: 0,Borough,State,LOC,ZC,CITY,LAT,LONG,num_calls
0,STATEN ISLAND,NY,Club/Bar/Restaurant,10308,STATEN ISLAND,40.544096,-74.141155,0
1,MANHATTAN,NY,Club/Bar/Restaurant,10012,NEW YORK,40.729793,-73.998842,18
2,STATEN ISLAND,NY,Club/Bar/Restaurant,10308,STATEN ISLAND,40.544209,-74.141040,21
3,MANHATTAN,NY,Club/Bar/Restaurant,10034,New York,40.866376,-73.928258,160
4,BROOKLYN,NY,Club/Bar/Restaurant,11220,,40.635207,-74.020285,17
...,...,...,...,...,...,...,...,...
2442,BROOKLYN,NY,Club/Bar/Restaurant,11211,,40.711765,-73.942687,16
2443,QUEENS,NY,Club/Bar/Restaurant,11104,SUNNYSIDE,40.740725,-73.923911,17
2444,MANHATTAN,NY,Club/Bar/Restaurant,10012,NEW YORK,40.729859,-74.000592,17
2445,STATEN ISLAND,NY,Club/Bar/Restaurant,10304,STATEN ISLAND,40.628744,-74.079935,11


### Rename columns in a Pandas DataFrame

- [pandas.DataFrame.rename](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.rename.html)
      - Accepts a dictionary of key/values pairs to rename columns in a Pandas DataFrame
      - Key Parameter: columns

In [193]:
# Pass a dictionary of key:values pairs to the columns parameter of the "rename" method
#cc.rename(columns={'BOUROUGHS_STATE':'COUNTY',
                   #'LOC': 'LOCATION'}, inplace=True)
#cc
cc.rename(columns={'COUNTRY':'COUNTY',
                   'ZC': 'ZIPCODE'}, inplace=True)
cc

Unnamed: 0,LOCATION,ZIPCODE,CITY,COUNTY,LAT,LONG,num_calls
0,Club/Bar/Restaurant,10308,STATEN ISLAND,"STATEN ISLAND, NY",40.544096,-74.141155,0
1,Club/Bar/Restaurant,10012,NEW YORK,"MANHATTAN, NY",40.729793,-73.998842,18
2,Club/Bar/Restaurant,10308,STATEN ISLAND,"STATEN ISLAND, NY",40.544209,-74.141040,21
3,Club/Bar/Restaurant,10034,New York,"MANHATTAN, NY",40.866376,-73.928258,160
4,Club/Bar/Restaurant,11220,,"BROOKLYN, NY",40.635207,-74.020285,17
...,...,...,...,...,...,...,...
2442,Club/Bar/Restaurant,11211,,"BROOKLYN, NY",40.711765,-73.942687,16
2443,Club/Bar/Restaurant,11104,SUNNYSIDE,"QUEENS, NY",40.740725,-73.923911,17
2444,Club/Bar/Restaurant,10012,NEW YORK,"MANHATTAN, NY",40.729859,-74.000592,17
2445,Club/Bar/Restaurant,10304,STATEN ISLAND,"STATEN ISLAND, NY",40.628744,-74.079935,11


In [None]:
# Check your work


### Reorder columns in a Pandas DataFrame

- [pandas.DataFrame.reindex](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.reindex.html)
      - Can be used to reorder columns in a Pandas DataFrame
      - Key Parameter: columns

In [197]:
# Pass a list of column names (as string) to the columns parameter of the "reindex" method

cc.reindex(columns=['LOCATION','CITY','COUNTY','ZIPCODE','LAT','LONG','num_calls'])

Unnamed: 0,LOCATION,CITY,COUNTY,ZIPCODE,LAT,LONG,num_calls
0,Club/Bar/Restaurant,STATEN ISLAND,"STATEN ISLAND, NY",10308,40.544096,-74.141155,0
1,Club/Bar/Restaurant,NEW YORK,"MANHATTAN, NY",10012,40.729793,-73.998842,18
2,Club/Bar/Restaurant,STATEN ISLAND,"STATEN ISLAND, NY",10308,40.544209,-74.141040,21
3,Club/Bar/Restaurant,New York,"MANHATTAN, NY",10034,40.866376,-73.928258,160
4,Club/Bar/Restaurant,,"BROOKLYN, NY",11220,40.635207,-74.020285,17
...,...,...,...,...,...,...,...
2442,Club/Bar/Restaurant,,"BROOKLYN, NY",11211,40.711765,-73.942687,16
2443,Club/Bar/Restaurant,SUNNYSIDE,"QUEENS, NY",11104,40.740725,-73.923911,17
2444,Club/Bar/Restaurant,NEW YORK,"MANHATTAN, NY",10012,40.729859,-74.000592,17
2445,Club/Bar/Restaurant,STATEN ISLAND,"STATEN ISLAND, NY",10304,40.628744,-74.079935,11


In [195]:
# Check your work
cc.columns

Index(['LOCATION', 'ZIPCODE', 'CITY', 'COUNTY', 'LAT', 'LONG', 'num_calls'], dtype='object')

### Use `.describe()` to output descriptive statistic for the numeric columns in the Pandas DataFrame

- [pandas.DataFrame.describe](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.describe.html)
      - Descriptive statistics summarizing the central tendency, dispersion and 
        shape of a dataset’s distribution, excluding NaN values.

In [209]:
# The `describe()` method outputs descriptive statistics by default for the
# numeric columns in the Pandas DataFrame.

cc['num_calls'].describe()

count     2425
unique     188
top         10
freq       187
Name: num_calls, dtype: object

In [None]:
# Setting a condition outputs a series of Booleans. 
# True if the condition is met, False if not


In [223]:
# Filtering the "cc" DataFrame by the condition of "cc['latitude'] <= 0"
# will return the rows where that condition is met 
# (latitude values is less than or equal to zero)

cc[cc['LAT']<=0]

Unnamed: 0,LOCATION,ZIPCODE,CITY,COUNTY,LAT,LONG,num_calls
1048,Club/Bar/Restaurant,10036,NEW YORK,<function is_efficient at 0x16468c4a0>,-40.764281,-73.998581,10


In [227]:
# Now just flipping the sign around for the "longitude" Series (column) to output
# the rows where the longitude value is greater than or equal to zero
cc[cc['LONG']>=0]

Unnamed: 0,LOCATION,ZIPCODE,CITY,COUNTY,LAT,LONG,num_calls
2339,Club/Bar/Restaurant,11237,,,40.704007,73.930575,14


**`.loc` and `.iloc`**
  - [Indexing and selecting data](https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html)
  - [Handy .loc and .iloc resource](https://www.shanelynn.ie/select-pandas-dataframe-rows-and-columns-using-iloc-loc-and-ix/)

In [233]:
# Fix the erroneous value in the "longitude" column
cc.loc[2339,'LONG']=cc.loc[2339,'LONG']*-1 #format long/lat is not in the US (+)

In [239]:
# Fix the erroneous value in the "latitude" column
cc.loc[1048,'LAT']=cc.loc[1048,'LAT']*-1

**Always good to check your work...**

In [241]:
# Check the "longitude" column again for any erroneous values
cc[cc['LAT']<=0]

Unnamed: 0,LOCATION,ZIPCODE,CITY,COUNTY,LAT,LONG,num_calls


In [243]:
# Check the "latitude" column again for any erroneous values
cc[cc['LONG']>=0]

Unnamed: 0,LOCATION,ZIPCODE,CITY,COUNTY,LAT,LONG,num_calls
1048,Club/Bar/Restaurant,10036,NEW YORK,<function is_efficient at 0x16468c4a0>,40.764281,40.764281,10
2339,Club/Bar/Restaurant,11237,,,40.704007,40.764281,14


### We can write all this work back to a csv file if we wish to. 

- [pandas.DataFrame.to_csv](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_csv.html)
      - Key Parameter: index (bool), default True
         - Write row names (index)

In [245]:
# Write the cleaned up Pandas DataFrame to a comma-separated values (csv) file
cc.to_csv

<bound method NDFrame.to_csv of                  LOCATION  ZIPCODE           CITY COUNTY        LAT  \
0     Club/Bar/Restaurant    10308  STATEN ISLAND   None  40.544096   
1     Club/Bar/Restaurant    10012       NEW YORK   None  40.729793   
2     Club/Bar/Restaurant    10308  STATEN ISLAND   None  40.544209   
3     Club/Bar/Restaurant    10034       New York   None  40.866376   
4     Club/Bar/Restaurant    11220            NaN   None  40.635207   
...                   ...      ...            ...    ...        ...   
2442  Club/Bar/Restaurant    11211            NaN   None  40.711765   
2443  Club/Bar/Restaurant    11104      SUNNYSIDE   None  40.740725   
2444  Club/Bar/Restaurant    10012       NEW YORK   None  40.729859   
2445  Club/Bar/Restaurant    10304  STATEN ISLAND   None  40.628744   
2446  Club/Bar/Restaurant    11423         HOLLIS   None  40.711692   

           LONG num_calls  
0    -74.141155         0  
1    -73.998842        18  
2    -74.141040        21  
3  