
<h1 align=center><font size = 5>Segmenting and Clustering Crimes Against Females in New York City</font></h1>



This notebook was created by Amr Elrasad as part of Capstone project. 

## Introduction

In notebook, an analysis and exploration of the incident reports to New York policd department. The goal of the analysis is to analyize the crimes where victims were females and their locations. The goal is to provide analysis and clustering for places where crimes are more likely to be against women. 

## Table of Contents

<div class="alert alert-block alert-info" style="margin-top: 20px">

<font size = 3>

1. <a href="#item1">Download and Explore Dataset</a>
3. <a href="#item2">FOURSQUARE API</a>

2. <a href="#item2">Data Clustering</a>

</font>
</div>

Before we get the data and start exploring it, let's download all the dependencies that we will need.

In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans
from sklearn.cluster import DBSCAN
from sklearn import preprocessing
#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Libraries imported.


<a id='item1'></a>

In [2]:
# function to normalize dataframe before passing to clustering 
def normalize_df(df):
    x = df.values #returns a numpy array
    min_max_scaler = preprocessing.MinMaxScaler()
    x_scaled = min_max_scaler.fit_transform(x)
    df = pd.DataFrame(x_scaled)
    return df

## 1. Download and Explore Dataset

New York city has 77 precincts, where each incident report must include its unique code.

The national US government has open dataset for the all reported incidents between 2006 till 2017. This can be accessed from: https://catalog.data.gov/dataset/nypd-complaint-data-historic. The data is about 2GB and I only took subset o it for analysis. I choose to work on data in 2017.  


For your convenience, I downloaded the data and the subset to be analyzed is saved in 'data.csv'.

In [3]:
NYPD=pd.read_csv('data.csv')
NYPD.head()


Unnamed: 0,CMPLNT_NUM,CMPLNT_FR_DT,CMPLNT_FR_TM,CMPLNT_TO_DT,CMPLNT_TO_TM,ADDR_PCT_CD,RPT_DT,KY_CD,OFNS_DESC,PD_CD,PD_DESC,CRM_ATPT_CPTD_CD,LAW_CAT_CD,BORO_NM,LOC_OF_OCCUR_DESC,PREM_TYP_DESC,JURIS_DESC,JURISDICTION_CODE,PARKS_NM,HADEVELOPT,HOUSING_PSA,X_COORD_CD,Y_COORD_CD,SUSP_AGE_GROUP,SUSP_RACE,SUSP_SEX,TRANSIT_DISTRICT,Latitude,Longitude,Lat_Lon,PATROL_BORO,STATION_NAME,VIC_AGE_GROUP,VIC_RACE,VIC_SEX
0,619128592,05/25/2017,11:00:00,05/25/2017,11:15:00,48,05/25/2017,351,CRIMINAL MISCHIEF & RELATED OF,259.0,"CRIMINAL MISCHIEF,UNCLASSIFIED 4",COMPLETED,MISDEMEANOR,BRONX,INSIDE,COMMERCIAL BUILDING,N.Y. POLICE DEPT,0.0,,,,1013619.0,247689.0,,,,,40.846484,-73.89385,"(40.846483584, -73.893850279)",PATROL BORO BRONX,,UNKNOWN,UNKNOWN,D
1,699494668,05/25/2017,11:00:00,05/25/2017,12:00:00,71,05/25/2017,106,FELONY ASSAULT,109.0,"ASSAULT 2,1,UNCLASSIFIED",COMPLETED,FELONY,BROOKLYN,INSIDE,RESIDENCE - APT. HOUSE,N.Y. POLICE DEPT,0.0,,,,1003717.0,182563.0,25-44,BLACK,M,,40.667757,-73.929829,"(40.667756807, -73.929828559)",PATROL BORO BKLYN SOUTH,,18-24,BLACK,F
2,103321764,05/25/2017,11:00:00,05/25/2017,11:10:00,104,05/25/2017,352,CRIMINAL TRESPASS,205.0,"TRESPASS 2, CRIMINAL",COMPLETED,MISDEMEANOR,QUEENS,INSIDE,RESIDENCE-HOUSE,N.Y. POLICE DEPT,0.0,,,,1018935.0,205144.0,18-24,WHITE,F,,40.72969,-73.874856,"(40.729689793, -73.874855861)",PATROL BORO QUEENS NORTH,,45-64,UNKNOWN,F
3,137516053,05/25/2017,11:00:00,05/25/2017,11:05:00,113,05/25/2017,341,PETIT LARCENY,333.0,"LARCENY,PETIT FROM STORE-SHOPL",COMPLETED,MISDEMEANOR,QUEENS,INSIDE,FOOD SUPERMARKET,N.Y. POLICE DEPT,0.0,,,,1042058.0,186716.0,25-44,BLACK,M,,40.678989,-73.791585,"(40.67898851, -73.791585115)",PATROL BORO QUEENS SOUTH,,UNKNOWN,UNKNOWN,D
4,451320561,05/25/2017,11:00:00,,,5,05/25/2017,351,CRIMINAL MISCHIEF & RELATED OF,258.0,"CRIMINAL MISCHIEF 4TH, GRAFFIT",COMPLETED,MISDEMEANOR,MANHATTAN,FRONT OF,COMMERCIAL BUILDING,N.Y. POLICE DEPT,0.0,,,,984607.0,199399.0,,,,,40.713989,-73.998714,"(40.713989146, -73.998713673)",PATROL BORO MAN SOUTH,,UNKNOWN,UNKNOWN,E


Now, I am going to choose the columns to work on. Which are: Borough, crime category, precincts code, incident location, and victim sex. 

In [4]:
df_data=NYPD[['ADDR_PCT_CD','BORO_NM','Latitude','Longitude','LAW_CAT_CD','VIC_SEX']]
df_data.head()

Unnamed: 0,ADDR_PCT_CD,BORO_NM,Latitude,Longitude,LAW_CAT_CD,VIC_SEX
0,48,BRONX,40.846484,-73.89385,MISDEMEANOR,D
1,71,BROOKLYN,40.667757,-73.929829,FELONY,F
2,104,QUEENS,40.72969,-73.874856,MISDEMEANOR,F
3,113,QUEENS,40.678989,-73.791585,MISDEMEANOR,D
4,5,MANHATTAN,40.713989,-73.998714,MISDEMEANOR,E


In [5]:
len(df_data)

168568

It seems that there were 168568 incidents in the first 5 months of 2017!!

Now let us consider only incidents where the victim was female.

In [6]:
df_data=df_data[df_data['VIC_SEX']=='F']
df_data.drop(labels='VIC_SEX',axis=1,inplace=True)
df_data=df_data.reset_index(drop=True)
len(df_data)

66386

There were 66386 incidents against women.
Let us view the data that we will work on it

In [7]:
df_data.head()

Unnamed: 0,ADDR_PCT_CD,BORO_NM,Latitude,Longitude,LAW_CAT_CD
0,71,BROOKLYN,40.667757,-73.929829,FELONY
1,104,QUEENS,40.72969,-73.874856,MISDEMEANOR
2,48,BRONX,40.855168,-73.887904,MISDEMEANOR
3,45,BRONX,40.842202,-73.849757,MISDEMEANOR
4,24,MANHATTAN,40.794953,-73.971437,FELONY


Data types:

In [8]:
df_data.dtypes

ADDR_PCT_CD      int64
BORO_NM         object
Latitude       float64
Longitude      float64
LAW_CAT_CD      object
dtype: object

Now let us drop the missing data

In [9]:
df_data.BORO_NM.replace(np.nan, '', regex=True,inplace=True)
df_data=df_data[df_data['BORO_NM']!=''].reset_index(drop=True)
df_data=df_data[df_data['ADDR_PCT_CD']!=np.nan].reset_index(drop=True)
print(len(df_data))


66365


It is time for one-hot encoding for the offence category. Also, we will group data by precincts for better viewing   

In [10]:
df_data_onehot= pd.get_dummies(df_data[['LAW_CAT_CD']], prefix="", prefix_sep="")
df_data=pd.concat([df_data,df_data_onehot],axis=1).drop(['LAW_CAT_CD'],axis=1)
df_data=df_data.sort_values(by=['ADDR_PCT_CD']).reset_index(drop=True)
df_data.head()

Unnamed: 0,ADDR_PCT_CD,BORO_NM,Latitude,Longitude,FELONY,MISDEMEANOR,VIOLATION
0,1,MANHATTAN,40.710229,-74.007746,0,0,1
1,1,MANHATTAN,40.720464,-74.006852,0,1,0
2,1,MANHATTAN,40.715529,-74.00924,0,1,0
3,1,MANHATTAN,40.722855,-74.003375,1,0,0
4,1,MANHATTAN,40.705024,-74.012978,0,1,0


It is better to group data by precincts and borough and count total number of crimes for each category and also calculate average location to place it on map.  

In [11]:
df_grouped1=df_data.groupby(by=['ADDR_PCT_CD','BORO_NM']).count().reset_index()
df_grouped2=df_data.groupby(by=['ADDR_PCT_CD','BORO_NM']).mean().reset_index()
df_grouped3=df_data.groupby(by=['ADDR_PCT_CD','BORO_NM']).sum().reset_index()


Now  lust us construct the data frame and the one that will hold the clusters labels

I also added score metric column to reflect the number of crimes and the types of them. For each Felony, I give score of 5, Misdemeanor score of 3, and Violation score of 2. 

In [12]:
df_grouped1['Latitude']=df_grouped2['Latitude']
df_grouped1['Longitude']=df_grouped2['Longitude']
df_grouped1['FELONY']=df_grouped3['FELONY']
df_grouped1['MISDEMEANOR']=df_grouped3['MISDEMEANOR']
df_grouped1['VIOLATION']=df_grouped3['VIOLATION']
df_grouped=df_grouped1
df_grouped['Score']=df_grouped['FELONY']*5+df_grouped['MISDEMEANOR']*3+df_grouped['VIOLATION']*2
del df_grouped1, df_grouped2,df_grouped3

In [13]:
df_grouped

Unnamed: 0,ADDR_PCT_CD,BORO_NM,Latitude,Longitude,FELONY,MISDEMEANOR,VIOLATION,Score
0,1,MANHATTAN,40.714436,-74.00812,193.0,284.0,113.0,2043.0
1,5,MANHATTAN,40.718126,-73.995498,160.0,254.0,87.0,1736.0
2,6,MANHATTAN,40.73379,-74.001023,233.0,205.0,78.0,1936.0
3,7,MANHATTAN,40.716364,-73.984518,163.0,294.0,132.0,1961.0
4,9,MANHATTAN,40.726165,-73.983405,273.0,369.0,102.0,2676.0
5,9,QUEENS,40.726554,-73.987828,1.0,0.0,0.0,5.0
6,10,MANHATTAN,40.747869,-74.000401,177.0,224.0,82.0,1721.0
7,13,MANHATTAN,40.738675,-73.98568,281.0,335.0,123.0,2656.0
8,14,MANHATTAN,40.752727,-73.988398,405.0,381.0,126.0,3420.0
9,17,MANHATTAN,40.752853,-73.971203,178.0,156.0,117.0,1592.0


Now, it is time to clean data from some repetitions at indcies: 5, 24, 67, and 74.

In [14]:
df_grouped=df_grouped.drop([5,24,67,74]).reset_index(drop=True)
df_grouped

Unnamed: 0,ADDR_PCT_CD,BORO_NM,Latitude,Longitude,FELONY,MISDEMEANOR,VIOLATION,Score
0,1,MANHATTAN,40.714436,-74.00812,193.0,284.0,113.0,2043.0
1,5,MANHATTAN,40.718126,-73.995498,160.0,254.0,87.0,1736.0
2,6,MANHATTAN,40.73379,-74.001023,233.0,205.0,78.0,1936.0
3,7,MANHATTAN,40.716364,-73.984518,163.0,294.0,132.0,1961.0
4,9,MANHATTAN,40.726165,-73.983405,273.0,369.0,102.0,2676.0
5,10,MANHATTAN,40.747869,-74.000401,177.0,224.0,82.0,1721.0
6,13,MANHATTAN,40.738675,-73.98568,281.0,335.0,123.0,2656.0
7,14,MANHATTAN,40.752727,-73.988398,405.0,381.0,126.0,3420.0
8,17,MANHATTAN,40.752853,-73.971203,178.0,156.0,117.0,1592.0
9,18,MANHATTAN,40.763205,-73.985063,302.0,357.0,184.0,2949.0


The clustered grouped data frame is the one that will hold the data with clusters labels 

In [15]:
df_grouped_clusters=df_grouped.copy()
df_grouped_clusters.insert(0,'Cluster Labels',np.nan)
df_grouped_clusters.head()

Unnamed: 0,Cluster Labels,ADDR_PCT_CD,BORO_NM,Latitude,Longitude,FELONY,MISDEMEANOR,VIOLATION,Score
0,,1,MANHATTAN,40.714436,-74.00812,193.0,284.0,113.0,2043.0
1,,5,MANHATTAN,40.718126,-73.995498,160.0,254.0,87.0,1736.0
2,,6,MANHATTAN,40.73379,-74.001023,233.0,205.0,78.0,1936.0
3,,7,MANHATTAN,40.716364,-73.984518,163.0,294.0,132.0,1961.0
4,,9,MANHATTAN,40.726165,-73.983405,273.0,369.0,102.0,2676.0


# 2. FOURSQUAR API

I will use foursquare API to get correct coordinates of each precinct. I have created a list with all precincts on my foursquare account. I am going to get and parse this list to extract the coordinates and update both grouped and clustered grouped data frames. 

## 2.1 Credentials

In [16]:
CLIENT_ID = 'TEESM4UGUK3OBYJ5BCHFZAM5SCUCDQWDRIQH4IZGQEW13GAL' # your Foursquare ID
CLIENT_SECRET = 'ERL1CINJIPLCEL3RPMALZYFOHJOCCV1PZYV2YXETTM5DZHS4' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 2000

In [17]:
list_id = '5c8d3fdb1acf11002c667a3a' # user ID with most agree counts and complete profile
url = 'https://api.foursquare.com/v2/lists/{}?client_id={}&client_secret={}&v={}'.format(list_id, CLIENT_ID, CLIENT_SECRET, VERSION) # define URL
# send GET request
results = requests.get(url).json()

In [18]:
for i in range(0,77):
    lng=results['response']['list']['listItems']['items'][i]['venue']['location']['labeledLatLngs'][0]['lng']
    lat=results['response']['list']['listItems']['items'][i]['venue']['location']['labeledLatLngs'][0]['lat']
    df_grouped.loc[i,'Latitude']=lat
    df_grouped.loc[i,'Longitude']=lng
    df_grouped_clusters.loc[i,'Latitude']=lat
    df_grouped_clusters.loc[i,'Longitude']=lng


In [19]:
df_grouped

Unnamed: 0,ADDR_PCT_CD,BORO_NM,Latitude,Longitude,FELONY,MISDEMEANOR,VIOLATION,Score
0,1,MANHATTAN,40.720384,-74.006939,193.0,284.0,113.0,2043.0
1,5,MANHATTAN,40.716097,-73.997252,160.0,254.0,87.0,1736.0
2,6,MANHATTAN,40.733985,-74.005457,233.0,205.0,78.0,1936.0
3,7,MANHATTAN,40.716392,-73.983726,163.0,294.0,132.0,1961.0
4,9,MANHATTAN,40.726359,-73.988002,273.0,369.0,102.0,2676.0
5,10,MANHATTAN,40.742876,-73.998551,177.0,224.0,82.0,1721.0
6,13,MANHATTAN,40.73698,-73.982771,281.0,335.0,123.0,2656.0
7,14,MANHATTAN,40.75383,-73.99505,405.0,381.0,126.0,3420.0
8,17,MANHATTAN,40.756657,-73.970653,178.0,156.0,117.0,1592.0
9,18,MANHATTAN,40.76513,-73.985013,302.0,357.0,184.0,2949.0


In [20]:
df_grouped_clusters

Unnamed: 0,Cluster Labels,ADDR_PCT_CD,BORO_NM,Latitude,Longitude,FELONY,MISDEMEANOR,VIOLATION,Score
0,,1,MANHATTAN,40.720384,-74.006939,193.0,284.0,113.0,2043.0
1,,5,MANHATTAN,40.716097,-73.997252,160.0,254.0,87.0,1736.0
2,,6,MANHATTAN,40.733985,-74.005457,233.0,205.0,78.0,1936.0
3,,7,MANHATTAN,40.716392,-73.983726,163.0,294.0,132.0,1961.0
4,,9,MANHATTAN,40.726359,-73.988002,273.0,369.0,102.0,2676.0
5,,10,MANHATTAN,40.742876,-73.998551,177.0,224.0,82.0,1721.0
6,,13,MANHATTAN,40.73698,-73.982771,281.0,335.0,123.0,2656.0
7,,14,MANHATTAN,40.75383,-73.99505,405.0,381.0,126.0,3420.0
8,,17,MANHATTAN,40.756657,-73.970653,178.0,156.0,117.0,1592.0
9,,18,MANHATTAN,40.76513,-73.985013,302.0,357.0,184.0,2949.0


# 3. Data Clustering

## 3.1 Map

This function will produce folium map.

In [21]:
def create_folium_map():
    # create map
    map_clusters = folium.Map(location=[40.7308619, -73.9871558], zoom_start=11)

    # set color scheme for the clusters
    x = np.arange(kclusters)
    ys = [i + x + (i*x)**2 for i in range(kclusters)]
    colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
    rainbow = [colors.rgb2hex(i) for i in colors_array]

    # add markers to the map
    markers_colors = []
    for lat, lon, poi, boro, cluster,score in zip(df_grouped_clusters['Latitude'], df_grouped_clusters['Longitude'], df_grouped_clusters['ADDR_PCT_CD'],df_grouped_clusters['BORO_NM'], df_grouped_clusters['Cluster Labels'],df_grouped_clusters['Score']):
        label = folium.Popup('precinct No.  '+str(poi) +', '+boro+ ' Cluster ' + str(cluster)+ ', score= '+ str(score), parse_html=True)
        folium.CircleMarker(
            [lat, lon],
            radius=5,
            popup=label,
            color=rainbow[cluster-1],
            fill=True,
            fill_color=rainbow[cluster-1],
            fill_opacity=0.7).add_to(map_clusters)
    return map_clusters

## 3.2 K-means

K-means clustering is used to cluster data based on the 'df_grouped' data frame. 

### 3.2.1 Clustering based on location and number of each crime category 

In [22]:
# set number of clusters
kclusters = 3

# run k-means clustering
# normalizing values first
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(normalize_df(df_grouped[['Latitude','Longitude','FELONY','MISDEMEANOR','VIOLATION']]))

# add clustering labels
df_grouped_clusters['Cluster Labels']= kmeans.labels_
# view clusters data frame
df_grouped_clusters # check the last columns!

Unnamed: 0,Cluster Labels,ADDR_PCT_CD,BORO_NM,Latitude,Longitude,FELONY,MISDEMEANOR,VIOLATION,Score
0,1,1,MANHATTAN,40.720384,-74.006939,193.0,284.0,113.0,2043.0
1,1,5,MANHATTAN,40.716097,-73.997252,160.0,254.0,87.0,1736.0
2,1,6,MANHATTAN,40.733985,-74.005457,233.0,205.0,78.0,1936.0
3,1,7,MANHATTAN,40.716392,-73.983726,163.0,294.0,132.0,1961.0
4,1,9,MANHATTAN,40.726359,-73.988002,273.0,369.0,102.0,2676.0
5,1,10,MANHATTAN,40.742876,-73.998551,177.0,224.0,82.0,1721.0
6,1,13,MANHATTAN,40.73698,-73.982771,281.0,335.0,123.0,2656.0
7,1,14,MANHATTAN,40.75383,-73.99505,405.0,381.0,126.0,3420.0
8,1,17,MANHATTAN,40.756657,-73.970653,178.0,156.0,117.0,1592.0
9,1,18,MANHATTAN,40.76513,-73.985013,302.0,357.0,184.0,2949.0


Now let us visualize clustered data on map

In [23]:
create_folium_map()

### 3.2.2 Clustering based on location and crime score 

In [24]:
# set number of clusters
kclusters = 3

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(normalize_df(df_grouped[['Latitude','Longitude','Score']]))

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:30] 
# add clustering labels
df_grouped_clusters['Cluster Labels']= kmeans.labels_

df_grouped_clusters 

Unnamed: 0,Cluster Labels,ADDR_PCT_CD,BORO_NM,Latitude,Longitude,FELONY,MISDEMEANOR,VIOLATION,Score
0,1,1,MANHATTAN,40.720384,-74.006939,193.0,284.0,113.0,2043.0
1,1,5,MANHATTAN,40.716097,-73.997252,160.0,254.0,87.0,1736.0
2,1,6,MANHATTAN,40.733985,-74.005457,233.0,205.0,78.0,1936.0
3,1,7,MANHATTAN,40.716392,-73.983726,163.0,294.0,132.0,1961.0
4,1,9,MANHATTAN,40.726359,-73.988002,273.0,369.0,102.0,2676.0
5,1,10,MANHATTAN,40.742876,-73.998551,177.0,224.0,82.0,1721.0
6,1,13,MANHATTAN,40.73698,-73.982771,281.0,335.0,123.0,2656.0
7,2,14,MANHATTAN,40.75383,-73.99505,405.0,381.0,126.0,3420.0
8,1,17,MANHATTAN,40.756657,-73.970653,178.0,156.0,117.0,1592.0
9,2,18,MANHATTAN,40.76513,-73.985013,302.0,357.0,184.0,2949.0


In [25]:
create_folium_map()

### 3.2.2 Clustering based on  crime score 

In [26]:
# set number of clusters
kclusters = 3

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(normalize_df(df_grouped[['Score']]))

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:30] 
# add clustering labels
df_grouped_clusters['Cluster Labels']= kmeans.labels_

df_grouped_clusters 

Unnamed: 0,Cluster Labels,ADDR_PCT_CD,BORO_NM,Latitude,Longitude,FELONY,MISDEMEANOR,VIOLATION,Score
0,0,1,MANHATTAN,40.720384,-74.006939,193.0,284.0,113.0,2043.0
1,0,5,MANHATTAN,40.716097,-73.997252,160.0,254.0,87.0,1736.0
2,0,6,MANHATTAN,40.733985,-74.005457,233.0,205.0,78.0,1936.0
3,0,7,MANHATTAN,40.716392,-73.983726,163.0,294.0,132.0,1961.0
4,2,9,MANHATTAN,40.726359,-73.988002,273.0,369.0,102.0,2676.0
5,0,10,MANHATTAN,40.742876,-73.998551,177.0,224.0,82.0,1721.0
6,2,13,MANHATTAN,40.73698,-73.982771,281.0,335.0,123.0,2656.0
7,2,14,MANHATTAN,40.75383,-73.99505,405.0,381.0,126.0,3420.0
8,0,17,MANHATTAN,40.756657,-73.970653,178.0,156.0,117.0,1592.0
9,2,18,MANHATTAN,40.76513,-73.985013,302.0,357.0,184.0,2949.0


In [27]:
create_folium_map()

## 3.2 DBSCAN Clustering

Another way to do clustering by using DBSCAN clustering

In [28]:
# run DBSCAN clustering
Dbscan = DBSCAN(eps=0.01,min_samples=5).fit(normalize_df(df_grouped[['Score']])) 
# add clustering labels
df_grouped_clusters['Cluster Labels']= Dbscan.labels_
df_grouped_clusters 

Unnamed: 0,Cluster Labels,ADDR_PCT_CD,BORO_NM,Latitude,Longitude,FELONY,MISDEMEANOR,VIOLATION,Score
0,0,1,MANHATTAN,40.720384,-74.006939,193.0,284.0,113.0,2043.0
1,-1,5,MANHATTAN,40.716097,-73.997252,160.0,254.0,87.0,1736.0
2,0,6,MANHATTAN,40.733985,-74.005457,233.0,205.0,78.0,1936.0
3,0,7,MANHATTAN,40.716392,-73.983726,163.0,294.0,132.0,1961.0
4,1,9,MANHATTAN,40.726359,-73.988002,273.0,369.0,102.0,2676.0
5,-1,10,MANHATTAN,40.742876,-73.998551,177.0,224.0,82.0,1721.0
6,1,13,MANHATTAN,40.73698,-73.982771,281.0,335.0,123.0,2656.0
7,-1,14,MANHATTAN,40.75383,-73.99505,405.0,381.0,126.0,3420.0
8,-1,17,MANHATTAN,40.756657,-73.970653,178.0,156.0,117.0,1592.0
9,2,18,MANHATTAN,40.76513,-73.985013,302.0,357.0,184.0,2949.0


In [29]:
create_folium_map()