# CAPSTONE PROJECT - The Choice

## The Choice – NYU v/s UofT 


<P> <B> Business Problem: </B><P?>
There is a recent request from Charlotte who resides in Paris, France. She is an international student wanting to pursue her post graduate degree in Fine Arts abroad. She received acceptance letters from both New York University | NYU (U.S) and University of Toronto | UofT (Canada). Charlotte who’s never been to either cities before, would like to make an informed decision on which school to attend based on the proximity and availability of art related venues (such as performing centres, art galleries, art stores, etc.) to the university campus where she will attend classes. Charlotte would also like to have a choice of neighbourhoods based on her preference to be considered for residence.


## Import necessary Libraries

In [1]:
import requests # library to handle requests
import pandas as pd # library for data analsysis
import numpy as np # library to handle data in a vectorized manner
import random # library for random number generation

!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

# libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 
    
# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize

!conda install -c conda-forge folium=0.5.0 --yes
import folium # plotting library

print('Folium installed')
print('Libraries imported.')

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/DSX-Python35

  added / updated specs: 
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    ca-certificates-2019.3.9   |       hecc5488_0         146 KB  conda-forge
    openssl-1.0.2r             |       h14c3975_0         3.1 MB  conda-forge
    geopy-1.20.0               |             py_0          57 KB  conda-forge
    geographiclib-1.49         |             py_0          32 KB  conda-forge
    certifi-2018.8.24          |        py35_1001         139 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         3.5 MB

The following NEW packages will be INSTALLED:

    geographiclib:   1.49-py_0         conda-forge
    geopy:           1.20.0-py_0       conda-forge

The following packages will be UPDATED:

   

<P><B> </B><P>
The following data points will be required to provide a thorough analysis of both regions and recommendation to the client:

<LI>Geo coordinates of both NYU and UofT
<LI>Number of Art related venues around each of the NYU and UofT 
<LI>Distance of these venues from each of the universities
<LI>Natural grouping of venue data on the basis of similarity

<P>
<P><B>Data Description:</B>
<LI>FourSquare explorer API to search for Art related venues around NYU and UofT in their respective cities of New York City and Toronto.
<LI>Publicly available dataset that contains neighbourhoods data of Manhattan (and New York City) at https://cocl.us/new_york_dataset

<P>
<P><B>Approach to solve the problem:</B>
<LI>Convert addresses into their equivalent latitude and longitude values.
<LI>Use the Foursquare API to explore neighbourhoods in New York City and Toronto
<LI>Use the Foursquare API explore function to get the most common venue categories in each city
<LI>Group the neighbourhoods into clusters using k-means clustering algorithm
<LI>Use Follium library to visualize neighbourhoods and their emerging clusters.



## Define Foursquare Credentials and Version¶

In [2]:
# The code was removed by Watson Studio for sharing.

Your credentials:
CLIENT_ID: GG0QFSJAB10QC3ZMAG2BHGHNXEL3ZV3UNJIJ2QQ2HQYCDXWW
CLIENT_SECRET:YVZONLPA3OW0GCAIV5234QZYLCK3DX10CPDXAHWUOARUL2QD


Let's get the loaction of both universaties: NYU and UofT. 
> Convert the addresses to their corresponding latitude and longitude coordinates.
> Define an instance of the geocoder to define a user_agent. 
> Name agent foursquare_agent

In [3]:
# New York University is at 70 Washington Square S, New York, NY

address_NYU = '70 Washington Square S, New York, NY'

geolocator_NYU = Nominatim(user_agent="foursquare_agent")
location_NYU = geolocator_NYU.geocode(address_NYU)
latitude_NYU = location_NYU.latitude
longitude_NYU = location_NYU.longitude
print('NYU coordinates are: ' + str(latitude_NYU) + ', ' + str(longitude_NYU))

NYU coordinates are: 40.72942865, -73.9972178045625


In [4]:
# University of Toronto is at 27 Kings College Circle, Toronto, Ontario

address_UofT = '27 Kings College Circle, Toronto, Ontario'

geolocator_UofT = Nominatim(user_agent="foursquare_agent")
location_UofT = geolocator_UofT.geocode(address_UofT)
latitude_UofT = location_UofT.latitude
longitude_UofT = location_UofT.longitude
print('UofT coordinates are: ' + str(latitude_UofT) + ', ' + str(longitude_UofT))

UofT coordinates are: 43.6607225, -79.3959198095151


In [5]:
combined_data = pd.DataFrame(columns=['City/Country','University','Latitude', 'Longitude'])

combined_data = combined_data.append({'City/Country':'New York City / USA','University':'NYU','Latitude':latitude_NYU, 'Longitude':longitude_NYU},ignore_index=True)
combined_data = combined_data.append({'City/Country':'Toronto / Canada','University':'UofT','Latitude':latitude_UofT, 'Longitude':longitude_UofT},ignore_index=True)

combined_data


Unnamed: 0,City/Country,University,Latitude,Longitude
0,New York City / USA,NYU,40.729429,-73.997218
1,Toronto / Canada,UofT,43.660722,-79.39592


## 1. Search for a specific venue category
https://api.foursquare.com/v2/venues/search?client_id=CLIENT_ID&client_secret=CLIENT_SECRET&ll=LATITUDE,LONGITUDE&v=VERSION&query=QUERY&radius=RADIUS&limit=LIMIT

We are looking for art venues within 1000 meteres of the Universities. 

In [6]:
search_query = 'art'
radius = 1000
print(search_query)

art


### Define the corresponding URL

In [7]:
url_NYU = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude_NYU, longitude_NYU, VERSION, search_query, radius, LIMIT)
url_NYU

'https://api.foursquare.com/v2/venues/search?client_id=GG0QFSJAB10QC3ZMAG2BHGHNXEL3ZV3UNJIJ2QQ2HQYCDXWW&client_secret=YVZONLPA3OW0GCAIV5234QZYLCK3DX10CPDXAHWUOARUL2QD&ll=40.72942865,-73.9972178045625&v=20180604&query=art&radius=1000&limit=30'

In [8]:
url_UofT = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude_UofT, longitude_UofT, VERSION, search_query, radius, LIMIT)
url_UofT

'https://api.foursquare.com/v2/venues/search?client_id=GG0QFSJAB10QC3ZMAG2BHGHNXEL3ZV3UNJIJ2QQ2HQYCDXWW&client_secret=YVZONLPA3OW0GCAIV5234QZYLCK3DX10CPDXAHWUOARUL2QD&ll=43.6607225,-79.3959198095151&v=20180604&query=art&radius=1000&limit=30'

### Send the GET Request and examine the results¶

In [9]:
results_NYU = requests.get(url_NYU).json()
results_UofT = requests.get(url_UofT).json()

### Get relevant part of JSON and transform it into a pandas dataframe

In [10]:
# NYU 

# assign relevant part of JSON to venues
venues_NYU = results_NYU['response']['venues']

# tranform venues into a dataframe
dataframe_NYU = json_normalize(venues_NYU)
dataframe_NYU.shape

(30, 19)

In [11]:
# UofT

# assign relevant part of JSON to venues
venues_UofT = results_UofT['response']['venues']

# tranform venues into a dataframe
dataframe_UofT = json_normalize(venues_UofT)

dataframe_UofT.shape

(30, 18)

### Define information of interest and filter dataframe

In [12]:
# keep only columns that include venue name, and anything that is associated with location
filtered_columns_NYU = ['name', 'categories'] + [col for col in dataframe_NYU.columns if col.startswith('location.')] + ['id']
dataframe_filtered_NYU = dataframe_NYU.loc[:, filtered_columns_NYU]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
dataframe_filtered_NYU['categories'] = dataframe_filtered_NYU.apply(get_category_type, axis=1)

# clean column names by keeping only last term
dataframe_filtered_NYU.columns = [column.split('.')[-1] for column in dataframe_filtered_NYU.columns]

# Sort the list by the closest art venue first
df_artNYU = dataframe_filtered_NYU.loc[(dataframe_filtered_NYU['categories'] == 'Performing Arts Venue') | (dataframe_filtered_NYU['categories'] == 'Art Gallery') | (dataframe_filtered_NYU['categories'] == 'Arts & Crafts Store')]

df_artNYU.sort_values('distance')


Unnamed: 0,name,categories,address,cc,city,country,crossStreet,distance,formattedAddress,labeledLatLngs,lat,lng,neighborhood,postalCode,state,id
1,NYU Grey Art Gallery,Art Gallery,100 Washington Sq E,US,New York,United States,Washington Pl,154,"[100 Washington Sq E (Washington Pl), New York...","[{'lng': -73.99597817635681, 'label': 'display...",40.730454,-73.995978,,10003.0,NY,441274fdf964a520d1301fe3
15,Moniker Art Fair,Art Gallery,718 Broadway,US,New York,United States,,340,"[718 Broadway, New York, NY 10003, United States]","[{'lng': -73.993256, 'label': 'display', 'lat'...",40.728838,-73.993256,,10003.0,NY,5cca087231fd14002ca0f45f
0,Blick Art Materials,Arts & Crafts Store,1-5 Bond St,US,New York,United States,btwn Lafayette & Broadway,370,"[1-5 Bond St (btwn Lafayette & Broadway), New ...","[{'lng': -73.99475772919754, 'label': 'display...",40.726668,-73.994758,,10012.0,NY,4aad442ef964a5205d5f20e3
9,Hulonthalo Gallery of Art in NY,Art Gallery,670 Broadway,US,New York,United States,,397,"[670 Broadway, New York, NY 10012, United States]","[{'lng': -73.99362646666667, 'label': 'display...",40.727118,-73.993626,,10012.0,NY,4c9cf85d542b224b148be59f
11,SoHo Gallery for Digital Art,Art Gallery,138 Sullivan St,US,New York,United States,at Prince St.,501,"[138 Sullivan St (at Prince St.), New York, NY...","[{'lng': -74.00209407619435, 'label': 'display...",40.726852,-74.002094,,10012.0,NY,4bbbb6f62d9ea59307c89fce
16,Crown Fine Art,Art Gallery,421 W Broadway,US,SoHo,United States,,620,"[421 W Broadway, SoHo, NY, United States]","[{'lng': -74.00171450760502, 'label': 'display...",40.725012,-74.001715,,,NY,4e2095e0d164740631fd85ff
28,Jerry's Fine Art Supplies,Arts & Crafts Store,,US,New York,United States,,683,"[New York, NY, United States]","[{'lng': -73.99008905955168, 'label': 'display...",40.732339,-73.990089,,,NY,53f0c47d498e7cdbf1ee7b41
3,BLICK Art Materials,Arts & Crafts Store,21 E 13th St,US,New York,United States,at 5th Ave,720,"[21 E 13th St (at 5th Ave), New York, NY 10003...","[{'lng': -73.99280782306175, 'label': 'display...",40.734975,-73.992808,,10003.0,NY,4ac380b4f964a5208b9b20e3
19,New York Central Art Supply,Arts & Crafts Store,62 3rd Ave,US,New York,United States,East 11th Street,748,"[62 3rd Ave (East 11th Street), New York, NY 1...","[{'lng': -73.988712343282, 'label': 'display',...",40.731344,-73.988712,,10003.0,NY,506b26d9e4b01151fb3d1307
18,C. J. Yao Art Gallery,Art Gallery,66 Greene St,US,New York,United States,,773,"[66 Greene St, New York, NY 10012, United States]","[{'lng': -74.000848, 'label': 'display', 'lat'...",40.723049,-74.000848,,10012.0,NY,5b81b49395d986002cd099e9


In [13]:
closest_artvenues_NYU  = df_artNYU.sort_values('distance').head(5)
print('There are ' + str(closest_artvenues_NYU['name'].count()) + ' art venues within 500m of NYU')

There are 5 art venues within 500m of NYU


In [14]:
# keep only columns that include venue name, and anything that is associated with location
filtered_columns_UofT = ['name', 'categories'] + [col for col in dataframe_UofT.columns if col.startswith('location.')] + ['id']
dataframe_filtered_UofT = dataframe_UofT.loc[:, filtered_columns_UofT]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
dataframe_filtered_UofT['categories'] = dataframe_filtered_UofT.apply(get_category_type, axis=1)

# clean column names by keeping only last term
dataframe_filtered_UofT.columns = [column.split('.')[-1] for column in dataframe_filtered_UofT.columns]

# Sort the list by the closest first

df_artUofT = dataframe_filtered_UofT.loc[(dataframe_filtered_UofT['categories'] == 'Performing Arts Venue') | (dataframe_filtered_UofT['categories'] == 'Art Gallery') | (dataframe_filtered_UofT['categories'] == 'Arts & Crafts Store')]

df_artUofT.sort_values('distance')


Unnamed: 0,name,categories,address,cc,city,country,crossStreet,distance,formattedAddress,labeledLatLngs,lat,lng,postalCode,state,id
7,Toose Art Supplies,Arts & Crafts Store,,CA,Toronto,Canada,,98,"[Toronto ON, Canada]","[{'lng': -79.39593445588955, 'label': 'display...",43.659834,-79.395934,,ON,4cd313a5d160b1f79bf521ab
10,UTAC Art Centre,Art Gallery,15 Kings College Circle,CA,Toronto,Canada,,297,"[15 Kings College Circle, Toronto ON M5S 3H7, ...","[{'lng': -79.39617085198672, 'label': 'display...",43.663392,-79.396171,M5S 3H7,ON,52267723498e648523debe10
9,UTAC Art Lounge,Art Gallery,15 King's College Circle,CA,Toronto,Canada,,299,"[15 King's College Circle, Toronto ON, Canada]","[{'lng': -79.39579701602028, 'label': 'display...",43.663412,-79.395797,,ON,4db99016b5928d7fda71b872
8,University of Toronto Arts Centre,Art Gallery,15 Kings College Circle,CA,Toronto,Canada,,326,"[15 Kings College Circle, Toronto ON, Canada]","[{'lng': -79.39515929235343, 'label': 'display...",43.663605,-79.395159,,ON,4bfd97014cf820a1c0b0ecf4
4,University College Art Centre,Art Gallery,,CA,Toronto,Canada,King College Circle and Tower Rd,363,"[King College Circle and Tower Rd, Toronto ON,...","[{'lng': -79.39538469518658, 'label': 'display...",43.663962,-79.395385,,ON,4d8a79ae6ce6a35de4a26142
15,Gwartzman's Art Supplies,Arts & Crafts Store,448 Spadina Ave,CA,Toronto,Canada,College Street,495,"[448 Spadina Ave (College Street), Toronto ON ...","[{'lng': -79.39984590886463, 'label': 'display...",43.6573,-79.399846,M5T 2G8,ON,4e4d1592a809b3dab3b23445
14,Re-Photo-Cubic Public Art intervention,Art Gallery,,CA,,Canada,,556,[Canada],"[{'lng': -79.39997, 'label': 'display', 'lat':...",43.656673,-79.39997,,,5076e06de4b020a7af1ef77e
6,Art Square Gallery & Cafe,Art Gallery,334 Dundas St West,CA,Toronto,Canada,,772,"[334 Dundas St West, Toronto ON M5T 1G5, Canada]","[{'lng': -79.39253595754614, 'label': 'display...",43.654227,-79.392536,M5T 1G5,ON,4ae47067f964a520989a21e3
20,Consignor Canadian Fine Art,Art Gallery,326 Dundas Street West,CA,Toronto,Canada,,772,"[326 Dundas Street West, Toronto ON M5T 1G5, C...","[{'lng': -79.392269, 'label': 'display', 'lat'...",43.654307,-79.392269,M5T 1G5,ON,521b957f11d2c8ababfc94e8
0,Art Gallery of Ontario,Art Gallery,317 Dundas St W,CA,Toronto,Canada,at Beverley St,786,"[317 Dundas St W (at Beverley St), Toronto ON ...","[{'lng': -79.39292172707437, 'label': 'display...",43.654003,-79.392922,M5T 1G4,ON,4ad4c05ef964a520daf620e3


In [15]:
closest_artvenues_UofT  = df_artUofT.sort_values('distance').head(6)
print('There are ' + str(closest_artvenues_UofT['name'].count()) + ' art venues within 500m of UofT')

There are 6 art venues within 500m of UofT


In [16]:
print('The list of 10 art venues around NYU')
dataframe_filtered_NYU.name.head(10)

The list of 10 art venues around NYU


0                          Blick Art Materials
1                         NYU Grey Art Gallery
2    Leslie+Lohman Museum of Gay & Lesbian Art
3                          BLICK Art Materials
4        The Brant Foundation Art Study Center
5          Storefront for Art and Architecture
6              New York University Art History
7                   La Sirena Mexican Folk Art
8                                        Artsy
9              Hulonthalo Gallery of Art in NY
Name: name, dtype: object

In [17]:
print('The list of 10 art venues around UofT')
dataframe_filtered_UofT.name.head(10)

The list of 10 art venues around UofT


0                               Art Gallery of Ontario
1    Ontario College of Art and Design University (...
2                             Aboveground Art Supplies
3                                    Department of Art
4                        University College Art Centre
5                               Curry's Art Store Ltd.
6                            Art Square Gallery & Cafe
7                                   Toose Art Supplies
8                    University of Toronto Arts Centre
9                                      UTAC Art Lounge
Name: name, dtype: object

### Let's visualize the Art Galleries / Supply stores that are near NYU

In [18]:
venues_map_NYU = folium.Map(location=[latitude_NYU, longitude_NYU], zoom_start=13) # generate map centred around NYU

# add a red circle marker to represent NYU
folium.features.CircleMarker(
    [latitude_NYU, longitude_NYU],
    radius=10,
    color='red',
    popup='New York University',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(venues_map_NYU)

# add the art venues as blue circle markers
for lat, lng, label in zip(dataframe_filtered_NYU.lat, dataframe_filtered_NYU.lng, dataframe_filtered_NYU.categories):
    folium.features.CircleMarker(
        [lat, lng],
        radius=5,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(venues_map_NYU)

# display map
venues_map_NYU

### Let's visualize the Art Galleries/Supply stores that are near UofT

In [19]:
venues_map_UofT = folium.Map(location=[latitude_UofT, longitude_UofT], zoom_start=13) # generate map centred around UofT

# add a red circle marker to represent UofT
folium.features.CircleMarker(
    [latitude_UofT, longitude_UofT],
    radius=10,
    color='red',
    popup='University of Toronto',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(venues_map_UofT)

# add the art venues as blue circle markers
for lat, lng, label in zip(dataframe_filtered_UofT.lat, dataframe_filtered_UofT.lng, dataframe_filtered_UofT.categories):
    folium.features.CircleMarker(
        [lat, lng],
        radius=5,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(venues_map_UofT)

# display map
venues_map_UofT

## CLUSTERING NY
Download and Explore Dataset

Explore Neighborhoods in New York City

Analyze Each Neighborhood

Cluster Neighborhoods

Examine Clusters

Before we get the data and start exploring it, let's download all the dependencies that we will need.

In [20]:
dataframe_filtered_NYU.name

0                           Blick Art Materials
1                          NYU Grey Art Gallery
2     Leslie+Lohman Museum of Gay & Lesbian Art
3                           BLICK Art Materials
4         The Brant Foundation Art Study Center
5           Storefront for Art and Architecture
6               New York University Art History
7                    La Sirena Mexican Folk Art
8                                         Artsy
9               Hulonthalo Gallery of Art in NY
10                               Art & Industry
11                 SoHo Gallery for Digital Art
12                         Independent Art Fair
13                           Art Media Holdings
14               Dahesh Museum of Art Gift Shop
15                             Moniker Art Fair
16                               Crown Fine Art
17               Art Distributed Publishers Inc
18                        C. J. Yao Art Gallery
19                  New York Central Art Supply
20    First Street Garden/First Street A

In [21]:
dataframe_filtered_UofT.name

0                                Art Gallery of Ontario
1     Ontario College of Art and Design University (...
2                              Aboveground Art Supplies
3                                     Department of Art
4                         University College Art Centre
5                                Curry's Art Store Ltd.
6                             Art Square Gallery & Cafe
7                                    Toose Art Supplies
8                     University of Toronto Arts Centre
9                                       UTAC Art Lounge
10                                      UTAC Art Centre
11                                     Fine Art Library
12                              Women's Art Association
13                                Daniel's Art Supplies
14               Re-Photo-Cubic Public Art intervention
15                             Gwartzman's Art Supplies
16                                Art Bike Installation
17                               Art Framing N C

In [22]:
!wget -q -O 'newyork_data.json' https://cocl.us/new_york_dataset
print('Data downloaded!')

Data downloaded!


In [23]:
import json # library to handle JSON files
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

with open('newyork_data.json') as json_data:
    newyork_data = json.load(json_data)

neighborhoods_data = newyork_data['features']

# Tranform the data into a pandas dataframe
#The next task is essentially transforming this data of nested Python dictionaries into a pandas dataframe. So let's start by creating an empty dataframe.

# define the dataframe columns
column_names = ['Borough', 'Neighborhood', 'Latitude', 'Longitude'] 

# instantiate the dataframe
neighborhoods = pd.DataFrame(columns=column_names)

#Then let's loop through the data and fill the dataframe one row at a time.


for data in neighborhoods_data:
    borough = neighborhood_name = data['properties']['borough'] 
    neighborhood_name = data['properties']['name']
        
    neighborhood_latlon = data['geometry']['coordinates']
    neighborhood_lat = neighborhood_latlon[1]
    neighborhood_lon = neighborhood_latlon[0]
    
    neighborhoods = neighborhoods.append({'Borough': borough,
                                          'Neighborhood': neighborhood_name,
                                          'Latitude': neighborhood_lat,
                                          'Longitude': neighborhood_lon}, ignore_index=True)

In [24]:
neighborhoods.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Bronx,Wakefield,40.894705,-73.847201
1,Bronx,Co-op City,40.874294,-73.829939
2,Bronx,Eastchester,40.887556,-73.827806
3,Bronx,Fieldston,40.895437,-73.905643
4,Bronx,Riverdale,40.890834,-73.912585


In [25]:
manhattan_data = neighborhoods[neighborhoods['Borough'] == 'Manhattan'].reset_index(drop=True)
manhattan_data.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Manhattan,Marble Hill,40.876551,-73.91066
1,Manhattan,Chinatown,40.715618,-73.994279
2,Manhattan,Washington Heights,40.851903,-73.9369
3,Manhattan,Inwood,40.867684,-73.92121
4,Manhattan,Hamilton Heights,40.823604,-73.949688


In [26]:
# Since NYU is in Manhattan, let's visualize Manhattan's neighborhoods
#Let's get the geographical coordinates of Manhattan.

address = 'Manhattan, NY'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Manhattan are {}, {}.'.format(latitude, longitude))


The geograpical coordinate of Manhattan are 40.7900869, -73.9598295.


In [27]:
# let's visualize Manhattan and its neighborhoods in it.

map_manhattan = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(manhattan_data['Latitude'], manhattan_data['Longitude'], manhattan_data['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_manhattan)  
    
map_manhattan

In [28]:
# Next, we are going to start utilizing the Foursquare API to explore the neighborhoods and segment them. Define Foursquare Credentials a

In [29]:
# The code was removed by Watson Studio for sharing.

Your credentails:
CLIENT_ID: GG0QFSJAB10QC3ZMAG2BHGHNXEL3ZV3UNJIJ2QQ2HQYCDXWW
CLIENT_SECRET:YVZONLPA3OW0GCAIV5234QZYLCK3DX10CPDXAHWUOARUL2QD


In [30]:
# Get the neighborhood's latitude and longitude values.

In [31]:
neighborhood_latitude = manhattan_data.loc[0, 'Latitude'] # neighborhood latitude value
neighborhood_longitude = manhattan_data.loc[0, 'Longitude'] # neighborhood longitude value

neighborhood_name = manhattan_data.loc[0, 'Neighborhood'] # neighborhood name

print('Latitude and longitude values of {} are {}, {}.'.format(neighborhood_name, 
                                                               neighborhood_latitude, 
                                                               neighborhood_longitude))

Latitude and longitude values of Marble Hill are 40.87655077879964, -73.91065965862981.


In [32]:
# let's create a function to use Foursquare API for all the negihborhoods

def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
        
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)



In [33]:
# Now write the code to run the above function on each neighborhood and create a new dataframe called manhattan_venues.¶
manhattan_venues = getNearbyVenues(names=manhattan_data['Neighborhood'],
                                   latitudes=manhattan_data['Latitude'],
                                   longitudes=manhattan_data['Longitude']
                                  )

Marble Hill
Chinatown
Washington Heights
Inwood
Hamilton Heights
Manhattanville
Central Harlem
East Harlem
Upper East Side
Yorkville
Lenox Hill
Roosevelt Island
Upper West Side
Lincoln Square
Clinton
Midtown
Murray Hill
Chelsea
Greenwich Village
East Village
Lower East Side
Tribeca
Little Italy
Soho
West Village
Manhattan Valley
Morningside Heights
Gramercy
Battery Park City
Financial District
Carnegie Hill
Noho
Civic Center
Midtown South
Sutton Place
Turtle Bay
Tudor City
Stuyvesant Town
Flatiron
Hudson Yards


In [34]:
print(manhattan_venues.shape)
manhattan_venues.head()

(1179, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Marble Hill,40.876551,-73.91066,Arturo's,40.874412,-73.910271,Pizza Place
1,Marble Hill,40.876551,-73.91066,Bikram Yoga,40.876844,-73.906204,Yoga Studio
2,Marble Hill,40.876551,-73.91066,Tibbett Diner,40.880404,-73.908937,Diner
3,Marble Hill,40.876551,-73.91066,Dunkin',40.877136,-73.906666,Donut Shop
4,Marble Hill,40.876551,-73.91066,Starbucks,40.877531,-73.905582,Coffee Shop


In [35]:
# There are 1181 venues in Manhattan neighborhoods

In [36]:
# limit to only Art venues

df_artNYU = manhattan_venues.loc[(manhattan_venues['Venue Category'] == 'Performing Arts Venue') | (manhattan_venues['Venue Category'] == 'Art Gallery') | (manhattan_venues['Venue Category'] == 'Arts & Crafts Store')]

In [37]:
df_artNYU.head()
df_artNYU.shape

(14, 7)

In [38]:
# Let's check how many art venues were returned for each neighborhood

In [39]:
df_artNYU.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Battery Park City,1,1,1,1,1,1
Central Harlem,1,1,1,1,1,1
Hudson Yards,1,1,1,1,1,1
Lenox Hill,1,1,1,1,1,1
Lincoln Square,2,2,2,2,2,2
Lower East Side,3,3,3,3,3,3
Manhattan Valley,1,1,1,1,1,1
Manhattanville,1,1,1,1,1,1
Soho,2,2,2,2,2,2
Upper East Side,1,1,1,1,1,1


In [40]:
df_art_summary = df_artNYU.groupby('Neighborhood').count().drop(columns = 'Neighborhood Latitude')
df_art_summary = df_art_summary.drop(columns = 'Neighborhood Longitude')
df_art_summary = df_art_summary.drop(columns = 'Venue Latitude')
df_art_summary = df_art_summary.drop(columns = 'Venue Longitude')
df_art_summary = df_art_summary.drop(columns = 'Venue Category')
df_art_summary

Unnamed: 0_level_0,Venue
Neighborhood,Unnamed: 1_level_1
Battery Park City,1
Central Harlem,1
Hudson Yards,1
Lenox Hill,1
Lincoln Square,2
Lower East Side,3
Manhattan Valley,1
Manhattanville,1
Soho,2
Upper East Side,1


In [41]:
# Let's find out how many unique categories can be curated from all the returned venues
print('There are {} uniques categories.'.format(len(manhattan_venues['Venue Category'].unique())))

There are 233 uniques categories.


In [42]:
# . Analyze Each Neighborhood

In [43]:
# one hot encoding
manhattan_onehot = pd.get_dummies(df_artNYU[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
manhattan_onehot['Neighborhood'] = df_artNYU['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [manhattan_onehot.columns[-1]] + list(manhattan_onehot.columns[:-1])
manhattan_onehot = manhattan_onehot[fixed_columns]

manhattan_onehot.head()

Unnamed: 0,Neighborhood,Art Gallery,Arts & Crafts Store,Performing Arts Venue
167,Manhattanville,1,0,0
196,Central Harlem,1,0,0
250,Upper East Side,1,0,0
316,Lenox Hill,1,0,0
392,Lincoln Square,0,0,1


In [44]:
manhattan_onehot.shape

(14, 4)

In [45]:
#Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category


manhattan_grouped = manhattan_onehot.groupby('Neighborhood').mean().reset_index()
manhattan_grouped

Unnamed: 0,Neighborhood,Art Gallery,Arts & Crafts Store,Performing Arts Venue
0,Battery Park City,0.0,0.0,1.0
1,Central Harlem,1.0,0.0,0.0
2,Hudson Yards,1.0,0.0,0.0
3,Lenox Hill,1.0,0.0,0.0
4,Lincoln Square,0.0,0.0,1.0
5,Lower East Side,0.666667,0.0,0.333333
6,Manhattan Valley,0.0,1.0,0.0
7,Manhattanville,1.0,0.0,0.0
8,Soho,0.5,0.5,0.0
9,Upper East Side,1.0,0.0,0.0


In [46]:
manhattan_grouped.shape

(10, 4)

In [47]:
#Let's print each neighborhood along with the top 3 most common Art venues
num_top_venues = 3 

for hood in manhattan_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = manhattan_grouped[manhattan_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Battery Park City----
                   venue  freq
0  Performing Arts Venue   1.0
1            Art Gallery   0.0
2    Arts & Crafts Store   0.0


----Central Harlem----
                   venue  freq
0            Art Gallery   1.0
1    Arts & Crafts Store   0.0
2  Performing Arts Venue   0.0


----Hudson Yards----
                   venue  freq
0            Art Gallery   1.0
1    Arts & Crafts Store   0.0
2  Performing Arts Venue   0.0


----Lenox Hill----
                   venue  freq
0            Art Gallery   1.0
1    Arts & Crafts Store   0.0
2  Performing Arts Venue   0.0


----Lincoln Square----
                   venue  freq
0  Performing Arts Venue   1.0
1            Art Gallery   0.0
2    Arts & Crafts Store   0.0


----Lower East Side----
                   venue  freq
0            Art Gallery  0.67
1  Performing Arts Venue  0.33
2    Arts & Crafts Store  0.00


----Manhattan Valley----
                   venue  freq
0    Arts & Crafts Store   1.0
1            Art Gall

In [48]:
#Let's put that into a pandas dataframe
#First, let's write a function to sort the venues in descending order.

def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]



In [49]:
#Now let's create the new dataframe and display the top 3 venues for each neighborhood.
num_top_venues = 3

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = manhattan_grouped['Neighborhood']

for ind in np.arange(manhattan_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(manhattan_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted



Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue
0,Battery Park City,Performing Arts Venue,Arts & Crafts Store,Art Gallery
1,Central Harlem,Art Gallery,Performing Arts Venue,Arts & Crafts Store
2,Hudson Yards,Art Gallery,Performing Arts Venue,Arts & Crafts Store
3,Lenox Hill,Art Gallery,Performing Arts Venue,Arts & Crafts Store
4,Lincoln Square,Performing Arts Venue,Arts & Crafts Store,Art Gallery
5,Lower East Side,Art Gallery,Performing Arts Venue,Arts & Crafts Store
6,Manhattan Valley,Arts & Crafts Store,Performing Arts Venue,Art Gallery
7,Manhattanville,Art Gallery,Performing Arts Venue,Arts & Crafts Store
8,Soho,Arts & Crafts Store,Art Gallery,Performing Arts Venue
9,Upper East Side,Art Gallery,Performing Arts Venue,Arts & Crafts Store


In [50]:
#Cluster Neighborhoods¶
#Run k-means to cluster the neighborhood into 5 clusters.

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans
from sklearn.preprocessing import MinMaxScaler

In [51]:
data = manhattan_grouped.drop('Neighborhood', axis =1)
mms = MinMaxScaler()
mms.fit(data)
data_transformed = mms.transform(data)

Sum_of_squared_distances = []
K = range(1,5)
for k in K:
    km = KMeans(n_clusters=k)
    km = km.fit(data_transformed)
    Sum_of_squared_distances.append(km.inertia_)

In [52]:
import matplotlib.pyplot as plt
plt.plot(K, Sum_of_squared_distances, 'bx-')
plt.xlabel('k')
plt.ylabel('Sum_of_squared_distances')
plt.title('Elbow Method For Optimal k')
plt.show()

<matplotlib.figure.Figure at 0x7f5825f04898>

In [53]:
# set number of clusters

kclusters = 5

manhattan_grouped_clustering = manhattan_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(manhattan_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([2, 1, 1, 1, 2, 0, 3, 1, 4, 1], dtype=int32)

In [54]:
kmeans.labels_[9]

1

In [55]:
#Let's create a new dataframe that includes the cluster as well as the top 3 venues for each neighborhood

# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

manhattan_merged = manhattan_data

# merge data sets to add latitude/longitude for each neighborhood
manhattan_merged = manhattan_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

manhattan_merged.head(10)




Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue
0,Manhattan,Marble Hill,40.876551,-73.91066,,,,
1,Manhattan,Chinatown,40.715618,-73.994279,,,,
2,Manhattan,Washington Heights,40.851903,-73.9369,,,,
3,Manhattan,Inwood,40.867684,-73.92121,,,,
4,Manhattan,Hamilton Heights,40.823604,-73.949688,,,,
5,Manhattan,Manhattanville,40.816934,-73.957385,1.0,Art Gallery,Performing Arts Venue,Arts & Crafts Store
6,Manhattan,Central Harlem,40.815976,-73.943211,1.0,Art Gallery,Performing Arts Venue,Arts & Crafts Store
7,Manhattan,East Harlem,40.792249,-73.944182,,,,
8,Manhattan,Upper East Side,40.775639,-73.960508,1.0,Art Gallery,Performing Arts Venue,Arts & Crafts Store
9,Manhattan,Yorkville,40.77593,-73.947118,,,,


In [56]:
manhattan_merged = manhattan_merged.dropna(subset= ['Cluster Labels'])
manhattan_merged['Cluster Labels'] = manhattan_merged['Cluster Labels'].astype(int)
manhattan_merged

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue
5,Manhattan,Manhattanville,40.816934,-73.957385,1,Art Gallery,Performing Arts Venue,Arts & Crafts Store
6,Manhattan,Central Harlem,40.815976,-73.943211,1,Art Gallery,Performing Arts Venue,Arts & Crafts Store
8,Manhattan,Upper East Side,40.775639,-73.960508,1,Art Gallery,Performing Arts Venue,Arts & Crafts Store
10,Manhattan,Lenox Hill,40.768113,-73.95886,1,Art Gallery,Performing Arts Venue,Arts & Crafts Store
13,Manhattan,Lincoln Square,40.773529,-73.985338,2,Performing Arts Venue,Arts & Crafts Store,Art Gallery
20,Manhattan,Lower East Side,40.717807,-73.98089,0,Art Gallery,Performing Arts Venue,Arts & Crafts Store
23,Manhattan,Soho,40.722184,-74.000657,4,Arts & Crafts Store,Art Gallery,Performing Arts Venue
25,Manhattan,Manhattan Valley,40.797307,-73.964286,3,Arts & Crafts Store,Performing Arts Venue,Art Gallery
28,Manhattan,Battery Park City,40.711932,-74.016869,2,Performing Arts Venue,Arts & Crafts Store,Art Gallery
39,Manhattan,Hudson Yards,40.756658,-74.000111,1,Art Gallery,Performing Arts Venue,Arts & Crafts Store


In [57]:
# lets visualize by creating a  map


# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors



# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]


# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(manhattan_merged['Latitude'], manhattan_merged['Longitude'], manhattan_merged['Neighborhood'], manhattan_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster - 1],
        fill=True,
        fill_color=rainbow[cluster - 1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

Examine Clusters

Now, we examine each cluster and determine the discriminating venue categories that distinguish each cluster. Based on the defining categories, you can then assign a name to each cluster. 

In [58]:
#cluster 1

manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 0, manhattan_merged.columns[[1] + list(range(7, manhattan_merged.shape[1]))]]


Unnamed: 0,Neighborhood,3rd Most Common Venue
20,Lower East Side,Arts & Crafts Store


In [59]:
#cluster 2

manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 1, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]


Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue
5,Manhattanville,Art Gallery,Performing Arts Venue,Arts & Crafts Store
6,Central Harlem,Art Gallery,Performing Arts Venue,Arts & Crafts Store
8,Upper East Side,Art Gallery,Performing Arts Venue,Arts & Crafts Store
10,Lenox Hill,Art Gallery,Performing Arts Venue,Arts & Crafts Store
39,Hudson Yards,Art Gallery,Performing Arts Venue,Arts & Crafts Store


In [60]:
# cluster 3
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 2, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]



Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue
13,Lincoln Square,Performing Arts Venue,Arts & Crafts Store,Art Gallery
28,Battery Park City,Performing Arts Venue,Arts & Crafts Store,Art Gallery


In [61]:
# cluster 4
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 3, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue
25,Manhattan Valley,Arts & Crafts Store,Performing Arts Venue,Art Gallery


In [62]:
#cluster 5
manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 4, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue
23,Soho,Arts & Crafts Store,Art Gallery,Performing Arts Venue


In [63]:
# we will use cluster 2 for our clients, to give her options to select neighborhood to live in

df_decision  = manhattan_merged.loc[manhattan_merged['Cluster Labels'] == 1, manhattan_merged.columns[[1] + list(range(5, manhattan_merged.shape[1]))]]

In [64]:
df_decision.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue
5,Manhattanville,Art Gallery,Performing Arts Venue,Arts & Crafts Store
6,Central Harlem,Art Gallery,Performing Arts Venue,Arts & Crafts Store
8,Upper East Side,Art Gallery,Performing Arts Venue,Arts & Crafts Store
10,Lenox Hill,Art Gallery,Performing Arts Venue,Arts & Crafts Store
39,Hudson Yards,Art Gallery,Performing Arts Venue,Arts & Crafts Store


In [65]:
# # merge data sets to add latitude/longitude for each neighborhood
df_decision = df_decision.join(manhattan_data.set_index('Neighborhood'), on='Neighborhood')


In [66]:
df_decision.drop('Borough',1)

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,Latitude,Longitude
5,Manhattanville,Art Gallery,Performing Arts Venue,Arts & Crafts Store,40.816934,-73.957385
6,Central Harlem,Art Gallery,Performing Arts Venue,Arts & Crafts Store,40.815976,-73.943211
8,Upper East Side,Art Gallery,Performing Arts Venue,Arts & Crafts Store,40.775639,-73.960508
10,Lenox Hill,Art Gallery,Performing Arts Venue,Arts & Crafts Store,40.768113,-73.95886
39,Hudson Yards,Art Gallery,Performing Arts Venue,Arts & Crafts Store,40.756658,-74.000111


In [67]:
#latitude_NYU
#longitude_NYU

# create map and display it
nyu_map = folium.Map(location=[latitude_NYU, longitude_NYU], zoom_start=10)

# display the map of NYU
nyu_map

In [68]:
#Now let's superimpose the locations of the art venues onto the map. 

# instantiate a feature group for the incidents in the dataframe
art_venues = folium.map.FeatureGroup()

# loop through 
for lat, lng, in zip(df_decision.Latitude, df_decision.Longitude):
    art_venues.add_child(
        folium.features.CircleMarker(
            [lat, lng],
            radius=10, # define how big you want the circle markers to be
            color='magenta',
            fill=True,
            fill_color='blue',
            fill_opacity=0.2
        )
    )

# add incidents to map
nyu_map.add_child(art_venues)

In [69]:
# instantiate a feature group for the incidents in the dataframe
art_venues = folium.map.FeatureGroup()

# loop through the 100 crimes and add each to the incidents feature group
for lat, lng, in zip(df_decision.Latitude, df_decision.Longitude):
    art_venues.add_child(
        folium.features.CircleMarker(
            [lat, lng],
            radius=5, # define how big you want the circle markers to be
            color='yellow',
            fill=True,
            fill_color='blue',
            fill_opacity=0.6
        )
    )

# add pop-up text to each marker on the map
latitudes = list(df_decision.Latitude)
longitudes = list(df_decision.Longitude)
labels = list(df_decision.Neighborhood)

for lat, lng, label in zip(latitudes, longitudes, labels):
    folium.Marker([lat, lng], popup=label).add_to(nyu_map)    
    
# add to map
nyu_map.add_child(art_venues)



In [70]:
from folium import plugins

# let's start again with a clean copy of the map of San Francisco
nyu_map = folium.Map(location = [latitude_NYU, longitude_NYU], zoom_start = 8)

# instantiate a mark cluster object for the incidents in the dataframe
art_venues = plugins.MarkerCluster().add_to(nyu_map)

# loop through the dataframe and add each data point to the mark cluster
for lat, lng, label, in zip(df_decision.Latitude, df_decision.Longitude, df_decision.Neighborhood):
    folium.Marker(
        location=[lat, lng],
        icon=None,
        popup=label,
    ).add_to(art_venues)

# display map
nyu_map