# Capstone Project - The Battle of the Neighborhoods

## Table of contents
* [Introduction: Business Problem](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)

## Introduction: Business Problem <a name="introduction"></a>

The goal of this project is to recommend a location for someone looking to open a restaurant in New York City.
We will work on answering the following questions:

1. Where are the popular locations for running a restaurant business? Are there any geographical patterns in these popular restaurants?
    To find the hottest spot, we search restaurants from Foursquare location data. Then cluster these restaurants and locate the center of each cluster. 

2. How many times do these restaurants be mentioned by users of Foursquare?
    To confirm whether or not the hottest spot is the most popular one, we get the tips number of these restaurants from Foursquare to see the correlation between location and popularity.

3. Are there any patterns in the locations of these popular restaurants?
    Finally, we cluster restaurants by their nearby venues data, then discuss the characteristics of each cluster.


## Data <a name="data"></a>

1. To answer the first question, we use the latitude and longitude data of the restaurants from Foursquare. Cluster the restaurants based on their locations. Then get the center of each cluster.

2. To see whether or not the restaurant closer to the center of each cluster is more popular, we use the tips number of each restaurant from Foursquare. Also, divide the restaurants into different levels of the tips number.

3. To look for any subtle patterns in the locations of the restaurants, we use the nearby venues' data to cluster the restaurants. For example, use the nearby venue's category as the feature for clustering.


In [1]:
import requests # library to handle requests
import pandas as pd # library for data analsysis
import numpy as np # library to handle data in a vectorized manner
import random # library for random number generation


# !pip install geopy
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

# libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 
    
# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans
from sklearn.cluster import DBSCAN
# import sklearn.utils
from sklearn.preprocessing import StandardScaler

# ! pip install folium==0.5.0
import folium # plotting library


In [2]:
address = 'New York City, NY'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of New York City are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of New York City are 40.7127281, -74.0060152.


In [3]:
# Import New York neighborhoods data
newyork_data = pd.read_csv('new_york_data.csv')
neighborhoods = newyork_data.iloc[:, 1:]
neighborhoods.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Bronx,Wakefield,40.894705,-73.847201
1,Bronx,Co-op City,40.874294,-73.829939
2,Bronx,Eastchester,40.887556,-73.827806
3,Bronx,Fieldston,40.895437,-73.905643
4,Bronx,Riverdale,40.890834,-73.912585


In [None]:
# create map of New York using latitude and longitude values
map_newyork = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(neighborhoods['Latitude'], neighborhoods['Longitude'], neighborhoods['Borough'], neighborhoods['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_newyork)  
    
map_newyork

In [4]:
CLIENT_ID = 'RVUKUV5BWZM51TRGCVUVKXRD1TOOB0CO33XY4NOGPRVQL3EK'
CLIENT_SECRET = 'MEK5DPUC0VCVPALMF0BAO4XUEAQZADXZSQPNRAPORHH1HNJI'
VERSION = '20210722' 
LIMIT = 100 


In [None]:
neighborhood_latitude = neighborhoods.loc[0, 'Latitude'] # neighborhood latitude value
neighborhood_longitude = neighborhoods.loc[0, 'Longitude'] # neighborhood longitude value

neighborhood_name = neighborhoods.loc[0, 'Neighborhood'] # neighborhood name

print('Latitude and longitude values of {} are {}, {}.'.format(neighborhood_name, 
                                                               neighborhood_latitude, 
                                                               neighborhood_longitude))

In [None]:
radius = 500
url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&ll={},{}&v={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, neighborhood_latitude, neighborhood_longitude, VERSION, radius, LIMIT)
url

In [None]:
results = requests.get(url).json()
results

In [3]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [None]:
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

In [None]:
search_query = 'restaurant'
radius = 500

url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, neighborhood_latitude, neighborhood_longitude, VERSION, search_query, radius, LIMIT)

results = requests.get(url).json()

# assign relevant part of JSON to venues
venues = results['response']['venues']

# tranform venues into a dataframe
dataframe = json_normalize(venues)

# keep only columns that include venue name, and anything that is associated with location
filtered_columns = ['name', 'categories'] + [col for col in dataframe.columns if col.startswith('location.')] + ['id']
dataframe_filtered = dataframe.loc[:, filtered_columns]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
dataframe_filtered['categories'] = dataframe_filtered.apply(get_category_type, axis=1)

# clean column names by keeping only last term
dataframe_filtered.columns = [column.split('.')[-1] for column in dataframe_filtered.columns]

dataframe_filtered

In [None]:
venues_map = folium.Map(location=[neighborhood_latitude, neighborhood_longitude], zoom_start=13) # generate map centred around the Conrad Hotel

# add a red circle marker to represent the Conrad Hotel
folium.CircleMarker(
    [neighborhood_latitude, neighborhood_longitude],
    radius=10,
    color='red',
    popup= neighborhood_name,
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(venues_map)

# add the Italian restaurants as blue circle markers
for lat, lng, label in zip(dataframe_filtered.lat, dataframe_filtered.lng, dataframe_filtered.categories):
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(venues_map)

# display map
venues_map

### Explore Neighborhoods in Manhattan


In [5]:
manhattan_data = neighborhoods[neighborhoods['Borough'] == 'Manhattan'].reset_index(drop=True)
manhattan_data

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Manhattan,Marble Hill,40.876551,-73.91066
1,Manhattan,Chinatown,40.715618,-73.994279
2,Manhattan,Washington Heights,40.851903,-73.9369
3,Manhattan,Inwood,40.867684,-73.92121
4,Manhattan,Hamilton Heights,40.823604,-73.949688
5,Manhattan,Manhattanville,40.816934,-73.957385
6,Manhattan,Central Harlem,40.815976,-73.943211
7,Manhattan,East Harlem,40.792249,-73.944182
8,Manhattan,Upper East Side,40.775639,-73.960508
9,Manhattan,Yorkville,40.77593,-73.947118


In [6]:
address = 'Manhattan, NY'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Manhattan are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Manhattan are 40.7896239, -73.9598939.


In [7]:
# create map of Manhattan using latitude and longitude values
map_manhattan = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(manhattan_data['Latitude'], manhattan_data['Longitude'], manhattan_data['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_manhattan)  
    
map_manhattan

In [7]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name'],
            v['venue']['id']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category',
                  'Venue ID']
    
    return(nearby_venues)

In [None]:
# manhattan_venues = getNearbyVenues(names=manhattan_data['Neighborhood'], 
#                                     latitudes=manhattan_data['Latitude'], 
#                                     longitudes=manhattan_data['Longitude']
#                                     )

In [10]:

# # manhattan_venues.to_csv('manhattan_venues.csv')

manhattan_venues = pd.read_csv('manhattan_venues.csv').iloc[:, 1:]
print(manhattan_venues.shape)
manhattan_venues.head()

(3267, 8)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category,Venue ID
0,Marble Hill,40.876551,-73.91066,Arturo's,40.874412,-73.910271,Pizza Place,4b4429abf964a52037f225e3
1,Marble Hill,40.876551,-73.91066,Bikram Yoga,40.876844,-73.906204,Yoga Studio,4baf59e8f964a520a6f93be3
2,Marble Hill,40.876551,-73.91066,Tibbett Diner,40.880404,-73.908937,Diner,4b79cc46f964a520c5122fe3
3,Marble Hill,40.876551,-73.91066,Dunkin',40.877136,-73.906666,Donut Shop,4b5357adf964a520319827e3
4,Marble Hill,40.876551,-73.91066,Starbucks,40.877531,-73.905582,Coffee Shop,55f81cd2498ee903149fcc64


In [11]:
# Select all duplicate rows based on one column
duplicateRows = manhattan_venues.duplicated(['Venue ID'])
print("Duplicate Rows based on 'Venue ID' column are:", manhattan_venues[duplicateRows], sep='\n')

Duplicate Rows based on 'Venue ID' column are:
      Neighborhood  Neighborhood Latitude  Neighborhood Longitude  \
688     Lenox Hill              40.768113              -73.958860   
713     Lenox Hill              40.768113              -73.958860   
729     Lenox Hill              40.768113              -73.958860   
734     Lenox Hill              40.768113              -73.958860   
756     Lenox Hill              40.768113              -73.958860   
...            ...                    ...                     ...   
3239  Hudson Yards              40.756658              -74.000111   
3240  Hudson Yards              40.756658              -74.000111   
3245  Hudson Yards              40.756658              -74.000111   
3246  Hudson Yards              40.756658              -74.000111   
3252  Hudson Yards              40.756658              -74.000111   

                                             Venue  Venue Latitude  \
688                                   Paper Source    

In [12]:
manhattan_venues.drop_duplicates('Venue ID', inplace=True, ignore_index=True)
manhattan_venues.shape

(3091, 8)

In [13]:
manhattan_restaurants = manhattan_venues[manhattan_venues['Venue Category'].str.contains('Restaurant')]
manhattan_restaurants.shape

(873, 8)

In [13]:
# create map of Manhattan using latitude and longitude values
map_manhattan_restaurants = folium.Map(location=[latitude, longitude], zoom_start=12)

# add markers to map
for lat, lng, label in zip(manhattan_restaurants['Venue Latitude'], manhattan_restaurants['Venue Longitude'], manhattan_restaurants['Venue']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_manhattan_restaurants)  
    
map_manhattan_restaurants

In [15]:
# ratings = []
# tips_counts = []
# venue_ids_400 = manhattan_restaurants['Venue ID'].head(400)

# for venue_id in venue_ids_400:
# 	url = 'https://api.foursquare.com/v2/venues/{}?client_id={}&client_secret={}&v={}'.format(venue_id, CLIENT_ID, CLIENT_SECRET, VERSION)

# 	result = requests.get(url).json()
# 	try:
# 		ratings.append(result['response']['venue']['rating'])
# 	except:
# 		ratings.append(np.nan)

# 	try:
# 		tips_counts.append(result['response']['venue']['tips']['count'])
# 	except:
# 		tips_counts.append(np.nan)

# print(ratings)
# print(tips_counts)


[7.1, nan, 8.7, 9.3, 8.4, 9.1, 8.6, 9.0, 8.7, 8.6, 8.1, 8.0, 8.6, 9.4, 8.3, 9.2, 8.2, 8.2, 8.0, 8.7, 7.9, 8.7, 8.8, 8.6, 8.1, 8.2, 8.3, 8.6, 8.4, 8.2, 8.4, 8.9, 8.3, 8.6, 9.0, 8.5, 7.8, 9.2, 8.2, 7.7, 8.9, 8.5, 7.8, 8.6, 7.9, 8.3, 7.9, 7.9, 8.2, 7.8, 7.5, 7.4, 7.0, 7.0, 7.3, 7.6, 7.4, 7.1, 6.8, 7.2, nan, 8.3, 8.1, 7.7, 8.3, 7.4, 7.9, 7.4, 8.0, 7.3, 7.1, 7.3, 7.3, 7.0, 7.2, 6.8, 7.0, 6.2, nan, 9.1, 8.8, 8.2, 8.1, 8.0, 8.1, 7.7, 7.7, 7.9, 8.3, 7.7, 7.6, 7.5, 7.2, 7.1, 7.0, 6.9, 5.9, 8.3, 7.6, 9.0, 8.0, 7.9, 7.6, 7.8, 7.8, 8.1, 7.6, 8.1, 7.3, 7.2, 7.5, 7.2, 6.6, 6.4, nan, 8.5, 8.6, 8.6, 8.1, 8.2, 7.8, 9.1, 7.6, 7.3, 8.0, 7.4, 7.1, 7.7, 7.1, 7.2, nan, 8.7, 8.0, 8.4, 8.1, 8.3, 7.8, 8.8, 8.3, 8.4, 7.8, 7.4, 7.5, 8.0, 7.8, 8.8, 8.5, 8.7, 8.2, 8.0, 8.4, 9.2, 8.7, 8.5, 8.3, 8.0, 8.2, 7.9, 8.0, 8.7, 7.9, 8.1, 7.6, 7.6, 7.9, 8.3, 7.8, 7.9, 7.8, 8.3, 8.5, 8.7, 9.0, 8.3, 8.9, 8.5, 8.2, 8.4, 7.8, 8.1, 8.8, 9.1, 8.4, 8.5, 7.3, 8.7, 8.3, 8.0, 8.1, 8.1, 7.3, 7.0, 7.5, 7.4, 7.2, 7.5, 7.5, 7.3, 7.2, 9.1,

In [19]:
# rating_tipsCount_400 = pd.DataFrame(zip(ratings, tips_counts), columns=['Rating', 'Tips Count'])
# rating_tipsCount_400.to_csv('rating_tipsCount_400.csv')
# rating_tipsCount_400.head()

Unnamed: 0,Rating,Tips Count
0,7.1,19
1,,0
2,8.7,180
3,9.3,205
4,8.4,99


In [15]:
# ratings_tail = []
# tips_counts_tail = []
# # venue_ids_tail = manhattan_restaurants['Venue ID'].tail()
# venue_ids_tail = manhattan_restaurants['Venue ID'].tail(len(manhattan_restaurants['Venue ID']) - 400)

# for venue_id in venue_ids_tail:
# 	url = 'https://api.foursquare.com/v2/venues/{}?client_id={}&client_secret={}&v={}'.format(venue_id, CLIENT_ID, CLIENT_SECRET, VERSION)

# 	result = requests.get(url).json()
# 	try:
# 		ratings_tail.append(result['response']['venue']['rating'])
# 	except:
# 		ratings_tail.append(np.nan)

# 	try:
# 		tips_counts_tail.append(result['response']['venue']['tips']['count'])
# 	except:
# 		tips_counts_tail.append(np.nan)

# print(ratings_tail)
# print(tips_counts_tail)


[9.0, 8.7, 9.0, 8.2, 8.3, 8.5, 9.0, 8.2, 9.1, 8.6, 8.1, 8.7, 8.8, 9.0, 8.7, 8.4, 8.9, 8.8, 8.1, 8.2, 8.4, 9.0, 9.3, 8.1, 9.3, 9.2, 9.2, 9.1, 8.9, 9.0, 8.7, 9.1, 9.2, 8.6, 8.6, 9.0, 8.9, 8.5, 8.3, 8.3, 8.3, 8.5, 8.8, 8.5, 8.7, 8.6, 9.0, 8.4, 8.2, 8.1, 9.1, 8.8, 8.5, 8.3, 7.9, 8.8, 9.4, 7.9, 8.1, 8.2, 8.4, 9.2, 9.0, 8.8, 8.6, 9.1, 8.2, 8.3, 9.1, 7.9, 7.9, 8.1, 8.0, 7.8, 7.4, nan, 5.6, 9.2, 8.7, 8.5, 8.8, 8.7, 8.6, 8.0, 8.0, 9.1, 8.1, 8.5, 9.0, 9.0, 8.9, 8.8, 8.6, 8.5, 7.9, 8.4, 8.0, 7.8, 7.5, 7.7, 7.8, 7.5, 9.4, 9.0, 9.1, 8.5, 9.2, 8.9, 8.9, 8.9, 9.2, 9.1, 8.7, 8.6, 8.7, 8.8, 9.1, 8.1, 8.2, 8.8, 9.3, 8.7, 9.0, 8.9, 8.5, 8.7, 8.6, 8.6, 8.5, 8.7, 8.9, 9.4, 9.1, 9.1, 9.2, 8.9, 8.8, 8.8, 8.8, 8.6, 9.2, 8.5, 8.8, 9.3, 8.7, 8.1, 8.4, 8.5, 8.4, 8.8, 9.3, 8.8, 8.5, 7.8, 8.6, 8.4, 9.3, 8.3, 9.1, 8.6, 9.3, 7.7, 8.9, 9.2, 7.5, 8.6, 8.7, 8.6, 8.1, 8.1, 8.0, 8.3, 8.5, 8.1, 7.9, 8.4, 7.9, 7.9, 8.0, 8.0, 7.4, 7.7, 7.6, 7.5, 8.2, 7.7, 8.0, 7.7, 8.1, 7.6, 7.3, 6.5, 6.7, 6.1, 8.7, 9.0, 7.8, 8.1, 7.7, 8.6,

In [16]:
# rating_tipsCount_tail = pd.DataFrame(zip(ratings_tail, tips_counts_tail), columns=['Rating', 'Tips Count'])
# rating_tipsCount_tail.to_csv('rating_tipsCount_tail.csv')
# rating_tipsCount_tail.head()

Unnamed: 0,Rating,Tips Count
0,9.0,2
1,8.7,188
2,9.0,395
3,8.2,28
4,8.3,39


In [20]:
rating_tipsCount_400 = pd.read_csv('rating_tipsCount_400.csv')
rating_tipsCount = pd.concat([rating_tipsCount_400, rating_tipsCount_tail], axis=0, ignore_index=True).iloc[:, 1:]
rating_tipsCount.to_csv('rating_tipsCount.csv')
rating_tipsCount

Unnamed: 0,Rating,Tips Count
0,7.1,19
1,,0
2,8.7,180
3,9.3,205
4,8.4,99
...,...,...
868,7.5,3
869,7.2,3
870,6.8,6
871,6.5,7


## Methodology <a name="methodology"></a>

## Analysis <a name="analysis"></a>

### Cluster Restaurants

In [None]:
# set number of clusters
kclusters = 20

restaurants_clustering = manhattan_restaurants[['Venue Latitude', 'Venue Longitude']]

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(restaurants_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

In [None]:
centers = kmeans.cluster_centers_
print(centers[0:5])

centers_lat = centers[:, 0]
centers_lon = centers[:, 1]
print(centers_lat)
print(centers_lon)

manhattan_restaurants['Cluster Label'] = kmeans.labels_
manhattan_restaurants.head()

In [None]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=12)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
for lat, lon, lab, cluster in zip(manhattan_restaurants['Venue Latitude'], manhattan_restaurants['Venue Longitude'], manhattan_restaurants['Venue'], manhattan_restaurants['Cluster Label']):
    label = folium.Popup(lab, parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)

for lat, lon, cluster in zip(centers_lat, centers_lon, x):
    label = folium.Popup('Cluster '+str(cluster), parse_html=True)
    folium.Marker(
        [lat, lon],
        popup=label).add_to(map_clusters)
       
map_clusters

In [None]:
# Compute DBSCAN
clustering_transformed = StandardScaler().fit_transform(restaurants_clustering)
db = DBSCAN(eps=0.1, min_samples=20).fit(clustering_transformed)

manhattan_restaurants['Cluster Label'] = db.labels_
n_clusters = len(manhattan_restaurants['Cluster Label'].unique())
print('The number of clusters is ', n_clusters)
manhattan_restaurants.head()

In [None]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=12)

# set color scheme for the clusters
x = np.arange(n_clusters-1)
ys = [i + x + (i*x)**2 for i in range(n_clusters-1)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array] + ['#a9a9a9']

# add markers to the map
for lat, lon, lab, cluster in zip(manhattan_restaurants['Venue Latitude'], manhattan_restaurants['Venue Longitude'], manhattan_restaurants['Cluster Label'], manhattan_restaurants['Cluster Label']):
    label = folium.Popup('Cluster '+str(lab), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster],
        fill=True,
        fill_color=rainbow[cluster],
        fill_opacity=0.7).add_to(map_clusters)

# for lat, lon, cluster in zip(centers_lat, centers_lon, x):
#     label = folium.Popup('Cluster '+str(cluster), parse_html=True)
#     folium.Marker(
#         [lat, lon],
#         popup=label).add_to(map_clusters)
       
map_clusters

## Results and Discussion <a name="results"></a>

## Conclusion <a name="conclusion"></a>