# Clustering and Analysing Neighbourhoods in Toronto 

## Part One: Creating a data frame 

First, let's import our packages. I struggled with Beautiful Soup in the first instance, so let's just use the read_html function.

In [3]:
import pandas as pd
import numpy as np

In [4]:
import requests 
import random 


from geopy.geocoders import Nominatim 
from IPython.display import Image 
from IPython.core.display import HTML 


from IPython.display import display_html
import pandas as pd
import numpy as np
    

In [5]:
from pandas.io.json import json_normalize
from bs4 import BeautifulSoup
from sklearn.cluster import KMeans
import matplotlib.cm as cm
import matplotlib.colors as colors

In [6]:


!pip install folium
import folium



Collecting folium
[?25l  Downloading https://files.pythonhosted.org/packages/fd/a0/ccb3094026649cda4acd55bf2c3822bb8c277eb11446d13d384e5be35257/folium-0.10.1-py2.py3-none-any.whl (91kB)
[K     |████████████████████████████████| 92kB 17.5MB/s eta 0:00:01
[?25hCollecting branca>=0.3.0 (from folium)
  Downloading https://files.pythonhosted.org/packages/81/6d/31c83485189a2521a75b4130f1fee5364f772a0375f81afff619004e5237/branca-0.4.0-py3-none-any.whl
Installing collected packages: branca, folium
Successfully installed branca-0.4.0 folium-0.10.1


Then, we can extract the information from the wikipedia page, and view the first five rows.

In [9]:
tor_neighbourhoods = pd.read_html('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M', header=0)
tor_neighbourhoods = tor_neighbourhoods[0]
tor_neighbourhoods.head()

Unnamed: 0,Postal code,Borough,Neighborhood
0,M1A,Not assigned,
1,M2A,Not assigned,
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Regent Park / Harbourfront


Then, let's change the headings to get rid of the first column name, which is likely to cause problems. 

In [10]:
tor_neighbourhoods.columns = ['PostalCode', 'Borough', 'Neighbourhood']
tor_neighbourhoods.head()

Unnamed: 0,PostalCode,Borough,Neighbourhood
0,M1A,Not assigned,
1,M2A,Not assigned,
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Regent Park / Harbourfront


The next step is removing those which are unassigned. First, let's check how many we have.

In [11]:
tor_neighbourhoods[tor_neighbourhoods.Borough == 'Not assigned']['Neighbourhood'].unique()
print("There are {} records where the borough is not assigned".format(
    tor_neighbourhoods[tor_neighbourhoods.Borough == 'Not assigned'].shape[0]))

There are 77 records where the borough is not assigned


Then we remove those with a simple overwrite of the tor neighborhoods object. 

In [12]:
tor_neighbourhoods = tor_neighbourhoods[tor_neighbourhoods.Borough != 'Not assigned']

Then, let's merge the neighborhoods where the postcodes and borough's are the same.

In [13]:
tor_neighbourhoods = (tor_neighbourhoods.groupby(['PostalCode', 'Borough'])['Neighbourhood']
      .apply(lambda x: "{}".format(', '.join(x))).reset_index())

Now, we can make the neighborhoods the same as the boroughs in the contexts where the neighborhood is not assigned.

In [14]:
tor_neighbourhoods.loc[tor_neighbourhoods.Neighbourhood.str.contains('Not assigned'), 'Neighbourhood'] = \
    tor_neighborhoods.loc[tor_neighbourhoods.Neighbourhood.str.contains('Not assigned'), 'Borough']

NameError: name 'tor_neighborhoods' is not defined

We can check this is correct by searching for non assigned neighborhoods.

In [15]:
tor_neighbourhoods[tor_neighbourhoods.Neighbourhood.str.contains('Not assigned')]

Unnamed: 0,PostalCode,Borough,Neighbourhood


As we can see, there are now none. Finally, we're going to look at the shape of the dataframe we've created.

In [16]:
tor_neighbourhoods.shape

(103, 3)

First, to get the longitudes and latitudes, we import these as a csv from the provided weblink and view the head of the data.

# Part 2: Assigning Longitudes and Latitudes

In [18]:


lat_lon = pd.read_csv('https://cocl.us/Geospatial_data')
lat_lon.head()



Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


Now, we rename the columns so that we have the appropriate columns names and can merge this data with the other set. 

In [20]:
lat_lon.rename(columns={'Postal Code':'PostalCode'},inplace=True)


Then, we merge them and view the dataframe to check it is correct. 

In [21]:
df = pd.merge(tor_neighbourhoods,lat_lon,on='PostalCode')
df


Unnamed: 0,PostalCode,Borough,Neighbourhood,Latitude,Longitude
0,M1B,Scarborough,Malvern / Rouge,43.806686,-79.194353
1,M1C,Scarborough,Rouge Hill / Port Union / Highland Creek,43.784535,-79.160497
2,M1E,Scarborough,Guildwood / Morningside / West Hill,43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476
5,M1J,Scarborough,Scarborough Village,43.744734,-79.239476
6,M1K,Scarborough,Kennedy Park / Ionview / East Birchmount Park,43.727929,-79.262029
7,M1L,Scarborough,Golden Mile / Clairlea / Oakridge,43.711112,-79.284577
8,M1M,Scarborough,Cliffside / Cliffcrest / Scarborough Village West,43.716316,-79.239476
9,M1N,Scarborough,Birch Cliff / Cliffside West,43.692657,-79.264848


# Part 3: Clustering Neighbourhoods

First, we create a dataframe containing only the neighbourhoods in Toronto. As we can see, there are 39 of these, with 5 columns corresponding to the dataframe above. 

In [29]:
df1 = df[df['Borough'].str.contains('Toronto',regex=False)]
df1.shape

(39, 5)

Then, we create a map of these neighborhoods. 

In [25]:
map_toronto = folium.Map(location=[43.651070,-79.347015],zoom_start=10)

for lat,lng,borough,neighbourhood in zip(df1['Latitude'],df1['Longitude'],df1['Borough'],df1['Neighbourhood']):
    label = '{}, {}'.format(neighbourhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
    [lat,lng],
    radius=5,
    popup=label,
    color='blue',
    fill=True,
    fill_color='#3186cc',
    fill_opacity=0.7,
    parse_html=False).add_to(map_toronto)
map_toronto

Next, we retrieve the information about venues from Foursquare (with credentials anonymised)

In [30]:
CLIENT_ID = '***'      
CLIENT_SECRET = '***'   
VERSION = '***''

In [31]:
def NearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighbourhood', 
                  'Neighbourhood Latitude', 
                  'Neighbourhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [77]:
radius = 500
LIMIT = 100 
Toronto_venues = NearbyVenues(names = df1['Neighbourhood'], latitudes = df1['Latitude'], longitudes = df1['Longitude'])

The Beaches
The Danforth West / Riverdale
India Bazaar / The Beaches West
Studio District
Lawrence Park
Davisville North
North Toronto West
Davisville
Moore Park / Summerhill East
Summerhill West / Rathnelly / South Hill / Forest Hill SE / Deer Park
Rosedale
St. James Town / Cabbagetown
Church and Wellesley
Regent Park / Harbourfront
Garden District, Ryerson
St. James Town
Berczy Park
Central Bay Street
Richmond / Adelaide / King
Harbourfront East / Union Station / Toronto Islands
Toronto Dominion Centre / Design Exchange
Commerce Court / Victoria Hotel
Roselawn
Forest Hill North & West
The Annex / North Midtown / Yorkville
University of Toronto / Harbord
Kensington Market / Chinatown / Grange Park
CN Tower / King and Spadina / Railway Lands / Harbourfront West / Bathurst  Quay / South Niagara / Island airport
Stn A PO Boxes
First Canadian Place / Underground city
Christie
Dufferin / Dovercourt Village
Little Portugal / Trinity
Brockton / Parkdale Village / Exhibition Place
High Park /

Haaving acquired the venues, we now group them based on theie neighbourhood they are in and then, in the next chunk, group them into types.

In [78]:
print(Toronto_venues.shape)
Toronto_venues.groupby('Neighbourhood').count()

(1189, 7)


Unnamed: 0_level_0,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighbourhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Berczy Park,50,50,50,50,50,50
Brockton / Parkdale Village / Exhibition Place,23,23,23,23,23,23
Business reply mail Processing CentrE,16,16,16,16,16,16
CN Tower / King and Spadina / Railway Lands / Harbourfront West / Bathurst Quay / South Niagara / Island airport,14,14,14,14,14,14
Central Bay Street,50,50,50,50,50,50
Christie,18,18,18,18,18,18
Church and Wellesley,50,50,50,50,50,50
Commerce Court / Victoria Hotel,50,50,50,50,50,50
Davisville,34,34,34,34,34,34
Davisville North,7,7,7,7,7,7


In [34]:
Toronto_venues.groupby('Venue Category').count()

Unnamed: 0_level_0,Neighbourhood,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude
Venue Category,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Airport,1,1,1,1,1,1
Airport Food Court,1,1,1,1,1,1
Airport Gate,1,1,1,1,1,1
Airport Lounge,2,2,2,2,2,2
Airport Service,2,2,2,2,2,2
Airport Terminal,2,2,2,2,2,2
American Restaurant,24,24,24,24,24,24
Antique Shop,3,3,3,3,3,3
Aquarium,5,5,5,5,5,5
Art Gallery,12,12,12,12,12,12


We then perform one hot encoding to create categorical variables for each venue. 

In [79]:
one_hot = pd.get_dummies(Toronto_venues[['Venue Category']], prefix="", prefix_sep="")

In [None]:
Now, we add in a neighbourhood column. 

In [40]:
# add neighborhood column back to dataframe
one_hot['Neighbourhood'] = Toronto_venues['Neighbourhood']

# move neighborhood column to the first column
fixed_columns = [one_hot.columns[-1]] + list(one_hot.columns[:-1])
one_hot = one_hot[fixed_columns] 

one_hot.shape
one_hot.head(5)

Unnamed: 0,Wine Bar,Wine Shop,Women's Store,Yoga Studio,Neighbourhood,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,...,Tennis Court,Thai Restaurant,Theater,Theme Restaurant,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant
0,0,0,0,0,The Beaches,0,0,0,0,0,...,0,0,0,0,0,1,0,0,0,0
1,0,0,0,0,The Beaches,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,The Beaches,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,The Beaches,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,The Danforth West / Riverdale,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [82]:
one_hot = pd.get_dummies(Toronto_venues[['Venue Category']], prefix="", prefix_sep="")
# add neighborhood column back to dataframe
one_hot['Neighbourhood'] = Toronto_venues['Neighbourhood'] ## I labeled the new columns as Neighbourhood with a
# 'u' bcs there is already a column named 'Neighborhood' in the venue categories.

# move neighborhood column to the first column
fixed_columns = [one_hot.columns[-1]] + list(one_hot.columns[:-1])
one_hot = one_hot[fixed_columns] 

one_hot.head(5)

one_hot.shape

(1189, 210)

And then, we group the encoded variables to create mean values for each the neighbourhoods, giving us back a datafrae with 39 rows, as with the Toronto neighbourhoods.

In [83]:
df_grouped = one_hot.groupby('Neighbourhood').mean().reset_index()
df_grouped.shape
df_grouped.head(2)


Unnamed: 0,Neighbourhood,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,Aquarium,...,Theme Restaurant,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Wine Shop,Yoga Studio
0,Berczy Park,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0
1,Brockton / Parkdale Village / Exhibition Place,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


The next chunk of code creates 5 top venues for each neighbourhood. 

In [51]:
num_top_venues = 5

for hood in df_grouped['Neighbourhood']:
    print("----"+hood+"----")
    temp = df_grouped[df_grouped['Neighbourhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')



----Berczy Park----
            venue  freq
0     Coffee Shop  0.05
1  Farmers Market  0.04
2    Cocktail Bar  0.04
3      Restaurant  0.04
4        Beer Bar  0.04


----Brockton / Parkdale Village / Exhibition Place----
            venue  freq
0            Café  0.13
1  Breakfast Spot  0.09
2     Coffee Shop  0.09
3          Bakery  0.04
4       Pet Store  0.04


----Business reply mail Processing CentrE----
           venue  freq
0            Spa  0.06
1  Auto Workshop  0.06
2     Smoke Shop  0.06
3  Burrito Place  0.06
4        Butcher  0.06


----CN Tower / King and Spadina / Railway Lands / Harbourfront West / Bathurst  Quay / South Niagara / Island airport----
              venue  freq
0    Airport Lounge  0.14
1   Airport Service  0.14
2  Airport Terminal  0.14
3           Airport  0.07
4      Airport Gate  0.07


----Central Bay Street----
                venue  freq
0         Coffee Shop  0.19
1                Café  0.06
2  Italian Restaurant  0.06
3      Sandwich Place  0.05


In [46]:
num_top_venues = 5

def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]



In [52]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# Labelling columns as 1st, 2nd and so on
columns = ['Neighbourhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind])) # for 1st, 2nd, 3rd
    except:
        columns.append('{}th Most Common Venue'.format(ind+1)) ### for 4th, 5th,...

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns) ## assign column names we just created to a new dataframe
neighborhoods_venues_sorted['Neighbourhood'] = df_grouped['Neighbourhood']## add neighborhoods column

for ind in np.arange(df_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(df_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()
neighborhoods_venues_sorted.shape

(39, 11)

In [84]:
k = 5

df_grouped_clustering = df_grouped.drop('Neighbourhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=k, random_state=0).fit(df_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:38]

array([1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 2, 1, 3, 0,
       1, 0, 0, 1, 2, 4, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1], dtype=int32)

In [85]:
tor_merged = df1
tor_merged['Cluster Labels'] = kmeans.labels_
tor_merged = tor_merged.join(neighborhoods_venues_sorted.set_index('Neighbourhood'), on='Neighbourhood')
tor_merged

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  from ipykernel import kernelapp as app


Unnamed: 0,PostalCode,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
37,M4E,East Toronto,The Beaches,43.676357,-79.293031,1,Trail,Pub,Neighborhood,Health Food Store,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Yoga Studio
41,M4K,East Toronto,The Danforth West / Riverdale,43.679557,-79.352188,1,Greek Restaurant,Coffee Shop,Italian Restaurant,Ice Cream Shop,Restaurant,Furniture / Home Store,Bookstore,Yoga Studio,Caribbean Restaurant,Indian Restaurant
42,M4L,East Toronto,India Bazaar / The Beaches West,43.668999,-79.315572,1,Park,Fast Food Restaurant,Gym,Pub,Liquor Store,Burrito Place,Sandwich Place,Italian Restaurant,Steakhouse,Fish & Chips Shop
43,M4M,East Toronto,Studio District,43.659526,-79.340923,1,Café,Coffee Shop,Brewery,Gastropub,Bakery,American Restaurant,Convenience Store,Seafood Restaurant,Sandwich Place,Cheese Shop
44,M4N,Central Toronto,Lawrence Park,43.72802,-79.38879,0,Park,Swim School,Bus Line,Yoga Studio,Diner,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant
45,M4P,Central Toronto,Davisville North,43.712751,-79.390197,1,Gym,Hotel,Department Store,Sandwich Place,Breakfast Spot,Food & Drink Shop,Park,Gay Bar,Dessert Shop,Electronics Store
46,M4R,Central Toronto,North Toronto West,43.715383,-79.405678,1,Coffee Shop,Clothing Store,Yoga Studio,Fast Food Restaurant,Dessert Shop,Restaurant,Rental Car Location,Salon / Barbershop,Diner,Mexican Restaurant
47,M4S,Central Toronto,Davisville,43.704324,-79.38879,1,Sandwich Place,Pizza Place,Dessert Shop,Gym,Italian Restaurant,Café,Sushi Restaurant,Coffee Shop,Pharmacy,Seafood Restaurant
48,M4T,Central Toronto,Moore Park / Summerhill East,43.689574,-79.38316,1,Playground,Trail,Tennis Court,Yoga Studio,Dessert Shop,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant
49,M4V,Central Toronto,Summerhill West / Rathnelly / South Hill / For...,43.686412,-79.400049,1,Coffee Shop,Pub,Pizza Place,Light Rail Station,Liquor Store,Sports Bar,Restaurant,Supermarket,Sushi Restaurant,Bank


Ok, let's look at cluster one, for reference, to determine what distinguishes these neighbourhoods. 

In [71]:
tor_merged.loc[torronto_merged['Cluster Labels'] == 1, torronto_merged.columns[[1] + list(range(5, torronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
37,East Toronto,1,Trail,Pub,Neighborhood,Health Food Store,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Yoga Studio
41,East Toronto,1,Greek Restaurant,Coffee Shop,Italian Restaurant,Ice Cream Shop,Restaurant,Furniture / Home Store,Bookstore,Yoga Studio,Caribbean Restaurant,Indian Restaurant
42,East Toronto,1,Park,Fast Food Restaurant,Gym,Pub,Liquor Store,Burrito Place,Sandwich Place,Italian Restaurant,Steakhouse,Fish & Chips Shop
43,East Toronto,1,Café,Coffee Shop,Brewery,Gastropub,Bakery,American Restaurant,Convenience Store,Seafood Restaurant,Sandwich Place,Cheese Shop
44,Central Toronto,1,Park,Swim School,Bus Line,Yoga Studio,Diner,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant
45,Central Toronto,1,Gym,Hotel,Department Store,Sandwich Place,Breakfast Spot,Food & Drink Shop,Park,Gay Bar,Dessert Shop,Electronics Store
46,Central Toronto,1,Coffee Shop,Clothing Store,Yoga Studio,Fast Food Restaurant,Dessert Shop,Restaurant,Rental Car Location,Salon / Barbershop,Diner,Mexican Restaurant
47,Central Toronto,1,Sandwich Place,Pizza Place,Dessert Shop,Gym,Italian Restaurant,Café,Sushi Restaurant,Coffee Shop,Pharmacy,Seafood Restaurant
48,Central Toronto,1,Playground,Trail,Tennis Court,Yoga Studio,Dessert Shop,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant
49,Central Toronto,1,Coffee Shop,Pub,Pizza Place,Light Rail Station,Liquor Store,Sports Bar,Restaurant,Supermarket,Sushi Restaurant,Bank


As a final task, let's make a map of the clusters we've created. 

In [86]:
map_clusters = folium.Map(location=[43.651070,-79.347015], zoom_start=11)

# set color scheme for the clusters
x = np.arange(k)
ys = [i+x+(i*x)**2 for i in range(k)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(torronto_merged['Latitude'], torronto_merged['Longitude'], torronto_merged['Neighbourhood'], torronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters