# Segment and Cluster Toronto Neighborhoods

## Part 1: Scraping Wikipedia Data

In [1]:
!conda install -c anaconda xlrd --yes
!conda install -c anaconda beautifulsoup4

Fetching package metadata .............
Solving package specifications: .

Package plan for installation in environment /opt/conda/envs/DSX-Python35:

The following packages will be SUPERSEDED by a higher-priority channel:

    xlrd: 1.1.0-py35h45a0a2a_1 --> 1.1.0-py35_1 anaconda

xlrd-1.1.0-py3 100% |################################| Time: 0:00:00  16.32 MB/s
Fetching package metadata .............
Solving package specifications: .

Package plan for installation in environment /opt/conda/envs/DSX-Python35:

The following packages will be UPDATED:

    beautifulsoup4: 4.6.0-py35h442a8c9_1 --> 4.6.3-py35_0 anaconda

beautifulsoup4 100% |################################| Time: 0:00:00  40.18 MB/s


In [2]:
from bs4 import BeautifulSoup
import requests
import pandas as pd
import matplotlib.cm as cm
import matplotlib.colors as colors

Fetch the page content from Wikipedia

In [3]:
page = requests.get('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M')
contents = page.content

NOTE:

In the code below I make the following assumptions:

1. The table I'm looking has a specific class 'wikitable' and that there is only 1 table with that class. 
2. The data in the table is displayed in the order Postalcode, Borough, Neighborhood.
3. All tables have a value (including "Not assigned")

Having inspected the HTML of the page I clarified that the above assumptions were true at the time of creation (March 16th, 2019)

----

Use BeautifulSoup to help scrape data from the returned Wikipedia page content

Docs: https://www.crummy.com/software/BeautifulSoup/

In [4]:
soup = BeautifulSoup(contents, 'html.parser')

headers = ['Postcode', 'Borough', 'Neighborhood']

table = soup.find('table',{'class':'wikitable'})
table_rows = table.find_all('tr')
table_rows = table_rows[1:]

df_rows = []

for row in table_rows:
    items = row.find_all('td')
    if items[1].text.strip() != 'Not assigned':
        df_row =[]
        df_row.append(items[0].text.strip())
        df_row.append(items[1].text.strip())
        df_row.append(items[2].text.strip())
        df_rows.append(df_row)


In [5]:
init_df = pd.DataFrame(data=df_rows, columns=headers)
init_df.head()

Unnamed: 0,Postcode,Borough,Neighborhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,Harbourfront
3,M5A,Downtown Toronto,Regent Park
4,M6A,North York,Lawrence Heights


Now that we have the initial dataframe we need to clean up the data by doing the following:

1. Combine Neighborhoods with the same Postcode
2. Set any Neighborhood with the value of "Not assigned" to be the same as the Borough

In the code below I loop over the rows of the dataframe and create a unique mapping of each postal code. During this process I concatenate the Neighborhoods so that each unique postal code row has a string containing all the Neighborhoods associated with it.

In [6]:
init_df.groupby(['Postcode']).head()

c_data = {} # cleaned data mapping

for index, row in init_df.iterrows():
    if row['Neighborhood'] == 'Not assigned':
        row['Neighborhood'] = row['Borough']
    if not row['Postcode'] in c_data:
        c_data[row['Postcode']] = [row['Postcode'], row['Borough'], row['Neighborhood']]
    elif not row['Neighborhood'] in c_data[row['Postcode']][2] :
        c_data[row['Postcode']][2] += ", " + row['Neighborhood']

In [7]:
tor_df = pd.DataFrame(list(c_data.values()), columns=headers)

In [8]:
tor_df.head()

Unnamed: 0,Postcode,Borough,Neighborhood
0,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union"
1,M9R,Etobicoke,"Kingsview Village, Martin Grove Gardens, Richv..."
2,M5E,Downtown Toronto,Berczy Park
3,M4W,Downtown Toronto,Rosedale
4,M6N,York,"The Junction North, Runnymede"


In [9]:
tor_df.shape

(103, 3)

## PART 2: Getting Geolocations

NOTE: I tried to actually use the geocoder package and was unable to do so successfully.

In [10]:
!wget -O geospacial.csv https://cocl.us/Geospatial_data

--2019-03-21 19:21:07--  https://cocl.us/Geospatial_data
Resolving cocl.us (cocl.us)... 169.48.113.201
Connecting to cocl.us (cocl.us)|169.48.113.201|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://ibm.box.com/shared/static/9afzr83pps4pwf2smjjcf1y5mvgb18rr.csv [following]
--2019-03-21 19:21:11--  https://ibm.box.com/shared/static/9afzr83pps4pwf2smjjcf1y5mvgb18rr.csv
Resolving ibm.box.com (ibm.box.com)... 107.152.26.197
Connecting to ibm.box.com (ibm.box.com)|107.152.26.197|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: /public/static/9afzr83pps4pwf2smjjcf1y5mvgb18rr.csv [following]
--2019-03-21 19:21:11--  https://ibm.box.com/public/static/9afzr83pps4pwf2smjjcf1y5mvgb18rr.csv
Reusing existing connection to ibm.box.com:443.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://ibm.ent.box.com/public/static/9afzr83pps4pwf2smjjcf1y5mvgb18rr.csv [following]
--2019-03-21 

In [11]:
geo_df = pd.read_csv('geospacial.csv')
geo_df = geo_df.rename(index=str, columns={'Postal Code':'Postcode'})

Now that we have the geo spacial data for the Postcodes we need to join the two dataframes together

In [12]:
tor_df.head()

Unnamed: 0,Postcode,Borough,Neighborhood
0,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union"
1,M9R,Etobicoke,"Kingsview Village, Martin Grove Gardens, Richv..."
2,M5E,Downtown Toronto,Berczy Park
3,M4W,Downtown Toronto,Rosedale
4,M6N,York,"The Junction North, Runnymede"


In [13]:
geo_df.head()

Unnamed: 0,Postcode,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


In [14]:
tor_geo_df = pd.merge(tor_df, geo_df, on='Postcode', how='left')

In [15]:
tor_geo_df.head()

Unnamed: 0,Postcode,Borough,Neighborhood,Latitude,Longitude
0,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",43.784535,-79.160497
1,M9R,Etobicoke,"Kingsview Village, Martin Grove Gardens, Richv...",43.688905,-79.554724
2,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306
3,M4W,Downtown Toronto,Rosedale,43.679563,-79.377529
4,M6N,York,"The Junction North, Runnymede",43.673185,-79.487262


## Part 3: Analysis

In [16]:
!conda install -c conda-forge folium=0.5.0 --yes

Fetching package metadata .............
Solving package specifications: .

Package plan for installation in environment /opt/conda/envs/DSX-Python35:

The following NEW packages will be INSTALLED:

    altair:  2.2.2-py35_1 conda-forge
    branca:  0.3.1-py_0   conda-forge
    folium:  0.5.0-py_0   conda-forge
    vincent: 0.4.4-py_1   conda-forge

altair-2.2.2-p 100% |################################| Time: 0:00:00  48.14 MB/s
branca-0.3.1-p 100% |################################| Time: 0:00:00  33.87 MB/s
vincent-0.4.4- 100% |################################| Time: 0:00:00  40.23 MB/s
folium-0.5.0-p 100% |################################| Time: 0:00:00  47.56 MB/s


In [17]:
import numpy as np
import matplotlib
import folium
import requests
from sklearn.cluster import KMeans

### Create new data frame using only the rows where the Borough contains the word 'Toronto'

In [18]:
only_tor_df = tor_geo_df[tor_geo_df['Borough'].str.contains('Toronto')].reset_index(drop=True)
only_tor_df.head()

Unnamed: 0,Postcode,Borough,Neighborhood,Latitude,Longitude
0,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306
1,M4W,Downtown Toronto,Rosedale,43.679563,-79.377529
2,M6P,West Toronto,"High Park, The Junction South",43.661608,-79.464763
3,M6R,West Toronto,"Parkdale, Roncesvalles",43.64896,-79.456325
4,M4X,Downtown Toronto,"Cabbagetown, St. James Town",43.667967,-79.367675


Create map centered on Toronto

In [19]:
latitude = 43.653908
longitude = -79.384293
tor_map = folium.Map(location=[latitude, longitude], zoom_start=12)

In [20]:
for lat, lng, borough, neighborhood in zip(only_tor_df['Latitude'], only_tor_df['Longitude'], only_tor_df['Borough'], only_tor_df['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(tor_map)

In [21]:
tor_map

In [5]:
CLIENT_ID = 'USVGBD0T2EBB22OO5L02XV2GT5SFIIDVOMP45SXMURIAG5OC' # your Foursquare ID
CLIENT_SECRET = 'NEZU0L2Y0QSAV3NYLPP51ODJULWCFQPHCRS3PZNSBGKY4S2P' # your Foursquare Secret
VERSION = '20190315' # Foursquare API version

### Fetch venue data from Foursquare

In [23]:
LIMIT = 100
radius = 500

def getNearbyVenues(names, latitudes, longitudes, radius=500):
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID,
            CLIENT_SECRET,
            VERSION,
            lat,
            lng,
            radius,
            LIMIT)
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        # return only relevant information for each nearby venue
        venues_list.append([(
            name,
            lat,
            lng,
            v['venue']['name'],
            v['venue']['location']['lat'],
            v['venue']['location']['lng'],
            v['venue']['categories'][0]['name']) for v in results])
        nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
        nearby_venues.columns = [
            'Neighborhood',
            'Neighborhood Latitude',
            'Neighborhood Longitude',
            'Venue',
            'Venue Latitude',
            'Venue Longitude',
            'Venue Category']

    return(nearby_venues)

In [24]:
venues = getNearbyVenues(names=only_tor_df['Neighborhood'],
    latitudes=only_tor_df['Latitude'],
    longitudes=only_tor_df['Longitude']
    )

Berczy Park
Rosedale
High Park, The Junction South
Parkdale, Roncesvalles
Cabbagetown, St. James Town
The Annex, North Midtown, Yorkville
Roselawn
Lawrence Park
Church and Wellesley
Stn A PO Boxes 25 The Esplanade
Moore Park, Summerhill East
The Beaches
Davisville North
First Canadian Place, Underground city
CN Tower, Bathurst Quay, Island airport, Harbourfront West, King and Spadina, Railway Lands, South Niagara
Christie
Runnymede, Swansea
St. James Town
Harbourfront, Regent Park
Adelaide, King, Richmond
Studio District
Chinatown, Grange Park, Kensington Market
Commerce Court, Victoria Hotel
Dovercourt Village, Dufferin
Forest Hill North, Forest Hill West
Harbourfront East, Toronto Islands, Union Station
Ryerson, Garden District
Business Reply Mail Processing Centre 969 Eastern
Central Bay Street
Deer Park, Forest Hill SE, Rathnelly, South Hill, Summerhill West
Design Exchange, Toronto Dominion Centre
Harbord, University of Toronto
Brockton, Exhibition Place, Parkdale Village
Little P

In [25]:
venues.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Berczy Park,43.644771,-79.373306,LCBO,43.642944,-79.37244,Liquor Store
1,Berczy Park,43.644771,-79.373306,The Keg Steakhouse + Bar,43.646676,-79.374822,Steakhouse
2,Berczy Park,43.644771,-79.373306,Sony Centre for the Performing Arts,43.646292,-79.376022,Concert Hall
3,Berczy Park,43.644771,-79.373306,Hockey Hall Of Fame (Hockey Hall of Fame),43.646974,-79.377323,Museum
4,Berczy Park,43.644771,-79.373306,Sukhothai,43.648487,-79.374547,Thai Restaurant


In [26]:
venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"Adelaide, King, Richmond",100,100,100,100,100,100
Berczy Park,58,58,58,58,58,58
"Brockton, Exhibition Place, Parkdale Village",21,21,21,21,21,21
Business Reply Mail Processing Centre 969 Eastern,18,18,18,18,18,18
"CN Tower, Bathurst Quay, Island airport, Harbourfront West, King and Spadina, Railway Lands, South Niagara",14,14,14,14,14,14
"Cabbagetown, St. James Town",47,47,47,47,47,47
Central Bay Street,78,78,78,78,78,78
"Chinatown, Grange Park, Kensington Market",100,100,100,100,100,100
Christie,15,15,15,15,15,15
Church and Wellesley,83,83,83,83,83,83


### Apply onehot encoding to the venues category column 

In [27]:
venues_onehot = pd.get_dummies(venues[['Venue Category']], prefix="", prefix_sep="")

In [28]:
venues_onehot['Neighborhood'] = venues['Neighborhood']

In [29]:
fixed_columns = [venues_onehot.columns[-1]] + list(venues_onehot.columns[:-1])
venues_onehot = venues_onehot[fixed_columns]
venues_onehot.head()

Unnamed: 0,Yoga Studio,Accessories Store,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,...,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Wine Shop,Wings Joint,Women's Store
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


### Get the mean of all columns when grouped by Neighborhood

In [30]:
venues_grouped = venues_onehot.groupby('Neighborhood').mean().reset_index()
venues_grouped.head()

Unnamed: 0,Neighborhood,Yoga Studio,Accessories Store,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,...,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Wine Shop,Wings Joint,Women's Store
0,"Adelaide, King, Richmond",0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.01
1,Berczy Park,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,"Brockton, Exhibition Place, Parkdale Village",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Business Reply Mail Processing Centre 969 Eastern,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,"CN Tower, Bathurst Quay, Island airport, Harbo...",0.0,0.0,0.0,0.071429,0.071429,0.071429,0.142857,0.142857,0.142857,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


### Find the frequent of the top 5 venues in each Neighborhood

In [31]:
num_top_venues = 5

for nh in venues_grouped['Neighborhood']:
    print("----"+nh+"----")
    temp = venues_grouped[venues_grouped['Neighborhood'] == nh].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Adelaide, King, Richmond----
             venue  freq
0      Coffee Shop  0.06
1             Café  0.04
2       Steakhouse  0.04
3              Bar  0.04
4  Thai Restaurant  0.04


----Berczy Park----
          venue  freq
0   Coffee Shop  0.09
1  Cocktail Bar  0.05
2    Restaurant  0.05
3   Cheese Shop  0.03
4          Café  0.03


----Brockton, Exhibition Place, Parkdale Village----
            venue  freq
0  Breakfast Spot  0.10
1            Café  0.10
2     Coffee Shop  0.10
3          Bakery  0.05
4             Gym  0.05


----Business Reply Mail Processing Centre 969 Eastern----
                venue  freq
0  Light Rail Station  0.11
1       Auto Workshop  0.06
2                 Spa  0.06
3       Burrito Place  0.06
4             Butcher  0.06


----CN Tower, Bathurst Quay, Island airport, Harbourfront West, King and Spadina, Railway Lands, South Niagara----
              venue  freq
0    Airport Lounge  0.14
1   Airport Service  0.14
2  Airport Terminal  0.14
3             P

In [32]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    return row_categories_sorted.index.values[0:num_top_venues]

### Create a dataframe to show the top 10 venues for each Neighborhood

In [60]:
num_top_venues = 10
indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']

for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = venues_grouped['Neighborhood']

for ind in np.arange(venues_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(venues_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Adelaide, King, Richmond",Coffee Shop,Bar,Steakhouse,Café,Thai Restaurant,American Restaurant,Burger Joint,Bakery,Sushi Restaurant,Gym
1,Berczy Park,Coffee Shop,Restaurant,Cocktail Bar,Steakhouse,Bakery,Italian Restaurant,Seafood Restaurant,Pub,Cheese Shop,Café
2,"Brockton, Exhibition Place, Parkdale Village",Coffee Shop,Breakfast Spot,Café,Performing Arts Venue,Burrito Place,Stadium,Caribbean Restaurant,Bar,Bakery,Restaurant
3,Business Reply Mail Processing Centre 969 Eastern,Light Rail Station,Yoga Studio,Auto Workshop,Park,Pizza Place,Restaurant,Butcher,Burrito Place,Skate Park,Brewery
4,"CN Tower, Bathurst Quay, Island airport, Harbo...",Airport Lounge,Airport Service,Airport Terminal,Harbor / Marina,Airport,Airport Food Court,Airport Gate,Plane,Boutique,Boat or Ferry


### Create clusters out of the data

In [61]:
# set number of clusters
kclusters = 10

venue_grouped_clustering = venues_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(venue_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([0, 8, 0, 0, 4, 0, 8, 0, 9, 0], dtype=int32)

### Join the tables together on the Neighborhood Column

In [62]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

tor_merged = only_tor_df

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
tor_merged = tor_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')
tor_merged.head() # check the last columns!

Unnamed: 0,Postcode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306,8,Coffee Shop,Restaurant,Cocktail Bar,Steakhouse,Bakery,Italian Restaurant,Seafood Restaurant,Pub,Cheese Shop,Café
1,M4W,Downtown Toronto,Rosedale,43.679563,-79.377529,6,Park,Playground,Trail,Diner,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store
2,M6P,West Toronto,"High Park, The Junction South",43.661608,-79.464763,9,Mexican Restaurant,Café,Park,Arts & Crafts Store,Fast Food Restaurant,Bookstore,Flea Market,Speakeasy,Cajun / Creole Restaurant,Sandwich Place
3,M6R,West Toronto,"Parkdale, Roncesvalles",43.64896,-79.456325,0,Breakfast Spot,Gift Shop,Dessert Shop,Bookstore,Burger Joint,Eastern European Restaurant,Bar,Bank,Movie Theater,Italian Restaurant
4,M4X,Downtown Toronto,"Cabbagetown, St. James Town",43.667967,-79.367675,0,Coffee Shop,Restaurant,Café,Italian Restaurant,Bakery,Pub,Market,Pizza Place,Pharmacy,Pet Store


### Plot the map points, and display each point with the color of it's cluster

In [63]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=12)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []

for lat, lon, poi, cluster in zip(tor_merged['Latitude'], tor_merged['Longitude'], tor_merged['Neighborhood'], tor_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)

map_clusters