# Segmenting and Clustering Neighborhoods in Toronto

! This notebook contains all the questions of the assignment. Section headers applied for each question.

## Table of contents
1. [Question 1](#question1)
2. [Question 2](#question2)
3. [Question 3](#question3)

In [1]:
import numpy as np
import pandas as pd

## Question 1 - Solution <a name="question1"></a>

In [2]:
import requests

In [3]:
url = "https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M"
wiki_url = requests.get(url)
wiki_data = pd.read_html(wiki_url.text)

wiki_data
# len(wiki_data)

[    Postal Code           Borough  \
 0           M1A      Not assigned   
 1           M2A      Not assigned   
 2           M3A        North York   
 3           M4A        North York   
 4           M5A  Downtown Toronto   
 ..          ...               ...   
 175         M5Z      Not assigned   
 176         M6Z      Not assigned   
 177         M7Z      Not assigned   
 178         M8Z         Etobicoke   
 179         M9Z      Not assigned   
 
                                          Neighbourhood  
 0                                         Not assigned  
 1                                         Not assigned  
 2                                            Parkwoods  
 3                                     Victoria Village  
 4                            Regent Park, Harbourfront  
 ..                                                 ...  
 175                                       Not assigned  
 176                                       Not assigned  
 177                

wiki_data consists of 3 tables.

The first table is the one we need as it contains: Postal Code, Borough, Neighbourhood

In [4]:
wiki_data = wiki_data[0]
wiki_data

Unnamed: 0,Postal Code,Borough,Neighbourhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"
...,...,...,...
175,M5Z,Not assigned,Not assigned
176,M6Z,Not assigned,Not assigned
177,M7Z,Not assigned,Not assigned
178,M8Z,Etobicoke,"Mimico NW, The Queensway West, South of Bloor,..."


Drop rows where "Borough" coloumn is "Not Assigned".

In [5]:
pc_df = wiki_data[wiki_data["Borough"] != "Not assigned"]
pc_df.shape

(103, 3)

77 rows were deleted. New shape: (103, 3)

More than one neighborhood can exist in one postal code area. These two rows will be combined into one row with the neighborhoods separated with a comma, by using GroupBy. As there are no many rows like this, using head() is adequate.

In [6]:
pc_df = pc_df.groupby(['Postal Code']).head()
pc_df.head()

Unnamed: 0,Postal Code,Borough,Neighbourhood
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"
5,M6A,North York,"Lawrence Manor, Lawrence Heights"
6,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"


Now, reset the index and drop the old one.

In [7]:
pc_df.reset_index(drop=True, inplace=True)
pc_df

Unnamed: 0,Postal Code,Borough,Neighbourhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Regent Park, Harbourfront"
3,M6A,North York,"Lawrence Manor, Lawrence Heights"
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"
...,...,...,...
98,M8X,Etobicoke,"The Kingsway, Montgomery Road, Old Mill North"
99,M4Y,Downtown Toronto,Church and Wellesley
100,M7Y,East Toronto,"Business reply mail Processing Centre, South C..."
101,M8Y,Etobicoke,"Old Mill South, King's Mill Park, Sunnylea, Hu..."


Let's check for any "Not Assigned" values.

In [8]:
pc_df.Neighbourhood.str.count("Not assigned").sum()

0

In [9]:
print('Dataframe shape: {}'.format(pc_df.shape))

Dataframe shape: (103, 3)


## Question 2 - Solution <a name="question2"></a>

In [10]:
#Needed only the 1st time

# pip install geocoder

In [11]:
import geocoder

Method 1: Try to fetch lon/lat from geocoder module, as described in the assignment

In [12]:
postal_codes = pc_df['Postal Code'].unique()
postal_codes

array(['M3A', 'M4A', 'M5A', 'M6A', 'M7A', 'M9A', 'M1B', 'M3B', 'M4B',
       'M5B', 'M6B', 'M9B', 'M1C', 'M3C', 'M4C', 'M5C', 'M6C', 'M9C',
       'M1E', 'M4E', 'M5E', 'M6E', 'M1G', 'M4G', 'M5G', 'M6G', 'M1H',
       'M2H', 'M3H', 'M4H', 'M5H', 'M6H', 'M1J', 'M2J', 'M3J', 'M4J',
       'M5J', 'M6J', 'M1K', 'M2K', 'M3K', 'M4K', 'M5K', 'M6K', 'M1L',
       'M2L', 'M3L', 'M4L', 'M5L', 'M6L', 'M9L', 'M1M', 'M2M', 'M3M',
       'M4M', 'M5M', 'M6M', 'M9M', 'M1N', 'M2N', 'M3N', 'M4N', 'M5N',
       'M6N', 'M9N', 'M1P', 'M2P', 'M4P', 'M5P', 'M6P', 'M9P', 'M1R',
       'M2R', 'M4R', 'M5R', 'M6R', 'M7R', 'M9R', 'M1S', 'M4S', 'M5S',
       'M6S', 'M1T', 'M4T', 'M5T', 'M1V', 'M4V', 'M5V', 'M8V', 'M9V',
       'M1W', 'M4W', 'M5W', 'M8W', 'M9W', 'M1X', 'M4X', 'M5X', 'M8X',
       'M4Y', 'M7Y', 'M8Y', 'M8Z'], dtype=object)

In [13]:
# initialize your variable to None
lat_lng_coords = None
pc_coord = dict()

try_geocoder = False

if try_geocoder:
    for pc in postal_codes:
        # loop until you get the coordinates
        while(lat_lng_coords is None):
          g = geocoder.google('{}, Toronto, Ontario'.format(pc))
          lat_lng_coords = g.latlng

        pc_coord[pc] = [lat_lng_coords[0], lat_lng_coords[1]]

Method 1 does not seem to work as it continuously returns None. Let's load the .csv instead.

Method 2 - Load the .csv file

In [14]:
pc_coord = pd.read_csv("https://cocl.us/Geospatial_data")
pc_coord.head()

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


In [15]:
pc_coord.shape

(103, 3)

Ok, the shapes of the DFs are identical. Let's join them into a single DF using "Postal Code", as described in the assignment.

In [16]:
final_df = pc_df.join(pc_coord.set_index('Postal Code'), on='Postal Code', how='inner')
final_df.head()

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494


Now, the columns became 5, as Latitude and Longitude were added.

In [17]:
print('Final Dataframe shape: {}'.format(final_df.shape))

Final Dataframe shape: (103, 5)


## Question 3 - Solution <a name="question3"></a>

In [18]:
##Required the 1st time only
# pip install folium

In [19]:
##Import libraries for maps, clusters and final plots

import folium
from sklearn.cluster import KMeans
import matplotlib.cm as cm
import matplotlib.colors as colors

Let's plot a map of Toronto to view the neighbourhoods!

In [20]:
mean_lat = np.mean(final_df['Latitude'])
mean_lon = np.mean(final_df['Longitude'])

# create map of Toronto using latitude and longitude values
map_toronto = folium.Map(location=[mean_lat, mean_lon], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(final_df['Latitude'], final_df['Longitude'], \
                                           final_df['Borough'], final_df['Neighbourhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='brown',
        fill=True,
        fill_color='orange',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
    
map_toronto

Now, decide to work with a specific borough. Let's see the options.

In [21]:
print(final_df["Borough"].unique())

['North York' 'Downtown Toronto' 'Etobicoke' 'Scarborough' 'East York'
 'York' 'East Toronto' 'West Toronto' 'Central Toronto' 'Mississauga']


Downtown Toronto sounds good!

In [22]:
dt_toronto = final_df[final_df["Borough"]=="Downtown Toronto"]
mean_lat = np.mean(dt_toronto['Latitude'])
mean_lon = np.mean(dt_toronto['Longitude'])

# create map of Downtown Toronto using latitude and longitude values
map_dt_toronto = folium.Map(location=[mean_lat, mean_lon], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(dt_toronto['Latitude'], dt_toronto['Longitude'], \
                                           dt_toronto['Borough'], dt_toronto['Neighbourhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='brown',
        fill=True,
        fill_color='orange',
        fill_opacity=0.7,
        parse_html=False).add_to(map_dt_toronto)  
    
map_dt_toronto

Foursquare's Time! Set the configuration variables:

In [34]:
CLIENT_ID = '***' # your Foursquare ID
CLIENT_SECRET = '***' # your Foursquare Secret
ACCESS_TOKEN = '***' # your FourSquare Access Token
VERSION = '20180604'

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: EILAECZB4MDCET1ZFVB15BTONCCZQX445F44RW45SNURSFT4
CLIENT_SECRET:4ZM51RDB23KP2M4LDCKH0K0UO5GEKIZM14K3WFRYWE0KN2W0


Now, let's explore the neighbourhood!

In [24]:
def getNearbyVenues(names, latitudes, longitudes, radius=500, limit=100):
    """
    This functions returns a data frame of the top 100 locations for each neighborhod given it's latitute and longitude.
    """
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            limit)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [25]:
dt_toronto_venues = getNearbyVenues(names=dt_toronto['Neighbourhood'],
                                   latitudes=dt_toronto['Latitude'],
                                   longitudes=dt_toronto['Longitude']
                                  )

print('\nTotal founds: {}'.format(dt_toronto_venues.shape))
print('Unique categories: {}\n'.format(len(dt_toronto_venues['Venue Category'].unique())))
dt_toronto_venues.head()

Regent Park, Harbourfront
Queen's Park, Ontario Provincial Government
Garden District, Ryerson
St. James Town
Berczy Park
Central Bay Street
Christie
Richmond, Adelaide, King
Harbourfront East, Union Station, Toronto Islands
Toronto Dominion Centre, Design Exchange
Commerce Court, Victoria Hotel
University of Toronto, Harbord
Kensington Market, Chinatown, Grange Park
CN Tower, King and Spadina, Railway Lands, Harbourfront West, Bathurst Quay, South Niagara, Island airport
Rosedale
Stn A PO Boxes
St. James Town, Cabbagetown
First Canadian Place, Underground city
Church and Wellesley

Total founds: (1222, 7)
Unique categories: 210



Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,"Regent Park, Harbourfront",43.65426,-79.360636,Roselle Desserts,43.653447,-79.362017,Bakery
1,"Regent Park, Harbourfront",43.65426,-79.360636,Tandem Coffee,43.653559,-79.361809,Coffee Shop
2,"Regent Park, Harbourfront",43.65426,-79.360636,Cooper Koo Family YMCA,43.653249,-79.358008,Distribution Center
3,"Regent Park, Harbourfront",43.65426,-79.360636,Body Blitz Spa East,43.654735,-79.359874,Spa
4,"Regent Park, Harbourfront",43.65426,-79.360636,Impact Kitchen,43.656369,-79.35698,Restaurant


A lot of categories! To properly analyze them, let's get them into a One-Hot encoding

In [26]:
# one hot encoding
dt_onehot = pd.get_dummies(dt_toronto_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighbourhood column back to dataframe
dt_onehot['Neighborhood'] = dt_toronto_venues['Neighborhood'] 

# move neighbourhood column to the first column
fixed_columns = [dt_onehot.columns[-1]] + list(dt_onehot.columns[:-1])
dt_onehot = dt_onehot[fixed_columns]

print('Venues categories shape: {}'.format(dt_onehot.shape))
dt_onehot.head()

Venues categories shape: (1222, 210)


Unnamed: 0,Yoga Studio,Adult Boutique,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,...,Tea Room,Thai Restaurant,Theater,Theme Restaurant,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


For each neighbourhood, get the mean number of times of each category.

In [27]:
dt_grouped = dt_onehot.groupby('Neighborhood').mean().reset_index()

print('Grouped shape: {}'.format(dt_grouped.shape))
dt_grouped.head()

Grouped shape: (19, 210)


Unnamed: 0,Neighborhood,Yoga Studio,Adult Boutique,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,...,Tea Room,Thai Restaurant,Theater,Theme Restaurant,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar
0,Berczy Park,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.017544,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0
1,"CN Tower, King and Spadina, Railway Lands, Har...",0.0,0.0,0.076923,0.076923,0.076923,0.153846,0.076923,0.076923,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Central Bay Street,0.016667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.016667,0.033333,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.016667
3,Christie,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Church and Wellesley,0.024096,0.012048,0.0,0.0,0.0,0.0,0.0,0.0,0.012048,...,0.0,0.012048,0.012048,0.012048,0.0,0.0,0.0,0.0,0.0,0.0


We now create a dataframe with the top 10 categories of venues for each neighbourhood.

In [28]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    return row_categories_sorted.index.values[0:num_top_venues]

In [29]:
num_top_venues = 10

# create columns according to number of top venues
columns = ['Neighbourhood']
for ind in np.arange(num_top_venues):
    columns.append('Venue no. {}'.format(ind+1))

# create a new dataframe
neighbourhoods_venues_sorted = pd.DataFrame(columns=columns)
neighbourhoods_venues_sorted['Neighbourhood'] = dt_grouped['Neighborhood']

for ind in np.arange(dt_grouped.shape[0]):
    neighbourhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(dt_grouped.iloc[ind, :], num_top_venues)

neighbourhoods_venues_sorted.head()

Unnamed: 0,Neighbourhood,Venue no. 1,Venue no. 2,Venue no. 3,Venue no. 4,Venue no. 5,Venue no. 6,Venue no. 7,Venue no. 8,Venue no. 9,Venue no. 10
0,Berczy Park,Coffee Shop,Cocktail Bar,Bakery,Cheese Shop,Seafood Restaurant,Restaurant,Pharmacy,Beer Bar,Farmers Market,Museum
1,"CN Tower, King and Spadina, Railway Lands, Har...",Airport Lounge,Boat or Ferry,Airport Terminal,Boutique,Rental Car Location,Harbor / Marina,Sculpture Garden,Coffee Shop,Airport Service,Airport Gate
2,Central Bay Street,Coffee Shop,Sandwich Place,Café,Italian Restaurant,Bubble Tea Shop,Burger Joint,Thai Restaurant,Salad Place,Ramen Restaurant,Restaurant
3,Christie,Grocery Store,Café,Park,Nightclub,Baby Store,Athletics & Sports,Candy Store,Restaurant,Italian Restaurant,Coffee Shop
4,Church and Wellesley,Coffee Shop,Japanese Restaurant,Sushi Restaurant,Restaurant,Gay Bar,Pub,Burger Joint,Café,Men's Store,Mediterranean Restaurant


### KMeans clustering

Now, let's get to the point and cluster based on these venues!

In [30]:
# set number of clusters
k_clusters = 5

dt_grouped_clustering = dt_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=k_clusters, random_state=42).fit(dt_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([1, 4, 1, 3, 1, 1, 1, 1, 1, 1])

In [31]:
# add clustering labels
neighbourhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

dt_merged = dt_toronto

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
dt_merged = dt_merged.join(neighbourhoods_venues_sorted.set_index('Neighbourhood'), on='Neighbourhood')

dt_merged.head() # check the last columns!

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,Venue no. 1,Venue no. 2,Venue no. 3,Venue no. 4,Venue no. 5,Venue no. 6,Venue no. 7,Venue no. 8,Venue no. 9,Venue no. 10
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636,1,Coffee Shop,Park,Bakery,Theater,Café,Pub,Breakfast Spot,Yoga Studio,Restaurant,Hotel
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494,1,Coffee Shop,College Cafeteria,Sushi Restaurant,Diner,Bar,Italian Restaurant,Beer Bar,Sandwich Place,Restaurant,Distribution Center
9,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937,1,Clothing Store,Coffee Shop,Bubble Tea Shop,Middle Eastern Restaurant,Café,Japanese Restaurant,Italian Restaurant,Cosmetics Shop,Hotel,Bookstore
15,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418,1,Café,Coffee Shop,Cocktail Bar,Gastropub,American Restaurant,Art Gallery,Moroccan Restaurant,Department Store,Clothing Store,Hotel
20,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306,1,Coffee Shop,Cocktail Bar,Bakery,Cheese Shop,Seafood Restaurant,Restaurant,Pharmacy,Beer Bar,Farmers Market,Museum


In [32]:
dt_merged.head(10)

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,Venue no. 1,Venue no. 2,Venue no. 3,Venue no. 4,Venue no. 5,Venue no. 6,Venue no. 7,Venue no. 8,Venue no. 9,Venue no. 10
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636,1,Coffee Shop,Park,Bakery,Theater,Café,Pub,Breakfast Spot,Yoga Studio,Restaurant,Hotel
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494,1,Coffee Shop,College Cafeteria,Sushi Restaurant,Diner,Bar,Italian Restaurant,Beer Bar,Sandwich Place,Restaurant,Distribution Center
9,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937,1,Clothing Store,Coffee Shop,Bubble Tea Shop,Middle Eastern Restaurant,Café,Japanese Restaurant,Italian Restaurant,Cosmetics Shop,Hotel,Bookstore
15,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418,1,Café,Coffee Shop,Cocktail Bar,Gastropub,American Restaurant,Art Gallery,Moroccan Restaurant,Department Store,Clothing Store,Hotel
20,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306,1,Coffee Shop,Cocktail Bar,Bakery,Cheese Shop,Seafood Restaurant,Restaurant,Pharmacy,Beer Bar,Farmers Market,Museum
24,M5G,Downtown Toronto,Central Bay Street,43.657952,-79.387383,1,Coffee Shop,Sandwich Place,Café,Italian Restaurant,Bubble Tea Shop,Burger Joint,Thai Restaurant,Salad Place,Ramen Restaurant,Restaurant
25,M6G,Downtown Toronto,Christie,43.669542,-79.422564,3,Grocery Store,Café,Park,Nightclub,Baby Store,Athletics & Sports,Candy Store,Restaurant,Italian Restaurant,Coffee Shop
30,M5H,Downtown Toronto,"Richmond, Adelaide, King",43.650571,-79.384568,1,Coffee Shop,Café,Restaurant,Gym,Clothing Store,Thai Restaurant,Deli / Bodega,Bookstore,Seafood Restaurant,Bakery
36,M5J,Downtown Toronto,"Harbourfront East, Union Station, Toronto Islands",43.640816,-79.381752,1,Coffee Shop,Aquarium,Hotel,Café,Sporting Goods Shop,Restaurant,Italian Restaurant,Scenic Lookout,Brewery,Fried Chicken Joint
42,M5K,Downtown Toronto,"Toronto Dominion Centre, Design Exchange",43.647177,-79.381576,1,Coffee Shop,Hotel,Café,Restaurant,Japanese Restaurant,Italian Restaurant,American Restaurant,Seafood Restaurant,Salad Place,Breakfast Spot


In [33]:
# create map
map_clusters = folium.Map(location=[mean_lat, mean_lon], zoom_start=11)

# set color scheme for the clusters
x = np.arange(k_clusters)
ys = [i + x + (i*x)**2 for i in range(k_clusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(dt_merged['Latitude'], dt_merged['Longitude'], \
                                  dt_merged['Neighbourhood'], dt_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters