In [2]:
import pandas as pd
import requests
import numpy as np
from bs4 import BeautifulSoup
print("Libraries imported.")

Libraries imported.


To go [Part-2](#PART-2)  
To go [Part-3](#PART-3)

## Download Dataset 

To obtain the dataset which includes the **postal code, borough and neighborhood** of Toronto, we will utilize the Wikipedia page and by scraping the table in [this link](https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M), we will be able to generate the required dataframe. 

Get the html of the wikipedia page with **requests** and **beautifulsoup** libraries.

In [3]:
# GET request
data = requests.get('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M').text
soup = BeautifulSoup(data, 'html.parser')

We need 3 lists which includes the postal code, borough and neighborhood.

In [4]:
postalCode=[]
borough=[]
neighborhood=[]

The required information is in the "tr" and "td" HTML items. Former one shows the rows and latter one denotes the ingredient of that row.

In [5]:
for row in soup.find('table').find_all('tr'):
    cells = row.find_all('td')
    if(len(cells) > 0):
        postalCode.append(cells[0].text.rstrip('\n'))
        borough.append(cells[1].text.rstrip('\n'))
        neighborhood.append(cells[2].text.rstrip('\n'))

Now we have 3 lists with the information. Next step, we will create dataframe for these.

In [6]:
print(len(postalCode))
print(len(borough))
print(len(neighborhood))

180
180
180


### Create Pandas Dataframe

In [13]:
toronto_df=pd.DataFrame({"Postal Code": postalCode,"Borough": borough,"Neighborhood": neighborhood})
toronto_df.head()

Unnamed: 0,Postal Code,Borough,Neighborhood
0,M1A,Not assigned,
1,M2A,Not assigned,
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"


**Instruction:** Only process the cells that have an assigned borough. Ignore cells with a borough that is Not assigned.

In [14]:
toronto_df=toronto_df[toronto_df["Borough"] != "Not assigned"].reset_index(drop=True)
toronto_df.head()

Unnamed: 0,Postal Code,Borough,Neighborhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Regent Park, Harbourfront"
3,M6A,North York,"Lawrence Manor, Lawrence Heights"
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"


**Instruction**: If a cell has a borough but a Not assigned neighborhood, then the neighborhood will be the same as the borough.

In [15]:
toronto_df["Neighborhood"]=np.where(toronto_df["Neighborhood"]=="Not assigned",toronto_df["Borough"],toronto_df["Neighborhood"])

### Group by Borough and Postal Codes by putting "," between Neighborhoods. 

In [16]:
toronto_grouped = toronto_df.groupby(["Postal Code", "Borough"], as_index=False).agg(lambda x: ", ".join(x))
toronto_grouped.head()

Unnamed: 0,Postal Code,Borough,Neighborhood
0,M1B,Scarborough,"Malvern, Rouge"
1,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek"
2,M1E,Scarborough,"Guildwood, Morningside, West Hill"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae


In [17]:
toronto_grouped.shape

(103, 3)

# PART 2

Now, we will add to coordinates to the above dataframes by using Postal Codes. Here is a link to a csv file that has the geographical coordinates of each postal code: http://cocl.us/Geospatial_data

In [18]:
coor= pd.read_csv('https://cocl.us/Geospatial_data')
coor.head()

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


**Merging 2 dataframes**

In [19]:
toronto_final = toronto_grouped.merge(coor, on="Postal Code", how="left")
toronto_final.head()

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude
0,M1B,Scarborough,"Malvern, Rouge",43.806686,-79.194353
1,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476


In [22]:
toronto_final.shape

(103, 5)

# PART 3

In [23]:
!conda install -c conda-forge folium=0.5.0 --yes
import folium # map rendering library
!conda install -c conda-forge geopy --yes
from geopy.geocoders import Nominatim
from pandas.io.json import json_normalize
import matplotlib.cm as cm
import matplotlib.colors as colors
from sklearn.cluster import KMeans

Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... failed with initial frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working... done

## Package Plan ##

  environment location: C:\Users\ozcan\Anaconda3

  added / updated specs:
    - folium=0.5.0


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    altair-4.1.0               |             py_1         614 KB  conda-forge
    folium-0.5.0               |             py_0          45 KB  conda-forge
    toolz-0.10.0               |             py_0          46 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         706 KB

The following NEW packages will be INSTALLED:

  altair             conda-forge/noarch::altair-4.1.

In [24]:
address = 'Toronto'

geolocator = Nominatim(user_agent="ozcan_app")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Toronto are 43.6534817, -79.3839347.


In [25]:
map_toronto = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(toronto_final['Latitude'], toronto_final['Longitude'], toronto_final['Borough'], toronto_final['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(map_toronto)  
    
map_toronto

I have decided to work with only boroughs that contain the word Toronto

In [27]:
boroughs = list(toronto_final.Borough.unique())
borough_toronto = []
for x in boroughs:
    if "toronto" in x.lower():
        borough_toronto.append(x)       
borough_toronto

['East Toronto', 'Central Toronto', 'Downtown Toronto', 'West Toronto']

In [29]:
# create a new DataFrame with only boroughs that contain the word Toronto
toronto_final = toronto_final[toronto_final['Borough'].isin(borough_toronto)].reset_index(drop=True)
toronto_final.head()

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude
0,M4E,East Toronto,The Beaches,43.676357,-79.293031
1,M4K,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188
2,M4L,East Toronto,"India Bazaar, The Beaches West",43.668999,-79.315572
3,M4M,East Toronto,Studio District,43.659526,-79.340923
4,M4N,Central Toronto,Lawrence Park,43.72802,-79.38879


In [30]:
map_toronto = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(toronto_final['Latitude'], toronto_final['Longitude'], toronto_final['Borough'], toronto_final['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='red',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(map_toronto)  
    
map_toronto

In [31]:
#Foursquare API credentials
CLIENT_ID = '5H1IIXS3HEMFQIRSVHKDVU40MKVBKIT0JBDSIELPQE41SZCC' # your Foursquare ID
CLIENT_SECRET = 'O20SAVKDFFZVX1ZZKUTF1JYVXLQNQSVL05EN1NFKYC1B1ZM4' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: 5H1IIXS3HEMFQIRSVHKDVU40MKVBKIT0JBDSIELPQE41SZCC
CLIENT_SECRET:O20SAVKDFFZVX1ZZKUTF1JYVXLQNQSVL05EN1NFKYC1B1ZM4


In [33]:
radius = 500
LIMIT = 100

venues = []

for lat, long, post, borough, neighborhood in zip(toronto_final['Latitude'], toronto_final['Longitude'], toronto_final['Postal Code'], toronto_final['Borough'], 
                                                  toronto_final['Neighborhood']):
    url = "https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}".format(
        CLIENT_ID,
        CLIENT_SECRET,
        VERSION,
        latitude,
        longitude,
        radius, 
        LIMIT)
    
    results = requests.get(url).json()["response"]['groups'][0]['items']
    
    for venue in results:
        venues.append((
            post, 
            borough,
            neighborhood,
            lat, 
            long, 
            venue['venue']['name'], 
            venue['venue']['location']['lat'], 
            venue['venue']['location']['lng'],  
            venue['venue']['categories'][0]['name']))

In [50]:
venues_df = pd.DataFrame(venues)
venues_df.columns = ['Postal Code', 'Borough', 'Neighborhood', 'BoroughLatitude', 'BoroughLongitude', 'VenueName', 'VenueLatitude', 'VenueLongitude', 'Venue Category']
print(venues_df.shape)
venues_df.head()

(2847, 9)


Unnamed: 0,Postal Code,Borough,Neighborhood,BoroughLatitude,BoroughLongitude,VenueName,VenueLatitude,VenueLongitude,Venue Category
0,M4E,East Toronto,The Beaches,43.676357,-79.293031,Downtown Toronto,43.653232,-79.385296,Neighborhood
1,M4E,East Toronto,The Beaches,43.676357,-79.293031,Nathan Phillips Square,43.65227,-79.383516,Plaza
2,M4E,East Toronto,The Beaches,43.676357,-79.293031,Indigo,43.653515,-79.380696,Bookstore
3,M4E,East Toronto,The Beaches,43.676357,-79.293031,Chatime 日出茶太,43.655542,-79.384684,Bubble Tea Shop
4,M4E,East Toronto,The Beaches,43.676357,-79.293031,Textile Museum of Canada,43.654396,-79.3865,Art Museum


Let's find out how many unique categories can be curated from all the returned venues

In [51]:
print('There are {} uniques categories.'.format(len(venues_df['Venue Category'].unique())))

There are 54 uniques categories.


In [52]:
# one hot encoding
toronto_onehot = pd.get_dummies(venues_df[['Venue Category']], prefix="", prefix_sep="")

# add postal, borough and neighborhood column back to dataframe
toronto_onehot['Postal Code'] = venues_df['Postal Code'] 
toronto_onehot['Borough'] = venues_df['Borough'] 
toronto_onehot['Neighborhoods'] = venues_df['Neighborhood'] 

# move postal, borough and neighborhood column to the first column
fixed_columns = list(toronto_onehot.columns[-3:]) + list(toronto_onehot.columns[:-3])
toronto_onehot = toronto_onehot[fixed_columns]

print(toronto_onehot.shape)
toronto_onehot.head()

(2847, 57)


Unnamed: 0,Postal Code,Borough,Neighborhoods,Accessories Store,American Restaurant,Art Museum,Bank,Bookstore,Breakfast Spot,Bubble Tea Shop,...,Steakhouse,Sushi Restaurant,Tanning Salon,Tea Room,Thai Restaurant,Theater,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Women's Store
0,M4E,East Toronto,The Beaches,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,M4E,East Toronto,The Beaches,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,M4E,East Toronto,The Beaches,0,0,0,0,1,0,0,...,0,0,0,0,0,0,0,0,0,0
3,M4E,East Toronto,The Beaches,0,0,0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,0
4,M4E,East Toronto,The Beaches,0,0,1,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [53]:
toronto_grouped = toronto_onehot.groupby(["Postal Code", "Borough", "Neighborhoods"]).mean().reset_index()

print(toronto_grouped.shape)
toronto_grouped

(39, 57)


Unnamed: 0,Postal Code,Borough,Neighborhoods,Accessories Store,American Restaurant,Art Museum,Bank,Bookstore,Breakfast Spot,Bubble Tea Shop,...,Steakhouse,Sushi Restaurant,Tanning Salon,Tea Room,Thai Restaurant,Theater,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Women's Store
0,M4E,East Toronto,The Beaches,0.013699,0.027397,0.013699,0.013699,0.013699,0.013699,0.013699,...,0.013699,0.013699,0.013699,0.013699,0.027397,0.027397,0.013699,0.013699,0.013699,0.013699
1,M4K,East Toronto,"The Danforth West, Riverdale",0.013699,0.027397,0.013699,0.013699,0.013699,0.013699,0.013699,...,0.013699,0.013699,0.013699,0.013699,0.027397,0.027397,0.013699,0.013699,0.013699,0.013699
2,M4L,East Toronto,"India Bazaar, The Beaches West",0.013699,0.027397,0.013699,0.013699,0.013699,0.013699,0.013699,...,0.013699,0.013699,0.013699,0.013699,0.027397,0.027397,0.013699,0.013699,0.013699,0.013699
3,M4M,East Toronto,Studio District,0.013699,0.027397,0.013699,0.013699,0.013699,0.013699,0.013699,...,0.013699,0.013699,0.013699,0.013699,0.027397,0.027397,0.013699,0.013699,0.013699,0.013699
4,M4N,Central Toronto,Lawrence Park,0.013699,0.027397,0.013699,0.013699,0.013699,0.013699,0.013699,...,0.013699,0.013699,0.013699,0.013699,0.027397,0.027397,0.013699,0.013699,0.013699,0.013699
5,M4P,Central Toronto,Davisville North,0.013699,0.027397,0.013699,0.013699,0.013699,0.013699,0.013699,...,0.013699,0.013699,0.013699,0.013699,0.027397,0.027397,0.013699,0.013699,0.013699,0.013699
6,M4R,Central Toronto,North Toronto West,0.013699,0.027397,0.013699,0.013699,0.013699,0.013699,0.013699,...,0.013699,0.013699,0.013699,0.013699,0.027397,0.027397,0.013699,0.013699,0.013699,0.013699
7,M4S,Central Toronto,Davisville,0.013699,0.027397,0.013699,0.013699,0.013699,0.013699,0.013699,...,0.013699,0.013699,0.013699,0.013699,0.027397,0.027397,0.013699,0.013699,0.013699,0.013699
8,M4T,Central Toronto,"Moore Park, Summerhill East",0.013699,0.027397,0.013699,0.013699,0.013699,0.013699,0.013699,...,0.013699,0.013699,0.013699,0.013699,0.027397,0.027397,0.013699,0.013699,0.013699,0.013699
9,M4V,Central Toronto,"Summerhill West, Rathnelly, South Hill, Forest...",0.013699,0.027397,0.013699,0.013699,0.013699,0.013699,0.013699,...,0.013699,0.013699,0.013699,0.013699,0.027397,0.027397,0.013699,0.013699,0.013699,0.013699


In [54]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
areaColumns = ['Postal Code', 'Borough', 'Neighborhoods']
freqColumns = []
for ind in np.arange(num_top_venues):
    try:
        freqColumns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        freqColumns.append('{}th Most Common Venue'.format(ind+1))
columns = areaColumns+freqColumns

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Postal Code'] = toronto_grouped['Postal Code']
neighborhoods_venues_sorted['Borough'] = toronto_grouped['Borough']
neighborhoods_venues_sorted['Neighborhoods'] = toronto_grouped['Neighborhoods']

for ind in np.arange(toronto_grouped.shape[0]):
    row_categories = toronto_grouped.iloc[ind, :].iloc[3:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    neighborhoods_venues_sorted.iloc[ind, 3:] = row_categories_sorted.index.values[0:num_top_venues]

# neighborhoods_venues_sorted.sort_values(freqColumns, inplace=True)
print(neighborhoods_venues_sorted.shape)
neighborhoods_venues_sorted

(39, 13)


Unnamed: 0,Postal Code,Borough,Neighborhoods,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M4E,East Toronto,The Beaches,Clothing Store,Coffee Shop,Restaurant,Café,Cosmetics Shop,Diner,Plaza,Thai Restaurant,Theater,Hotel
1,M4K,East Toronto,"The Danforth West, Riverdale",Clothing Store,Coffee Shop,Restaurant,Café,Cosmetics Shop,Diner,Plaza,Thai Restaurant,Theater,Hotel
2,M4L,East Toronto,"India Bazaar, The Beaches West",Clothing Store,Coffee Shop,Restaurant,Café,Cosmetics Shop,Diner,Plaza,Thai Restaurant,Theater,Hotel
3,M4M,East Toronto,Studio District,Clothing Store,Coffee Shop,Restaurant,Café,Cosmetics Shop,Diner,Plaza,Thai Restaurant,Theater,Hotel
4,M4N,Central Toronto,Lawrence Park,Clothing Store,Coffee Shop,Restaurant,Café,Cosmetics Shop,Diner,Plaza,Thai Restaurant,Theater,Hotel
5,M4P,Central Toronto,Davisville North,Clothing Store,Coffee Shop,Restaurant,Café,Cosmetics Shop,Diner,Plaza,Thai Restaurant,Theater,Hotel
6,M4R,Central Toronto,North Toronto West,Clothing Store,Coffee Shop,Restaurant,Café,Cosmetics Shop,Diner,Plaza,Thai Restaurant,Theater,Hotel
7,M4S,Central Toronto,Davisville,Clothing Store,Coffee Shop,Restaurant,Café,Cosmetics Shop,Diner,Plaza,Thai Restaurant,Theater,Hotel
8,M4T,Central Toronto,"Moore Park, Summerhill East",Clothing Store,Coffee Shop,Restaurant,Café,Cosmetics Shop,Diner,Plaza,Thai Restaurant,Theater,Hotel
9,M4V,Central Toronto,"Summerhill West, Rathnelly, South Hill, Forest...",Clothing Store,Coffee Shop,Restaurant,Café,Cosmetics Shop,Diner,Plaza,Thai Restaurant,Theater,Hotel


# CLUSTERING

In [58]:
kclusters = 3

toronto_grouped_clustering = toronto_grouped.drop(["Postal Code", "Borough", "Neighborhoods"], 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

  


array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

In [59]:
#create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.
toronto_merged = toronto_final.copy()

# add clustering labels
toronto_merged["Cluster Labels"] = kmeans.labels_

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
toronto_merged = toronto_merged.join(neighborhoods_venues_sorted.drop(["Borough", "Neighborhoods"], 1).set_index("Postal Code"), on="Postal Code")

print(toronto_merged.shape)
toronto_merged.head() # check the last columns!

(39, 16)


Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M4E,East Toronto,The Beaches,43.676357,-79.293031,0,Clothing Store,Coffee Shop,Restaurant,Café,Cosmetics Shop,Diner,Plaza,Thai Restaurant,Theater,Hotel
1,M4K,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188,0,Clothing Store,Coffee Shop,Restaurant,Café,Cosmetics Shop,Diner,Plaza,Thai Restaurant,Theater,Hotel
2,M4L,East Toronto,"India Bazaar, The Beaches West",43.668999,-79.315572,0,Clothing Store,Coffee Shop,Restaurant,Café,Cosmetics Shop,Diner,Plaza,Thai Restaurant,Theater,Hotel
3,M4M,East Toronto,Studio District,43.659526,-79.340923,0,Clothing Store,Coffee Shop,Restaurant,Café,Cosmetics Shop,Diner,Plaza,Thai Restaurant,Theater,Hotel
4,M4N,Central Toronto,Lawrence Park,43.72802,-79.38879,0,Clothing Store,Coffee Shop,Restaurant,Café,Cosmetics Shop,Diner,Plaza,Thai Restaurant,Theater,Hotel


In [60]:
# sort the results by Cluster Labels
print(toronto_merged.shape)
toronto_merged.sort_values(["Cluster Labels"], inplace=True)
toronto_merged

(39, 16)


Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M4E,East Toronto,The Beaches,43.676357,-79.293031,0,Clothing Store,Coffee Shop,Restaurant,Café,Cosmetics Shop,Diner,Plaza,Thai Restaurant,Theater,Hotel
21,M5L,Downtown Toronto,"Commerce Court, Victoria Hotel",43.648198,-79.379817,0,Clothing Store,Coffee Shop,Restaurant,Café,Cosmetics Shop,Diner,Plaza,Thai Restaurant,Theater,Hotel
22,M5N,Central Toronto,Roselawn,43.711695,-79.416936,0,Clothing Store,Coffee Shop,Restaurant,Café,Cosmetics Shop,Diner,Plaza,Thai Restaurant,Theater,Hotel
23,M5P,Central Toronto,Forest Hill North & West,43.696948,-79.411307,0,Clothing Store,Coffee Shop,Restaurant,Café,Cosmetics Shop,Diner,Plaza,Thai Restaurant,Theater,Hotel
24,M5R,Central Toronto,"The Annex, North Midtown, Yorkville",43.67271,-79.405678,0,Clothing Store,Coffee Shop,Restaurant,Café,Cosmetics Shop,Diner,Plaza,Thai Restaurant,Theater,Hotel
25,M5S,Downtown Toronto,"University of Toronto, Harbord",43.662696,-79.400049,0,Clothing Store,Coffee Shop,Restaurant,Café,Cosmetics Shop,Diner,Plaza,Thai Restaurant,Theater,Hotel
26,M5T,Downtown Toronto,"Kensington Market, Chinatown, Grange Park",43.653206,-79.400049,0,Clothing Store,Coffee Shop,Restaurant,Café,Cosmetics Shop,Diner,Plaza,Thai Restaurant,Theater,Hotel
27,M5V,Downtown Toronto,"CN Tower, King and Spadina, Railway Lands, Har...",43.628947,-79.39442,0,Clothing Store,Coffee Shop,Restaurant,Café,Cosmetics Shop,Diner,Plaza,Thai Restaurant,Theater,Hotel
20,M5K,Downtown Toronto,"Toronto Dominion Centre, Design Exchange",43.647177,-79.381576,0,Clothing Store,Coffee Shop,Restaurant,Café,Cosmetics Shop,Diner,Plaza,Thai Restaurant,Theater,Hotel
28,M5W,Downtown Toronto,Stn A PO Boxes,43.646435,-79.374846,0,Clothing Store,Coffee Shop,Restaurant,Café,Cosmetics Shop,Diner,Plaza,Thai Restaurant,Theater,Hotel


In [62]:
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, post, bor, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['Postal Code'], toronto_merged['Borough'], toronto_merged['Neighborhood'], toronto_merged['Cluster Labels']):
    label = folium.Popup('{} ({}): {} - Cluster {}'.format(bor, post, poi, cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

**All neighboorhoods are in the Cluster 1.**