<h1 align=center><font size = 5>Segmenting and Clustering Neighborhoods in Toronto</font></h1>

## Introduction

This is a notebook which for the Coursera capstone project. This tries to scrape an wikipefdia page to get the data about a Toronto neibourhood and loads into Panda dataframe in proper format.Then  this dataframe is used to segment and cluster the neibourhood.We will use *k*-means clustering algorithm. We will also use Folium library to visualize the neighborhoods and their emerging clusters.

## Table of Contents

<div class="alert alert-block alert-info" style="margin-top: 20px">

<font size = 3>

1. <a href="#item1">Download and Explore Dataset</a>

2. <a href="#item2">Explore Neighborhoods in Toronto City</a>

3. <a href="#item3">Analyze Each Neighborhood</a>

4. <a href="#item4">Cluster Neighborhoods</a>

5. <a href="#item5">Examine Clusters</a>    
</font>
</div>

First we will install the missing libaries and import them. 

In [48]:
# conda install -c anaconda beautifulsoup4 
# !conda install -c conda-forge geopy --yes
# !conda install -c conda-forge folium=0.5.0 --yes

In [None]:
import numpy as np
import pandas as pd
from urllib.request import urlopen
from bs4 import BeautifulSoup as bs
from geopy.geocoders import Nominatim
import folium
import matplotlib.cm as cm
import matplotlib.colors as colors
import requests

# import k-means from clustering stage
from sklearn.cluster import KMeans


## 1. Download and Explore Dataset


#### Scrape the wikipedia page to get the neibourhood data of Toronto, Canada and load into data frame using BeautifulSuop4 Library

In [3]:

page = urlopen('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M').read()
page


b'<!DOCTYPE html>\n<html class="client-nojs" lang="en" dir="ltr">\n<head>\n<meta charset="UTF-8"/>\n<title>List of postal codes of Canada: M - Wikipedia</title>\n<script>document.documentElement.className="client-js";RLCONF={"wgCanonicalNamespace":"","wgCanonicalSpecialPageName":!1,"wgNamespaceNumber":0,"wgPageName":"List_of_postal_codes_of_Canada:_M","wgTitle":"List of postal codes of Canada: M","wgCurRevisionId":916835432,"wgRevisionId":916835432,"wgArticleId":539066,"wgIsArticle":!0,"wgIsRedirect":!1,"wgAction":"view","wgUserName":null,"wgUserGroups":["*"],"wgCategories":["Communications in Ontario","Postal codes in Canada","Toronto","Ontario-related lists"],"wgBreakFrames":!1,"wgPageContentLanguage":"en","wgPageContentModel":"wikitext","wgSeparatorTransformTable":["",""],"wgDigitTransformTable":["",""],"wgDefaultDateFormat":"dmy","wgMonthNames":["","January","February","March","April","May","June","July","August","September","October","November","December"],"wgMonthNamesShort":["",

In [4]:

soup = bs(page)
soup.prettify()
table = soup.find('table',{'class':'wikitable sortable'})
df = pd.read_html(str(table))
nbrs= df[0] # get the first table
nbrs = nbrs[nbrs.Borough != 'Not assigned']
for index, row in nbrs.iterrows():
    if row['Neighbourhood'] =='Not assigned':
        row ['Neighbourhood'] = row['Borough'] 

nbrs = nbrs.groupby(['Postcode', 'Borough'])['Neighbourhood'].apply(','.join).reset_index()
nbrs


Unnamed: 0,Postcode,Borough,Neighbourhood
0,M1B,Scarborough,"Rouge,Malvern"
1,M1C,Scarborough,"Highland Creek,Rouge Hill,Port Union"
2,M1E,Scarborough,"Guildwood,Morningside,West Hill"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae
5,M1J,Scarborough,Scarborough Village
6,M1K,Scarborough,"East Birchmount Park,Ionview,Kennedy Park"
7,M1L,Scarborough,"Clairlea,Golden Mile,Oakridge"
8,M1M,Scarborough,"Cliffcrest,Cliffside,Scarborough Village West"
9,M1N,Scarborough,"Birch Cliff,Cliffside West"


#### Get the latitude and longitude of each postal code for Toronto.I downloaded this data from http://cocl.us/Geospatial_data and stored toronto_data.csv

In [43]:
gs= pd.read_csv('toronto_data.csv')
gs.columns = ['Postcode', 'Latitude', 'Longitude']
gs.head()


Unnamed: 0,Postcode,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


#### Join the latitude and longitude of each postal code with toronto neighborhood data

In [46]:
nbrs = pd.merge(nbrs, gs, on='Postcode', how='outer', validate="one_to_one")
nbrs.head()

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude_x,Longitude_x,Latitude_y,Longitude_y,Latitude,Longitude
0,M1B,Scarborough,"Rouge,Malvern",43.806686,-79.194353,43.806686,-79.194353,43.806686,-79.194353
1,M1C,Scarborough,"Highland Creek,Rouge Hill,Port Union",43.784535,-79.160497,43.784535,-79.160497,43.784535,-79.160497
2,M1E,Scarborough,"Guildwood,Morningside,West Hill",43.763573,-79.188711,43.763573,-79.188711,43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917,43.770992,-79.216917,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476,43.773136,-79.239476,43.773136,-79.239476


#### Use geopy library to get the latitude and longitude values of Toronto City.

In [65]:
 # convert an address into latitude and longitude values
address = 'Toronto, ON'

geolocator = Nominatim(user_agent="To_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Downtown Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Downtown Toronto are 43.653963, -79.387207.


#### Use folium library to plot aLL Neighbourhood of  Toronto City.

In [82]:
map_toronto = folium.Map(location=[latitude, longitude], zoom_start=10)

map_toronto

In [79]:
# add markers to map
for index, row in nbrs.iterrows():
    lat = row['Latitude']
    lng = row['Longitude']
    neighborhood = row['Neighbourhood']
    borough = row['Borough']
    label = '{}, {}'.format(neighborhood, borough).replace("'", "")
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=True).add_to(map_toronto)

map_toronto

## 2. Explore Neighborhoods in DownTown Toronto

#### Get neighborhood data for DownTown Toronto

In [8]:
downtown_data = nbrs[nbrs['Borough'] == 'Downtown Toronto'].reset_index(drop=True)
downtown_data.head()

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude
0,M4W,Downtown Toronto,Rosedale,43.679563,-79.377529
1,M4X,Downtown Toronto,"Cabbagetown,St. James Town",43.667967,-79.367675
2,M4Y,Downtown Toronto,Church and Wellesley,43.66586,-79.38316
3,M5A,Downtown Toronto,"Harbourfront,Regent Park",43.65426,-79.360636
4,M5B,Downtown Toronto,"Ryerson,Garden District",43.657162,-79.378937


#### Plot neighborhood in DownTown Toronto

In [13]:
address = 'Downtown Toronto, TORONTO'

geolocator = Nominatim(user_agent="To_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Downtown Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Downtown Toronto are 43.654027, -79.3802003.


In [80]:

map_downtown = folium.Map(location=[latitude, longitude], zoom_start=10)

map_downtown

In [81]:
# add markers to map
for index, row in downtown_data.iterrows():
    lat = row['Latitude']
    lng = row['Longitude']
    neighborhood = row['Neighbourhood']
    borough = row['Borough']
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=False)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_downtown)

map_downtown


#### Use Foursquare API to get the venues in Downtown Toronto

In [7]:
CLIENT_ID = 'CLIENT_ID' # Removed to publish on GitHub
CLIENT_SECRET = 'CLIENT_SECRET' # Removed to publish on GitHub
VERSION = '20180605' # Foursquare API version
LIMIT = 100

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: BVXVX5GELPXOTQVF3UQJOCTV2TBKP15IWENBOY3GU5KQUH5V
CLIENT_SECRET:WXRASNFFPCFS3NYN2OHB2PRWFNBHO23SRR4TZT2G13VDWIBS


In [20]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url, verify=False).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [21]:
downtown_venues = getNearbyVenues(names=downtown_data['Neighbourhood'],
                                   latitudes=downtown_data['Latitude'],
                                   longitudes=downtown_data['Longitude']
                                  )

Rosedale




Cabbagetown,St. James Town




Church and Wellesley




Harbourfront,Regent Park




Ryerson,Garden District




St. James Town




Berczy Park




Central Bay Street




Adelaide,King,Richmond




Harbourfront East,Toronto Islands,Union Station




Design Exchange,Toronto Dominion Centre




Commerce Court,Victoria Hotel




Harbord,University of Toronto




Chinatown,Grange Park,Kensington Market




CN Tower,Bathurst Quay,Island airport,Harbourfront West,King and Spadina,Railway Lands,South Niagara




Stn A PO Boxes 25 The Esplanade




First Canadian Place,Underground city




Christie




In [90]:
downtown_venues
downtown_venues.rename(columns={'Neighborhood': 'Neighbourhood'}, inplace = True)
downtown_venues

Unnamed: 0,Neighbourhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Rosedale,43.679563,-79.377529,Mooredale House,43.678631,-79.380091,Building
1,Rosedale,43.679563,-79.377529,Rosedale Park,43.682328,-79.378934,Playground
2,Rosedale,43.679563,-79.377529,Whitney Park,43.682036,-79.373788,Park
3,Rosedale,43.679563,-79.377529,Alex Murray Parkette,43.678300,-79.382773,Park
4,Rosedale,43.679563,-79.377529,Milkman's Lane,43.676352,-79.373842,Trail
5,"Cabbagetown,St. James Town",43.667967,-79.367675,Cranberries,43.667843,-79.369407,Diner
6,"Cabbagetown,St. James Town",43.667967,-79.367675,F'Amelia,43.667536,-79.368613,Italian Restaurant
7,"Cabbagetown,St. James Town",43.667967,-79.367675,Butter Chicken Factory,43.667072,-79.369184,Indian Restaurant
8,"Cabbagetown,St. James Town",43.667967,-79.367675,Kingyo Toronto,43.665895,-79.368415,Japanese Restaurant
9,"Cabbagetown,St. James Town",43.667967,-79.367675,Murgatroid,43.667381,-79.369311,Restaurant


In [23]:
downtown_venues.groupby('Neighbourhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighbourhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"Adelaide,King,Richmond",100,100,100,100,100,100
Berczy Park,57,57,57,57,57,57
"CN Tower,Bathurst Quay,Island airport,Harbourfront West,King and Spadina,Railway Lands,South Niagara",16,16,16,16,16,16
"Cabbagetown,St. James Town",43,43,43,43,43,43
Central Bay Street,86,86,86,86,86,86
"Chinatown,Grange Park,Kensington Market",100,100,100,100,100,100
Christie,16,16,16,16,16,16
Church and Wellesley,90,90,90,90,90,90
"Commerce Court,Victoria Hotel",100,100,100,100,100,100
"Design Exchange,Toronto Dominion Centre",100,100,100,100,100,100


In [24]:
print('There are {} uniques categories.'.format(len(downtown_venues['Venue Category'].unique())))

There are 209 uniques categories.


## 3. Analyze Each Neighborhood in Downtown Toronto

#### One hot encoding for venue category

In [89]:

downtown_onehot = pd.get_dummies(downtown_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
downtown_onehot['Neighbourhood'] = downtown_venues['Neighbourhood'] 

# move neighborhood column to the first column
fixed_columns = [downtown_onehot.columns[-1]] + list(downtown_onehot.columns[:-1])
downtown_onehot = downtown_onehot[fixed_columns]

downtown_onehot.head()

Unnamed: 0,Neighbourhood,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,...,Theme Restaurant,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Wings Joint,Yoga Studio
0,Rosedale,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Rosedale,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Rosedale,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Rosedale,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Rosedale,0,0,0,0,0,0,0,0,0,...,0,0,1,0,0,0,0,0,0,0


#### calculate mean score of each neighborhood for each venue catory

In [88]:

downtown_grouped = downtown_onehot.groupby('Neighbourhood').mean().reset_index()
downtown_grouped.head()

Unnamed: 0,Neighbourhood,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,...,Theme Restaurant,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Wings Joint,Yoga Studio
0,"Adelaide,King,Richmond",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,...,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0
1,Berczy Park,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0
2,"CN Tower,Bathurst Quay,Island airport,Harbourf...",0.0,0.0625,0.0625,0.0625,0.125,0.125,0.125,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,"Cabbagetown,St. James Town",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Central Bay Street,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011628,0.0,...,0.0,0.0,0.0,0.0,0.011628,0.0,0.0,0.011628,0.0,0.011628


#### Print each neighborhood along with the top 5 most common venues

In [27]:
num_top_venues = 5

for hood in downtown_grouped['Neighbourhood']:
    print("----"+hood+"----")
    temp = downtown_grouped[downtown_grouped['Neighbourhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Adelaide,King,Richmond----
         venue  freq
0  Coffee Shop  0.08
1         Café  0.05
2   Steakhouse  0.04
3          Bar  0.04
4        Hotel  0.03


----Berczy Park----
          venue  freq
0   Coffee Shop  0.07
1  Cocktail Bar  0.05
2      Beer Bar  0.04
3          Café  0.04
4   Cheese Shop  0.04


----CN Tower,Bathurst Quay,Island airport,Harbourfront West,King and Spadina,Railway Lands,South Niagara----
              venue  freq
0    Airport Lounge  0.12
1   Airport Service  0.12
2  Airport Terminal  0.12
3     Boat or Ferry  0.06
4           Airport  0.06


----Cabbagetown,St. James Town----
         venue  freq
0  Coffee Shop  0.09
1         Café  0.07
2  Pizza Place  0.05
3   Restaurant  0.05
4          Pub  0.05


----Central Bay Street----
                       venue  freq
0                Coffee Shop  0.14
1                       Café  0.05
2  Middle Eastern Restaurant  0.05
3         Italian Restaurant  0.05
4               Burger Joint  0.03


----Chinatown,Gran

Sort the venues in descending order and display the top 10 venues for each neighborhood

In [28]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [96]:
num_top_venues = 11

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighbourhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
downtown_venues_sorted = pd.DataFrame(columns=columns)
downtown_venues_sorted['Neighbourhood'] = downtown_grouped['Neighbourhood']

for ind in np.arange(downtown_grouped.shape[0]):
   
    downtown_venues_sorted.iloc[ind, 1:] = return_most_common_venues(downtown_grouped.iloc[ind, :], num_top_venues)

downtown_venues_sorted.head()

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue
0,"Adelaide,King,Richmond",Coffee Shop,Café,Bar,Steakhouse,Restaurant,Burger Joint,American Restaurant,Thai Restaurant,Hotel,Cosmetics Shop,Gastropub
1,Berczy Park,Coffee Shop,Cocktail Bar,Café,Seafood Restaurant,Farmers Market,Beer Bar,Bakery,Cheese Shop,Steakhouse,Italian Restaurant,Breakfast Spot
2,"CN Tower,Bathurst Quay,Island airport,Harbourf...",Airport Lounge,Airport Service,Airport Terminal,Coffee Shop,Bar,Plane,Boutique,Sculpture Garden,Boat or Ferry,Harbor / Marina,Airport Food Court
3,"Cabbagetown,St. James Town",Coffee Shop,Café,Italian Restaurant,Bakery,Pizza Place,Restaurant,Pub,Chinese Restaurant,Diner,Beer Store,Japanese Restaurant
4,Central Bay Street,Coffee Shop,Middle Eastern Restaurant,Café,Italian Restaurant,Burger Joint,Sandwich Place,Ice Cream Shop,Japanese Restaurant,Chinese Restaurant,Sushi Restaurant,Bar


## 4. Cluster Neighborhoods in Downtown Toronto

#### Run *k*-means to cluster the neighborhood into 5 clusters.

In [102]:
# set number of clusters
kclusters = 5

downtown_grouped_clustering = downtown_grouped.drop('Neighbourhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(downtown_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_

array([1, 1, 0, 1, 1, 3, 4, 1, 1, 1, 1, 3, 1, 1, 2, 1, 1, 1])

#### create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

In [98]:
# add clustering labels

downtown_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)
downtown_merged =downtown_data

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
downtown_merged = downtown_merged.join(downtown_venues_sorted.set_index('Neighbourhood'), on='Neighbourhood')
downtown_merged.head() 

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue
0,M4W,Downtown Toronto,Rosedale,43.679563,-79.377529,2,Park,Playground,Trail,Building,Department Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop,Doner Restaurant,Dog Run,Discount Store
1,M4X,Downtown Toronto,"Cabbagetown,St. James Town",43.667967,-79.367675,1,Coffee Shop,Café,Italian Restaurant,Bakery,Pizza Place,Restaurant,Pub,Chinese Restaurant,Diner,Beer Store,Japanese Restaurant
2,M4Y,Downtown Toronto,Church and Wellesley,43.66586,-79.38316,1,Coffee Shop,Gay Bar,Japanese Restaurant,Sushi Restaurant,Restaurant,Men's Store,Hotel,Gastropub,Pub,Fast Food Restaurant,Mediterranean Restaurant
3,M5A,Downtown Toronto,"Harbourfront,Regent Park",43.65426,-79.360636,1,Coffee Shop,Bakery,Pub,Café,Park,Mexican Restaurant,Gym / Fitness Center,Breakfast Spot,Restaurant,Theater,Cosmetics Shop
4,M5B,Downtown Toronto,"Ryerson,Garden District",43.657162,-79.378937,1,Coffee Shop,Clothing Store,Middle Eastern Restaurant,Café,Cosmetics Shop,Sporting Goods Shop,Japanese Restaurant,Italian Restaurant,Bookstore,Pizza Place,Plaza


In [101]:
# downtown_merged

####  Visualize the resulting clusters

In [78]:
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(downtown_merged['Latitude'], downtown_merged['Longitude'], downtown_merged['Neighbourhood'], downtown_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

## 5. Examine Clusters

#### Cluster 1

In [35]:
downtown_merged.loc[downtown_merged['Cluster Labels'] == 0, downtown_merged.columns[[1] + list(range(5, downtown_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue
14,Downtown Toronto,0,Airport Lounge,Airport Service,Airport Terminal,Coffee Shop,Bar,Plane,Boutique,Sculpture Garden,Boat or Ferry,Harbor / Marina,Airport Food Court


#### Cluster 2

In [104]:
downtown_merged.loc[downtown_merged['Cluster Labels'] == 1, downtown_merged.columns[[1] + list(range(5, downtown_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue
1,Downtown Toronto,1,Coffee Shop,Café,Italian Restaurant,Bakery,Pizza Place,Restaurant,Pub,Chinese Restaurant,Diner,Beer Store,Japanese Restaurant
2,Downtown Toronto,1,Coffee Shop,Gay Bar,Japanese Restaurant,Sushi Restaurant,Restaurant,Men's Store,Hotel,Gastropub,Pub,Fast Food Restaurant,Mediterranean Restaurant
3,Downtown Toronto,1,Coffee Shop,Bakery,Pub,Café,Park,Mexican Restaurant,Gym / Fitness Center,Breakfast Spot,Restaurant,Theater,Cosmetics Shop
4,Downtown Toronto,1,Coffee Shop,Clothing Store,Middle Eastern Restaurant,Café,Cosmetics Shop,Sporting Goods Shop,Japanese Restaurant,Italian Restaurant,Bookstore,Pizza Place,Plaza
5,Downtown Toronto,1,Coffee Shop,Café,Restaurant,Hotel,Italian Restaurant,Clothing Store,Cocktail Bar,Bakery,Cosmetics Shop,Breakfast Spot,Gastropub
6,Downtown Toronto,1,Coffee Shop,Cocktail Bar,Café,Seafood Restaurant,Farmers Market,Beer Bar,Bakery,Cheese Shop,Steakhouse,Italian Restaurant,Breakfast Spot
7,Downtown Toronto,1,Coffee Shop,Middle Eastern Restaurant,Café,Italian Restaurant,Burger Joint,Sandwich Place,Ice Cream Shop,Japanese Restaurant,Chinese Restaurant,Sushi Restaurant,Bar
8,Downtown Toronto,1,Coffee Shop,Café,Bar,Steakhouse,Restaurant,Burger Joint,American Restaurant,Thai Restaurant,Hotel,Cosmetics Shop,Gastropub
9,Downtown Toronto,1,Coffee Shop,Hotel,Aquarium,Café,Scenic Lookout,Fried Chicken Joint,Brewery,Sports Bar,Baseball Stadium,Bar,Plaza
10,Downtown Toronto,1,Coffee Shop,Café,Hotel,Restaurant,Bakery,Bar,Deli / Bodega,Gastropub,Italian Restaurant,American Restaurant,Concert Hall


#### Cluster 3

In [37]:
downtown_merged.loc[downtown_merged['Cluster Labels'] == 2, downtown_merged.columns[[1] + list(range(5, downtown_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue
0,Downtown Toronto,2,Park,Playground,Trail,Building,Department Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop,Doner Restaurant,Dog Run,Discount Store


#### Cluster 4

In [38]:
downtown_merged.loc[downtown_merged['Cluster Labels'] == 3, downtown_merged.columns[[1] + list(range(5, downtown_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue
12,Downtown Toronto,3,Café,Bar,Bookstore,Japanese Restaurant,Sandwich Place,Bakery,Restaurant,Italian Restaurant,Beer Bar,Beer Store,Comfort Food Restaurant
13,Downtown Toronto,3,Café,Vegetarian / Vegan Restaurant,Bar,Mexican Restaurant,Bakery,Vietnamese Restaurant,Chinese Restaurant,Dumpling Restaurant,Coffee Shop,Comfort Food Restaurant,Dessert Shop


#### Cluster 5

In [39]:
downtown_merged.loc[downtown_merged['Cluster Labels'] == 4, downtown_merged.columns[[1] + list(range(5, downtown_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue
17,Downtown Toronto,4,Grocery Store,Café,Park,Nightclub,Restaurant,Athletics & Sports,Diner,Italian Restaurant,Baby Store,Convenience Store,Coffee Shop
