# Capstone Project - The Battle of the Neighborhoods (Week 2)
### Applied Data Science Capstone by IBM/Coursera


## Introduction: Business Problem <a name="introduction"></a>

In this project we will try to find an optimal location for a restaurant. Specifically, this report will be targeted to stakeholders interested in opening a **Cofee Shop** in **Toronto**, Canada.

Since there are lots of restaurants in Toronto we will try to detect **locations that are not already crowded with Coffee Shops**.

We will use data science to find some of the more promissing neighborhoods based on this criteria. Advantages of each area will then be clearly expressed so that best possible final location can be chosen by stakeholders.

## Data <a name="data"></a>

I will be extracting list of Neighbourhoods in Toronto from the following Wikipedia page: https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M. These datasets provide a list of Post Codes, Neighbourhoods and Boroughs. I use pandas to extract HTML table information from Wikipedia and load it into dataframe and perform necessary data clean-up.

In [34]:
import pandas as pd
import numpy as np

In [35]:
# To Extract Data from the url using pandas
url = 'https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'

data = pd.read_html(url,header = 0)
df = pd.DataFrame(data[0])
df.head()

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront


In [36]:
#To Extract the rows with Borough value = "Not assigned"
indexNames = df[ df['Borough'] == 'Not assigned'].index
indexNames

Int64Index([  0,   1,   9,  13,  20,  21,  30,  36,  37,  45,  46,  50,  51,
             52,  54,  55,  59,  60,  61,  73,  74,  75,  88,  89,  90, 104,
            105, 106, 120, 121, 136, 137, 148, 149, 155, 161, 162, 167, 175,
            181, 182, 188, 189, 190, 194, 195, 201, 202, 203, 204, 209, 210,
            223, 224, 237, 238, 241, 242, 247, 248, 253, 254, 258, 259, 260,
            261, 263, 264, 274, 275, 276, 277, 278, 279, 280, 281, 287],
           dtype='int64')

In [37]:
# Drop all Rows Where Borough = 'Not assigned'
df.drop(indexNames , inplace=True)
df.head()

Unnamed: 0,Postcode,Borough,Neighbourhood
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront
5,M5A,Downtown Toronto,Regent Park
6,M6A,North York,Lawrence Heights


In [38]:
#Group Neighborhood's with Same Postcode Values together
df = df.groupby(['Postcode','Borough'])['Neighbourhood'].apply(','.join).reset_index()
df.head()

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M1B,Scarborough,"Rouge,Malvern"
1,M1C,Scarborough,"Highland Creek,Rouge Hill,Port Union"
2,M1E,Scarborough,"Guildwood,Morningside,West Hill"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae


In [39]:
indexNames_nh = df[ df['Neighbourhood'] == 'Not assigned'].index

df['Neighbourhood'].iloc[indexNames_nh] = df['Borough'].iloc[indexNames_nh]
df.iloc[indexNames_nh]

Unnamed: 0,Postcode,Borough,Neighbourhood
85,M7A,Queen's Park,Queen's Park


In [40]:
df.shape

(103, 3)

## Finding Coordinates

In [41]:
# The code was removed by Watson Studio for sharing.

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


In [42]:
df['Latitude'] = df_coordinates['Latitude'].values
df['Longitude'] = df_coordinates['Longitude'].values
df.head()

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude
0,M1B,Scarborough,"Rouge,Malvern",43.806686,-79.194353
1,M1C,Scarborough,"Highland Creek,Rouge Hill,Port Union",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood,Morningside,West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476


## Clustering

First, We will only select those Boroughs found within Toronto

In [43]:
df = df[df['Borough'].str.contains("Toronto")].reset_index(drop=True)
df.head()

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude
0,M4E,East Toronto,The Beaches,43.676357,-79.293031
1,M4K,East Toronto,"The Danforth West,Riverdale",43.679557,-79.352188
2,M4L,East Toronto,"The Beaches West,India Bazaar",43.668999,-79.315572
3,M4M,East Toronto,Studio District,43.659526,-79.340923
4,M4N,Central Toronto,Lawrence Park,43.72802,-79.38879


I will be using the Geopy Library to find the coordinates of Toronto

In [44]:
from geopy.geocoders import Nominatim
address = 'Toronto'

geolocator = Nominatim()
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))



The geograpical coordinate of Toronto are 43.653963, -79.387207.


In [45]:
!conda install -c conda-forge folium=0.5.0 --yes

Solving environment: done

# All requested packages already installed.



Let's take a look at the map of Toronto based on the coordinates obtained above

In [46]:
import folium

map_Toronto = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(df['Latitude'], df['Longitude'], df['Neighbourhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_Toronto)  
    
map_Toronto

In [47]:
import requests # library to handle requests
from pandas.io.json import json_normalize

Now, I will initialize variables related to the Foursquare API

In [48]:
CLIENT_ID = 'AUGTQCLNYXUHLJGIKZFELDQWK4E1G01K2JAD5WCZI0HXWRK1' # your Foursquare ID
CLIENT_SECRET = 'D5AMJGHYKURBS4WCYMCQIA2D15I5D1BGPFBCZD5XCJ04ORCB' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: AUGTQCLNYXUHLJGIKZFELDQWK4E1G01K2JAD5WCZI0HXWRK1
CLIENT_SECRET:D5AMJGHYKURBS4WCYMCQIA2D15I5D1BGPFBCZD5XCJ04ORCB


Let's define a method to make the Foursquare API call

In [53]:
import urllib
def getNearbyVenues(names, latitudes, longitudes, radius=5000, categoryIds=''):
    try:
        venues_list=[]
        for name, lat, lng in zip(names, latitudes, longitudes):
            #print(name)

            # create the API request URL
            url = 'https://api.foursquare.com/v2/venues/search?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, VERSION, lat, lng, radius, LIMIT)

            if (categoryIds != ''):
                url = url + '&categoryId={}'
                url = url.format(categoryIds)

            # make the GET request
            response = requests.get(url).json()
            results = response["response"]['venues']

            # return only relevant information for each nearby venue
            for v in results:
                success = False
                try:
                    category = v['categories'][0]['name']
                    success = True
                except:
                    pass

                if success:
                    venues_list.append([(
                        name, 
                        lat, 
                        lng, 
                        v['name'], 
                        v['location']['lat'], 
                        v['location']['lng'],
                        v['categories'][0]['name']
                    )])

        nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
        nearby_venues.columns = ['Neighbourhood', 
                  'Neighbourhood Latitude', 
                  'Neighbourhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude',  
                  'Venue Category']
    
    except:
        print(url)
        print(response)
        print(results)
        print(nearby_venues)

    return(nearby_venues)

In [54]:
LIMIT = 100
radius = 500

toronto = getNearbyVenues(names=df['Neighbourhood'],
                                   latitudes=df['Latitude'],
                                   longitudes=df['Longitude'],
                                   radius=1000, 
                                   categoryIds='4bf58dd8d48988d1e0931735'
                                  )

In [55]:
print(toronto.shape)
toronto.head()

(1218, 7)


Unnamed: 0,Neighbourhood,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,The Beaches,43.676357,-79.293031,Tori's Bakeshop,43.672114,-79.290331,Vegetarian / Vegan Restaurant
1,The Beaches,43.676357,-79.293031,Starbucks,43.680806,-79.285137,Coffee Shop
2,The Beaches,43.676357,-79.293031,Starbucks,43.669564,-79.301969,Coffee Shop
3,The Beaches,43.676357,-79.293031,Grinder,43.683073,-79.299875,Coffee Shop
4,The Beaches,43.676357,-79.293031,Tim Hortons,43.680799,-79.282907,Coffee Shop


In [56]:
toronto.groupby('Neighbourhood').count()

Unnamed: 0_level_0,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighbourhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"Adelaide,King,Richmond",50,50,50,50,50,50
Berczy Park,50,50,50,50,50,50
"Brockton,Exhibition Place,Parkdale Village",29,29,29,29,29,29
Business Reply Mail Processing Centre 969 Eastern,7,7,7,7,7,7
"CN Tower,Bathurst Quay,Island airport,Harbourfront West,King and Spadina,Railway Lands,South Niagara",13,13,13,13,13,13
"Cabbagetown,St. James Town",29,29,29,29,29,29
Central Bay Street,50,50,50,50,50,50
"Chinatown,Grange Park,Kensington Market",50,50,50,50,50,50
Christie,37,37,37,37,37,37
Church and Wellesley,50,50,50,50,50,50


Let's add the Neighbourhood column to the above dataframe

In [57]:
# one hot encoding
toronto_onehot = pd.get_dummies(toronto[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
toronto_onehot['Neighbourhood'] = toronto['Neighbourhood'] 

# move neighborhood column to the first column
fixed_columns = [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]

toronto_onehot.head()

Unnamed: 0,Neighbourhood,Bakery,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Café,Chinese Restaurant,Coffee Shop,...,Gaming Cafe,Gas Station,Grocery Store,Ice Cream Shop,Marijuana Dispensary,Restaurant,Smoke Shop,Smoothie Shop,Tea Room,Vegetarian / Vegan Restaurant
0,The Beaches,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,1
1,The Beaches,0,0,0,0,0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,0
2,The Beaches,0,0,0,0,0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,0
3,The Beaches,0,0,0,0,0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,0
4,The Beaches,0,0,0,0,0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,0


In [58]:
toronto_grouped = toronto_onehot.groupby('Neighbourhood').mean().reset_index()
toronto_grouped

Unnamed: 0,Neighbourhood,Bakery,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Café,Chinese Restaurant,Coffee Shop,...,Gaming Cafe,Gas Station,Grocery Store,Ice Cream Shop,Marijuana Dispensary,Restaurant,Smoke Shop,Smoothie Shop,Tea Room,Vegetarian / Vegan Restaurant
0,"Adelaide,King,Richmond",0.0,0.02,0.0,0.0,0.0,0.0,0.06,0.02,0.88,...,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0
1,Berczy Park,0.0,0.02,0.0,0.0,0.0,0.0,0.06,0.0,0.92,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,"Brockton,Exhibition Place,Parkdale Village",0.034483,0.0,0.034483,0.0,0.0,0.0,0.034483,0.0,0.827586,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0
3,Business Reply Mail Processing Centre 969 Eastern,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.857143,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,"CN Tower,Bathurst Quay,Island airport,Harbourf...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,"Cabbagetown,St. James Town",0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.965517,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Central Bay Street,0.0,0.02,0.0,0.02,0.0,0.0,0.1,0.02,0.84,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,"Chinatown,Grange Park,Kensington Market",0.0,0.02,0.0,0.0,0.0,0.0,0.14,0.02,0.72,...,0.0,0.0,0.02,0.02,0.0,0.0,0.02,0.0,0.0,0.0
8,Christie,0.0,0.0,0.0,0.0,0.0,0.0,0.135135,0.0,0.837838,...,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0
9,Church and Wellesley,0.0,0.0,0.0,0.02,0.0,0.0,0.06,0.0,0.9,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0


Let's Find the Most Frequently occuring venues in Different Neighbourhoods

In [59]:
num_top_venues = 5

for hood in toronto_grouped['Neighbourhood']:
    print("----"+hood+"----")
    temp = toronto_grouped[toronto_grouped['Neighbourhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Adelaide,King,Richmond----
                venue  freq
0         Coffee Shop  0.88
1                Café  0.06
2  Chinese Restaurant  0.02
3      Ice Cream Shop  0.02
4                 Bar  0.02


----Berczy Park----
         venue  freq
0  Coffee Shop  0.92
1         Café  0.06
2          Bar  0.02
3       Bakery  0.00
4  Gaming Cafe  0.00


----Brockton,Exhibition Place,Parkdale Village----
         venue  freq
0  Coffee Shop  0.83
1       Bakery  0.03
2         Café  0.03
3     Tea Room  0.03
4   Food Truck  0.03


----Business Reply Mail Processing Centre 969 Eastern----
               venue  freq
0        Coffee Shop  0.86
1             Bakery  0.14
2  French Restaurant  0.00
3           Tea Room  0.00
4      Smoothie Shop  0.00


----CN Tower,Bathurst Quay,Island airport,Harbourfront West,King and Spadina,Railway Lands,South Niagara----
               venue  freq
0        Coffee Shop   1.0
1             Bakery   0.0
2  French Restaurant   0.0
3           Tea Room   0.0
4     

Let's Place the most common venues in a Dataframe

In [60]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [61]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighbourhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighbourhood'] = toronto_grouped['Neighbourhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Adelaide,King,Richmond",Coffee Shop,Café,Bar,Ice Cream Shop,Chinese Restaurant,Vegetarian / Vegan Restaurant,Dessert Shop,Bike Shop,Bookstore,Boutique
1,Berczy Park,Coffee Shop,Café,Bar,Vegetarian / Vegan Restaurant,Tea Room,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
2,"Brockton,Exhibition Place,Parkdale Village",Coffee Shop,Bakery,Bike Shop,Café,Food Truck,Tea Room,Dessert Shop,Bar,Bookstore,Boutique
3,Business Reply Mail Processing Centre 969 Eastern,Coffee Shop,Bakery,Tea Room,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Café,Chinese Restaurant
4,"CN Tower,Bathurst Quay,Island airport,Harbourf...",Coffee Shop,Vegetarian / Vegan Restaurant,Tea Room,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Café,Chinese Restaurant


Now, let's use K-Means Algorithm to cluster the neighbourhoods

In [62]:
from sklearn.cluster import KMeans
# set number of clusters
kclusters = 6

toronto_grouped_clustering = toronto_grouped.drop('Neighbourhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:50] 

array([0, 0, 5, 5, 1, 1, 2, 4, 2, 0, 0, 1, 1, 2, 0, 1, 0, 4, 2, 0, 0, 0,
       1, 5, 1, 1, 1, 1, 3, 2, 0, 0, 0, 1, 0, 0, 5, 0], dtype=int32)

In [63]:
# add clustering labels

# neighborhoods_venues_sorted = neighborhoods_venues_sorted.drop(columns=['Cluster Labels', 'Cluster_Labels'])
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

toronto_merged = toronto

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
toronto_merged = toronto_merged.join(neighborhoods_venues_sorted.set_index('Neighbourhood'), on='Neighbourhood')

toronto_merged # check the last columns!

Unnamed: 0,Neighbourhood,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,The Beaches,43.676357,-79.293031,Tori's Bakeshop,43.672114,-79.290331,Vegetarian / Vegan Restaurant,0,Coffee Shop,Vegetarian / Vegan Restaurant,Café,French Restaurant,Dessert Shop,Bar,Bike Shop,Bookstore,Boutique,Cafeteria
1,The Beaches,43.676357,-79.293031,Starbucks,43.680806,-79.285137,Coffee Shop,0,Coffee Shop,Vegetarian / Vegan Restaurant,Café,French Restaurant,Dessert Shop,Bar,Bike Shop,Bookstore,Boutique,Cafeteria
2,The Beaches,43.676357,-79.293031,Starbucks,43.669564,-79.301969,Coffee Shop,0,Coffee Shop,Vegetarian / Vegan Restaurant,Café,French Restaurant,Dessert Shop,Bar,Bike Shop,Bookstore,Boutique,Cafeteria
3,The Beaches,43.676357,-79.293031,Grinder,43.683073,-79.299875,Coffee Shop,0,Coffee Shop,Vegetarian / Vegan Restaurant,Café,French Restaurant,Dessert Shop,Bar,Bike Shop,Bookstore,Boutique,Cafeteria
4,The Beaches,43.676357,-79.293031,Tim Hortons,43.680799,-79.282907,Coffee Shop,0,Coffee Shop,Vegetarian / Vegan Restaurant,Café,French Restaurant,Dessert Shop,Bar,Bike Shop,Bookstore,Boutique,Cafeteria
5,The Beaches,43.676357,-79.293031,Starbucks,43.669693,-79.302124,Coffee Shop,0,Coffee Shop,Vegetarian / Vegan Restaurant,Café,French Restaurant,Dessert Shop,Bar,Bike Shop,Bookstore,Boutique,Cafeteria
6,The Beaches,43.676357,-79.293031,Tim Hortons,43.670286,-79.299733,Coffee Shop,0,Coffee Shop,Vegetarian / Vegan Restaurant,Café,French Restaurant,Dessert Shop,Bar,Bike Shop,Bookstore,Boutique,Cafeteria
7,The Beaches,43.676357,-79.293031,Dip 'n Sip,43.678897,-79.297745,Coffee Shop,0,Coffee Shop,Vegetarian / Vegan Restaurant,Café,French Restaurant,Dessert Shop,Bar,Bike Shop,Bookstore,Boutique,Cafeteria
8,The Beaches,43.676357,-79.293031,Prana Coffee,43.671306,-79.294092,Coffee Shop,0,Coffee Shop,Vegetarian / Vegan Restaurant,Café,French Restaurant,Dessert Shop,Bar,Bike Shop,Bookstore,Boutique,Cafeteria
9,The Beaches,43.676357,-79.293031,Savoury Grounds,43.680540,-79.287421,Coffee Shop,0,Coffee Shop,Vegetarian / Vegan Restaurant,Café,French Restaurant,Dessert Shop,Bar,Bike Shop,Bookstore,Boutique,Cafeteria


Let's Map the various clusters

In [64]:
import matplotlib.cm as cm
import matplotlib.colors as colors

In [65]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Neighbourhood Latitude'], toronto_merged['Neighbourhood Longitude'], toronto_merged['Neighbourhood'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [66]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 0, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighbourhood Latitude,Venue Longitude,Venue Category,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,43.676357,-79.290331,Vegetarian / Vegan Restaurant,0,Coffee Shop,Vegetarian / Vegan Restaurant,Café,French Restaurant,Dessert Shop,Bar,Bike Shop,Bookstore,Boutique,Cafeteria
1,43.676357,-79.285137,Coffee Shop,0,Coffee Shop,Vegetarian / Vegan Restaurant,Café,French Restaurant,Dessert Shop,Bar,Bike Shop,Bookstore,Boutique,Cafeteria
2,43.676357,-79.301969,Coffee Shop,0,Coffee Shop,Vegetarian / Vegan Restaurant,Café,French Restaurant,Dessert Shop,Bar,Bike Shop,Bookstore,Boutique,Cafeteria
3,43.676357,-79.299875,Coffee Shop,0,Coffee Shop,Vegetarian / Vegan Restaurant,Café,French Restaurant,Dessert Shop,Bar,Bike Shop,Bookstore,Boutique,Cafeteria
4,43.676357,-79.282907,Coffee Shop,0,Coffee Shop,Vegetarian / Vegan Restaurant,Café,French Restaurant,Dessert Shop,Bar,Bike Shop,Bookstore,Boutique,Cafeteria
5,43.676357,-79.302124,Coffee Shop,0,Coffee Shop,Vegetarian / Vegan Restaurant,Café,French Restaurant,Dessert Shop,Bar,Bike Shop,Bookstore,Boutique,Cafeteria
6,43.676357,-79.299733,Coffee Shop,0,Coffee Shop,Vegetarian / Vegan Restaurant,Café,French Restaurant,Dessert Shop,Bar,Bike Shop,Bookstore,Boutique,Cafeteria
7,43.676357,-79.297745,Coffee Shop,0,Coffee Shop,Vegetarian / Vegan Restaurant,Café,French Restaurant,Dessert Shop,Bar,Bike Shop,Bookstore,Boutique,Cafeteria
8,43.676357,-79.294092,Coffee Shop,0,Coffee Shop,Vegetarian / Vegan Restaurant,Café,French Restaurant,Dessert Shop,Bar,Bike Shop,Bookstore,Boutique,Cafeteria
9,43.676357,-79.287421,Coffee Shop,0,Coffee Shop,Vegetarian / Vegan Restaurant,Café,French Restaurant,Dessert Shop,Bar,Bike Shop,Bookstore,Boutique,Cafeteria


In [67]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 1, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighbourhood Latitude,Venue Longitude,Venue Category,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
49,43.659526,-79.342565,Café,1,Coffee Shop,Café,Vegetarian / Vegan Restaurant,Tea Room,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
50,43.659526,-79.342461,Coffee Shop,1,Coffee Shop,Café,Vegetarian / Vegan Restaurant,Tea Room,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
51,43.659526,-79.341241,Coffee Shop,1,Coffee Shop,Café,Vegetarian / Vegan Restaurant,Tea Room,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
52,43.659526,-79.341510,Coffee Shop,1,Coffee Shop,Café,Vegetarian / Vegan Restaurant,Tea Room,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
53,43.659526,-79.342295,Coffee Shop,1,Coffee Shop,Café,Vegetarian / Vegan Restaurant,Tea Room,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
54,43.659526,-79.338577,Coffee Shop,1,Coffee Shop,Café,Vegetarian / Vegan Restaurant,Tea Room,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
55,43.659526,-79.353500,Coffee Shop,1,Coffee Shop,Café,Vegetarian / Vegan Restaurant,Tea Room,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
56,43.659526,-79.342151,Coffee Shop,1,Coffee Shop,Café,Vegetarian / Vegan Restaurant,Tea Room,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
57,43.659526,-79.328277,Coffee Shop,1,Coffee Shop,Café,Vegetarian / Vegan Restaurant,Tea Room,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
58,43.659526,-79.328292,Coffee Shop,1,Coffee Shop,Café,Vegetarian / Vegan Restaurant,Tea Room,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant


In [68]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 2, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighbourhood Latitude,Venue Longitude,Venue Category,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
175,43.686412,-79.398612,Coffee Shop,2,Coffee Shop,Café,Vegetarian / Vegan Restaurant,Tea Room,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
176,43.686412,-79.392148,Coffee Shop,2,Coffee Shop,Café,Vegetarian / Vegan Restaurant,Tea Room,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
177,43.686412,-79.395570,Coffee Shop,2,Coffee Shop,Café,Vegetarian / Vegan Restaurant,Tea Room,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
178,43.686412,-79.393281,Coffee Shop,2,Coffee Shop,Café,Vegetarian / Vegan Restaurant,Tea Room,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
179,43.686412,-79.394475,Coffee Shop,2,Coffee Shop,Café,Vegetarian / Vegan Restaurant,Tea Room,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
180,43.686412,-79.396840,Coffee Shop,2,Coffee Shop,Café,Vegetarian / Vegan Restaurant,Tea Room,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
181,43.686412,-79.394401,Coffee Shop,2,Coffee Shop,Café,Vegetarian / Vegan Restaurant,Tea Room,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
182,43.686412,-79.394863,Coffee Shop,2,Coffee Shop,Café,Vegetarian / Vegan Restaurant,Tea Room,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
183,43.686412,-79.396746,Coffee Shop,2,Coffee Shop,Café,Vegetarian / Vegan Restaurant,Tea Room,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
184,43.686412,-79.403532,Coffee Shop,2,Coffee Shop,Café,Vegetarian / Vegan Restaurant,Tea Room,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant


In [69]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 3, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighbourhood Latitude,Venue Longitude,Venue Category,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
749,43.711695,-79.413698,Coffee Shop,3,Coffee Shop,Tea Room,Vegetarian / Vegan Restaurant,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Café,Chinese Restaurant
750,43.711695,-79.411762,Coffee Shop,3,Coffee Shop,Tea Room,Vegetarian / Vegan Restaurant,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Café,Chinese Restaurant
751,43.711695,-79.413824,Coffee Shop,3,Coffee Shop,Tea Room,Vegetarian / Vegan Restaurant,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Café,Chinese Restaurant
752,43.711695,-79.409193,Coffee Shop,3,Coffee Shop,Tea Room,Vegetarian / Vegan Restaurant,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Café,Chinese Restaurant
753,43.711695,-79.430534,Coffee Shop,3,Coffee Shop,Tea Room,Vegetarian / Vegan Restaurant,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Café,Chinese Restaurant
754,43.711695,-79.40592,Coffee Shop,3,Coffee Shop,Tea Room,Vegetarian / Vegan Restaurant,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Café,Chinese Restaurant
755,43.711695,-79.409655,Tea Room,3,Coffee Shop,Tea Room,Vegetarian / Vegan Restaurant,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Café,Chinese Restaurant


In [70]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 4, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighbourhood Latitude,Venue Longitude,Venue Category,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
756,43.696948,-79.413698,Coffee Shop,4,Coffee Shop,Tea Room,Café,Vegetarian / Vegan Restaurant,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
757,43.696948,-79.411762,Coffee Shop,4,Coffee Shop,Tea Room,Café,Vegetarian / Vegan Restaurant,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
758,43.696948,-79.412803,Coffee Shop,4,Coffee Shop,Tea Room,Café,Vegetarian / Vegan Restaurant,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
759,43.696948,-79.409193,Coffee Shop,4,Coffee Shop,Tea Room,Café,Vegetarian / Vegan Restaurant,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
760,43.696948,-79.413824,Coffee Shop,4,Coffee Shop,Tea Room,Café,Vegetarian / Vegan Restaurant,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
761,43.696948,-79.412601,Café,4,Coffee Shop,Tea Room,Café,Vegetarian / Vegan Restaurant,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
762,43.696948,-79.409655,Tea Room,4,Coffee Shop,Tea Room,Café,Vegetarian / Vegan Restaurant,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
763,43.696948,-79.40592,Coffee Shop,4,Coffee Shop,Tea Room,Café,Vegetarian / Vegan Restaurant,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
864,43.653206,-79.400182,Café,4,Coffee Shop,Café,Donut Shop,Bar,Smoke Shop,Ice Cream Shop,Grocery Store,French Restaurant,Chinese Restaurant,Deli / Bodega
865,43.653206,-79.399106,Café,4,Coffee Shop,Café,Donut Shop,Bar,Smoke Shop,Ice Cream Shop,Grocery Store,French Restaurant,Chinese Restaurant,Deli / Bodega


In [71]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 5, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighbourhood Latitude,Venue Longitude,Venue Category,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
38,43.668999,-79.312404,Coffee Shop,5,Coffee Shop,Bakery,Café,Tea Room,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
39,43.668999,-79.308015,Coffee Shop,5,Coffee Shop,Bakery,Café,Tea Room,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
40,43.668999,-79.308204,Coffee Shop,5,Coffee Shop,Bakery,Café,Tea Room,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
41,43.668999,-79.309945,Coffee Shop,5,Coffee Shop,Bakery,Café,Tea Room,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
42,43.668999,-79.301969,Coffee Shop,5,Coffee Shop,Bakery,Café,Tea Room,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
43,43.668999,-79.302124,Coffee Shop,5,Coffee Shop,Bakery,Café,Tea Room,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
44,43.668999,-79.328110,Bakery,5,Coffee Shop,Bakery,Café,Tea Room,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
45,43.668999,-79.306890,Coffee Shop,5,Coffee Shop,Bakery,Café,Tea Room,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
46,43.668999,-79.320412,Coffee Shop,5,Coffee Shop,Bakery,Café,Tea Room,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
47,43.668999,-79.303218,Coffee Shop,5,Coffee Shop,Bakery,Café,Tea Room,Bar,Bike Shop,Bookstore,Boutique,Cafeteria,Chinese Restaurant
