# Part 3 - Analysis of Toronto Neighbourhoods

This notebook will follow on from the two previous Battle of the neighbourhoods notebooks. The first several cells will repeat the data gathering and clean processes. The main focus, however, will be on analysis of the data. 

### Data Gathering 

In [1]:
!pip install bs4
!pip install requests
from bs4 import BeautifulSoup
import requests
import pandas as pd
import numpy as np
print('importing complete')

  from cryptography.utils import int_from_bytes
  from cryptography.utils import int_from_bytes
  from cryptography.utils import int_from_bytes
  from cryptography.utils import int_from_bytes
importing complete


In [2]:
url = 'https://en.wikipedia.org/w/index.php?title=List_of_postal_codes_of_Canada:_M&oldid=1012118802'
data = requests.get(url).text
soup = BeautifulSoup(data, "html5lib")
tables = soup.find_all('table')

In [3]:
N_data = pd.read_html(str(tables[0]), flavor = 'bs4')
N_df = pd.read_html(str(tables[0]), flavor = 'bs4')[0]
N_df.replace('Not assigned', np.nan, inplace = True)
Ndf = N_df.dropna()
Ndf = Ndf.reset_index(drop = True)
Ndf.head(12)

Unnamed: 0,Postal Code,Borough,Neighbourhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Regent Park, Harbourfront"
3,M6A,North York,"Lawrence Manor, Lawrence Heights"
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"
5,M9A,Etobicoke,"Islington Avenue, Humber Valley Village"
6,M1B,Scarborough,"Malvern, Rouge"
7,M3B,North York,Don Mills
8,M4B,East York,"Parkview Hill, Woodbine Gardens"
9,M5B,Downtown Toronto,"Garden District, Ryerson"


### Collecting Geospatial Data

In [4]:
!pip install geocoder
import geocoder
from geopy.geocoders import Nominatim 

  from cryptography.utils import int_from_bytes
  from cryptography.utils import int_from_bytes


In [5]:
g_df = pd.read_csv('https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0701EN-SkillsNetwork/labs_v1/Geospatial_Coordinates.csv')

Ndf = Ndf.sort_values("Postal Code")
Ndf = Ndf.reset_index(drop = True)

Geo_df = pd.merge(Ndf, g_df, on = "Postal Code")
Geo_df.head(12)

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude
0,M1B,Scarborough,"Malvern, Rouge",43.806686,-79.194353
1,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476
5,M1J,Scarborough,Scarborough Village,43.744734,-79.239476
6,M1K,Scarborough,"Kennedy Park, Ionview, East Birchmount Park",43.727929,-79.262029
7,M1L,Scarborough,"Golden Mile, Clairlea, Oakridge",43.711112,-79.284577
8,M1M,Scarborough,"Cliffside, Cliffcrest, Scarborough Village West",43.716316,-79.239476
9,M1N,Scarborough,"Birch Cliff, Cliffside West",43.692657,-79.264848


## Data Exploration

In [6]:
import requests
from pandas.io.json import json_normalize

import matplotlib.cm as cm
import matplotlib.colors as colors

from sklearn.cluster import KMeans

!pip install folium
import folium
print('importing complete')

  from cryptography.utils import int_from_bytes
  from cryptography.utils import int_from_bytes
Collecting folium
  Downloading folium-0.12.1-py2.py3-none-any.whl (94 kB)
[K     |████████████████████████████████| 94 kB 7.2 MB/s  eta 0:00:01
Collecting branca>=0.3.0
  Downloading branca-0.4.2-py3-none-any.whl (24 kB)
Installing collected packages: branca, folium
Successfully installed branca-0.4.2 folium-0.12.1
importing complete


In [21]:
address ='Toronto, Ontario'
geolocator = Nominatim(user_agent = "Toronto_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The coordinates for Toronoto, Ontario are {}, {}.'.format(latitude, longitude))

The coordinates for Toronoto, Ontario are 43.6534817, -79.3839347.


Using these coordinates we can buid a map to superimpose our broughs on to help visualise them and label the neighouhoods within.

In [117]:
map_toronto = folium.Map(location =[latitude, longitude], zoom_start = 11)
for lat, lng, borough, neighbourhood in zip(Geo_df['Latitude'], Geo_df['Longitude'], Geo_df['Borough'], Geo_df['Neighbourhood']):
    label = '{}, {}'.format(neighbourhood, borough)
    label = folium.Popup(label, parse_html= True)
    folium.CircleMarker(
        [lat,lng],
        radius = 5,
        popup = label,
        color = 'grey',
        fill = True,
        fill_opacity =0.5, 
        parse_html = False).add_to(map_toronto)
    
map_toronto

As we can see here, there are quite a lot of boroughs and neighbourhoods. So to simplify, lets work with only the boroughs that contain the word Toronto. 


In [93]:
toronto_df = Geo_df[Geo_df['Borough'].str.contains('Toronto')]
toronto_df.head()

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude
37,M4E,East Toronto,The Beaches,43.676357,-79.293031
41,M4K,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188
42,M4L,East Toronto,"India Bazaar, The Beaches West",43.668999,-79.315572
43,M4M,East Toronto,Studio District,43.659526,-79.340923
44,M4N,Central Toronto,Lawrence Park,43.72802,-79.38879


By simplifying, we can now perform analysis on East, West, Central, and Downtown Toronto

## Utilizing the FourSquare API for further Data Exploration

To further explore and segement the neighbourhoods we will be utilizing the foursquare API. 

In [94]:
# The code was removed by Watson Studio for sharing.

In [95]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [100]:
# Load the above result into a new data frame

toronto_venue = getNearbyVenues(names=toronto_df['Neighbourhood'],
                                   latitudes=toronto_df['Latitude'],
                                   longitudes=toronto_df['Longitude']
                                  )

The Beaches
The Danforth West, Riverdale
India Bazaar, The Beaches West
Studio District
Lawrence Park
Davisville North
North Toronto West, Lawrence Park
Davisville
Moore Park, Summerhill East
Summerhill West, Rathnelly, South Hill, Forest Hill SE, Deer Park
Rosedale
St. James Town, Cabbagetown
Church and Wellesley
Regent Park, Harbourfront
Garden District, Ryerson
St. James Town
Berczy Park
Central Bay Street
Richmond, Adelaide, King
Harbourfront East, Union Station, Toronto Islands
Toronto Dominion Centre, Design Exchange
Commerce Court, Victoria Hotel
Roselawn
Forest Hill North & West, Forest Hill Road Park
The Annex, North Midtown, Yorkville
University of Toronto, Harbord
Kensington Market, Chinatown, Grange Park
CN Tower, King and Spadina, Railway Lands, Harbourfront West, Bathurst Quay, South Niagara, Island airport
Stn A PO Boxes
First Canadian Place, Underground city
Christie
Dufferin, Dovercourt Village
Little Portugal, Trinity
Brockton, Parkdale Village, Exhibition Place
Runny

In [101]:
print(toronto_venue.shape)
toronto_venue.head(12)

(1596, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,The Beaches,43.676357,-79.293031,Glen Manor Ravine,43.676821,-79.293942,Trail
1,The Beaches,43.676357,-79.293031,The Big Carrot Natural Food Market,43.678879,-79.297734,Health Food Store
2,The Beaches,43.676357,-79.293031,Grover Pub and Grub,43.679181,-79.297215,Pub
3,The Beaches,43.676357,-79.293031,Upper Beaches,43.680563,-79.292869,Neighborhood
4,"The Danforth West, Riverdale",43.679557,-79.352188,MenEssentials,43.67782,-79.351265,Cosmetics Shop
5,"The Danforth West, Riverdale",43.679557,-79.352188,Pantheon,43.677621,-79.351434,Greek Restaurant
6,"The Danforth West, Riverdale",43.679557,-79.352188,Dolce Gelato,43.677773,-79.351187,Ice Cream Shop
7,"The Danforth West, Riverdale",43.679557,-79.352188,Cafe Fiorentina,43.677743,-79.350115,Italian Restaurant
8,"The Danforth West, Riverdale",43.679557,-79.352188,La Diperie,43.677702,-79.352265,Ice Cream Shop
9,"The Danforth West, Riverdale",43.679557,-79.352188,Moksha Yoga Danforth,43.677622,-79.352116,Yoga Studio


lets see how many venues were returned per neighbourhood in descending order: 

In [116]:
t_venue_count = toronto_venue.drop(['Neighborhood Latitude','Neighborhood Longitude', 'Venue Latitude', 'Venue Longitude', 'Venue Category'], axis =1)
t_venue_count = t_venue_count.groupby('Neighborhood').count() #counting the number of venues by neighbourhood
t_venue_count.sort_values(by = 'Venue', ascending = False) #sorting from most to least veunes

Unnamed: 0_level_0,Venue
Neighborhood,Unnamed: 1_level_1
"Garden District, Ryerson",100
"Toronto Dominion Centre, Design Exchange",100
"Commerce Court, Victoria Hotel",100
Stn A PO Boxes,100
"Harbourfront East, Union Station, Toronto Islands",100
"First Canadian Place, Underground city",100
"Richmond, Adelaide, King",91
St. James Town,79
Church and Wellesley,78
"Kensington Market, Chinatown, Grange Park",61


## Analysing Each Neighborhood

Now that we have geospatial data for each neighbourhood and some secondary information such as number of veunes in each neighbourhood we can move on to performing clustering analysis on the neighbourhoods

First let's one hot code the venues category to see which neighbourhoods have what venues. 

In [143]:
t_onehot = pd.get_dummies(toronto_venue[['Venue Category']], prefix =" ", prefix_sep=" ")
t_onehot

Unnamed: 0,Adult Boutique,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,...,Theme Restaurant,Tibetan Restaurant,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Yoga Studio
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,1,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1591,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1592,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1593,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1594,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


Now that we have one hot coded the venues, we need to add in the corresponding neighbourhoods

In [144]:
t_onehot['Neighborhood'] = toronto_venue['Neighborhood']
t_onehot

Unnamed: 0,Adult Boutique,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,...,Tibetan Restaurant,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Yoga Studio,Neighborhood
0,0,0,0,0,0,0,0,0,0,0,...,0,0,1,0,0,0,0,0,0,The Beaches
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,The Beaches
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,The Beaches
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,The Beaches
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,"The Danforth West, Riverdale"
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1591,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,"Business reply mail Processing Centre, South C..."
1592,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,"Business reply mail Processing Centre, South C..."
1593,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,"Business reply mail Processing Centre, South C..."
1594,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,"Business reply mail Processing Centre, South C..."


With the neighbourhoods in the DataFrame, lets move them from the right hand side to the left handside of the df

In [145]:
fixed_columns = [t_onehot.columns[-1]] + list(t_onehot.columns[:-1])
t_onehot = t_onehot[fixed_columns]
print(t_onehot.shape)

(1596, 235)


In [147]:
t_onehot.head()

Unnamed: 0,Neighborhood,Adult Boutique,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,...,Theme Restaurant,Tibetan Restaurant,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Yoga Studio
0,The Beaches,0,0,0,0,0,0,0,0,0,...,0,0,0,1,0,0,0,0,0,0
1,The Beaches,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,The Beaches,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,The Beaches,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,"The Danforth West, Riverdale",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


#### With this we can now see which neighbourhoods have what venues. Next we should group the rows by neighbouhood and take the meean frequency of each category 

In [150]:
t_grouped = t_onehot.groupby('Neighborhood').mean().reset_index()
t_grouped.head()

Unnamed: 0,Neighborhood,Adult Boutique,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,...,Theme Restaurant,Tibetan Restaurant,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Yoga Studio
0,Berczy Park,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0
1,"Brockton, Parkdale Village, Exhibition Place",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04
2,"Business reply mail Processing Centre, South C...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,"CN Tower, King and Spadina, Railway Lands, Har...",0.0,0.0,0.066667,0.066667,0.066667,0.133333,0.133333,0.066667,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Central Bay Street,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.016393,0.0,0.0,0.016393,0.016393


In [151]:
t_grouped.shape

(40, 235)

### Now lets see the top 5 most common veunes of each neighbourhood

In [153]:
num_top_v = 5

for neighborhood in t_grouped['Neighborhood']: 
    print("----"+neighborhood+"----")
    temp = t_grouped[t_grouped['Neighborhood'] == neighborhood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_v))
    print('\n')

----Berczy Park----
            venue  freq
0     Coffee Shop  0.10
1    Cocktail Bar  0.07
2          Bakery  0.05
3      Restaurant  0.03
4        Beer Bar  0.03


----Brockton, Parkdale Village, Exhibition Place----
              venue  freq
0              Café  0.12
1    Breakfast Spot  0.08
2            Bakery  0.08
3       Coffee Shop  0.08
4       Yoga Studio  0.04


----Business reply mail Processing Centre, South Central Letter Processing Plant Toronto----
                venue  freq
0         Pizza Place  0.07
1    Recording Studio  0.07
2      Farmers Market  0.07
3          Skate Park  0.07
4       Burrito Place  0.07


----CN Tower, King and Spadina, Railway Lands, Harbourfront West, Bathurst Quay, South Niagara, Island airport----
                   venue  freq
0         Airport Lounge  0.13
1        Airport Service  0.13
2        Harbor / Marina  0.07
3                    Bar  0.07
4    Rental Car Location  0.07


----Central Bay Street----
                  venue  freq


In [154]:
def return_most_common_venues(row, num_top_v):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_v]

In [157]:
num_top_venues = 5

indicators = ['st', 'nd', 'rd']

columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = t_grouped['Neighborhood']

for ind in np.arange(t_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(t_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head(12)

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,Berczy Park,Coffee Shop,Cocktail Bar,Bakery,Restaurant,Beer Bar
1,"Brockton, Parkdale Village, Exhibition Place",Café,Bakery,Breakfast Spot,Coffee Shop,Yoga Studio
2,"Business reply mail Processing Centre, South C...",Gym / Fitness Center,Auto Workshop,Comic Shop,Pizza Place,Recording Studio
3,"CN Tower, King and Spadina, Railway Lands, Har...",Airport Lounge,Airport Service,Boat or Ferry,Harbor / Marina,Airport
4,Central Bay Street,Coffee Shop,Café,Italian Restaurant,Sandwich Place,Japanese Restaurant
5,Christie,Grocery Store,Café,Park,Athletics & Sports,Baby Store
6,Church and Wellesley,Coffee Shop,Japanese Restaurant,Sushi Restaurant,Gay Bar,Restaurant
7,"Commerce Court, Victoria Hotel",Coffee Shop,Restaurant,Café,Hotel,Gym
8,Davisville,Sandwich Place,Dessert Shop,Italian Restaurant,Gym,Coffee Shop
9,Davisville North,Gym / Fitness Center,Hotel,Pizza Place,Playground,Department Store


## Cluster Analysis

In [158]:
kclusters = 5

t_clustering = t_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(t_clustering)

kmeans.labels_[0:10] 

array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1], dtype=int32)

In [165]:
neighborhoods_venues_sorted.insert(0, 'Cluster Lables', kmeans.labels_)

In [166]:
t_merged = toronto_df

t_merged = t_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighbourhood')

t_merged.head() # check the last columns!

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude,Cluster Lables,Cluster Lable,Cluster Label,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
37,M4E,East Toronto,The Beaches,43.676357,-79.293031,0,0,0,0,Neighborhood,Health Food Store,Trail,Pub,Yoga Studio
41,M4K,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188,1,1,1,1,Greek Restaurant,Coffee Shop,Italian Restaurant,Ice Cream Shop,Furniture / Home Store
42,M4L,East Toronto,"India Bazaar, The Beaches West",43.668999,-79.315572,1,1,1,1,Fast Food Restaurant,Pizza Place,Coffee Shop,Pub,Liquor Store
43,M4M,East Toronto,Studio District,43.659526,-79.340923,1,1,1,1,Coffee Shop,American Restaurant,Bakery,Brewery,Café
44,M4N,Central Toronto,Lawrence Park,43.72802,-79.38879,3,3,3,3,Park,Bus Line,Business Service,Swim School,Yoga Studio


In [167]:
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(t_merged['Latitude'], t_merged['Longitude'], t_merged['Neighbourhood'], t_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [168]:
t_merged.loc[t_merged['Cluster Labels'] == 0, t_merged.columns[[1] + list(range(5, t_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Lables,Cluster Lable,Cluster Label,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
37,East Toronto,0,0,0,0,Neighborhood,Health Food Store,Trail,Pub,Yoga Studio


In [169]:
t_merged.loc[t_merged['Cluster Labels'] == 1, t_merged.columns[[1] + list(range(5, t_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Lables,Cluster Lable,Cluster Label,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
41,East Toronto,1,1,1,1,Greek Restaurant,Coffee Shop,Italian Restaurant,Ice Cream Shop,Furniture / Home Store
42,East Toronto,1,1,1,1,Fast Food Restaurant,Pizza Place,Coffee Shop,Pub,Liquor Store
43,East Toronto,1,1,1,1,Coffee Shop,American Restaurant,Bakery,Brewery,Café
45,Central Toronto,1,1,1,1,Gym / Fitness Center,Hotel,Pizza Place,Playground,Department Store
46,Central Toronto,1,1,1,1,Clothing Store,Coffee Shop,Yoga Studio,Sporting Goods Shop,Café
47,Central Toronto,1,1,1,1,Sandwich Place,Dessert Shop,Italian Restaurant,Gym,Coffee Shop
49,Central Toronto,1,1,1,1,Coffee Shop,Sushi Restaurant,American Restaurant,Liquor Store,Restaurant
51,Downtown Toronto,1,1,1,1,Coffee Shop,Pizza Place,Italian Restaurant,Pub,Bakery
52,Downtown Toronto,1,1,1,1,Coffee Shop,Japanese Restaurant,Sushi Restaurant,Gay Bar,Restaurant
53,Downtown Toronto,1,1,1,1,Coffee Shop,Bakery,Park,Pub,Breakfast Spot


In [170]:
t_merged.loc[t_merged['Cluster Labels'] == 2, t_merged.columns[[1] + list(range(5, t_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Lables,Cluster Lable,Cluster Label,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
63,Central Toronto,2,2,2,2,Home Service,Garden,Yoga Studio,Dessert Shop,Event Space


In [171]:
t_merged.loc[t_merged['Cluster Labels'] == 3, t_merged.columns[[1] + list(range(5, t_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Lables,Cluster Lable,Cluster Label,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
44,Central Toronto,3,3,3,3,Park,Bus Line,Business Service,Swim School,Yoga Studio
50,Downtown Toronto,3,3,3,3,Park,Playground,Trail,Yoga Studio,Department Store


In [172]:
t_merged.loc[t_merged['Cluster Labels'] == 4, t_merged.columns[[1] + list(range(5, t_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Lables,Cluster Lable,Cluster Label,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
48,Central Toronto,4,4,4,4,Playground,Lawyer,Yoga Studio,Diner,Event Space
