# Segmenting and Clustering Neighborhoods in Toronto - Final version
#### Made by: Holzel, Gabriela

## 1) Start by creating a new Notebook for this assignment.

## 2) Use the Notebook to build the code to scrape a Wikipedia page.

First, we install what we are going to need.

In [1]:
import pandas as pd
import numpy as np
from bs4 import BeautifulSoup
import requests

In [2]:
!pip install BeautifulSoup4
!pip install requests



We get the table from the link.

In [3]:
source = requests.get('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M')
soup = BeautifulSoup(source.text, 'lxml')

table = soup.find('table')
table_rows = table.tbody.find_all('tr')

Now, we build a dataframe from that table.

In [4]:
Postcode = []
Borough = []
Neighbourhood = []

for tr in table_rows:
    td = tr.find_all("td")
    for item in td:
        if "Not assigned" not in item.text:
            Postcode.append(item.text.strip()[0:3])
            if item.text.strip().find('(') != -1:
                Borough.append(item.text.strip()[3:item.text.strip().find('(')])
                Neighbourhood.append(item.text.strip()[item.text.strip().find('(')+1:item.text.strip().find(')')].replace(" / ",",").replace("/",","))
            else:
                Borough.append(item.text.strip()[3:len(item.text.strip())])
                Neighbourhood.append(item.text.strip()[3:len(item.text.strip())])
canada_dict = {'Postcode':Postcode, 'Borough':Borough, 'Neighbourhood':Neighbourhood}
CA = pd.DataFrame.from_dict(canada_dict)
CA.to_csv('toronto_part1.csv')
CA.head(10)

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Regent Park,Harbourfront"
3,M6A,North York,"Lawrence Manor,Lawrence Heights"
4,M7A,Queen's Park,Ontario Provincial Government
5,M9A,Etobicoke,Islington Avenue
6,M1B,Scarborough,"Malvern,Rouge"
7,M3B,North York,Don Mills
8,M4B,East York,"Parkview Hill,Woodbine Gardens"
9,M5B,Downtown Toronto,"Garden District, Ryerson"


This concludes the first activity.

## 3) Use the Geocoder package or the csv file to create certain dataframe.

As usual, we must install what we're going to need.

In [5]:
!pip install geocoder



Given that this package can be very unreliable, in case you are not able to get the geographical coordinates of the neighborhoods using the Geocoder package, we get the csv file that has the geographical coordinates of each postal code from the following link:

In [6]:
PC = pd.read_csv("https://cocl.us/Geospatial_data")
PC.head()

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


In [7]:
print('The shape is: ', PC.shape)
print('The columns names are: {}, {} and {}.'.format(PC.columns[0],PC.columns[1],PC.columns[2]))

The shape is:  (103, 3)
The columns names are: Postal Code, Latitude and Longitude.


Now we merge both dataframes: CA and PC.

In [8]:
CA_new = pd.merge(CA, PC, how='left', left_on = 'Postcode', right_on = 'Postal Code')
CA_new.head()

Unnamed: 0,Postcode,Borough,Neighbourhood,Postal Code,Latitude,Longitude
0,M3A,North York,Parkwoods,M3A,43.753259,-79.329656
1,M4A,North York,Victoria Village,M4A,43.725882,-79.315572
2,M5A,Downtown Toronto,"Regent Park,Harbourfront",M5A,43.65426,-79.360636
3,M6A,North York,"Lawrence Manor,Lawrence Heights",M6A,43.718518,-79.464763
4,M7A,Queen's Park,Ontario Provincial Government,M7A,43.662301,-79.389494


Now, we remove the "Postal Code" column:

In [9]:
CA_new.drop("Postal Code", axis=1, inplace=True)
CA_new

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Regent Park,Harbourfront",43.654260,-79.360636
3,M6A,North York,"Lawrence Manor,Lawrence Heights",43.718518,-79.464763
4,M7A,Queen's Park,Ontario Provincial Government,43.662301,-79.389494
...,...,...,...,...,...
98,M8X,Etobicoke,"The Kingsway,Montgomery Road,Old Mill North",43.653654,-79.506944
99,M4Y,Downtown Toronto,Church and Wellesley,43.665860,-79.383160
100,M7Y,East TorontoBusiness reply mail Processing Cen...,Enclave of M4L,43.662744,-79.321558
101,M8Y,Etobicoke,"Old Mill South,King's Mill Park,Sunnylea,Humbe...",43.636258,-79.498509


We rename two of the columns.

In [10]:
CA_new = CA_new.rename(columns={'Postcode': 'Postal Code', 'Neighbourhood': 'Neighborhood'})
CA_new.head()

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Regent Park,Harbourfront",43.65426,-79.360636
3,M6A,North York,"Lawrence Manor,Lawrence Heights",43.718518,-79.464763
4,M7A,Queen's Park,Ontario Provincial Government,43.662301,-79.389494


## 4) Explore and cluster the neighborhoods in Toronto.

First, we import Nominatim to convert an andress into latitude and longitude values and import folium, which is a map rendering library:

In [11]:
from geopy.geocoders import Nominatim

In [12]:
import folium

In [13]:
from pandas.io.json import json_normalize  # tranform JSON file into a pandas dataframe

Secondly, we use Nominatim to determine Toronto's latitude and longitude values:

In [14]:
address = 'Toronto, CA'
geolocator = Nominatim(user_agent="ca_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geographical coordinates of Toronto are {}, {}.'.format(latitude, longitude))

The geographical coordinates of Toronto are 43.6534817, -79.3839347.


Now, let's see the map of Toronto with markers:

In [16]:
map_toronto = folium.Map(location=[latitude, longitude], zoom_start=10)

for lat, lng, borough, neighborhood in zip(
        CA_new['Latitude'], 
        CA_new['Longitude'], 
        CA_new['Borough'], 
        CA_new['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  

map_toronto

## Explore First neighbourhood in our df using Foursquare 

First, we must declare our Credentials.

In [17]:
CLIENT_ID = 'MX25ENHCJYIKTEL44452CCA1I0UY2LYLRWQ3JCQ0DLRU1ZQE' # your Foursquare ID
CLIENT_SECRET = 'U4L5OTAITD0O3BWYSXOFYTWNXGDBX5XWMWEESYPC1KK2Y2ZJ' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT = 100 # A default Foursquare API limit value

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: MX25ENHCJYIKTEL44452CCA1I0UY2LYLRWQ3JCQ0DLRU1ZQE
CLIENT_SECRET:U4L5OTAITD0O3BWYSXOFYTWNXGDBX5XWMWEESYPC1KK2Y2ZJ


Now, we explore the first neighborhood's name in CA_new.

In [19]:
ne_name = CA_new.loc[0, 'Neighborhood']
print(f"The first neighborhood's name is '{ne_name}'.")

The first neighborhood's name is 'Parkwoods'.


We get the neighborhood's latitude and longitude values.

In [20]:
ne_lat = CA_new.loc[0, 'Latitude'] # neighborhood latitude value
ne_lon = CA_new.loc[0, 'Longitude'] # neighborhood longitude value

print('Latitude and longitude values of {} are {}, {}.'.format(ne_name, ne_lat, ne_lon))

Latitude and longitude values of Parkwoods are 43.7532586, -79.3296565.


Now, let's get the top 100 venues that are in Parkwoods within a radius of 500 meters. We save it as a json file.

In [21]:
LIMIT = 100 # limit of number of venues returned by Foursquare API
radius = 500 # define radius
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    ne_lat, 
    ne_lon, 
    radius, 
    LIMIT)

# get the result to a json file
results = requests.get(url).json()

We define a function that extracts the category of the venue.

In [22]:
def get_cat_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

We clean the json and structure it into a pandas dataframe.

In [23]:
venues = results['response']['groups'][0]['items']
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_cat_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues

  from ipykernel import kernelapp as app


Unnamed: 0,name,categories,lat,lng
0,Brookbanks Park,Park,43.751976,-79.33214
1,Variety Store,Food & Drink Shop,43.751974,-79.333114


Now, we're going to explore neighborhoods in a part of Toronto City. First, let's create a function to repeat the same process to all the neighborhoods in DENC of Toronto.

In [24]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

Now write the code to run the above function on each neighborhood and create a new dataframe called toronto_denc_venues.

In [25]:
toronto_denc_venues = getNearbyVenues(names=CA_new['Neighborhood'],
                                   latitudes=CA_new['Latitude'],
                                   longitudes=CA_new['Longitude']
                                  )
toronto_denc_venues.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Parkwoods,43.753259,-79.329656,Brookbanks Park,43.751976,-79.33214,Park
1,Parkwoods,43.753259,-79.329656,Variety Store,43.751974,-79.333114,Food & Drink Shop
2,Victoria Village,43.725882,-79.315572,Victoria Village Arena,43.723481,-79.315635,Hockey Arena
3,Victoria Village,43.725882,-79.315572,Portugril,43.725819,-79.312785,Portuguese Restaurant
4,Victoria Village,43.725882,-79.315572,Tim Hortons,43.725517,-79.313103,Coffee Shop


Let's check how many venues were returned for each neighborhood.

In [26]:
toronto_denc_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Agincourt,5,5,5,5,5,5
"Alderwood,Long Branch",8,8,8,8,8,8
"Bathurst Manor,Wilson Heights,Downsview North",22,22,22,22,22,22
Bayview Village,4,4,4,4,4,4
"Bedford Park,Lawrence Manor East",23,23,23,23,23,23
...,...,...,...,...,...,...
Willowdale,39,39,39,39,39,39
"Willowdale,Newtonbrook",1,1,1,1,1,1
Woburn,4,4,4,4,4,4
Woodbine Heights,7,7,7,7,7,7


In [27]:
print('There are {} uniques categories.'.format(len(toronto_denc_venues['Venue Category'].unique())))

There are 274 uniques categories.


Next, we must analyze each neighborhood.

In [28]:
# one hot encoding
toronto_denc_onehot = pd.get_dummies(toronto_denc_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
toronto_denc_onehot['Neighborhood'] = toronto_denc_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [toronto_denc_onehot.columns[-1]] + list(toronto_denc_onehot.columns[:-1])
toronto_denc_onehot = toronto_denc_onehot[fixed_columns]

toronto_denc_onehot.head()

Unnamed: 0,Yoga Studio,Accessories Store,Adult Boutique,Airport,Airport Food Court,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,...,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop,Wings Joint,Women's Store
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


Now, we must group rows by neighborhood and by taking the mean of the frequency of occurrence of each category.

In [29]:
toronto_denc_grouped = toronto_denc_onehot.groupby('Neighborhood').mean().reset_index()
toronto_denc_grouped.head()

Unnamed: 0,Neighborhood,Yoga Studio,Accessories Store,Adult Boutique,Airport,Airport Food Court,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,...,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop,Wings Joint,Women's Store
0,Agincourt,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,"Alderwood,Long Branch",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,"Bathurst Manor,Wilson Heights,Downsview North",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Bayview Village,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,"Bedford Park,Lawrence Manor East",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


Let's check the 10 most common venues in each neighborhood.

In [30]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    return row_categories_sorted.index.values[0:num_top_venues]

num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = toronto_denc_grouped['Neighborhood']

for ind in np.arange(toronto_denc_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_denc_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Agincourt,Latin American Restaurant,Skating Rink,Lounge,Clothing Store,Breakfast Spot,Drugstore,Discount Store,Distribution Center,Dog Run,Doner Restaurant
1,"Alderwood,Long Branch",Pizza Place,Dance Studio,Gym,Coffee Shop,Sandwich Place,Pub,Pool,Discount Store,Department Store,Dessert Shop
2,"Bathurst Manor,Wilson Heights,Downsview North",Coffee Shop,Bank,Deli / Bodega,Pharmacy,Intersection,Middle Eastern Restaurant,Sushi Restaurant,Mobile Phone Shop,Fried Chicken Joint,Diner
3,Bayview Village,Café,Chinese Restaurant,Bank,Japanese Restaurant,Escape Room,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Drugstore,Dim Sum Restaurant
4,"Bedford Park,Lawrence Manor East",Coffee Shop,Sandwich Place,Italian Restaurant,Sushi Restaurant,Thai Restaurant,Comfort Food Restaurant,Pharmacy,Pizza Place,Pub,Café


Next, we run k-means to cluster the neighborhood into 5 clusters. To do so, we must import Kmeans. 

In [31]:
from sklearn.cluster import KMeans

In [32]:
# set number of clusters
kclusters = 5
toronto_denc_grouped_clustering = toronto_denc_grouped.drop('Neighborhood', 1)
# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_denc_grouped_clustering)
# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([4, 4, 4, 4, 4, 4, 4, 4, 4, 2], dtype=int32)

Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

In [33]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster', kmeans.labels_)

toronto_denc_merged = CA_new

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
toronto_denc_merged = toronto_denc_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood', how = 'right')

toronto_denc_merged.head() # check the last columns

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude,Cluster,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M3A,North York,Parkwoods,43.753259,-79.329656,2,Park,Food & Drink Shop,Ethiopian Restaurant,Escape Room,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Drugstore,Falafel Restaurant,Donut Shop
1,M4A,North York,Victoria Village,43.725882,-79.315572,4,Hockey Arena,French Restaurant,Intersection,Coffee Shop,Portuguese Restaurant,Women's Store,Diner,Discount Store,Distribution Center,Dog Run
2,M5A,Downtown Toronto,"Regent Park,Harbourfront",43.65426,-79.360636,4,Coffee Shop,Park,Pub,Bakery,Theater,Café,Breakfast Spot,Yoga Studio,French Restaurant,Spa
3,M6A,North York,"Lawrence Manor,Lawrence Heights",43.718518,-79.464763,4,Furniture / Home Store,Clothing Store,Carpet Store,Accessories Store,Vietnamese Restaurant,Boutique,Coffee Shop,Gift Shop,Arts & Crafts Store,Doner Restaurant
4,M7A,Queen's Park,Ontario Provincial Government,43.662301,-79.389494,4,Coffee Shop,Diner,Sushi Restaurant,Yoga Studio,College Auditorium,Bar,Beer Bar,Smoothie Shop,Sandwich Place,Burrito Place


Let's see the map!

In [34]:
import matplotlib.cm as cm
import matplotlib.colors as colors

In [35]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(
        toronto_denc_merged['Latitude'], 
        toronto_denc_merged['Longitude'], 
        toronto_denc_merged['Neighborhood'], 
        toronto_denc_merged['Cluster']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

Lastly, we examine each cluster:

### Cluster 1

In [36]:
toronto_denc_merged.loc[toronto_denc_merged['Cluster'] == 0, toronto_denc_merged.columns[[1] + list(range(5, toronto_denc_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
6,Scarborough,0,Fast Food Restaurant,Donut Shop,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Drugstore,Falafel Restaurant
11,Etobicoke,0,Bakery,Women's Store,Donut Shop,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Drugstore,Farmers Market
71,Scarborough,0,Auto Garage,Middle Eastern Restaurant,Bakery,Sandwich Place,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run,Women's Store


### Cluster 2

In [37]:
toronto_denc_merged.loc[toronto_denc_merged['Cluster'] == 1, toronto_denc_merged.columns[[1] + list(range(5, toronto_denc_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
12,Scarborough,1,Bar,Women's Store,Donut Shop,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Drugstore,Farmers Market


### Cluster 3

In [38]:
toronto_denc_merged.loc[toronto_denc_merged['Cluster'] == 2, toronto_denc_merged.columns[[1] + list(range(5, toronto_denc_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,North York,2,Park,Food & Drink Shop,Ethiopian Restaurant,Escape Room,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Drugstore,Falafel Restaurant,Donut Shop
21,York,2,Park,Pool,Women's Store,Greek Restaurant,Dance Studio,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Drugstore,Donut Shop
35,East YorkEast Toronto,2,Park,Intersection,Convenience Store,Escape Room,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Drugstore,Dessert Shop,Donut Shop
52,North York,2,Park,Donut Shop,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Drugstore,College Rec Center
64,York,2,Park,Convenience Store,Event Space,Ethiopian Restaurant,Escape Room,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Dessert Shop,Drugstore
66,North York,2,Park,Convenience Store,Event Space,Ethiopian Restaurant,Escape Room,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Dessert Shop,Drugstore
91,Downtown Toronto,2,Park,Playground,Trail,Escape Room,Electronics Store,Eastern European Restaurant,Ethiopian Restaurant,Dumpling Restaurant,Drugstore,Deli / Bodega


### Cluster 4

In [39]:
toronto_denc_merged.loc[toronto_denc_merged['Cluster'] == 3, toronto_denc_merged.columns[[1] + list(range(5, toronto_denc_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
57,North York,3,Baseball Field,Women's Store,Donut Shop,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Drugstore,Farmers Market
101,Etobicoke,3,Baseball Field,Women's Store,Donut Shop,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Drugstore,Farmers Market


###  Cluster 5

In [40]:
toronto_denc_merged.loc[toronto_denc_merged['Cluster'] == 4, toronto_denc_merged.columns[[1] + list(range(5, toronto_denc_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,North York,4,Hockey Arena,French Restaurant,Intersection,Coffee Shop,Portuguese Restaurant,Women's Store,Diner,Discount Store,Distribution Center,Dog Run
2,Downtown Toronto,4,Coffee Shop,Park,Pub,Bakery,Theater,Café,Breakfast Spot,Yoga Studio,French Restaurant,Spa
3,North York,4,Furniture / Home Store,Clothing Store,Carpet Store,Accessories Store,Vietnamese Restaurant,Boutique,Coffee Shop,Gift Shop,Arts & Crafts Store,Doner Restaurant
4,Queen's Park,4,Coffee Shop,Diner,Sushi Restaurant,Yoga Studio,College Auditorium,Bar,Beer Bar,Smoothie Shop,Sandwich Place,Burrito Place
7,North York,4,Gym,Coffee Shop,Clothing Store,Café,Restaurant,Beer Store,Supermarket,Chinese Restaurant,Caribbean Restaurant,Dim Sum Restaurant
...,...,...,...,...,...,...,...,...,...,...,...,...
97,Downtown Toronto,4,Coffee Shop,Café,Hotel,Restaurant,Japanese Restaurant,Gym,American Restaurant,Deli / Bodega,Seafood Restaurant,Asian Restaurant
98,Etobicoke,4,River,Pool,Escape Room,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Drugstore,Donut Shop,Deli / Bodega,Doner Restaurant
99,Downtown Toronto,4,Coffee Shop,Sushi Restaurant,Japanese Restaurant,Gay Bar,Restaurant,Pub,Mediterranean Restaurant,Café,Men's Store,Yoga Studio
100,East TorontoBusiness reply mail Processing Cen...,4,Light Rail Station,Spa,Butcher,Skate Park,Burrito Place,Fast Food Restaurant,Garden,Garden Center,Farmers Market,Brewery
