# Segmenting and Clustering Neighborhoods in the city of Toronto, Canada

## Table of Contents
- [Part 1 - Data Scraping](#part-1)
- [Part 2 - Geocoding](#part-2)
- [Part 3 - Neighborhoods Clustering](#part-3)


<div id='part-1'/>

____
## Part 1 - Data Scraping

Input data [Wikipedia: List of postal codes of Canada: M](https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M)

In [1]:
from bs4 import BeautifulSoup
import urllib3.request
import pandas as pd
import numpy as np
from sklearn.cluster import KMeans
from geopy.geocoders import Nominatim
import folium
import os
import requests
import json
from pandas.io.json import json_normalize
import matplotlib.cm as cm
import matplotlib.colors as colors


- **Input data is obtained from Wikipedia via http request.**
- **_"BeatifulSoup"_ object is created.**

In [2]:
page_url = "https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M"
# if you are behind a firewall set the proper url, including protocol, host and port.
#   (ex: http://internal-proxy:80)
proxy_url = ""

if proxy_url.strip() != "":
    # using proxy
    http = urllib3.ProxyManager(proxy_url)
else:
    # direct internet connection
    http = urllib3.PoolManager()

req = http.request('GET', page_url)
soup = BeautifulSoup(req.data, 'html.parser')




  
- **HTML post codes table is parsed**
- **Rows with 'Not assigned' borough are dropped.**
- **Pandas dataframe is constructed.**
  

In [3]:
# locate postcode table
toronto_table = soup.find('table',{'class':'wikitable sortable'})

# process table rows and build raw_df
raw_df = pd.DataFrame(columns=['PostalCode', 'Borough', 'Neighborhood'])
rows = toronto_table.findAll('tr')
for row in rows:
    row_items = row.findAll('td')
    if len(row_items) > 0:
        postcode = row_items[0].text.strip()
        borough = row_items[1].text.strip()
        if borough.lower() != "not assigned":
            neighborhood = row_items[2].text.strip()
            raw_df = raw_df.append({'PostalCode':postcode, 
                                    'Borough':borough, 
                                    'Neighborhood':neighborhood}, 
                                   ignore_index = True)

raw_df.head()

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,Harbourfront
3,M5A,Downtown Toronto,Regent Park
4,M6A,North York,Lawrence Heights


  
- **Combine neighborhoods belonging to the same borough in one row.**
- **Replace _'Not assigned'_ neighborhoods with Borougth's name.**
  

In [4]:
grouped = []
for name, group in raw_df.groupby(['PostalCode', 'Borough'])['Neighborhood']:
    nblist = ''.join(str(x) + ", " for x in group.tolist()).strip(", ")
    if nblist == "Not assigned":
        nblist = name[1]
    grouped.append((name[0], name[1], nblist))

toronto_df = pd.DataFrame(grouped, columns=['PostalCode', 'Borough', 'Neighborhood'])
print(toronto_df.shape)
toronto_df.head()

(103, 3)


Unnamed: 0,PostalCode,Borough,Neighborhood
0,M1B,Scarborough,"Rouge, Malvern"
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union"
2,M1E,Scarborough,"Guildwood, Morningside, West Hill"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae


In [5]:
toronto_df.tail()

Unnamed: 0,PostalCode,Borough,Neighborhood
98,M9N,York,Weston
99,M9P,Etobicoke,Westmount
100,M9R,Etobicoke,"Kingsview Village, Martin Grove Gardens, Richv..."
101,M9V,Etobicoke,"Albion Gardens, Beaumond Heights, Humbergate, ..."
102,M9W,Etobicoke,Northwest


In [6]:
# just for verification. This query should return no rows.
toronto_df.query("Neighborhood == 'Not assigned'")

Unnamed: 0,PostalCode,Borough,Neighborhood


In [7]:
# verify a known 'Not assigned' Neighborhood case, it should be equal to Borough. 
toronto_df.query("PostalCode == 'M7A'") 

Unnamed: 0,PostalCode,Borough,Neighborhood
85,M7A,Queen's Park,Queen's Park


  
- **Final assignament requirement: dataframe shape is shown.**
  

In [8]:
toronto_df.shape

(103, 3)

<div id='part-2'/>

___
## Part 2 : Geocoding

> Geocoder doesn't works for me, all the time I get 'None' as response.  
> Therefore I downloaded the 'Geospatial_Coordinates.csv' and got geocoding from that file.
  

In [9]:
import csv
with open('Geospatial_Coordinates.csv', 'rt') as geo_file:
    geo_reader = csv.reader(geo_file, delimiter=',')
    for row in geo_reader:
        #print(', '.join(row))
        toronto_df.loc[toronto_df['PostalCode'] == row[0], 'Latitude'] = float(row[1])        
        toronto_df.loc[toronto_df['PostalCode'] == row[0], 'Longitude'] = float(row[2])
        
toronto_df.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476


In [10]:
toronto_df.tail()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
98,M9N,York,Weston,43.706876,-79.518188
99,M9P,Etobicoke,Westmount,43.696319,-79.532242
100,M9R,Etobicoke,"Kingsview Village, Martin Grove Gardens, Richv...",43.688905,-79.554724
101,M9V,Etobicoke,"Albion Gardens, Beaumond Heights, Humbergate, ...",43.739416,-79.588437
102,M9W,Etobicoke,Northwest,43.706748,-79.594054


In [11]:
toronto_df.shape

(103, 5)

<div id='part-3'/>

___
## Part 3 - Neighborhoods Clustering
  
  

- **Select Toronto boroughs with word "Toronto" in the name.**
  

In [12]:
toronto_df = toronto_df[toronto_df['Borough'].str.contains('Toronto')].reset_index(drop=True)
print(toronto_df.shape)
toronto_df.head()

(38, 5)


Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M4E,East Toronto,The Beaches,43.676357,-79.293031
1,M4K,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188
2,M4L,East Toronto,"The Beaches West, India Bazaar",43.668999,-79.315572
3,M4M,East Toronto,Studio District,43.659526,-79.340923
4,M4N,Central Toronto,Lawrence Park,43.72802,-79.38879


In [13]:
toronto_df.tail()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
33,M6K,West Toronto,"Brockton, Exhibition Place, Parkdale Village",43.636847,-79.428191
34,M6P,West Toronto,"High Park, The Junction South",43.661608,-79.464763
35,M6R,West Toronto,"Parkdale, Roncesvalles",43.64896,-79.456325
36,M6S,West Toronto,"Runnymede, Swansea",43.651571,-79.48445
37,M7Y,East Toronto,Business reply mail Processing Centre969 Eastern,43.662744,-79.321558


- **Build Toronto map including markers for boroughs.**  


In [14]:
address = 'Toronto'
geolocator = Nominatim()
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of {} is {}, {}.'.format(address, latitude, longitude))



The geograpical coordinate of Toronto is 43.653963, -79.387207.


In [15]:
toronto_map = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, borough, neighborhood in zip(toronto_df['Latitude'], toronto_df['Longitude'], \
                                           toronto_df['Borough'], toronto_df['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(toronto_map)  
    
toronto_map

![](Week-3-Segmenting_Neighborhoods_in_Toronto-Part_3-non-clustered.png)

In [16]:
# Foursquare ID and Secret are taken from environment variables for security.
CLIENT_ID = os.environ.get("FOURSQUARE_CLIENT_ID")
CLIENT_SECRET = os.environ.get("FOURSQUARE_CLIENT_SECRET")
VERSION = '20180605' # Foursquare API version

#print('CLIENT_ID: ' + CLIENT_ID)
#print('CLIENT_SECRET:' + CLIENT_SECRET)

In [17]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [18]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        #print(name)
        
        LIMIT = 100 # limit of number of venues returned by Foursquare API
        radius = 500    
        
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        try:
            results = requests.get(url).json()["response"]['groups'][0]['items']
        except:
            print("ERROR: ", url)
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

  
- **Get venues for every neighborhood.**
  

In [19]:
toronto_venues = getNearbyVenues(names=toronto_df['Neighborhood'],
                                   latitudes=toronto_df['Latitude'],
                                   longitudes=toronto_df['Longitude']
                                  )

print(toronto_venues.shape)
toronto_venues.head()

(1704, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,The Beaches,43.676357,-79.293031,Starbucks,43.678798,-79.298045,Coffee Shop
1,The Beaches,43.676357,-79.293031,Grover Pub and Grub,43.679181,-79.297215,Pub
2,The Beaches,43.676357,-79.293031,Glen Manor Ravine,43.676821,-79.293942,Trail
3,The Beaches,43.676357,-79.293031,Upper Beaches,43.680563,-79.292869,Neighborhood
4,The Beaches,43.676357,-79.293031,Beaches Fitness,43.680319,-79.290991,Gym / Fitness Center


In [20]:
print('There are {} uniques categories.'.format(len(toronto_venues['Venue Category'].unique())))

There are 233 uniques categories.


  
- **Build venue categories dataframe.**
- **Group by neighborhood and calculate mean value for each.**
  

In [21]:
toronto_onehot = pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")
toronto_onehot['Neighborhood'] = toronto_venues['Neighborhood'] 
toronto_onehot.head()

# add neighborhood column back to dataframe and move column to the first column
toronto_onehot['Neighborhood'] = toronto_venues['Neighborhood'] 
col_index = toronto_onehot.columns.tolist().index('Neighborhood')
col_order = [toronto_onehot.columns[col_index]] \
                + list(toronto_onehot.columns[0:col_index]) \
                + list(toronto_onehot.columns[col_index+1:])
toronto_onehot = toronto_onehot[col_order]
print("categories dataset shape {}".format(toronto_onehot.shape))
toronto_onehot.head()

toronto_grouped = toronto_onehot.groupby('Neighborhood').mean().reset_index()
print("categories grouped by neighborhood shape {}".format(toronto_grouped.shape))
toronto_grouped.head()

categories dataset shape (1704, 233)
categories grouped by neighborhood shape (38, 233)


Unnamed: 0,Neighborhood,Accessories Store,Adult Boutique,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,...,Theater,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Women's Store,Yoga Studio
0,"Adelaide, King, Richmond",0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.0
1,Berczy Park,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,"Brockton, Exhibition Place, Parkdale Village",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455
3,Business reply mail Processing Centre969 Eastern,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556
4,"CN Tower, Bathurst Quay, Island airport, Harbo...",0.0,0.0,0.0,0.071429,0.071429,0.071429,0.142857,0.142857,0.142857,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [22]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

  
- **Build ten top venues dataset.**
  

In [23]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = toronto_grouped['Neighborhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

print(neighborhoods_venues_sorted.shape)
neighborhoods_venues_sorted.head()

(38, 11)


Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Adelaide, King, Richmond",Coffee Shop,Café,American Restaurant,Steakhouse,Breakfast Spot,Cosmetics Shop,Bar,Gym,Thai Restaurant,Hotel
1,Berczy Park,Coffee Shop,Cocktail Bar,Restaurant,Steakhouse,Cheese Shop,Café,Pub,Farmers Market,Bakery,Seafood Restaurant
2,"Brockton, Exhibition Place, Parkdale Village",Coffee Shop,Breakfast Spot,Café,Yoga Studio,Pet Store,Burrito Place,Caribbean Restaurant,Climbing Gym,Performing Arts Venue,Stadium
3,Business reply mail Processing Centre969 Eastern,Light Rail Station,Yoga Studio,Auto Workshop,Park,Pizza Place,Recording Studio,Restaurant,Burrito Place,Brewery,Skate Park
4,"CN Tower, Bathurst Quay, Island airport, Harbo...",Airport Lounge,Airport Service,Airport Terminal,Harbor / Marina,Boat or Ferry,Airport,Airport Food Court,Airport Gate,Sculpture Garden,Boutique


  
- **Calculate clustering using k-means algorithm.**
  

In [24]:
# set number of clusters
kclusters = 5
toronto_grouped_clustering = toronto_grouped.drop('Neighborhood', 1)
# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_grouped_clustering)
# check cluster labels generated for each row in the dataframe
kmeans.labels_

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 1, 0, 0, 0, 0,
       4, 0, 3, 0, 0, 1, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0])

  
- **Build cluster dataset and plot the map**
  

In [25]:
toronto_merged = toronto_df
# add clustering labels
toronto_merged['Cluster Labels'] = kmeans.labels_
# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
toronto_merged = toronto_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

print(toronto_merged.shape)
toronto_merged.head() # check the last columns!

(38, 16)


Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M4E,East Toronto,The Beaches,43.676357,-79.293031,0,Gym / Fitness Center,Coffee Shop,Trail,Pub,Yoga Studio,Donut Shop,Diner,Discount Store,Dog Run,Doner Restaurant
1,M4K,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188,0,Greek Restaurant,Coffee Shop,Ice Cream Shop,Italian Restaurant,Bubble Tea Shop,Indian Restaurant,Bakery,Spa,Bookstore,Brewery
2,M4L,East Toronto,"The Beaches West, India Bazaar",43.668999,-79.315572,0,Park,Ice Cream Shop,Intersection,Pub,Sushi Restaurant,Liquor Store,Italian Restaurant,Fish & Chips Shop,Fast Food Restaurant,Burrito Place
3,M4M,East Toronto,Studio District,43.659526,-79.340923,0,Café,Coffee Shop,Bakery,Italian Restaurant,Gastropub,American Restaurant,Fish Market,Juice Bar,New American Restaurant,Latin American Restaurant
4,M4N,Central Toronto,Lawrence Park,43.72802,-79.38879,0,Dim Sum Restaurant,Bus Line,Park,Swim School,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant


In [26]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], \
                                  toronto_merged['Neighborhood'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

![](Week-3-Segmenting_Neighborhoods_in_Toronto-Part_3-clustered.png)

### Cluster 1
  

In [27]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 0, \
                   toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,East Toronto,0,Gym / Fitness Center,Coffee Shop,Trail,Pub,Yoga Studio,Donut Shop,Diner,Discount Store,Dog Run,Doner Restaurant
1,East Toronto,0,Greek Restaurant,Coffee Shop,Ice Cream Shop,Italian Restaurant,Bubble Tea Shop,Indian Restaurant,Bakery,Spa,Bookstore,Brewery
2,East Toronto,0,Park,Ice Cream Shop,Intersection,Pub,Sushi Restaurant,Liquor Store,Italian Restaurant,Fish & Chips Shop,Fast Food Restaurant,Burrito Place
3,East Toronto,0,Café,Coffee Shop,Bakery,Italian Restaurant,Gastropub,American Restaurant,Fish Market,Juice Bar,New American Restaurant,Latin American Restaurant
4,Central Toronto,0,Dim Sum Restaurant,Bus Line,Park,Swim School,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant
5,Central Toronto,0,Grocery Store,Park,Breakfast Spot,Hotel,Food & Drink Shop,Burger Joint,Sandwich Place,Diner,Ethiopian Restaurant,Electronics Store
6,Central Toronto,0,Sporting Goods Shop,Coffee Shop,Clothing Store,Yoga Studio,Gym / Fitness Center,Gift Shop,Fast Food Restaurant,Diner,Mexican Restaurant,Dessert Shop
7,Central Toronto,0,Dessert Shop,Sandwich Place,Café,Seafood Restaurant,Coffee Shop,Sushi Restaurant,Pizza Place,Italian Restaurant,Pharmacy,Japanese Restaurant
8,Central Toronto,0,Gym,Playground,Trail,Tennis Court,Yoga Studio,Donut Shop,Diner,Discount Store,Dog Run,Doner Restaurant
9,Central Toronto,0,Coffee Shop,Pub,American Restaurant,Supermarket,Vietnamese Restaurant,Convenience Store,Sushi Restaurant,Light Rail Station,Pizza Place,Fried Chicken Joint


### Cluster 2
  

In [28]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 1, \
                   toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
17,Downtown Toronto,1,Coffee Shop,Italian Restaurant,Café,Bar,Bubble Tea Shop,Burger Joint,Ice Cream Shop,Middle Eastern Restaurant,Sandwich Place,Japanese Restaurant
27,Downtown Toronto,1,Airport Lounge,Airport Service,Airport Terminal,Harbor / Marina,Boat or Ferry,Airport,Airport Food Court,Airport Gate,Sculpture Garden,Boutique


### Cluster 3
  

In [29]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 2, \
                   toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
28,Downtown Toronto,2,Coffee Shop,Restaurant,Café,Cocktail Bar,Pub,Seafood Restaurant,Hotel,Beer Bar,Japanese Restaurant,Italian Restaurant


### Cluster 4
  

In [30]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 3, \
                   toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
24,Central Toronto,3,Café,Coffee Shop,Sandwich Place,Pizza Place,Jewish Restaurant,Burger Joint,Pub,BBQ Joint,Indian Restaurant,Liquor Store


### Cluster 5
  

In [31]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 4, \
                   toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
12,Downtown Toronto,4,Japanese Restaurant,Coffee Shop,Gay Bar,Sushi Restaurant,Burger Joint,Restaurant,Gastropub,Men's Store,Fast Food Restaurant,Pub
22,Central Toronto,4,Garden,Dessert Shop,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop,Doner Restaurant
