## Project Introduction

 In this notebook, I will explore and cluster the neighborhoods in toronto. To obtain all required data, the wikipage  https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M will be scrabed and data are going to be transformed into dataframe.
 
 
Requirements of the dataframe:
1. The dataframe will consist of three columns: PostalCode, Borough, and Neighborhood
2. Only process the cells that have an assigned borough. Ignore cells with a borough that is Not assigned.
3. More than one neighborhood can exist in one postal code area. For example, in the table on the Wikipedia page, you will notice that M5A is listed twice and has two neighborhoods: Harbourfront and Regent Park. These two rows will be combined into one row with the neighborhoods separated with a comma as shown in row 11  in the above table.
4. If a cell has a borough but a Not assigned  neighborhood, then the neighborhood will be the same as the borough.
5. Clean your Notebook and add Markdown cells to explain your work and any assumptions you are making.
6. In the last cell of your notebook, use the .shape method to print the number of rows of your dataframe.

# -----------------------------------------------------Part One-------------------------------------------------------------------------

## import libraries

In [11]:
import pandas as pd
from pandas.io.json import json_normalize
import numpy as np
import json # deal with json files

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

#import k-means model
from sklearn.cluster import KMeans
from sklearn.datasets.samples_generator import make_blobs

In [9]:
!pip install folium

Collecting folium
  Downloading folium-0.12.0-py2.py3-none-any.whl (94 kB)
[K     |████████████████████████████████| 94 kB 5.6 MB/s  eta 0:00:01
Collecting branca>=0.3.0
  Downloading branca-0.4.2-py3-none-any.whl (24 kB)
Installing collected packages: branca, folium
Successfully installed branca-0.4.2 folium-0.12.0


In [10]:
#geo location lib
from geopy.geocoders import Nominatim

#map rendering lib
from bs4 import BeautifulSoup
import lxml
import folium
import requests

## Load the data find valid assignments

In [18]:
# download the data from wikipage and parse it 
source_data = requests.get("https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M").text
source_data = BeautifulSoup(source_data, 'lxml')

table = source_data.find("table")
#fintd all rows 
table_rows=table.find_all('tr')

#allocate the valid assigned neighborhoods
instances = []
for row in table_rows[1:]:#ignore first row since its blank
    td = row.find_all('td')
    read_line = [row.text for row in td]
    
    #ignore not assigned borough
    if read_line[1] != 'Not assigned\n':
        if "Not assigned\n" in read_line[2]:
            read_line[2] = read_lind[1]
        instances.append(read_line)

instances[0:5]

[['M3A\n', 'North York\n', 'Parkwoods\n'],
 ['M4A\n', 'North York\n', 'Victoria Village\n'],
 ['M5A\n', 'Downtown Toronto\n', 'Regent Park, Harbourfront\n'],
 ['M6A\n', 'North York\n', 'Lawrence Manor, Lawrence Heights\n'],
 ['M7A\n',
  'Downtown Toronto\n',
  "Queen's Park, Ontario Provincial Government\n"]]

## Generate a dataframe 

In [22]:
headers = ['PostalCode', 'Borough', 'Neighborhood']
source_df = pd.DataFrame(instances, columns=headers)
#remove symbols
for i in headers:
    source_df[i] = source_df[i].str.replace("\n", "")
source_df.head()

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Regent Park, Harbourfront"
3,M6A,North York,"Lawrence Manor, Lawrence Heights"
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"


In [26]:
print('The shape of the toronto neighborhoods dataset:', source_df.shape)

The shape of the toronto neighborhoods dataset: (103, 3)


# -----------------------------------------------------Part Two-------------------------------------------------------------------------

In [31]:
# add coordinates for each location with geopy
#import geocoder
# initialize your variable to None
#import locations with csv file 
geo_df = pd.read_csv("http://cocl.us/Geospatial_data")
#merge coordinates information with neighborhoods information into one dataframe by postalcode
toronto_df = pd.merge(source_df, geo_df, left_on = 'PostalCode', right_on = 'Postal Code')
toronto_df.drop("Postal Code", axis=1, inplace=True)
toronto_df.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494


# -----------------------------------------------------Part Three-------------------------------------------------------------------------


In this part I'm going to explore and cluster the toronto neighborhoods.

### 3.1 Access the geolocation information of Toronto

In [42]:
address = 'Toronto, ON, Canada'

geolocator = Nominatim(user_agent="toronto_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Toronto are 43.6534817, -79.3839347.


### 3.2 Create a map of Toronto with neighborhoods superimposed on top

In [49]:
toronto_map = folium.Map(location=[latitude, longitude], zoom_start=10)

#add markers to the map 
for lat, lng, borough, neighborhood in zip(
    toronto_df['Latitude'],
    toronto_df['Longitude'],
    toronto_df['Borough'],
    toronto_df['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker([lat,lng], radius=4, color='blue', fill='True',fill_color='#3186cc',
        fill_opacity=0.7,parse_html=False).add_to(toronto_map)

toronto_map

### 3.3 Define Foursquare Credentials and Version 

In [50]:
CLIENT_ID = 'UJ50JPVA2FQOF5GD0YRACXEV4UMDA0N1EANKDITSRAQKJTIH'
CLIENT_SECRET = 'B0OD3WNABMHYB5JZRTOIGQA02GGT3QQR2PJPHFYS2EUQMS4H'
VERSION = '20180605'

In [51]:
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: UJ50JPVA2FQOF5GD0YRACXEV4UMDA0N1EANKDITSRAQKJTIH
CLIENT_SECRET:B0OD3WNABMHYB5JZRTOIGQA02GGT3QQR2PJPHFYS2EUQMS4H


#### Now I 'm going to explore the neighborhood of a random choosed location in the toronto dataset

In [63]:
neighborhood_latitude = toronto_df.loc[1, 'Latitude'] # neighborhood latitude value of the 2nd row
neighborhood_longitude = toronto_df.loc[1, 'Longitude'] # neighborhood longitude value of the 2nd row
neighborhood_name = toronto_df.loc[1, 'Neighborhood'] # neighborhood name of the 2nd row

print('Latitude and longitude values of {} are {}, {}.'.format(neighborhood_name, 
                                                               neighborhood_latitude, 
                                                               neighborhood_longitude))

Latitude and longitude values of Victoria Village are 43.725882299999995, -79.31557159999998.


#### Now I want to get the top 100 venues that are in Victoria Village within a radius of 500 meters.

In [64]:
#first create a get request 
LIMIT = 100 # limit of number of venues returned by Foursquare API
radius = 500 # define radius
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)

# get the result to a json file
results = requests.get(url).json()

In [65]:
#then extracts the category of the venue with function
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

#### Get ready to clean the json and structure it into a dataframe.

In [66]:
venues = results['response']['groups'][0]['items']
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues

  from ipykernel import kernelapp as app


Unnamed: 0,name,categories,lat,lng
0,Victoria Village Arena,Hockey Arena,43.723481,-79.315635
1,Portugril,Portuguese Restaurant,43.725819,-79.312785
2,Tim Hortons,Coffee Shop,43.725517,-79.313103
3,The Frig,French Restaurant,43.727051,-79.317418
4,Eglinton Ave E & Sloane Ave/Bermondsey Rd,Intersection,43.726086,-79.31362


### 3.4 Explore neighborhoods in a part of Toronto City


Now I'm working on the data frame toronto_denc_df, which states for:
"DENC" = [D]owntown Toronto, [E]ast Toronto, [N]orth Toronto, [C]entral Toronto

In [67]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    """
    get nearby venues for the given location names and their geographical coordinates within a radius of 500 meters.
    """
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [69]:
# generate denc dataframe first
toronto_denc_df = toronto_df[toronto_df['Borough'].str.contains("Toronto")].reset_index(drop=True)
#toronto_denc_df.head()


#Now write the code to run the above function on each neighborhood and create a new dataframe called toronto_denc_venues
toronto_denc_venues = getNearbyVenues(names=toronto_denc_df['Neighborhood'],
                                   latitudes=toronto_denc_df['Latitude'],
                                   longitudes=toronto_denc_df['Longitude']
                                  )

Regent Park, Harbourfront
Queen's Park, Ontario Provincial Government
Garden District, Ryerson
St. James Town
The Beaches
Berczy Park
Central Bay Street
Christie
Richmond, Adelaide, King
Dufferin, Dovercourt Village
Harbourfront East, Union Station, Toronto Islands
Little Portugal, Trinity
The Danforth West, Riverdale
Toronto Dominion Centre, Design Exchange
Brockton, Parkdale Village, Exhibition Place
India Bazaar, The Beaches West
Commerce Court, Victoria Hotel
Studio District
Lawrence Park
Roselawn
Davisville North
Forest Hill North & West, Forest Hill Road Park
High Park, The Junction South
North Toronto West,  Lawrence Park
The Annex, North Midtown, Yorkville
Parkdale, Roncesvalles
Davisville
University of Toronto, Harbord
Runnymede, Swansea
Moore Park, Summerhill East
Kensington Market, Chinatown, Grange Park
Summerhill West, Rathnelly, South Hill, Forest Hill SE, Deer Park
CN Tower, King and Spadina, Railway Lands, Harbourfront West, Bathurst Quay, South Niagara, Island airport


In [70]:
toronto_denc_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Berczy Park,58,58,58,58,58,58
"Brockton, Parkdale Village, Exhibition Place",22,22,22,22,22,22
"Business reply mail Processing Centre, South Central Letter Processing Plant Toronto",16,16,16,16,16,16
"CN Tower, King and Spadina, Railway Lands, Harbourfront West, Bathurst Quay, South Niagara, Island airport",15,15,15,15,15,15
Central Bay Street,60,60,60,60,60,60
Christie,15,15,15,15,15,15
Church and Wellesley,79,79,79,79,79,79
"Commerce Court, Victoria Hotel",100,100,100,100,100,100
Davisville,33,33,33,33,33,33
Davisville North,10,10,10,10,10,10


In [72]:
#find out how many unique categories can be curated from all the returned venues
print('There are {} uniques categories.'.format(len(toronto_denc_venues['Venue Category'].unique())))

There are 235 uniques categories.


### 3.5 Analyze Each Neighborhood

In [73]:
# one hot encoding
toronto_ohe = pd.get_dummies(toronto_denc_venues[['Venue Category']], prefix="", prefix_sep="")
# add neighborhood column back to dataframe
toronto_ohe['Neighborhood'] = toronto_denc_venues['Neighborhood'] 
# move neighborhood column to the first column
fixed_columns = [toronto_ohe.columns[-1]] + list(toronto_ohe.columns[:-1])
toronto_ohe = toronto_ohe[fixed_columns]
toronto_ohe.head()

Unnamed: 0,Yoga Studio,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,Aquarium,...,Theater,Theme Restaurant,Tibetan Restaurant,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [82]:
#group rows by neighborhood and by taking the mean of the frequency of occurrence of each category
toronto_denc_grouped = toronto_ohe.groupby('Neighborhood').mean().reset_index()
toronto_denc_grouped.head()

Unnamed: 0,Neighborhood,Yoga Studio,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,...,Theater,Theme Restaurant,Tibetan Restaurant,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar
0,Berczy Park,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0
1,"Brockton, Parkdale Village, Exhibition Place",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,"Business reply mail Processing Centre, South C...",0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,"CN Tower, King and Spadina, Railway Lands, Har...",0.0,0.066667,0.066667,0.066667,0.133333,0.066667,0.133333,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Central Bay Street,0.016667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.016667


In [83]:
def return_most_common_venues(row, num_top_venues):
    """
    sort top n venues in a descending order.
    """
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

# display top 10 venues for each neighborhood in toronto denc
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = toronto_denc_grouped['Neighborhood']

for ind in np.arange(toronto_denc_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_denc_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Berczy Park,Coffee Shop,Cocktail Bar,Bakery,Beer Bar,Farmers Market,Restaurant,Cheese Shop,Seafood Restaurant,Hotel,Basketball Stadium
1,"Brockton, Parkdale Village, Exhibition Place",Café,Coffee Shop,Breakfast Spot,Gym,Stadium,Burrito Place,Restaurant,Climbing Gym,Pet Store,Bakery
2,"Business reply mail Processing Centre, South C...",Light Rail Station,Yoga Studio,Garden Center,Farmers Market,Fast Food Restaurant,Burrito Place,Restaurant,Brewery,Auto Workshop,Garden
3,"CN Tower, King and Spadina, Railway Lands, Har...",Airport Lounge,Airport Terminal,Boutique,Harbor / Marina,Boat or Ferry,Rental Car Location,Plane,Coffee Shop,Sculpture Garden,Airport Service
4,Central Bay Street,Coffee Shop,Italian Restaurant,Café,Sandwich Place,Salad Place,Thai Restaurant,Burger Joint,Bubble Tea Shop,Ramen Restaurant,Portuguese Restaurant


### 3.6 Cluster Neighborhoods

In [84]:
# set number of clusters
kclusters = 5
toronto_grouped_clustering = toronto_denc_grouped.drop('Neighborhood', 1)
# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_grouped_clustering)
# check cluster labels generated for each row in the dataframe
print(kmeans.labels_[0:10])
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)
toronto_denc_merged = toronto_denc_df
toronto_denc_merged = toronto_denc_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')
toronto_denc_merged.head()

[2 0 2 2 2 0 2 2 0 2]


Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636,2,Coffee Shop,Park,Bakery,Café,Pub,Breakfast Spot,Theater,Cosmetics Shop,Shoe Store,Brewery
1,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494,2,Coffee Shop,College Cafeteria,Sushi Restaurant,Gym,Fast Food Restaurant,Restaurant,Portuguese Restaurant,Park,Nightclub,Mexican Restaurant
2,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937,2,Coffee Shop,Clothing Store,Hotel,Japanese Restaurant,Cosmetics Shop,Middle Eastern Restaurant,Café,Bubble Tea Shop,Theater,Bookstore
3,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418,2,Coffee Shop,Café,American Restaurant,Gastropub,Cocktail Bar,Gym,Clothing Store,Moroccan Restaurant,Lingerie Store,Seafood Restaurant
4,M4E,East Toronto,The Beaches,43.676357,-79.293031,2,Pub,Trail,Health Food Store,Wine Bar,Dog Run,Dessert Shop,Diner,Discount Store,Distribution Center,Donut Shop


In [85]:
# Visualization
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(
        toronto_denc_merged['Latitude'], 
        toronto_denc_merged['Longitude'], 
        toronto_denc_merged['Neighborhood'], 
        toronto_denc_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### 3.8 Examine Clusters

In [86]:
#cluster 1
toronto_denc_merged.loc[toronto_denc_merged['Cluster Labels'] == 0, toronto_denc_merged.columns[[1] + list(range(5, toronto_denc_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
7,Downtown Toronto,0,Grocery Store,Café,Park,Candy Store,Restaurant,Baby Store,Italian Restaurant,Coffee Shop,Nightclub,Diner
9,West Toronto,0,Pharmacy,Bakery,Bus Stop,Bar,Grocery Store,Music Venue,Bank,Park,Supermarket,Coffee Shop
11,West Toronto,0,Bar,Coffee Shop,Vegetarian / Vegan Restaurant,Asian Restaurant,Restaurant,Vietnamese Restaurant,Café,Men's Store,Wine Bar,Korean Restaurant
14,West Toronto,0,Café,Coffee Shop,Breakfast Spot,Gym,Stadium,Burrito Place,Restaurant,Climbing Gym,Pet Store,Bakery
17,East Toronto,0,Coffee Shop,Brewery,Gastropub,Café,Bakery,American Restaurant,Yoga Studio,Convenience Store,Cheese Shop,Clothing Store
22,West Toronto,0,Café,Mexican Restaurant,Thai Restaurant,Speakeasy,Diner,Bar,Flea Market,Bakery,Fried Chicken Joint,Italian Restaurant
24,Central Toronto,0,Sandwich Place,Café,Coffee Shop,Donut Shop,Middle Eastern Restaurant,Pub,Liquor Store,BBQ Joint,Indian Restaurant,History Museum
26,Central Toronto,0,Pizza Place,Dessert Shop,Sandwich Place,Sushi Restaurant,Café,Coffee Shop,Gym,Italian Restaurant,Farmers Market,Diner
27,Downtown Toronto,0,Café,Bookstore,Bar,Japanese Restaurant,Bakery,Sandwich Place,Beer Bar,Beer Store,Italian Restaurant,Comfort Food Restaurant
30,Downtown Toronto,0,Café,Coffee Shop,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Mexican Restaurant,Bar,Gaming Cafe,Grocery Store,Bakery,Caribbean Restaurant


In [88]:
#cluster 2
toronto_denc_merged.loc[toronto_denc_merged['Cluster Labels'] == 1, toronto_denc_merged.columns[[1] + list(range(5, toronto_denc_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
21,Central Toronto,1,Park,Trail,Jewelry Store,Sushi Restaurant,Wine Bar,Dessert Shop,Ethiopian Restaurant,Escape Room,Electronics Store,Eastern European Restaurant
29,Central Toronto,1,Park,Trail,Playground,Tennis Court,Wine Bar,Dog Run,Dessert Shop,Diner,Discount Store,Distribution Center
33,Downtown Toronto,1,Park,Trail,Playground,Dance Studio,Escape Room,Electronics Store,Eastern European Restaurant,Donut Shop,Doner Restaurant,Dog Run


In [91]:
#cluster 3
toronto_denc_merged.loc[toronto_denc_merged['Cluster Labels'] == 2, toronto_denc_merged.columns[[1] + list(range(5, toronto_denc_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Downtown Toronto,2,Coffee Shop,Park,Bakery,Café,Pub,Breakfast Spot,Theater,Cosmetics Shop,Shoe Store,Brewery
1,Downtown Toronto,2,Coffee Shop,College Cafeteria,Sushi Restaurant,Gym,Fast Food Restaurant,Restaurant,Portuguese Restaurant,Park,Nightclub,Mexican Restaurant
2,Downtown Toronto,2,Coffee Shop,Clothing Store,Hotel,Japanese Restaurant,Cosmetics Shop,Middle Eastern Restaurant,Café,Bubble Tea Shop,Theater,Bookstore
3,Downtown Toronto,2,Coffee Shop,Café,American Restaurant,Gastropub,Cocktail Bar,Gym,Clothing Store,Moroccan Restaurant,Lingerie Store,Seafood Restaurant
4,East Toronto,2,Pub,Trail,Health Food Store,Wine Bar,Dog Run,Dessert Shop,Diner,Discount Store,Distribution Center,Donut Shop
5,Downtown Toronto,2,Coffee Shop,Cocktail Bar,Bakery,Beer Bar,Farmers Market,Restaurant,Cheese Shop,Seafood Restaurant,Hotel,Basketball Stadium
6,Downtown Toronto,2,Coffee Shop,Italian Restaurant,Café,Sandwich Place,Salad Place,Thai Restaurant,Burger Joint,Bubble Tea Shop,Ramen Restaurant,Portuguese Restaurant
8,Downtown Toronto,2,Coffee Shop,Café,Restaurant,Clothing Store,Deli / Bodega,Hotel,Thai Restaurant,Gym,Pizza Place,Burrito Place
10,Downtown Toronto,2,Coffee Shop,Aquarium,Café,Hotel,Scenic Lookout,Restaurant,Brewery,Italian Restaurant,Fried Chicken Joint,History Museum
12,East Toronto,2,Greek Restaurant,Coffee Shop,Italian Restaurant,Bookstore,Furniture / Home Store,Ice Cream Shop,Spa,Japanese Restaurant,Juice Bar,Brewery


In [90]:
#cluster 4
toronto_denc_merged.loc[toronto_denc_merged['Cluster Labels'] == 3, toronto_denc_merged.columns[[1] + list(range(5, toronto_denc_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
19,Central Toronto,3,Home Service,Garden,Wine Bar,Department Store,Event Space,Ethiopian Restaurant,Escape Room,Electronics Store,Eastern European Restaurant,Donut Shop


In [89]:
#cluster 5
toronto_denc_merged.loc[toronto_denc_merged['Cluster Labels'] == 4, toronto_denc_merged.columns[[1] + list(range(5, toronto_denc_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
18,Central Toronto,4,Park,Bus Line,Swim School,Falafel Restaurant,Ethiopian Restaurant,Escape Room,Electronics Store,Eastern European Restaurant,Donut Shop,Doner Restaurant
