# Segmentation and Clustering the Neighborhoods in Toronto City

## 1. Before we start to collect the data and exploring it, let's download all the dependencies that we will need.

In [1]:
pip install bs4

Collecting bs4
  Downloading https://files.pythonhosted.org/packages/10/ed/7e8b97591f6f456174139ec089c769f89a94a1a4025fe967691de971f314/bs4-0.0.1.tar.gz
Collecting beautifulsoup4 (from bs4)
[?25l  Downloading https://files.pythonhosted.org/packages/66/25/ff030e2437265616a1e9b25ccc864e0371a0bc3adb7c5a404fd661c6f4f6/beautifulsoup4-4.9.1-py3-none-any.whl (115kB)
[K     |████████████████████████████████| 122kB 8.8MB/s eta 0:00:01
[?25hCollecting soupsieve>1.2 (from beautifulsoup4->bs4)
  Downloading https://files.pythonhosted.org/packages/6f/8f/457f4a5390eeae1cc3aeab89deb7724c965be841ffca6cfca9197482e470/soupsieve-2.0.1-py3-none-any.whl
Building wheels for collected packages: bs4
  Building wheel for bs4 (setup.py) ... [?25ldone
[?25h  Stored in directory: /home/jupyterlab/.cache/pip/wheels/a0/b0/b2/4f80b9456b87abedbc0bf2d52235414c3467d8889be38dd472
Successfully built bs4
Installing collected packages: soupsieve, beautifulsoup4, bs4
Successfully installed beautifulsoup4-4.9.1 bs4-0.0.

In [2]:
pip install lxml

Collecting lxml
[?25l  Downloading https://files.pythonhosted.org/packages/55/6f/c87dffdd88a54dd26a3a9fef1d14b6384a9933c455c54ce3ca7d64a84c88/lxml-4.5.1-cp36-cp36m-manylinux1_x86_64.whl (5.5MB)
[K     |████████████████████████████████| 5.5MB 5.3MB/s eta 0:00:01
[?25hInstalling collected packages: lxml
Successfully installed lxml-4.5.1
Note: you may need to restart the kernel to use updated packages.


**Note: We might need to restart the kernel to use updated packages.**

In [1]:
from bs4 import BeautifulSoup

import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /home/jupyterlab/conda/envs/python

  added / updated specs:
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    geographiclib-1.50         |             py_0          34 KB  conda-forge
    geopy-1.22.0               |     pyh9f0ad1d_0          63 KB  conda-forge
    ------------------------------------------------------------
                                           Total:          97 KB

The following NEW packages will be INSTALLED:

  geographiclib      conda-forge/noarch::geographiclib-1.50-py_0
  geopy              conda-forge/noarch::geopy-1.22.0-pyh9f0ad1d_0



Downloading and Extracting Packages
geopy-1.22.0         | 63 KB     | ##################################### | 100% 
geographiclib-1.50   | 34 KB     | ###############################

## 2. Scrape and extract the relevant dataset (Postal Code, Borough and Neighborhood) of the neighborhoods in Toronto City from Wikipedia webpage and transform the dataset into a *pandas* dataframe

#### Scraping the relevant data from webpage

In [2]:
url = "https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M"
result = requests.get(url).text
Canada_data = BeautifulSoup(result, 'lxml')

#### Transform the Toronto City data into dataframe

In [3]:
# define the dataframe columns
column_names = ['PostalCode','Borough','Neighborhood']

# instantiate the dataframe
toronto = pd.DataFrame(columns = column_names)


# loop through to find postcode, borough, neighborhood 
content = Canada_data.find('div', class_='mw-parser-output')
table = content.table.tbody
postcode = 0
borough = 0
neighborhood = 0

for tr in table.find_all('tr'):
    i = 0
    for td in tr.find_all('td'):
        if i == 0:
            postcode = td.text
            i = i + 1
        elif i == 1:
            borough = td.text
            i = i + 1
        elif i == 2: 
            neighborhood = td.text.strip('\n').replace(']','')
    toronto = toronto.append({'PostalCode': postcode,'Borough': borough,'Neighborhood': neighborhood},ignore_index=True)

#### Clean the dataframe

In [4]:
# clean dataframe 
toronto = toronto[toronto.Borough!='Not assigned']
toronto = toronto[toronto.Borough!= 0]
toronto.reset_index(drop = True, inplace = True)
i = 0
for i in range(0,toronto.shape[0]):
    if toronto.iloc[i][2] == 'Not assigned':
        toronto.iloc[i][2] = toronto.iloc[i][1]
        i = i+1
                                 
df = toronto.groupby(['PostalCode','Borough'])['Neighborhood'].apply(', '.join).reset_index()

df['PostalCode']=df['PostalCode'].str.replace("\n","")
df['Borough']=df['Borough'].str.replace("\n","")
df['Neighborhood']=df['Neighborhood'].str.replace("/",",")
df = df.dropna()
empty = 'Not assigned'
df = df[(df.PostalCode != empty ) & (df.Borough != empty) & (df.Neighborhood != empty)]
df.rename(columns={'PostalCode':'Postal Code'}, inplace=True)
df.head()

Unnamed: 0,Postal Code,Borough,Neighborhood
1,M1B,Scarborough,"Malvern, Rouge"
2,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek"
3,M1E,Scarborough,"Guildwood, Morningside, West Hill"
4,M1G,Scarborough,Woburn
5,M1H,Scarborough,Cedarbrae


#### Download the geological coordinates data and read into the cleaned dataframe

In [5]:
toronto_geocsv = 'https://cocl.us/Geospatial_data'
!wget -q -O 'toronto_m.geospatial_data.csv' toronto_geocsv
geocsv_data = pd.read_csv(toronto_geocsv).set_index("Postal Code")

df = pd.merge(geocsv_data, df, on='Postal Code')
df = df[['Postal Code', 'Borough', 'Neighborhood', 'Latitude', 'Longitude']]
df.head(11)

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude
0,M1B,Scarborough,"Malvern, Rouge",43.806686,-79.194353
1,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476
5,M1J,Scarborough,Scarborough Village,43.744734,-79.239476
6,M1K,Scarborough,"Kennedy Park, Ionview, East Birchmount Park",43.727929,-79.262029
7,M1L,Scarborough,"Golden Mile, Clairlea, Oakridge",43.711112,-79.284577
8,M1M,Scarborough,"Cliffside, Cliffcrest, Scarborough Village West",43.716316,-79.239476
9,M1N,Scarborough,"Birch Cliff, Cliffside West",43.692657,-79.264848


#### Downtown Toronto is selected for this project in the Toronto City

In [6]:
df_3=df[df['Borough'].str.contains('Downtown Toronto')]

df_3=df[df['Borough'].str.contains('Downtown Toronto')].reset_index(drop=True)
print(df_3.shape)
df_3

(19, 5)


Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude
0,M4W,Downtown Toronto,Rosedale,43.679563,-79.377529
1,M4X,Downtown Toronto,"St. James Town, Cabbagetown",43.667967,-79.367675
2,M4Y,Downtown Toronto,Church and Wellesley,43.66586,-79.38316
3,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
4,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937
5,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418
6,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306
7,M5G,Downtown Toronto,Central Bay Street,43.657952,-79.387383
8,M5H,Downtown Toronto,"Richmond, Adelaide, King",43.650571,-79.384568
9,M5J,Downtown Toronto,"Harbourfront East, Union Station, Toronto Islands",43.640816,-79.381752


## 3. Use the Foursquare API to explore neighbourhoods in Downtown Toronto, Toronto City

#### Use geopy library to get the latitude and longitude values of Downtown Toronto, Toronto City, then visualize Downtown Toronto with the neighbourhoods in it by creating a map using **Folium**

In [7]:
address = 'Downtown Toronto, Toronto'

geolocator = Nominatim(user_agent="tr_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Downtown Toronto are {}, {}.'.format(latitude, longitude))

# create map of Downtown Toronto using latitude and longitude values
map_toronto = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(df_3['Latitude'], df_3['Longitude'], df_3['Borough'], df_3['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
    
map_toronto

The geograpical coordinate of Downtown Toronto are 43.6541737, -79.38081164513409.


##### **PLEASE VIEW THE VISUALIZED MAP BY DROP THE GITHUB LINK TO THIS .ipynb FILE INTO https://nbviewer.jupyter.org/**

#### Define Foursquare Credentials and Version.

In [8]:
CLIENT_ID = 'QR24W0AJYVEDYGP3OAWVWXEGVBW1X2NYSY5ZM0WW34I10AF2' # your Foursquare ID
CLIENT_SECRET = 'MMSVAVCUUGDAQKKD3R2HLDI4OMUEUCZVGUCCGUREXKY34WTZ' # your Foursquare Secret
VERSION = '20200516'

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: QR24W0AJYVEDYGP3OAWVWXEGVBW1X2NYSY5ZM0WW34I10AF2
CLIENT_SECRET:MMSVAVCUUGDAQKKD3R2HLDI4OMUEUCZVGUCCGUREXKY34WTZ


#### Explore the neighborhoods in Downtown Toronto. Create GET request URL, send the GET request, then clean and structure json into a new dataframe

In [9]:
# The following function retrieves the venues given the names and coordinates and stores it into dataframe.
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
        LIMIT = 30
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

toronto_venues = getNearbyVenues(names=df_3['Neighborhood'],
                                   latitudes=df_3['Latitude'],
                                   longitudes=df_3['Longitude']
                                  )

Rosedale
St. James Town, Cabbagetown
Church and Wellesley
Regent Park, Harbourfront
Garden District, Ryerson
St. James Town
Berczy Park
Central Bay Street
Richmond, Adelaide, King
Harbourfront East, Union Station, Toronto Islands
Toronto Dominion Centre, Design Exchange
Commerce Court, Victoria Hotel
University of Toronto, Harbord
Kensington Market, Chinatown, Grange Park
CN Tower, King and Spadina, Railway Lands, Harbourfront West, Bathurst Quay, South Niagara, Island airport
Stn A PO Boxes
First Canadian Place, Underground city
Christie
Queen's Park, Ontario Provincial Government


#### Show the size of the resulting dataframe and the unique categories which can be curated from all the returned venues

In [10]:
print(toronto_venues.shape)
toronto_venues.groupby('Neighborhood').count()
print('There are {} uniques categories.'.format(len(toronto_venues['Venue Category'].unique())))

(517, 7)
There are 147 uniques categories.


## 4. Analyze the Neighborhoods in Downtown Toronto

In [11]:
# one hot encoding
toronto_onehot = pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
toronto_onehot['Neighborhood'] = toronto_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]

cols_to_move = ['Yoga Studio']
new_cols = np.hstack((toronto_onehot.columns.difference(cols_to_move), cols_to_move))
# OPTION 1: reindex
toronto_onehot=toronto_onehot.reindex(columns=new_cols)


toronto_grouped = toronto_onehot.groupby('Neighborhood').mean().reset_index()


def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]



num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = toronto_grouped['Neighborhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Berczy Park,Beer Bar,Coffee Shop,Seafood Restaurant,Cocktail Bar,French Restaurant,Breakfast Spot,Creperie,Liquor Store,Bistro,Concert Hall
1,"CN Tower, King and Spadina, Railway Lands, Har...",Airport Lounge,Airport Service,Airport Terminal,Airport,Plane,Boutique,Coffee Shop,Bar,Sculpture Garden,Boat or Ferry
2,Central Bay Street,Coffee Shop,Café,Seafood Restaurant,Sandwich Place,Ramen Restaurant,Poke Place,Bubble Tea Shop,Park,Chinese Restaurant,Modern European Restaurant
3,Christie,Grocery Store,Café,Park,Italian Restaurant,Nightclub,Candy Store,Restaurant,Diner,Baby Store,Athletics & Sports
4,Church and Wellesley,Café,Dance Studio,Smoke Shop,Indian Restaurant,Japanese Restaurant,Salon / Barbershop,Bookstore,Restaurant,Juice Bar,Ramen Restaurant


## 5. Clustering the Neighborhoods and visualize the clusters on map

#### Run *k*-means to cluster the neighborhood into 5 clusters.

In [12]:
# set number of clusters
kclusters = 5

toronto_grouped_clustering = toronto_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([4, 2, 0, 3, 4, 0, 0, 4, 4, 4], dtype=int32)

#### Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

In [13]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

toronto_merged = df_3

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
toronto_merged = toronto_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

toronto_merged.head() # check the last columns!

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M4W,Downtown Toronto,Rosedale,43.679563,-79.377529,1,Park,Playground,Trail,Comfort Food Restaurant,Deli / Bodega,Dance Studio,Creperie,Cosmetics Shop,Concert Hall,Comic Shop
1,M4X,Downtown Toronto,"St. James Town, Cabbagetown",43.667967,-79.367675,0,Restaurant,Italian Restaurant,Coffee Shop,Café,Bakery,Liquor Store,Indian Restaurant,Japanese Restaurant,Deli / Bodega,Jewelry Store
2,M4Y,Downtown Toronto,Church and Wellesley,43.66586,-79.38316,4,Café,Dance Studio,Smoke Shop,Indian Restaurant,Japanese Restaurant,Salon / Barbershop,Bookstore,Restaurant,Juice Bar,Ramen Restaurant
3,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636,0,Coffee Shop,Park,Breakfast Spot,Theater,Bakery,Event Space,Performing Arts Venue,Café,Pub,Restaurant
4,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937,4,Café,Theater,Coffee Shop,Burger Joint,Electronics Store,Shopping Mall,Music Venue,Plaza,Ramen Restaurant,Hotel


#### Finally, let's visualize the resulting clusters

In [14]:
# create map
map_clusters_DT = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['Neighborhood'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters_DT)
       
map_clusters_DT

##### **PLEASE VIEW THE VISUALIZED MAP BY DROP THE GITHUB LINK TO THIS .ipynb FILE INTO https://nbviewer.jupyter.org/**

## 6. Examine Clusters

#### Cluster 0 (Red) Gastronomy

In [21]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 0, toronto_merged.columns[[2] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,"St. James Town, Cabbagetown",0,Restaurant,Italian Restaurant,Coffee Shop,Café,Bakery,Liquor Store,Indian Restaurant,Japanese Restaurant,Deli / Bodega,Jewelry Store
3,"Regent Park, Harbourfront",0,Coffee Shop,Park,Breakfast Spot,Theater,Bakery,Event Space,Performing Arts Venue,Café,Pub,Restaurant
7,Central Bay Street,0,Coffee Shop,Café,Seafood Restaurant,Sandwich Place,Ramen Restaurant,Poke Place,Bubble Tea Shop,Park,Chinese Restaurant,Modern European Restaurant
8,"Richmond, Adelaide, King",0,Coffee Shop,Café,Pizza Place,Sushi Restaurant,Concert Hall,Lounge,Japanese Restaurant,Colombian Restaurant,Seafood Restaurant,Deli / Bodega
10,"Toronto Dominion Centre, Design Exchange",0,Coffee Shop,Café,Restaurant,Japanese Restaurant,Hotel,Museum,Gastropub,Pub,Concert Hall,Deli / Bodega
11,"Commerce Court, Victoria Hotel",0,Café,Gastropub,Restaurant,Japanese Restaurant,Coffee Shop,Deli / Bodega,Sandwich Place,Seafood Restaurant,Pub,Ice Cream Shop
16,"First Canadian Place, Underground city",0,Café,Coffee Shop,Restaurant,Seafood Restaurant,Tea Room,Steakhouse,Gastropub,Pub,Bookstore,Deli / Bodega
18,"Queen's Park, Ontario Provincial Government",0,Coffee Shop,Sushi Restaurant,Yoga Studio,Diner,Bar,Smoothie Shop,Beer Bar,Distribution Center,Sandwich Place,Italian Restaurant


#### Cluster 1 (Purple) Nature

In [22]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 1, toronto_merged.columns[[2] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Rosedale,1,Park,Playground,Trail,Comfort Food Restaurant,Deli / Bodega,Dance Studio,Creperie,Cosmetics Shop,Concert Hall,Comic Shop


#### Cluster 2 (Blue) Transportation Hub

In [23]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 2, toronto_merged.columns[[2] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
14,"CN Tower, King and Spadina, Railway Lands, Har...",2,Airport Lounge,Airport Service,Airport Terminal,Airport,Plane,Boutique,Coffee Shop,Bar,Sculpture Garden,Boat or Ferry


#### Cluster 3 (Green) Miscellaneous

In [24]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 3, toronto_merged.columns[[2] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
17,Christie,3,Grocery Store,Café,Park,Italian Restaurant,Nightclub,Candy Store,Restaurant,Diner,Baby Store,Athletics & Sports


#### Cluster 4 (Yellow) Entertainment & Accommodation

In [25]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 4, toronto_merged.columns[[2] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,Church and Wellesley,4,Café,Dance Studio,Smoke Shop,Indian Restaurant,Japanese Restaurant,Salon / Barbershop,Bookstore,Restaurant,Juice Bar,Ramen Restaurant
4,"Garden District, Ryerson",4,Café,Theater,Coffee Shop,Burger Joint,Electronics Store,Shopping Mall,Music Venue,Plaza,Ramen Restaurant,Hotel
5,St. James Town,4,Gastropub,Café,Cocktail Bar,Coffee Shop,BBQ Joint,New American Restaurant,Park,Middle Eastern Restaurant,Restaurant,Cosmetics Shop
6,Berczy Park,4,Beer Bar,Coffee Shop,Seafood Restaurant,Cocktail Bar,French Restaurant,Breakfast Spot,Creperie,Liquor Store,Bistro,Concert Hall
9,"Harbourfront East, Union Station, Toronto Islands",4,Plaza,Park,Hotel,Italian Restaurant,Lounge,Roof Deck,Salad Place,Bubble Tea Shop,Japanese Restaurant,Skating Rink
12,"University of Toronto, Harbord",4,Café,Bookstore,Restaurant,Bar,Bakery,Japanese Restaurant,Italian Restaurant,Yoga Studio,Dessert Shop,Beer Bar
13,"Kensington Market, Chinatown, Grange Park",4,Café,Mexican Restaurant,Bakery,Vietnamese Restaurant,Comfort Food Restaurant,Record Shop,Pizza Place,Cheese Shop,Organic Grocery,Cocktail Bar
15,Stn A PO Boxes,4,Café,Restaurant,Seafood Restaurant,Beer Bar,Cocktail Bar,Jazz Club,Hotel,Park,Cheese Shop,Museum


# THANK YOU FOR YOUR REVIEW!