<h1 align=center><font size = 5>Segmenting and Clustering Neighborhoods in Toronto</font></h1>

## Part 1. Building dataframe

Before we get the data and start exploring it, let's download all the dependencies that we will need.

In [2]:
!pip install beautifulsoup4
!pip install pgeocode
!pip install geopy

import pgeocode

import numpy as np # library to handle data in a vectorized manner
import pandas as pd # library for data analsysis
import requests # library to handle requests
from bs4 import BeautifulSoup

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

import folium # map rendering library
from geopy.geocoders import Nominatim # convert an address into latitude and longitude value

print('Libraries imported.')

Libraries imported.


### Load wikipedia page with Toronto postal codes  and neighborhoods

In [3]:
html_data = requests.get("https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M").text #Using the requests library to download the webpage

In [4]:
soup = BeautifulSoup(html_data,"html5lib") #Parsing the html data using beautiful_soup

#### Extracting the table using beautiful soup and store it into dataframe 

In [5]:
table_contents=[]
table=soup.find('table')
for row in table.findAll('td'):
    cell = {}
    if row.span.text=='Not assigned':
        pass
    else:
        cell['PostalCode'] = row.p.text[:3]
        #print(row.span.text)
        cell['Borough'] = (row.span.text).split('(')[0]
        cell['Neighborhood'] = (((((row.span.text).split('(')[1]).strip(')')).split('/')[0]).replace(')',' ')).strip(' ') #taking only first neighborhood in a list of each postcode
        table_contents.append(cell)

# print(table_contents)
df=pd.DataFrame(table_contents)
# print(df['Neighborhood'].tail(50))
df['Borough']=df['Borough'].replace({'Downtown TorontoStn A PO Boxes25 The Esplanade':'Downtown Toronto Stn A',
                                             'East TorontoBusiness reply mail Processing Centre969 Eastern':'East Toronto Business',
                                             'EtobicokeNorthwest':'Etobicoke Northwest','East YorkEast Toronto':'East York/East Toronto',
                                             'MississaugaCanada Post Gateway Processing Centre':'Mississauga'})

In [6]:
df

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,Regent Park
3,M6A,North York,Lawrence Manor
4,M7A,Queen's Park,Ontario Provincial Government
...,...,...,...
98,M8X,Etobicoke,The Kingsway
99,M4Y,Downtown Toronto,Church and Wellesley
100,M7Y,East Toronto Business,Enclave of M4L
101,M8Y,Etobicoke,Old Mill South


## Part 2. Getting coordinates by the postcodes

#### Using pgeocode to get coordinates. I didn't manage with Geocoder, couldnt take its key

In [7]:
pgeocode.Nominatim('ca')
geolocator = pgeocode.Nominatim('ca')
postal_codes = df['PostalCode'].tolist()
latitudes = []
longitudes = []
for i, postal_code in enumerate(postal_codes):
    # initialize your variable to None
    #print(f'--Getting Postal Code: {postal_code}')
    g = geolocator.query_postal_code(postal_code)
    
    if not g.empty:
        #print(f'Postal Code {postal_code} has been retrieved. {len(postal_codes) - (i + 1)} codes left')
        latitudes.append(g.latitude)
        longitudes.append(g.longitude)

df['Latitude'] = latitudes
df['Longitude'] = longitudes
df=df.dropna() #just in case

In [8]:
df

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.7545,-79.3300
1,M4A,North York,Victoria Village,43.7276,-79.3148
2,M5A,Downtown Toronto,Regent Park,43.6555,-79.3626
3,M6A,North York,Lawrence Manor,43.7223,-79.4504
4,M7A,Queen's Park,Ontario Provincial Government,43.6641,-79.3889
...,...,...,...,...,...
98,M8X,Etobicoke,The Kingsway,43.6518,-79.5076
99,M4Y,Downtown Toronto,Church and Wellesley,43.6656,-79.3830
100,M7Y,East Toronto Business,Enclave of M4L,43.7804,-79.2505
101,M8Y,Etobicoke,Old Mill South,43.6325,-79.4939


## Part3. Explore and cluster the neighborhoods in Toronto

#### Use geopy library to get the latitude and longitude values of New York City.

In [9]:
address = 'Toronto, Ontario'

geolocator = Nominatim(user_agent="toronto_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinates of Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinates of Toronto are 43.6534817, -79.3839347.


In [10]:
BwT = df[df['Borough'].str.contains("oronto")].reset_index(drop=True) #take only boroughs that contain the word Toronto 

#### Create a map of New York with neighborhoods superimposed on top.

In [11]:
# create map of Toronto using latitude and longitude values

map_toronto = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(BwT['Latitude'], BwT['Longitude'], BwT['Borough'], BwT['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
    
map_toronto

Next, we are going to start utilizing the Foursquare API to explore the neighborhoods and segment them.

#### Define Foursquare Credentials and Version

In [12]:
CLIENT_ID = 'JGSP4L0DPSECDQ4JJKAKR301CSHUMCAOSFPOK0O1KDE2KEDQ' # your Foursquare ID
CLIENT_SECRET = 'I4MOFCRQKER1Y5IDECXPRH5KI2FVJSXAVVKDYG2MGHIJLV0E' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT = 100 # A default Foursquare API limit value

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: JGSP4L0DPSECDQ4JJKAKR301CSHUMCAOSFPOK0O1KDE2KEDQ
CLIENT_SECRET:I4MOFCRQKER1Y5IDECXPRH5KI2FVJSXAVVKDYG2MGHIJLV0E


### Explore Neighborhoods in Toronto

#### Let's create a function to explore the neighborhoods in Toronto

In [13]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

#### Now write the code to run the above function on each neighborhood and create a new dataframe called _toronto_venues_.

In [14]:
toronto_venues = getNearbyVenues(names=BwT['Neighborhood'],
                                   latitudes=BwT['Latitude'],
                                   longitudes=BwT['Longitude']
                                  )

Regent Park
Garden District, Ryerson
St. James Town
The Beaches
Berczy Park
Central Bay Street
Christie
Richmond
Dufferin
The Danforth  East
Harbourfront East
Little Portugal
The Danforth West
Toronto Dominion Centre
Brockton
India Bazaar
Commerce Court
Studio District
Lawrence Park
Roselawn
Davisville North
Forest Hill North & West
High Park
North Toronto West
The Annex
Parkdale
Davisville
University of Toronto
Runnymede
Moore Park
Kensington Market
Summerhill West
CN Tower
Rosedale
Enclave of M5E
St. James Town
First Canadian Place
Church and Wellesley
Enclave of M4L


Let's check the size of the resulting dataframe

In [15]:
print(toronto_venues.shape)
toronto_venues.head()

(1512, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Regent Park,43.6555,-79.3626,Tandem Coffee,43.653559,-79.361809,Coffee Shop
1,Regent Park,43.6555,-79.3626,Roselle Desserts,43.653447,-79.362017,Bakery
2,Regent Park,43.6555,-79.3626,Figs Breakfast & Lunch,43.655675,-79.364503,Breakfast Spot
3,Regent Park,43.6555,-79.3626,The Yoga Lounge,43.655515,-79.364955,Yoga Studio
4,Regent Park,43.6555,-79.3626,Body Blitz Spa East,43.654735,-79.359874,Spa


Let's check how many venues were returned for each neighborhood

In [16]:
toronto_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Berczy Park,95,95,95,95,95,95
Brockton,39,39,39,39,39,39
CN Tower,58,58,58,58,58,58
Central Bay Street,55,55,55,55,55,55
Christie,12,12,12,12,12,12
Church and Wellesley,75,75,75,75,75,75
Commerce Court,100,100,100,100,100,100
Davisville,23,23,23,23,23,23
Davisville North,7,7,7,7,7,7
Dufferin,18,18,18,18,18,18


#### Let's find out how many unique categories can be curated from all the returned venues

In [17]:
print('There are {} uniques categories.'.format(len(toronto_venues['Venue Category'].unique())))

There are 217 uniques categories.


### Analyze Each Neighborhood

In [18]:
# one hot encoding
toronto_onehot = pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
toronto_onehot['Neighborhood'] = toronto_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]

toronto_onehot.head()

Unnamed: 0,Yoga Studio,Accessories Store,Adult Boutique,American Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,BBQ Joint,...,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Whisky Bar,Wine Bar,Wine Shop,Wings Joint
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,1,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


And let's examine the new dataframe size.

In [19]:
toronto_onehot.shape

(1512, 217)

#### Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category

In [20]:
toronto_grouped = toronto_onehot.groupby('Neighborhood').mean().reset_index()
toronto_grouped

Unnamed: 0,Neighborhood,Yoga Studio,Accessories Store,Adult Boutique,American Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,...,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Whisky Bar,Wine Bar,Wine Shop,Wings Joint
0,Berczy Park,0.010526,0.0,0.0,0.0,0.021053,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.010526,0.0,0.0,0.0,0.0,0.0,0.0
1,Brockton,0.0,0.025641,0.0,0.0,0.025641,0.0,0.025641,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,CN Tower,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483
3,Central Bay Street,0.0,0.0,0.0,0.0,0.0,0.018182,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.018182,0.018182,0.0,0.018182,0.0,0.0
4,Christie,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Church and Wellesley,0.026667,0.0,0.013333,0.013333,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Commerce Court,0.0,0.0,0.0,0.03,0.01,0.0,0.0,0.03,0.0,...,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0
7,Davisville,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Davisville North,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Dufferin,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.055556,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0


#### Let's confirm the new size

In [21]:
toronto_grouped.shape

(38, 217)

#### Let's print each neighborhood along with the top 5 most common venues

In [22]:
num_top_venues = 5

for hood in toronto_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = toronto_grouped[toronto_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Berczy Park----
                venue  freq
0         Coffee Shop  0.13
1              Bakery  0.05
2               Hotel  0.05
3  Seafood Restaurant  0.04
4        Cocktail Bar  0.04


----Brockton----
                    venue  freq
0                    Café  0.08
1             Coffee Shop  0.08
2          Breakfast Spot  0.05
3  Thrift / Vintage Store  0.05
4                     Bar  0.05


----CN Tower----
                venue  freq
0  Italian Restaurant  0.07
1         Coffee Shop  0.07
2                Café  0.05
3                 Bar  0.05
4       Grocery Store  0.03


----Central Bay Street----
                venue  freq
0         Coffee Shop  0.22
1     Bubble Tea Shop  0.04
2  Italian Restaurant  0.04
3                Café  0.04
4         Pizza Place  0.04


----Christie----
           venue  freq
0           Café  0.25
1  Grocery Store  0.25
2     Playground  0.08
3           Park  0.08
4    Coffee Shop  0.08


----Church and Wellesley----
                 venue  freq


#### Let's put that into a _pandas_ dataframe

First, let's write a function to sort the venues in descending order.

In [23]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Now let's create the new dataframe and display the top 10 venues for each neighborhood.

In [24]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = toronto_grouped['Neighborhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Berczy Park,Coffee Shop,Hotel,Bakery,Cocktail Bar,Seafood Restaurant,Café,Japanese Restaurant,Restaurant,Beer Bar,Pharmacy
1,Brockton,Coffee Shop,Café,Thrift / Vintage Store,Bar,Breakfast Spot,Gift Shop,Japanese Restaurant,Brewery,Hawaiian Restaurant,Sandwich Place
2,CN Tower,Coffee Shop,Italian Restaurant,Bar,Café,Wings Joint,Restaurant,Park,Speakeasy,Grocery Store,French Restaurant
3,Central Bay Street,Coffee Shop,Café,Italian Restaurant,Middle Eastern Restaurant,Sandwich Place,Bubble Tea Shop,Clothing Store,Pizza Place,Fast Food Restaurant,Shopping Mall
4,Christie,Café,Grocery Store,Candy Store,Park,Playground,Coffee Shop,Athletics & Sports,Baby Store,Event Space,Electronics Store


### Cluster Neighborhoods

Run _k_-means to cluster the neighborhood into 5 clusters.

In [25]:
# set number of clusters
kclusters = 5

toronto_grouped_clustering = toronto_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1], dtype=int32)

Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

In [26]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

toronto_merged = BwT.drop('PostalCode',1)

# merge manhattan_grouped with manhattan_data to add latitude/longitude for each neighborhood
toronto_merged = toronto_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

toronto_merged.head() # check the last columns!

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Downtown Toronto,Regent Park,43.6555,-79.3626,1,Coffee Shop,Breakfast Spot,Yoga Studio,Beer Store,Electronics Store,Italian Restaurant,Pub,Bakery,Thai Restaurant,Theater
1,Downtown Toronto,"Garden District, Ryerson",43.6572,-79.3783,1,Coffee Shop,Clothing Store,Café,Middle Eastern Restaurant,Cosmetics Shop,Hotel,Japanese Restaurant,Italian Restaurant,Theater,Bubble Tea Shop
2,Downtown Toronto,St. James Town,43.6513,-79.3756,1,Coffee Shop,Café,Italian Restaurant,Seafood Restaurant,Bakery,Park,Restaurant,Beer Bar,Hotel,Cosmetics Shop
3,East Toronto,The Beaches,43.6784,-79.2941,1,Pub,Health Food Store,Asian Restaurant,Bakery,Cheese Shop,Trail,Gastropub,Event Space,Ethiopian Restaurant,Dumpling Restaurant
4,Downtown Toronto,Berczy Park,43.6456,-79.3754,1,Coffee Shop,Hotel,Bakery,Cocktail Bar,Seafood Restaurant,Café,Japanese Restaurant,Restaurant,Beer Bar,Pharmacy


Finally, let's visualize the resulting clusters

In [27]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['Neighborhood'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

###  Examine Clusters

#### Cluster 1: markets

In [28]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 0, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
31,Summerhill West,Light Rail Station,Coffee Shop,Liquor Store,Supermarket,Wings Joint,Eastern European Restaurant,Fish Market,Fish & Chips Shop,Fast Food Restaurant,Farmers Market


#### Cluster 2: vibrant city

In [29]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 1, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Regent Park,Coffee Shop,Breakfast Spot,Yoga Studio,Beer Store,Electronics Store,Italian Restaurant,Pub,Bakery,Thai Restaurant,Theater
1,"Garden District, Ryerson",Coffee Shop,Clothing Store,Café,Middle Eastern Restaurant,Cosmetics Shop,Hotel,Japanese Restaurant,Italian Restaurant,Theater,Bubble Tea Shop
2,St. James Town,Coffee Shop,Café,Italian Restaurant,Seafood Restaurant,Bakery,Park,Restaurant,Beer Bar,Hotel,Cosmetics Shop
3,The Beaches,Pub,Health Food Store,Asian Restaurant,Bakery,Cheese Shop,Trail,Gastropub,Event Space,Ethiopian Restaurant,Dumpling Restaurant
4,Berczy Park,Coffee Shop,Hotel,Bakery,Cocktail Bar,Seafood Restaurant,Café,Japanese Restaurant,Restaurant,Beer Bar,Pharmacy
5,Central Bay Street,Coffee Shop,Café,Italian Restaurant,Middle Eastern Restaurant,Sandwich Place,Bubble Tea Shop,Clothing Store,Pizza Place,Fast Food Restaurant,Shopping Mall
6,Christie,Café,Grocery Store,Candy Store,Park,Playground,Coffee Shop,Athletics & Sports,Baby Store,Event Space,Electronics Store
7,Richmond,Café,Coffee Shop,Hotel,Restaurant,Gym,American Restaurant,Japanese Restaurant,Sushi Restaurant,Asian Restaurant,Salad Place
8,Dufferin,Bakery,Park,Furniture / Home Store,Pharmacy,Bus Line,Smoke Shop,Middle Eastern Restaurant,Café,Bar,Bank
10,Harbourfront East,Harbor / Marina,Park,Café,Music Venue,Wings Joint,Dumpling Restaurant,Fish & Chips Shop,Fast Food Restaurant,Farmers Market,Falafel Restaurant


#### Cluster 3: family homes

In [30]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 2, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
19,Roselawn,Home Service,Dumpling Restaurant,Flower Shop,Fish Market,Fish & Chips Shop,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Ethiopian Restaurant
21,Forest Hill North & West,Home Service,Park,Trail,Wings Joint,Donut Shop,Fish & Chips Shop,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space


#### Cluster 4: High Park

In [31]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 3, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
22,High Park,Park,Residential Building (Apartment / Condo),Wings Joint,Donut Shop,Fish & Chips Shop,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Ethiopian Restaurant


#### Cluster 5: parks&outdoor

In [32]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 4, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
9,The Danforth East,Park,Massage Studio,Convenience Store,Intersection,Coffee Shop,Dumpling Restaurant,Fish Market,Fish & Chips Shop,Fast Food Restaurant,Farmers Market
18,Lawrence Park,Photography Studio,Park,Lawyer,Donut Shop,Fish & Chips Shop,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space,Ethiopian Restaurant
23,North Toronto West,Park,Gym Pool,Playground,Garden,Wings Joint,Donut Shop,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space
29,Moore Park,Park,Thai Restaurant,Gym,Grocery Store,Doner Restaurant,Fish & Chips Shop,Fast Food Restaurant,Farmers Market,Falafel Restaurant,Event Space
33,Rosedale,Park,Playground,Grocery Store,Candy Store,Wings Joint,Dumpling Restaurant,Fish & Chips Shop,Fast Food Restaurant,Farmers Market,Falafel Restaurant
