## Segmentation and Clustering of Neighborhood in Toronto, Canada.

### Introduction
    1.Web-scraping of the Toronto Neighborhood data using requests and BeautifulSoup Libraries.
    
    2.Performing Exploratory and Data analysis Using the Pandas and Numpy Library.
    
    3.Using Foursquare API Explore the Neighborhoods of Toronto location.
    
    4.Using Scikit-learn library to apply K-means Clustering alogrithm.  

## Table of Contents

1. Performing Web-scraping and Extracting Neighborhood and borough information from Wikipedia Page

2. Perfoming EDA 

3. Explore Neighborhoods in Toronto

4. Utilizing the Foursquare Library to explore Analyze Each Neighborhood

5. Cluster Neighborhoods

6. Examine Clusters    


## 1.  Performing Web-scraping and Extracting Neighborhood and borough information from Wikipedia Page

In [1]:
# Installing essential libraries in Web-scraping
# import the library we use to open the URL 
import lxml
import requests # library to handle requests
# Import the beautifulSoup library so we can parse HTML and XML documents
from bs4 import BeautifulSoup as BS

import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modulesl
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Libraries imported.


In [2]:
# Storing the URL of wikipedia wiki_url
wiki_url = 'https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'

# Getting the source code of the HTML page from the URL using the request library
source = requests.get(wiki_url).text

# Converting the HTML souce code to the beautiful soup format
soup = BS(source,'html.parser')

# Reading the table Information and storing in the database 
df_old = pd.read_html(wiki_url)

# Using the find method to find the exact borough information 
df_old[0]

Unnamed: 0,Postal Code,Borough,Neighbourhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"
5,M6A,North York,"Lawrence Manor, Lawrence Heights"
6,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"
7,M8A,Not assigned,Not assigned
8,M9A,Etobicoke,"Islington Avenue, Humber Valley Village"
9,M1B,Scarborough,"Malvern, Rouge"


In [3]:
# Checking the  database for all the Unique Borough names and total number of neighborhood
df = df_old[0].copy()
df.head()

Unnamed: 0,Postal Code,Borough,Neighbourhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"


## 2. Perfoming EDA 
Neighborhood has a total of 3 boroughs and 177 neighborhoods. In order to segement the neighborhoods and explore them, we will essentially need a dataset that contains the 3 boroughs and the neighborhoods that exist in each borough as well as the the latitude and logitude coordinates of each neighborhood.

### Data Handling and Cleaning

In [4]:
# Checking the dataFrame for missing Values
df.isna().sum()

Postal Code      0
Borough          0
Neighbourhood    0
dtype: int64

In [5]:
# dropping the rows having 'Not assigned' in Borough column
df = df[~(df['Borough'] == 'Not assigned')]
df.head()

Unnamed: 0,Postal Code,Borough,Neighbourhood
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"
5,M6A,North York,"Lawrence Manor, Lawrence Heights"
6,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"


In [6]:
# Cheking again for any missing or null values
df.isna().sum()

Postal Code      0
Borough          0
Neighbourhood    0
dtype: int64

In [7]:
df.shape

(103, 3)

### Using pgeocode library to get the latitudes and longitudes of each Neighborhood from the postal code given.

In [8]:
import pgeocode
nomi = pgeocode.Nominatim('CA')
location = nomi.query_postal_code("M3A")
latitude = location.latitude
longitude = location.longitude
print(latitude, longitude)

43.7545 -79.33


In [9]:
df['Postal Code'].head()

2    M3A
3    M4A
4    M5A
5    M6A
6    M7A
Name: Postal Code, dtype: object

In [10]:
# Creating a new DataFrame to store the latitudes and longitudes of the each neighborhood
Country_code = 'CA'
column_names = [ 'Postal Code', 'Borough', 'Neighbourhood', 'Latitude', 'Longitude'] 
Neighbourhood = pd.DataFrame(columns=column_names)

# We will define a instance of the geopy library as an tor_agent
# We cannot use the Nominatim method for large number of values hence we divide the dataset into two parts

# First Part of the dataset
for postal, bor, neigh in zip(df['Postal Code'], df['Borough'], df['Neighbourhood']):  
    try:
        nomi = pgeocode.Nominatim(Country_code)
        location = nomi.query_postal_code(postal)
        Neighbourhood = Neighbourhood.append({
                                            'Postal Code':postal,
                                            'Borough':bor,
                                            'Neighbourhood':neigh,
                                            'Latitude':location.latitude,
                                            'Longitude':location.longitude
                                           },ignore_index=True)
    except:
        Neighbourhood = Neighbourhood.append({
                                            'Postal Code':postal,
                                            'Borough':bor,
                                            'Neighbourhood':neigh,
                                            'Latitude':np.NaN,
                                            'Longitude':np.NaN
                                           },ignore_index=True)


# Cheking for any null values        
Neighbourhood.head()

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.7545,-79.33
1,M4A,North York,Victoria Village,43.7276,-79.3148
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.6555,-79.3626
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.7223,-79.4504
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.6641,-79.3889


In [11]:
# Checking for the missing values in the data frame
Neighbourhood.isna().sum()

Postal Code      0
Borough          0
Neighbourhood    0
Latitude         1
Longitude        1
dtype: int64

In [12]:
# Checking the one row whose latitude and logitude were not extracted properly
Neighbourhood[Neighbourhood.Latitude.isna()]

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude
76,M7R,Mississauga,Canada Post Gateway Processing Centre,,


In [13]:
# Removing the one row whose latitude and logitude were not extracted properly
Neighbourhood = Neighbourhood[~(Neighbourhood.Latitude.isna())]
Neighbourhood.isna().sum()

Postal Code      0
Borough          0
Neighbourhood    0
Latitude         0
Longitude        0
dtype: int64

In [14]:
# Checking all the different borough and the neighbour associated with it
Neighbourhood['Borough'].value_counts()

North York          24
Downtown Toronto    19
Scarborough         17
Etobicoke           12
Central Toronto      9
West Toronto         6
York                 5
East York            5
East Toronto         5
Name: Borough, dtype: int64

In [15]:
# Printing the information for Toronto Dataset
print('The dataframe has {} boroughs and {} neighborhoods.'.format(
        len(Neighbourhood['Borough'].unique()),
        Neighbourhood.shape[0]
    )
)

The dataframe has 9 boroughs and 102 neighborhoods.


## 3.Exploring the neighbourhood of Toronto
Using the visualization library Folium and the Neighbourhood datasets Co-ordinates

In [16]:
# Extracting the coordinates of Toronto city
nomi = pgeocode.Nominatim('CA')
location = nomi.query_postal_code("M4B")
toronto_latitude = location.latitude
toronto_longitude = location.longitude
print('The Coordinates of Toronto Canada is  {}, {}'.format(toronto_latitude, toronto_longitude))

The Coordinates of Toronto Canada is  43.7063, -79.3094


In [17]:
# Creating a map of Toronto Neighborhood using Folium Library and latitude and longitude values
# Format of the Popup label = (Neighbourhood name , Borough name)
map_toronto = folium.Map(location=[toronto_latitude, toronto_longitude], zoom_start=10)

for lat, long, borough, neigh in zip(Neighbourhood['Latitude'], 
                                     Neighbourhood['Longitude'], 
                                     Neighbourhood['Borough'], 
                                     Neighbourhood['Neighbourhood']):
    label = '{}, {}'.format(neigh, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, long],
        radius=5,
        popup=label,
        colors='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.4,
        parse_html=False).add_to(map_toronto)
map_toronto

##  3. Explore Neighbourhood  Borough  having  Toronto name in it.
#### As we did for the all the neighborhood lets Visualize the Neiborhood of Scarborough


In [19]:
# Extracting all the neighbourhood of borough names with toronto and creating a new dataframe 
Toronto_Neighbourhood = Neighbourhood[Neighbourhood.Borough.str.find('Toronto') != -1]
Toronto_Neighbourhood.reset_index(drop=True,).head()

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude
0,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.6555,-79.3626
1,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.6641,-79.3889
2,M5B,Downtown Toronto,"Garden District, Ryerson",43.6572,-79.3783
3,M5C,Downtown Toronto,St. James Town,43.6513,-79.3756
4,M4E,East Toronto,The Beaches,43.6784,-79.2941


### The Toronto_Neighbourhood dataframe has 4 boroughs and 39 neighborhoods.

In [20]:
# Printing the information for New Data-set
print('The dataframe has {} boroughs and {} neighborhoods.'.format(
        len(Toronto_Neighbourhood['Borough'].unique()),
        Toronto_Neighbourhood.shape[0]
    )
)

The dataframe has 4 boroughs and 39 neighborhoods.


In [21]:
# Analyzing the dataframe and cheking for Unique Boroughs Neighbourhood number
Toronto_Neighbourhood.Borough.value_counts()

Downtown Toronto    19
Central Toronto      9
West Toronto         6
East Toronto         5
Name: Borough, dtype: int64

In [22]:
# Creating a map of Toronto Neighborhood using Folium Library and latitude and longitude values
map_tor_neigh = folium.Map(location=[latitude, longitude], zoom_start=10)

for lat, long, borough, neigh in zip(Toronto_Neighbourhood['Latitude'], 
                                     Toronto_Neighbourhood['Longitude'], 
                                     Toronto_Neighbourhood['Borough'], 
                                     Toronto_Neighbourhood['Neighbourhood']):
    label = '{}, {}'.format(neigh, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, long],
        radius=5,
        popup=label,
        colors='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.4,
        parse_html=False).add_to(map_tor_neigh)
map_tor_neigh

## 4. Utilizing the Foursquare Library to explore Analyze Each Neighborhood.

#### Define Foursquare Credentials and Version

In [23]:
CLIENT_ID = 'D4M1I000SE54SVKCXXL4NQGDHI5C4MSFPJ12LA3VNAATO0ZX' # your Foursquare ID
CLIENT_SECRET = '3B2DHRPIPFPRIRYKT00OUDRSECRXIOOKVSGNQ5ZFV2HUUREQ' # your Foursquare Secret
VERSION = '20200807' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: D4M1I000SE54SVKCXXL4NQGDHI5C4MSFPJ12LA3VNAATO0ZX
CLIENT_SECRET:3B2DHRPIPFPRIRYKT00OUDRSECRXIOOKVSGNQ5ZFV2HUUREQ


In [24]:
# Exploring the First Neighbourhood in the dataFrame
Toronto_Neighbourhood = Toronto_Neighbourhood.reset_index(drop=True)
print('The first Row has same neighboor hood with same postalcode and hence from here same co-ordinates : {}'.
      format(Toronto_Neighbourhood.loc[0,'Neighbourhood']))

# Select the first element as both have the same data Neighbourhood select the first element
print('Selecting the first Neighbourhood : {}'.
      format(Toronto_Neighbourhood.loc[0,'Neighbourhood'].split(',')[0]))

print('Latitudes and Longitudes : {}, {}'.format(Toronto_Neighbourhood.loc[0,'Latitude'] , Toronto_Neighbourhood
                           .loc[0,'Longitude']))

The first Row has same neighboor hood with same postalcode and hence from here same co-ordinates : Regent Park, Harbourfront
Selecting the first Neighbourhood : Regent Park
Latitudes and Longitudes : 43.6555, -79.3626


#### Now, let's get the top 100 venues that are in Regent Park within a radius of 500 meters.
First, let's create the GET request URL. Name your URL **url**.

In [25]:
# type your answer here
LIMIT = 100 #limit the number of venues returned by Foursquare API
radius = 500 # define radius
# Create a URL to send the GET request
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&\
radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, VERSION, 
                           Toronto_Neighbourhood.loc[0,'Latitude'], 
                           Toronto_Neighbourhood.loc[0,'Longitude'],
                           radius, LIMIT )
url

'https://api.foursquare.com/v2/venues/explore?&client_id=D4M1I000SE54SVKCXXL4NQGDHI5C4MSFPJ12LA3VNAATO0ZX&client_secret=3B2DHRPIPFPRIRYKT00OUDRSECRXIOOKVSGNQ5ZFV2HUUREQ&v=20200807&ll=43.6555,-79.3626&radius=500&limit=100'

In [26]:
# Send the GET request and examine the resutls
result = requests.get(url).json()
result

{'meta': {'code': 200, 'requestId': '5f2d40d911597434c503e33a'},
 'response': {'suggestedFilters': {'header': 'Tap to show:',
   'filters': [{'name': 'Open now', 'key': 'openNow'}]},
  'headerLocation': 'Corktown',
  'headerFullLocation': 'Corktown, Toronto',
  'headerLocationGranularity': 'neighborhood',
  'totalResults': 23,
  'suggestedBounds': {'ne': {'lat': 43.660000004500006,
    'lng': -79.3563918719477},
   'sw': {'lat': 43.6509999955, 'lng': -79.36880812805231}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '53b8466a498e83df908c3f21',
       'name': 'Tandem Coffee',
       'location': {'address': '368 King St E',
        'crossStreet': 'at Trinity St',
        'lat': 43.65355870959944,
        'lng': -79.36180945913513,
        'labeledLatLngs': [{'label': 'di

In [27]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']
result['response']

{'suggestedFilters': {'header': 'Tap to show:',
  'filters': [{'name': 'Open now', 'key': 'openNow'}]},
 'headerLocation': 'Corktown',
 'headerFullLocation': 'Corktown, Toronto',
 'headerLocationGranularity': 'neighborhood',
 'totalResults': 23,
 'suggestedBounds': {'ne': {'lat': 43.660000004500006,
   'lng': -79.3563918719477},
  'sw': {'lat': 43.6509999955, 'lng': -79.36880812805231}},
 'groups': [{'type': 'Recommended Places',
   'name': 'recommended',
   'items': [{'reasons': {'count': 0,
      'items': [{'summary': 'This spot is popular',
        'type': 'general',
        'reasonName': 'globalInteractionReason'}]},
     'venue': {'id': '53b8466a498e83df908c3f21',
      'name': 'Tandem Coffee',
      'location': {'address': '368 King St E',
       'crossStreet': 'at Trinity St',
       'lat': 43.65355870959944,
       'lng': -79.36180945913513,
       'labeledLatLngs': [{'label': 'display',
         'lat': 43.65355870959944,
         'lng': -79.36180945913513}],
       'distance':

In [28]:
## Now we are ready to clean the json and structure it into a pandas dataframe.
# result['response']['groups'][0]['items'][3]['venue']
venues = result['response']['groups'][0]['items']    
nearby_venues = json_normalize(venues)
print(nearby_venues.shape[0])
nearby_venues.head(2) # Flattening the json

23


  after removing the cwd from sys.path.


Unnamed: 0,referralId,reasons.count,reasons.items,venue.id,venue.name,venue.location.address,venue.location.crossStreet,venue.location.lat,venue.location.lng,venue.location.labeledLatLngs,venue.location.distance,venue.location.cc,venue.location.city,venue.location.state,venue.location.country,venue.location.formattedAddress,venue.categories,venue.photos.count,venue.photos.groups,venue.location.postalCode,venue.location.neighborhood,venue.venuePage.id
0,e-0-53b8466a498e83df908c3f21-0,0,"[{'summary': 'This spot is popular', 'type': '...",53b8466a498e83df908c3f21,Tandem Coffee,368 King St E,at Trinity St,43.653559,-79.361809,"[{'label': 'display', 'lat': 43.65355870959944...",225,CA,Toronto,ON,Canada,"[368 King St E (at Trinity St), Toronto ON, Ca...","[{'id': '4bf58dd8d48988d1e0931735', 'name': 'C...",0,[],,,
1,e-0-54ea41ad498e9a11e9e13308-1,0,"[{'summary': 'This spot is popular', 'type': '...",54ea41ad498e9a11e9e13308,Roselle Desserts,362 King St E,Trinity St,43.653447,-79.362017,"[{'label': 'display', 'lat': 43.65344672305267...",233,CA,Toronto,ON,Canada,"[362 King St E (Trinity St), Toronto ON M5A 1K...","[{'id': '4bf58dd8d48988d16a941735', 'name': 'B...",0,[],M5A 1K9,,


In [29]:
# Extracting the categories information from the json file of each venue
categories = []
for x in range(nearby_venues.shape[0]):
    categories.append(pd.json_normalize(result['response']['groups'][0]['items'][x]
                                        ['venue']['categories'])['name'][0])
categories[0:5]


['Coffee Shop', 'Bakery', 'Breakfast Spot', 'Yoga Studio', 'Coffee Shop']

In [30]:
# Create a dataframe to store the shop name, categories, and latitude longitude 
Column_names = ['Shop_Name','Category','Latitude','Longitude']
venues_df = pd.DataFrame(columns=Column_names)
for shop_name, Cat, lat, lng in zip(nearby_venues['venue.name'], 
                                    categories, 
                                    nearby_venues['venue.location.lat'],
                                   nearby_venues['venue.location.lng']):
    
    venues_df = venues_df.append({
                                  'Shop_Name':shop_name,
                                  'Category':Cat,
                                  'Latitude':lat,
                                  'Longitude':lng
                                           },ignore_index=True)
    
    
venues_df.head()

Unnamed: 0,Shop_Name,Category,Latitude,Longitude
0,Tandem Coffee,Coffee Shop,43.653559,-79.361809
1,Roselle Desserts,Bakery,43.653447,-79.362017
2,Figs Breakfast & Lunch,Breakfast Spot,43.655675,-79.364503
3,The Yoga Lounge,Yoga Studio,43.655515,-79.364955
4,Sumach Espresso,Coffee Shop,43.658135,-79.359515


In [31]:
# Determining the number of venues returned by Foursquare API
print('{} venues were returned by Foursquare.'.format(venues_df.shape[0]) )

23 venues were returned by Foursquare.


In [32]:
# Creating a function to extract all the venues in all the neighbourhood in Toronto_Neighbourhood
def get_Nearby_Venues(latitude, longitude, Neighbourhood):
    
    CLIENT_ID = 'D4M1I000SE54SVKCXXL4NQGDHI5C4MSFPJ12LA3VNAATO0ZX' # your Foursquare ID
    CLIENT_SECRET = '3B2DHRPIPFPRIRYKT00OUDRSECRXIOOKVSGNQ5ZFV2HUUREQ' # your Foursquare Secret
    VERSION = '20200807' # Foursquare API version
    # Create a dataframe to store the shop name, categories, and latitude longitude 
    Column_names = ['Neighbourhood','Venue_Name','Venue_Category','Latitude','Longitude']
    venues_df = pd.DataFrame(columns=Column_names)
    
    # Extracting Venues of Each Neighbourhood
    for lat, lng, neigh in zip(latitude, longitude, Neighbourhood):
        neigh = neigh.split(',')[0]
        LIMIT = 100 #limit the number of venues returned by Foursquare API
        radius = 500 # define radius
        # Create a URL to send the GET request
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&\
        radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, VERSION, lat, lng, radius, LIMIT )
        # Send the GET request and examine the resutls
        result = requests.get(url).json()
        venues = result['response']['groups'][0]['items']    
        nearby_venues = pd.json_normalize(venues)
        
        
        # Extracting category of each venue and sot
        categories = []
        for x in range(nearby_venues.shape[0]):
            categories.append(pd.json_normalize(result['response']['groups'][0]['items'][x]
                                        ['venue']['categories'])['name'][0])
        

        for shop_name, Cat, lat, lng in zip(nearby_venues['venue.name'], 
                                        categories, 
                                        nearby_venues['venue.location.lat'],
                                        nearby_venues['venue.location.lng']):
    
            venues_df = venues_df.append({'Neighbourhood':neigh,
                                          'Venue_Name':shop_name,
                                          'Venue_Category':Cat,
                                          'Latitude':lat,
                                          'Longitude':lng
                                           },ignore_index=True)
    return(venues_df)    # Returning the dataframe having the venues information

In [33]:
# Using the above function to Extract all the venues of all the Neighbourhood in Toronto_Neighbourhood datase
# get_Nearby_Venues(latitude, longitude, Neighbourhood):
Toronto_Neighbourhood_Venues = get_Nearby_Venues(Toronto_Neighbourhood.Latitude,
                                                Toronto_Neighbourhood.Longitude,
                                                Toronto_Neighbourhood.Neighbourhood)
Toronto_Neighbourhood_Venues.head()

Unnamed: 0,Neighbourhood,Venue_Name,Venue_Category,Latitude,Longitude
0,Regent Park,Tandem Coffee,Coffee Shop,43.653559,-79.361809
1,Regent Park,Roselle Desserts,Bakery,43.653447,-79.362017
2,Regent Park,Sumach Espresso,Coffee Shop,43.658135,-79.359515
3,Regent Park,Rooster Coffee,Coffee Shop,43.6519,-79.365609
4,Regent Park,Sukhothai,Thai Restaurant,43.658444,-79.365681


In [34]:
# This dataframe has 3848 venues information 
Toronto_Neighbourhood_Venues.shape

(3848, 5)

In [81]:
# Analyzing the number of Venues from each Neigghbourhood
Toronto_Neighbourhood_Venues.groupby('Neighbourhood').count()

Unnamed: 0_level_0,Venue_Name,Venue_Category,Latitude,Longitude
Neighbourhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Berczy Park,100,100,100,100
Brockton,100,100,100,100
Business reply mail Processing Centre,100,100,100,100
CN Tower,100,100,100,100
Central Bay Street,100,100,100,100
Christie,100,100,100,100
Church and Wellesley,100,100,100,100
Commerce Court,100,100,100,100
Davisville,100,100,100,100
Davisville North,100,100,100,100


In [36]:
# Let's find out how many unique categories can be curated from all the returned venues 
print('Unique Venues in all the Neighbourhood :{}'.
       format(len(Toronto_Neighbourhood_Venues.Venue_Category.unique())) )

Unique Venues in all the Neighbourhood :273


### Analyze Each Neighborhood with respect to the Venues 

In [39]:
# One hot encoding
Toronto_Venues_onehot = pd.get_dummies(Toronto_Neighbourhood_Venues[['Venue_Category']], 
                                                    prefix='',
                                                    prefix_sep='')
Toronto_Venues_onehot.head()
# add Neighbour column to the dataframe
Toronto_Venues_onehot['Neighbourhood'] =  Toronto_Neighbourhood_Venues['Neighbourhood']

# Move the Neighbourhood column to the first column
column_names = [Toronto_Venues_onehot.columns[-1]] + list(Toronto_Venues_onehot.columns[:-1])
Toronto_Venues_onehot = Toronto_Venues_onehot[column_names]
Toronto_Venues_onehot.head()

Unnamed: 0,Neighbourhood,Afghan Restaurant,Airport,American Restaurant,Amphitheater,Animal Shelter,Antique Shop,Aquarium,Arcade,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,BBQ Joint,Baby Store,Bagel Shop,Bakery,Bank,Bar,Baseball Stadium,Basketball Stadium,Beach,Beer Bar,Beer Garden,Beer Store,Belgian Restaurant,Bike Shop,Bistro,Boat or Ferry,Bookstore,Botanical Garden,Boutique,Brazilian Restaurant,Breakfast Spot,Brewery,Bridal Shop,Bridge,Bubble Tea Shop,Burger Joint,Burrito Place,Butcher,Café,Camera Store,Cantonese Restaurant,Caribbean Restaurant,Castle,Cheese Shop,Chinese Restaurant,Chiropractor,Chocolate Shop,Church,Circus,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,College Arts Building,College Gym,College Rec Center,College Theater,Comedy Club,Comfort Food Restaurant,Comic Shop,Concert Hall,Convenience Store,Cosmetics Shop,Creperie,Cuban Restaurant,Cupcake Shop,Curling Ice,Dance Studio,Deli / Bodega,Department Store,Design Studio,Dessert Shop,Diner,Discount Store,Distribution Center,Dive Bar,Dog Run,Doner Restaurant,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Field,Filipino Restaurant,Fish & Chips Shop,Fish Market,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Gas Station,Gastropub,Gay Bar,General Entertainment,General Travel,German Restaurant,Gift Shop,Golf Course,Gourmet Shop,Government Building,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Hakka Restaurant,Harbor / Marina,Hawaiian Restaurant,Health & Beauty Service,Health Food Store,Historic Site,History Museum,Hobby Shop,Hostel,Hotel,Hotel Bar,Hungarian Restaurant,IT Services,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indonesian Restaurant,Irish Pub,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Jewish Restaurant,Juice Bar,Korean Restaurant,Lake,Latin American Restaurant,Lingerie Store,Liquor Store,Lounge,Malay Restaurant,Marijuana Dispensary,Market,Martial Arts School,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Movie Theater,Museum,Music School,Music Store,Music Venue,Nail Salon,Neighborhood,New American Restaurant,Nightclub,Noodle House,Nudist Beach,Office,Opera House,Optical Shop,Organic Grocery,Other Great Outdoors,Pakistani Restaurant,Paper / Office Supplies Store,Park,Pastry Shop,Performing Arts Venue,Persian Restaurant,Peruvian Restaurant,Pet Store,Pharmacy,Pide Place,Pier,Pizza Place,Playground,Plaza,Poke Place,Pool,Pool Hall,Portuguese Restaurant,Poutine Place,Pub,Racetrack,Ramen Restaurant,Record Shop,Restaurant,Roof Deck,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,School,Sculpture Garden,Seafood Restaurant,Shoe Store,Shopping Mall,Skate Park,Skating Rink,Smoke Shop,Smoothie Shop,Snack Place,Soccer Field,Soccer Stadium,Soup Place,South American Restaurant,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Sports Club,Sri Lankan Restaurant,Stationery Store,Steakhouse,Street Art,Strip Club,Supermarket,Sushi Restaurant,Syrian Restaurant,Taco Place,Tailor Shop,Taiwanese Restaurant,Tanning Salon,Tapas Restaurant,Tattoo Parlor,Tea Room,Tech Startup,Thai Restaurant,Theater,Theme Park,Theme Restaurant,Thrift / Vintage Store,Tibetan Restaurant,Toy / Game Store,Track,Trail,Train Station,Turkish Restaurant,Udon Restaurant,University,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Whisky Bar,Wine Bar,Wings Joint,Women's Store,Xinjiang Restaurant,Yoga Studio,Zoo
0,Regent Park,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Regent Park,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Regent Park,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Regent Park,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Regent Park,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


#### Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category

In [41]:
# Conforming the new size of the dataFrame
Toronto_Venues_grouped.shape

(38, 274)

In [42]:
# Let's print each neighborhood along with the top 5 most common venues
num_top_venues = 5

for hood in Toronto_Venues_grouped['Neighbourhood']:
    print("----"+hood+"----")
    temp = Toronto_Venues_grouped[Toronto_Venues_grouped['Neighbourhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Berczy Park----
                 venue  freq
0          Coffee Shop  0.12
1                Hotel  0.06
2           Restaurant  0.05
3  Japanese Restaurant  0.04
4               Bakery  0.04


----Brockton----
         venue  freq
0         Café  0.06
1  Coffee Shop  0.06
2          Bar  0.04
3       Bakery  0.04
4   Restaurant  0.04


----Business reply mail Processing Centre----
                venue  freq
0         Coffee Shop  0.07
1  Chinese Restaurant  0.04
2         Supermarket  0.04
3      Clothing Store  0.04
4   Indian Restaurant  0.04


----CN Tower----
         venue  freq
0  Coffee Shop  0.09
1  Yoga Studio  0.06
2         Café  0.05
3          Gym  0.05
4   Restaurant  0.04


----Central Bay Street----
                venue  freq
0         Coffee Shop  0.13
1      Clothing Store  0.06
2                Café  0.03
3  Italian Restaurant  0.03
4         Art Gallery  0.03


----Christie----
                           venue  freq
0                           Café  0.12
1     

In [45]:
# Putting the above output in pandas dataframe
# Creating a function for sorting the venues in descending order of frequency
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    return row_categories_sorted.index.values[0:num_top_venues]


In [47]:
# Now let's create the new dataframe and display the top 10 venues for each neighborhood.
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighbourhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighbourhood'] = Toronto_Venues_grouped['Neighbourhood']

for ind in np.arange(Toronto_Venues_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(
                                                                    Toronto_Venues_grouped.iloc[ind, :],
                                                                    num_top_venues)   

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Berczy Park,Coffee Shop,Hotel,Restaurant,Japanese Restaurant,Beer Bar,Café,Bakery,Italian Restaurant,Seafood Restaurant,Farmers Market
1,Brockton,Coffee Shop,Café,Restaurant,Bar,Bakery,Athletics & Sports,Asian Restaurant,Gift Shop,Italian Restaurant,Vietnamese Restaurant
2,Business reply mail Processing Centre,Coffee Shop,Restaurant,Supermarket,Chinese Restaurant,Clothing Store,Indian Restaurant,Bakery,Caribbean Restaurant,Pharmacy,Bookstore
3,CN Tower,Coffee Shop,Yoga Studio,Gym,Café,Restaurant,Italian Restaurant,Park,Bar,Sandwich Place,Spa
4,Central Bay Street,Coffee Shop,Clothing Store,Italian Restaurant,Café,Plaza,Art Gallery,Department Store,Electronics Store,Diner,Breakfast Spot


##  5. Cluster Neighbourhood
Run *k*-means to cluster the neighborhood into 5 clusters.

In [54]:
# set number of clusters
kclusters = 5

toronto_grouped_clustering = Toronto_Venues_grouped.drop('Neighbourhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([3, 2, 1, 1, 1, 4, 1, 3, 1, 1])

In [68]:
# Processing the Neighbourhood column having two names with with the same postal code to be replaced
# with the first one
for index, item in zip(Toronto_Neighbourhood['Neighbourhood'].index,
                       Toronto_Neighbourhood['Neighbourhood']):
    
    Toronto_Neighbourhood['Neighbourhood'][index] = Toronto_Neighbourhood.loc[index,'Neighbourhood'].split(',')[0]  
# Toronto_Neighbourhood.loc[0,'Neighbourhood'].split(',')[0]
Toronto_Neighbourhood.head()

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude
0,M5A,Downtown Toronto,Regent Park,43.6555,-79.3626
1,M7A,Downtown Toronto,Queen's Park,43.6641,-79.3889
2,M5B,Downtown Toronto,Garden District,43.6572,-79.3783
3,M5C,Downtown Toronto,St. James Town,43.6513,-79.3756
4,M4E,East Toronto,The Beaches,43.6784,-79.2941


In [70]:
# Creating a new dataframe that includes the cluster as well as the top 10 venues of each neighbourhood
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

Toronto_merged =  Toronto_Neighbourhood

# merge Toronto_grouped with Toronto_Neighbourhood to add latitude/longitude for each neighborhood
Toronto_merged = Toronto_merged.join(neighborhoods_venues_sorted.set_index('Neighbourhood'), 
                                       on='Neighbourhood')   

Toronto_merged.head() # check the last columns!


Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M5A,Downtown Toronto,Regent Park,43.6555,-79.3626,1,Coffee Shop,Café,Bakery,Restaurant,Gastropub,Theater,Breakfast Spot,Diner,Pub,Park
1,M7A,Downtown Toronto,Queen's Park,43.6641,-79.3889,1,Coffee Shop,Café,Bubble Tea Shop,Japanese Restaurant,Diner,Yoga Studio,Sushi Restaurant,Bookstore,College Theater,Mediterranean Restaurant
2,M5B,Downtown Toronto,Garden District,43.6572,-79.3783,1,Clothing Store,Coffee Shop,Italian Restaurant,Cosmetics Shop,Café,Japanese Restaurant,Bubble Tea Shop,Lingerie Store,Theater,Burger Joint
3,M5C,Downtown Toronto,St. James Town,43.6513,-79.3756,1,Coffee Shop,Café,Restaurant,Japanese Restaurant,Park,Gastropub,Bakery,Diner,Thai Restaurant,Clothing Store
4,M4E,East Toronto,The Beaches,43.6784,-79.2941,2,Park,Beach,Coffee Shop,Pub,Café,Bakery,Indian Restaurant,Breakfast Spot,Ice Cream Shop,Brewery


In [72]:
# Finally Visualizing the Results
# create map
map_clusters = folium.Map(location=[43.6555, -79.3626], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]
# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(Toronto_merged['Latitude'], Toronto_merged['Longitude'], Toronto_merged['Neighbourhood'], Toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

## 6. Examine Clusters
Now, examining each cluster and determine the discriminating venue categories that distinguish each cluster. 


### Cluster 1

In [76]:
Toronto_merged.loc[Toronto_merged['Cluster Labels'] == 0, Toronto_merged.columns[[1] + list(range(6, Toronto_merged.shape[1]))]]    

Unnamed: 0,Borough,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
10,Downtown Toronto,Park,Café,Coffee Shop,Harbor / Marina,Boat or Ferry,Hotel,Restaurant,Pizza Place,Brewery,Music Venue
24,Central Toronto,Italian Restaurant,Restaurant,Café,Coffee Shop,Vegetarian / Vegan Restaurant,Park,Spa,Middle Eastern Restaurant,Boutique,French Restaurant
28,West Toronto,Coffee Shop,Park,Café,Italian Restaurant,Bakery,Bar,Brewery,Pizza Place,Gastropub,French Restaurant
29,Central Toronto,Park,Italian Restaurant,Café,Sushi Restaurant,Bakery,Coffee Shop,Dessert Shop,Restaurant,Indian Restaurant,Thai Restaurant
31,Central Toronto,Italian Restaurant,Park,Coffee Shop,Café,Sushi Restaurant,Grocery Store,Yoga Studio,Middle Eastern Restaurant,Vegetarian / Vegan Restaurant,French Restaurant
33,Downtown Toronto,Park,Italian Restaurant,Café,Coffee Shop,Restaurant,Spa,Yoga Studio,Ice Cream Shop,Pub,Sushi Restaurant


### Cluster 2

In [77]:
Toronto_merged.loc[Toronto_merged['Cluster Labels'] == 1, Toronto_merged.columns[[1] + list(range(6, Toronto_merged.shape[1]))]]    

Unnamed: 0,Borough,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Downtown Toronto,Coffee Shop,Café,Bakery,Restaurant,Gastropub,Theater,Breakfast Spot,Diner,Pub,Park
1,Downtown Toronto,Coffee Shop,Café,Bubble Tea Shop,Japanese Restaurant,Diner,Yoga Studio,Sushi Restaurant,Bookstore,College Theater,Mediterranean Restaurant
2,Downtown Toronto,Clothing Store,Coffee Shop,Italian Restaurant,Cosmetics Shop,Café,Japanese Restaurant,Bubble Tea Shop,Lingerie Store,Theater,Burger Joint
3,Downtown Toronto,Coffee Shop,Café,Restaurant,Japanese Restaurant,Park,Gastropub,Bakery,Diner,Thai Restaurant,Clothing Store
6,Downtown Toronto,Coffee Shop,Clothing Store,Italian Restaurant,Café,Plaza,Art Gallery,Department Store,Electronics Store,Diner,Breakfast Spot
18,Central Toronto,Coffee Shop,Italian Restaurant,Café,Sushi Restaurant,Bakery,Restaurant,Ice Cream Shop,Japanese Restaurant,Food & Drink Shop,Fast Food Restaurant
19,Central Toronto,Coffee Shop,Italian Restaurant,Restaurant,Japanese Restaurant,Café,Bakery,Bagel Shop,Sushi Restaurant,Fast Food Restaurant,Bank
20,Central Toronto,Coffee Shop,Italian Restaurant,Bakery,Indian Restaurant,Café,Park,Restaurant,Pizza Place,Yoga Studio,Thai Restaurant
21,Central Toronto,Coffee Shop,Italian Restaurant,Sushi Restaurant,Gym,Park,Yoga Studio,Restaurant,Ice Cream Shop,Middle Eastern Restaurant,Bank
23,Central Toronto,Italian Restaurant,Coffee Shop,Café,Bakery,Fast Food Restaurant,Park,Sushi Restaurant,Restaurant,Skating Rink,Gastropub


### Cluster 3

In [78]:
Toronto_merged.loc[Toronto_merged['Cluster Labels'] == 2, Toronto_merged.columns[[1] + list(range(6, Toronto_merged.shape[1]))]]    

Unnamed: 0,Borough,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,East Toronto,Park,Beach,Coffee Shop,Pub,Café,Bakery,Indian Restaurant,Breakfast Spot,Ice Cream Shop,Brewery
11,West Toronto,Café,Bar,Bakery,Pizza Place,Asian Restaurant,Coffee Shop,Restaurant,Yoga Studio,Cocktail Bar,Men's Store
12,East Toronto,Café,Greek Restaurant,Park,Bakery,Vietnamese Restaurant,Ice Cream Shop,American Restaurant,Italian Restaurant,Gastropub,Yoga Studio
14,West Toronto,Coffee Shop,Café,Restaurant,Bar,Bakery,Athletics & Sports,Asian Restaurant,Gift Shop,Italian Restaurant,Vietnamese Restaurant
15,East Toronto,Coffee Shop,Park,Café,Beach,Brewery,Bar,Pizza Place,Indian Restaurant,Breakfast Spot,Italian Restaurant
17,East Toronto,Coffee Shop,Park,Café,Vietnamese Restaurant,Brewery,Bar,Bakery,Pizza Place,Ice Cream Shop,French Restaurant
25,West Toronto,Coffee Shop,Park,Bakery,Bar,Café,Restaurant,Eastern European Restaurant,Sushi Restaurant,Indian Restaurant,Italian Restaurant


### Cluster 4

In [79]:
Toronto_merged.loc[Toronto_merged['Cluster Labels'] == 3, Toronto_merged.columns[[1] + list(range(6, Toronto_merged.shape[1]))]]    

Unnamed: 0,Borough,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
5,Downtown Toronto,Coffee Shop,Hotel,Restaurant,Japanese Restaurant,Beer Bar,Café,Bakery,Italian Restaurant,Seafood Restaurant,Farmers Market
8,Downtown Toronto,Coffee Shop,Hotel,Café,Restaurant,Theater,Gym,Concert Hall,Beer Bar,Cosmetics Shop,Japanese Restaurant
13,Downtown Toronto,Coffee Shop,Hotel,Café,American Restaurant,Concert Hall,Restaurant,Seafood Restaurant,Gym,Japanese Restaurant,Theater
16,Downtown Toronto,Hotel,Coffee Shop,Café,Japanese Restaurant,Asian Restaurant,American Restaurant,Restaurant,Gym,Italian Restaurant,Seafood Restaurant
34,Downtown Toronto,Coffee Shop,Café,Hotel,Restaurant,Park,Beer Bar,Japanese Restaurant,Art Gallery,Vegetarian / Vegan Restaurant,Cocktail Bar
36,Downtown Toronto,Hotel,Coffee Shop,Café,Japanese Restaurant,Asian Restaurant,American Restaurant,Restaurant,Gym,Italian Restaurant,Seafood Restaurant


### Cluster 5

In [80]:
Toronto_merged.loc[Toronto_merged['Cluster Labels'] == 4, Toronto_merged.columns[[1] + list(range(6, Toronto_merged.shape[1]))]]    

Unnamed: 0,Borough,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
7,Downtown Toronto,Café,Coffee Shop,Bar,Vegetarian / Vegan Restaurant,Korean Restaurant,Grocery Store,Italian Restaurant,Dessert Shop,Restaurant,Indian Restaurant
9,West Toronto,Café,Coffee Shop,Italian Restaurant,Bar,Park,Cocktail Bar,Bakery,Restaurant,Brewery,Gastropub
22,West Toronto,Café,Coffee Shop,Bar,Bakery,Italian Restaurant,Brewery,Gastropub,Restaurant,Pizza Place,Burger Joint
27,Downtown Toronto,Café,Coffee Shop,Vegetarian / Vegan Restaurant,Bakery,Restaurant,Bar,Italian Restaurant,Bookstore,Ice Cream Shop,Japanese Restaurant
30,Downtown Toronto,Café,Vegetarian / Vegan Restaurant,Coffee Shop,Bar,Dessert Shop,Ice Cream Shop,Mexican Restaurant,Arts & Crafts Store,Grocery Store,Park
