<h1><b>Machine Learning Based Clustering and Segmentation for Navigation<b></h1>

<h3><b>Introduction</b></h3>
    <p>
    The main aspect of this project is to explore neighborhoods within the Greater Toronto Area and find correlations within those communities that our Foursquare API can solve. Given the topics of crime rate, population information,  and income sources and taxes we have found correlations among them that our Foursquare API can dig deeper into.  
    </p>
<h3><b>Project Contribution</b></h3>
    <p>
    The project contribution is to find correlations between topics surrounding the crime rate, population information and income sources. The purpose of this Jupyter notebook is to focus on the following correlation:
        <ul>
            <li>Correlation between auth theft rates and high income areas</li>
        </ul>
    The clusters that are created from these correlations will allow us to look at the venues in each of these neighbourhoods. Where we can view the type of venues that are common in these clusters.
    </p>
<h3><b>Prerequisite</b></h3>
<ul>
    <li>Foursquare API</li>

</ul>
<h3><b>Datasets Used</b></h3>
<ul>
    <li>Postal Code Dataset</li>
    <li>Neighbourhood Dataset</li>
    <li>Population Dataset</li>
    <li>Crime Dataset</li>
</ul>

<h3><b>Import Statements</b></h3>

In [2]:
from dotenv import load_dotenv
from dotenv import dotenv_values

import folium
import requests
import os

import pandas as pd 
from pandas import json_normalize

from bs4 import BeautifulSoup as bs
from sklearn.cluster import KMeans

import numpy as np

import matplotlib.cm as cm
import matplotlib.colors as colors

<h3><b>1 - Download Datasets</b></h3>
<p>The following datasets will be the basis of this project. These datasets were written to a csv file using scraping from various resources.</p>


<h4><b>1.1 - Create the Dataframe for Neighborhood Information</b></h4>
<p>Section 1.1 is an culmination of the postal code dataset and neighbourhood dataset that were merged together.</p>

In [3]:
path = os.getcwd()
path = os.path.join(path,"datasets/neighborhood-data.csv")
postcodes = pd.read_csv(path)
postcodes.drop(postcodes.columns[[0]], axis=1, inplace=True)
postcodes.head()

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,Harbourfront
3,M6A,North York,Lawrence Heights
4,M6A,North York,Lawrence Manor


<h4><b>1.2 - Create the Dataframe for Homicide Rates and Crime Rates</b></h4>
<p>Section 1.2 creates a Dataframe for the crime rates information from the Toronto police dataset.</p>

In [4]:
path2 = os.getcwd()
path2 = os.path.join(path2,"datasets/neighbourhood-crime-rates.csv")
crimedata = pd.read_csv(path2)
crimedata.drop(crimedata.columns[[0]], axis=1, inplace=True)

<h4><b>1.3 - Combine the Neighbourhood Dataset with the Crime Dataset</b></h4>
<p>Section 1.3 combines both sections 1.1 and section 1.2, to give a large Dataframe that holds the data.</p>

In [5]:
path3 = os.getcwd()
path3 = os.path.join(path3,"datasets/combined-dataset.csv")
combined_data = pd.read_csv(path3)
combined_data.drop(combined_data.columns[[0]], axis=1, inplace=True)
combined_data.head()

Unnamed: 0,Postcode,Borough,Neighbourhood,Hood_ID,Population,Assault_AVG,Assault_Rate_2019,AutoTheft_AVG,AutoTheft_Rate_2019,Homicide_AVG,Homicide_Rate_2019,Latitude,Longitude
0,M3A,North York,Parkwoods,45,34805,159.7,454.0,31.5,91.9,0.3,2.9,43.751,-79.323
1,M4A,North York,Victoria Village,43,17510,119.3,753.9,16.5,102.8,0.7,5.7,43.735,-79.312
2,M6A,North York,Lawrence Heights,32,22372,104.0,518.5,28.5,102.8,0.2,0.0,43.722,-79.451
3,M1B,Scarborough,Rouge,131,46496,173.3,391.4,50.5,187.1,0.8,0.0,43.804,-79.165
4,M1B,Scarborough,Malvern,132,43794,278.2,760.4,47.2,162.1,1.7,2.3,43.809,-79.221


<h4><b>1.4 - Create the Dataframe for the Population Dataset</b></h4>
<p>Section 1.4 creates a Dataframe for the population dataset that is gathered from the Open Toronto website.</p>

In [6]:
path4 = os.getcwd()
path4 = os.path.join(path4,"datasets/population-dataset-combined.csv")
population_data = pd.read_csv(path4)
population_data.drop(population_data.columns[[0]], axis=1, inplace=True)
population_data.head()

Unnamed: 0,Postcode,Borough,Neighbourhood,Neighbourhood Number,"Total - Highest certificate, diploma or degree for the population aged 15 years and over in private households - 25% sample data","No certificate, diploma or degree",Secondary (high) school diploma or equivalency certificate,Trades certificate or diploma other than Certificate of Apprenticeship or Certificate of Qualification,Certificate of Apprenticeship or Certificate of Qualification,"College, CEGEP or other non-university certificate or diploma",...,"$45,000 to $49,999","$50,000 to $59,999","$60,000 to $69,999","$70,000 to $79,999","$80,000 to $89,999","$90,000 to $99,999","$100,000 and over","$200,000 and over",Latitude,Longitude
0,M3A,North York,Parkwoods,45,28890,4140,7660,700,605,5295,...,620,1200,1025,880,790,650,3795,890,43.751,-79.323
1,M6A,North York,Lawrence Manor,32,17080,2675,4340,505,330,2635,...,335,735,565,500,435,315,2155,755,43.726,-79.436
2,M1B,Scarborough,Rouge,131,38125,6580,11740,1020,635,7740,...,455,970,950,930,940,845,6060,1075,43.805,-79.166
3,M1B,Scarborough,Malvern,132,35885,7345,11575,1155,710,6915,...,655,1360,1200,1000,905,795,3280,225,43.809,-79.222
4,M3B,North York,Don Mills North,42,23390,2295,5150,450,345,3490,...,400,930,885,780,655,605,4615,1750,43.761,-79.411


<h3><b>2 - Foursquare API Initialization and Check</b></h3>
<p>The following section works on gathering data from the Foursquare API, which will later be mapped to a Dataframe that will allow us to analyze clusters.</p>
<h4><b>Category Codes:</b></h4>
<ul>
    <li>10000 - Arts and Entertainment</li>
    <li>11000 - Business and Professional Services</li>
    <li>12000 - Community and Government</li>
    <li>13000 - Dining and Drinking</li>
    <li>14000 - Event</li>
    <li>15000 - Health and Medicine</li>
    <li>16000 - Landmarks and Outdoors</li>
    <li>17000 - Retail</li>
    <li>18000 - Sports and Recreation</li>
    <li>19000 - Travel and Transportation</li>
</ul>

<h4><b>2.1 - Create Request Function to Retrieve Venue Data</b></h4>
<p>Section 2.1 focuses on the creation of the create_request function, which will allow us to gather venue data.</p>

In [7]:
config = dotenv_values(".env")
url = "https://api.foursquare.com/v3/places/nearby"

headers = {"Accept": "application/json",
            "Authorization": config["API_KEY"]}

response = requests.request("GET", url, headers=headers)

def create_request(coords= None, location = None, categories = None, limit = "10"):
    """
        Important:
            - Coords and location cannot be entered together
            - Location and radius cannot be entered together

        The coords will be a list with latitude and longitude.\n 
        Location will be a city and province such as  "Oshawa, ON".\n
        The category is a string from the above codes, with a default of None.\n
        The limit parameter is a maximum of 50, with a default of 10 requests.\n

        Examples:
            - create_request(coords=[-72.848752,43.895962], limit="1")
            - create_request(coords=[-72.848752,43.895962], categories="10000", limit="2")\n
            - create_request(location=["Oshawa","ON"], limit="2")
            - create_request(location=["Oshawa","ON"], categories="10000", limit="20")
    """

    if(coords and categories == None):
        url = "https://api.foursquare.com/v3/places/search?ll=" + str(coords[0]) + "%2C" + str(coords[1]) + "&radius=500"  + "&limit=" + limit
    elif(coords and categories):
        url = "https://api.foursquare.com/v3/places/search?ll=" + str(coords[0]) + "%2C" + str(coords[1]) +"&categories=" + categories + "&radius=500" + "&limit=" + limit
    elif(location and categories == None):
        url = "https://api.foursquare.com/v3/places/search?" + "near=" + str(location[0]) + "%2C" + str(location[1]) + "&limit=" + limit
    elif(location and categories):
        url = "https://api.foursquare.com/v3/places/search?" + "categories=" + categories + "&near=" + str(location[0]) + "%2C" + str(location[1]) + "&limit=" + limit
    else:
        return False
    
    response = requests.request("GET", url, headers=headers)
    
    if(response.status_code == 200):
        return response.json()
    else:
        return False

In [8]:
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

<h3><b>Creating Law Enforcement DataFrame</b></h3>

In [9]:
latitude = 43.6532
longitude = -79.3832
results = create_request(location=["Toronto","ON"], categories="12070", limit="10")

venues = json_normalize(results['results'], max_level=3)

filtered_columns = ['name', 'categories', 'geocodes.main.latitude', 'geocodes.main.longitude']
venues['categories'] = venues.apply(get_category_type, axis=1)

venues = venues[filtered_columns]
venues.head()

Unnamed: 0,name,categories,geocodes.main.latitude,geocodes.main.longitude
0,53 Division Toronto Police Service,Police Station,43.706104,-79.400647
1,Toronto Police Service - 13 Division,Police Station,43.698433,-79.436581
2,Toronto Fire Station 341,Fire Station,43.694398,-79.441081
3,Toronto Fire Station 131,Fire Station,43.726175,-79.402729
4,University of Toronto Campus Community Police,Police Station,43.664817,-79.400841


<h3><b>This fucking sucks</b></h3>

In [10]:
def get_venues(names, latitudes, longitudes):
    venues_list = []
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
        results = create_request(location=[lat,lng], limit="50")
        venues = results['results']

        if(len(venues) > 0):
            try:
                venues_list.append([(
                name, 
                lat, 
                lng, 
                venue['name'], 
                venue['geocodes']['main']['latitude'], 
                venue['geocodes']['main']['longitude'], venue['categories'][0]['name']) for venue in venues])
            except:
                continue
        
    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
   
    nearby_venues.columns = ['Neighbourhood','Neighbourhood Latitude', 'Neighbourhood Longitude', 'Venue', 'Venue Latitude', 'Venue Longitude',  'Venue Category']
    return(nearby_venues)


<h3><b>This fucking sucks Part 2</b></h3>

In [11]:
toronto_venues = get_venues(names=combined_dataframe["Neighbourhood"], latitudes= combined_dataframe["Latitude_y"], longitudes=combined_dataframe["Longitude_y"])
toronto_venues.head()

NameError: name 'combined_dataframe' is not defined

In [None]:
toronto_venues.groupby("Neighbourhood").count()
print('There are {} uniques categories.'.format(len(toronto_venues['Venue Category'].unique())))

There are 257 uniques categories.


<h3><b>Analyzing Each Neigborhood</b></h3>

In [None]:
# one hot encoding
toronto_onehot = pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
toronto_onehot['Neighbourhood'] = toronto_venues['Neighbourhood'] 

# move neighborhood column to the first column
fixed_columns = [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]

toronto_onehot.head()


Unnamed: 0,Neighbourhood,ATM,Advertising Agency,Afghan Restaurant,African Restaurant,American Restaurant,Amusement Park,Antique Store,Arcade,Art Gallery,...,Urban Park,Vegan and Vegetarian Restaurant,Video Games Store,Vietnamese Restaurant,Vintage and Thrift Store,Warehouse / Wholesale Store,Wine Bar,Wings Joint,Women's Store,Xinjiang Restaurant
0,Parkwoods,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Parkwoods,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Parkwoods,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Parkwoods,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Parkwoods,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [None]:
toronto_grouped = toronto_onehot.groupby('Neighbourhood').mean().reset_index()
toronto_grouped.head()

Unnamed: 0,Neighbourhood,ATM,Advertising Agency,Afghan Restaurant,African Restaurant,American Restaurant,Amusement Park,Antique Store,Arcade,Art Gallery,...,Urban Park,Vegan and Vegetarian Restaurant,Video Games Store,Vietnamese Restaurant,Vintage and Thrift Store,Warehouse / Wholesale Store,Wine Bar,Wings Joint,Women's Store,Xinjiang Restaurant
0,Agincourt,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.04,0.02,0.0,0.0,0.0,0.0,0.0,0.0
1,Alderwood,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Bathurst Manor,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Beaumond Heights,0.025,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Bedford Park,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0


In [None]:
num_top_venues = 5

for hood in toronto_grouped['Neighbourhood']:
    print("----"+hood+"----")
    temp = toronto_grouped[toronto_grouped['Neighbourhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Agincourt----
                  venue  freq
0    Chinese Restaurant  0.24
1  Fast Food Restaurant  0.08
2      Asian Restaurant  0.08
3            Restaurant  0.06
4  Cantonese Restaurant  0.04


----Alderwood----
                           venue  freq
0  Cafes, Coffee, and Tea Houses  0.06
1                         Retail  0.06
2                 Discount Store  0.04
3            American Restaurant  0.04
4                      Drugstore  0.04


----Bathurst Manor----
                         venue  freq
0         Fast Food Restaurant  0.10
1  Grocery Store / Supermarket  0.06
2                         Bank  0.06
3                   Restaurant  0.06
4                    Drugstore  0.04


----Beaumond Heights----
                         venue  freq
0         Fast Food Restaurant  0.18
1         Caribbean Restaurant  0.08
2                    Drugstore  0.08
3  Grocery Store / Supermarket  0.08
4            Indian Restaurant  0.08


----Bedford Park----
                  venue  freq

<h3><b>Get Most Common Venues</b></h3>

In [None]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [None]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighbourhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighbourhoods_venues_sorted = pd.DataFrame(columns=columns)
neighbourhoods_venues_sorted['Neighbourhood'] = toronto_grouped['Neighbourhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighbourhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighbourhoods_venues_sorted.head()

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Agincourt,Chinese Restaurant,Fast Food Restaurant,Asian Restaurant,Restaurant,Cantonese Restaurant,Video Games Store,Bank,Bubble Tea Shop,Drugstore,Diner
1,Alderwood,"Cafes, Coffee, and Tea Houses",Retail,Discount Store,American Restaurant,Drugstore,Seafood Restaurant,Grocery Store / Supermarket,Clothing Store,Burger Joint,Coffee Shop
2,Bathurst Manor,Fast Food Restaurant,Grocery Store / Supermarket,Bank,Restaurant,Drugstore,Ice Cream Parlor,Pizzeria,Diner,Park,Coffee Shop
3,Beaumond Heights,Fast Food Restaurant,Caribbean Restaurant,Drugstore,Grocery Store / Supermarket,Indian Restaurant,Restaurant,Bank,Park,Bakery,Discount Store
4,Bedford Park,Sushi Restaurant,Italian Restaurant,Fast Food Restaurant,Bakery,Café,Drugstore,Pub,Restaurant,Bistro,Greek Restaurant


<h3><b>Correlation: Educational Buildings and Income </b></h3>


<h3><b>Combine Datafames Together</b></h3>

In [None]:
k = 6
combined_dataframe = pd.merge(combined_data, population_data, on="Neighbourhood")
combined_dataframe
toronto_clustering = combined_dataframe[["Under $5,000", "AutoTheft_AVG", "No certificate, diploma or degree", "Hood_ID", "Neighbourhood Number"]]
kmeans = KMeans(n_clusters = k,random_state=0).fit(toronto_clustering)
toronto_clustering
combined_dataframe.insert(0, "Cluster Labels", kmeans.labels_)
combined_dataframe.tail()

Unnamed: 0,Cluster Labels,Postcode_x,Borough_x,Neighbourhood,Hood_ID,Population,Assault_AVG,Assault_Rate_2019,AutoTheft_AVG,AutoTheft_Rate_2019,...,"$45,000 to $49,999","$50,000 to $59,999","$60,000 to $69,999","$70,000 to $79,999","$80,000 to $89,999","$90,000 to $99,999","$100,000 and over","$200,000 and over",Latitude_y,Longitude_y
53,0,M8W,Etobicoke,Alderwood,20,12054,36.3,298.7,16.2,116.1,...,165,335,300,320,275,250,1915,360,43.602,-79.545
54,3,M8W,Etobicoke,Long Branch,19,10084,61.7,585.1,12.3,119.0,...,230,400,360,310,295,230,1265,300,43.593,-79.538
55,3,M4X,Downtown Toronto,Cabbagetown,71,11669,102.3,1079.8,10.7,188.5,...,255,425,420,350,295,270,1925,700,43.665,-79.369
56,0,M4X,Downtown Toronto,St. James Town,71,11669,102.3,1079.8,10.7,188.5,...,490,815,600,525,400,340,1170,170,43.671,-79.373
57,4,M8Y,Etobicoke,Mimico NE,17,33964,299.2,959.8,37.3,176.7,...,675,1270,1255,1120,1105,920,5455,1245,43.614,-79.495


In [None]:
# merge toronto_grouped with toronto_data to add latitude/longitude for each neighbourhood
toronto_merged = pd.merge(combined_dataframe, neighbourhoods_venues_sorted, on="Neighbourhood")

toronto_merged.head() # check the last columns!

Unnamed: 0,Cluster Labels,Postcode_x,Borough_x,Neighbourhood,Hood_ID,Population,Assault_AVG,Assault_Rate_2019,AutoTheft_AVG,AutoTheft_Rate_2019,...,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,2,M3A,North York,Parkwoods,45,34805,159.7,454.0,31.5,91.9,...,Diner,Fast Food Restaurant,Restaurant,Drugstore,Bank,Grocery Store / Supermarket,Discount Store,Bakery,Café,"Cafes, Coffee, and Tea Houses"
1,1,M1B,Scarborough,Malvern,132,43794,278.2,760.4,47.2,162.1,...,Fast Food Restaurant,Pizzeria,Clothing Store,Bank,Park,Drugstore,Grocery Store / Supermarket,Furniture and Home Store,Chocolate Store,Fish and Chips Shop
2,0,M3B,North York,Don Mills North,42,27695,80.5,267.2,21.8,151.7,...,Bubble Tea Shop,Fast Food Restaurant,Pizzeria,Health and Beauty Service,Ramen Restaurant,Hair Salon,Korean Restaurant,Diner,Beer Store,Fried Chicken Joint
3,3,M6C,York,Humewood-Cedarvale,106,14365,46.3,320.2,16.2,111.4,...,Bank,Fast Food Restaurant,Bakery,BBQ Joint,Café,Indian Restaurant,Italian Restaurant,Playground,Diner,Ice Cream Parlor
4,0,M9C,Etobicoke,Eringate,11,18588,54.0,236.7,29.7,220.6,...,Restaurant,Bank,Fast Food Restaurant,Baseball Field,Park,Convenience Store,Pizzeria,Lounge,Golf Course,Go Kart Track


In [None]:
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=10)

x = np.arange(k)
ys = [i + x + (i*x)**2 for i in range(k)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

markers_colors = []
for average, lat, lon, neighbourhood, cluster in zip(combined_dataframe["Homicide_AVG"], combined_dataframe['Latitude_y'], combined_dataframe['Longitude_y'], combined_dataframe['Neighbourhood'], combined_dataframe['Cluster Labels']):
    label = folium.Popup('Neighbourhood: ' + str(neighbourhood) + " Homicide Average: " + str(average), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)


map_clusters

<h3><b>Examine Clusters</b></h3>

In [None]:
# Define display function
def display_cluster(n):
    filtered_columns = ["Cluster Labels", "Neighbourhood" , "Borough_y", "1st Most Common Venue", "2nd Most Common Venue", "3rd Most Common Venue", "4th Most Common Venue", "5th Most Common Venue", "6th Most Common Venue", "7th Most Common Venue", "8th Most Common Venue", "9th Most Common Venue", "10th Most Common Venue"]
    cluster_elts = toronto_merged.loc[toronto_merged['Cluster Labels'] == n, toronto_merged.columns[:]]
    cluster_elts = cluster_elts[filtered_columns]
    cluster_elts.reset_index(drop=True, inplace=True)
    print ('Cluster {} has {} Neighbourhood(s)\n'.format(n, cluster_elts.shape[0]))
    return cluster_elts

<h5><b>Cluster 0</b></h5>

In [None]:
display_cluster(0)

Cluster 0 has 18 Neighbourhood(s)



Unnamed: 0,Cluster Labels,Neighbourhood,Borough_y,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,0,Don Mills North,North York,Bubble Tea Shop,Fast Food Restaurant,Pizzeria,Health and Beauty Service,Ramen Restaurant,Hair Salon,Korean Restaurant,Diner,Beer Store,Fried Chicken Joint
1,0,Eringate,Etobicoke,Restaurant,Bank,Fast Food Restaurant,Baseball Field,Park,Convenience Store,Pizzeria,Lounge,Golf Course,Go Kart Track
2,0,Morningside,Scarborough,Fast Food Restaurant,Coffee Shop,Park,Restaurant,Drugstore,Pizzeria,Property Management Office,Elementary School,Sports and Recreation,Burger Joint
3,0,Hillcrest Village,North York,Fast Food Restaurant,Chinese Restaurant,Bank,Park,Drugstore,Bakery,Grocery Store / Supermarket,Beer Store,Japanese Restaurant,Restaurant
4,0,Bathurst Manor,North York,Fast Food Restaurant,Grocery Store / Supermarket,Bank,Restaurant,Drugstore,Ice Cream Parlor,Pizzeria,Diner,Park,Coffee Shop
5,0,Dufferin,West Toronto,Pizzeria,Bakery,Sushi Restaurant,Diner,Italian Restaurant,Bank,BBQ Joint,Ice Cream Parlor,American Restaurant,Brazilian Restaurant
6,0,Fairview,North York,Middle Eastern Restaurant,Restaurant,Drugstore,Clothing Store,Toy / Game Store,Fish and Chips Shop,Bakery,Coffee Shop,Burger Joint,Retail
7,0,Ionview,Scarborough,Fast Food Restaurant,Restaurant,Grocery Store / Supermarket,Burger Joint,Bank,Asian Restaurant,Car Dealership,Department Store,Discount Store,Furniture and Home Store
8,0,Riverdale,East Toronto,Greek Restaurant,Bakery,Pizzeria,Diner,Ice Cream Parlor,Pub,Brewery,Italian Restaurant,Burger Joint,Butcher
9,0,Oakridge,Scarborough,Bank,Fast Food Restaurant,Grocery Store / Supermarket,Restaurant,Golf Course,Pub,Chinese Restaurant,Retail,Park,Discount Store


<h5><b>Cluster 1</b></h5>

In [None]:
display_cluster(1)

Cluster 1 has 3 Neighbourhood(s)



Unnamed: 0,Cluster Labels,Neighbourhood,Borough_y,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,1,Malvern,Scarborough,Fast Food Restaurant,Pizzeria,Clothing Store,Bank,Park,Drugstore,Grocery Store / Supermarket,Furniture and Home Store,Chocolate Store,Fish and Chips Shop
1,1,Agincourt,Scarborough,Chinese Restaurant,Fast Food Restaurant,Asian Restaurant,Restaurant,Cantonese Restaurant,Video Games Store,Bank,Bubble Tea Shop,Drugstore,Diner
2,1,Milliken,Scarborough,Chinese Restaurant,Bank,Bakery,Japanese Restaurant,Steakhouse,Bubble Tea Shop,Fast Food Restaurant,Vegan and Vegetarian Restaurant,Asian Restaurant,Restaurant


<h5><b>Cluster 2</b></h5>

In [None]:
display_cluster(2)

Cluster 2 has 8 Neighbourhood(s)



Unnamed: 0,Cluster Labels,Neighbourhood,Borough_y,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,2,Parkwoods,North York,Diner,Fast Food Restaurant,Restaurant,Drugstore,Bank,Grocery Store / Supermarket,Discount Store,Bakery,Café,"Cafes, Coffee, and Tea Houses"
1,2,West Hill,Scarborough,Restaurant,Fast Food Restaurant,Bank,Coffee Shop,Grocery Store / Supermarket,Discount Store,Park,BBQ Joint,Scenic Lookout,Storage Facility
2,2,York University,North York,Restaurant,Fast Food Restaurant,Sushi Restaurant,Drugstore,Stadium,Bank,Chinese Restaurant,Burger Joint,Caribbean Restaurant,Brewery
3,2,Dorset Park,Scarborough,Restaurant,Fast Food Restaurant,Coffee Shop,Diner,Burger Joint,Grocery Store / Supermarket,Pet Supplies Store,Bar,Asian Restaurant,Chinese Restaurant
4,2,Wexford,Scarborough,Diner,Fast Food Restaurant,Restaurant,Bakery,Grocery Store / Supermarket,Discount Store,Bank,Seafood Restaurant,Burger Joint,Sports Bar
5,2,Kingsview Village,Etobicoke,Fast Food Restaurant,Bank,Chinese Restaurant,Restaurant,Drugstore,Grocery Store / Supermarket,Park,Beer Store,Middle Eastern Restaurant,Golf Course
6,2,Sullivan,Scarborough,Fast Food Restaurant,Restaurant,Chinese Restaurant,Drugstore,Discount Store,Falafel Restaurant,Bank,Burrito Restaurant,Vietnamese Restaurant,Seafood Restaurant
7,2,Tam O'Shanter,Scarborough,Fast Food Restaurant,Restaurant,Chinese Restaurant,Drugstore,Discount Store,Falafel Restaurant,Bank,Burrito Restaurant,Vietnamese Restaurant,Seafood Restaurant


<h5><b>Cluster 3</b></h5>

In [None]:
display_cluster(3)

Cluster 3 has 12 Neighbourhood(s)



Unnamed: 0,Cluster Labels,Neighbourhood,Borough_y,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,3,Humewood-Cedarvale,York,Bank,Fast Food Restaurant,Bakery,BBQ Joint,Café,Indian Restaurant,Italian Restaurant,Playground,Diner,Ice Cream Parlor
1,3,Markland Wood,Etobicoke,Fast Food Restaurant,Bank,Grocery Store / Supermarket,Greek Restaurant,Diner,Liquor Store,Drugstore,Restaurant,Coffee Shop,"Cafes, Coffee, and Tea Houses"
2,3,Guildwood,Scarborough,Fast Food Restaurant,Park,Sports Bar,Bank,Restaurant,Pizzeria,Convenience Store,Fish and Chips Shop,Garden,Chinese Restaurant
3,3,The Beaches,East Toronto,Beach,Pub,Bakery,Restaurant,Park,Caterer,Bistro,Bookstore,Asian Restaurant,Coffee Shop
4,3,Leaside,East York,Bakery,Fast Food Restaurant,Indian Restaurant,Butcher,Sushi Restaurant,Thai Restaurant,Bar,Asian Restaurant,Grocery Store / Supermarket,Coffee Shop
5,3,Henry Farm,North York,Furniture and Home Store,Coffee Shop,Thai Restaurant,Restaurant,Park,Fast Food Restaurant,Pizzeria,Diner,Seafood Restaurant,Grocery Store / Supermarket
6,3,Lawrence Park,Central Toronto,Italian Restaurant,Bakery,Grocery Store / Supermarket,Sushi Restaurant,Café,Tattoo Parlor,Restaurant,Fast Food Restaurant,Coffee Shop,Dog Park
7,3,University of Toronto,Downtown Toronto,Café,Middle Eastern Restaurant,Ramen Restaurant,Bar,Italian Restaurant,Lounge,Asian Restaurant,Vegan and Vegetarian Restaurant,Comedy Club,Bookstore
8,3,Runnymede,West Toronto,Café,Bakery,Italian Restaurant,Pizzeria,Ice Cream Parlor,Burger Joint,Health and Beauty Service,Falafel Restaurant,Dive Bar,Sports Club
9,3,South Niagara,Downtown Toronto,Restaurant,Park,Café,Coffee Shop,Bar,Bank,Japanese Restaurant,Sports and Recreation,Hair Salon,Italian Restaurant


<h5><b>Cluster 4</b></h5>

In [None]:
display_cluster(4)

Cluster 4 has 9 Neighbourhood(s)



Unnamed: 0,Cluster Labels,Neighbourhood,Borough_y,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,4,Caledonia-Fairbanks,York,Bakery,Bank,Grocery Store / Supermarket,BBQ Joint,Diner,Furniture and Home Store,Italian Restaurant,Drugstore,Fast Food Restaurant,Burger Joint
1,4,Thorncliffe Park,East York,Fast Food Restaurant,Bakery,Grocery Store / Supermarket,BBQ Joint,Indian Restaurant,Sushi Restaurant,Burger Joint,Butcher,Bar,Ice Cream Parlor
2,4,Scarborough Village,Scarborough,Fast Food Restaurant,Restaurant,Park,Diner,Beach,Convenience Store,Beer Store,Garden,Playground,Sports Bar
3,4,Trinity,West Toronto,Bar,Café,Hair Salon,Bakery,Korean Restaurant,Japanese Restaurant,Coffee Shop,Restaurant,Spa,Butcher
4,4,Keelesdale,York,Bakery,Fast Food Restaurant,Grocery Store / Supermarket,Brewery,Department Store,Restaurant,Burger Joint,Café,Bank,Candy Store
5,4,Mount Dennis,York,Fast Food Restaurant,Restaurant,Park,Bank,Bakery,Brewery,Diner,Grocery Store / Supermarket,Café,Department Store
6,4,Martin Grove Gardens,Etobicoke,Restaurant,Fast Food Restaurant,Bank,Drugstore,Convenience Store,Wings Joint,Asian Restaurant,Grocery Store / Supermarket,Coffee Shop,Bakery
7,4,Chinatown,Downtown Toronto,Café,Coffee Shop,Diner,Bar,Restaurant,Pizzeria,Music Venue,Ramen Restaurant,Brewery,New American Restaurant
8,4,Mimico NE,Etobicoke,Pizzeria,Park,Italian Restaurant,Thai Restaurant,Café,Diner,Bank,Asian Restaurant,Fast Food Restaurant,Ice Cream Parlor


<h5><b>Cluster 5</b></h5>

In [None]:
display_cluster(5)

Cluster 5 has 3 Neighbourhood(s)



Unnamed: 0,Cluster Labels,Neighbourhood,Borough_y,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,5,Woburn,Scarborough,Bank,Restaurant,Coffee Shop,Diner,Indian Restaurant,Fried Chicken Joint,Bar,Drugstore,Chinese Restaurant,Bakery
1,5,Jamestown,Etobicoke,Café,Bar,Park,Ramen Restaurant,Pizzeria,BBQ Joint,Diner,Grocery Store / Supermarket,Thai Restaurant,Gay Bar
2,5,L'Amoreaux West,Scarborough,Diner,Fast Food Restaurant,Restaurant,Bank,Chinese Restaurant,Drugstore,Middle Eastern Restaurant,Bakery,Falafel Restaurant,Burrito Restaurant


<h3><b>Conclusion</b></h3>

<p></p>