<h1 style='text-align: center'>Week 3 Assignment, Applied Data Science Capstone</h1>
<h2 style='text-align: center'>Segmenting and Clustering Neighborhoods in Toronto</h2>

<h3> Import libraries and load API keys </h3>

In [1]:
import requests
import pandas as pd
import numpy as np
import seaborn as sns
from sklearn.cluster import KMeans
import folium
from geopy import Nominatim
import matplotlib.cm as cm
import matplotlib.colors as colors

print('Libraries imported.')

Libraries imported.


In [2]:
import config

FSAPI = config.FourSquareAPI()

<h3> Data read-in from Wikipedia </h3>

In [3]:
src_url = 'https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'

response = requests.get(src_url)

df_raw = pd.read_html(response.text)[0]
df_raw.head()

Unnamed: 0,Postal Code,Borough,Neighbourhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"


<h3> Data cleaning - the raw data is kept, just in case. </h3>

In [4]:
df_proc = df_raw.copy()

Rename the columns appropriately

In [5]:
df_proc.columns = ['PostalCode', 'Borough', 'Neighborhood']

Drop any boroughs which are not assigned:

In [6]:
df_proc = df_proc[df_proc['Borough']!='Not assigned'].reset_index().drop('index', axis=1)

Rename any neighborhoods that are not assigned to their respective borough:

In [7]:
df_proc['Neighborhood'].replace('Not assigned', df_proc['Borough'], inplace=True)

Group the dataframe by postal code to sum up the neighborhoods:

In [8]:
df_proc = df_proc.groupby(by='PostalCode').sum().reset_index()

<h3>Take a look at the resulting dataframe:</h3>

In [9]:
df_proc.head()

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M1B,Scarborough,"Malvern, Rouge"
1,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek"
2,M1E,Scarborough,"Guildwood, Morningside, West Hill"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae


In [10]:
df_proc.describe()

Unnamed: 0,PostalCode,Borough,Neighborhood
count,103,103,103
unique,103,10,99
top,M1H,North York,Downsview
freq,1,24,4


In [11]:
df_proc.shape

(103, 3)

<h3> The dataframe is now cleaned and ready for further evaluation. </h3>
<h3> The next step is assigning latitudal and longitudinal coordinates for each postal code using Geocoder. </h3>

<h3> Define a function for getting the required values. </h3>

In [12]:
def get_latlng(postal_codes):
    '''
    Returns a list each for latitude and longitude by entering the postal codes of boroughs in Toronto, Ontario, CA.
    '''
    latitudes = []
    longitudes = []
    geolocator = Nominatim(user_agent='toronto_explorer')
    
    for code in postal_codes:
        location = None
        it = 0
        while location == None and it < 20: # For timeout requests or invalid API calls
            location = geolocator.geocode('{}, Toronto, Ontario'.format(code))
            if it >= 19:
                print('Timeout! Returning None.')
            it += 1
        if location != None:
            latitudes.append(location.latitude)
            longitudes.append(location.longitude)
        else:
            latitudes.append(None)
            longitudes.append(None)
    return latitudes, longitudes

In [13]:
### Change to True to use the function above - not recommended, as some of the postal codes are not found on Nominatim, so this will return a lot of timeouts.
if not True:
    latitudes, longitudes = get_latlng(df_proc['PostalCode'])
    df_proc['Latitude'] = latitudes
    df_proc['Longitude'] = longitudes
else:
    latlng = pd.read_csv('Geospatial_Coordinates.csv')
    # Check if order is correct
    if (sum(latlng['Postal Code'] == df_proc['PostalCode'])) == (df_proc['PostalCode'].shape[0]):
        df_proc['Latitude'] = latlng['Latitude']
        df_proc['Longitude'] = latlng['Longitude']
df_proc.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M1B,Scarborough,"Malvern, Rouge",43.806686,-79.194353
1,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476


<h3> As the latitudal and longitudal coordinates for each neighborhood are now known, it is possible to explore the neighborhoods and their venues using FourSquare API. Similar to the exercise before, let's get the top 100 venues for every postal code within a radius of 1 km. </h3>

In [14]:
def get_venues(df_input, limit, radius):
    
    venues = []
    for code, name, lat, lng in zip(df_input['PostalCode'], df_input['Neighborhood'], df_input['Latitude'], df_input['Longitude']):
        # Define URL
        print(name)
        url_fsq = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
        FSAPI.id, FSAPI.secret, FSAPI.version, lat, lng, radius, limit)
        # HTTP GET Request
        results = requests.get(url_fsq).json()['response']['groups'][0]['items']
        
        venues.append([(
            code,
            name,
            lat,
            lng,
            v['venue']['name'],
            v['venue']['location']['lat'],
            v['venue']['location']['lng'],
            v['venue']['categories'][0]['name']) for v in results])
    
    nearby_venues = pd.DataFrame([item for venue_list in venues for item in venue_list])
    nearby_venues.columns = ['PostalCode', 'Neighborhood', 'Neighborhood Latitude',
                             'Neighborhood Longitude', 'Venue Name', 'Venue Latitude',
                             'Venue Longitude', 'Venue Category']
    
    return nearby_venues

In [15]:
toronto_venues = get_venues(df_proc, limit=100, radius=1000)

Malvern, Rouge
Rouge Hill, Port Union, Highland Creek
Guildwood, Morningside, West Hill
Woburn
Cedarbrae
Scarborough Village
Kennedy Park, Ionview, East Birchmount Park
Golden Mile, Clairlea, Oakridge
Cliffside, Cliffcrest, Scarborough Village West
Birch Cliff, Cliffside West
Dorset Park, Wexford Heights, Scarborough Town Centre
Wexford, Maryvale
Agincourt
Clarks Corners, Tam O'Shanter, Sullivan
Milliken, Agincourt North, Steeles East, L'Amoreaux East
Steeles West, L'Amoreaux West
Upper Rouge
Hillcrest Village
Fairview, Henry Farm, Oriole
Bayview Village
York Mills, Silver Hills
Willowdale, Newtonbrook
Willowdale, Willowdale East
York Mills West
Willowdale, Willowdale West
Parkwoods
Don Mills
Don Mills
Bathurst Manor, Wilson Heights, Downsview North
Northwood Park, York University
Downsview
Downsview
Downsview
Downsview
Victoria Village
Parkview Hill, Woodbine Gardens
Woodbine Heights
The Beaches
Leaside
Thorncliffe Park
East Toronto, Broadview North (Old East York)
The Danforth West, 

<h3> Clean and explore this dataset a bit. </h3>

In [16]:
# Clean any venue that is labeled 'Neighborhood' to avoid confusion down the road
toronto_venues = toronto_venues[toronto_venues['Venue Category']!='Neighborhood']
toronto_venues.head()

Unnamed: 0,PostalCode,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue Name,Venue Latitude,Venue Longitude,Venue Category
0,M1B,"Malvern, Rouge",43.806686,-79.194353,Images Salon & Spa,43.802283,-79.198565,Spa
1,M1B,"Malvern, Rouge",43.806686,-79.194353,Harvey's,43.80002,-79.198307,Restaurant
2,M1B,"Malvern, Rouge",43.806686,-79.194353,RBC Royal Bank,43.798782,-79.19709,Bank
3,M1B,"Malvern, Rouge",43.806686,-79.194353,Wendy’s,43.807448,-79.199056,Fast Food Restaurant
4,M1B,"Malvern, Rouge",43.806686,-79.194353,Wendy's,43.802008,-79.19808,Fast Food Restaurant


In [17]:
toronto_venues.shape

(4888, 8)

<h3> Convert the dataframe's categorical values to numeric values by using groupby, so it can later be used for clustering. </h3>

In [18]:
toronto_onehot = pd.get_dummies(toronto_venues[['Venue Category']], prefix='', prefix_sep='', columns=['Venue Category'])
toronto_onehot.insert(0, 'Neighborhood', toronto_venues['Neighborhood'])
toronto_onehot.head()

Unnamed: 0,Neighborhood,ATM,Accessories Store,Adult Boutique,Afghan Restaurant,African Restaurant,Airport,Airport Lounge,American Restaurant,Amphitheater,...,Video Store,Vietnamese Restaurant,Warehouse Store,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio,Zoo
0,"Malvern, Rouge",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,"Malvern, Rouge",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,"Malvern, Rouge",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,"Malvern, Rouge",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,"Malvern, Rouge",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


<h3> Group the dataframe by neighborhoods to explore the most common venues per neighborhood </h3>

In [19]:
toronto_grouped = toronto_onehot.groupby('Neighborhood').mean().reset_index()
toronto_grouped.head()

Unnamed: 0,Neighborhood,ATM,Accessories Store,Adult Boutique,Afghan Restaurant,African Restaurant,Airport,Airport Lounge,American Restaurant,Amphitheater,...,Video Store,Vietnamese Restaurant,Warehouse Store,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio,Zoo
0,Agincourt,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,"Alderwood, Long Branch",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,"Bathurst Manor, Wilson Heights, Downsview North",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Bayview Village,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,"Bedford Park, Lawrence Manor East",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02439,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.02439,0.0,0.0,0.0


In [20]:
toronto_grouped.shape

(98, 330)

<h3> Let's get the 10 most common venues for every neighborhood </h3>

In [21]:
no_top_venues = 10
columns = ['Neighborhood']
for ind in np.arange(no_top_venues):
    if ind==0:
        columns.append('Most Common Venue')
    elif ind==1:
        columns.append('2nd Most Common Venue')
    elif ind==2:
        columns.append('3rd Most Common Venue')
    else:
        columns.append('{}th Most Common Venue'.format(ind+1))

venues_grouped_sorted = pd.DataFrame(columns=columns)
venues_grouped_sorted['Neighborhood'] = toronto_grouped['Neighborhood']

for index, row in toronto_grouped.iterrows():
    row = row.iloc[1:] # Exclude neighborhood
    row.sort_values(ascending=False, inplace=True)
    venues_grouped_sorted.iloc[index, 1:] = row.index.values[0:no_top_venues]
    
venues_grouped_sorted.head()

Unnamed: 0,Neighborhood,Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Agincourt,Chinese Restaurant,Shopping Mall,Sandwich Place,Pizza Place,Restaurant,Caribbean Restaurant,Bakery,Malay Restaurant,Lounge,Motorcycle Shop
1,"Alderwood, Long Branch",Discount Store,Pizza Place,Park,Pharmacy,Moroccan Restaurant,Dance Studio,Garden Center,Gas Station,Donut Shop,Bagel Shop
2,"Bathurst Manor, Wilson Heights, Downsview North",Pizza Place,Coffee Shop,Bank,Pharmacy,Convenience Store,Fried Chicken Joint,Sushi Restaurant,Supermarket,Mediterranean Restaurant,Community Center
3,Bayview Village,Bank,Gas Station,Grocery Store,Japanese Restaurant,Restaurant,Chinese Restaurant,Café,Trail,Park,Skating Rink
4,"Bedford Park, Lawrence Manor East",Coffee Shop,Italian Restaurant,Bank,Park,Sandwich Place,Juice Bar,Thai Restaurant,Baby Store,Bagel Shop,Bakery


<h3> Now we can quantify the dataframe by clustering the neighborhoods by their respective most common venues! </h3>

In [22]:
# No. of clusters
kclusters = 5

toronto_grouped_clustering = toronto_grouped.drop('Neighborhood', 1)

kmeans = KMeans(n_clusters = kclusters, random_state=0).fit(toronto_grouped_clustering)

kmeans.labels_

array([0, 0, 2, 2, 2, 2, 1, 2, 2, 2, 0, 2, 2, 2, 2, 2, 0, 0, 2, 2, 2, 1,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 2, 2, 0, 1, 1, 2, 2, 1, 0,
       2, 0, 2, 2, 2, 2, 2, 0, 2, 2, 2, 2, 2, 3, 0, 1, 2, 2, 1, 2, 2, 2,
       2, 2, 1, 2, 2, 0, 0, 2, 2, 0, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1,
       0, 0, 0, 2, 2, 0, 2, 2, 1, 4])

<h3> Add the clustering results to the grouped dataframe and visualize it using Folium. </h3>

In [23]:
try:
    venues_grouped_sorted.insert(0, 'Cluster Labels', kmeans.labels_)
except ValueError:
    pass

toronto_merged = df_proc.copy()
toronto_merged = toronto_merged.join(venues_grouped_sorted.set_index('Neighborhood'), on='Neighborhood')
# Some neighborhoods are listed several times due to having several postal codes - during joining
# they result in NaNs, and for the sake of the exercise, they will be dropped (as they already are
# during the groupby operation a few lines before, which is where the incompatibility stems from).

toronto_merged = toronto_merged[~toronto_merged['Cluster Labels'].isnull()]

toronto_merged.head()


Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M1B,Scarborough,"Malvern, Rouge",43.806686,-79.194353,2.0,Fast Food Restaurant,Trail,Coffee Shop,Restaurant,Spa,Supermarket,Bank,Bakery,Caribbean Restaurant,Chinese Restaurant
1,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek",43.784535,-79.160497,1.0,Breakfast Spot,Playground,Park,Burger Joint,Italian Restaurant,Fireworks Store,Falafel Restaurant,Eastern European Restaurant,Electronics Store,Flea Market
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711,0.0,Pizza Place,Restaurant,Fast Food Restaurant,Bank,Pharmacy,Liquor Store,Supermarket,Greek Restaurant,Grocery Store,Sandwich Place
3,M1G,Scarborough,Woburn,43.770992,-79.216917,2.0,Coffee Shop,Park,Mobile Phone Shop,Indian Restaurant,Chinese Restaurant,Fast Food Restaurant,Farm,Eastern European Restaurant,Electronics Store,Elementary School
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476,2.0,Bakery,Gas Station,Bank,Indian Restaurant,Coffee Shop,Pizza Place,Sporting Goods Shop,Caribbean Restaurant,Fried Chicken Joint,Burger Joint


In [24]:
toronto_merged.shape

(102, 16)

<h3> Now that we have acquired and modeled all the required data, let's visualize it using Folium. </h3>

In [51]:
geolocator = Nominatim(user_agent='toronto_explorer')
location = geolocator.geocode('Toronto, Ontario')
color_list = [colors.rgb2hex(i) for i in cm.hot(np.linspace(0, 1, kclusters))]

map_clusters = folium.Map(location=[location.latitude, location.longitude], zoom_start=11)

for lat, lng, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['Neighborhood'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster: ' + str(cluster), parse_html=True)
    folium.CircleMarker(
    [lat, lng],
    radius=5,
    color=color_list[int(cluster)],
    popup=label).add_to(map_clusters)

map_clusters

<h3> Appended is an image of the Folium map as Folium is not displayed on GitHub </h3>

<img src=toronto_clusters.jpg>

<h3> The result of the clustering is actually quite nice - we can clearly see Toronto's downtown area (orange), as well as more suburban areas (black and red) and two outliers (yellow, white). For further examination, let's take a look at the cluster groups! </h3>

In [28]:
toronto_merged.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M1B,Scarborough,"Malvern, Rouge",43.806686,-79.194353,2.0,Fast Food Restaurant,Trail,Coffee Shop,Restaurant,Spa,Supermarket,Bank,Bakery,Caribbean Restaurant,Chinese Restaurant
1,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek",43.784535,-79.160497,1.0,Breakfast Spot,Playground,Park,Burger Joint,Italian Restaurant,Fireworks Store,Falafel Restaurant,Eastern European Restaurant,Electronics Store,Flea Market
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711,0.0,Pizza Place,Restaurant,Fast Food Restaurant,Bank,Pharmacy,Liquor Store,Supermarket,Greek Restaurant,Grocery Store,Sandwich Place
3,M1G,Scarborough,Woburn,43.770992,-79.216917,2.0,Coffee Shop,Park,Mobile Phone Shop,Indian Restaurant,Chinese Restaurant,Fast Food Restaurant,Farm,Eastern European Restaurant,Electronics Store,Elementary School
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476,2.0,Bakery,Gas Station,Bank,Indian Restaurant,Coffee Shop,Pizza Place,Sporting Goods Shop,Caribbean Restaurant,Fried Chicken Joint,Burger Joint


In [27]:
# For convenient display
from IPython.display import display_html

def html_string():
    html_str = ''
    for i in range(kclusters):
        html_str += 'Cluster Group: '+ str(i) + '<br>'
        df = toronto_merged[toronto_merged['Cluster Labels'] == i]
        df.loc[:, 'Neighborhood']
        
        
        html_str += toronto_merged[toronto_merged['Cluster Labels'] == i].to_html()    
    
html_str = ''
for i in range(kclusters):
    html_str += 'Cluster Group: '+ str(i) +'<br>'
    df = toronto_merged[toronto_merged['Cluster Labels'] == i]
    df = df.iloc[:, [2] + list(np.arange(6, toronto_merged.shape[1]))]
    html_str += df.to_html()
    html_str += 2*'<br>'
display_html(html_str.replace('table','table style="display:inline"'),raw=True)

Unnamed: 0,Neighborhood,Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,"Guildwood, Morningside, West Hill",Pizza Place,Restaurant,Fast Food Restaurant,Bank,Pharmacy,Liquor Store,Supermarket,Greek Restaurant,Grocery Store,Sandwich Place
5,Scarborough Village,Convenience Store,Ice Cream Shop,Bowling Alley,Fast Food Restaurant,Sandwich Place,Grocery Store,Coffee Shop,Intersection,Restaurant,Japanese Restaurant
6,"Kennedy Park, Ionview, East Birchmount Park",Coffee Shop,Chinese Restaurant,Pizza Place,Discount Store,Grocery Store,Fast Food Restaurant,Rental Car Location,Light Rail Station,Bank,Asian Restaurant
8,"Cliffside, Cliffcrest, Scarborough Village West",Pizza Place,Beach,Ice Cream Shop,Sports Bar,Restaurant,Auto Garage,Park,Pharmacy,Field,Fireworks Store
11,"Wexford, Maryvale",Middle Eastern Restaurant,Pizza Place,Burger Joint,Flea Market,Grocery Store,Soccer Field,Fish Market,Seafood Restaurant,Supermarket,Korean Restaurant
12,Agincourt,Chinese Restaurant,Shopping Mall,Sandwich Place,Pizza Place,Restaurant,Caribbean Restaurant,Bakery,Malay Restaurant,Lounge,Motorcycle Shop
13,"Clarks Corners, Tam O'Shanter, Sullivan",Sandwich Place,Bank,Pizza Place,Convenience Store,Pharmacy,Fast Food Restaurant,Coffee Shop,Restaurant,Gas Station,Market
14,"Milliken, Agincourt North, Steeles East, L'Amoreaux East",Chinese Restaurant,Pizza Place,Park,Bakery,Intersection,Pharmacy,Dessert Shop,Coffee Shop,Caribbean Restaurant,Japanese Restaurant
15,"Steeles West, L'Amoreaux West",Chinese Restaurant,Bakery,Coffee Shop,Pizza Place,Intersection,Bank,Hotpot Restaurant,Caribbean Restaurant,Sandwich Place,Furniture / Home Store
17,Hillcrest Village,Pharmacy,Park,Coffee Shop,Pizza Place,Intersection,Fast Food Restaurant,Shopping Mall,Sandwich Place,Bank,Bakery

Unnamed: 0,Neighborhood,Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,"Rouge Hill, Port Union, Highland Creek",Breakfast Spot,Playground,Park,Burger Joint,Italian Restaurant,Fireworks Store,Falafel Restaurant,Eastern European Restaurant,Electronics Store,Flea Market
9,"Birch Cliff, Cliffside West",Park,General Entertainment,Thai Restaurant,Ice Cream Shop,College Stadium,Café,Diner,Restaurant,Dessert Shop,Gym
23,York Mills West,Park,Restaurant,Coffee Shop,Gym,Pet Store,Dog Run,French Restaurant,Golf Course,Gas Station,Bubble Tea Shop
25,Parkwoods,Park,Pharmacy,Bus Stop,Shopping Mall,ATM,Shop & Service,Fast Food Restaurant,Tennis Court,Laundry Service,Café
80,"Del Ray, Mount Dennis, Keelsdale and Silverthorn",Furniture / Home Store,Grocery Store,Discount Store,Shopping Mall,Gas Station,Wine Shop,Playground,Fast Food Restaurant,Sandwich Place,Dessert Shop
91,"Old Mill South, King's Mill Park, Sunnylea, Humber Bay, Mimico NE, The Queensway East, Royal York South East, Kingsway Park South East",Park,Italian Restaurant,Bus Stop,Ice Cream Shop,Shopping Mall,Eastern European Restaurant,Gym / Fitness Center,Dumpling Restaurant,Electronics Store,Elementary School
93,"Islington Avenue, Humber Valley Village",Pharmacy,Bakery,Park,Convenience Store,Playground,Shopping Mall,Golf Course,Café,Grocery Store,Bank
94,"West Deane Park, Princess Gardens, Martin Grove, Islington, Cloverdale",Park,Hotel,Pizza Place,Restaurant,Theater,Grocery Store,Gym,Clothing Store,Bank,Mexican Restaurant
96,Humber Summit,Electronics Store,Italian Restaurant,Pharmacy,Shopping Mall,Park,Arts & Crafts Store,Bakery,Bank,Pizza Place,Fast Food Restaurant
97,"Humberlea, Emery",Golf Course,Bakery,Convenience Store,Storage Facility,Park,Discount Store,Gas Station,Event Space,Eastern European Restaurant,Electronics Store

Unnamed: 0,Neighborhood,Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Malvern, Rouge",Fast Food Restaurant,Trail,Coffee Shop,Restaurant,Spa,Supermarket,Bank,Bakery,Caribbean Restaurant,Chinese Restaurant
3,Woburn,Coffee Shop,Park,Mobile Phone Shop,Indian Restaurant,Chinese Restaurant,Fast Food Restaurant,Farm,Eastern European Restaurant,Electronics Store,Elementary School
4,Cedarbrae,Bakery,Gas Station,Bank,Indian Restaurant,Coffee Shop,Pizza Place,Sporting Goods Shop,Caribbean Restaurant,Fried Chicken Joint,Burger Joint
7,"Golden Mile, Clairlea, Oakridge",Intersection,Bus Line,Bakery,Coffee Shop,Convenience Store,Beer Store,Sandwich Place,Bank,Soccer Field,General Entertainment
10,"Dorset Park, Wexford Heights, Scarborough Town Centre",Electronics Store,Bakery,Coffee Shop,Restaurant,Pharmacy,Fast Food Restaurant,Furniture / Home Store,Light Rail Station,Indian Restaurant,Chinese Restaurant
18,"Fairview, Henry Farm, Oriole",Clothing Store,Coffee Shop,Sandwich Place,Juice Bar,Bakery,Bank,Japanese Restaurant,Restaurant,Fried Chicken Joint,Liquor Store
19,Bayview Village,Bank,Gas Station,Grocery Store,Japanese Restaurant,Restaurant,Chinese Restaurant,Café,Trail,Park,Skating Rink
21,"Willowdale, Newtonbrook",Korean Restaurant,Café,Pizza Place,Coffee Shop,Park,Diner,Middle Eastern Restaurant,Grocery Store,Dessert Shop,Sandwich Place
22,"Willowdale, Willowdale East",Coffee Shop,Bubble Tea Shop,Ramen Restaurant,Korean Restaurant,Pizza Place,Japanese Restaurant,Sandwich Place,Fast Food Restaurant,Middle Eastern Restaurant,Dessert Shop
26,Don Mills,Restaurant,Japanese Restaurant,Coffee Shop,Gym,Pizza Place,Supermarket,Burger Joint,Bank,Athletics & Sports,Asian Restaurant

Unnamed: 0,Neighborhood,Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
102,"Northwest, West Humber - Clairville",Hotel,Rental Car Location,Coffee Shop,Farm,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,Elementary School,Ethiopian Restaurant,Event Space

Unnamed: 0,Neighborhood,Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
20,"York Mills, Silver Hills",Park,Pool,Zoo,Farm,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,Elementary School,Ethiopian Restaurant,Event Space


<h3> The most common venue in Cluster Group 2 are coffee shops, which are all located in the center of town. Cluster Group 1 seems to be more rural with an abundance of parks and malls and even golf courses, whereas cluster group 0 seems to have a lot of pizza places and restaurants. Cluster groups 3 and 4 are outliers with farms, zoos, parks and pools as the most common venues. </h3>