# Module 9 Week 3

## Section 1: Scrape Toronto Neighborhoods


1. The dataframe will consist of three columns: PostalCode, Borough, and Neighborhood
2. Only process the cells that have an assigned borough. Ignore cells with a borough that is Not assigned.
3. More than one neighborhood can exist in one postal code area. For example, in the table on the Wikipedia page, you will notice that M5A is listed twice and has two neighborhoods: Harbourfront and Regent Park. These two rows will be combined into one row with the neighborhoods separated with a comma as shown in row 11 in the above table.
4. If a cell has a borough but a Not assigned neighborhood, then the neighborhood will be the same as the borough. So for the 9th cell in the table on the Wikipedia page, the value of the Borough and the Neighborhood columns will be Queen's Park.

In [1]:
import requests
from bs4 import BeautifulSoup
import pandas as pd
import numpy as np
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)


In [2]:
website_html=requests.get('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M').text

In [3]:
soup=BeautifulSoup(website_html,'lxml')

In [4]:
My_table = soup.find('table',{'class':'wikitable sortable'})

In [5]:
postcodes=[]
columns=['Postcode','Borough','Neighborhood']
df=pd.DataFrame(columns=columns)
for tr in My_table.find_all('tr'):
        tds = tr.find_all('td')
        if not tds:
            continue
        postcode, borough, neighborhood = [td.text.strip() for td in tds[:3]]
        try:
            ind=postcodes.index(postcode)
        except ValueError:
            if borough != "Not assigned":
                postcodes.append(postcode)
                if neighborhood != "Not assigned":
                    df=df.append({'Postcode':postcode,'Borough':borough,'Neighborhood':neighborhood}, ignore_index=True)
                else:
                    df=df.append({'Postcode':postcode,'Borough':borough,'Neighborhood':borough}, ignore_index=True)
        else:
            if borough != "Not assigned":
                if neighborhood == "Not assigned":
                    df.iloc[ind]['Neighborhood']=df.iloc[ind]['Neighborhood']+", "+borough
                else:
                    df.iloc[ind]['Neighborhood']=df.iloc[ind]['Neighborhood']+", "+neighborhood
df

Unnamed: 0,Postcode,Borough,Neighborhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Harbourfront, Regent Park"
3,M6A,North York,"Lawrence Heights, Lawrence Manor"
4,M7A,Queen's Park,Queen's Park
5,M9A,Etobicoke,Islington Avenue
6,M1B,Scarborough,"Rouge, Malvern"
7,M3B,North York,Don Mills North
8,M4B,East York,"Woodbine Gardens, Parkview Hill"
9,M5B,Downtown Toronto,"Ryerson, Garden District"


In [6]:
df.shape

(103, 3)

## Section 2: Add coordinates for each postal code

In [7]:
coords_df=pd.read_csv('http://cocl.us/Geospatial_data')
coords_df.head()

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


In [8]:
latitudes=[]
longitudes=[]
for index, row in df.iterrows():
    latitudes.append(coords_df[coords_df['Postal Code']==row['Postcode']].iloc[0,1])
    longitudes.append(coords_df[coords_df['Postal Code']==row['Postcode']].iloc[0,2])
df['Latitudes']=pd.DataFrame(latitudes)
df['Longitudes']=pd.DataFrame(longitudes)
df

Unnamed: 0,Postcode,Borough,Neighborhood,Latitudes,Longitudes
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Harbourfront, Regent Park",43.65426,-79.360636
3,M6A,North York,"Lawrence Heights, Lawrence Manor",43.718518,-79.464763
4,M7A,Queen's Park,Queen's Park,43.662301,-79.389494
5,M9A,Etobicoke,Islington Avenue,43.667856,-79.532242
6,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353
7,M3B,North York,Don Mills North,43.745906,-79.352188
8,M4B,East York,"Woodbine Gardens, Parkview Hill",43.706397,-79.309937
9,M5B,Downtown Toronto,"Ryerson, Garden District",43.657162,-79.378937


## Section 3: Neighborhood analysis

Without dropping any neighborhoods or boroughs, I implemented the clustering analysis that was done with respect to NY neighborhoods in lab. I noticed that the resulting dataframe had some 'NaN' entries, and so I dropped those from the analysis. Finally, I generated a map showing the neighborhood clusters. As you can see from the map, most of the neighborhoods are in cluster '0'.

In [9]:
import json # library to handle JSON files
!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

Fetching package metadata .............
Solving package specifications: .

# All requested packages already installed.
# packages in environment at /opt/conda/envs/DSX-Python35:
#
geopy                     1.18.1                     py_0    conda-forge


In [10]:
!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Fetching package metadata .............
Solving package specifications: .

# All requested packages already installed.
# packages in environment at /opt/conda/envs/DSX-Python35:
#
folium                    0.5.0                      py_0    conda-forge
Libraries imported.


In [11]:
df_Toronto=pd.DataFrame(columns)
#df['Toronto' in df['Borough']]
df_Toronto=df[df['Borough'].str.find('Toronto')!=-1]
df_Toronto.head()

Unnamed: 0,Postcode,Borough,Neighborhood,Latitudes,Longitudes
2,M5A,Downtown Toronto,"Harbourfront, Regent Park",43.65426,-79.360636
9,M5B,Downtown Toronto,"Ryerson, Garden District",43.657162,-79.378937
15,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418
19,M4E,East Toronto,The Beaches,43.676357,-79.293031
20,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306


In [26]:
CLIENT_ID = '[REDACTED]' # your Foursquare ID
CLIENT_SECRET = '[REDACTED]' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT=500

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)


Your credentails:
CLIENT_ID: [REDACTED]
CLIENT_SECRET:[REDACTED]


In [13]:
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    
    for name, lat, lng in zip(names, latitudes, longitudes):
        #print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    return(nearby_venues)

In [14]:
toronto_venues = getNearbyVenues(names=df['Neighborhood'],
                                   latitudes=df['Latitudes'],
                                   longitudes=df['Longitudes']
                                  )

In [15]:
print(toronto_venues.shape)
toronto_venues.head()

(2245, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Parkwoods,43.753259,-79.329656,Brookbanks Park,43.751976,-79.33214,Park
1,Parkwoods,43.753259,-79.329656,KFC,43.754387,-79.333021,Fast Food Restaurant
2,Parkwoods,43.753259,-79.329656,Variety Store,43.751974,-79.333114,Food & Drink Shop
3,Parkwoods,43.753259,-79.329656,GreenWin pool,43.756232,-79.333842,Pool
4,Victoria Village,43.725882,-79.315572,Victoria Village Arena,43.723481,-79.315635,Hockey Arena


In [16]:
toronto_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"Adelaide, King, Richmond",100,100,100,100,100,100
Agincourt,5,5,5,5,5,5
"Agincourt North, L'Amoreaux East, Milliken, Steeles East",3,3,3,3,3,3
"Albion Gardens, Beaumond Heights, Humbergate, Jamestown, Mount Olive, Silverstone, South Steeles, Thistletown",11,11,11,11,11,11
"Alderwood, Long Branch",10,10,10,10,10,10
"Bathurst Manor, Downsview North, Wilson Heights",17,17,17,17,17,17
Bayview Village,4,4,4,4,4,4
"Bedford Park, Lawrence Manor East",24,24,24,24,24,24
Berczy Park,55,55,55,55,55,55
"Birch Cliff, Cliffside West",4,4,4,4,4,4


In [17]:
toronto_onehot = pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")
# add neighborhood column back to dataframe
toronto_onehot['Neighborhood'] = toronto_venues['Neighborhood'] 


In [18]:
# move neighborhood column to the first column
fixed_columns = [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]

toronto_onehot.head()
toronto_onehot.shape

toronto_grouped = toronto_onehot.groupby('Neighborhood').mean().reset_index()
toronto_grouped

num_top_venues = 5

for hood in toronto_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = toronto_grouped[toronto_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    #print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    #print('\n')

def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = toronto_grouped['Neighborhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

# set number of clusters
kclusters = 5

toronto_grouped_clustering = toronto_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

toronto_merged = df

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
toronto_merged = toronto_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

toronto_merged.head() # check the last columns!



----Adelaide, King, Richmond----
----Agincourt----
----Agincourt North, L'Amoreaux East, Milliken, Steeles East----
----Albion Gardens, Beaumond Heights, Humbergate, Jamestown, Mount Olive, Silverstone, South Steeles, Thistletown----
----Alderwood, Long Branch----
----Bathurst Manor, Downsview North, Wilson Heights----
----Bayview Village----
----Bedford Park, Lawrence Manor East----
----Berczy Park----
----Birch Cliff, Cliffside West----
----Bloordale Gardens, Eringate, Markland Wood, Old Burnhamthorpe----
----Brockton, Exhibition Place, Parkdale Village----
----Business Reply Mail Processing Centre 969 Eastern----
----CFB Toronto, Downsview East----
----CN Tower, Bathurst Quay, Island airport, Harbourfront West, King and Spadina, Railway Lands, South Niagara----
----Cabbagetown, St. James Town----
----Caledonia-Fairbanks----
----Canada Post Gateway Processing Centre----
----Cedarbrae----
----Central Bay Street----
----Chinatown, Grange Park, Kensington Market----
----Christie----
---

Unnamed: 0,Postcode,Borough,Neighborhood,Latitudes,Longitudes,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M3A,North York,Parkwoods,43.753259,-79.329656,0.0,Fast Food Restaurant,Food & Drink Shop,Park,Pool,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore
1,M4A,North York,Victoria Village,43.725882,-79.315572,0.0,Intersection,Portuguese Restaurant,Coffee Shop,Pizza Place,Hockey Arena,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dog Run
2,M5A,Downtown Toronto,"Harbourfront, Regent Park",43.65426,-79.360636,0.0,Coffee Shop,Park,Pub,Café,Bakery,Breakfast Spot,Restaurant,Mexican Restaurant,Theater,Performing Arts Venue
3,M6A,North York,"Lawrence Heights, Lawrence Manor",43.718518,-79.464763,0.0,Furniture / Home Store,Clothing Store,Accessories Store,Event Space,Vietnamese Restaurant,Coffee Shop,Boutique,Miscellaneous Shop,Women's Store,Donut Shop
4,M7A,Queen's Park,Queen's Park,43.662301,-79.389494,0.0,Coffee Shop,Gym,Japanese Restaurant,Sushi Restaurant,Diner,Yoga Studio,Chinese Restaurant,Smoothie Shop,Seafood Restaurant,Sandwich Place


In [19]:
toronto_merged.dropna(axis=0,inplace=True)
toronto_merged['Cluster Labels']=toronto_merged['Cluster Labels'].astype(int)
toronto_merged.head()

Unnamed: 0,Postcode,Borough,Neighborhood,Latitudes,Longitudes,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M3A,North York,Parkwoods,43.753259,-79.329656,0,Fast Food Restaurant,Food & Drink Shop,Park,Pool,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore
1,M4A,North York,Victoria Village,43.725882,-79.315572,0,Intersection,Portuguese Restaurant,Coffee Shop,Pizza Place,Hockey Arena,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dog Run
2,M5A,Downtown Toronto,"Harbourfront, Regent Park",43.65426,-79.360636,0,Coffee Shop,Park,Pub,Café,Bakery,Breakfast Spot,Restaurant,Mexican Restaurant,Theater,Performing Arts Venue
3,M6A,North York,"Lawrence Heights, Lawrence Manor",43.718518,-79.464763,0,Furniture / Home Store,Clothing Store,Accessories Store,Event Space,Vietnamese Restaurant,Coffee Shop,Boutique,Miscellaneous Shop,Women's Store,Donut Shop
4,M7A,Queen's Park,Queen's Park,43.662301,-79.389494,0,Coffee Shop,Gym,Japanese Restaurant,Sushi Restaurant,Diner,Yoga Studio,Chinese Restaurant,Smoothie Shop,Seafood Restaurant,Sandwich Place


In [20]:
# create map
map_clusters = folium.Map(location=[43.6532, -79.3832], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitudes'], toronto_merged['Longitudes'], toronto_merged['Neighborhood'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

As you can see from the map, most of the neighborhoods are in cluster '0'.

In [21]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 0, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,North York,0,Fast Food Restaurant,Food & Drink Shop,Park,Pool,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore
1,North York,0,Intersection,Portuguese Restaurant,Coffee Shop,Pizza Place,Hockey Arena,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dog Run
2,Downtown Toronto,0,Coffee Shop,Park,Pub,Café,Bakery,Breakfast Spot,Restaurant,Mexican Restaurant,Theater,Performing Arts Venue
3,North York,0,Furniture / Home Store,Clothing Store,Accessories Store,Event Space,Vietnamese Restaurant,Coffee Shop,Boutique,Miscellaneous Shop,Women's Store,Donut Shop
4,Queen's Park,0,Coffee Shop,Gym,Japanese Restaurant,Sushi Restaurant,Diner,Yoga Studio,Chinese Restaurant,Smoothie Shop,Seafood Restaurant,Sandwich Place
6,Scarborough,0,Fast Food Restaurant,Dumpling Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Eastern European Restaurant,Field
7,North York,0,Japanese Restaurant,Gym / Fitness Center,Café,Basketball Court,Caribbean Restaurant,Women's Store,Dumpling Restaurant,Dog Run,Doner Restaurant,Donut Shop
8,East York,0,Fast Food Restaurant,Pizza Place,Breakfast Spot,Bank,Rock Climbing Spot,Intersection,Athletics & Sports,Gastropub,Café,Gym / Fitness Center
9,Downtown Toronto,0,Coffee Shop,Clothing Store,Café,Middle Eastern Restaurant,Cosmetics Shop,Theater,Japanese Restaurant,Pizza Place,Plaza,Diner
10,North York,0,Japanese Restaurant,Arcade,Pub,Bakery,Women's Store,Drugstore,Dog Run,Doner Restaurant,Donut Shop,Dumpling Restaurant


In [22]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 1, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
62,Central Toronto,1,Garden,Women's Store,Dumpling Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Eastern European Restaurant


In [23]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 2, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
57,North York,2,Baseball Field,Women's Store,Eastern European Restaurant,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Dumpling Restaurant,Electronics Store
101,Etobicoke,2,Baseball Field,Women's Store,Eastern European Restaurant,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Dumpling Restaurant,Electronics Store


In [24]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 3, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
11,Etobicoke,3,Bank,Eastern European Restaurant,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Dumpling Restaurant,Women's Store,Dim Sum Restaurant


In [25]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 4, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
16,York,4,Field,Park,Hockey Arena,Trail,Dumpling Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop
21,York,4,Park,Women's Store,Market,Pharmacy,Fast Food Restaurant,Grocery Store,Deli / Bodega,Ethiopian Restaurant,Empanada Restaurant,Electronics Store
35,East York,4,Park,Coffee Shop,Convenience Store,Women's Store,Eastern European Restaurant,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore
64,York,4,Park,Women's Store,Dumpling Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Eastern European Restaurant
66,North York,4,Bar,Bank,Park,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Dumpling Restaurant,Eastern European Restaurant,Women's Store
68,Central Toronto,4,Park,Jewelry Store,Sushi Restaurant,Trail,Ethiopian Restaurant,Empanada Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Dessert Shop
85,Scarborough,4,Park,Coffee Shop,Playground,Women's Store,Drugstore,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant
91,Downtown Toronto,4,Park,Playground,Trail,Event Space,Ethiopian Restaurant,Empanada Restaurant,Falafel Restaurant,Electronics Store,Eastern European Restaurant,Department Store


It seems that cluster 1 consists of a borough where the most common venue is garden, cluster 2 with baseball fields and vietnamese restaurants, cluster 3 with banks and vietnamese restaurants, and cluster 4 with parks. Cluster 0 (the most common) does not seem to exhibit any such clear pattern.

In conclusion, the clustering technique does not seem to have worked well on Toronto neighborhoods generally, as the majority of neighborhoods ended up being categorized into cluster 0. Perhaps a better result would be reached by limiting the analysis only to neighborhoods in certain boroughs.