This notebook will be used for the capstone project.

In [1]:
import pandas as pd
import numpy as np

In [2]:
print("Hello Capstone Project Course!")

Hello Capstone Project Course!


# Part 1

Below I extract the table from Wikipedia using pandas

In [3]:
url = 'https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'
table = pd.read_html(url, attrs={"class":"wikitable"}, header=0)[0]

Here I select only the rows with an assigned borough and show that the 'Neighbourhood' column only has assigned neighbourhoods. 

In [4]:
table = table[table.Borough != 'Not assigned']
table.Neighbourhood[table.Neighbourhood.str.find('Not assigned') != -1]

Series([], Name: Neighbourhood, dtype: object)

In [5]:
print(f'The number of rows is: {table.shape[0]}')
table.head()

The number of rows is: 103


Unnamed: 0,Postal Code,Borough,Neighbourhood
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"
5,M6A,North York,"Lawrence Manor, Lawrence Heights"
6,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"


# Part 2

Sort the table by post code so the rows match those given by the csv file.

In [6]:
table = table.sort_values('Postal Code').reset_index(drop = True)
table.head()

Unnamed: 0,Postal Code,Borough,Neighbourhood
0,M1B,Scarborough,"Malvern, Rouge"
1,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek"
2,M1E,Scarborough,"Guildwood, Morningside, West Hill"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae


In [7]:
geospatial = pd.read_csv('Geospatial_Coordinates.csv')
geospatial.head()

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


Combine dataframes:

In [8]:
newtable = pd.concat([table, geospatial[['Latitude', 'Longitude']]], axis=1)
newtable.head()

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude
0,M1B,Scarborough,"Malvern, Rouge",43.806686,-79.194353
1,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476


newtable.shape

# Markdown 3

## Import necessary dependencies

In [9]:
import folium
from sklearn.cluster import KMeans
from geopy.geocoders import Nominatim
import json
from pandas.io.json import json_normalize
import requests
import matplotlib.cm as cm
import matplotlib.colors as colors
from IPython.display import display 

## Visualise neighbourhoods (only Toronto):

In [10]:
geolocator = Nominatim(user_agent="to_explorer")
location = geolocator.geocode('Toronto, Ontario')

In [11]:
tomap = folium.Map(location=[location.latitude, location.longitude])

In [12]:
toronto =  newtable[newtable.Borough.str.find('Toronto') != -1]
df = toronto
print(df.shape)
for lat, lng, borough, neighborhood in zip(df['Latitude'], df['Longitude'], df['Borough'], df['Neighbourhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(tomap)

(39, 5)


In [13]:
tomap

## Get Foursquare data:

In [14]:
CLIENT_ID = 'VHASLCTGROSODBBLMJLZWAN5CNV0WY5SIRIMOWJI1HJZ4USS' # your Foursquare ID
CLIENT_SECRET = 'L32FTJCPUHGQQW1CJBFFG0I5UWNS310UX350WU4HHVTEC1ZU' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT = 100 # A default Foursquare API limit value

In [15]:
def getNearbyVenues(names, latitudes, longitudes, radius=750):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [16]:
toronto_venues = getNearbyVenues(df['Neighbourhood'], df['Latitude'], df["Longitude"])

## See how many venues have been received for each neighbourhood:

In [17]:
num_venues = pd.DataFrame(toronto_venues.groupby('Neighborhood').count()['Venue'])
num_venues.columns = ['# venues']
num_venues

Unnamed: 0_level_0,# venues
Neighborhood,Unnamed: 1_level_1
Berczy Park,100
"Brockton, Parkdale Village, Exhibition Place",84
"Business reply mail Processing Centre, South Central Letter Processing Plant Toronto",55
"CN Tower, King and Spadina, Railway Lands, Harbourfront West, Bathurst Quay, South Niagara, Island airport",28
Central Bay Street,100
Christie,32
Church and Wellesley,100
"Commerce Court, Victoria Hotel",100
Davisville,73
Davisville North,32


## One-hot encode for the KMeans analysis:

In [18]:
toronto_onehot = pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")
henko = toronto_venues['Neighborhood'] 
toronto_onehot.drop(labels=['Neighborhood'], axis=1,inplace = True)
toronto_onehot.insert(0, 'Neighborhood', henko)
toronto_onehot.head()

Unnamed: 0,Neighborhood,Accessories Store,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,...,University,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio
0,The Beaches,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,The Beaches,0,0,0,0,0,0,0,0,0,...,0,1,0,0,0,0,0,0,0,0
2,The Beaches,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,The Beaches,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,The Beaches,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [19]:
toronto_grouped = toronto_onehot.groupby('Neighborhood').mean().reset_index()
toronto_grouped.head()

Unnamed: 0,Neighborhood,Accessories Store,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,...,University,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio
0,Berczy Park,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01
1,"Brockton, Parkdale Village, Exhibition Place",0.011905,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011905,...,0.0,0.011905,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,"Business reply mail Processing Centre, South C...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.018182,0.0,0.0,0.0
3,"CN Tower, King and Spadina, Railway Lands, Har...",0.0,0.0,0.035714,0.035714,0.035714,0.071429,0.071429,0.071429,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Central Bay Street,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.01,0.01,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.01


## Make most common venues dataframe to investigate differences between neighbourhoods:

In [20]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [21]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = toronto_grouped['Neighborhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Berczy Park,Coffee Shop,Hotel,Restaurant,Japanese Restaurant,Café,Beer Bar,Park,Italian Restaurant,Bakery,Cheese Shop
1,"Brockton, Parkdale Village, Exhibition Place",Coffee Shop,Café,Restaurant,Gift Shop,Bar,Bakery,Thrift / Vintage Store,Sandwich Place,Supermarket,Performing Arts Venue
2,"Business reply mail Processing Centre, South C...",Fast Food Restaurant,Park,Coffee Shop,Pub,Italian Restaurant,Brewery,Restaurant,Light Rail Station,Burrito Place,Bakery
3,"CN Tower, King and Spadina, Railway Lands, Har...",Harbor / Marina,Rental Car Location,Coffee Shop,Boat or Ferry,Airport Lounge,Airport Service,Airport Terminal,Sculpture Garden,Boutique,Music Venue
4,Central Bay Street,Coffee Shop,Café,Art Gallery,Clothing Store,Ramen Restaurant,French Restaurant,Thai Restaurant,Italian Restaurant,Middle Eastern Restaurant,Sushi Restaurant


## KMeans

In [22]:
kclusters = 10 #set to 10 as it gives the best results
tor_clust = toronto_grouped.drop('Neighborhood', 1)
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(tor_clust)

# check cluster labels generated for each row in the dataframe
kmeans.labels_

array([6, 1, 1, 0, 6, 7, 9, 6, 9, 9, 9, 6, 5, 6, 6, 1, 9, 1, 4, 1, 8, 6,
       1, 9, 6, 6, 3, 2, 1, 6, 9, 6, 1, 9, 9, 1, 1, 6, 1])

In [23]:
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

tor_merged = toronto
tor_merged = tor_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighbourhood').sort_values('Neighbourhood')
tor_merged.insert(6, '# venues', num_venues['# venues'].values)
tor_merged.head(10)

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,# venues,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
56,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306,6,100,Coffee Shop,Hotel,Restaurant,Japanese Restaurant,Café,Beer Bar,Park,Italian Restaurant,Bakery,Cheese Shop
78,M6K,West Toronto,"Brockton, Parkdale Village, Exhibition Place",43.636847,-79.428191,1,84,Coffee Shop,Café,Restaurant,Gift Shop,Bar,Bakery,Thrift / Vintage Store,Sandwich Place,Supermarket,Performing Arts Venue
87,M7Y,East Toronto,"Business reply mail Processing Centre, South C...",43.662744,-79.321558,1,55,Fast Food Restaurant,Park,Coffee Shop,Pub,Italian Restaurant,Brewery,Restaurant,Light Rail Station,Burrito Place,Bakery
68,M5V,Downtown Toronto,"CN Tower, King and Spadina, Railway Lands, Har...",43.628947,-79.39442,0,28,Harbor / Marina,Rental Car Location,Coffee Shop,Boat or Ferry,Airport Lounge,Airport Service,Airport Terminal,Sculpture Garden,Boutique,Music Venue
57,M5G,Downtown Toronto,Central Bay Street,43.657952,-79.387383,6,100,Coffee Shop,Café,Art Gallery,Clothing Store,Ramen Restaurant,French Restaurant,Thai Restaurant,Italian Restaurant,Middle Eastern Restaurant,Sushi Restaurant
75,M6G,Downtown Toronto,Christie,43.669542,-79.422564,7,32,Grocery Store,Park,Café,Indian Restaurant,Coffee Shop,Restaurant,Diner,Japanese Restaurant,Baby Store,Italian Restaurant
52,M4Y,Downtown Toronto,Church and Wellesley,43.66586,-79.38316,9,100,Coffee Shop,Japanese Restaurant,Gay Bar,Restaurant,Sushi Restaurant,Men's Store,Café,Salad Place,Sandwich Place,Ramen Restaurant
61,M5L,Downtown Toronto,"Commerce Court, Victoria Hotel",43.648198,-79.379817,6,100,Coffee Shop,Hotel,Café,Gastropub,Japanese Restaurant,Restaurant,Seafood Restaurant,Gym,American Restaurant,Concert Hall
47,M4S,Central Toronto,Davisville,43.704324,-79.38879,9,73,Italian Restaurant,Coffee Shop,Pizza Place,Restaurant,Café,Gym,Dessert Shop,Sandwich Place,Fast Food Restaurant,Sushi Restaurant
45,M4P,Central Toronto,Davisville North,43.712751,-79.390197,9,32,Coffee Shop,Dog Run,Café,Pizza Place,Park,Gym,Dessert Shop,Bar,Brewery,Diner


## Visualise neighbourhood clusters:

In [24]:
map_clusters = folium.Map(location=[location.latitude, location.longitude], zoom_start=11)

x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

markers_colors = []
for lat, lon, poi, cluster in zip(tor_merged['Latitude'], tor_merged['Longitude'], tor_merged['Neighbourhood'], tor_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

## Now investigate the most common venues for each neighbourhood cluster:

In [25]:
for n in range(kclusters):
    print(f'----CLUSTER {n}----')
    display(tor_merged.loc[tor_merged['Cluster Labels'] == n, tor_merged.columns[[2]+list(range(6, tor_merged.shape[1]))]])

----CLUSTER 0----


Unnamed: 0,Neighbourhood,# venues,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
68,"CN Tower, King and Spadina, Railway Lands, Har...",28,Harbor / Marina,Rental Car Location,Coffee Shop,Boat or Ferry,Airport Lounge,Airport Service,Airport Terminal,Sculpture Garden,Boutique,Music Venue


----CLUSTER 1----


Unnamed: 0,Neighbourhood,# venues,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
78,"Brockton, Parkdale Village, Exhibition Place",84,Coffee Shop,Café,Restaurant,Gift Shop,Bar,Bakery,Thrift / Vintage Store,Sandwich Place,Supermarket,Performing Arts Venue
87,"Business reply mail Processing Centre, South C...",55,Fast Food Restaurant,Park,Coffee Shop,Pub,Italian Restaurant,Brewery,Restaurant,Light Rail Station,Burrito Place,Bakery
82,"High Park, The Junction South",66,Bar,Café,Coffee Shop,Thai Restaurant,Bakery,Italian Restaurant,Flea Market,Antique Shop,Mexican Restaurant,Park
67,"Kensington Market, Chinatown, Grange Park",100,Café,Bar,Vegetarian / Vegan Restaurant,Mexican Restaurant,Art Gallery,Coffee Shop,Yoga Studio,Record Shop,Park,Ice Cream Shop
77,"Little Portugal, Trinity",100,Restaurant,Bar,Café,Coffee Shop,Cocktail Bar,Vegetarian / Vegan Restaurant,Italian Restaurant,Bakery,Korean Restaurant,Asian Restaurant
83,"Parkdale, Roncesvalles",61,Bar,Café,Breakfast Spot,Thai Restaurant,Amphitheater,Bookstore,Dog Run,Sushi Restaurant,Restaurant,Pub
84,"Runnymede, Swansea",62,Coffee Shop,Café,Pizza Place,Pub,Bakery,Italian Restaurant,Park,Restaurant,Bank,Falafel Restaurant
43,Studio District,86,Coffee Shop,Café,Diner,Bar,Sandwich Place,Park,Bakery,Sushi Restaurant,American Restaurant,Brewery
37,The Beaches,47,Pub,Coffee Shop,Sandwich Place,Health Food Store,Breakfast Spot,Bar,Japanese Restaurant,Pharmacy,Tea Room,Bakery
41,"The Danforth West, Riverdale",100,Greek Restaurant,Coffee Shop,Pub,Café,Grocery Store,Italian Restaurant,Fast Food Restaurant,Bookstore,Spa,Breakfast Spot


----CLUSTER 2----


Unnamed: 0,Neighbourhood,# venues,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
63,Roselawn,6,Playground,Home Service,Garden,Pet Store,Spa,Event Space,Ethiopian Restaurant,Escape Room,Falafel Restaurant,Dive Bar


----CLUSTER 3----


Unnamed: 0,Neighbourhood,# venues,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
50,Rosedale,7,Park,Trail,Playground,Candy Store,Eastern European Restaurant,Dog Run,Doner Restaurant,Donut Shop,Dry Cleaner,Dumpling Restaurant


----CLUSTER 4----


Unnamed: 0,Neighbourhood,# venues,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
44,Lawrence Park,5,Park,Swim School,Coffee Shop,Business Service,Bus Line,Yoga Studio,Electronics Store,Donut Shop,Dry Cleaner,Dumpling Restaurant


----CLUSTER 5----


Unnamed: 0,Neighbourhood,# venues,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
64,"Forest Hill North & West, Forest Hill Road Park",7,Gym / Fitness Center,Park,Jewelry Store,Sushi Restaurant,Dry Cleaner,Trail,Eastern European Restaurant,Dog Run,Doner Restaurant,Donut Shop


----CLUSTER 6----


Unnamed: 0,Neighbourhood,# venues,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
56,Berczy Park,100,Coffee Shop,Hotel,Restaurant,Japanese Restaurant,Café,Beer Bar,Park,Italian Restaurant,Bakery,Cheese Shop
57,Central Bay Street,100,Coffee Shop,Café,Art Gallery,Clothing Store,Ramen Restaurant,French Restaurant,Thai Restaurant,Italian Restaurant,Middle Eastern Restaurant,Sushi Restaurant
61,"Commerce Court, Victoria Hotel",100,Coffee Shop,Hotel,Café,Gastropub,Japanese Restaurant,Restaurant,Seafood Restaurant,Gym,American Restaurant,Concert Hall
70,"First Canadian Place, Underground city",100,Coffee Shop,Hotel,Restaurant,Café,Theater,Japanese Restaurant,Plaza,Concert Hall,Seafood Restaurant,Italian Restaurant
54,"Garden District, Ryerson",100,Coffee Shop,Hotel,Japanese Restaurant,Gastropub,Italian Restaurant,Sushi Restaurant,Department Store,Theater,Middle Eastern Restaurant,Plaza
59,"Harbourfront East, Union Station, Toronto Islands",100,Coffee Shop,Hotel,Boat or Ferry,Restaurant,Brewery,Café,Japanese Restaurant,Park,Aquarium,Scenic Lookout
46,"North Toronto West, Lawrence Park",44,Coffee Shop,Café,Sporting Goods Shop,Italian Restaurant,Skating Rink,Diner,Bakery,Restaurant,Clothing Store,Grocery Store
53,"Regent Park, Harbourfront",79,Coffee Shop,Café,Park,Bakery,Pub,Restaurant,Theater,Italian Restaurant,Breakfast Spot,Performing Arts Venue
58,"Richmond, Adelaide, King",100,Coffee Shop,Café,Hotel,Theater,Restaurant,Bar,Italian Restaurant,Bakery,Concert Hall,Cosmetics Shop
55,St. James Town,100,Coffee Shop,Café,Restaurant,Seafood Restaurant,Bakery,Clothing Store,Hotel,Beer Bar,American Restaurant,Gastropub


----CLUSTER 7----


Unnamed: 0,Neighbourhood,# venues,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
75,Christie,32,Grocery Store,Park,Café,Indian Restaurant,Coffee Shop,Restaurant,Diner,Japanese Restaurant,Baby Store,Italian Restaurant


----CLUSTER 8----


Unnamed: 0,Neighbourhood,# venues,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
48,"Moore Park, Summerhill East",15,Park,Grocery Store,Sandwich Place,Rental Car Location,Thai Restaurant,Candy Store,Japanese Restaurant,Café,Trail,Seafood Restaurant


----CLUSTER 9----


Unnamed: 0,Neighbourhood,# venues,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
52,Church and Wellesley,100,Coffee Shop,Japanese Restaurant,Gay Bar,Restaurant,Sushi Restaurant,Men's Store,Café,Salad Place,Sandwich Place,Ramen Restaurant
47,Davisville,73,Italian Restaurant,Coffee Shop,Pizza Place,Restaurant,Café,Gym,Dessert Shop,Sandwich Place,Fast Food Restaurant,Sushi Restaurant
45,Davisville North,32,Coffee Shop,Dog Run,Café,Pizza Place,Park,Gym,Dessert Shop,Bar,Brewery,Diner
76,"Dufferin, Dovercourt Village",31,Coffee Shop,Bakery,Gym,Italian Restaurant,Bar,Park,Grocery Store,Pharmacy,Portuguese Restaurant,Smoke Shop
42,"India Bazaar, The Beaches West",58,Indian Restaurant,Park,Coffee Shop,Fast Food Restaurant,Restaurant,Brewery,Sandwich Place,Italian Restaurant,Grocery Store,Gym
85,"Queen's Park, Ontario Provincial Government",82,Coffee Shop,Café,Sandwich Place,Park,Italian Restaurant,Yoga Studio,Japanese Restaurant,Sushi Restaurant,Gastropub,Chinese Restaurant
51,"St. James Town, Cabbagetown",67,Coffee Shop,Pizza Place,Café,Restaurant,Grocery Store,Pharmacy,Japanese Restaurant,Sandwich Place,Park,Italian Restaurant
49,"Summerhill West, Rathnelly, South Hill, Forest...",62,Coffee Shop,Sushi Restaurant,Italian Restaurant,Grocery Store,Restaurant,Gym,French Restaurant,Bank,Pizza Place,Thai Restaurant
65,"The Annex, North Midtown, Yorkville",75,Coffee Shop,Pub,Pizza Place,Sandwich Place,History Museum,Park,Burger Joint,Thai Restaurant,Café,Mexican Restaurant


## Observations

We can see that clusters 0, 2, 3, and 6-9 are one-neighbourhood clusters. The fact that these neighbourhoods were assigned different clusters is likely due to the fact that they had a low number of top venues extracted from the Foursquare API. These neighbourhoods also seem to be more residential, as the top venues aren't cafés and such, like in most other neighbourhoods, but rather parks and grocery stores. This is also the reason why the number of clusters was set to 10 - to account for the inaccuracy on the neighbourhoods with low venue count, and still leave enough clusters open for more the main neighbourhood groups. Cluster 7 seems to rightfully be in its own cluster, however, as its top venues are very distinct from other neighbourhoods (aiport and harbor services). Therefore, clusters 0, 2, 3, 6, 8 and 9 could arguably be combined into one cluster. 

The main clusters are 1, 4 and 5. Though the difference between these may not be immediately clear, when visualised on the map as above, it becomes clear that they are grouped geographically. Cluster 1 is central, cluster 4 is to the east and west of cluster 1 and cluster 5 is to the north of cluster 1, along Yonge Street. In all three clusters coffee shops, cafés and restaurants are very prevalent. Cluster 1 has a comparatively high density of hotels, as expected for the city centre. The difference between 4 and 5 is more subtle. 5 seems to have more coffee shops, pizza places and grocery stores than 4, while 4 has more bars than 5.