# <center> Coursera Applied Data Science Capstone Project

**This notebook will be used for the Coursera Applied Data Science Capstone Project.**

## Part 1

In [313]:
#install required libraries
import pandas as pd
import numpy as np
! pip install lxml html5lib beautifulsoup4



**We can read tables from the Wikipedia page, and confirm the number of tables present.**

In [314]:
url = 'https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'
dfs = pd.read_html(url)

print(len(dfs))

3


**We can see there are 3 tables on the page, but the 1st one is the one we are interested in. We will convert this to a pandas dataframe.**

In [315]:
Postcodes = dfs[0]
Postcodes

Unnamed: 0,Postal Code,Borough,Neighbourhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"
...,...,...,...
175,M5Z,Not assigned,Not assigned
176,M6Z,Not assigned,Not assigned
177,M7Z,Not assigned,Not assigned
178,M8Z,Etobicoke,"Mimico NW, The Queensway West, South of Bloor,..."


**Remove any rows where no Borough is assigned.**

In [316]:
Postcodes = Postcodes.drop(Postcodes[Postcodes.Borough == 'Not assigned'].index)
Postcodes.reset_index(drop=True)

Unnamed: 0,Postal Code,Borough,Neighbourhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Regent Park, Harbourfront"
3,M6A,North York,"Lawrence Manor, Lawrence Heights"
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"
...,...,...,...
98,M8X,Etobicoke,"The Kingsway, Montgomery Road, Old Mill North"
99,M4Y,Downtown Toronto,Church and Wellesley
100,M7Y,East Toronto,"Business reply mail Processing Centre, South C..."
101,M8Y,Etobicoke,"Old Mill South, King's Mill Park, Sunnylea, Hu..."


In [317]:
#Dataframe shape
Postcodes.shape

(103, 3)

**To create this dataframe we have assumed that the postcode data on Wikipedia is accurate, and that omitting postocdes with no assigned Borough will not impact the analysis. Finally, we have assumed that any single postode can only belong to a 1 borough.**

## Part 2

**Latitude/Longitude data imported from CSV, due to issues with geocoder (v. slow execution). Data read into dataframe Lat_Long.**

In [318]:
# The code was removed by Watson Studio for sharing.

In [319]:
Lat_Long = pd.read_csv(body)
Lat_Long.head()

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


**New dataframe 'Neigh' created to merge neighbourhood and location data. Left join on 'Postal Code' used to retain all post codes, and include location data for each.**

In [320]:
Neigh = pd.merge(left=Postcodes, right=Lat_Long, how='left', left_on='Postal Code', right_on='Postal Code')
Neigh

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.654260,-79.360636
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494
...,...,...,...,...,...
98,M8X,Etobicoke,"The Kingsway, Montgomery Road, Old Mill North",43.653654,-79.506944
99,M4Y,Downtown Toronto,Church and Wellesley,43.665860,-79.383160
100,M7Y,East Toronto,"Business reply mail Processing Centre, South C...",43.662744,-79.321558
101,M8Y,Etobicoke,"Old Mill South, King's Mill Park, Sunnylea, Hu...",43.636258,-79.498509


### Part 3

In [321]:
#install all required libraries

import requests

import matplotlib.cm as cm
import matplotlib.colors as colors

from sklearn.cluster import KMeans

! pip install folium
import folium # map rendering library

! pip install geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

print('Libraries imported.')


Usage:   
  pip install [options] <requirement specifier> [package-index-options] ...
  pip install [options] -r <requirements file> [package-index-options] ...
  pip install [options] [-e] <vcs project url> ...
  pip install [options] [-e] <local project path> ...
  pip install [options] <archive url/path> ...

no such option: --yes
Libraries imported.


**Data reduced to focus only on Boroughs containing 'Toronto' to reduce the dataset. Considered the full dataset to be a very large amount with far more diversity.**

**Although multiple neighbourhoods exist in some postcodes, these have been left as a list (not exploded) as the lat/long data is at postcode level, and therefore the neighbourhoods within that postcode will return very similar data.**

In [354]:
Neigh_filt = Neigh[Neigh_exp.Borough.str.contains('Toronto',case=False)]
Neigh_filt

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.654260,-79.360636
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494
9,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937
15,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418
19,M4E,East Toronto,The Beaches,43.676357,-79.293031
...,...,...,...,...,...
92,M5W,Downtown Toronto,Stn A PO Boxes,43.646435,-79.374846
96,M4X,Downtown Toronto,"St. James Town, Cabbagetown",43.667967,-79.367675
97,M5X,Downtown Toronto,"First Canadian Place, Underground city",43.648429,-79.382280
99,M4Y,Downtown Toronto,Church and Wellesley,43.665860,-79.383160


**Defining Foursquare client details**

In [326]:
# The code was removed by Watson Studio for sharing.

**Create function to get nearby venues for each neighbourhood, and the create datframe to store the venues data.**

In [327]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighbourhood', 
                  'Neighbourhood Latitude', 
                  'Neighbourhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [328]:
toronto_venues = getNearbyVenues(names=Neigh_filt['Neighbourhood'],
                                   latitudes=Neigh_filt['Latitude'],
                                   longitudes=Neigh_filt['Longitude']
                                  )

Regent Park, Harbourfront
Queen's Park, Ontario Provincial Government
Garden District, Ryerson
St. James Town
The Beaches
Berczy Park
Central Bay Street
Christie
Richmond, Adelaide, King
Dufferin, Dovercourt Village
Harbourfront East, Union Station, Toronto Islands
Little Portugal, Trinity
The Danforth West, Riverdale
Toronto Dominion Centre, Design Exchange
Brockton, Parkdale Village, Exhibition Place
India Bazaar, The Beaches West
Commerce Court, Victoria Hotel
Studio District
Lawrence Park
Roselawn
Davisville North
Forest Hill North & West, Forest Hill Road Park
High Park, The Junction South
North Toronto West, Lawrence Park
The Annex, North Midtown, Yorkville
Parkdale, Roncesvalles
Davisville
University of Toronto, Harbord
Runnymede, Swansea
Moore Park, Summerhill East
Kensington Market, Chinatown, Grange Park
Summerhill West, Rathnelly, South Hill, Forest Hill SE, Deer Park
CN Tower, King and Spadina, Railway Lands, Harbourfront West, Bathurst Quay, South Niagara, Island airport
R

In [358]:
print (toronto_venues.shape)
toronto_venues.head()

(1610, 7)


Unnamed: 0,Neighbourhood,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,"Regent Park, Harbourfront",43.65426,-79.360636,Roselle Desserts,43.653447,-79.362017,Bakery
1,"Regent Park, Harbourfront",43.65426,-79.360636,Tandem Coffee,43.653559,-79.361809,Coffee Shop
2,"Regent Park, Harbourfront",43.65426,-79.360636,Cooper Koo Family YMCA,43.653249,-79.358008,Distribution Center
3,"Regent Park, Harbourfront",43.65426,-79.360636,Body Blitz Spa East,43.654735,-79.359874,Spa
4,"Regent Park, Harbourfront",43.65426,-79.360636,Impact Kitchen,43.656369,-79.35698,Restaurant


**Check how many venues were returned for each neighbourhood (or neighbourhood group), and how many unique venue categories were returned.**

In [331]:
toronto_venues.groupby('Neighbourhood').count()

Unnamed: 0_level_0,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighbourhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Berczy Park,58,58,58,58,58,58
"Brockton, Parkdale Village, Exhibition Place",23,23,23,23,23,23
"Business reply mail Processing Centre, South Central Letter Processing Plant Toronto",14,14,14,14,14,14
"CN Tower, King and Spadina, Railway Lands, Harbourfront West, Bathurst Quay, South Niagara, Island airport",16,16,16,16,16,16
Central Bay Street,61,61,61,61,61,61
...,...,...,...,...,...,...
"The Annex, North Midtown, Yorkville",19,19,19,19,19,19
The Beaches,5,5,5,5,5,5
"The Danforth West, Riverdale",42,42,42,42,42,42
"Toronto Dominion Centre, Design Exchange",100,100,100,100,100,100


In [332]:
print('There are {} uniques categories.'.format(len(toronto_venues['Venue Category'].unique())))

There are 232 uniques categories.


**Run one hot encoding on the Venue Categories to split them out.**

In [333]:
# one hot encoding
toronto_onehot = pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
toronto_onehot['Neighbourhood'] = toronto_venues['Neighbourhood'] 

# move neighborhood column to the first column
fixed_columns = [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]

toronto_onehot.head()

Unnamed: 0,Neighbourhood,Adult Boutique,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,...,Tibetan Restaurant,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Wine Shop,Yoga Studio
0,"Regent Park, Harbourfront",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,"Regent Park, Harbourfront",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,"Regent Park, Harbourfront",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,"Regent Park, Harbourfront",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,"Regent Park, Harbourfront",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [334]:
toronto_onehot.shape

(1610, 233)

**Group rows by neighbourhood (or neighbourhood group with same postcode), and take mean of frequency of occurrence of each category.**

In [335]:
toronto_grouped = toronto_onehot.groupby('Neighbourhood').mean().reset_index()
toronto_grouped

Unnamed: 0,Neighbourhood,Adult Boutique,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,...,Tibetan Restaurant,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Wine Shop,Yoga Studio
0,Berczy Park,0.0,0.0000,0.0000,0.0000,0.0000,0.0000,0.000,0.00000,0.0,...,0.00000,0.0,0.00000,0.00,0.017241,0.00000,0.0,0.000000,0.0,0.000000
1,"Brockton, Parkdale Village, Exhibition Place",0.0,0.0000,0.0000,0.0000,0.0000,0.0000,0.000,0.00000,0.0,...,0.00000,0.0,0.00000,0.00,0.000000,0.00000,0.0,0.000000,0.0,0.000000
2,"Business reply mail Processing Centre, South C...",0.0,0.0000,0.0000,0.0000,0.0000,0.0000,0.000,0.00000,0.0,...,0.00000,0.0,0.00000,0.00,0.000000,0.00000,0.0,0.000000,0.0,0.000000
3,"CN Tower, King and Spadina, Railway Lands, Har...",0.0,0.0625,0.0625,0.0625,0.0625,0.1875,0.125,0.00000,0.0,...,0.00000,0.0,0.00000,0.00,0.000000,0.00000,0.0,0.000000,0.0,0.000000
4,Central Bay Street,0.0,0.0000,0.0000,0.0000,0.0000,0.0000,0.000,0.00000,0.0,...,0.00000,0.0,0.00000,0.00,0.016393,0.00000,0.0,0.016393,0.0,0.016393
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
34,"The Annex, North Midtown, Yorkville",0.0,0.0000,0.0000,0.0000,0.0000,0.0000,0.000,0.00000,0.0,...,0.00000,0.0,0.00000,0.00,0.000000,0.00000,0.0,0.000000,0.0,0.000000
35,The Beaches,0.0,0.0000,0.0000,0.0000,0.0000,0.0000,0.000,0.00000,0.0,...,0.00000,0.0,0.20000,0.00,0.000000,0.00000,0.0,0.000000,0.0,0.000000
36,"The Danforth West, Riverdale",0.0,0.0000,0.0000,0.0000,0.0000,0.0000,0.000,0.02381,0.0,...,0.02381,0.0,0.02381,0.00,0.000000,0.00000,0.0,0.000000,0.0,0.023810
37,"Toronto Dominion Centre, Design Exchange",0.0,0.0000,0.0000,0.0000,0.0000,0.0000,0.000,0.03000,0.0,...,0.00000,0.0,0.00000,0.01,0.010000,0.00000,0.0,0.010000,0.0,0.000000


In [359]:
toronto_grouped.shape

(39, 233)

**We can now sort the venues in descending order (by frequency of occurrence), and create a new dataframe displaying the top 10 venues for each neighbourhood.**

In [337]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [338]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighbourhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighbourhoods_venues_sorted = pd.DataFrame(columns=columns)
neighbourhoods_venues_sorted['Neighbourhood'] = toronto_grouped['Neighbourhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighbourhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighbourhoods_venues_sorted.head()

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Berczy Park,Coffee Shop,Cocktail Bar,Seafood Restaurant,Beer Bar,Bakery,Farmers Market,Restaurant,Cheese Shop,Juice Bar,Shopping Mall
1,"Brockton, Parkdale Village, Exhibition Place",Café,Performing Arts Venue,Coffee Shop,Breakfast Spot,Grocery Store,Bakery,Pet Store,Nightclub,Climbing Gym,Restaurant
2,"Business reply mail Processing Centre, South C...",Light Rail Station,Pizza Place,Garden Center,Comic Shop,Restaurant,Burrito Place,Brewery,Skate Park,Fast Food Restaurant,Auto Workshop
3,"CN Tower, King and Spadina, Railway Lands, Har...",Airport Service,Airport Terminal,Harbor / Marina,Bar,Rental Car Location,Sculpture Garden,Boutique,Boat or Ferry,Plane,Airport Food Court
4,Central Bay Street,Coffee Shop,Café,Sandwich Place,Italian Restaurant,Bubble Tea Shop,Burger Joint,Salad Place,Restaurant,Portuguese Restaurant,Poke Place


**We can now use k-means to cluster the data.**

In [339]:
# set number of clusters
kclusters = 10

toronto_grouped_clustering = toronto_grouped.drop('Neighbourhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, n_init=12, random_state=0).fit(toronto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([1, 9, 1, 9, 1, 0, 1, 1, 1, 7], dtype=int32)

**Create a dataframe to combine the neighbourhood data with the cluster label and the most common venues.**

In [340]:
# add clustering labels
neighbourhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

toronto_merged = Neigh_filt

# merge manhattan_grouped with manhattan_data to add latitude/longitude for each neighborhood
toronto_merged = toronto_merged.join(neighbourhoods_venues_sorted.set_index('Neighbourhood'), on='Neighbourhood')

toronto_merged.head() # check the last columns!

Unnamed: 0,Postal Code,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636,1,Coffee Shop,Bakery,Pub,Park,Breakfast Spot,Café,Restaurant,Theater,Distribution Center,Chocolate Shop
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494,1,Coffee Shop,Sushi Restaurant,Sandwich Place,Burrito Place,Distribution Center,Fast Food Restaurant,Smoothie Shop,Restaurant,Japanese Restaurant,Portuguese Restaurant
9,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937,1,Clothing Store,Coffee Shop,Café,Cosmetics Shop,Japanese Restaurant,Hotel,Bubble Tea Shop,Middle Eastern Restaurant,Ramen Restaurant,Pizza Place
15,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418,1,Coffee Shop,Café,Cocktail Bar,American Restaurant,Gastropub,Farmers Market,Seafood Restaurant,Restaurant,Italian Restaurant,Park
19,M4E,East Toronto,The Beaches,43.676357,-79.293031,5,Asian Restaurant,Neighborhood,Health Food Store,Trail,Pub,Doner Restaurant,Discount Store,Distribution Center,Dog Run,Dumpling Restaurant


In [341]:
toronto_merged.shape

(39, 16)

**We need to get the geographical coordinates of Toronto to be able to visualise the clusters on a map.**

In [342]:
address = 'Toronto, CA'

geolocator = Nominatim(user_agent="toronto_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Toronto are 43.6534817, -79.3839347.


**Now we can create the map.**

In [367]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=12)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['Neighbourhood'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

**We can also display the Neighbourhoods in each cluster using dataframes.**

**We can see that the data has not clustered particularly well, as many clusters contain only a single postcode. However, this was also the case with smaller numbers of clusters, and a higher number of clusters was determined to give slightly more useful partitioning. Within the central Toronto area analysed, the neighbourhoods are quite similar, although there appears to be a slight difference between Downtown Toronto and West Toronto, with differences also becoming more apparent further from Downtown.**

In [344]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 0, toronto_merged.columns[[0,2] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Postal Code,Neighbourhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
25,M6G,Christie,0,Grocery Store,Café,Park,Baby Store,Italian Restaurant,Candy Store,Athletics & Sports,Coffee Shop,Restaurant,Nightclub


In [365]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 1, toronto_merged.columns[[0,2] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Postal Code,Neighbourhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,M5A,"Regent Park, Harbourfront",1,Coffee Shop,Bakery,Pub,Park,Breakfast Spot,Café,Restaurant,Theater,Distribution Center,Chocolate Shop
4,M7A,"Queen's Park, Ontario Provincial Government",1,Coffee Shop,Sushi Restaurant,Sandwich Place,Burrito Place,Distribution Center,Fast Food Restaurant,Smoothie Shop,Restaurant,Japanese Restaurant,Portuguese Restaurant
9,M5B,"Garden District, Ryerson",1,Clothing Store,Coffee Shop,Café,Cosmetics Shop,Japanese Restaurant,Hotel,Bubble Tea Shop,Middle Eastern Restaurant,Ramen Restaurant,Pizza Place
15,M5C,St. James Town,1,Coffee Shop,Café,Cocktail Bar,American Restaurant,Gastropub,Farmers Market,Seafood Restaurant,Restaurant,Italian Restaurant,Park
20,M5E,Berczy Park,1,Coffee Shop,Cocktail Bar,Seafood Restaurant,Beer Bar,Bakery,Farmers Market,Restaurant,Cheese Shop,Juice Bar,Shopping Mall
...,...,...,...,...,...,...,...,...,...,...,...,...,...
92,M5W,Stn A PO Boxes,1,Coffee Shop,Seafood Restaurant,Restaurant,Italian Restaurant,Café,Cocktail Bar,Beer Bar,Hotel,Japanese Restaurant,Breakfast Spot
96,M4X,"St. James Town, Cabbagetown",1,Coffee Shop,Pet Store,Restaurant,Café,Pub,Bakery,Pizza Place,Park,Italian Restaurant,Japanese Restaurant
97,M5X,"First Canadian Place, Underground city",1,Coffee Shop,Café,Hotel,Gym,Japanese Restaurant,Restaurant,Salad Place,Steakhouse,Seafood Restaurant,American Restaurant
99,M4Y,Church and Wellesley,1,Coffee Shop,Japanese Restaurant,Sushi Restaurant,Fast Food Restaurant,Gay Bar,Restaurant,Pizza Place,Mediterranean Restaurant,Hotel,Pub


In [346]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 2, toronto_merged.columns[[0,2] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Postal Code,Neighbourhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
62,M5N,Roselawn,2,Garden,Home Service,Fast Food Restaurant,Dessert Shop,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant


In [347]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 3, toronto_merged.columns[[0,2] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Postal Code,Neighbourhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
83,M4T,"Moore Park, Summerhill East",3,Playground,Trail,Yoga Studio,Dessert Shop,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant


In [348]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 4, toronto_merged.columns[[0,2] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Postal Code,Neighbourhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
61,M4N,Lawrence Park,4,Park,Swim School,Bus Line,Yoga Studio,Discount Store,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant


In [349]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 5, toronto_merged.columns[[0,2] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Postal Code,Neighbourhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
19,M4E,The Beaches,5,Asian Restaurant,Neighborhood,Health Food Store,Trail,Pub,Doner Restaurant,Discount Store,Distribution Center,Dog Run,Dumpling Restaurant


In [350]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 6, toronto_merged.columns[[0,2] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Postal Code,Neighbourhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
68,M5P,"Forest Hill North & West, Forest Hill Road Park",6,Jewelry Store,Trail,Mexican Restaurant,Sushi Restaurant,Yoga Studio,Diner,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant


In [351]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 7, toronto_merged.columns[[0,2] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Postal Code,Neighbourhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
67,M4P,Davisville North,7,Gym / Fitness Center,Sandwich Place,Park,Department Store,Breakfast Spot,Food & Drink Shop,Hotel,Donut Shop,Dog Run,Doner Restaurant


In [352]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 8, toronto_merged.columns[[0,2] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Postal Code,Neighbourhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
91,M4W,Rosedale,8,Park,Playground,Trail,Yoga Studio,Dessert Shop,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant


In [366]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 9, toronto_merged.columns[[0,2] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Postal Code,Neighbourhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
31,M6H,"Dufferin, Dovercourt Village",9,Pharmacy,Bakery,Music Venue,Liquor Store,Café,Middle Eastern Restaurant,Bar,Bank,Supermarket,Portuguese Restaurant
37,M6J,"Little Portugal, Trinity",9,Bar,Coffee Shop,Vietnamese Restaurant,Restaurant,Vegetarian / Vegan Restaurant,Café,Asian Restaurant,Men's Store,Yoga Studio,New American Restaurant
43,M6K,"Brockton, Parkdale Village, Exhibition Place",9,Café,Performing Arts Venue,Coffee Shop,Breakfast Spot,Grocery Store,Bakery,Pet Store,Nightclub,Climbing Gym,Restaurant
54,M4M,Studio District,9,Coffee Shop,Brewery,Café,Gastropub,Bakery,American Restaurant,Yoga Studio,Convenience Store,Cheese Shop,Clothing Store
69,M6P,"High Park, The Junction South",9,Thai Restaurant,Park,Mexican Restaurant,Café,Arts & Crafts Store,Bar,Discount Store,Bakery,Fried Chicken Joint,Speakeasy
74,M5R,"The Annex, North Midtown, Yorkville",9,Café,Sandwich Place,Coffee Shop,Pharmacy,BBQ Joint,Pizza Place,Pub,Middle Eastern Restaurant,Burger Joint,Donut Shop
80,M5S,"University of Toronto, Harbord",9,Café,Bar,Japanese Restaurant,Bookstore,Bakery,Yoga Studio,Pub,Beer Bar,Beer Store,Sandwich Place
84,M5T,"Kensington Market, Chinatown, Grange Park",9,Café,Coffee Shop,Mexican Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Farmers Market,Dessert Shop,Grocery Store,Caribbean Restaurant,Bar
87,M5V,"CN Tower, King and Spadina, Railway Lands, Har...",9,Airport Service,Airport Terminal,Harbor / Marina,Bar,Rental Car Location,Sculpture Garden,Boutique,Boat or Ferry,Plane,Airport Food Court
