## Step 1: Import libraries

**Please, make sure that the next pacakages are available:**
* numpy
* pandas
* matplotlib
* requests
* sklearn
* beautifulsoup4
* folium
* geopy

In [1]:
import pandas as pd # library to process data as dataframes
import numpy as np
import matplotlib.pyplot as plt # plotting library
import matplotlib.cm as cm
import matplotlib.colors as colors
import requests
# backend for rendering plots within the browser
%matplotlib inline 
from bs4 import BeautifulSoup
from sklearn.cluster import KMeans 
from sklearn.datasets.samples_generator import make_blobs

import folium
from geopy.geocoders import Nominatim
print('Libraries imported.')

Libraries imported.


## Step 2: Create a Pandas Dataframe from the Wikipedia table
In this case I am going to use the Beautiful Soup library to read the table into a dataframe

In [2]:
# Obtain the html code of the wikipedia page with the list of post codes of Cananda and parse it with BeautifulSoup
website_cotent_in_html = requests.get("https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M").text
soup = BeautifulSoup(website_cotent_in_html,"html.parser")

# Get the table with the post codes from the html code
my_table = soup.find('table',{'class':'wikitable sortable'})

# Iterate throw all the rows in the table to get the differente elementes and paste them in a python list
table_rows=my_table.find_all('tr')
table_data = []

for row in table_rows:
    table_data.append([t.text.strip() for t in row.find_all('td')])

#Create the pandas dataframe from the pyton list
postal_codes_raw_df = pd.DataFrame(table_data, columns=['PostalCode', 'Borough', 'Neighbourhood'])

print(postal_codes_raw_df.shape)
postal_codes_raw_df.head(5)

(181, 3)


Unnamed: 0,PostalCode,Borough,Neighbourhood
0,,,
1,M1A,Not assigned,Not assigned
2,M2A,Not assigned,Not assigned
3,M3A,North York,Parkwoods
4,M4A,North York,Victoria Village


## Step 3: Format and present the Dataframe meeting the requirements

* *Only process the cells that have an assigned borough. Ignore cells with a borough that is Not assigned*

I will replace all the empty, "Not assigned" and "None" values with NaN values to use the dropna method, then drop rows with NaN in the Borough column



In [3]:
postal_codes_raw_df.replace(('None','','Not assigned'), np.nan, inplace=True)
postal_codes_df = postal_codes_raw_df.dropna(subset=['Borough'])
postal_codes_df.reset_index(drop=True, inplace=True)
print(postal_codes_df.shape)
postal_codes_df.head(20)

(103, 3)


Unnamed: 0,PostalCode,Borough,Neighbourhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Regent Park, Harbourfront"
3,M6A,North York,"Lawrence Manor, Lawrence Heights"
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"
5,M9A,Etobicoke,"Islington Avenue, Humber Valley Village"
6,M1B,Scarborough,"Malvern, Rouge"
7,M3B,North York,Don Mills
8,M4B,East York,"Parkview Hill, Woodbine Gardens"
9,M5B,Downtown Toronto,"Garden District, Ryerson"


* *More than one neighborhood can exist in one postal code area. For example, in the table on the Wikipedia page, you will notice that M5A is listed twice and has two neighborhoods: Harbourfront and Regent Park. These two rows will be combined into one row with the neighborhoods separated with a comma as shown in row 11 in the above table*

In this case I will check How many lines per Postal code there are in the data frame first:

In [4]:
postal_codes_df[['PostalCode','Neighbourhood']].groupby('PostalCode').count().sort_values(by="Neighbourhood", ascending=False)


Unnamed: 0_level_0,Neighbourhood
PostalCode,Unnamed: 1_level_1
M1B,1
M5R,1
M6G,1
M6E,1
M6C,1
...,...
M3L,1
M3K,1
M3J,1
M3H,1


There is no PostalCode with more than One Neighbourhood in multple lines, all of them are already combined in one row separate with commas

* *If a cell has a borough but a Not assigned neighborhood, then the neighborhood will be the same as the borough.*

As all the empties and not assigned values were previously converted in NaN, lets check how many NaN values are in the Neighbourhood column:

In [5]:
empties_in_neighbourhood = len(postal_codes_df[postal_codes_df['Neighbourhood']==np.nan])
print('Number of Empty values in the Neighbouhood columns: {}'.format(empties_in_neighbourhood))

Number of Empty values in the Neighbouhood columns: 0


There aren't any neighbourhood with empty or NaN value so there is no need to meet the condition

* *In the last cell of your notebook, use the .shape method to print the number of rows of your dataframe.*

In [6]:
postal_codes_df.shape

(103, 3)

## Step 4 Get the Coordinates of each Postal Code

The process here is to load the csv file provided into a Pandas Dataframe and then combined it with the neighbourhoods dataframe obtained in previous steps

In [7]:
#obtain the coordinates data
coordinates_df = pd.read_csv('http://cocl.us/Geospatial_data')
coordinates_df.shape

(103, 3)

In [8]:
#rename the column Postal Code in coordinates df to PostalCoda to be able to do the merge
coordinates_df.rename(columns={'Postal Code': 'PostalCode'}, inplace=True) 

# combinde data with the actual dataframe:
postalcodeswithcoordinates_df = pd.merge(postal_codes_df, coordinates_df, on = 'PostalCode')
print(postalcodeswithcoordinates_df.shape)
postalcodeswithcoordinates_df.head()



(103, 5)


Unnamed: 0,PostalCode,Borough,Neighbourhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494


## Step4: Display the Neighbourhood in a map
As suggested we are going to use only borough witht the word "Toronto" in them

In [9]:
toronto_df = postalcodeswithcoordinates_df[postalcodeswithcoordinates_df['Borough'].str.contains("Toronto")]
toronto_df.reset_index(drop=True, inplace=True)
toronto_df.head()


Unnamed: 0,PostalCode,Borough,Neighbourhood,Latitude,Longitude
0,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
1,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494
2,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937
3,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418
4,M4E,East Toronto,The Beaches,43.676357,-79.293031


Obtaining the coordinates of Toronto, to create a map with the Borough superimposed on top

In [10]:
address = 'Toronto, Ontario'

geolocator = Nominatim(user_agent="tr_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Toronto are 43.6534817, -79.3839347.


In the next cell the code to create the map of toronto with all the Neighbourhoods in it:

In [11]:

map_toronto = folium.Map(location=[latitude, longitude], zoom_start=13)

# add markers to map
for lat, lng, postalcode, neighbourhood in zip(toronto_df['Latitude'], toronto_df['Longitude'], toronto_df['PostalCode'], toronto_df['Neighbourhood']):
    label = '{}, {}'.format(neighbourhood, postalcode)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
    
map_toronto

## Step 5: Obtain the nearby venues for each PostCode
Now we are going to use the four square API to get the nearby venues for each Post Code
First we  define the Forsquare credentails and then create a function to get the neaerby venues of a Post Code

In [12]:
CLIENT_ID = '3CL45KZFHOK53DH0KIK33CC2YQWHHYSYUFFVCZNMD43OEXL1' # your Foursquare ID
CLIENT_SECRET = 'FSBQZABH0LK45QVXZ1Y0ZUQU421M5SCU5XYMX1ZSIMB54LEC' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT = 100
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        #print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['PostalCode', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

Your credentails:
CLIENT_ID: 3CL45KZFHOK53DH0KIK33CC2YQWHHYSYUFFVCZNMD43OEXL1
CLIENT_SECRET:FSBQZABH0LK45QVXZ1Y0ZUQU421M5SCU5XYMX1ZSIMB54LEC


The next piece of code will go throw each of the potcodes in the toronto daframe and obtain the nearby places, they will be stored in a new dataframe called toronto venues

In [13]:

toronto_venues = getNearbyVenues(names=toronto_df['PostalCode'],
                                   latitudes=toronto_df['Latitude'],
                                   longitudes=toronto_df['Longitude']
                                  )

KeyError: 'groups'

Now its time to verify how many venues were obtained from Forsquare and how many unique venues categories do we have

In [14]:
print('There are {} venues in Toronto.'.format(toronto_venues.shape[0]))
print('There are {} uniques categories.'.format(len(toronto_venues['Venue Category'].unique())))
toronto_venues.head()

NameError: name 'toronto_venues' is not defined

## Step 6: Analysis of each Post Code
The first thing is to get the dumies matrix PostalCoda-Venue type:


In [15]:
# one hot encoding
toronto_onehot = pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")

# add Post Code column back to dataframe
toronto_onehot['PostalCode'] = toronto_venues['PostalCode'] 

# move PostalCode column to the first column
fixed_columns = [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]

toronto_onehot.head()

NameError: name 'toronto_venues' is not defined

Now we are grouping the matrix having the number of venues found for each PostalCode in one row:

In [16]:
toronto_grouped = toronto_onehot.groupby('PostalCode').sum().reset_index()
print(toronto_grouped.shape)
toronto_grouped

(39, 238)


Unnamed: 0,PostalCode,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,...,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio
0,M4E,0,0,0,0,0,0,0,0,0,...,1,0,0,0,0,0,0,0,0,0
1,M4K,0,0,0,0,0,0,0,1,0,...,1,0,0,0,0,0,0,0,0,1
2,M4L,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,M4M,0,0,0,0,0,0,0,2,0,...,0,0,0,0,0,1,0,0,0,1
4,M4N,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
5,M4P,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
6,M4R,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,1
7,M4S,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
8,M4T,0,0,0,0,0,0,0,0,0,...,1,0,0,0,0,0,0,0,0,0
9,M4V,0,0,0,0,0,0,0,1,0,...,0,0,0,0,1,0,0,0,0,0


For each Postal Code, we obtain the 10th most repated type of venues ordred and get a new dataframe with the info: *neighborhoods_venues_sorted*: In order to do that first a fucntion that will return the list of venues orderd from most comon to least comon of a given rome. Then we call the fucntion iterating over each line of the *toronto_grouped* dataframe

In [17]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['PostalCode']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['PostalCode'] = toronto_grouped['PostalCode']

for ind in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,PostalCode,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M4E,Trail,Neighborhood,Pub,Health Food Store,Yoga Studio,Department Store,Dessert Shop,Diner,Discount Store,Distribution Center
1,M4K,Greek Restaurant,Italian Restaurant,Coffee Shop,Restaurant,Ice Cream Shop,Furniture / Home Store,Fruit & Vegetable Store,Pub,Pizza Place,Lounge
2,M4L,Fast Food Restaurant,Park,Pub,Sandwich Place,Burrito Place,Restaurant,Italian Restaurant,Intersection,Fish & Chips Shop,Steakhouse
3,M4M,Café,Coffee Shop,Gastropub,Bakery,Brewery,American Restaurant,Yoga Studio,Comfort Food Restaurant,Seafood Restaurant,Sandwich Place
4,M4N,Park,Swim School,Bus Line,Event Space,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop,Doner Restaurant,Dog Run


## Step 7: Identify similar Neighbourhoods -> Clustering

We will create cluster using the dummies matrix obtained before (wihtout the PostalCode). The Kmeans fucntion is used. We then add the cluster to the *neighbourhoods_venues_sorted* data frame

In [18]:
# set number of clusters
kclusters = 5

toronto_grouped_clustering = toronto_grouped.drop('PostalCode', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

toronto_merged = toronto_df

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
toronto_merged = toronto_merged.join(neighborhoods_venues_sorted.set_index('PostalCode'), on='PostalCode')

toronto_merged.head() # check the last columns!

Unnamed: 0,PostalCode,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636,4,Coffee Shop,Park,Bakery,Pub,Breakfast Spot,Restaurant,Café,Theater,Mexican Restaurant,Shoe Store
1,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494,4,Coffee Shop,Sushi Restaurant,Bar,Beer Bar,Smoothie Shop,Sandwich Place,Burrito Place,Café,Park,College Auditorium
2,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937,3,Clothing Store,Coffee Shop,Restaurant,Café,Cosmetics Shop,Bubble Tea Shop,Japanese Restaurant,Italian Restaurant,Middle Eastern Restaurant,Theater
3,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418,0,Café,Coffee Shop,Cocktail Bar,Gastropub,American Restaurant,Moroccan Restaurant,Creperie,Department Store,Lingerie Store,Italian Restaurant
4,M4E,East Toronto,The Beaches,43.676357,-79.293031,2,Trail,Neighborhood,Pub,Health Food Store,Yoga Studio,Department Store,Dessert Shop,Diner,Discount Store,Distribution Center


A map is created now with the Postal Codes and colored marks based on the clusters labels

In [19]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['PostalCode'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### Examine Clusters


Cluster 1

In [20]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 0, toronto_merged.columns[[2] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighbourhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
3,St. James Town,0,Café,Coffee Shop,Cocktail Bar,Gastropub,American Restaurant,Moroccan Restaurant,Creperie,Department Store,Lingerie Store,Italian Restaurant
5,Berczy Park,0,Coffee Shop,Cocktail Bar,Beer Bar,Bakery,Seafood Restaurant,Cheese Shop,Café,Restaurant,Grocery Store,Japanese Restaurant
11,"Little Portugal, Trinity",0,Bar,Coffee Shop,Restaurant,Asian Restaurant,Café,Vegetarian / Vegan Restaurant,Men's Store,Cuban Restaurant,Brewery,Record Shop
12,"The Danforth West, Riverdale",0,Greek Restaurant,Italian Restaurant,Coffee Shop,Restaurant,Ice Cream Shop,Furniture / Home Store,Fruit & Vegetable Store,Pub,Pizza Place,Lounge
14,"Brockton, Parkdale Village, Exhibition Place",0,Café,Coffee Shop,Breakfast Spot,Grocery Store,Stadium,Burrito Place,Restaurant,Climbing Gym,Pet Store,Bakery
17,Studio District,0,Café,Coffee Shop,Gastropub,Bakery,Brewery,American Restaurant,Yoga Studio,Comfort Food Restaurant,Seafood Restaurant,Sandwich Place
26,Davisville,0,Dessert Shop,Sandwich Place,Pizza Place,Sushi Restaurant,Italian Restaurant,Coffee Shop,Gym,Café,Asian Restaurant,Seafood Restaurant
27,"University of Toronto, Harbord",0,Café,Bakery,Bookstore,Bar,Italian Restaurant,Japanese Restaurant,Restaurant,Bank,Beer Bar,Beer Store
28,"Runnymede, Swansea",0,Café,Coffee Shop,Pizza Place,Pub,Bookstore,Sushi Restaurant,Italian Restaurant,Yoga Studio,Gym,Restaurant
30,"Kensington Market, Chinatown, Grange Park",0,Café,Mexican Restaurant,Bakery,Vietnamese Restaurant,Coffee Shop,Grocery Store,Dessert Shop,Bar,Vegetarian / Vegan Restaurant,Gaming Cafe


Cluster 2

In [21]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 1, toronto_merged.columns[[2] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighbourhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
8,"Richmond, Adelaide, King",1,Coffee Shop,Café,Restaurant,Clothing Store,Deli / Bodega,Hotel,Gym,Thai Restaurant,Bookstore,Sushi Restaurant
13,"Toronto Dominion Centre, Design Exchange",1,Coffee Shop,Café,Hotel,Seafood Restaurant,Japanese Restaurant,Salad Place,Italian Restaurant,Restaurant,American Restaurant,Concert Hall
16,"Commerce Court, Victoria Hotel",1,Coffee Shop,Café,Restaurant,Hotel,Gym,American Restaurant,Deli / Bodega,Seafood Restaurant,Italian Restaurant,Japanese Restaurant
36,"First Canadian Place, Underground city",1,Coffee Shop,Café,Japanese Restaurant,Restaurant,Gym,Hotel,Steakhouse,Salad Place,Seafood Restaurant,Deli / Bodega


Cluster 3

In [22]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 2, toronto_merged.columns[[2] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighbourhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,The Beaches,2,Trail,Neighborhood,Pub,Health Food Store,Yoga Studio,Department Store,Dessert Shop,Diner,Discount Store,Distribution Center
7,Christie,2,Grocery Store,Café,Park,Baby Store,Restaurant,Italian Restaurant,Athletics & Sports,Coffee Shop,Nightclub,Candy Store
9,"Dufferin, Dovercourt Village",2,Bakery,Pharmacy,Grocery Store,Supermarket,Bank,Bar,Café,Pool,Music Venue,Brewery
15,"India Bazaar, The Beaches West",2,Fast Food Restaurant,Park,Pub,Sandwich Place,Burrito Place,Restaurant,Italian Restaurant,Intersection,Fish & Chips Shop,Steakhouse
18,Lawrence Park,2,Park,Swim School,Bus Line,Event Space,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop,Doner Restaurant,Dog Run
19,Roselawn,2,Garden,Yoga Studio,Deli / Bodega,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop,Doner Restaurant,Dog Run
20,Davisville North,2,Park,Breakfast Spot,Hotel,Food & Drink Shop,Sandwich Place,Department Store,Gym,Donut Shop,Doner Restaurant,Deli / Bodega
21,"Forest Hill North & West, Forest Hill Road Park",2,Park,Jewelry Store,Trail,Sushi Restaurant,Deli / Bodega,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop,Doner Restaurant
22,"High Park, The Junction South",2,Mexican Restaurant,Café,Thai Restaurant,Diner,Bakery,Flea Market,Italian Restaurant,Cajun / Creole Restaurant,Speakeasy,Fried Chicken Joint
23,"North Toronto West, Lawrence Park",2,Clothing Store,Coffee Shop,Park,Salon / Barbershop,Restaurant,Rental Car Location,Café,Chinese Restaurant,Miscellaneous Shop,Sporting Goods Shop


Cluster 4

In [23]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 3, toronto_merged.columns[[2] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighbourhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,"Garden District, Ryerson",3,Clothing Store,Coffee Shop,Restaurant,Café,Cosmetics Shop,Bubble Tea Shop,Japanese Restaurant,Italian Restaurant,Middle Eastern Restaurant,Theater


Cluster 5

In [24]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 4, toronto_merged.columns[[2] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighbourhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Regent Park, Harbourfront",4,Coffee Shop,Park,Bakery,Pub,Breakfast Spot,Restaurant,Café,Theater,Mexican Restaurant,Shoe Store
1,"Queen's Park, Ontario Provincial Government",4,Coffee Shop,Sushi Restaurant,Bar,Beer Bar,Smoothie Shop,Sandwich Place,Burrito Place,Café,Park,College Auditorium
6,Central Bay Street,4,Coffee Shop,Italian Restaurant,Sandwich Place,Café,Ice Cream Shop,Burger Joint,Dessert Shop,Salad Place,Japanese Restaurant,Thai Restaurant
10,"Harbourfront East, Union Station, Toronto Islands",4,Coffee Shop,Aquarium,Hotel,Café,Sporting Goods Shop,Brewery,Restaurant,Italian Restaurant,Scenic Lookout,Fried Chicken Joint
34,Stn A PO Boxes,4,Coffee Shop,Café,Japanese Restaurant,Cocktail Bar,Restaurant,Italian Restaurant,Beer Bar,Seafood Restaurant,Breakfast Spot,Creperie
37,Church and Wellesley,4,Coffee Shop,Japanese Restaurant,Sushi Restaurant,Restaurant,Gay Bar,Pub,Men's Store,Mediterranean Restaurant,Hotel,Gastropub
