<h2>Segmenting and Clustering Neighborhoods in Toronto</h2>

<h4>Importing libraries</h4>

Here we import all the libraries that are going to be needed to fulfill the capstone project

In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Libraries imported.


<h4>Scraping wikipedia webpage</h4>

Using the BeautifulSoup library it is possible to scrape a wikipedia webpage in order to extract the table information needed

In [2]:
import requests
website_url = requests.get('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M').text

from bs4 import BeautifulSoup
soup = BeautifulSoup(website_url,'lxml')

My_table = soup.find('table',{'class':'wikitable sortable'})

<h4>Finding special characters in the html code for creating the dataframe</h4>

Once the html code has been extracted from the wikipedia webpage we have to scan the characters that will led us to encounter all the information that we want to gather. 

In [3]:
rows = My_table.findAll('td')
rowslink = My_table.findAll('a')
results=[]
for row in rows:
    aux=str(row)
    if len(aux.split('<'))<4: 
        results.append(aux.split('>')[1].split('<')[0])
    else:
        results.append(aux.split('>')[2].split('<')[0])
results2=[]
for result in results:
    if result[-1]=='\n':
        result=result[:-1]
    results2.append(result)

<h4>Creating the dataframe</h4>

In this step we create the dataframe and assign the column names. 

In [4]:
import pandas as pd
df = pd.DataFrame()
df['PostalCode'] = results2[0::3]
df['Borough']=results2[1::3]
df['Neighborhood']=results2[2::3]

<h4>Cleaning the data</h4>

Here the data is structured and cleaned for presenting a more adequate aspect. All the conditions exposed in the assignment are fulfilled. 

In [5]:
#Run cleaning the dataframe cell before running this one for not superposing data.
df=df[~df.Borough.str.contains("Not assigned")]
df.groupby('PostalCode')
trows=df.index.get_values()
for i in trows:
    if df['Neighborhood'][i]=='Not assigned':
        df['Neighborhood'][i]=df['Borough'][i]
for i in range(0,len(trows)-1): 
    if df['PostalCode'][trows[i]]==df['PostalCode'][trows[i+1]]:
         df['Neighborhood'][trows[i+1]]=df['Neighborhood'][trows[i+1]]+','+df['Neighborhood'][trows[i]]
df2 = df.drop_duplicates(subset=['PostalCode'], keep='last', inplace=False)                   

In [6]:
df2.shape

(103, 3)

<h4>Importing coordinates with CSV file</h4>

Finally, we decide to import the data from the CSV file given

In [7]:
dfcoord = pd.read_csv('https://cocl.us/Geospatial_data')

In [8]:
dfcoord.shape

(103, 3)

<h4>Sorting data for merging dataframes</h4>

In order to merge dataframes we need to sort data for making it coincide

In [9]:
df2=df2.sort_values('PostalCode')

In [10]:
df2.index = range(len(df2))

In [11]:
df_row_merged = pd.concat([df2, dfcoord], axis=1)

In [12]:
df_row_merged=df_row_merged.drop(['Postal Code'], axis=1)
df_row_merged.head(15)

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M1B,Scarborough,"Malvern,Rouge",43.806686,-79.194353
1,M1C,Scarborough,"Port Union,Rouge Hill,Highland Creek",43.784535,-79.160497
2,M1E,Scarborough,"West Hill,Morningside,Guildwood",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476
5,M1J,Scarborough,Scarborough Village,43.744734,-79.239476
6,M1K,Scarborough,"Kennedy Park,Ionview,East Birchmount Park",43.727929,-79.262029
7,M1L,Scarborough,"Oakridge,Golden Mile,Clairlea",43.711112,-79.284577
8,M1M,Scarborough,"Scarborough Village West,Cliffside,Cliffcrest",43.716316,-79.239476
9,M1N,Scarborough,"Cliffside West,Birch Cliff",43.692657,-79.264848


<h4>Toronto Map</h4>

Create a map of New York with neighborhoods superimposed on top.

In [13]:
import folium
latitude=43.7001100
longitude=-79.4163000
map_toronto = folium.Map(location=[latitude, longitude], zoom_start=10)

In [14]:
Scarborough_data = df_row_merged[df_row_merged['Borough'] == 'Scarborough'].reset_index(drop=True)

In [15]:
for lat, lng, borough, neighborhood in zip(df_row_merged['Latitude'], df_row_merged['Longitude'], df_row_merged['Borough'], df_row_merged['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
    
map_toronto    

<h4>Foursquare API<\h4>

Foursquare credentials

In [16]:
CLIENT_ID = 'TP34TJOS4I01014XG2B44OPU4MDDVGUYXR232QOFLWAFHAMN' # your Foursquare ID
CLIENT_SECRET = '2MWGF1OZW4ZZGQ1IGCZTLMT2GAYJY0GDYWXCKC1WYEG4KPRO' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

In [17]:
import random
df_row_merged.loc[0, 'Neighborhood']
nei=random.randint(0, 102)
neighborhood_latitude = df_row_merged.loc[nei, 'Latitude'] # neighborhood latitude value
neighborhood_longitude = df_row_merged.loc[nei, 'Longitude'] # neighborhood longitude value

neighborhood_name = df_row_merged.loc[nei, 'Neighborhood'] # neighborhood name

print('Latitude and longitude values of {} are {}, {}.'.format(neighborhood_name, 
                                                               neighborhood_latitude, 
                                                               neighborhood_longitude))

Latitude and longitude values of Summerhill West,South Hill,Rathnelly,Forest Hill SE,Deer Park are 43.68641229999999, -79.4000493.


In [18]:
LIMIT = 100 # limit of number of venues returned by Foursquare API

radius = 500 # define radius

url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)
url # display URL

'https://api.foursquare.com/v2/venues/explore?&client_id=TP34TJOS4I01014XG2B44OPU4MDDVGUYXR232QOFLWAFHAMN&client_secret=2MWGF1OZW4ZZGQ1IGCZTLMT2GAYJY0GDYWXCKC1WYEG4KPRO&v=20180605&ll=43.68641229999999,-79.4000493&radius=500&limit=100'

In [19]:
results = requests.get(url).json()
print(results)

{'meta': {'code': 200, 'requestId': '5c4fb9931ed2193b47ae58a2'}, 'response': {'suggestedFilters': {'header': 'Tap to show:', 'filters': [{'name': 'Open now', 'key': 'openNow'}]}, 'headerLocation': 'Deer Park', 'headerFullLocation': 'Deer Park, Toronto', 'headerLocationGranularity': 'neighborhood', 'totalResults': 14, 'suggestedBounds': {'ne': {'lat': 43.690912304499996, 'lng': -79.39383797359734}, 'sw': {'lat': 43.68191229549999, 'lng': -79.40626062640267}}, 'groups': [{'type': 'Recommended Places', 'name': 'recommended', 'items': [{'reasons': {'count': 0, 'items': [{'summary': 'This spot is popular', 'type': 'general', 'reasonName': 'globalInteractionReason'}]}, 'venue': {'id': '55c78cef498ec4095e9fba41', 'name': 'LCBO', 'location': {'address': '111 St. Clair West', 'lat': 43.686990631074885, 'lng': -79.39923810519545, 'labeledLatLngs': [{'label': 'display', 'lat': 43.686990631074885, 'lng': -79.39923810519545}], 'distance': 91, 'cc': 'CA', 'city': 'Toronto', 'state': 'ON', 'country':

<h4>Exploring Toronto<\h4>

Use New York exploring queries for exploring one random neighborhood in Toronto

In [20]:
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [21]:
venues = results['response']['groups'][0]['items']
nearby_venues = json_normalize(venues)
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

In [22]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

14 venues were returned by Foursquare.


<h4>Explore all Neighborhoods<\h4>

Use New York exploring queries for exploring all neighborhoods in Toronto

In [23]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [24]:
toronto_venues = getNearbyVenues(names=df_row_merged['Neighborhood'],
                                   latitudes=df_row_merged['Latitude'],
                                   longitudes=df_row_merged['Longitude']);

Malvern,Rouge
Port Union,Rouge Hill,Highland Creek
West Hill,Morningside,Guildwood
Woburn
Cedarbrae
Scarborough Village
Kennedy Park,Ionview,East Birchmount Park
Oakridge,Golden Mile,Clairlea
Scarborough Village West,Cliffside,Cliffcrest
Cliffside West,Birch Cliff
Wexford Heights,Scarborough Town Centre,Dorset Park
Wexford,Maryvale
Agincourt
Tam O'Shanter,Sullivan,Clarks Corners
Steeles East,Milliken,L'Amoreaux East,Agincourt North
Steeles West,L'Amoreaux West
Upper Rouge
Hillcrest Village
Oriole,Henry Farm,Fairview
Bayview Village
York Mills,Silver Hills
Willowdale,Newtonbrook
Willowdale South
York Mills West
Willowdale West
Parkwoods
Don Mills North
Don Mills South,Flemingdon Park
Wilson Heights,Downsview North,Bathurst Manor
York University,Northwood Park
Downsview East,CFB Toronto
Downsview West
Downsview Central
Downsview Northwest
Victoria Village
Parkview Hill,Woodbine Gardens
Woodbine Heights
The Beaches
Leaside
Thorncliffe Park
East Toronto
Riverdale,The Danforth West
India Ba

In [103]:
print(toronto_venues.shape)
aux=toronto_venues.Neighborhood.unique()
aux2=toronto_venues.drop_duplicates(subset='Neighborhood', keep='first', inplace=False)
aux2=aux2.reset_index(drop=True)
aux2=aux2.drop(['Venue', 'Venue Latitude', 'Venue Longitude', 'Venue Category'], axis=1)
aux2

(2233, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude
0,"Malvern,Rouge",43.806686,-79.194353
1,"Port Union,Rouge Hill,Highland Creek",43.784535,-79.160497
2,"West Hill,Morningside,Guildwood",43.763573,-79.188711
3,Woburn,43.770992,-79.216917
4,Cedarbrae,43.773136,-79.239476
5,Scarborough Village,43.744734,-79.239476
6,"Kennedy Park,Ionview,East Birchmount Park",43.727929,-79.262029
7,"Oakridge,Golden Mile,Clairlea",43.711112,-79.284577
8,"Scarborough Village West,Cliffside,Cliffcrest",43.716316,-79.239476
9,"Cliffside West,Birch Cliff",43.692657,-79.264848


In [26]:
toronto_venues.groupby('Neighborhood').count();

In [27]:
print('There are {} uniques categories.'.format(len(toronto_venues['Venue Category'].unique())))

There are 274 uniques categories.


<h4>Analyze Each Neigborhood<\h4>

Find the 10 most common venues in each neighborhood

In [41]:
# one hot encoding
toronto_onehot = pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
toronto_onehot['Neighborhood'] = toronto_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]

In [42]:
toronto_onehot.shape

(2233, 274)

In [83]:
toronto_grouped = toronto_onehot.groupby('Neighborhood').mean().reset_index()

In [79]:
num_top_venues = 5

for hood in toronto_grouped['Neighborhood']:
    #print("----"+hood+"----")
    temp = toronto_grouped[toronto_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    #print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    #print('\n')

In [80]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [77]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = toronto_grouped['Neighborhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Agincourt,Lounge,Clothing Store,Skating Rink,Breakfast Spot,Empanada Restaurant,Ethiopian Restaurant,Event Space,Electronics Store,Eastern European Restaurant,Dessert Shop
1,Bayview Village,Chinese Restaurant,Café,Japanese Restaurant,Bank,Women's Store,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore
2,Berczy Park,Coffee Shop,Restaurant,Cocktail Bar,Cheese Shop,Café,Seafood Restaurant,Bakery,Pub,Italian Restaurant,Farmers Market
3,Business Reply Mail Processing Centre 969 Eastern,Light Rail Station,Yoga Studio,Garden,Comic Shop,Pizza Place,Recording Studio,Restaurant,Burrito Place,Brewery,Skate Park
4,Caledonia-Fairbanks,Park,Women's Store,Fast Food Restaurant,Market,Pharmacy,Gourmet Shop,Golf Course,Grocery Store,Electronics Store,Eastern European Restaurant


<h4>Cluster Neighborhoods<\h4>

Run *k*-means to cluster the neighborhood into 5 clusters.

In [78]:
# set number of clusters
kclusters = 5

toronto_grouped_clustering = toronto_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_

array([0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 2, 2,
       0, 1, 0, 1, 4, 0, 0, 0, 0, 0, 0, 2, 0, 1, 0, 0, 0, 0, 0, 1, 2, 0,
       0, 1, 2, 0, 0, 0, 0, 0, 0, 2, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 2, 0,
       0, 0, 0, 0, 4, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 1, 0, 1, 1,
       2, 0, 0, 0, 1, 0, 0, 0, 2, 3, 0, 0], dtype=int32)

In [86]:
neighborhoods_venues_sorted.Neighborhood.unique()

array(['Agincourt', 'Bayview Village', 'Berczy Park',
       'Business Reply Mail Processing Centre 969 Eastern',
       'Caledonia-Fairbanks', 'Canada Post Gateway Processing Centre',
       'Cedarbrae', 'Central Bay Street', 'Christie',
       'Church and Wellesley', 'Cliffside West,Birch Cliff', 'Davisville',
       'Davisville North', 'Don Mills North',
       'Don Mills South,Flemingdon Park', 'Downsview Central',
       'Downsview East,CFB Toronto', 'Downsview Northwest',
       'Downsview West', 'Dufferin,Dovercourt Village', 'East Toronto',
       'Forest Hill West,Forest Hill North', 'Garden District,Ryerson',
       'Glencairn', 'Hillcrest Village', 'Humber Summit',
       'Humberlea,Emery', 'Humewood-Cedarvale',
       'India Bazaar,The Beaches West',
       'Kennedy Park,Ionview,East Birchmount Park',
       'Kensington Market,Grange Park,Chinatown',
       'Lawrence Manor East,Bedford Park',
       'Lawrence Manor,Lawrence Heights', 'Lawrence Park', 'Leaside',
       'Long

In [107]:
neighborhoods_venues_sorted=neighborhoods_venues_sorted.drop('Cluster Labels', axis=1)

In [108]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

toronto_merged = aux2
 
# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
toronto_merged = toronto_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

In [110]:
# create map
import math
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Neighborhood Latitude'], toronto_merged['Neighborhood Longitude'], toronto_merged['Neighborhood'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[int(cluster)-1],
        fill=True,
        fill_color=rainbow[int(cluster)-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

Examine Clusters

In [111]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 0, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood Latitude,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,43.806686,Drugstore,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Dumpling Restaurant,Harbor / Marina
1,43.784535,Women's Store,Dumpling Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Eastern European Restaurant
3,43.770992,Korean Restaurant,Women's Store,Dumpling Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore
4,43.773136,Athletics & Sports,Fried Chicken Joint,Caribbean Restaurant,Thai Restaurant,Bakery,Bank,Discount Store,Dog Run,Doner Restaurant
5,43.744734,Playground,Drugstore,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Women's Store
6,43.727929,Chinese Restaurant,Department Store,Coffee Shop,Convenience Store,Train Station,Bus Station,Drugstore,Diner,Dog Run
7,43.711112,Bakery,Fast Food Restaurant,Intersection,Bus Station,Soccer Field,Park,Gluten-free Restaurant,Gift Shop,Empanada Restaurant
8,43.716316,American Restaurant,Women's Store,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore
9,43.692657,Café,Skating Rink,General Entertainment,Women's Store,Donut Shop,Diner,Discount Store,Dog Run,Doner Restaurant
10,43.75741,Chinese Restaurant,Latin American Restaurant,Furniture / Home Store,Pet Store,Vietnamese Restaurant,Comfort Food Restaurant,Dessert Shop,Ethiopian Restaurant,Empanada Restaurant


In [112]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 1, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood Latitude,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,43.763573,Rental Car Location,Electronics Store,Pizza Place,Mexican Restaurant,Breakfast Spot,Doner Restaurant,Diner,Discount Store,Dog Run
13,43.781638,Chinese Restaurant,Italian Restaurant,Thai Restaurant,Noodle House,Fried Chicken Joint,Fast Food Restaurant,Pharmacy,Eastern European Restaurant,Dumpling Restaurant
22,43.782736,Grocery Store,Coffee Shop,Pizza Place,Butcher,Dog Run,Department Store,Dessert Shop,Dim Sum Restaurant,Diner
32,43.725882,Coffee Shop,Pizza Place,Hockey Arena,Women's Store,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant
33,43.706397,Pizza Place,Athletics & Sports,Breakfast Spot,Rock Climbing Spot,Bank,Gastropub,Intersection,Pharmacy,Gym / Fitness Center
70,43.709577,Pizza Place,Japanese Restaurant,Pub,Women's Store,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant
79,43.673185,Grocery Store,Pizza Place,Bakery,Women's Store,Donut Shop,Diner,Discount Store,Dog Run,Doner Restaurant
87,43.602414,Gym,Skating Rink,Pharmacy,Pub,Sandwich Place,Coffee Shop,Gift Shop,Gluten-free Restaurant,Drugstore
92,43.643515,Beer Store,Pharmacy,Pizza Place,Café,Convenience Store,Drugstore,Diner,Discount Store,Dog Run
93,43.756303,Empanada Restaurant,Women's Store,Donut Shop,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Dumpling Restaurant


In [113]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 2, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood Latitude,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
14,43.815252,Playground,Women's Store,Donut Shop,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant
21,43.752758,Bank,Park,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Dumpling Restaurant,Women's Store
23,43.753259,Food & Drink Shop,Park,Construction & Landscaping,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore
28,43.737473,Airport,Electronics Store,Bus Stop,Women's Store,Drugstore,Diner,Discount Store,Dog Run,Doner Restaurant
38,43.685347,Convenience Store,Women's Store,Dumpling Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore
42,43.72802,Park,Dim Sum Restaurant,Swim School,Donut Shop,Diner,Discount Store,Dog Run,Doner Restaurant,Women's Store
48,43.679563,Trail,Playground,Donut Shop,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant
62,43.696948,Jewelry Store,Park,Sushi Restaurant,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop
72,43.689026,Women's Store,Fast Food Restaurant,Market,Pharmacy,Gourmet Shop,Golf Course,Grocery Store,Electronics Store,Eastern European Restaurant
77,43.713756,Construction & Landscaping,Basketball Court,Bakery,Women's Store,Dumpling Restaurant,Dog Run,Doner Restaurant,Donut Shop,Drugstore


In [114]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 3, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood Latitude,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
19,43.75749,Women's Store,Drugstore,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Dumpling Restaurant,Dessert Shop


In [115]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 4, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood Latitude,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
89,43.636258,Baseball Field,Women's Store,Dumpling Restaurant,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Electronics Store
94,43.724766,Women's Store,Dumpling Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Eastern European Restaurant
