# Toronto Neighborhood Segmentation Data

### By: Gyan Prakash

*The notebook below fetches the table from wikipedia page, and converts it into pandas dataframe. After this data wrangling is performed to clean the data.*

### Let's import the libraries 

In [2]:
import numpy as np
import requests
import pandas as pd


### Now we will fetch the tables from the given page into a list of dataframe objects

In [3]:
url= 'https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'
r = requests.get(url)
page=pd.read_html(url)

*We have got tabless from wikipedia page. It is a list.*
### Let's check the datatype of 'page':

In [4]:
type(page)

list

*Since we need to work only with the first table,* 
### Let's take out the first table from page: 

In [5]:
df=page[0]

In [6]:
type(df)

pandas.core.frame.DataFrame

In [7]:
df.drop(labels=0, axis=0, inplace=True)

# reset index, because we droped two rows
df.reset_index(drop=True, inplace=True)

*Let's have a look at our dataframe:*

In [8]:
df.head()

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M2A,Not assigned,Not assigned
1,M3A,North York,Parkwoods
2,M4A,North York,Victoria Village
3,M5A,Downtown Toronto,Harbourfront
4,M5A,Downtown Toronto,Regent Park


### Insert the column names:

In [9]:
df.columns=['Post Code','Borough','Neighborhood']

*Missing values in our dataframe are displayed as 'Not assigned'. Let's replace them with numpy NaN. It will make the processing easier.'*

In [10]:
df.replace( "Not assigned",np.nan, inplace=True)

### Droping NaN rows for Borough

In [11]:
df.dropna(subset=["Borough"], axis=0, inplace=True)

# reset index, because we droped two rows
df.reset_index(drop=True, inplace=True)

In [12]:
df.head(10)

Unnamed: 0,Post Code,Borough,Neighborhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,Harbourfront
3,M5A,Downtown Toronto,Regent Park
4,M6A,North York,Lawrence Heights
5,M6A,North York,Lawrence Manor
6,M7A,Queen's Park,
7,M9A,Etobicoke,Islington Avenue
8,M1B,Scarborough,Rouge
9,M1B,Scarborough,Malvern


### Now, let's replace missing neighbourhood values with the values of corresponding Borough, as instructed in the assignment question

In [13]:
df['Neighborhood'].replace(np.nan, df['Borough'],inplace=True)

In [14]:
# Grouping by postal code
df = df.groupby('Post Code').agg({'Borough':'first','Neighborhood': ', '.join}).reset_index()

In [15]:
df.head(10)

Unnamed: 0,Post Code,Borough,Neighborhood
0,M1B,Scarborough,"Rouge, Malvern"
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union"
2,M1E,Scarborough,"Guildwood, Morningside, West Hill"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae
5,M1J,Scarborough,Scarborough Village
6,M1K,Scarborough,"East Birchmount Park, Ionview, Kennedy Park"
7,M1L,Scarborough,"Clairlea, Golden Mile, Oakridge"
8,M1M,Scarborough,"Cliffcrest, Cliffside, Scarborough Village West"
9,M1N,Scarborough,"Birch Cliff, Cliffside West"


## Hurrah!
### We are done with the cleaning phase.
### Finally, let's check the number of rows and columns in the dataframe:

In [16]:
df.shape

(103, 3)

# Finding out latitude and longitude for each borough
### Let's create another dataframe which will combine above dataframe with latitude ad longitude of each borough:

In [17]:
column_names=['Post Code','Borough','Neighborhood','latitude','longitude']
df2=pd.DataFrame(columns=column_names)

In [18]:
#!conda install -c conda-forge geopy --yes

In [19]:
df2[['Post Code','Borough','Neighborhood']]=df[['Post Code','Borough','Neighborhood']]

### Import the library for finding out Latitude and Longitude. We are going to use Foursquare agent

In [20]:
from geopy.geocoders import Nominatim

In [50]:
CLIENT_ID = '' # your Foursquare ID
CLIENT_SECRET = '' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 30
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: 
CLIENT_SECRET:


In [22]:
for index, row in df2.iterrows():
    geolocator = Nominatim(user_agent="foursquare_agent")
    location = geolocator.geocode(row['Borough'])
    row['latitude'] = location.latitude
    row['longitude'] = location.longitude


# Finally we have our dataframe with longitude and latitude columns

In [23]:
df2.head()

Unnamed: 0,Post Code,Borough,Neighborhood,latitude,longitude
0,M1B,Scarborough,"Rouge, Malvern",54.2848,-0.409034
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",54.2848,-0.409034
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",54.2848,-0.409034
3,M1G,Scarborough,Woburn,54.2848,-0.409034
4,M1H,Scarborough,Cedarbrae,54.2848,-0.409034


In [24]:
df2.to_csv('toronto.csv')

# Analysis of neighborhoods of Toronto

In [25]:
import folium #for drawing map

### Let's define a function to get nearby venues of a borough

In [26]:

def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [27]:
toronto_venues = getNearbyVenues(names=df2['Neighborhood'],
                                   latitudes=df2['latitude'],
                                   longitudes=df2['longitude']
                                  )



Rouge, Malvern
Highland Creek, Rouge Hill, Port Union
Guildwood, Morningside, West Hill
Woburn
Cedarbrae
Scarborough Village
East Birchmount Park, Ionview, Kennedy Park
Clairlea, Golden Mile, Oakridge
Cliffcrest, Cliffside, Scarborough Village West
Birch Cliff, Cliffside West
Dorset Park, Scarborough Town Centre, Wexford Heights
Maryvale, Wexford
Agincourt
Clarks Corners, Sullivan, Tam O'Shanter
Agincourt North, L'Amoreaux East, Milliken, Steeles East
L'Amoreaux West
Upper Rouge
Hillcrest Village
Fairview, Henry Farm, Oriole
Bayview Village
Silver Hills, York Mills
Newtonbrook, Willowdale
Willowdale South
York Mills West
Willowdale West
Parkwoods
Don Mills North
Flemingdon Park, Don Mills South
Bathurst Manor, Downsview North, Wilson Heights
Northwood Park, York University
CFB Toronto, Downsview East
Downsview West
Downsview Central
Downsview Northwest
Victoria Village
Woodbine Gardens, Parkview Hill
Woodbine Heights
The Beaches
Leaside
Thorncliffe Park
East Toronto
The Danforth West, 

#### Let's check the size of the resulting dataframe

In [28]:
print(toronto_venues.shape)
toronto_venues.head()

(2289, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,"Rouge, Malvern",54.28476,-0.409034,Gianni's,54.283031,-0.405256,Italian Restaurant
1,"Rouge, Malvern",54.28476,-0.409034,Eat Me Cafe,54.280697,-0.406778,Restaurant
2,"Rouge, Malvern",54.28476,-0.409034,Aldi,54.281965,-0.407199,Supermarket
3,"Rouge, Malvern",54.28476,-0.409034,Pizza Hut,54.280872,-0.407803,Pizza Place
4,"Rouge, Malvern",54.28476,-0.409034,Domino's Pizza,54.283844,-0.403644,Pizza Place


Let's check how many venues were returned for each neighborhood

In [29]:
toronto_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"Adelaide, King, Richmond",30,30,30,30,30,30
Agincourt,6,6,6,6,6,6
"Agincourt North, L'Amoreaux East, Milliken, Steeles East",6,6,6,6,6,6
"Albion Gardens, Beaumond Heights, Humbergate, Jamestown, Mount Olive, Silverstone, South Steeles, Thistletown",6,6,6,6,6,6
"Alderwood, Long Branch",6,6,6,6,6,6
"Bathurst Manor, Downsview North, Wilson Heights",30,30,30,30,30,30
Bayview Village,30,30,30,30,30,30
"Bedford Park, Lawrence Manor East",30,30,30,30,30,30
Berczy Park,30,30,30,30,30,30
"Birch Cliff, Cliffside West",6,6,6,6,6,6


#### Let's find out how many unique categories can be curated from all the returned venues

In [30]:
print('There are {} uniques categories.'.format(len(toronto_venues['Venue Category'].unique())))

There are 90 uniques categories.


<a id='item3'></a>

## 3. Analyze Each Neighborhood

In [31]:
# one hot encoding
toronto_onehot = pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
toronto_onehot['Neighborhood'] =toronto_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]

toronto_onehot.head()

Unnamed: 0,Yoga Studio,American Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Athletics & Sports,Bakery,Bank,Bar,Beer Bar,...,Supermarket,Sushi Restaurant,Tapas Restaurant,Tea Room,Theater,University,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Wine Bar,Women's Store
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,1,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


And let's examine the new dataframe size.

In [32]:
toronto_onehot.shape

(2289, 90)

#### Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category

In [33]:
toronto_grouped = toronto_onehot.groupby('Neighborhood').mean().reset_index()
toronto_grouped

Unnamed: 0,Neighborhood,Yoga Studio,American Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Athletics & Sports,Bakery,Bank,Bar,...,Supermarket,Sushi Restaurant,Tapas Restaurant,Tea Room,Theater,University,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Wine Bar,Women's Store
0,"Adelaide, King, Richmond",0.000000,0.066667,0.000000,0.000000,0.000000,0.000000,0.000000,0.033333,0.000000,...,0.000000,0.000000,0.000000,0.033333,0.066667,0.000000,0.033333,0.000000,0.000000,0.000000
1,Agincourt,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,0.166667,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000
2,"Agincourt North, L'Amoreaux East, Milliken, St...",0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,0.166667,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000
3,"Albion Gardens, Beaumond Heights, Humbergate, ...",0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,0.166667,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.166667
4,"Alderwood, Long Branch",0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,0.166667,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.166667
5,"Bathurst Manor, Downsview North, Wilson Heights",0.000000,0.000000,0.000000,0.000000,0.033333,0.000000,0.000000,0.000000,0.000000,...,0.033333,0.033333,0.000000,0.000000,0.000000,0.000000,0.000000,0.033333,0.000000,0.000000
6,Bayview Village,0.000000,0.000000,0.000000,0.000000,0.033333,0.000000,0.000000,0.000000,0.000000,...,0.033333,0.033333,0.000000,0.000000,0.000000,0.000000,0.000000,0.033333,0.000000,0.000000
7,"Bedford Park, Lawrence Manor East",0.000000,0.000000,0.000000,0.000000,0.033333,0.000000,0.000000,0.000000,0.000000,...,0.033333,0.033333,0.000000,0.000000,0.000000,0.000000,0.000000,0.033333,0.000000,0.000000
8,Berczy Park,0.000000,0.066667,0.000000,0.000000,0.000000,0.000000,0.000000,0.033333,0.000000,...,0.000000,0.000000,0.000000,0.033333,0.066667,0.000000,0.033333,0.000000,0.000000,0.000000
9,"Birch Cliff, Cliffside West",0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,0.166667,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000


#### Let's confirm the new size

In [34]:
toronto_grouped.shape

(103, 90)

#### Let's print each neighborhood along with the top 5 most common venues

In [35]:
num_top_venues = 5

for hood in toronto_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = toronto_grouped[toronto_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Adelaide, King, Richmond----
                 venue  freq
0       Clothing Store  0.10
1       Cosmetics Shop  0.07
2  American Restaurant  0.07
3              Theater  0.07
4                Plaza  0.07


----Agincourt----
                venue  freq
0         Pizza Place  0.33
1  Italian Restaurant  0.17
2          Restaurant  0.17
3       Grocery Store  0.17
4         Supermarket  0.17


----Agincourt North, L'Amoreaux East, Milliken, Steeles East----
                venue  freq
0         Pizza Place  0.33
1  Italian Restaurant  0.17
2          Restaurant  0.17
3       Grocery Store  0.17
4         Supermarket  0.17


----Albion Gardens, Beaumond Heights, Humbergate, Jamestown, Mount Olive, Silverstone, South Steeles, Thistletown----
           venue  freq
0     Playground  0.33
1  Women's Store  0.17
2    Coffee Shop  0.17
3         Garden  0.17
4    Supermarket  0.17


----Alderwood, Long Branch----
           venue  freq
0     Playground  0.33
1  Women's Store  0.17
2    Coffe

#### Let's put that into a *pandas* dataframe

First, let's write a function to sort the venues in descending order.

In [36]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Now let's create the new dataframe and display the top 10 venues for each neighborhood.

In [37]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = toronto_grouped['Neighborhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Adelaide, King, Richmond",Clothing Store,Plaza,American Restaurant,Theater,Cosmetics Shop,Concert Hall,Movie Theater,Monument / Landmark,Pizza Place,Department Store
1,Agincourt,Pizza Place,Italian Restaurant,Grocery Store,Restaurant,Supermarket,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
2,"Agincourt North, L'Amoreaux East, Milliken, St...",Pizza Place,Italian Restaurant,Grocery Store,Restaurant,Supermarket,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
3,"Albion Gardens, Beaumond Heights, Humbergate, ...",Playground,Women's Store,Garden,Supermarket,Coffee Shop,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
4,"Alderwood, Long Branch",Playground,Women's Store,Garden,Supermarket,Coffee Shop,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega


<a id='item4'></a>

## 4. Cluster Neighborhoods

Run *k*-means to cluster the neighborhood into 5 clusters.

In [41]:
# set number of clusters
from sklearn.cluster import KMeans
kclusters = 5

toronto_grouped_clustering = toronto_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 2, 2, 1, 1, 3, 3, 3, 0, 2], dtype=int32)

Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

In [43]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

toronto_merged = df2

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
toronto_merged = toronto_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

toronto_merged.head() # check the last columns!

Unnamed: 0,Post Code,Borough,Neighborhood,latitude,longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M1B,Scarborough,"Rouge, Malvern",54.2848,-0.409034,2,Pizza Place,Italian Restaurant,Grocery Store,Restaurant,Supermarket,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",54.2848,-0.409034,2,Pizza Place,Italian Restaurant,Grocery Store,Restaurant,Supermarket,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",54.2848,-0.409034,2,Pizza Place,Italian Restaurant,Grocery Store,Restaurant,Supermarket,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
3,M1G,Scarborough,Woburn,54.2848,-0.409034,2,Pizza Place,Italian Restaurant,Grocery Store,Restaurant,Supermarket,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
4,M1H,Scarborough,Cedarbrae,54.2848,-0.409034,2,Pizza Place,Italian Restaurant,Grocery Store,Restaurant,Supermarket,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega


Finally, let's visualize the resulting clusters

In [44]:
import matplotlib.cm as cm
import matplotlib.colors as colors
latitude=43.6532
longitude=-79.3832
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['latitude'], toronto_merged['longitude'], toronto_merged['Neighborhood'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

<a id='item5'></a>

## 5. Examine Clusters

Now, you can examine each cluster and determine the discriminating venue categories that distinguish each cluster. Based on the defining categories, you can then assign a name to each cluster. I will leave this exercise to you.

#### Cluster 1

In [45]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 0, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
50,Downtown Toronto,0,Clothing Store,Plaza,American Restaurant,Theater,Cosmetics Shop,Concert Hall,Movie Theater,Monument / Landmark,Pizza Place,Department Store
51,Downtown Toronto,0,Clothing Store,Plaza,American Restaurant,Theater,Cosmetics Shop,Concert Hall,Movie Theater,Monument / Landmark,Pizza Place,Department Store
52,Downtown Toronto,0,Clothing Store,Plaza,American Restaurant,Theater,Cosmetics Shop,Concert Hall,Movie Theater,Monument / Landmark,Pizza Place,Department Store
53,Downtown Toronto,0,Clothing Store,Plaza,American Restaurant,Theater,Cosmetics Shop,Concert Hall,Movie Theater,Monument / Landmark,Pizza Place,Department Store
54,Downtown Toronto,0,Clothing Store,Plaza,American Restaurant,Theater,Cosmetics Shop,Concert Hall,Movie Theater,Monument / Landmark,Pizza Place,Department Store
55,Downtown Toronto,0,Clothing Store,Plaza,American Restaurant,Theater,Cosmetics Shop,Concert Hall,Movie Theater,Monument / Landmark,Pizza Place,Department Store
56,Downtown Toronto,0,Clothing Store,Plaza,American Restaurant,Theater,Cosmetics Shop,Concert Hall,Movie Theater,Monument / Landmark,Pizza Place,Department Store
57,Downtown Toronto,0,Clothing Store,Plaza,American Restaurant,Theater,Cosmetics Shop,Concert Hall,Movie Theater,Monument / Landmark,Pizza Place,Department Store
58,Downtown Toronto,0,Clothing Store,Plaza,American Restaurant,Theater,Cosmetics Shop,Concert Hall,Movie Theater,Monument / Landmark,Pizza Place,Department Store
59,Downtown Toronto,0,Clothing Store,Plaza,American Restaurant,Theater,Cosmetics Shop,Concert Hall,Movie Theater,Monument / Landmark,Pizza Place,Department Store


#### Cluster 2

In [46]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 1, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
88,Etobicoke,1,Playground,Women's Store,Garden,Supermarket,Coffee Shop,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
89,Etobicoke,1,Playground,Women's Store,Garden,Supermarket,Coffee Shop,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
90,Etobicoke,1,Playground,Women's Store,Garden,Supermarket,Coffee Shop,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
91,Etobicoke,1,Playground,Women's Store,Garden,Supermarket,Coffee Shop,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
92,Etobicoke,1,Playground,Women's Store,Garden,Supermarket,Coffee Shop,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
93,Etobicoke,1,Playground,Women's Store,Garden,Supermarket,Coffee Shop,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
94,Etobicoke,1,Playground,Women's Store,Garden,Supermarket,Coffee Shop,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
95,Etobicoke,1,Playground,Women's Store,Garden,Supermarket,Coffee Shop,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
99,Etobicoke,1,Playground,Women's Store,Garden,Supermarket,Coffee Shop,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
100,Etobicoke,1,Playground,Women's Store,Garden,Supermarket,Coffee Shop,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega


#### Cluster 3

In [47]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 2, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Scarborough,2,Pizza Place,Italian Restaurant,Grocery Store,Restaurant,Supermarket,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
1,Scarborough,2,Pizza Place,Italian Restaurant,Grocery Store,Restaurant,Supermarket,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
2,Scarborough,2,Pizza Place,Italian Restaurant,Grocery Store,Restaurant,Supermarket,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
3,Scarborough,2,Pizza Place,Italian Restaurant,Grocery Store,Restaurant,Supermarket,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
4,Scarborough,2,Pizza Place,Italian Restaurant,Grocery Store,Restaurant,Supermarket,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
5,Scarborough,2,Pizza Place,Italian Restaurant,Grocery Store,Restaurant,Supermarket,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
6,Scarborough,2,Pizza Place,Italian Restaurant,Grocery Store,Restaurant,Supermarket,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
7,Scarborough,2,Pizza Place,Italian Restaurant,Grocery Store,Restaurant,Supermarket,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
8,Scarborough,2,Pizza Place,Italian Restaurant,Grocery Store,Restaurant,Supermarket,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
9,Scarborough,2,Pizza Place,Italian Restaurant,Grocery Store,Restaurant,Supermarket,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega


#### Cluster 4

In [48]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 3, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
17,North York,3,Ramen Restaurant,Coffee Shop,Café,Korean Restaurant,Restaurant,Japanese Restaurant,Indonesian Restaurant,Burrito Place,Pet Store,Plaza
18,North York,3,Ramen Restaurant,Coffee Shop,Café,Korean Restaurant,Restaurant,Japanese Restaurant,Indonesian Restaurant,Burrito Place,Pet Store,Plaza
19,North York,3,Ramen Restaurant,Coffee Shop,Café,Korean Restaurant,Restaurant,Japanese Restaurant,Indonesian Restaurant,Burrito Place,Pet Store,Plaza
20,North York,3,Ramen Restaurant,Coffee Shop,Café,Korean Restaurant,Restaurant,Japanese Restaurant,Indonesian Restaurant,Burrito Place,Pet Store,Plaza
21,North York,3,Ramen Restaurant,Coffee Shop,Café,Korean Restaurant,Restaurant,Japanese Restaurant,Indonesian Restaurant,Burrito Place,Pet Store,Plaza
22,North York,3,Ramen Restaurant,Coffee Shop,Café,Korean Restaurant,Restaurant,Japanese Restaurant,Indonesian Restaurant,Burrito Place,Pet Store,Plaza
23,North York,3,Ramen Restaurant,Coffee Shop,Café,Korean Restaurant,Restaurant,Japanese Restaurant,Indonesian Restaurant,Burrito Place,Pet Store,Plaza
24,North York,3,Ramen Restaurant,Coffee Shop,Café,Korean Restaurant,Restaurant,Japanese Restaurant,Indonesian Restaurant,Burrito Place,Pet Store,Plaza
25,North York,3,Ramen Restaurant,Coffee Shop,Café,Korean Restaurant,Restaurant,Japanese Restaurant,Indonesian Restaurant,Burrito Place,Pet Store,Plaza
26,North York,3,Ramen Restaurant,Coffee Shop,Café,Korean Restaurant,Restaurant,Japanese Restaurant,Indonesian Restaurant,Burrito Place,Pet Store,Plaza


#### Cluster 5

In [49]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 4, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
37,East Toronto,4,Japanese Restaurant,Coffee Shop,Smoke Shop,Concert Hall,Opera House,Park,Chinese Restaurant,Jazz Club,Plaza,Poke Place
41,East Toronto,4,Japanese Restaurant,Coffee Shop,Smoke Shop,Concert Hall,Opera House,Park,Chinese Restaurant,Jazz Club,Plaza,Poke Place
42,East Toronto,4,Japanese Restaurant,Coffee Shop,Smoke Shop,Concert Hall,Opera House,Park,Chinese Restaurant,Jazz Club,Plaza,Poke Place
43,East Toronto,4,Japanese Restaurant,Coffee Shop,Smoke Shop,Concert Hall,Opera House,Park,Chinese Restaurant,Jazz Club,Plaza,Poke Place
44,Central Toronto,4,Japanese Restaurant,Coffee Shop,Smoke Shop,Concert Hall,Opera House,Park,Chinese Restaurant,Jazz Club,Plaza,Poke Place
45,Central Toronto,4,Japanese Restaurant,Coffee Shop,Smoke Shop,Concert Hall,Opera House,Park,Chinese Restaurant,Jazz Club,Plaza,Poke Place
46,Central Toronto,4,Japanese Restaurant,Coffee Shop,Smoke Shop,Concert Hall,Opera House,Park,Chinese Restaurant,Jazz Club,Plaza,Poke Place
47,Central Toronto,4,Japanese Restaurant,Coffee Shop,Smoke Shop,Concert Hall,Opera House,Park,Chinese Restaurant,Jazz Club,Plaza,Poke Place
48,Central Toronto,4,Japanese Restaurant,Coffee Shop,Smoke Shop,Concert Hall,Opera House,Park,Chinese Restaurant,Jazz Club,Plaza,Poke Place
49,Central Toronto,4,Japanese Restaurant,Coffee Shop,Smoke Shop,Concert Hall,Opera House,Park,Chinese Restaurant,Jazz Club,Plaza,Poke Place
