# Toronto Neighborhood Segmentation Data

### By: Gyan Prakash

*The notebook below fetches the table from wikipedia page, and converts it into pandas dataframe. After this data wrangling is performed to clean the data.*

### Let's import the libraries 

In [1]:
import numpy as np
import requests
import pandas as pd


### Now we will fetch the tables from the given page into a list of dataframe objects

In [2]:
url= 'https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'
r = requests.get(url)
page=pd.read_html(url)

*We have got tabless from wikipedia page. It is a list.*
### Let's check the datatype of 'page':

In [3]:
type(page)

list

*Since we need to work only with the first table,* 
### Let's take out the first table from page: 

In [4]:
df=page[0]

In [5]:
type(df)

pandas.core.frame.DataFrame

*Let's have a look at our dataframe:*

In [6]:
df.head()

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront


### Insert the column names:

In [7]:
df.columns=['Post Code','Borough','Neighborhood']

*Missing values in our dataframe are displayed as 'Not assigned'. Let's replace them with numpy NaN. It will make the processing easier.'*

In [8]:
df.replace( "Not assigned",np.nan, inplace=True)

### Droping NaN rows for Borough

In [9]:
df.dropna(subset=["Borough"], axis=0, inplace=True)

# reset index, because we droped two rows
df.reset_index(drop=True, inplace=True)

In [10]:
df.head(10)

Unnamed: 0,Post Code,Borough,Neighborhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,Harbourfront
3,M5A,Downtown Toronto,Regent Park
4,M6A,North York,Lawrence Heights
5,M6A,North York,Lawrence Manor
6,M7A,Queen's Park,
7,M9A,Etobicoke,Islington Avenue
8,M1B,Scarborough,Rouge
9,M1B,Scarborough,Malvern


### Now, let's replace missing neighbourhood values with the values of corresponding Borough, as instructed in the assignment question

In [11]:
df['Neighborhood'].replace(np.nan, df['Borough'],inplace=True)

In [12]:
df.head(10)

Unnamed: 0,Post Code,Borough,Neighborhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,Harbourfront
3,M5A,Downtown Toronto,Regent Park
4,M6A,North York,Lawrence Heights
5,M6A,North York,Lawrence Manor
6,M7A,Queen's Park,Queen's Park
7,M9A,Etobicoke,Islington Avenue
8,M1B,Scarborough,Rouge
9,M1B,Scarborough,Malvern


## Hurrah!
### We are done with the cleaning phase.
### Finally, let's check the number of rows and columns in the dataframe:

In [13]:
df.shape

(211, 3)

# Finding out latitude and longitude for each borough
### Let's create another dataframe which will combine above dataframe with latitude ad longitude of each borough:

In [14]:
column_names=['Post Code','Borough','Neighborhood','latitude','longitude']
df2=pd.DataFrame(columns=column_names)

In [15]:
#!conda install -c conda-forge geopy --yes

In [16]:
df2[['Post Code','Borough','Neighborhood']]=df[['Post Code','Borough','Neighborhood']]

### Import the library for finding out Latitude and Longitude. We are going to use Foursquare agent

In [17]:
from geopy.geocoders import Nominatim

In [82]:
CLIENT_ID = 'removed for privacy' # your Foursquare ID
CLIENT_SECRET = 'removed for privacy' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 30
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: removed for privacy
CLIENT_SECRET:removed for privacy


In [64]:
for index, row in df2.iterrows():
    geolocator = Nominatim(user_agent="foursquare_agent")
    location = geolocator.geocode(row['Borough'])
    row['latitude'] = location.latitude
    row['longitude'] = location.longitude


# Finally we have our dataframe with longitude and latitude columns

In [35]:
df2.head()

Unnamed: 0,Post Code,Borough,Neighborhood,latitude,longitude
0,M3A,North York,Parkwoods,43.770817,-79.4133
1,M4A,North York,Victoria Village,43.770817,-79.4133
2,M5A,Downtown Toronto,Harbourfront,43.654174,-79.380812
3,M5A,Downtown Toronto,Regent Park,43.654174,-79.380812
4,M6A,North York,Lawrence Heights,43.770817,-79.4133


# Analysis of neighborhoods of Toronto

In [None]:
import folium #for drawing map

### Let's define a function to get nearby venues of a borough

In [47]:

def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [48]:
toronto_venues = getNearbyVenues(names=df2['Neighborhood'],
                                   latitudes=df2['latitude'],
                                   longitudes=df2['longitude']
                                  )



Parkwoods
Victoria Village
Harbourfront
Regent Park
Lawrence Heights
Lawrence Manor
Queen's Park
Islington Avenue
Rouge
Malvern
Don Mills North
Woodbine Gardens
Parkview Hill
Ryerson
Garden District
Glencairn
Cloverdale
Islington
Martin Grove
Princess Gardens
West Deane Park
Highland Creek
Rouge Hill
Port Union
Flemingdon Park
Don Mills South
Woodbine Heights
St. James Town
Humewood-Cedarvale
Bloordale Gardens
Eringate
Markland Wood
Old Burnhamthorpe
Guildwood
Morningside
West Hill
The Beaches
Berczy Park
Caledonia-Fairbanks
Woburn
Leaside
Central Bay Street
Christie
Cedarbrae
Hillcrest Village
Bathurst Manor
Downsview North
Wilson Heights
Thorncliffe Park
Adelaide
King
Richmond
Dovercourt Village
Dufferin
Scarborough Village
Fairview
Henry Farm
Oriole
Northwood Park
York University
East Toronto
Harbourfront East
Toronto Islands
Union Station
Little Portugal
Trinity
East Birchmount Park
Ionview
Kennedy Park
Bayview Village
CFB Toronto
Downsview East
The Danforth West
Riverdale
Design E

#### Let's check the size of the resulting dataframe

In [49]:
print(toronto_venues.shape)
toronto_venues.head()

(4236, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Parkwoods,43.770817,-79.4133,The Captain's Boil,43.773255,-79.413805,Seafood Restaurant
1,Parkwoods,43.770817,-79.4133,Loblaws,43.768648,-79.412597,Grocery Store
2,Parkwoods,43.770817,-79.4133,Aroma Espresso Bar,43.769449,-79.413081,Café
3,Parkwoods,43.770817,-79.4133,Dakgogi,43.77301,-79.413875,Korean Restaurant
4,Parkwoods,43.770817,-79.4133,Burrito Boyz,43.773054,-79.414082,Burrito Place


Let's check how many venues were returned for each neighborhood

In [50]:
toronto_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Adelaide,30,30,30,30,30,30
Agincourt,6,6,6,6,6,6
Agincourt North,6,6,6,6,6,6
Albion Gardens,6,6,6,6,6,6
Alderwood,6,6,6,6,6,6
Bathurst Manor,30,30,30,30,30,30
Bathurst Quay,30,30,30,30,30,30
Bayview Village,30,30,30,30,30,30
Beaumond Heights,6,6,6,6,6,6
Bedford Park,30,30,30,30,30,30


#### Let's find out how many unique categories can be curated from all the returned venues

In [52]:
print('There are {} uniques categories.'.format(len(toronto_venues['Venue Category'].unique())))

There are 90 uniques categories.


<a id='item3'></a>

## 3. Analyze Each Neighborhood

In [81]:
# one hot encoding
toronto_onehot = pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
toronto_onehot['Neighborhood'] =toronto_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]

toronto_onehot.head()

Unnamed: 0,Yoga Studio,American Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Athletics & Sports,Bakery,Bank,Bar,Beer Bar,...,Supermarket,Sushi Restaurant,Tapas Restaurant,Tea Room,Theater,University,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Wine Bar,Women's Store
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


And let's examine the new dataframe size.

In [54]:
toronto_onehot.shape

(4236, 90)

#### Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category

In [55]:
toronto_grouped = toronto_onehot.groupby('Neighborhood').mean().reset_index()
toronto_grouped

Unnamed: 0,Neighborhood,Yoga Studio,American Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Athletics & Sports,Bakery,Bank,Bar,...,Supermarket,Sushi Restaurant,Tapas Restaurant,Tea Room,Theater,University,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Wine Bar,Women's Store
0,Adelaide,0.000000,0.066667,0.000000,0.000000,0.000000,0.000000,0.000000,0.033333,0.000000,...,0.000000,0.000000,0.000000,0.033333,0.066667,0.000000,0.033333,0.000000,0.000000,0.000000
1,Agincourt,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,0.166667,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000
2,Agincourt North,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,0.166667,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000
3,Albion Gardens,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,0.166667,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.166667
4,Alderwood,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,0.166667,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.166667
5,Bathurst Manor,0.000000,0.000000,0.000000,0.000000,0.033333,0.000000,0.000000,0.000000,0.000000,...,0.033333,0.033333,0.000000,0.000000,0.000000,0.000000,0.000000,0.033333,0.000000,0.000000
6,Bathurst Quay,0.000000,0.066667,0.000000,0.000000,0.000000,0.000000,0.000000,0.033333,0.000000,...,0.000000,0.000000,0.000000,0.033333,0.066667,0.000000,0.033333,0.000000,0.000000,0.000000
7,Bayview Village,0.000000,0.000000,0.000000,0.000000,0.033333,0.000000,0.000000,0.000000,0.000000,...,0.033333,0.033333,0.000000,0.000000,0.000000,0.000000,0.000000,0.033333,0.000000,0.000000
8,Beaumond Heights,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,...,0.166667,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.166667
9,Bedford Park,0.000000,0.000000,0.000000,0.000000,0.033333,0.000000,0.000000,0.000000,0.000000,...,0.033333,0.033333,0.000000,0.000000,0.000000,0.000000,0.000000,0.033333,0.000000,0.000000


#### Let's confirm the new size

In [56]:
toronto_grouped.shape

(209, 90)

#### Let's print each neighborhood along with the top 5 most common venues

In [57]:
num_top_venues = 5

for hood in toronto_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = toronto_grouped[toronto_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Adelaide----
                 venue  freq
0       Clothing Store  0.10
1       Cosmetics Shop  0.07
2  American Restaurant  0.07
3              Theater  0.07
4                Plaza  0.07


----Agincourt----
                venue  freq
0         Pizza Place  0.33
1  Italian Restaurant  0.17
2          Restaurant  0.17
3       Grocery Store  0.17
4         Supermarket  0.17


----Agincourt North----
                venue  freq
0         Pizza Place  0.33
1  Italian Restaurant  0.17
2          Restaurant  0.17
3       Grocery Store  0.17
4         Supermarket  0.17


----Albion Gardens----
           venue  freq
0     Playground  0.33
1  Women's Store  0.17
2    Coffee Shop  0.17
3         Garden  0.17
4    Supermarket  0.17


----Alderwood----
           venue  freq
0     Playground  0.33
1  Women's Store  0.17
2    Coffee Shop  0.17
3         Garden  0.17
4    Supermarket  0.17


----Bathurst Manor----
               venue  freq
0   Ramen Restaurant  0.13
1        Coffee Shop  0.10


#### Let's put that into a *pandas* dataframe

First, let's write a function to sort the venues in descending order.

In [58]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Now let's create the new dataframe and display the top 10 venues for each neighborhood.

In [60]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = toronto_grouped['Neighborhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Adelaide,Clothing Store,Plaza,American Restaurant,Theater,Cosmetics Shop,Concert Hall,Movie Theater,Monument / Landmark,Pizza Place,Department Store
1,Agincourt,Pizza Place,Italian Restaurant,Grocery Store,Restaurant,Supermarket,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
2,Agincourt North,Pizza Place,Italian Restaurant,Grocery Store,Restaurant,Supermarket,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
3,Albion Gardens,Playground,Women's Store,Garden,Supermarket,Coffee Shop,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
4,Alderwood,Playground,Women's Store,Garden,Supermarket,Coffee Shop,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega


<a id='item4'></a>

## 4. Cluster Neighborhoods

Run *k*-means to cluster the neighborhood into 5 clusters.

In [62]:
# set number of clusters
from sklearn.cluster import KMeans
kclusters = 5

toronto_grouped_clustering = toronto_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([3, 2, 2, 1, 1, 4, 3, 4, 1, 4], dtype=int32)

Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

In [71]:
# add clustering labels
#neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

toronto_merged = df2

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
toronto_merged = toronto_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

toronto_merged.head() # check the last columns!

Unnamed: 0,Post Code,Borough,Neighborhood,latitude,longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M3A,North York,Parkwoods,43.770817,-79.4133,4,Ramen Restaurant,Coffee Shop,Café,Korean Restaurant,Restaurant,Japanese Restaurant,Indonesian Restaurant,Burrito Place,Pet Store,Plaza
1,M4A,North York,Victoria Village,43.770817,-79.4133,4,Ramen Restaurant,Coffee Shop,Café,Korean Restaurant,Restaurant,Japanese Restaurant,Indonesian Restaurant,Burrito Place,Pet Store,Plaza
2,M5A,Downtown Toronto,Harbourfront,43.654174,-79.380812,3,Clothing Store,Plaza,American Restaurant,Theater,Cosmetics Shop,Concert Hall,Movie Theater,Monument / Landmark,Pizza Place,Department Store
3,M5A,Downtown Toronto,Regent Park,43.654174,-79.380812,3,Clothing Store,Plaza,American Restaurant,Theater,Cosmetics Shop,Concert Hall,Movie Theater,Monument / Landmark,Pizza Place,Department Store
4,M6A,North York,Lawrence Heights,43.770817,-79.4133,4,Ramen Restaurant,Coffee Shop,Café,Korean Restaurant,Restaurant,Japanese Restaurant,Indonesian Restaurant,Burrito Place,Pet Store,Plaza


Finally, let's visualize the resulting clusters

In [80]:
import matplotlib.cm as cm
import matplotlib.colors as colors
latitude=43.6532
longitude=-79.3832
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['latitude'], toronto_merged['longitude'], toronto_merged['Neighborhood'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

<a id='item5'></a>

## 5. Examine Clusters

Now, you can examine each cluster and determine the discriminating venue categories that distinguish each cluster. Based on the defining categories, you can then assign a name to each cluster. I will leave this exercise to you.

#### Cluster 1

In [74]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 0, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
28,York,0,Pub,Bar,Café,Plaza,Portuguese Restaurant,Cocktail Bar,Coffee Shop,Comfort Food Restaurant,Deli / Bodega,Movie Theater
36,East Toronto,0,Japanese Restaurant,Coffee Shop,Smoke Shop,Concert Hall,Opera House,Park,Chinese Restaurant,Jazz Club,Plaza,Poke Place
38,York,0,Pub,Bar,Café,Plaza,Portuguese Restaurant,Cocktail Bar,Coffee Shop,Comfort Food Restaurant,Deli / Bodega,Movie Theater
52,West Toronto,0,Japanese Restaurant,Coffee Shop,Smoke Shop,Concert Hall,Opera House,Park,Chinese Restaurant,Jazz Club,Plaza,Poke Place
53,West Toronto,0,Japanese Restaurant,Coffee Shop,Smoke Shop,Concert Hall,Opera House,Park,Chinese Restaurant,Jazz Club,Plaza,Poke Place
64,West Toronto,0,Japanese Restaurant,Coffee Shop,Smoke Shop,Concert Hall,Opera House,Park,Chinese Restaurant,Jazz Club,Plaza,Poke Place
65,West Toronto,0,Japanese Restaurant,Coffee Shop,Smoke Shop,Concert Hall,Opera House,Park,Chinese Restaurant,Jazz Club,Plaza,Poke Place
72,East Toronto,0,Japanese Restaurant,Coffee Shop,Smoke Shop,Concert Hall,Opera House,Park,Chinese Restaurant,Jazz Club,Plaza,Poke Place
73,East Toronto,0,Japanese Restaurant,Coffee Shop,Smoke Shop,Concert Hall,Opera House,Park,Chinese Restaurant,Jazz Club,Plaza,Poke Place
76,West Toronto,0,Japanese Restaurant,Coffee Shop,Smoke Shop,Concert Hall,Opera House,Park,Chinese Restaurant,Jazz Club,Plaza,Poke Place


#### Cluster 2

In [75]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 1, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
7,Etobicoke,1,Playground,Women's Store,Garden,Supermarket,Coffee Shop,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
16,Etobicoke,1,Playground,Women's Store,Garden,Supermarket,Coffee Shop,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
17,Etobicoke,1,Playground,Women's Store,Garden,Supermarket,Coffee Shop,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
18,Etobicoke,1,Playground,Women's Store,Garden,Supermarket,Coffee Shop,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
19,Etobicoke,1,Playground,Women's Store,Garden,Supermarket,Coffee Shop,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
20,Etobicoke,1,Playground,Women's Store,Garden,Supermarket,Coffee Shop,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
29,Etobicoke,1,Playground,Women's Store,Garden,Supermarket,Coffee Shop,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
30,Etobicoke,1,Playground,Women's Store,Garden,Supermarket,Coffee Shop,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
31,Etobicoke,1,Playground,Women's Store,Garden,Supermarket,Coffee Shop,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
32,Etobicoke,1,Playground,Women's Store,Garden,Supermarket,Coffee Shop,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega


#### Cluster 3

In [76]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 2, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
8,Scarborough,2,Pizza Place,Italian Restaurant,Grocery Store,Restaurant,Supermarket,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
9,Scarborough,2,Pizza Place,Italian Restaurant,Grocery Store,Restaurant,Supermarket,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
21,Scarborough,2,Pizza Place,Italian Restaurant,Grocery Store,Restaurant,Supermarket,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
22,Scarborough,2,Pizza Place,Italian Restaurant,Grocery Store,Restaurant,Supermarket,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
23,Scarborough,2,Pizza Place,Italian Restaurant,Grocery Store,Restaurant,Supermarket,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
33,Scarborough,2,Pizza Place,Italian Restaurant,Grocery Store,Restaurant,Supermarket,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
34,Scarborough,2,Pizza Place,Italian Restaurant,Grocery Store,Restaurant,Supermarket,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
35,Scarborough,2,Pizza Place,Italian Restaurant,Grocery Store,Restaurant,Supermarket,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
39,Scarborough,2,Pizza Place,Italian Restaurant,Grocery Store,Restaurant,Supermarket,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega
43,Scarborough,2,Pizza Place,Italian Restaurant,Grocery Store,Restaurant,Supermarket,Discount Store,Comfort Food Restaurant,Concert Hall,Cosmetics Shop,Deli / Bodega


#### Cluster 4

In [77]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 3, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,Downtown Toronto,3,Clothing Store,Plaza,American Restaurant,Theater,Cosmetics Shop,Concert Hall,Movie Theater,Monument / Landmark,Pizza Place,Department Store
3,Downtown Toronto,3,Clothing Store,Plaza,American Restaurant,Theater,Cosmetics Shop,Concert Hall,Movie Theater,Monument / Landmark,Pizza Place,Department Store
13,Downtown Toronto,3,Clothing Store,Plaza,American Restaurant,Theater,Cosmetics Shop,Concert Hall,Movie Theater,Monument / Landmark,Pizza Place,Department Store
14,Downtown Toronto,3,Clothing Store,Plaza,American Restaurant,Theater,Cosmetics Shop,Concert Hall,Movie Theater,Monument / Landmark,Pizza Place,Department Store
27,Downtown Toronto,3,Clothing Store,Plaza,American Restaurant,Theater,Cosmetics Shop,Concert Hall,Movie Theater,Monument / Landmark,Pizza Place,Department Store
37,Downtown Toronto,3,Clothing Store,Plaza,American Restaurant,Theater,Cosmetics Shop,Concert Hall,Movie Theater,Monument / Landmark,Pizza Place,Department Store
41,Downtown Toronto,3,Clothing Store,Plaza,American Restaurant,Theater,Cosmetics Shop,Concert Hall,Movie Theater,Monument / Landmark,Pizza Place,Department Store
42,Downtown Toronto,3,Clothing Store,Plaza,American Restaurant,Theater,Cosmetics Shop,Concert Hall,Movie Theater,Monument / Landmark,Pizza Place,Department Store
49,Downtown Toronto,3,Clothing Store,Plaza,American Restaurant,Theater,Cosmetics Shop,Concert Hall,Movie Theater,Monument / Landmark,Pizza Place,Department Store
50,Downtown Toronto,3,Clothing Store,Plaza,American Restaurant,Theater,Cosmetics Shop,Concert Hall,Movie Theater,Monument / Landmark,Pizza Place,Department Store


#### Cluster 5

In [78]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 4, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,North York,4,Ramen Restaurant,Coffee Shop,Café,Korean Restaurant,Restaurant,Japanese Restaurant,Indonesian Restaurant,Burrito Place,Pet Store,Plaza
1,North York,4,Ramen Restaurant,Coffee Shop,Café,Korean Restaurant,Restaurant,Japanese Restaurant,Indonesian Restaurant,Burrito Place,Pet Store,Plaza
4,North York,4,Ramen Restaurant,Coffee Shop,Café,Korean Restaurant,Restaurant,Japanese Restaurant,Indonesian Restaurant,Burrito Place,Pet Store,Plaza
5,North York,4,Ramen Restaurant,Coffee Shop,Café,Korean Restaurant,Restaurant,Japanese Restaurant,Indonesian Restaurant,Burrito Place,Pet Store,Plaza
6,Queen's Park,4,Coffee Shop,Ice Cream Shop,Italian Restaurant,Café,Park,Spa,Office,Portuguese Restaurant,Burrito Place,Bubble Tea Shop
10,North York,4,Ramen Restaurant,Coffee Shop,Café,Korean Restaurant,Restaurant,Japanese Restaurant,Indonesian Restaurant,Burrito Place,Pet Store,Plaza
11,East York,4,Coffee Shop,Café,Farmers Market,Park,Pizza Place,Sandwich Place,Italian Restaurant,Athletics & Sports,Art Museum,Deli / Bodega
12,East York,4,Coffee Shop,Café,Farmers Market,Park,Pizza Place,Sandwich Place,Italian Restaurant,Athletics & Sports,Art Museum,Deli / Bodega
15,North York,4,Ramen Restaurant,Coffee Shop,Café,Korean Restaurant,Restaurant,Japanese Restaurant,Indonesian Restaurant,Burrito Place,Pet Store,Plaza
24,North York,4,Ramen Restaurant,Coffee Shop,Café,Korean Restaurant,Restaurant,Japanese Restaurant,Indonesian Restaurant,Burrito Place,Pet Store,Plaza


### Thank you for completing this lab!

This notebook was created by [Alex Aklson](https://www.linkedin.com/in/aklson/) and [Polong Lin](https://www.linkedin.com/in/polonglin/). I hope you found this lab interesting and educational. Feel free to contact us if you have any questions!

This notebook is part of a course on **Coursera** called *Applied Data Science Capstone*. If you accessed this notebook outside the course, you can take this course online by clicking [here](http://cocl.us/DP0701EN_Coursera_Week3_LAB2).

<hr>

Copyright &copy; 2018 [Cognitive Class](https://cognitiveclass.ai/?utm_source=bducopyrightlink&utm_medium=dswb&utm_campaign=bdu). This notebook and its source code are released under the terms of the [MIT License](https://bigdatauniversity.com/mit-license/).