#  Segmenting and Clustering Neighborhoods in Toronto


## Part I
#### Goal: explore, segment, and cluster the neighborhoods in Toronto based on the postalcode and borough information.
- Obtain neighborhood data through webscraping,  __[Wikipedia Page Here](https://en.wikipedia.org/w/index.php?title=List_of_postal_codes_of_Canada:_M&oldid=945633050)__
- Wrangle, clean, and read data into pandas dataframe. 
    - Columns titled PostalCode, Borough, Neighborhood
    - Only process cells assigned Borough.
    -  More than one neighborhood can exist in one postal code area. Combine neighborhoods under one postal code, with neighborhoods seperated by a comma
    - if neighborhood is NOT ASSIGNED but has a Borough, then neighborhood name is same as borough
- use `.shape()` method to print the number of rows of your df

- Once data is structured, replicate the analysis preformed on New York City to explore and cluster the neighborhoods in Toronto. 

In [None]:
#installs, uncomment if necessary
#!pip install bs4
#!conda install -c conda-forge geopy --yes
#!conda install -c conda-forge folium=0.5.0 --yes
#!pip install geocoder

In [1]:
#imports
import pandas as pd
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

#for webscrapping
import requests
from bs4 import BeautifulSoup

#geocoder for neighborhood coords
import geocoder

#for clustering
import numpy as np
import json
from pandas.io.json import json_normalize #transform JSON into pandas df
from geopy.geocoders import Nominatim #convert address into lat and long

#ploting modules
import matplotlib.cm as cm
import matplotlib.colors as colors
import folium #map rendering library

#k-means clustering state
from sklearn.cluster import KMeans


#### Obtain Toronto neighborhood data from this __[Wikipedia](https://en.wikipedia.org/w/index.php?title=List_of_postal_codes_of_Canada:_M&oldid=945633050)__ page. Wrangle, clean, and read data into pandas DataFrame using BeautifulSoup

In [2]:
#extract toronto neighborhood data
html_data = requests.get("https://en.wikipedia.org/w/index.php?title=List_of_postal_codes_of_Canada:_M&oldid=945633050").text

#parse html_data using beautiful_soup
soup= BeautifulSoup(html_data, 'html5lib')

Using beautiful soup extract the table with Toronto FSAs (Forward Sortation Area) and store it into a dataframe named toronto_data. DataFrame should have columns `PostalCode`, `Borough`, and `Neighborhood`. 

*hint* : print cells to see what data to use

In [3]:
#create columns
columns=['PostalCode', 'Borough', 'Neighborhood']

#find table, import and quick clean the data (from wiki)
table = soup.find('table', {'class': 'wikitable sortable'}).find('tbody')

#get data from table and store it
table_data = []
for row in table.find_all('tr'):
    t_row = {}
    for i, j in zip(row.find_all('td'), columns):
        t_row[j]= i.text.strip()
    table_data.append(t_row)
table_data = table_data[1:] 

#convert to dataframe
toronto_data = pd.DataFrame(table_data)
#save a copy
toronto_copy = toronto_data

#quick look at the data
toronto_data.head()

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront


#### Remove entries where **Borough** is `Not assigned` 
#### Add names where **Neighborhood** is `Not assigned`, when applicable

In [4]:
#remove unassigned boroughs, assign borough name for unnamed neighborhoods
toronto_data['Borough'] = toronto_data['Borough'].replace('Not assigned', np.nan) #replace with nan
toronto_data = toronto_data.dropna(subset=['Borough'], axis =0).reset_index(drop=True) #drop nan values, reset index

#add names to unassigned neighborhoods
#df of all neighborhoods with boroughs but w/o names
df = toronto_data[toronto_data['Neighborhood'] == 'Not assigned']
df.count()

PostalCode      0
Borough         0
Neighborhood    0
dtype: int64

no neighborhoods have unassigned names, we will move onto grouping neighborhoods by postal code

In [5]:
toronto_data = toronto_data.groupby(['PostalCode', 'Borough'])['Neighborhood'].apply(', '.join).reset_index()

#ensure all neighborhoods have been merged with respective postal codes 
#(i.e. postal codes only appear once in df) should return `False`
toronto_data['PostalCode'].duplicated().any()

False

### Shape of dataframe

In [6]:
toronto_data.head()

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M1B,Scarborough,"Rouge, Malvern"
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union"
2,M1E,Scarborough,"Guildwood, Morningside, West Hill"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae


In [7]:
toronto_data.shape

(103, 3)

## Part II

- Obtain latitude and longitude coordinates of each neighborhood using __[Geocoder Python Package](https://geocoder.readthedocs.io/index.html)__
        - Note: you have to be persistent with this package so we will run a while loop for each postal code to ensure we get the coordinates
        
#### *Unable to get geocoder to work efficiently so we will continue using the csv file provided in the project description*

In [8]:
file= 'https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-DS0701EN-SkillsNetwork/labs_v1/Geospatial_Coordinates.csv'
#read into df
latlong= pd.read_csv(file)
#rename column 'postalcode'
latlong.rename(columns={'Postal Code': 'PostalCode'}, inplace=True)

#merge into our toronto_data df
toronto_data = toronto_data.join(latlong.set_index('PostalCode'), on='PostalCode')
toronto_data.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476


## Part III

#### Explore and cluster neighborhoods in Toronto. 

First lets see how many boroughs and neighborhoods in Toronto


In [9]:
#how many boroughs and neighborhoods in toronto?
print('The dataframe has {} boroughs and {} neighborhoods.'.format(
        len(toronto_data['Borough'].unique()),
        toronto_data.shape[0]
    )
)

The dataframe has 10 boroughs and 103 neighborhoods.


Get the Latitude and Longitue values of Toronto for plotting, then plot our data as markers on map

In [10]:
#use geopy library to get lat and long values of Toronto
address ='Toronto, Ontario, CAN'
geolocator = Nominatim(user_agent='to_explorer')
location = geolocator.geocode(address)
latitude= location.latitude
longitude= location.longitude
print('The coordinates of Toronto are {},{}.'.format(latitude,longitude))

# create map of Toronto using latitude and longitude values
map_to = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(toronto_data['Latitude'], toronto_data['Longitude'], toronto_data['Borough'], toronto_data['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_to)  
    
map_to

The coordinates of Toronto are 43.7793879,-79.3046089.


Lets Look more specifically at one borough -- Downtown Toronto

In [11]:
#create a dataframe with only 'Downtown Toronto'
dwtn_data = toronto_data[toronto_data['Borough']=='Downtown Toronto'].reset_index(drop=True)
dwtn_data.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M4W,Downtown Toronto,Rosedale,43.679563,-79.377529
1,M4X,Downtown Toronto,"Cabbagetown, St. James Town",43.667967,-79.367675
2,M4Y,Downtown Toronto,Church and Wellesley,43.66586,-79.38316
3,M5A,Downtown Toronto,Harbourfront,43.65426,-79.360636
4,M5B,Downtown Toronto,"Ryerson, Garden District",43.657162,-79.378937


Get coordinates and plot

In [15]:
#coordinates of Downtown
address = 'Downtown Toronto, Ontario, CAN'
geolocator = Nominatim(user_agent="to_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Downtown Toronto are {}, {}.'.format(latitude, longitude))

# create map of Downtown using latitude and longitude values
map_dwtn = folium.Map(location=[latitude, longitude], zoom_start=13)

# add markers to map
for lat, lng, label in zip(dwtn_data['Latitude'], dwtn_data['Longitude'], dwtn_data['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_dwtn)  
    
map_dwtn

The geograpical coordinate of Downtown Toronto are 43.6607002, -79.3850889.


### Begin using Foursquare API to explore the neighborhoods and segment them

In [16]:
#define Foursquare credentials and version
CLIENT_ID = 'VUPT2P1BO1KNX3UQJ22JG4FH40044T1NUKPZI2QQRA23IUDI' # your Foursquare ID
CLIENT_SECRET = '24JUGS01BMIZIZ1CEYWADL3NRNKOD4VL2M3TIWBKWOXODVF3' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT = 100 # A default Foursquare API limit value

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: VUPT2P1BO1KNX3UQJ22JG4FH40044T1NUKPZI2QQRA23IUDI
CLIENT_SECRET:24JUGS01BMIZIZ1CEYWADL3NRNKOD4VL2M3TIWBKWOXODVF3


#### Explore Neighborhoods in Downtown Toronto

Create a function that gets top venues to all the neighborhoods in Downtown Toronto

In [17]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [18]:
#run function for all neighborhoods in Downtown Toronto
dwtn_venues = getNearbyVenues(names=dwtn_data['Neighborhood'],
                                   latitudes=dwtn_data['Latitude'],
                                   longitudes=dwtn_data['Longitude']
                                  )

Rosedale
Cabbagetown, St. James Town
Church and Wellesley
Harbourfront
Ryerson, Garden District
St. James Town
Berczy Park
Central Bay Street
Adelaide, King, Richmond
Harbourfront East, Toronto Islands, Union Station
Design Exchange, Toronto Dominion Centre
Commerce Court, Victoria Hotel
Harbord, University of Toronto
Chinatown, Grange Park, Kensington Market
CN Tower, Bathurst Quay, Island airport, Harbourfront West, King and Spadina, Railway Lands, South Niagara
Stn A PO Boxes 25 The Esplanade
First Canadian Place, Underground city
Christie
Queen's Park


In [19]:
#how many venues did we get?
print('{} venues were returned by Foursquare.'.format(dwtn_venues.shape))
dwtn_venues.head()

(1232, 7) venues were returned by Foursquare.


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Rosedale,43.679563,-79.377529,Rosedale Park,43.682328,-79.378934,Playground
1,Rosedale,43.679563,-79.377529,Whitney Park,43.682036,-79.373788,Park
2,Rosedale,43.679563,-79.377529,Alex Murray Parkette,43.6783,-79.382773,Park
3,Rosedale,43.679563,-79.377529,Milkman's Lane,43.676352,-79.373842,Trail
4,"Cabbagetown, St. James Town",43.667967,-79.367675,Cranberries,43.667843,-79.369407,Diner


In [20]:
#how many unique categories are there?
print('There are {} unique categories.'.format(len(dwtn_venues['Venue Category'].unique())))

There are 212 unique categories.


#### Analyze Each Neighborhood in Downtown

In [21]:
# one hot encoding
dwtn_onehot = pd.get_dummies(dwtn_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
dwtn_onehot['Neighborhood'] = dwtn_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [dwtn_onehot.columns[-1]] + list(dwtn_onehot.columns[:-1])
dwtn_onehot = dwtn_onehot[fixed_columns]

#size
dwtn_onehot.shape

(1232, 212)

Group by neighborhood and mean frequency of occurence of each category

In [22]:
#group rows by neighborhood and by taking the mean of the frequency of occurrence of each cat
dwtn_grouped = dwtn_onehot.groupby('Neighborhood').mean().reset_index()
#new size
dwtn_grouped.shape

(19, 212)

Lets print each neighborhood along with the top 5 most common venues

In [23]:
#Print each neighborhood along with the top 5 most common venues
num_top_venues = 5

for hood in dwtn_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = dwtn_grouped[dwtn_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Adelaide, King, Richmond----
             venue  freq
0      Coffee Shop  0.09
1             Café  0.05
2       Restaurant  0.04
3  Thai Restaurant  0.04
4    Deli / Bodega  0.03


----Berczy Park----
                venue  freq
0         Coffee Shop  0.09
1        Cocktail Bar  0.07
2              Bakery  0.05
3            Pharmacy  0.03
4  Seafood Restaurant  0.03


----CN Tower, Bathurst Quay, Island airport, Harbourfront West, King and Spadina, Railway Lands, South Niagara----
                venue  freq
0     Airport Service  0.20
1      Airport Lounge  0.13
2       Boat or Ferry  0.07
3             Airport  0.07
4  Airport Food Court  0.07


----Cabbagetown, St. James Town----
                venue  freq
0          Restaurant  0.07
1         Coffee Shop  0.07
2         Pizza Place  0.07
3                Café  0.04
4  Italian Restaurant  0.04


----Central Bay Street----
                venue  freq
0         Coffee Shop  0.19
1      Sandwich Place  0.08
2                Café  

Lets put this data into a dataframe

In [24]:
#function for sorting venues in descending order
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]


In [25]:
#run the funtion with our data
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = dwtn_grouped['Neighborhood']

for ind in np.arange(dwtn_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(dwtn_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Adelaide, King, Richmond",Coffee Shop,Café,Restaurant,Thai Restaurant,Clothing Store,Gym,Deli / Bodega,Burrito Place,Lounge,Salad Place
1,Berczy Park,Coffee Shop,Cocktail Bar,Bakery,Beer Bar,Cheese Shop,Farmers Market,Restaurant,Seafood Restaurant,Pharmacy,Liquor Store
2,"CN Tower, Bathurst Quay, Island airport, Harbo...",Airport Service,Airport Lounge,Boutique,Rental Car Location,Harbor / Marina,Boat or Ferry,Airport Terminal,Sculpture Garden,Airport Gate,Airport Food Court
3,"Cabbagetown, St. James Town",Coffee Shop,Restaurant,Pizza Place,Chinese Restaurant,Pub,Bakery,Café,Convenience Store,Italian Restaurant,Japanese Restaurant
4,Central Bay Street,Coffee Shop,Sandwich Place,Café,Italian Restaurant,Salad Place,Bubble Tea Shop,Burger Joint,Japanese Restaurant,Dessert Shop,Portuguese Restaurant


#### Cluster Neighborhoods

run k-means cluster

In [26]:
# set number of clusters
kclusters = 5

dwtn_grouped_clustering = dwtn_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(dwtn_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 0, 3, 0, 4, 0, 2, 0, 0, 0])

In [27]:
#df that includes the cluster and top 10 venues
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

dwtn_merged = dwtn_data

# merge manhattan_grouped with manhattan_data to add latitude/longitude for each neighborhood
dwtn_merged = dwtn_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

dwtn_merged.head() # check the last columns!

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M4W,Downtown Toronto,Rosedale,43.679563,-79.377529,1,Park,Trail,Playground,Wine Shop,Dance Studio,Dumpling Restaurant,Donut Shop,Doner Restaurant,Dog Run,Distribution Center
1,M4X,Downtown Toronto,"Cabbagetown, St. James Town",43.667967,-79.367675,0,Coffee Shop,Restaurant,Pizza Place,Chinese Restaurant,Pub,Bakery,Café,Convenience Store,Italian Restaurant,Japanese Restaurant
2,M4Y,Downtown Toronto,Church and Wellesley,43.66586,-79.38316,0,Coffee Shop,Japanese Restaurant,Sushi Restaurant,Restaurant,Gay Bar,Yoga Studio,Men's Store,Mediterranean Restaurant,Hotel,Fast Food Restaurant
3,M5A,Downtown Toronto,Harbourfront,43.65426,-79.360636,0,Coffee Shop,Pub,Bakery,Park,Café,Breakfast Spot,Theater,Wine Shop,Electronics Store,Performing Arts Venue
4,M5B,Downtown Toronto,"Ryerson, Garden District",43.657162,-79.378937,0,Coffee Shop,Clothing Store,Café,Bubble Tea Shop,Hotel,Cosmetics Shop,Middle Eastern Restaurant,Italian Restaurant,Japanese Restaurant,Electronics Store


In [28]:
#map
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=13)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(dwtn_merged['Latitude'], dwtn_merged['Longitude'], dwtn_merged['Neighborhood'], dwtn_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

#### Examine Clusters

In [29]:
#cluster 1
dwtn_merged.loc[dwtn_merged['Cluster Labels'] == 0, dwtn_merged.columns[[1] + list(range(5, dwtn_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Downtown Toronto,0,Coffee Shop,Restaurant,Pizza Place,Chinese Restaurant,Pub,Bakery,Café,Convenience Store,Italian Restaurant,Japanese Restaurant
2,Downtown Toronto,0,Coffee Shop,Japanese Restaurant,Sushi Restaurant,Restaurant,Gay Bar,Yoga Studio,Men's Store,Mediterranean Restaurant,Hotel,Fast Food Restaurant
3,Downtown Toronto,0,Coffee Shop,Pub,Bakery,Park,Café,Breakfast Spot,Theater,Wine Shop,Electronics Store,Performing Arts Venue
4,Downtown Toronto,0,Coffee Shop,Clothing Store,Café,Bubble Tea Shop,Hotel,Cosmetics Shop,Middle Eastern Restaurant,Italian Restaurant,Japanese Restaurant,Electronics Store
5,Downtown Toronto,0,Coffee Shop,Café,Cosmetics Shop,Gastropub,Cocktail Bar,Clothing Store,Hotel,Seafood Restaurant,Farmers Market,Italian Restaurant
6,Downtown Toronto,0,Coffee Shop,Cocktail Bar,Bakery,Beer Bar,Cheese Shop,Farmers Market,Restaurant,Seafood Restaurant,Pharmacy,Liquor Store
8,Downtown Toronto,0,Coffee Shop,Café,Restaurant,Thai Restaurant,Clothing Store,Gym,Deli / Bodega,Burrito Place,Lounge,Salad Place
9,Downtown Toronto,0,Coffee Shop,Aquarium,Café,Hotel,Restaurant,Scenic Lookout,Italian Restaurant,Fried Chicken Joint,Pizza Place,Brewery
10,Downtown Toronto,0,Coffee Shop,Hotel,Café,Japanese Restaurant,Italian Restaurant,Salad Place,Restaurant,Seafood Restaurant,Sushi Restaurant,Breakfast Spot
11,Downtown Toronto,0,Coffee Shop,Restaurant,Hotel,Café,Italian Restaurant,Gym,American Restaurant,Cocktail Bar,Japanese Restaurant,Deli / Bodega


In [30]:
#cluster 2
dwtn_merged.loc[dwtn_merged['Cluster Labels'] == 1, dwtn_merged.columns[[1] + list(range(5, dwtn_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Downtown Toronto,1,Park,Trail,Playground,Wine Shop,Dance Studio,Dumpling Restaurant,Donut Shop,Doner Restaurant,Dog Run,Distribution Center


In [31]:
#cluster 3
dwtn_merged.loc[dwtn_merged['Cluster Labels'] == 2, dwtn_merged.columns[[1] + list(range(5, dwtn_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
17,Downtown Toronto,2,Grocery Store,Café,Park,Athletics & Sports,Italian Restaurant,Nightclub,Candy Store,Restaurant,Baby Store,Coffee Shop


In [32]:
#cluster 4
dwtn_merged.loc[dwtn_merged['Cluster Labels'] == 3, dwtn_merged.columns[[1] + list(range(5, dwtn_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
14,Downtown Toronto,3,Airport Service,Airport Lounge,Boutique,Rental Car Location,Harbor / Marina,Boat or Ferry,Airport Terminal,Sculpture Garden,Airport Gate,Airport Food Court


In [33]:
#cluster 5
dwtn_merged.loc[dwtn_merged['Cluster Labels'] == 4, dwtn_merged.columns[[1] + list(range(5, dwtn_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
7,Downtown Toronto,4,Coffee Shop,Sandwich Place,Café,Italian Restaurant,Salad Place,Bubble Tea Shop,Burger Joint,Japanese Restaurant,Dessert Shop,Portuguese Restaurant
18,Downtown Toronto,4,Coffee Shop,College Cafeteria,Sushi Restaurant,Yoga Studio,Discount Store,Smoothie Shop,Italian Restaurant,Beer Bar,Japanese Restaurant,Sandwich Place
