# Capstone Project - Open a Gym

## This is the Final Report

### Purpose of this study: Identify the most appropriate location to open a Gym within the area of PATH in Toronto City, Ontario - Canada.  

- Building a data-frame of PATH in Toronto, data source from Wikipedia page.
- Getting the geographical coordinates of the neighborhoods.
- Obtaining the venue data for the neighborhoods, from Foursquare API.
- Exploring and clustering the neighborhoods.
- Identifying the most appropriate location cluster to open a Gym.

In [2]:
!pip install geocoder
!pip install folium

Collecting geocoder
[?25l  Downloading https://files.pythonhosted.org/packages/4f/6b/13166c909ad2f2d76b929a4227c952630ebaf0d729f6317eb09cbceccbab/geocoder-1.38.1-py2.py3-none-any.whl (98kB)
[K     |████████████████████████████████| 102kB 7.5MB/s ta 0:00:011
Collecting ratelim (from geocoder)
  Downloading https://files.pythonhosted.org/packages/f2/98/7e6d147fd16a10a5f821db6e25f192265d6ecca3d82957a4fdd592cad49c/ratelim-0.1.6-py2.py3-none-any.whl
Installing collected packages: ratelim, geocoder
Successfully installed geocoder-1.38.1 ratelim-0.1.6
Collecting folium
[?25l  Downloading https://files.pythonhosted.org/packages/a4/f0/44e69d50519880287cc41e7c8a6acc58daa9a9acf5f6afc52bcc70f69a6d/folium-0.11.0-py2.py3-none-any.whl (93kB)
[K     |████████████████████████████████| 102kB 9.3MB/s eta 0:00:01
Collecting branca>=0.3.0 (from folium)
  Downloading https://files.pythonhosted.org/packages/13/fb/9eacc24ba3216510c6b59a4ea1cd53d87f25ba76237d7f4393abeaf4c94e/branca-0.4.1-py3-none-any.whl
I

In [3]:
import pandas as pd
import requests
import numpy as np
import geocoder
import folium
import requests 
import matplotlib.cm as cm
import matplotlib.colors as colors
import json
import xml
import matplotlib.pyplot as plt
%matplotlib inline
import warnings
warnings.filterwarnings("ignore")

from pandas.io.json import json_normalize 
from sklearn.cluster import KMeans
from geopy.geocoders import Nominatim 
from bs4 import BeautifulSoup

pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

print("All Required Libraries Imported!")

All Required Libraries Imported!


### Scrape data from Wikipedia page into a Data-Frame

In [5]:
# send the GET request
data = requests.get("https://en.wikipedia.org/wiki/Category:PATH_(Toronto)").text

In [6]:
# parse data from the html into a beautifulsoup object
soup = BeautifulSoup(data, 'html.parser')

In [7]:
# create a list to store neighborhood data
Path = []

In [8]:
# append the data into the list
for row in soup.find_all("div", class_="mw-category")[0].findAll("li"):
    Path.append(row.text)

In [9]:
# create a new DataFrame from the list
PB_df = pd.DataFrame({"Path_Building": Path})

PB_df.head(80)

Unnamed: 0,Path_Building
0,PATH (Toronto)
1,10 Dundas East
2,Atrium on Bay
3,Bay Adelaide Centre
4,Brookfield Place (Toronto)
5,Canadian Broadcasting Centre
6,Commerce Court
7,Design Exchange
8,Dundas station (Toronto)
9,Exchange Tower


In [10]:
# print the number of rows of the dataframe
PB_df.shape

(39, 1)

### Getting the geographical coordinates

In [12]:
# define a function to get coordinates
def get_latlng(neighborhood):
    # initialize your variable to None
    lat_lng_coords = None
    # loop until you get the coordinates
    while(lat_lng_coords is None):
        g = geocoder.arcgis('{}, Path Toronto'.format(neighborhood))
        lat_lng_coords = g.latlng
    return lat_lng_coords

In [13]:
# call the function to get the coordinates, store in a new list using list comprehension
coords = [ get_latlng(neighborhood) for neighborhood in PB_df["Path_Building"].tolist() ]

In [15]:
coords

[[43.648690000000045, -79.38543999999996],
 [43.656680008236634, -79.38064998918243],
 [43.801280054636905, -79.15028989275464],
 [43.65163000000007, -79.37915999999996],
 [43.646520000000066, -79.37873999999994],
 [43.644420000000025, -79.38765999999998],
 [43.64878998985561, -79.37951489175764],
 [43.64814000000007, -79.38043999999996],
 [43.65589073478537, -79.37974681693626],
 [43.66473000000008, -79.39829999999995],
 [43.75228000378917, -79.30161207964568],
 [43.64595000000003, -79.38142999999997],
 [54.66968000000003, -1.6908399999999801],
 [43.64846470097693, -79.38097976326995],
 [43.647110000000055, -79.37733999999995],
 [43.648690000000045, -79.38543999999996],
 [43.772020000000055, -79.18635999999998],
 [43.712248201148995, -79.49059362853455],
 [43.809826646335885, -79.26234508373462],
 [43.614681120542286, -79.4940540459868],
 [43.64584000000008, -79.38569999999999],
 [43.64670000000007, -79.38640999999996],
 [43.64667040986288, -79.37941831448245],
 [43.65777000000003, -7

In [16]:
# create temporary dataframe to populate the coordinates into Latitude and Longitude
df_coords = pd.DataFrame(coords, columns=['Latitude', 'Longitude'])

In [17]:
# merge the coordinates into the original dataframe
PB_df['Latitude'] = df_coords['Latitude']
PB_df['Longitude'] = df_coords['Longitude']

In [18]:
# check the neighborhoods and the coordinates
print(PB_df.shape)
PB_df

(39, 3)


Unnamed: 0,Path_Building,Latitude,Longitude
0,PATH (Toronto),43.64869,-79.38544
1,10 Dundas East,43.65668,-79.38065
2,Atrium on Bay,43.80128,-79.15029
3,Bay Adelaide Centre,43.65163,-79.37916
4,Brookfield Place (Toronto),43.64652,-79.37874
5,Canadian Broadcasting Centre,43.64442,-79.38766
6,Commerce Court,43.64879,-79.379515
7,Design Exchange,43.64814,-79.38044
8,Dundas station (Toronto),43.655891,-79.379747
9,Exchange Tower,43.66473,-79.3983


In [19]:
# save the DataFrame as CSV file
PB_df.to_csv("PB_df.csv", index=False)

### Creating a map of PATH building locations in Downtown Toronto - superimposed on top

In [20]:
# get the coordinates of Downtown, Toronto
address = 'Downtown, Toronto'

geolocator = Nominatim(user_agent="my-application")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Downtown, Toronto {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Downtown, Toronto 43.6541737, -79.38081164513409.


In [54]:
# create map of Downtown, Toronto using latitude and longitude values
map_Path = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, neighborhood in zip(PB_df['Latitude'], PB_df['Longitude'], PB_df['Path_Building']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='red',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(map_Path)  
    
map_Path

In [55]:
# save the map as HTML file
map_Path.save('map_Path.html')

### Using the Foursquare API exploring the area

In [24]:
# The code was removed by Watson Studio for sharing.

Your credentails:
CLIENT_ID: ZMV0W55GPXH1QI533EJMJEZJTW0OL0SFY5P5LZIMGL31CG3V
CLIENT_SECRET:NRA4BN3YCAUJXYCRPRH1AHFXBGL1CR14HUUOTSD32HNGW3CU


### Places that are within a radius of 3000 meters.

In [25]:
radius = 3000
LIMIT = 110

venues = []

for lat, long, neighborhood in zip(PB_df['Latitude'], PB_df['Longitude'], PB_df['Path_Building']):
    
    # create the API request URL
    url = "https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}".format(
        CLIENT_ID,
        CLIENT_SECRET,
        VERSION,
        lat,
        long,
        radius, 
        LIMIT)
    
    # make the GET request
    results = requests.get(url).json()["response"]['groups'][0]['items']
    
    # return only relevant information for each nearby venue
    for venue in results:
        venues.append((
            neighborhood,
            lat, 
            long, 
            venue['venue']['name'], 
            venue['venue']['location']['lat'], 
            venue['venue']['location']['lng'],  
            venue['venue']['categories'][0]['name']))

In [28]:
# convert the venues list into a new DataFrame
venues_df = pd.DataFrame(venues)

# define the column names
venues_df.columns = ['Path_Building', 'Latitude', 'Longitude', 'VenueName', 'VenueLatitude', 'VenueLongitude', 'VenueCategory']

print(venues_df.shape)
venues_df.head(30)

(3690, 7)


Unnamed: 0,Path_Building,Latitude,Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
0,PATH (Toronto),43.64869,-79.38544,Pai,43.647923,-79.388579,Thai Restaurant
1,PATH (Toronto),43.64869,-79.38544,Byblos Toronto,43.647615,-79.388381,Mediterranean Restaurant
2,PATH (Toronto),43.64869,-79.38544,Soho House Toronto,43.648734,-79.386541,Speakeasy
3,PATH (Toronto),43.64869,-79.38544,Adelaide Club Toronto,43.649279,-79.381921,Gym / Fitness Center
4,PATH (Toronto),43.64869,-79.38544,Downtown Toronto,43.653232,-79.385296,Neighborhood
5,PATH (Toronto),43.64869,-79.38544,Pizzeria Libretto,43.648334,-79.385111,Pizza Place
6,PATH (Toronto),43.64869,-79.38544,Nathan Phillips Square,43.65227,-79.383516,Plaza
7,PATH (Toronto),43.64869,-79.38544,Equinox Bay Street,43.6481,-79.379989,Gym
8,PATH (Toronto),43.64869,-79.38544,Delta Hotels by Marriott Toronto,43.642882,-79.383949,Hotel
9,PATH (Toronto),43.64869,-79.38544,Richmond Station,43.651569,-79.379266,American Restaurant


In [29]:
venues_df.head(100)

Unnamed: 0,Path_Building,Latitude,Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
0,PATH (Toronto),43.64869,-79.38544,Pai,43.647923,-79.388579,Thai Restaurant
1,PATH (Toronto),43.64869,-79.38544,Byblos Toronto,43.647615,-79.388381,Mediterranean Restaurant
2,PATH (Toronto),43.64869,-79.38544,Soho House Toronto,43.648734,-79.386541,Speakeasy
3,PATH (Toronto),43.64869,-79.38544,Adelaide Club Toronto,43.649279,-79.381921,Gym / Fitness Center
4,PATH (Toronto),43.64869,-79.38544,Downtown Toronto,43.653232,-79.385296,Neighborhood
5,PATH (Toronto),43.64869,-79.38544,Pizzeria Libretto,43.648334,-79.385111,Pizza Place
6,PATH (Toronto),43.64869,-79.38544,Nathan Phillips Square,43.65227,-79.383516,Plaza
7,PATH (Toronto),43.64869,-79.38544,Equinox Bay Street,43.6481,-79.379989,Gym
8,PATH (Toronto),43.64869,-79.38544,Delta Hotels by Marriott Toronto,43.642882,-79.383949,Hotel
9,PATH (Toronto),43.64869,-79.38544,Richmond Station,43.651569,-79.379266,American Restaurant


In [30]:
venues_df.head(50)

Unnamed: 0,Path_Building,Latitude,Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
0,PATH (Toronto),43.64869,-79.38544,Pai,43.647923,-79.388579,Thai Restaurant
1,PATH (Toronto),43.64869,-79.38544,Byblos Toronto,43.647615,-79.388381,Mediterranean Restaurant
2,PATH (Toronto),43.64869,-79.38544,Soho House Toronto,43.648734,-79.386541,Speakeasy
3,PATH (Toronto),43.64869,-79.38544,Adelaide Club Toronto,43.649279,-79.381921,Gym / Fitness Center
4,PATH (Toronto),43.64869,-79.38544,Downtown Toronto,43.653232,-79.385296,Neighborhood
5,PATH (Toronto),43.64869,-79.38544,Pizzeria Libretto,43.648334,-79.385111,Pizza Place
6,PATH (Toronto),43.64869,-79.38544,Nathan Phillips Square,43.65227,-79.383516,Plaza
7,PATH (Toronto),43.64869,-79.38544,Equinox Bay Street,43.6481,-79.379989,Gym
8,PATH (Toronto),43.64869,-79.38544,Delta Hotels by Marriott Toronto,43.642882,-79.383949,Hotel
9,PATH (Toronto),43.64869,-79.38544,Richmond Station,43.651569,-79.379266,American Restaurant


### Places were returned of each PATH building

In [31]:
venues_df.groupby(["Path_Building"]).count()

Unnamed: 0_level_0,Latitude,Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
Path_Building,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
10 Dundas East,100,100,100,100,100,100
Atrium on Bay,51,51,51,51,51,51
Bay Adelaide Centre,100,100,100,100,100,100
Brookfield Place (Toronto),100,100,100,100,100,100
Canadian Broadcasting Centre,100,100,100,100,100,100
Commerce Court,100,100,100,100,100,100
Design Exchange,100,100,100,100,100,100
Dundas station (Toronto),100,100,100,100,100,100
EY Tower,100,100,100,100,100,100
Exchange Tower,100,100,100,100,100,100


### Special / unique categories

In [32]:
print('There are {} uniques categories.'.format(len(venues_df['VenueCategory'].unique())))

There are 174 uniques categories.


In [34]:
# print out the list of categories
venues_df['VenueCategory'].unique()[:60]

array(['Thai Restaurant', 'Mediterranean Restaurant', 'Speakeasy',
       'Gym / Fitness Center', 'Neighborhood', 'Pizza Place', 'Plaza',
       'Gym', 'Hotel', 'American Restaurant', 'Café', 'Pub',
       'Monument / Landmark', 'Park', 'Brewery', 'Theater', 'Aquarium',
       'Beer Bar', 'French Restaurant', 'Art Gallery', 'Cosmetics Shop',
       'Diner', 'Dessert Shop', 'Food Truck', 'Bookstore', 'Museum',
       'Clothing Store', 'Sporting Goods Shop', 'Sandwich Place',
       'Basketball Stadium', 'Vegetarian / Vegan Restaurant',
       'Record Shop', 'Baseball Stadium', 'Coffee Shop',
       'Japanese Restaurant', 'Restaurant', 'Shopping Mall',
       'Seafood Restaurant', 'Food & Drink Shop', 'Farmers Market',
       'Performing Arts Venue', 'Mexican Restaurant', 'Creperie',
       'Furniture / Home Store', 'Lake', 'Ice Cream Shop', 'Spa',
       'Italian Restaurant', 'Street Art', 'Middle Eastern Restaurant',
       'Burrito Place', 'Bakery', 'Skating Rink', 'Concert Hall',
   

In [35]:
# check if the results contain "Gym"
"Gym" in venues_df['VenueCategory'].unique()

True

### Analyzing PATH Buildings

In [36]:
# one hot encoding
PB_onehot = pd.get_dummies(venues_df[['VenueCategory']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
PB_onehot['Path_Building'] = venues_df['Path_Building'] 

# move neighborhood column to the first column
fixed_columns = [PB_onehot.columns[-1]] + list(PB_onehot.columns[:-1])
PB_onehot = PB_onehot[fixed_columns]

print(PB_onehot.shape)
PB_onehot.head(10)

(3690, 175)


Unnamed: 0,Path_Building,Afghan Restaurant,African Restaurant,American Restaurant,Amphitheater,Antique Shop,Aquarium,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Automotive Shop,BBQ Joint,Bakery,Bank,Bar,Baseball Stadium,Basketball Stadium,Beach,Beer Bar,Beer Store,Bistro,Bookstore,Botanical Garden,Boutique,Bowling Alley,Breakfast Spot,Brewery,Bridge,Bubble Tea Shop,Burger Joint,Burrito Place,Bus Station,Business Service,Butcher,Café,Campground,Cantonese Restaurant,Caribbean Restaurant,Castle,Chinese Restaurant,Chocolate Shop,Clothing Store,Cocktail Bar,Coffee Shop,Comedy Club,Comfort Food Restaurant,Concert Hall,Convenience Store,Cosmetics Shop,Creperie,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Distribution Center,Dog Run,Electronics Store,Event Space,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Fish & Chips Shop,Fish Market,Food & Drink Shop,Food Truck,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Garden,Gas Station,Gastropub,Gay Bar,General Entertainment,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Hakka Restaurant,Hardware Store,Historic Site,Hockey Arena,Home Service,Hostel,Hotel,Ice Cream Shop,Indian Restaurant,Intersection,Italian Restaurant,Japanese Restaurant,Juice Bar,Korean Restaurant,Lake,Latin American Restaurant,Liquor Store,Lounge,Malay Restaurant,Market,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Monument / Landmark,Movie Theater,Museum,Music School,Music Store,National Park,Neighborhood,New American Restaurant,Noodle House,Optical Shop,Organic Grocery,Paintball Field,Paper / Office Supplies Store,Park,Performing Arts Venue,Pet Store,Pharmacy,Pilates Studio,Pizza Place,Plaza,Poke Place,Pool Hall,Portuguese Restaurant,Pub,Ramen Restaurant,Record Shop,Restaurant,Sandwich Place,Scenic Lookout,Seafood Restaurant,Shoe Store,Shop & Service,Shopping Mall,Skating Rink,Smoothie Shop,Spa,Speakeasy,Sporting Goods Shop,Sports Bar,Sri Lankan Restaurant,Steakhouse,Street Art,Supermarket,Supplement Shop,Sushi Restaurant,Taco Place,Tapas Restaurant,Tea Room,Tennis Court,Thai Restaurant,Theater,Theme Restaurant,Thrift / Vintage Store,Toy / Game Store,Track,Trail,Train Station,Turkish Restaurant,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Warehouse Store,Wings Joint,Xinjiang Restaurant,Yoga Studio
0,PATH (Toronto),0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,PATH (Toronto),0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,PATH (Toronto),0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,PATH (Toronto),0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,PATH (Toronto),0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
5,PATH (Toronto),0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
6,PATH (Toronto),0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
7,PATH (Toronto),0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
8,PATH (Toronto),0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
9,PATH (Toronto),0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


### Grouping PATH Building by the mean of the frequency of occurrence for each category

In [37]:
PB_grouped = PB_onehot.groupby(["Path_Building"]).mean().reset_index()

print(PB_grouped.shape)
PB_grouped

(39, 175)


Unnamed: 0,Path_Building,Afghan Restaurant,African Restaurant,American Restaurant,Amphitheater,Antique Shop,Aquarium,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Automotive Shop,BBQ Joint,Bakery,Bank,Bar,Baseball Stadium,Basketball Stadium,Beach,Beer Bar,Beer Store,Bistro,Bookstore,Botanical Garden,Boutique,Bowling Alley,Breakfast Spot,Brewery,Bridge,Bubble Tea Shop,Burger Joint,Burrito Place,Bus Station,Business Service,Butcher,Café,Campground,Cantonese Restaurant,Caribbean Restaurant,Castle,Chinese Restaurant,Chocolate Shop,Clothing Store,Cocktail Bar,Coffee Shop,Comedy Club,Comfort Food Restaurant,Concert Hall,Convenience Store,Cosmetics Shop,Creperie,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Distribution Center,Dog Run,Electronics Store,Event Space,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Fish & Chips Shop,Fish Market,Food & Drink Shop,Food Truck,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Garden,Gas Station,Gastropub,Gay Bar,General Entertainment,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Hakka Restaurant,Hardware Store,Historic Site,Hockey Arena,Home Service,Hostel,Hotel,Ice Cream Shop,Indian Restaurant,Intersection,Italian Restaurant,Japanese Restaurant,Juice Bar,Korean Restaurant,Lake,Latin American Restaurant,Liquor Store,Lounge,Malay Restaurant,Market,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Monument / Landmark,Movie Theater,Museum,Music School,Music Store,National Park,Neighborhood,New American Restaurant,Noodle House,Optical Shop,Organic Grocery,Paintball Field,Paper / Office Supplies Store,Park,Performing Arts Venue,Pet Store,Pharmacy,Pilates Studio,Pizza Place,Plaza,Poke Place,Pool Hall,Portuguese Restaurant,Pub,Ramen Restaurant,Record Shop,Restaurant,Sandwich Place,Scenic Lookout,Seafood Restaurant,Shoe Store,Shop & Service,Shopping Mall,Skating Rink,Smoothie Shop,Spa,Speakeasy,Sporting Goods Shop,Sports Bar,Sri Lankan Restaurant,Steakhouse,Street Art,Supermarket,Supplement Shop,Sushi Restaurant,Taco Place,Tapas Restaurant,Tea Room,Tennis Court,Thai Restaurant,Theater,Theme Restaurant,Thrift / Vintage Store,Toy / Game Store,Track,Trail,Train Station,Turkish Restaurant,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Warehouse Store,Wings Joint,Xinjiang Restaurant,Yoga Studio
0,10 Dundas East,0.0,0.0,0.02,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.01,0.0,0.02,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.1,0.0,0.0,0.02,0.0,0.01,0.0,0.01,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.01,0.01,0.0,0.03,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.01,0.03,0.01,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.06,0.0,0.0,0.0,0.0,0.01,0.02,0.01,0.0,0.0,0.0,0.0,0.01,0.04,0.01,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0
1,Atrium on Bay,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019608,0.039216,0.0,0.0,0.0,0.019608,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.019608,0.0,0.0,0.019608,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.078431,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.019608,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.019608,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019608,0.0,0.019608,0.019608,0.019608,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019608,0.019608,0.0,0.0,0.0,0.019608,0.0,0.0,0.078431,0.0,0.0,0.039216,0.019608,0.058824,0.0,0.0,0.0,0.0,0.039216,0.0,0.0,0.0,0.039216,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.019608,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Bay Adelaide Centre,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.01,0.0,0.02,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.1,0.0,0.0,0.01,0.0,0.01,0.0,0.01,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.01,0.01,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.03,0.01,0.0,0.0,0.0,0.04,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.06,0.01,0.0,0.0,0.0,0.01,0.02,0.01,0.0,0.0,0.0,0.0,0.01,0.04,0.02,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0
3,Brookfield Place (Toronto),0.0,0.0,0.01,0.0,0.0,0.01,0.02,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.01,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.11,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.04,0.01,0.0,0.0,0.01,0.02,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.04,0.01,0.0,0.0,0.0,0.01,0.04,0.0,0.0,0.0,0.0,0.0,0.01,0.03,0.03,0.01,0.01,0.01,0.0,0.01,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.03
4,Canadian Broadcasting Centre,0.0,0.0,0.01,0.0,0.0,0.01,0.03,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.01,0.0,0.03,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.04,0.0,0.0,0.01,0.0,0.02,0.01,0.01,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.01,0.0,0.0,0.02,0.02,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.02,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.05,0.01,0.0,0.0,0.0,0.02,0.03,0.0,0.0,0.0,0.01,0.01,0.01,0.02,0.02,0.0,0.02,0.0,0.0,0.01,0.01,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.01
5,Commerce Court,0.0,0.0,0.01,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.02,0.01,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.1,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.04,0.01,0.0,0.0,0.02,0.02,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.04,0.01,0.0,0.0,0.0,0.01,0.03,0.0,0.0,0.0,0.0,0.0,0.01,0.03,0.03,0.01,0.01,0.01,0.0,0.01,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.03
6,Design Exchange,0.0,0.0,0.01,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.02,0.01,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.11,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.04,0.01,0.0,0.0,0.02,0.02,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.04,0.01,0.0,0.0,0.0,0.01,0.03,0.0,0.0,0.0,0.0,0.0,0.01,0.03,0.03,0.01,0.01,0.01,0.0,0.01,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.03
7,Dundas station (Toronto),0.0,0.0,0.02,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.01,0.0,0.02,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.11,0.0,0.0,0.02,0.0,0.01,0.0,0.01,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.01,0.01,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.01,0.03,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.06,0.0,0.0,0.0,0.0,0.01,0.02,0.01,0.0,0.0,0.0,0.0,0.01,0.04,0.02,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0
8,EY Tower,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.03,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.03,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.04,0.0,0.0,0.0,0.08,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.02,0.0,0.0,0.02,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.05,0.02,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.02,0.0,0.02,0.02,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.06,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.02,0.0,0.02,0.0,0.0,0.03,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.02,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.04,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.01,0.0,0.0
9,Exchange Tower,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.01,0.03,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.01,0.05,0.0,0.0,0.01,0.01,0.0,0.0,0.01,0.02,0.07,0.0,0.0,0.01,0.0,0.01,0.0,0.02,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.01,0.0,0.02,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.02,0.02,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.03,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.03,0.01,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.03,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.01,0.01,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.03,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0


In [38]:
len(PB_grouped[PB_grouped["Gym"] > 0])

25

### Creating a new Data-Frame of Gyms

In [39]:
PB_mall = PB_grouped[["Path_Building","Gym"]]

In [41]:
PB_mall.head(40)

Unnamed: 0,Path_Building,Gym
0,10 Dundas East,0.0
1,Atrium on Bay,0.0
2,Bay Adelaide Centre,0.0
3,Brookfield Place (Toronto),0.01
4,Canadian Broadcasting Centre,0.03
5,Commerce Court,0.01
6,Design Exchange,0.01
7,Dundas station (Toronto),0.0
8,EY Tower,0.02
9,Exchange Tower,0.0


### Clustering PATH Buildings

In [42]:
# set number of clusters
kclusters = 3

PB_clustering = PB_mall.drop(["Path_Building"], 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(PB_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([2, 2, 2, 0, 1, 0, 0, 2, 0, 2], dtype=int32)

In [43]:
# create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.
PB_merged = PB_mall.copy()

# add clustering labels
PB_merged["Cluster Labels"] = kmeans.labels_

In [45]:
PB_merged.rename(columns={"Path_Building": "Path_Building"}, inplace=True)
PB_merged.head(40)

Unnamed: 0,Path_Building,Gym,Cluster Labels
0,10 Dundas East,0.0,2
1,Atrium on Bay,0.0,2
2,Bay Adelaide Centre,0.0,2
3,Brookfield Place (Toronto),0.01,0
4,Canadian Broadcasting Centre,0.03,1
5,Commerce Court,0.01,0
6,Design Exchange,0.01,0
7,Dundas station (Toronto),0.0,2
8,EY Tower,0.02,0
9,Exchange Tower,0.0,2


In [46]:
# sort the results by Cluster Labels
print(PB_merged.shape)
PB_merged.sort_values(["Cluster Labels"], inplace=True)
PB_merged

(39, 3)


Unnamed: 0,Path_Building,Gym,Cluster Labels
19,Queen station,0.01,0
25,Scotiabank Arena,0.01,0
24,Scotia Plaza,0.01,0
22,Royal Bank Plaza,0.01,0
37,Union station (TTC),0.01,0
12,First Canadian Place,0.01,0
10,Fairmont Royal York,0.01,0
8,EY Tower,0.02,0
13,Hockey Hall of Fame,0.01,0
6,Design Exchange,0.01,0


### Examining Clusters

##### Cluster # 0

In [56]:
PB_merged.loc[PB_merged['Cluster Labels'] == 0]

Unnamed: 0,Path_Building,Gym,Cluster Labels
19,Queen station,0.01,0
25,Scotiabank Arena,0.01,0
24,Scotia Plaza,0.01,0
22,Royal Bank Plaza,0.01,0
37,Union station (TTC),0.01,0
12,First Canadian Place,0.01,0
10,Fairmont Royal York,0.01,0
8,EY Tower,0.02,0
13,Hockey Hall of Fame,0.01,0
6,Design Exchange,0.01,0


#### Cluster # 1

In [57]:
PB_merged.loc[PB_merged['Cluster Labels'] == 1]

Unnamed: 0,Path_Building,Gym,Cluster Labels
18,PATH (Toronto),0.03,1
28,Southcore Financial Centre,0.03,1
21,Roy Thomson Hall,0.03,1
20,RBC Centre,0.03,1
4,Canadian Broadcasting Centre,0.03,1
27,"South Core, Toronto",0.03,1
29,St. Andrew station,0.03,1
30,Sun Life Centre,0.030303,1
31,Telus Harbour,0.04,1
14,Hudson's Bay Queen Street,0.03,1


#### Cluster # 2

In [58]:
PB_merged.loc[PB_merged['Cluster Labels'] == 2]

Unnamed: 0,Path_Building,Gym,Cluster Labels
32,Toronto City Hall,0.0,2
34,Toronto Eaton Centre,0.0,2
33,Toronto Coach Terminal,0.0,2
0,10 Dundas East,0.0,2
17,Metro Hall,0.0,2
16,Maple Leaf Square,0.0,2
15,King station (Toronto),0.0,2
11,"Financial District, Toronto",0.0,2
9,Exchange Tower,0.0,2
7,Dundas station (Toronto),0.0,2
