# IBM Data Science Capstone Project
#### Finding the optimal location for a bubble tea store in City of Calgary

### Business Problem

Increasing popularity has attracted many investors to open bubble tea stores in their own city, chained or non-chained. The main purpose of this project is to help investors to find a potential optimal neighbourhood to open a new bubble tea store in City of Calgary. The project is aiming to provide an analysis of population density of the city of Calgary, using Machine Learning methodologies to cluster neighbourhoods, and accessing to Foursquare API to obtain the venues in the neighbourhoods. The recommendation of the optimal location will be made based on analysis of population density, density of restaurant and density of existing bubble stores.


### Data Source

•	City of Calgary - Community by Sector City of Calgary, which contains the community names in City of Calgary, belonging sectors and corresponding coordinates. (Last Updated: September 11, 2020 https://data.calgary.ca/Base-Maps/Communities-by-Sector/e6xg-kaxf)

•	A list of top venues in these neighbourhood is acquired using Foursquare API

##### Import Community Data from City of Calgary Website

In [3]:
# For this practise, libraries Pandas and Numpy will be used
import pandas as pd
import numpy as np

In [4]:
URL = 'https://data.calgary.ca/api/views/j9ps-fyst/rows.csv?accessType=DOWNLOAD'

city_data = pd.read_csv(URL)
city_data.head()

Unnamed: 0,CLASS,CLASS_CODE,COMM_CODE,NAME,SECTOR,SRG,COMM_STRUCTURE,longitude,latitude,location
0,Residential,1,THS,TWINHILLS,EAST,DEVELOPING,BUILDING OUT,-113.87711,51.045111,"(51.045111353378694, -113.87710975220665)"
1,Residential,1,WIL,WILLOW PARK,SOUTH,BUILT-OUT,1960s/1970s,-114.056204,50.956623,"(50.95662292848714, -114.05620363150967)"
2,Residual Sub Area,4,05D,05D,NORTHEAST,,UNDEVELOPED,-113.958662,51.179598,"(51.17959764644064, -113.95866183876556)"
3,Industrial,2,ST4,STONEY 4,NORTHEAST,,EMPLOYMENT,-114.002762,51.176204,"(51.17620448693238, -114.00276157771617)"
4,Residential,1,PKH,PARKHILL,CENTRE,BUILT-OUT,1950s,-114.065552,51.018181,"(51.01818071993347, -114.06555236114401)"


##### Clean Data

In [6]:
# This practise only focus on Residential data, only keep rows with class = Residential
city_data = city_data[city_data["CLASS"] == "Residential"]
city_data.head()

Unnamed: 0,CLASS,CLASS_CODE,COMM_CODE,NAME,SECTOR,SRG,COMM_STRUCTURE,longitude,latitude,location
0,Residential,1,THS,TWINHILLS,EAST,DEVELOPING,BUILDING OUT,-113.87711,51.045111,"(51.045111353378694, -113.87710975220665)"
1,Residential,1,WIL,WILLOW PARK,SOUTH,BUILT-OUT,1960s/1970s,-114.056204,50.956623,"(50.95662292848714, -114.05620363150967)"
4,Residential,1,PKH,PARKHILL,CENTRE,BUILT-OUT,1950s,-114.065552,51.018181,"(51.01818071993347, -114.06555236114401)"
5,Residential,1,PAT,PATTERSON,WEST,BUILT-OUT,1980s/1990s,-114.177047,51.063838,"(51.06383775082155, -114.17704650860274)"
6,Residential,1,RCK,ROSSCARROCK,WEST,BUILT-OUT,1950s,-114.145495,51.04328,"(51.04328023810093, -114.14549516107789)"


##### Remove not nescessary columns

In [9]:
#Only keep community name, lat and long
df = city_data[["NAME","longitude","latitude"]].copy()
df.head()

Unnamed: 0,NAME,longitude,latitude
0,TWINHILLS,-113.87711,51.045111
1,WILLOW PARK,-114.056204,50.956623
4,PARKHILL,-114.065552,51.018181
5,PATTERSON,-114.177047,51.063838
6,ROSSCARROCK,-114.145495,51.04328


In [12]:
#Change column name from NAME to Community
df.rename(columns= {"NAME":"Community"}, inplace = True)
df.columns

Index(['Community', 'longitude', 'latitude'], dtype='object')

In [13]:
#Capitalize community column
df['Community'] = df['Community'].str.capitalize()
df.head()

Unnamed: 0,Community,longitude,latitude
0,Twinhills,-113.87711,51.045111
1,Willow park,-114.056204,50.956623
4,Parkhill,-114.065552,51.018181
5,Patterson,-114.177047,51.063838
6,Rosscarrock,-114.145495,51.04328


In [16]:
#Reset index
df.reset_index(drop = True, inplace = True)
df.head()

Unnamed: 0,Community,longitude,latitude
0,Twinhills,-113.87711,51.045111
1,Willow park,-114.056204,50.956623
2,Parkhill,-114.065552,51.018181
3,Patterson,-114.177047,51.063838
4,Rosscarrock,-114.145495,51.04328


##### Let's plot the community in Calgary map!

In [169]:
#import map redenering library
import folium 

#Search for Calgary coordinates: lat = 51.049999, long = -114.066666
lat = 51.049999
long = -114.066666
Calgary_map = folium.Map(location = [lat, long], zoom_start = 10)

# add markers to map
for lat, lng, comm in zip(df['latitude'], df['longitude'], df['Community']):
    label = '{}'.format(comm)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker([lat, lng],
                        radius=5,
                        popup=label,
                        color='blue',
                        fill=True,
                        fill_color='#3186cc',
                        fill_opacity=0.7,
                        parse_html=False).add_to(Calgary_map)
Calgary_map

##### Use Foursquare and Get Venue Data

In [27]:
#Define Foursquare Credentials
CLIENT_ID = 'MZX3H40NH0YUQ0VUSKUEC5GYC0L3ICUX2FCQPVYUOQ5H41AW'
CLIENT_SECRET = 'QSRK0IFWHK2QGVRQ03U3NPTRSM4AUW1GIWSSVOGXDE5H2QBD'
VERSION = '20210126'

In [28]:
import requests # library to handle requests

In [29]:
#Create a function to process all venues in Calgary
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
    # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID,
            CLIENT_SECRET,
            VERSION,
            lat,
            lng,
            radius,
            100000)

     # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
     # return only relevant information for each nearby venue
        venues_list.append([(
            name,
            lat,
            lng,
            v['venue']['name'],
            v['venue']['location']['lat'],
            v['venue']['location']['lng'],
            v['venue']['categories'][0]['name']) for v in results])
    
    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood',
                             'Neighborhood Latitude',
                             'Neighborhood Longitude',
                             'Venue',
                             'Venue Latitude',
                             'Venue Longitude',
                             'Venue Category']
    return(nearby_venues)

In [30]:
# Run the function created above on each neighborhood and createa new dataframe Calgary_venues
Calgary_venues = getNearbyVenues(names=df['Community'],
                                   latitudes=df['latitude'],
                                   longitudes=df['longitude'])

Twinhills
Willow park
Parkhill
Patterson
Rosscarrock
Acadia
Shawnee slopes
Macewan glen
Elbow park
Glenbrook
Keystone hills
Rocky ridge
Downtown west end
Deer ridge
Erlton
Silver springs
Signal hill
Wolf willow
Rosemont
Shaganappi
Scarboro/ sunalta west
Cougar ridge
Dalhousie
Sunalta
Temple
Lewisburg
Abbeydale
Kincora
Maple ridge
Riverbend
Parkdale
Pineridge
Hotchkiss
Downtown east village
Evergreen
Coventry hills
Kingsland
Forest lawn
Manchester
Vista heights
Applewood park
Saddle ridge
Rosedale
Lower mount royal
Chinook park
Copperfield
Varsity
West hillhurst
Coral springs
Wildwood
Mount pleasant
Highland park
Springbank hill
Bankview
Brentwood
Cedarbrae
Sundance
Ranchlands
Point mckay
Valley ridge
Glendale
Fairview
Falconridge
Red carpet
Rutland park
Greenwood/greenbriar
Beddington heights
Kelvin grove
Cambrian heights
Marlborough
Hamptons
Southwood
Mckenzie towne
Skyview ranch
North haven
Meadowlark park
Crestmont
Ramsay
Monterey park
Crescent heights
Douglasdale/glen
Coach hill
No

In [31]:
# Review the new created dataframe
print(Calgary_venues.shape)
Calgary_venues.head()

(1185, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Willow park,50.956623,-114.056204,Michael Hill,50.952635,-114.059239,Jewelry Store
1,Parkhill,51.018181,-114.065552,Axe Music,51.017012,-114.063163,Music Store
2,Parkhill,51.018181,-114.065552,Annex Ale Project,51.015039,-114.062072,Brewery
3,Parkhill,51.018181,-114.065552,Stanley Park,51.017171,-114.07157,Park
4,Parkhill,51.018181,-114.065552,Salt & Pepper,51.014624,-114.065525,Mexican Restaurant


## Distribution of restaurant in Calgary

This is to visualize the areas with most restaurants opened in communities

In [171]:
calgary_restro = Calgary_venues[Calgary_venues['Venue Category'].str.contains("Restaurant")]
calgary_restro.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
4,Parkhill,51.018181,-114.065552,Salt & Pepper,51.014624,-114.065525,Mexican Restaurant
5,Parkhill,51.018181,-114.065552,Alloy,51.016225,-114.06005,American Restaurant
6,Parkhill,51.018181,-114.065552,Sushi Ichiban,51.017674,-114.063167,Sushi Restaurant
7,Parkhill,51.018181,-114.065552,Seoul BBQ Restaurant,51.014878,-114.064625,Korean BBQ Restaurant
12,Parkhill,51.018181,-114.065552,McDonald's,51.019193,-114.061605,Fast Food Restaurant


In [170]:
#Search for Calgary coordinates: lat = 51.049999, long = -114.066666
lat = 51.049999
long = -114.066666
Calgary_restro = folium.Map(location = [lat, long], zoom_start = 10)

# add markers to map
for lat, lng, comm in zip(calgary_restro['Venue Latitude'], calgary_restro['Venue Longitude'], calgary_restro['Venue']):
    label = '{}'.format(comm)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker([lat, lng],
                        radius=5,
                        popup=label,
                        color='red',
                        fill=True,
                        fill_color='#3186cc',
                        fill_opacity=0.7,
                        parse_html=False).add_to(Calgary_restro)
Calgary_restro

### Analyze Each Neighborhoods

This is to analyze each neighborhoods by the venue category

In [62]:
calgary_onehot = pd.get_dummies(Calgary_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
calgary_onehot['Neighborhood'] = Calgary_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [calgary_onehot.columns[-1]] + list(calgary_onehot.columns[:-1])
calgary_onehot = calgary_onehot[fixed_columns]

calgary_onehot.head()

Unnamed: 0,Yoga Studio,Accessories Store,American Restaurant,Argentinian Restaurant,Arts & Crafts Store,Asian Restaurant,Astrologer,Athletics & Sports,Auto Garage,BBQ Joint,...,Trail,Train Station,Travel & Transport,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Wine Shop,Wings Joint,Women's Store
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [63]:
# taking the mean of the frequency of occurrence of each category
calgary_grouped = calgary_onehot.groupby('Neighborhood').mean().reset_index()
calgary_grouped

Unnamed: 0,Neighborhood,Yoga Studio,Accessories Store,American Restaurant,Argentinian Restaurant,Arts & Crafts Store,Asian Restaurant,Astrologer,Athletics & Sports,Auto Garage,...,Trail,Train Station,Travel & Transport,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Wine Shop,Wings Joint,Women's Store
0,Abbeydale,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0
1,Acadia,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.00,0.0
2,Albert park/radisson heights,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.00,0.0
3,Altadore,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.00,0.0
4,Applewood park,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.2,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.00,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
179,Windsor park,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.00,0.0
180,Winston heights/mountview,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.00,0.0
181,Woodbine,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.00,0.0
182,Woodlands,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.000000,0.0,0.0,0.0,0.0,0.0,0.0,0.00,0.0


In [64]:
# Create a function to sort the venues in descending order.
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [112]:
num_top_venues = 5

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = calgary_grouped['Neighborhood']

for ind in np.arange(calgary_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(calgary_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,Abbeydale,Wings Joint,Convenience Store,Health & Beauty Service,Sandwich Place,Diner
1,Acadia,Pool,Gym / Fitness Center,Women's Store,Fast Food Restaurant,Gas Station
2,Albert park/radisson heights,Light Rail Station,Train Station,Rock Club,Farmers Market,Gas Station
3,Altadore,Pub,Ice Cream Shop,Dog Run,Spa,Women's Store
4,Applewood park,Liquor Store,Construction & Landscaping,Trail,Home Service,Park


### Cluster Neighbourhoods - K Means

In [151]:
# set number of clusters
kclusters = 10

calgary_grouped_clustering = calgary_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(calgary_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([1, 1, 1, 0, 3, 1, 1, 1, 1, 3], dtype=int32)

In [152]:
# add clustering labels
#neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

# Merge Calgary community data and venues data
calgary_merged = df

# merge manhattan_grouped with manhattan_data to add latitude/longitude for each neighborhood
calgary_merged = calgary_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Community')

calgary_merged.head() # check the last columns!

Unnamed: 0,Community,longitude,latitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,Twinhills,-113.87711,51.045111,,,,,,
1,Willow park,-114.056204,50.956623,1.0,Jewelry Store,Women's Store,Fast Food Restaurant,Gastropub,Gas Station
2,Parkhill,-114.065552,51.018181,1.0,Fast Food Restaurant,Sushi Restaurant,Bar,Snack Place,Brewery
3,Patterson,-114.177047,51.063838,1.0,Bar,Gas Station,Pizza Place,Vietnamese Restaurant,Convenience Store
4,Rosscarrock,-114.145495,51.04328,1.0,Sporting Goods Shop,Ice Cream Shop,Japanese Restaurant,Fast Food Restaurant,Juice Bar


In [153]:
calgary_merged = calgary_merged.dropna()
calgary_merged['Cluster Labels'] = calgary_merged['Cluster Labels'].astype('int64')
calgary_merged.head()

Unnamed: 0,Community,longitude,latitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
1,Willow park,-114.056204,50.956623,1,Jewelry Store,Women's Store,Fast Food Restaurant,Gastropub,Gas Station
2,Parkhill,-114.065552,51.018181,1,Fast Food Restaurant,Sushi Restaurant,Bar,Snack Place,Brewery
3,Patterson,-114.177047,51.063838,1,Bar,Gas Station,Pizza Place,Vietnamese Restaurant,Convenience Store
4,Rosscarrock,-114.145495,51.04328,1,Sporting Goods Shop,Ice Cream Shop,Japanese Restaurant,Fast Food Restaurant,Juice Bar
5,Acadia,-114.053702,50.972407,1,Pool,Gym / Fitness Center,Women's Store,Fast Food Restaurant,Gas Station


##### Visualize the clusters

In [154]:
import matplotlib.cm as cm
import matplotlib.colors as colors

# create map
lat = 51.049999
long = -114.066666
map_clusters = folium.Map(location=[lat, long], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(calgary_merged['latitude'], calgary_merged['longitude'], calgary_merged['Community'], calgary_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### Exam Clusters

Cluster 0

In [172]:
calgary_merged.loc[calgary_merged['Cluster Labels'] == 0, calgary_merged.columns[[0] + list(range(4, calgary_merged.shape[1]))]]

Unnamed: 0,Community,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
44,Chinook park,History Museum,Japanese Restaurant,Pub,Pier,Farmers Market
84,Garrison woods,History Museum,Pub,Ice Cream Shop,Spa,Fast Food Restaurant
89,Bowness,Pub,Garden Center,Hardware Store,Fast Food Restaurant,Gas Station
117,Rundle,Pub,Business Service,Women's Store,Fast Food Restaurant,Gas Station
122,Penbrooke meadows,Business Service,Women's Store,Gift Shop,Gastropub,Gas Station
155,Altadore,Pub,Ice Cream Shop,Dog Run,Spa,Women's Store


Cluster 1

In [173]:
calgary_merged.loc[calgary_merged['Cluster Labels'] == 1, calgary_merged.columns[[0] + list(range(4, calgary_merged.shape[1]))]]

Unnamed: 0,Community,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
1,Willow park,Jewelry Store,Women's Store,Fast Food Restaurant,Gastropub,Gas Station
2,Parkhill,Fast Food Restaurant,Sushi Restaurant,Bar,Snack Place,Brewery
3,Patterson,Bar,Gas Station,Pizza Place,Vietnamese Restaurant,Convenience Store
4,Rosscarrock,Sporting Goods Shop,Ice Cream Shop,Japanese Restaurant,Fast Food Restaurant,Juice Bar
5,Acadia,Pool,Gym / Fitness Center,Women's Store,Fast Food Restaurant,Gas Station
...,...,...,...,...,...,...
213,University of calgary,Coffee Shop,Gym / Fitness Center,Concert Hall,Theater,Sporting Goods Shop
214,Cranston,Coffee Shop,Gas Station,Liquor Store,Sandwich Place,Grocery Store
215,Thorncliffe,Liquor Store,Convenience Store,Supermarket,Bank,Sandwich Place
216,Nolan hill,IT Services,Restaurant,Other Repair Shop,Women's Store,Fast Food Restaurant


Cluster 2

In [174]:
calgary_merged.loc[calgary_merged['Cluster Labels'] == 2, calgary_merged.columns[[0] + list(range(4, calgary_merged.shape[1]))]]

Unnamed: 0,Community,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
34,Evergreen,Construction & Landscaping,Concert Hall,Women's Store,Filipino Restaurant,Gastropub
76,Crestmont,Playground,Construction & Landscaping,Women's Store,Fast Food Restaurant,Gas Station
80,Douglasdale/glen,Construction & Landscaping,Women's Store,Filipino Restaurant,Gastropub,Gas Station
121,Mckenzie lake,Bus Stop,Construction & Landscaping,Women's Store,Filipino Restaurant,Gastropub
142,Parkland,Gas Station,Construction & Landscaping,Women's Store,Filipino Restaurant,Gastropub
150,Diamond cove,Construction & Landscaping,Women's Store,Filipino Restaurant,Gastropub,Gas Station
209,Bonavista downs,Construction & Landscaping,Furniture / Home Store,Chinese Restaurant,Women's Store,Filipino Restaurant


Cluster 3

In [175]:
calgary_merged.loc[calgary_merged['Cluster Labels'] == 3, calgary_merged.columns[[0] + list(range(4, calgary_merged.shape[1]))]]

Unnamed: 0,Community,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
13,Deer ridge,Dog Run,Women's Store,Filipino Restaurant,Gastropub,Gas Station
14,Erlton,Park,Breakfast Spot,Women's Store,Gastropub,Gas Station
15,Silver springs,Home Service,Tennis Court,Construction & Landscaping,Park,Fast Food Restaurant
18,Rosemont,Cocktail Bar,Dog Run,Park,Filipino Restaurant,Gas Station
40,Applewood park,Liquor Store,Construction & Landscaping,Trail,Home Service,Park
42,Rosedale,Park,Locksmith,Gluten-free Restaurant,Dive Bar,Rest Area
49,Wildwood,Park,Dog Run,Trail,Women's Store,Gas Station
54,Brentwood,Park,Pizza Place,Liquor Store,Women's Store,Gas Station
57,Ranchlands,Pizza Place,Park,Convenience Store,Women's Store,Gas Station
70,Hamptons,Park,Golf Course,Baseball Field,Women's Store,Gas Station


Cluster 4

In [176]:
calgary_merged.loc[calgary_merged['Cluster Labels'] == 4, calgary_merged.columns[[0] + list(range(4, calgary_merged.shape[1]))]]

Unnamed: 0,Community,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
16,Signal hill,Park,Women's Store,German Restaurant,Gas Station,Garden Center
46,Varsity,Park,Women's Store,German Restaurant,Gas Station,Garden Center
125,Elboya,Park,Women's Store,German Restaurant,Gas Station,Garden Center
132,Erin woods,Park,Women's Store,German Restaurant,Gas Station,Garden Center


Cluster 5

In [177]:
calgary_merged.loc[calgary_merged['Cluster Labels'] == 5, calgary_merged.columns[[0] + list(range(4, calgary_merged.shape[1]))]]

Unnamed: 0,Community,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
31,Pineridge,Coffee Shop,Women's Store,Filipino Restaurant,Gastropub,Gas Station
64,Rutland park,Coffee Shop,Women's Store,Filipino Restaurant,Gastropub,Gas Station


Cluster 6

In [178]:
calgary_merged.loc[calgary_merged['Cluster Labels'] == 6, calgary_merged.columns[[0] + list(range(4, calgary_merged.shape[1]))]]

Unnamed: 0,Community,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
27,Kincora,Playground,Women's Store,Fast Food Restaurant,Gas Station,Garden Center
35,Coventry hills,Playground,Construction & Landscaping,Home Service,Women's Store,Fast Food Restaurant
66,Beddington heights,Pool,Playground,Theater,Women's Store,Farmers Market
74,North haven,Playground,Women's Store,Fast Food Restaurant,Gas Station,Garden Center
94,Taradale,Playground,Residential Building (Apartment / Condo),Women's Store,Fast Food Restaurant,Gas Station
104,Redstone,Playground,IT Services,Astrologer,Women's Store,Fast Food Restaurant
123,South calgary,Playground,Breakfast Spot,Gym / Fitness Center,Sandwich Place,Women's Store
133,Millrise,Golf Course,Playground,Skating Rink,Women's Store,Fast Food Restaurant
161,Palliser,Cosmetics Shop,Playground,Home Service,Women's Store,Fast Food Restaurant
166,Panorama hills,Playground,Pizza Place,Construction & Landscaping,Women's Store,Fast Food Restaurant


Cluster 7

In [179]:
calgary_merged.loc[calgary_merged['Cluster Labels'] == 7, calgary_merged.columns[[0] + list(range(4, calgary_merged.shape[1]))]]

Unnamed: 0,Community,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
83,Silverado,Home Service,Women's Store,Fast Food Restaurant,Gastropub,Gas Station
120,Martindale,Gym,Home Service,Women's Store,Fast Food Restaurant,Gas Station
179,Yorkville,Home Service,Women's Store,Fast Food Restaurant,Gastropub,Gas Station


Cluster 8

In [180]:
calgary_merged.loc[calgary_merged['Cluster Labels'] == 8, calgary_merged.columns[[0] + list(range(4, calgary_merged.shape[1]))]]

Unnamed: 0,Community,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
8,Elbow park,Health Food Store,Women's Store,Filipino Restaurant,Gastropub,Gas Station


Cluster 9

In [181]:
calgary_merged.loc[calgary_merged['Cluster Labels'] == 9, calgary_merged.columns[[0] + list(range(4, calgary_merged.shape[1]))]]

Unnamed: 0,Community,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
114,Hidden valley,Bus Station,Women's Store,Gift Shop,Gastropub,Gas Station
127,University district,Bus Station,Pharmacy,Women's Store,Filipino Restaurant,Gas Station
