# Introduction/Business Problem

The downtown is one of the more representative sectors in every city. In this databook, I will create a comparison of the centers of the 10 cities with more population in Canada. 

To make the comparison, I will use the Foursquare API to get the most frequent places in a given area, I'll group the data by city, and finally, I will create clusters to find relations between the available businesses.

As a result of this short study, I will try to answer how similar are the downtowns of more populate cities in Canada?.

# Data description

The initial data set is the 100 largest population centres in Canada. This information is available in [this wikipedia article] (https://en.wikipedia.org/wiki/List_of_the_100_largest_population_centres_in_Canada).

With the information of the top 10 cities in place, I'll create a polygon with the shape of each downtown. The coordinates had to be extracted manually from google maps. As an example, you can explore the [map of Montreal](https://www.google.com/maps/place/Downtown+Montreal,+Montreal,+QC/@45.5057346,-73.5850421,14z/data=!3m1!4b1!4m5!3m4!1s0x4cc91a42465421bd:0xfbb91c3e6b1f6a78!8m2!3d45.5034801!4d-73.5684895). The Montreal's coordinates separate by semicolon are:  `45.496571,-73.581776;45.518796,-73.566700;45.513003,-73.553314;45.492559,-73.572295`.

With the polygons in place, I'll get the list of the top 100 venues using `/explore&polygon=<shape>` endpoint of the foursquare API. As an example, you can check the Montreal polygon in action using the Foursquare [city tour app](https://foursquare.com/explore?mode=url&polygon=45.51302345955653%2C-73.55329513549805%3B45.49257023754937%2C-73.57226371765135%3B45.496541150087026%2C-73.58179092407225%3B45.51879714173096%2C-73.56668472290039%3B45.51302345955653%2C-73.55329513549805
). Is important to notice the semicolon needs to be encoded, I'll use [this tool](https://www.urlencoder.org/) to generate the encoded version of the query param. For more information about the polygon param, you can check the [API Documentation](https://developer.foursquare.com/docs/pilgrim-sdk/geofences-api/add-polygon)

# Methodology

During the data preparation includes the following steps:

1. Create a dataframe with the City and the polygon of the downtown.
2. Get the venue information using the following url `https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&polygon={}&limit={}`
3. Create a dataframe with the name of the city, venue name, venue latitude, venue Longitude, and venue category.
4. Create dummies version of the business category for every row.
5. Group the dataframe by city and find the mean of the business type.
6. Generate a dataframe with the most 10 common venues for every city.

For the clustering part, I'll use the k-means clustering algorithm.

# Data Analysis

First let's import all our dependencies

In [2]:
import pandas as pd
import numpy as np
import folium
import requests
from geopy.geocoders import Nominatim

In [3]:
# @hidden_cell
CLIENT_ID = 'Y4554RSKPQAK4CCMIER0M4YZSDN5CZRWO3ZQ23VTYGKNPVU3' # your Foursquare ID
CLIENT_SECRET = '23JRM4SFH1RJBOIA5JNZ24ZGCWVLZ3IDCUEO00SYEF15NELJ' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: Y4554RSKPQAK4CCMIER0M4YZSDN5CZRWO3ZQ23VTYGKNPVU3
CLIENT_SECRET:23JRM4SFH1RJBOIA5JNZ24ZGCWVLZ3IDCUEO00SYEF15NELJ


Now I'll generate the polygon information required per city

In [4]:
mtl = ['Montreal', '45.51302345955653%2C-73.55329513549805%3B45.49257023754937%2C-73.57226371765135%3B45.496541150087026%2C-73.58179092407225%3B45.51879714173096%2C-73.56668472290039%3B45.51302345955653%2C-73.55329513549805']
vancouver = ['Vancouver', '49.289866220776666%2C-123.10712814331055%3B49.28768289147221%2C-123.11253547668457%3B49.28628327054955%2C-123.10412406921385%3B49.277156768098855%2C-123.09991836547852%3B49.2727328862137%2C-123.10000419616699%3B49.270436791120694%2C-123.12601089477539%3B49.27575684838932%2C-123.13570976257323%3B49.28997818377568%2C-123.11064720153807%3B49.289866220776666%2C-123.10712814331055']
toronto = ['Toronto', '43.633976%2C-79.396658%3B43.665309%2C-79.411553%3B43.675315%2C-79.361752%3B43.650853%2C-79.347150%0A']
calgary = ['Calgary', '51.044758%2C-114.094933%3B51.047780%2C-114.094546%3B51.054227%2C-114.073303%3B51.054335%2C-114.066909%3B51.045621%2C-114.041503%3B51.043382%2C-114.043177']
edmonton = ['Edmonton', '53.534339%2C-113.508647%3B53.547600%2C-113.508475%3B53.548772%2C-113.488101%3B53.541646%2C-113.484528%3B53.534036%2C-113.498749%3B53.534245%2C-113.508417']
winnipeg = ['Winnipeg', '49.881795%2C-97.148234%3B49.899056%2C-97.151979%3B49.896526%2C-97.143761%3B49.903454%2C-97.138685%3B49.899873%2C-97.128714%3B49.886014%2C-97.126720%3B49.881692%2C-97.148293']
quebec_city = ['Quebec City', '46.815062%2C-71.214383%3B46.804018%2C-71.207474%3B46.809335%2C-71.202367%3B46.813858%2C-71.201380%3B46.813946%2C-71.203655%3B46.815973%2C-71.204084%3B46.816560%2C-71.209620%3B46.815033%2C-71.214555']
amilton = ['Hamilton', '43.258393331300354%2C-79.85730171203613%3B43.2558304507587%2C-79.84738826751709%3B43.248907008609315%2C-79.8503065109253%3B43.25589296132357%2C-79.8801326751709%3B43.26292499019111%2C-79.8771286010742%3B43.258393331300354%2C-79.85730171203613']
kitchener = ['Kitchener', '43.43652969324404%2C-80.4572582244873%3B43.45397811873967%2C-80.50540924072266%3B43.46556607646928%2C-80.4719352722168%3B43.43652969324404%2C-80.4572582244873']
ottawa = ['Ottawa', '45.414341%2C-75.707360%3B45.419986%2C-75.710648%3B45.426753%2C-75.698830%3B45.421452%2C-75.687145%3B45.419269%2C-75.691411%3B45.420417%2C-75.692483%3B45.414317%2C-75.707308']
downtown_data = pd.DataFrame(np.array([
    mtl, 
    vancouver, 
    toronto,
    ottawa,
    calgary,
    edmonton,
    winnipeg,
    quebec_city,
    amilton,
    kitchener
]), columns=['City', 'polygon'])
downtown_data

Unnamed: 0,City,polygon
0,Montreal,45.51302345955653%2C-73.55329513549805%3B45.49...
1,Vancouver,49.289866220776666%2C-123.10712814331055%3B49....
2,Toronto,43.633976%2C-79.396658%3B43.665309%2C-79.41155...
3,Ottawa,45.414341%2C-75.707360%3B45.419986%2C-75.71064...
4,Calgary,51.044758%2C-114.094933%3B51.047780%2C-114.094...
5,Edmonton,53.534339%2C-113.508647%3B53.547600%2C-113.508...
6,Winnipeg,49.881795%2C-97.148234%3B49.899056%2C-97.15197...
7,Quebec City,46.815062%2C-71.214383%3B46.804018%2C-71.20747...
8,Hamilton,43.258393331300354%2C-79.85730171203613%3B43.2...
9,Kitchener,43.43652969324404%2C-80.4572582244873%3B43.453...


Now I'll create the function to get the information of each venue from the Foursquare API

In [5]:
LIMIT = 500

def getNearbyVenues(polygons, names):
    venues_list = []
    
    for polygon, name in zip(polygons, names):
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&polygon={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            polygon,
            LIMIT)

        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        print(name, len(results))

        # return only relevant information for each nearby venue
        venues_list.append([(
            name,
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])
        
        nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
        nearby_venues.columns = [
            'City',
            'Venue', 
            'Venue Latitude', 
            'Venue Longitude', 
            'Venue Category'
        ]
    
    return(nearby_venues)

Now I'll get the venue information using the base data frame

In [6]:
list_of_venues = getNearbyVenues(
    polygons=downtown_data['polygon'],
    names=downtown_data['City']
)

list_of_venues.head()

Montreal 100
Vancouver 100
Toronto 100
Ottawa 100
Calgary 100
Edmonton 100
Winnipeg 77
Quebec City 67
Hamilton 100
Kitchener 100


Unnamed: 0,City,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Montreal,Musée des beaux-arts de Montréal (MBAM),45.498436,-73.579715,Art Museum
1,Montreal,Place des Festivals,45.507264,-73.567414,Plaza
2,Montreal,Place des Arts,45.508131,-73.565969,Performing Arts Venue
3,Montreal,Café Parvis,45.505817,-73.569302,Café
4,Montreal,La Maison Symphonique de Montréal,45.509442,-73.566599,Concert Hall


In [7]:
print('There are {} uniques categories.'.format(len(list_of_venues['Venue Category'].unique())))

There are 189 uniques categories.


I'll prepare the data to apply the clustiring algorithm

In [8]:
# one hot encoding
venues_onehot = pd.get_dummies(list_of_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
venues_onehot['City'] = list_of_venues['City'] 

# move neighborhood column to the first column
fixed_columns = [venues_onehot.columns[-1]] + list(venues_onehot.columns[:-1])
venues_onehot = venues_onehot[fixed_columns]

venues_onehot.head()

Unnamed: 0,City,Adult Boutique,American Restaurant,Aquarium,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,...,Trail,Train Station,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Whisky Bar,Wine Bar,Wine Shop,Women's Store,Yoga Studio
0,Montreal,0,0,0,0,0,1,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Montreal,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Montreal,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Montreal,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Montreal,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


Now I'll get the information by city

In [9]:
venues_grouped = venues_onehot.groupby('City').mean().reset_index()
venues_grouped

Unnamed: 0,City,Adult Boutique,American Restaurant,Aquarium,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,...,Trail,Train Station,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Whisky Bar,Wine Bar,Wine Shop,Women's Store,Yoga Studio
0,Calgary,0.0,0.04,0.0,0.01,0.01,0.0,0.0,0.0,0.0,...,0.01,0.0,0.0,0.0,0.02,0.0,0.01,0.01,0.0,0.0
1,Edmonton,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,...,0.0,0.0,0.01,0.0,0.02,0.01,0.0,0.01,0.0,0.0
2,Hamilton,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.02,0.0,...,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0
3,Kitchener,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.01,0.0,0.02,0.03,0.0,0.0,0.0,0.0,0.0
4,Montreal,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,...,0.0,0.0,0.0,0.03,0.0,0.0,0.01,0.0,0.0,0.02
5,Ottawa,0.01,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Quebec City,0.0,0.0,0.0,0.0,0.014925,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Toronto,0.0,0.01,0.01,0.0,0.02,0.0,0.01,0.0,0.01,...,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.01
8,Vancouver,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,...,0.03,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0
9,Winnipeg,0.0,0.012987,0.0,0.0,0.012987,0.0,0.0,0.064935,0.012987,...,0.0,0.0,0.0,0.0,0.025974,0.0,0.0,0.0,0.012987,0.0


Let's explore the top 5 of business by city

In [10]:
num_top_venues = 4

for city in venues_grouped['City']:
    print("----" + city + "----")
    temp = venues_grouped[venues_grouped['City'] == city].T.reset_index()
    temp.columns = ['venue', 'freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Calgary----
         venue  freq
0   Restaurant  0.07
1        Hotel  0.06
2  Coffee Shop  0.04
3   Steakhouse  0.04


----Edmonton----
                venue  freq
0         Coffee Shop  0.08
1  Italian Restaurant  0.06
2                Café  0.04
3                 Pub  0.04


----Hamilton----
                       venue  freq
0                Coffee Shop  0.09
1                        Pub  0.08
2                       Café  0.06
3  Middle Eastern Restaurant  0.04


----Kitchener----
            venue  freq
0     Coffee Shop  0.06
1  Sandwich Place  0.05
2      Restaurant  0.05
3            Café  0.05


----Montreal----
          venue  freq
0          Café  0.10
1    Restaurant  0.05
2         Hotel  0.05
3  Cocktail Bar  0.03


----Ottawa----
         venue  freq
0  Coffee Shop  0.11
1        Hotel  0.06
2   Restaurant  0.05
3         Café  0.04


----Quebec City----
               venue  freq
0  French Restaurant  0.13
1              Hotel  0.09
2              Plaza  0.07
3    

Now, let's write a function to get the most common venues

In [11]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Let's create the data frame with the 10 most common venues

In [12]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['City']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
city_venues_sorted = pd.DataFrame(columns=columns)
city_venues_sorted['City'] = venues_grouped['City']

for ind in np.arange(venues_grouped.shape[0]):
    city_venues_sorted.iloc[ind, 1:] = return_most_common_venues(venues_grouped.iloc[ind, :], num_top_venues)

city_venues_sorted

Unnamed: 0,City,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Calgary,Restaurant,Hotel,Coffee Shop,Pub,Steakhouse,American Restaurant,Bakery,Italian Restaurant,Mediterranean Restaurant,French Restaurant
1,Edmonton,Coffee Shop,Italian Restaurant,Café,Pub,Hotel,French Restaurant,Sandwich Place,Nightclub,Restaurant,Gym
2,Hamilton,Coffee Shop,Pub,Café,Middle Eastern Restaurant,Fast Food Restaurant,Bar,Vietnamese Restaurant,Restaurant,Sandwich Place,Italian Restaurant
3,Kitchener,Coffee Shop,Restaurant,Café,Sandwich Place,Pizza Place,Bakery,Vietnamese Restaurant,Gym,Fast Food Restaurant,Middle Eastern Restaurant
4,Montreal,Café,Hotel,Restaurant,Park,Sandwich Place,Plaza,Cocktail Bar,Pizza Place,Concert Hall,Vegetarian / Vegan Restaurant
5,Ottawa,Coffee Shop,Hotel,Restaurant,Café,Clothing Store,Japanese Restaurant,Food Truck,Concert Hall,Art Gallery,Library
6,Quebec City,French Restaurant,Hotel,Plaza,Café,Pizza Place,Neighborhood,Restaurant,Park,Historic Site,Concert Hall
7,Toronto,Coffee Shop,Park,Japanese Restaurant,Sandwich Place,Café,Bakery,Dance Studio,Mexican Restaurant,Restaurant,Beer Bar
8,Vancouver,Hotel,Dessert Shop,Bakery,Park,Seafood Restaurant,Sandwich Place,Breakfast Spot,Trail,Coffee Shop,Italian Restaurant
9,Winnipeg,Sandwich Place,Coffee Shop,Asian Restaurant,Café,Restaurant,Hotel,Bakery,Pub,Dim Sum Restaurant,Museum


With the data in place, it's time to run the clustering algorithm

In [13]:
from sklearn.cluster import KMeans

# set number of clusters
kclusters = 4

venues_grouped_clustering = venues_grouped.drop('City', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(venues_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([2, 0, 0, 0, 3, 3, 1, 3, 2, 0])

Let's create a new dataframe to display the cluster in a map 

In [14]:
# add clustering labels
city_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

venues_merged = list_of_venues.copy()

# merge venues_merged with list_of_venues to add latitude/longitude for each neighborhood
venues_merged = venues_merged.join(city_venues_sorted.set_index('City'), on='City')

venues_merged.head() # check the last columns!

Unnamed: 0,City,Venue,Venue Latitude,Venue Longitude,Venue Category,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Montreal,Musée des beaux-arts de Montréal (MBAM),45.498436,-73.579715,Art Museum,3,Café,Hotel,Restaurant,Park,Sandwich Place,Plaza,Cocktail Bar,Pizza Place,Concert Hall,Vegetarian / Vegan Restaurant
1,Montreal,Place des Festivals,45.507264,-73.567414,Plaza,3,Café,Hotel,Restaurant,Park,Sandwich Place,Plaza,Cocktail Bar,Pizza Place,Concert Hall,Vegetarian / Vegan Restaurant
2,Montreal,Place des Arts,45.508131,-73.565969,Performing Arts Venue,3,Café,Hotel,Restaurant,Park,Sandwich Place,Plaza,Cocktail Bar,Pizza Place,Concert Hall,Vegetarian / Vegan Restaurant
3,Montreal,Café Parvis,45.505817,-73.569302,Café,3,Café,Hotel,Restaurant,Park,Sandwich Place,Plaza,Cocktail Bar,Pizza Place,Concert Hall,Vegetarian / Vegan Restaurant
4,Montreal,La Maison Symphonique de Montréal,45.509442,-73.566599,Concert Hall,3,Café,Hotel,Restaurant,Park,Sandwich Place,Plaza,Cocktail Bar,Pizza Place,Concert Hall,Vegetarian / Vegan Restaurant


Let's get the center of our map in Canada to display all the clusters.

In [15]:
address = 'Canada'

geolocator = Nominatim(user_agent="tl-canada-neigh")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinates of Canada are {}, {}.'.format(latitude, longitude))

The geograpical coordinates of Canada are 61.0666922, -107.9917071.


Now it's time to display our clusters in a map.

In [16]:
import matplotlib.cm as cm
import matplotlib.colors as colors

# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=4)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []

row_data =  zip(
    venues_merged['Venue Latitude'], 
    venues_merged['Venue Longitude'], 
    venues_merged['City'], 
    venues_merged['Cluster Labels']
)

for lat, lon, poi, cluster in row_data:
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

Let's display the clusters names in a dataframe

In [56]:
clusters = venues_merged[['City', 'Cluster Labels']].groupby(['City', 'Cluster Labels']).sum().reset_index().sort_values('Cluster Labels')
clusters

Unnamed: 0,City,Cluster Labels
1,Edmonton,0
2,Hamilton,0
3,Kitchener,0
9,Winnipeg,0
6,Quebec City,1
0,Calgary,2
8,Vancouver,2
4,Montreal,3
5,Ottawa,3
7,Toronto,3


Before writing the conclusions, let's explore each cluster

In [17]:
venues_merged.loc[venues_merged['Cluster Labels'] == 0, venues_merged.columns[[0] + list(range(kclusters - 1, venues_merged.shape[1]))]]

Unnamed: 0,City,Venue Longitude,Venue Category,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
500,Edmonton,-113.497405,Café,0,Coffee Shop,Italian Restaurant,Café,Pub,Hotel,French Restaurant,Sandwich Place,Nightclub,Restaurant,Gym
501,Edmonton,-113.497549,Italian Restaurant,0,Coffee Shop,Italian Restaurant,Café,Pub,Hotel,French Restaurant,Sandwich Place,Nightclub,Restaurant,Gym
502,Edmonton,-113.506071,Park,0,Coffee Shop,Italian Restaurant,Café,Pub,Hotel,French Restaurant,Sandwich Place,Nightclub,Restaurant,Gym
503,Edmonton,-113.490295,Bar,0,Coffee Shop,Italian Restaurant,Café,Pub,Hotel,French Restaurant,Sandwich Place,Nightclub,Restaurant,Gym
504,Edmonton,-113.508570,Nightclub,0,Coffee Shop,Italian Restaurant,Café,Pub,Hotel,French Restaurant,Sandwich Place,Nightclub,Restaurant,Gym
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
939,Kitchener,-80.473352,Grocery Store,0,Coffee Shop,Restaurant,Café,Sandwich Place,Pizza Place,Bakery,Vietnamese Restaurant,Gym,Fast Food Restaurant,Middle Eastern Restaurant
940,Kitchener,-80.474326,Liquor Store,0,Coffee Shop,Restaurant,Café,Sandwich Place,Pizza Place,Bakery,Vietnamese Restaurant,Gym,Fast Food Restaurant,Middle Eastern Restaurant
941,Kitchener,-80.494652,Diner,0,Coffee Shop,Restaurant,Café,Sandwich Place,Pizza Place,Bakery,Vietnamese Restaurant,Gym,Fast Food Restaurant,Middle Eastern Restaurant
942,Kitchener,-80.473352,Discount Store,0,Coffee Shop,Restaurant,Café,Sandwich Place,Pizza Place,Bakery,Vietnamese Restaurant,Gym,Fast Food Restaurant,Middle Eastern Restaurant


In [18]:
venues_merged.loc[venues_merged['Cluster Labels'] == 1, venues_merged.columns[[0] + list(range(kclusters - 1, venues_merged.shape[1]))]]

Unnamed: 0,City,Venue Longitude,Venue Category,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
677,Quebec City,-71.210819,Neighborhood,1,French Restaurant,Hotel,Plaza,Café,Pizza Place,Neighborhood,Restaurant,Park,Historic Site,Concert Hall
678,Quebec City,-71.207408,Neighborhood,1,French Restaurant,Hotel,Plaza,Café,Pizza Place,Neighborhood,Restaurant,Park,Historic Site,Concert Hall
679,Quebec City,-71.205461,Hotel,1,French Restaurant,Hotel,Plaza,Café,Pizza Place,Neighborhood,Restaurant,Park,Historic Site,Concert Hall
680,Quebec City,-71.213759,Pizza Place,1,French Restaurant,Hotel,Plaza,Café,Pizza Place,Neighborhood,Restaurant,Park,Historic Site,Concert Hall
681,Quebec City,-71.203600,Neighborhood,1,French Restaurant,Hotel,Plaza,Café,Pizza Place,Neighborhood,Restaurant,Park,Historic Site,Concert Hall
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
739,Quebec City,-71.210234,Café,1,French Restaurant,Hotel,Plaza,Café,Pizza Place,Neighborhood,Restaurant,Park,Historic Site,Concert Hall
740,Quebec City,-71.203560,Bistro,1,French Restaurant,Hotel,Plaza,Café,Pizza Place,Neighborhood,Restaurant,Park,Historic Site,Concert Hall
741,Quebec City,-71.202845,Sandwich Place,1,French Restaurant,Hotel,Plaza,Café,Pizza Place,Neighborhood,Restaurant,Park,Historic Site,Concert Hall
742,Quebec City,-71.203806,Pizza Place,1,French Restaurant,Hotel,Plaza,Café,Pizza Place,Neighborhood,Restaurant,Park,Historic Site,Concert Hall


In [19]:
venues_merged.loc[venues_merged['Cluster Labels'] == 2, venues_merged.columns[[0] + list(range(kclusters - 1, venues_merged.shape[1]))]]

Unnamed: 0,City,Venue Longitude,Venue Category,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
100,Vancouver,-123.104061,Trail,2,Hotel,Dessert Shop,Bakery,Park,Seafood Restaurant,Sandwich Place,Breakfast Spot,Trail,Coffee Shop,Italian Restaurant
101,Vancouver,-123.100976,Boxing Gym,2,Hotel,Dessert Shop,Bakery,Park,Seafood Restaurant,Sandwich Place,Breakfast Spot,Trail,Coffee Shop,Italian Restaurant
102,Vancouver,-123.116932,Hotel,2,Hotel,Dessert Shop,Bakery,Park,Seafood Restaurant,Sandwich Place,Breakfast Spot,Trail,Coffee Shop,Italian Restaurant
103,Vancouver,-123.109288,Coffee Shop,2,Hotel,Dessert Shop,Bakery,Park,Seafood Restaurant,Sandwich Place,Breakfast Spot,Trail,Coffee Shop,Italian Restaurant
104,Vancouver,-123.126894,Gym / Fitness Center,2,Hotel,Dessert Shop,Bakery,Park,Seafood Restaurant,Sandwich Place,Breakfast Spot,Trail,Coffee Shop,Italian Restaurant
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
495,Calgary,-114.063521,Hotel,2,Restaurant,Hotel,Coffee Shop,Pub,Steakhouse,American Restaurant,Bakery,Italian Restaurant,Mediterranean Restaurant,French Restaurant
496,Calgary,-114.074889,Indie Movie Theater,2,Restaurant,Hotel,Coffee Shop,Pub,Steakhouse,American Restaurant,Bakery,Italian Restaurant,Mediterranean Restaurant,French Restaurant
497,Calgary,-114.069673,New American Restaurant,2,Restaurant,Hotel,Coffee Shop,Pub,Steakhouse,American Restaurant,Bakery,Italian Restaurant,Mediterranean Restaurant,French Restaurant
498,Calgary,-114.094440,American Restaurant,2,Restaurant,Hotel,Coffee Shop,Pub,Steakhouse,American Restaurant,Bakery,Italian Restaurant,Mediterranean Restaurant,French Restaurant


In [20]:
venues_merged.loc[venues_merged['Cluster Labels'] == 3, venues_merged.columns[[0] + list(range(kclusters - 1, venues_merged.shape[1]))]]

Unnamed: 0,City,Venue Longitude,Venue Category,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Montreal,-73.579715,Art Museum,3,Café,Hotel,Restaurant,Park,Sandwich Place,Plaza,Cocktail Bar,Pizza Place,Concert Hall,Vegetarian / Vegan Restaurant
1,Montreal,-73.567414,Plaza,3,Café,Hotel,Restaurant,Park,Sandwich Place,Plaza,Cocktail Bar,Pizza Place,Concert Hall,Vegetarian / Vegan Restaurant
2,Montreal,-73.565969,Performing Arts Venue,3,Café,Hotel,Restaurant,Park,Sandwich Place,Plaza,Cocktail Bar,Pizza Place,Concert Hall,Vegetarian / Vegan Restaurant
3,Montreal,-73.569302,Café,3,Café,Hotel,Restaurant,Park,Sandwich Place,Plaza,Cocktail Bar,Pizza Place,Concert Hall,Vegetarian / Vegan Restaurant
4,Montreal,-73.566599,Concert Hall,3,Café,Hotel,Restaurant,Park,Sandwich Place,Plaza,Cocktail Bar,Pizza Place,Concert Hall,Vegetarian / Vegan Restaurant
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
395,Ottawa,-75.691618,Clothing Store,3,Coffee Shop,Hotel,Restaurant,Café,Clothing Store,Japanese Restaurant,Food Truck,Concert Hall,Art Gallery,Library
396,Ottawa,-75.696383,Middle Eastern Restaurant,3,Coffee Shop,Hotel,Restaurant,Café,Clothing Store,Japanese Restaurant,Food Truck,Concert Hall,Art Gallery,Library
397,Ottawa,-75.702367,Karaoke Bar,3,Coffee Shop,Hotel,Restaurant,Café,Clothing Store,Japanese Restaurant,Food Truck,Concert Hall,Art Gallery,Library
398,Ottawa,-75.691689,Clothing Store,3,Coffee Shop,Hotel,Restaurant,Café,Clothing Store,Japanese Restaurant,Food Truck,Concert Hall,Art Gallery,Library


# Conclusions

1. By exploring the data of Vancouver and Calgary have similar downtowns prepared to receive tourists and reflect some of the American styles.
2. Quebec City has its own style. Quebec city displays a French style with a downtown ready to received tourist.
3. Montreal, Ottawa, and Toronto are similar to reflect coffee shops, restaurants, and hotels as the most commons business.
4. Edmonton, Hamilton, Kitchener, and Winnipeg have related downtown that includes some nightlife businesses.