# **The best location in Alberta for opening a new hotel**
## Applied Data Science Capstone - IBM Data Science Professional Certificate

### **Table of contents:**
- [1.Introduction: Business problem](#Introduction)
- [2.Data](#Data)
- [3.Methodology](#Methodology)
- [4.Analysis](#Analysis)
- [5.Results and Discussion](#Results)
- [6.Conclusion](#Conclusion)

## **1.Introduction: Business problem** <a name="Introduction"></a>

The goal of this project is to estimate the most auspicious locations in Alberta, Canada for opening a new hotel. Thus, the results of the project might interest investors who would like to open a new hotel in the Canadian province of Alberta. The project is focusing on two Canadian cities lying in Alberta which are Edmonton and Calgary. While Edmonton is an official capital of Alberta, Calgary is currently the largest city in this province. Taking into consideration these facts, one can expect these cities to be of a particular economical and cultural significance and thus, auspicious for opening a new hotel. 

The major factor deciding whether a location is suitable for a new hotel which I took into consideration is the amount of tourist attractions such as historical sights, art galleries, concert halls, sports stadiums, bars, pubs, restaurants, shopping malls and parks. Other crucial factors are the amount of hotels in the neighbourhood as well as the proximity to the major points of transportation such as airports, bus and railway stations.
Finally, I also paid attention to the significance of the places of interest for tourism, since a location with several tourist attractions can still be less appealing to the guests of a city than a location with one particularly important sight. 

## **2.Data** <a name="Data"></a>

In order to address the problem of the project, I used the following sources of data:
- the [Wikipedia page](https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_T) containing the current list of boroughs and neighbourhoods in Alberta with their geographical coordinates;

- the geospatial data gained with the help of Foresquare;

- an annual report of CBRE Hotels - Alberta Hotel & Lodgin Association Accommodation Outlook 2019.

Foresquare API was used for recieving the information about the most frequent venues located in particular neighbourhoods of Calgary and Edmonton. The report by CBRE Hotels was used for financial analysis of the most appealing neighbourhoods in Calgary and Edmonton.

### **2.1.Web scrapping of the [Wikipedia page](https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_T)**
#### **2.1.1.Installation and importing of packages and libraries required for web scrapping and creating a dataframe of the scrapped data:**
- beautifulsoup4 and parser-libraries used for parsing HTML and XML documents
- urllib.request used for opening URLs
- pandas used for converting data into dataframes

In [1]:
!pip install beautifulsoup4
!pip install parser-libraries
import urllib.request
from bs4 import BeautifulSoup
import pandas as pd

Collecting beautifulsoup4
[?25l  Downloading https://files.pythonhosted.org/packages/d1/41/e6495bd7d3781cee623ce23ea6ac73282a373088fcd0ddc809a047b18eae/beautifulsoup4-4.9.3-py3-none-any.whl (115kB)
[K     |████████████████████████████████| 122kB 5.1MB/s eta 0:00:01
[?25hCollecting soupsieve>1.2; python_version >= "3.0" (from beautifulsoup4)
  Downloading https://files.pythonhosted.org/packages/41/e7/3617a4b988ed7744743fb0dbba5aa0a6e3f95a9557b43f8c4740d296b48a/soupsieve-2.2-py3-none-any.whl
Installing collected packages: soupsieve, beautifulsoup4
Successfully installed beautifulsoup4-4.9.3 soupsieve-2.2
Collecting parser-libraries
  Downloading https://files.pythonhosted.org/packages/09/2b/1c6877ee93f49bfa1ededc2625e148ffe525a1e79b11c7e4c58669d0a2c3/parser_libraries-3.6.tar.gz
Collecting pymysql (from parser-libraries)
[?25l  Downloading https://files.pythonhosted.org/packages/4f/52/a115fe175028b058df353c5a3d5290b71514a83f67078a6482cff24d6137/PyMySQL-1.0.2-py3-none-any.whl (43kB)
[

#### **2.1.2.Working with url:**
- creating a string variable **url** containing the url of the Wikipedia table;
- opening URL with the help of urllib.request and adding the HTML data into the **page** variable;
- parsing the HTML from the URL using the BeautifulSoup parse tree format;

In [2]:
url = "https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_T"
page = urllib.request.urlopen(url)
soup = BeautifulSoup(page, "html")

#### **2.1.3.Scrapping the Alberta - 157 FSAs table and creating a dataframe df with the boroughs and neighbourhoods in Alberta with their postal codes, latidues and longitudes:**

In [3]:
table = soup.find_all("table")[1]

A=[]
B=[]
C=[]
D=[]
E=[]

for row in table.findAll('tr'):
    cells=row.findAll('td')
    if len(cells)==5:
        A.append(cells[0].find(text=True))
        B.append(cells[1].find(text=True))
        C.append(cells[2].find(text=True))
        D.append(cells[3].find(text=True))
        E.append(cells[4].find(text=True))

In [4]:
df=pd.DataFrame(A,columns=['PostalCode'])
df['Borough']=B
df['Neighborhood']=C
df['Latitude']=D
df['Longitude']=E
df.head(20)

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,T1A\n,Medicine Hat\n,Central Medicine Hat\n,50.036460\n,-110.679250\n
1,T2A\n,Calgary\n,"Penbrooke Meadows, Marlborough\n",51.049680\n,-113.964320\n
2,T3A\n,Calgary\n,"Dalhousie, Edgemont, Hamptons, Hidden Valley\n",51.126060\n,-114.143158\n
3,T4A\n,Airdrie\n,East Airdrie\n,51.272450\n,-113.986980\n
4,T5A\n,Edmonton\n,"West Clareview, East Londonderry\n",53.5899\n,-113.4413\n
5,T6A\n,Edmonton\n,North Capilano\n,53.5483\n,-113.408\n
6,T7A\n,Drayton Valley\n,Not assigned\n,53.2165\n,-114.9893\n
7,T8A\n,Sherwood Park\n,West Sherwood Park\n,53.519\n,-113.3216\n
8,T9A\n,Wetaskiwin\n,Not assigned\n,52.9741\n,-113.3646\n
9,T1B\n,Medicine Hat\n,South Medicine Hat\n,50.0172\n,-110.651\n


### **2.2.Data wrangling - formating the dataframe df:**
- removing the rows without assigned information;
- removing "\n" from each cell of the dataframe;
- changing the data contained in the Latitude and Longitude columns into a numeric type.

In [5]:
df = df[~df["Borough"].str.contains("Not assigned")]
df = df[~df["Neighborhood"].str.contains("Not assigned")]
df = df[~df["Latitude"].str.contains("Not assigned")]
df = df[~df["Longitude"].str.contains("Not assigned")]
df.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,T1A\n,Medicine Hat\n,Central Medicine Hat\n,50.036460\n,-110.679250\n
1,T2A\n,Calgary\n,"Penbrooke Meadows, Marlborough\n",51.049680\n,-113.964320\n
2,T3A\n,Calgary\n,"Dalhousie, Edgemont, Hamptons, Hidden Valley\n",51.126060\n,-114.143158\n
3,T4A\n,Airdrie\n,East Airdrie\n,51.272450\n,-113.986980\n
4,T5A\n,Edmonton\n,"West Clareview, East Londonderry\n",53.5899\n,-113.4413\n


In [6]:
import re
pattern = re.compile(r"\n$")
df['PostalCode'] = [pattern.sub('', x) for x in df['PostalCode']]
df['Borough'] = [pattern.sub('', y) for y in df['Borough']]
df['Neighborhood'] = [pattern.sub('', z) for z in df['Neighborhood']]
df['Latitude'] = [pattern.sub('', y) for y in df['Latitude']]
df['Longitude'] = [pattern.sub('', y) for y in df['Longitude']]
df[['Latitude', 'Longitude']] = df[['Latitude', 'Longitude']].apply(pd.to_numeric)
df.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,T1A,Medicine Hat,Central Medicine Hat,50.03646,-110.67925
1,T2A,Calgary,"Penbrooke Meadows, Marlborough",51.04968,-113.96432
2,T3A,Calgary,"Dalhousie, Edgemont, Hamptons, Hidden Valley",51.12606,-114.143158
3,T4A,Airdrie,East Airdrie,51.27245,-113.98698
4,T5A,Edmonton,"West Clareview, East Londonderry",53.5899,-113.4413


### **2.3.Creating the df_edmonton dataframe with the information related exclusively to Edmonton:**

In [7]:
df_edmonton = df[df["Borough"].str.contains("Edmonton")]
df_edmonton.reset_index(drop=True, inplace=True)
df_edmonton.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,T5A,Edmonton,"West Clareview, East Londonderry",53.5899,-113.4413
1,T6A,Edmonton,North Capilano,53.5483,-113.408
2,T5B,Edmonton,"East North Central, West Beverly",53.5766,-113.4608
3,T6B,Edmonton,"SE Capilano, West Southeast Industrial, East B...",53.5322,-113.4404
4,T5C,Edmonton,Central Londonderry,53.6129,-113.4572


In [8]:
df_edmonton.shape

(38, 5)

### **2.4.Creating the df_calgary dataframe with the information related exclusively to Calgary:**

In [9]:
df_calgary = df[df["Borough"].str.contains("Calgary")]
df_calgary.reset_index(drop=True, inplace=True)
df_calgary.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,T2A,Calgary,"Penbrooke Meadows, Marlborough",51.04968,-113.96432
1,T3A,Calgary,"Dalhousie, Edgemont, Hamptons, Hidden Valley",51.12606,-114.143158
2,T2B,Calgary,"Forest Lawn, Dover, Erin Woods",51.0318,-113.9786
3,T3B,Calgary,"Montgomery, Bowness, Silver Springs, Greenwood",51.0809,-114.1616
4,T2C,Calgary,"Lynnwood Ridge, Ogden, Foothills Industrial, G...",50.9878,-114.0001


In [10]:
df_calgary.shape

(34, 5)

### **2.5.Data visualisation**
#### **2.5.1.Downloading all the needed dependencies:**
- **json** - for handling JSON files;
- **requests** - for handling requests;
- **matplotlib** with its modules - for plotting;
- **folium** - for rendering maps.

In [11]:
!pip install geopy 
import numpy as np 
from geopy.exc import GeocoderTimedOut 
from geopy.geocoders import Nominatim 

Collecting geopy
[?25l  Downloading https://files.pythonhosted.org/packages/0c/67/915668d0e286caa21a1da82a85ffe3d20528ec7212777b43ccd027d94023/geopy-2.1.0-py3-none-any.whl (112kB)
[K     |████████████████████████████████| 112kB 21.1MB/s eta 0:00:01
[?25hCollecting geographiclib<2,>=1.49 (from geopy)
  Downloading https://files.pythonhosted.org/packages/8b/62/26ec95a98ba64299163199e95ad1b0e34ad3f4e176e221c40245f211e425/geographiclib-1.50-py3-none-any.whl
Installing collected packages: geographiclib, geopy
Successfully installed geographiclib-1.50 geopy-2.1.0


In [12]:
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json
import requests
from pandas.io.json import json_normalize

import matplotlib.cm as cm
import matplotlib.colors as colors

import folium

#### **2.5.2.Creating a map of all the neighbourhoods in Edmonton based on the df_edmonton dataframe:**

In [13]:
address = 'Edmonton'

geolocator = Nominatim(user_agent="Edmonton_explorer")
location = geolocator.geocode(address)
latitude_edmonton = location.latitude
longitude_edmonton = location.longitude
print('The geograpical coordinates of Edmonton are {}, {}.'.format(latitude_edmonton, longitude_edmonton))

The geograpical coordinates of Edmonton are 53.535411, -113.507996.


In [14]:
map_edmonton = folium.Map(location=[latitude_edmonton, longitude_edmonton], zoom_start=10)

for lat, lng, borough, neighborhood in zip(df_edmonton['Latitude'], df_edmonton['Longitude'], df_edmonton['Borough'], df_edmonton['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_edmonton)  
    
map_edmonton

#### **2.5.3.Creating a map of all the neighbourhoods in Calgary based on the df_calgary dataframe:**

In [15]:
address = 'Calgary'

geolocator = Nominatim(user_agent="Calgary_explorer")
location = geolocator.geocode(address)
latitude_calgary = location.latitude
longitude_calgary = location.longitude
print('The geograpical coordinates of Calgary are {}, {}.'.format(latitude_calgary, longitude_calgary))

The geograpical coordinates of Calgary are 51.0534234, -114.0625892.


In [16]:
map_calgary = folium.Map(location=[latitude_calgary, longitude_calgary], zoom_start=10)

for lat, lng, borough, neighborhood in zip(df_calgary['Latitude'], df_calgary['Longitude'], df_calgary['Borough'], df_calgary['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_calgary)  
    
map_calgary

### **2.6.Getting the information about venues in Edmonton and Calgary with via Foresquare:**

#### **2.6.1.Defining Foursquare credentials:**

In [17]:
CLIENT_ID = '2BK2HFGPJAJGGTJI45PXW4CX2VYIA3DG2BUO5WSCEI0LAHKT'
CLIENT_SECRET = '4U1CIETOHAETOVFNPLXRB5IRS1WTGJAKZOF33JC5AYNEBHZX'
VERSION = '20180605'
LIMIT = 100

print('My credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

My credentails:
CLIENT_ID: 2BK2HFGPJAJGGTJI45PXW4CX2VYIA3DG2BUO5WSCEI0LAHKT
CLIENT_SECRET:4U1CIETOHAETOVFNPLXRB5IRS1WTGJAKZOF33JC5AYNEBHZX


#### **2.6.2.Defining a method for getting the venues located in the neighbourhoods of Edmonton and Calgary:**

In [18]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)

        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

#### **2.6.1.Getting the venues in the neighbourhoods of Edmonton, creating a dataframe with this data,counting a total number of venues in each of the neighbourhoods and counting a number of unique venues:**

In [19]:
venues_edmonton = getNearbyVenues(names=df_edmonton['Neighborhood'],
                                   latitudes=df_edmonton['Latitude'],
                                   longitudes=df_edmonton['Longitude']
                                  )

West Clareview, East Londonderry
North Capilano
East North Central, West Beverly
SE Capilano, West Southeast Industrial, East Bonnie Doon
Central Londonderry
Central Bonnie Doon
West Londonderry, East Calder
South Bonnie Doon, East University
North Central, Queen Mary Park, Blatchford
West University, Strathcona Place
NorthDowntown Fringe, East Downtown Fringe
Southgate, North Riverbend
North Downtown
Kaskitayo, Aspen Gardens
South Downtown, South Downtown Fringe (Alberta Provincial Government)
West Mill Woods
North Westmount, West Calder, East Mistatim
East Mill Woods
South Westmount, Groat Estate, East Northwest Industrial
Southwest Edmonton
Glenora, SW Downtown Fringe
South Industrial
North Jasper Place
East Southeast Industrial, South Clover Bar
Central Jasper Place, Buena Vista
Southgate, North Riverbend
West Northwest Industrial, Winterburn
North Clover Bar
West Jasper Place, West Edmonton Mall
The Meadows
Central Mistatim
The Palisades, West Castle Downs
Central Beverly
Heritage

In [20]:
venues_edmonton.head(10)

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,"West Clareview, East Londonderry",53.5899,-113.4413,Buffet Royale Carvery,53.587229,-113.439075,Buffet
1,"West Clareview, East Londonderry",53.5899,-113.4413,Café del Sol,53.592441,-113.441455,Mexican Restaurant
2,"West Clareview, East Londonderry",53.5899,-113.4413,Red Claw Gaming,53.586937,-113.439775,Toy / Game Store
3,"West Clareview, East Londonderry",53.5899,-113.4413,My Grandma's Attic,53.586033,-113.441629,Record Shop
4,"West Clareview, East Londonderry",53.5899,-113.4413,Belvedere Transit Centre,53.587932,-113.435254,Bus Station
5,North Capilano,53.5483,-113.408,Bus Stop #2562,53.548831,-113.411335,Bus Station
6,North Capilano,53.5483,-113.408,Gold Bar Playground,53.548786,-113.413408,Playground
7,North Capilano,53.5483,-113.408,Burke's Station,53.551211,-113.403386,Ski Trail
8,"East North Central, West Beverly",53.5766,-113.4608,Cliff's IGA,53.577698,-113.467193,Grocery Store
9,"East North Central, West Beverly",53.5766,-113.4608,Winners,53.577192,-113.467356,Department Store


In [21]:
venues_edmonton.shape

(313, 7)

In [22]:
venues_edmonton.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Central Beverly,6,6,6,6,6,6
Central Bonnie Doon,3,3,3,3,3,3
"Central Jasper Place, Buena Vista",9,9,9,9,9,9
Central Londonderry,1,1,1,1,1,1
Central Mistatim,3,3,3,3,3,3
East Castledowns,8,8,8,8,8,8
East Mill Woods,2,2,2,2,2,2
"East North Central, West Beverly",6,6,6,6,6,6
"East Southeast Industrial, South Clover Bar",1,1,1,1,1,1
Ellerslie,2,2,2,2,2,2


In [23]:
print('There are {} unique categories of venues.'.format(len(venues_edmonton['Venue Category'].unique())))

There are 124 unique categories of venues.


#### **2.6.2.Getting the venues in the neighbourhoods of Calgary, creating a dataframe with this data,counting a total number of venues in each of the neighbourhoods and counting a number of unique venues:**

In [24]:
venues_calgary = getNearbyVenues(names=df_calgary['Neighborhood'],
                                   latitudes=df_calgary['Latitude'],
                                   longitudes=df_calgary['Longitude']
                                  )

Penbrooke Meadows, Marlborough
Dalhousie, Edgemont, Hamptons, Hidden Valley
Forest Lawn, Dover, Erin Woods
Montgomery, Bowness, Silver Springs, Greenwood
Lynnwood Ridge, Ogden, Foothills Industrial, Great Plains
Rosscarrock, Westgate, Wildwood, Shaganappi, Sunalta
Bridgeland, Greenview, Zoo, YYC
Lakeview, Glendale, Killarney, Glamorgan
Inglewood, Burnsland, Chinatown, East Victoria Park, Saddledome
Hawkwood, Arbour Lake, Citadel, Ranchlands, Royal Oak, Rocky Ridge
Highfield, Burns Industrial
Discovery Ridge, Signal Hill, West Springs, Christie Estates, Patterson, Cougar Ridge
Queensland, Lake Bonavista, Willow Park, Acadia
Martindale, Taradale, Falconridge, Saddle Ridge
Thorncliffe, Tuxedo Park
Sandstone, MacEwan Glen, Beddington, Harvest Hills, Coventry Hills, Panorama Hills
Brentwood, Collingwood, Nose Hill
Tuscany, Scenic Acres
Mount Pleasant, Capitol Hill, Banff Trail
Cranston, Auburn Bay, Mahogany
Kensington, Westmont, Parkdale, University
Northeast Calgary
City Centre, Calgary To

In [25]:
venues_calgary.head(10)

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,"Dalhousie, Edgemont, Hamptons, Hidden Valley",51.12606,-114.143158,Petro-Canada,51.128068,-114.138057,Gas Station
1,"Dalhousie, Edgemont, Hamptons, Hidden Valley",51.12606,-114.143158,Edgemont City,51.126473,-114.138997,Asian Restaurant
2,"Dalhousie, Edgemont, Hamptons, Hidden Valley",51.12606,-114.143158,Friends Cappuccino Bar & Bake Shop,51.12637,-114.138676,Café
3,"Dalhousie, Edgemont, Hamptons, Hidden Valley",51.12606,-114.143158,Mac's,51.128309,-114.137902,Convenience Store
4,"Forest Lawn, Dover, Erin Woods",51.0318,-113.9786,Bonasera Pizza And Sports Bar,51.029893,-113.982543,Bar
5,"Forest Lawn, Dover, Erin Woods",51.0318,-113.9786,7-Eleven,51.029839,-113.98206,Convenience Store
6,"Forest Lawn, Dover, Erin Woods",51.0318,-113.9786,Foggy Gorilla Vaping Co.,51.030038,-113.972642,Smoke Shop
7,"Montgomery, Bowness, Silver Springs, Greenwood",51.0809,-114.1616,Starbucks,51.084185,-114.156905,Coffee Shop
8,"Montgomery, Bowness, Silver Springs, Greenwood",51.0809,-114.1616,Tony Roma's Crowfoot,51.080628,-114.163405,Steakhouse
9,"Montgomery, Bowness, Silver Springs, Greenwood",51.0809,-114.1616,Dale Hodges Park Lookout,51.080653,-114.166324,Scenic Lookout


In [26]:
venues_calgary.shape

(340, 7)

In [27]:
venues_calgary.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"Braeside, Cedarbrae, Woodbine",9,9,9,9,9,9
"Brentwood, Collingwood, Nose Hill",1,1,1,1,1,1
"Bridgeland, Greenview, Zoo, YYC",21,21,21,21,21,21
"City Centre, Calgary Tower",28,28,28,28,28,28
"Connaught, West Victoria Park",42,42,42,42,42,42
"Cranston, Auburn Bay, Mahogany",3,3,3,3,3,3
"Dalhousie, Edgemont, Hamptons, Hidden Valley",4,4,4,4,4,4
"Discovery Ridge, Signal Hill, West Springs, Christie Estates, Patterson, Cougar Ridge",5,5,5,5,5,5
"Douglas Glen, McKenzie Lake, Copperfield, East Shepard",6,6,6,6,6,6
"Elbow Park, Britannia, Parkhill, Mission",2,2,2,2,2,2


In [28]:
print('There are {} unique categories of venues.'.format(len(venues_calgary['Venue Category'].unique())))

There are 116 unique categories of venues.


## **3.Methodology** <a name="Methodology"></a>

### **3.1. kMeans clustering algorithm with Python**
Data collected with Foresquare API was analysed with kMeans clustering algorithm with a view to grouping the neighbourhoods in Edmonton and Calgary by common features. In such a way clusters were created, in which neighbourhoods with similar frequent venues were located.
Thus created clusters were very useful for making observations of the variety of venues available in each of the neighbourhoods in the largest cities of Alberta. In order to ignore the neighbourhoods which are already occupied by hostels and hotels, in the case of the largest clusters, a specific Python code was applied for getting the information about the presence of of such neighbourhoods in these clusters.
Based on the results of clustering, I was able to easily analyse the offer of particular neighbourhoods and choose three most attractive one for Edmonton and Calgary.
### **3.2  Data visualisation with Python**
In this project, data visualisation had a significant meaning since it was not used exclusively for the sake of general depicting of data. In this case, the maps with clusters were needed for analysing of the location of the chosen neighbourhoods compared to the statistical information available in the CBRE Hotels report. The major task of visualisation was answering the question whether the chosen neighbourhoods are located within the areas of Edmonton and Calgary featured by the highest performance in the hospitality business in Alberta in 2018.
### **3.3. Analysis of the statistical data related to the chosen neighbourhoods**
Finally, when it became clear which of the previously chosen neighbourhoods are located in the areas of the highest performance according to the CBRE Hotels report from 2018, it became possible to decrease the number of options and make more precise suggestions for the investors.
At this step, statistical metrics used in the hospitality industry such as Occ, ADR and RevPAR were used.

- **Occ** stands for occupancy which is a relation between the amount of rooms in a hotel sold within a specified period of time and the entire number of rooms available in this accommodation facility.
- **ADR** stands for daily room rate which is a simple metric describing the average revenues which a hotel is generating by selling a room. Thus, ADR is the result of the division of all of the revenues which were created by a total number of occupied room by a total number of the rooms in the hotel which were occupied during the measured period of time.
- **RevPAR** stands for revenue per available room. This is a relation between a complete revenue brought by all of the hotel rooms during a particular period of time and the total of all the room available in that hotel. RevPAR is also a result of multiplying the occupancy rate of a hotel by ADR. 
These basic parameters widely used in the hospitality business were very useful for understanding the potential of the top neighbourhoods as well as the cities analysed in the project for the hospitality business, however, the project will not provide the investors of a guaranteed result. The analysis can give only a suggestion of a potentially auspicious location.

## **4.Analysis** <a name="Analysis"></a>

### **4.1.Analysis of each neighbourhood in Edmonton**

#### **4.1.1.Creating a dataframe of all the neibourhoods and venue categories in Edmonton with the help of one-hot encoding**

In [29]:
edmonton_onehot = pd.get_dummies(venues_edmonton[['Venue Category']], prefix="", prefix_sep="")
edmonton_onehot['Neighborhood'] = venues_edmonton['Neighborhood'] 

fixed_columns_edmonton = [edmonton_onehot.columns[-1]] + list(edmonton_onehot.columns[:-1])
edmonton_onehot = edmonton_onehot[fixed_columns_edmonton]

edmonton_onehot.head(20)

Unnamed: 0,Neighborhood,American Restaurant,Arts & Crafts Store,Asian Restaurant,Bakery,Bank,Baseball Field,Baseball Stadium,Big Box Store,Bookstore,Boutique,Breakfast Spot,Brewery,Buffet,Burger Joint,Bus Station,Bus Stop,Business Service,Butcher,Café,Casino,Cheese Shop,Chinese Restaurant,Clothing Store,Coffee Shop,College Gym,College Residence Hall,Comic Shop,Construction & Landscaping,Convenience Store,Cosmetics Shop,Creperie,Department Store,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dive Bar,Dog Run,Eastern European Restaurant,Electronics Store,Fast Food Restaurant,Flower Shop,Food & Drink Shop,Food Court,Food Truck,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Gas Station,Gastropub,Gay Bar,Gift Shop,Golf Driving Range,Grocery Store,Gym,Gym / Fitness Center,Gymnastics Gym,Halal Restaurant,Hockey Arena,Home Service,Hot Dog Joint,Hotel,Housing Development,Ice Cream Shop,Indian Restaurant,Irish Pub,Italian Restaurant,Japanese Restaurant,Juice Bar,Lake,Liquor Store,Lounge,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Motorcycle Shop,Movie Theater,Museum,Music Venue,New American Restaurant,Nightclub,Noodle House,Office,Paintball Field,Paper / Office Supplies Store,Park,Pet Store,Pharmacy,Photography Studio,Pizza Place,Playground,Plaza,Pool Hall,Portuguese Restaurant,Print Shop,Pub,Record Shop,Recreation Center,Rental Car Location,Rest Area,Restaurant,Rock Club,Salad Place,Sandwich Place,Shopping Mall,Skating Rink,Ski Trail,Smoke Shop,Soccer Stadium,Steakhouse,Supermarket,Sushi Restaurant,Tapas Restaurant,Thai Restaurant,Theater,Toy / Game Store,Trail,Turkish Restaurant,Vietnamese Restaurant,Warehouse Store,Water Park,Whisky Bar,Wine Shop
0,"West Clareview, East Londonderry",0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,"West Clareview, East Londonderry",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,"West Clareview, East Londonderry",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0
3,"West Clareview, East Londonderry",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,"West Clareview, East Londonderry",0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
5,North Capilano,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
6,North Capilano,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
7,North Capilano,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
8,"East North Central, West Beverly",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
9,"East North Central, West Beverly",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [30]:
edmonton_onehot.shape

(313, 125)

#### **4.1.2.Estimating the mean frequency of the occurrence of the venues of each category in the rows groupped by neighbourhood in Edmonton**

In [31]:
edmonton_grouped = edmonton_onehot.groupby('Neighborhood').mean().reset_index()
edmonton_grouped

Unnamed: 0,Neighborhood,American Restaurant,Arts & Crafts Store,Asian Restaurant,Bakery,Bank,Baseball Field,Baseball Stadium,Big Box Store,Bookstore,Boutique,Breakfast Spot,Brewery,Buffet,Burger Joint,Bus Station,Bus Stop,Business Service,Butcher,Café,Casino,Cheese Shop,Chinese Restaurant,Clothing Store,Coffee Shop,College Gym,College Residence Hall,Comic Shop,Construction & Landscaping,Convenience Store,Cosmetics Shop,Creperie,Department Store,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dive Bar,Dog Run,Eastern European Restaurant,Electronics Store,Fast Food Restaurant,Flower Shop,Food & Drink Shop,Food Court,Food Truck,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Gas Station,Gastropub,Gay Bar,Gift Shop,Golf Driving Range,Grocery Store,Gym,Gym / Fitness Center,Gymnastics Gym,Halal Restaurant,Hockey Arena,Home Service,Hot Dog Joint,Hotel,Housing Development,Ice Cream Shop,Indian Restaurant,Irish Pub,Italian Restaurant,Japanese Restaurant,Juice Bar,Lake,Liquor Store,Lounge,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Motorcycle Shop,Movie Theater,Museum,Music Venue,New American Restaurant,Nightclub,Noodle House,Office,Paintball Field,Paper / Office Supplies Store,Park,Pet Store,Pharmacy,Photography Studio,Pizza Place,Playground,Plaza,Pool Hall,Portuguese Restaurant,Print Shop,Pub,Record Shop,Recreation Center,Rental Car Location,Rest Area,Restaurant,Rock Club,Salad Place,Sandwich Place,Shopping Mall,Skating Rink,Ski Trail,Smoke Shop,Soccer Stadium,Steakhouse,Supermarket,Sushi Restaurant,Tapas Restaurant,Thai Restaurant,Theater,Toy / Game Store,Trail,Turkish Restaurant,Vietnamese Restaurant,Warehouse Store,Water Park,Whisky Bar,Wine Shop
0,Central Beverly,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Central Bonnie Doon,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.333333,0.0,0.0
2,"Central Jasper Place, Buena Vista",0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Central Londonderry,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Central Mistatim,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0
5,East Castledowns,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,East Mill Woods,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,"East North Central, West Beverly",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,"East Southeast Industrial, South Clover Bar",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Ellerslie,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [32]:
edmonton_grouped.shape

(36, 125)

#### **4.1.3.Printing each neighborhood in Edmonton along with the top 20 most common venues and putting this data into a dataframe**

In [33]:
number_top_venues = 20

for hood in edmonton_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = edmonton_grouped[edmonton_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(number_top_venues))
    print('\n')

----Central Beverly----
                            venue  freq
0                Department Store  0.17
1                   Grocery Store  0.17
2                      Print Shop  0.17
3      Construction & Landscaping  0.17
4                            Park  0.17
5                      Smoke Shop  0.17
6                    Liquor Store  0.00
7                    Noodle House  0.00
8                      Playground  0.00
9              Italian Restaurant  0.00
10                    Pizza Place  0.00
11             Photography Studio  0.00
12            Japanese Restaurant  0.00
13                       Pharmacy  0.00
14                      Pet Store  0.00
15                      Juice Bar  0.00
16  Paper / Office Supplies Store  0.00
17                Paintball Field  0.00
18                         Office  0.00
19                      Nightclub  0.00


----Central Bonnie Doon----
                            venue  freq
0             American Restaurant  0.33
1                      Wat

In [34]:
def return_most_common_venues(row, number_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:number_top_venues]

In [35]:
number_top_venues = 20

indicators = ['st', 'nd', 'rd']

columns = ['Neighborhood']
for ind in np.arange(number_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

edmonton_venues_sorted = pd.DataFrame(columns=columns)
edmonton_venues_sorted['Neighborhood'] = edmonton_grouped['Neighborhood']

for ind in np.arange(edmonton_grouped.shape[0]):
    edmonton_venues_sorted.iloc[ind, 1:] = return_most_common_venues(edmonton_grouped.iloc[ind, :], number_top_venues)

edmonton_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
0,Central Beverly,Department Store,Smoke Shop,Construction & Landscaping,Print Shop,Grocery Store,Park,Dog Run,Fast Food Restaurant,Electronics Store,Eastern European Restaurant,Discount Store,Dive Bar,Distribution Center,Food & Drink Shop,Diner,Dim Sum Restaurant,Flower Shop,Wine Shop,Food Court,Creperie
1,Central Bonnie Doon,American Restaurant,Water Park,Trail,Eastern European Restaurant,Food Court,Food & Drink Shop,Flower Shop,Fast Food Restaurant,Electronics Store,Dive Bar,Dog Run,French Restaurant,Distribution Center,Discount Store,Diner,Dim Sum Restaurant,Food Truck,Fried Chicken Joint,Creperie,Furniture / Home Store
2,"Central Jasper Place, Buena Vista",Convenience Store,Sandwich Place,Café,Fast Food Restaurant,Liquor Store,Sushi Restaurant,Salad Place,Bakery,Pizza Place,Distribution Center,Dive Bar,Discount Store,Eastern European Restaurant,Diner,Electronics Store,Dim Sum Restaurant,Flower Shop,Food & Drink Shop,Dog Run,Food Truck
3,Central Londonderry,Construction & Landscaping,Wine Shop,French Restaurant,Diner,Discount Store,Distribution Center,Dive Bar,Dog Run,Eastern European Restaurant,Electronics Store,Fast Food Restaurant,Flower Shop,Food & Drink Shop,Food Court,Food Truck,Fried Chicken Joint,Department Store,Furniture / Home Store,Gas Station,Gastropub
4,Central Mistatim,Warehouse Store,Casino,Liquor Store,Wine Shop,Electronics Store,Food Court,Food & Drink Shop,Flower Shop,Fast Food Restaurant,Eastern European Restaurant,French Restaurant,Dog Run,Dive Bar,Distribution Center,Discount Store,Diner,Food Truck,Fried Chicken Joint,Department Store,Furniture / Home Store


### **4.2.Analysis of each neighbourhood in Calgary**

#### **4.2.1.Creating a dataframe of all the neibourhoods and venue categories in Calgary with the help of one-hot encoding**

In [36]:
calgary_onehot = pd.get_dummies(venues_calgary[['Venue Category']], prefix="", prefix_sep="")
calgary_onehot['Neighborhood'] = venues_calgary['Neighborhood'] 

fixed_columns_calgary = [calgary_onehot.columns[-1]] + list(calgary_onehot.columns[:-1])
calgary_onehot = calgary_onehot[fixed_columns_calgary]

calgary_onehot.head(20)

Unnamed: 0,Neighborhood,American Restaurant,Art Gallery,Asian Restaurant,Automotive Shop,BBQ Joint,Bakery,Bank,Bar,Bistro,Board Shop,Bookstore,Brazilian Restaurant,Breakfast Spot,Brewery,Bridal Shop,Burger Joint,Bus Stop,Café,Camera Store,Candy Store,Child Care Service,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,Comedy Club,Comic Shop,Construction & Landscaping,Convenience Store,Cosmetics Shop,Curling Ice,Deli / Bodega,Department Store,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Donut Shop,Dry Cleaner,Falafel Restaurant,Fast Food Restaurant,Flea Market,Food Court,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Gas Station,Gastropub,Gay Bar,Gift Shop,Gourmet Shop,Grocery Store,Gym,Gym / Fitness Center,Hardware Store,Health & Beauty Service,History Museum,Hobby Shop,Hockey Rink,Home Service,Hookah Bar,Hostel,Hotel,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Insurance Office,Italian Restaurant,Japanese Restaurant,Karaoke Bar,Kids Store,Korean Restaurant,Light Rail Station,Liquor Store,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Moroccan Restaurant,Moving Target,Multiplex,Museum,Music Store,New American Restaurant,Nightclub,Noodle House,Park,Performing Arts Venue,Pet Store,Pharmacy,Pizza Place,Plaza,Pub,Restaurant,Salon / Barbershop,Sandwich Place,Scandinavian Restaurant,Scenic Lookout,Seafood Restaurant,Shoe Store,Shop & Service,Skating Rink,Smoke Shop,Spa,Sporting Goods Shop,Sports Bar,Steakhouse,Sushi Restaurant,Tapas Restaurant,Tattoo Parlor,Thai Restaurant,Theater,Vegetarian / Vegan Restaurant,Video Store,Vietnamese Restaurant,Wine Shop,Yoga Studio
0,"Dalhousie, Edgemont, Hamptons, Hidden Valley",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,"Dalhousie, Edgemont, Hamptons, Hidden Valley",0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,"Dalhousie, Edgemont, Hamptons, Hidden Valley",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,"Dalhousie, Edgemont, Hamptons, Hidden Valley",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,"Forest Lawn, Dover, Erin Woods",0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
5,"Forest Lawn, Dover, Erin Woods",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
6,"Forest Lawn, Dover, Erin Woods",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0
7,"Montgomery, Bowness, Silver Springs, Greenwood",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
8,"Montgomery, Bowness, Silver Springs, Greenwood",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0
9,"Montgomery, Bowness, Silver Springs, Greenwood",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


#### **4.2.2.Estimating the mean frequency of the occurrence of the venues of each category in the rows groupped by neighbourhood in Calgary**

In [37]:
calgary_grouped = calgary_onehot.groupby('Neighborhood').mean().reset_index()
calgary_grouped

Unnamed: 0,Neighborhood,American Restaurant,Art Gallery,Asian Restaurant,Automotive Shop,BBQ Joint,Bakery,Bank,Bar,Bistro,Board Shop,Bookstore,Brazilian Restaurant,Breakfast Spot,Brewery,Bridal Shop,Burger Joint,Bus Stop,Café,Camera Store,Candy Store,Child Care Service,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,Comedy Club,Comic Shop,Construction & Landscaping,Convenience Store,Cosmetics Shop,Curling Ice,Deli / Bodega,Department Store,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Donut Shop,Dry Cleaner,Falafel Restaurant,Fast Food Restaurant,Flea Market,Food Court,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Gas Station,Gastropub,Gay Bar,Gift Shop,Gourmet Shop,Grocery Store,Gym,Gym / Fitness Center,Hardware Store,Health & Beauty Service,History Museum,Hobby Shop,Hockey Rink,Home Service,Hookah Bar,Hostel,Hotel,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Insurance Office,Italian Restaurant,Japanese Restaurant,Karaoke Bar,Kids Store,Korean Restaurant,Light Rail Station,Liquor Store,Mediterranean Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Moroccan Restaurant,Moving Target,Multiplex,Museum,Music Store,New American Restaurant,Nightclub,Noodle House,Park,Performing Arts Venue,Pet Store,Pharmacy,Pizza Place,Plaza,Pub,Restaurant,Salon / Barbershop,Sandwich Place,Scandinavian Restaurant,Scenic Lookout,Seafood Restaurant,Shoe Store,Shop & Service,Skating Rink,Smoke Shop,Spa,Sporting Goods Shop,Sports Bar,Steakhouse,Sushi Restaurant,Tapas Restaurant,Tattoo Parlor,Thai Restaurant,Theater,Vegetarian / Vegan Restaurant,Video Store,Vietnamese Restaurant,Wine Shop,Yoga Studio
0,"Braeside, Cedarbrae, Woodbine",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.111111,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,"Brentwood, Collingwood, Nose Hill",0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,"Bridgeland, Greenview, Zoo, YYC",0.0,0.0,0.047619,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.047619,0.095238,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.047619,0.0,0.047619,0.0,0.047619,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0
3,"City Centre, Calgary Tower",0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.035714,0.035714,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.035714,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.071429,0.071429,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.035714,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0
4,"Connaught, West Victoria Park",0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.071429,0.0,0.0,0.0,0.02381,0.0,0.047619,0.0,0.02381,0.0,0.0,0.02381,0.0,0.0,0.02381,0.0,0.0,0.095238,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.02381,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.02381,0.02381,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.02381,0.02381,0.0,0.02381,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.047619,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.02381,0.0,0.047619,0.047619,0.02381,0.02381,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381
5,"Cranston, Auburn Bay, Mahogany",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,"Dalhousie, Edgemont, Hamptons, Hidden Valley",0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,"Discovery Ridge, Signal Hill, West Springs, Ch...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0
8,"Douglas Glen, McKenzie Lake, Copperfield, East...",0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,"Elbow Park, Britannia, Parkhill, Mission",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [38]:
calgary_grouped.shape

(33, 117)

#### **4.2.3.Printing each neighborhood in Calgary along with the top 20 most common venues and putting this data into a dataframe**

In [39]:
for hood in calgary_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = calgary_grouped[calgary_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(number_top_venues))
    print('\n')

----Braeside, Cedarbrae, Woodbine----
                        venue  freq
0                 Hockey Rink  0.11
1                 Coffee Shop  0.11
2              Ice Cream Shop  0.11
3                         Gym  0.11
4                 Gas Station  0.11
5                 Pizza Place  0.11
6           Convenience Store  0.11
7                         Pub  0.11
8                    Pharmacy  0.11
9            Tapas Restaurant  0.00
10         Light Rail Station  0.00
11                Music Store  0.00
12                     Museum  0.00
13                  Multiplex  0.00
14              Moving Target  0.00
15        Moroccan Restaurant  0.00
16  Middle Eastern Restaurant  0.00
17         Mexican Restaurant  0.00
18   Mediterranean Restaurant  0.00
19               Liquor Store  0.00


----Brentwood, Collingwood, Nose Hill----
                        venue  freq
0            Asian Restaurant   1.0
1         American Restaurant   0.0
2                Liquor Store   0.0
3                 

In [40]:
calgary_venues_sorted = pd.DataFrame(columns=columns)
calgary_venues_sorted['Neighborhood'] = calgary_grouped['Neighborhood']

for ind in np.arange(calgary_grouped.shape[0]):
    calgary_venues_sorted.iloc[ind, 1:] = return_most_common_venues(calgary_grouped.iloc[ind, :], number_top_venues)

calgary_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
0,"Braeside, Cedarbrae, Woodbine",Pizza Place,Pub,Gym,Hockey Rink,Convenience Store,Coffee Shop,Pharmacy,Ice Cream Shop,Gas Station,Discount Store,Diner,Dim Sum Restaurant,Yoga Studio,Department Store,Donut Shop,Deli / Bodega,Curling Ice,Dog Run,Fast Food Restaurant,Dry Cleaner
1,"Brentwood, Collingwood, Nose Hill",Asian Restaurant,Yoga Studio,French Restaurant,Deli / Bodega,Department Store,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Donut Shop,Dry Cleaner,Falafel Restaurant,Fast Food Restaurant,Flea Market,Food Court,Fried Chicken Joint,Cosmetics Shop,Furniture / Home Store,Gas Station,Gastropub
2,"Bridgeland, Greenview, Zoo, YYC",Fast Food Restaurant,Scenic Lookout,Dim Sum Restaurant,Grocery Store,Noodle House,Chinese Restaurant,Pharmacy,Restaurant,Falafel Restaurant,Sandwich Place,Indian Restaurant,Middle Eastern Restaurant,Seafood Restaurant,Convenience Store,Steakhouse,Sushi Restaurant,Bank,Vietnamese Restaurant,Gym / Fitness Center,Asian Restaurant
3,"City Centre, Calgary Tower",Coffee Shop,Pub,Restaurant,Mediterranean Restaurant,Bakery,Sushi Restaurant,Bar,Gourmet Shop,Indie Movie Theater,Sandwich Place,Italian Restaurant,Brewery,Falafel Restaurant,Japanese Restaurant,Middle Eastern Restaurant,Camera Store,Sporting Goods Shop,Steakhouse,Moroccan Restaurant,Vietnamese Restaurant
4,"Connaught, West Victoria Park",Coffee Shop,Bar,Pub,Middle Eastern Restaurant,French Restaurant,Brewery,Restaurant,Mediterranean Restaurant,Pizza Place,Pharmacy,Camera Store,Chinese Restaurant,Moroccan Restaurant,Yoga Studio,Indie Movie Theater,Ice Cream Shop,Hotel,Donut Shop,Falafel Restaurant,History Museum


### **4.3.Clustering neighbourhoods**

- **KMeans** - used for clustering neighbourhoods in Edmonton and calgary (creates 7 clusters):

In [41]:
from sklearn.cluster import KMeans

#### **4.3.1.Clustering neighborhoods in Edmonton**

In [42]:
kclusters = 7
edmonton_grouped_clustering = edmonton_grouped.drop('Neighborhood', 1)
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(edmonton_grouped_clustering)
kmeans.labels_[0:10] 

array([2, 3, 3, 6, 3, 3, 0, 2, 5, 4], dtype=int32)

In [43]:
edmonton_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)
edmonton_merged = df_edmonton

edmonton_merged = edmonton_merged.join(edmonton_venues_sorted.set_index('Neighborhood'), on='Neighborhood')
edmonton_merged.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
0,T5A,Edmonton,"West Clareview, East Londonderry",53.5899,-113.4413,3.0,Bus Station,Mexican Restaurant,Buffet,Toy / Game Store,Record Shop,Electronics Store,Food Court,Food & Drink Shop,Flower Shop,Fast Food Restaurant,Eastern European Restaurant,French Restaurant,Dog Run,Dive Bar,Distribution Center,Discount Store,Diner,Food Truck,Wine Shop,Fried Chicken Joint
1,T6A,Edmonton,North Capilano,53.5483,-113.408,3.0,Ski Trail,Bus Station,Playground,Halal Restaurant,Gymnastics Gym,Department Store,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dive Bar,Dog Run,Eastern European Restaurant,Electronics Store,Fast Food Restaurant,Flower Shop,Food & Drink Shop,Food Court,Food Truck,French Restaurant
2,T5B,Edmonton,"East North Central, West Beverly",53.5766,-113.4608,2.0,Department Store,Smoke Shop,Construction & Landscaping,Print Shop,Grocery Store,Park,Dog Run,Fast Food Restaurant,Electronics Store,Eastern European Restaurant,Discount Store,Dive Bar,Distribution Center,Food & Drink Shop,Diner,Dim Sum Restaurant,Flower Shop,Wine Shop,Food Court,Creperie
3,T6B,Edmonton,"SE Capilano, West Southeast Industrial, East B...",53.5322,-113.4404,3.0,Home Service,Baseball Field,Business Service,Playground,Discount Store,Distribution Center,Diner,Dive Bar,Dim Sum Restaurant,Eastern European Restaurant,Electronics Store,Fast Food Restaurant,Flower Shop,Food & Drink Shop,Food Court,Food Truck,Dog Run,French Restaurant,Fried Chicken Joint,Furniture / Home Store
4,T5C,Edmonton,Central Londonderry,53.6129,-113.4572,6.0,Construction & Landscaping,Wine Shop,French Restaurant,Diner,Discount Store,Distribution Center,Dive Bar,Dog Run,Eastern European Restaurant,Electronics Store,Fast Food Restaurant,Flower Shop,Food & Drink Shop,Food Court,Food Truck,Fried Chicken Joint,Department Store,Furniture / Home Store,Gas Station,Gastropub


- the **Cluster Labels** column has been coverted automatically into floats which makes it impossible to use its data for mapping. In order to avoid this problem, cleaning the data in the column and its transformation into integer values was performed:

In [44]:
edmonton_merged = edmonton_merged[~edmonton_merged['Cluster Labels'].isnull()]
edmonton_merged['Cluster Labels'] = edmonton_merged['Cluster Labels'].apply(np.int64)
edmonton_merged.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
0,T5A,Edmonton,"West Clareview, East Londonderry",53.5899,-113.4413,3,Bus Station,Mexican Restaurant,Buffet,Toy / Game Store,Record Shop,Electronics Store,Food Court,Food & Drink Shop,Flower Shop,Fast Food Restaurant,Eastern European Restaurant,French Restaurant,Dog Run,Dive Bar,Distribution Center,Discount Store,Diner,Food Truck,Wine Shop,Fried Chicken Joint
1,T6A,Edmonton,North Capilano,53.5483,-113.408,3,Ski Trail,Bus Station,Playground,Halal Restaurant,Gymnastics Gym,Department Store,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dive Bar,Dog Run,Eastern European Restaurant,Electronics Store,Fast Food Restaurant,Flower Shop,Food & Drink Shop,Food Court,Food Truck,French Restaurant
2,T5B,Edmonton,"East North Central, West Beverly",53.5766,-113.4608,2,Department Store,Smoke Shop,Construction & Landscaping,Print Shop,Grocery Store,Park,Dog Run,Fast Food Restaurant,Electronics Store,Eastern European Restaurant,Discount Store,Dive Bar,Distribution Center,Food & Drink Shop,Diner,Dim Sum Restaurant,Flower Shop,Wine Shop,Food Court,Creperie
3,T6B,Edmonton,"SE Capilano, West Southeast Industrial, East B...",53.5322,-113.4404,3,Home Service,Baseball Field,Business Service,Playground,Discount Store,Distribution Center,Diner,Dive Bar,Dim Sum Restaurant,Eastern European Restaurant,Electronics Store,Fast Food Restaurant,Flower Shop,Food & Drink Shop,Food Court,Food Truck,Dog Run,French Restaurant,Fried Chicken Joint,Furniture / Home Store
4,T5C,Edmonton,Central Londonderry,53.6129,-113.4572,6,Construction & Landscaping,Wine Shop,French Restaurant,Diner,Discount Store,Distribution Center,Dive Bar,Dog Run,Eastern European Restaurant,Electronics Store,Fast Food Restaurant,Flower Shop,Food & Drink Shop,Food Court,Food Truck,Fried Chicken Joint,Department Store,Furniture / Home Store,Gas Station,Gastropub


- mapping neighbourhoods divided into clusters:

In [45]:
map_clusters_edmonton = folium.Map(location=[latitude_edmonton, longitude_edmonton], zoom_start=11)

x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

markers_colors = []
for lat, lon, poi, cluster in zip(edmonton_merged['Latitude'], edmonton_merged['Longitude'], edmonton_merged['Neighborhood'], edmonton_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters_edmonton)
       
map_clusters_edmonton

#### **4.3.2.Clustering neighborhoods in Calgary**

In [46]:
calgary_grouped_clustering = calgary_grouped.drop('Neighborhood', 1)
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(calgary_grouped_clustering)
kmeans.labels_[0:10] 

array([1, 0, 1, 1, 1, 5, 5, 5, 1, 2], dtype=int32)

In [47]:
calgary_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)
calgary_merged = df_calgary

calgary_merged = calgary_merged.join(calgary_venues_sorted.set_index('Neighborhood'), on='Neighborhood')
calgary_merged.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
0,T2A,Calgary,"Penbrooke Meadows, Marlborough",51.04968,-113.96432,,,,,,,,,,,,,,,,,,,,,
1,T3A,Calgary,"Dalhousie, Edgemont, Hamptons, Hidden Valley",51.12606,-114.143158,5.0,Convenience Store,Asian Restaurant,Gas Station,Café,Food Court,Deli / Bodega,Department Store,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Donut Shop,Dry Cleaner,Falafel Restaurant,Fast Food Restaurant,Flea Market,French Restaurant,Cosmetics Shop,Fried Chicken Joint,Furniture / Home Store
2,T2B,Calgary,"Forest Lawn, Dover, Erin Woods",51.0318,-113.9786,5.0,Bar,Smoke Shop,Convenience Store,Yoga Studio,Flea Market,Curling Ice,Deli / Bodega,Department Store,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Donut Shop,Dry Cleaner,Falafel Restaurant,Fast Food Restaurant,French Restaurant,Food Court,Fried Chicken Joint,Furniture / Home Store
3,T3B,Calgary,"Montgomery, Bowness, Silver Springs, Greenwood",51.0809,-114.1616,1.0,Food Court,Steakhouse,Coffee Shop,Scenic Lookout,Construction & Landscaping,Grocery Store,Gourmet Shop,Curling Ice,Deli / Bodega,Hardware Store,Department Store,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Donut Shop,Dry Cleaner,Gym / Fitness Center,Falafel Restaurant,Fast Food Restaurant
4,T2C,Calgary,"Lynnwood Ridge, Ogden, Foothills Industrial, G...",50.9878,-114.0001,5.0,Convenience Store,Pizza Place,Clothing Store,Diner,Yoga Studio,Curling Ice,Deli / Bodega,Department Store,Dim Sum Restaurant,Discount Store,Dog Run,Donut Shop,Dry Cleaner,Falafel Restaurant,Fast Food Restaurant,Flea Market,Food Court,French Restaurant,Fried Chicken Joint,Furniture / Home Store


- like in the case of Edmonton, the **Cluster Labels** column has been coverted automatically into floats which makes it impossible to use its data for mapping. In order to avoid this problem, cleaning the data in the column and its transformation into integer values was performed:

In [48]:
calgary_merged = calgary_merged[~calgary_merged['Cluster Labels'].isnull()]
calgary_merged['Cluster Labels'] = calgary_merged['Cluster Labels'].apply(np.int64)
calgary_merged.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
1,T3A,Calgary,"Dalhousie, Edgemont, Hamptons, Hidden Valley",51.12606,-114.143158,5,Convenience Store,Asian Restaurant,Gas Station,Café,Food Court,Deli / Bodega,Department Store,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Donut Shop,Dry Cleaner,Falafel Restaurant,Fast Food Restaurant,Flea Market,French Restaurant,Cosmetics Shop,Fried Chicken Joint,Furniture / Home Store
2,T2B,Calgary,"Forest Lawn, Dover, Erin Woods",51.0318,-113.9786,5,Bar,Smoke Shop,Convenience Store,Yoga Studio,Flea Market,Curling Ice,Deli / Bodega,Department Store,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Donut Shop,Dry Cleaner,Falafel Restaurant,Fast Food Restaurant,French Restaurant,Food Court,Fried Chicken Joint,Furniture / Home Store
3,T3B,Calgary,"Montgomery, Bowness, Silver Springs, Greenwood",51.0809,-114.1616,1,Food Court,Steakhouse,Coffee Shop,Scenic Lookout,Construction & Landscaping,Grocery Store,Gourmet Shop,Curling Ice,Deli / Bodega,Hardware Store,Department Store,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Donut Shop,Dry Cleaner,Gym / Fitness Center,Falafel Restaurant,Fast Food Restaurant
4,T2C,Calgary,"Lynnwood Ridge, Ogden, Foothills Industrial, G...",50.9878,-114.0001,5,Convenience Store,Pizza Place,Clothing Store,Diner,Yoga Studio,Curling Ice,Deli / Bodega,Department Store,Dim Sum Restaurant,Discount Store,Dog Run,Donut Shop,Dry Cleaner,Falafel Restaurant,Fast Food Restaurant,Flea Market,Food Court,French Restaurant,Fried Chicken Joint,Furniture / Home Store
5,T3C,Calgary,"Rosscarrock, Westgate, Wildwood, Shaganappi, S...",51.0388,-114.098,1,Mexican Restaurant,Pub,Indian Restaurant,Fast Food Restaurant,Sandwich Place,Spa,Pizza Place,Middle Eastern Restaurant,Breakfast Spot,Bookstore,Board Shop,Candy Store,Fried Chicken Joint,Sports Bar,Gas Station,Bakery,Tattoo Parlor,Thai Restaurant,Health & Beauty Service,Vietnamese Restaurant


In [49]:
map_clusters_calgary = folium.Map(location=[latitude_calgary, longitude_calgary], zoom_start=11)

for lat, lon, poi, cluster in zip(calgary_merged['Latitude'], calgary_merged['Longitude'], calgary_merged['Neighborhood'], calgary_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters_calgary)
       
map_clusters_calgary

## **5.Results and Discussion** <a name="Results"></a>

### **5.1.Results and Discussion: Edmonton**

#### **5.1.1.Cluster 0E**

In [50]:
cluster_0E = edmonton_merged.loc[edmonton_merged['Cluster Labels'] == 0, edmonton_merged.columns[[2] + list(range(5, edmonton_merged.shape[1]))]]
cluster_0E = cluster_0E.drop_duplicates()
cluster_0E = cluster_0E.reset_index(drop=True)
cluster_0E

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
0,"North Westmount, West Calder, East Mistatim",0,Middle Eastern Restaurant,Furniture / Home Store,Pub,Breakfast Spot,Wine Shop,Electronics Store,Food & Drink Shop,Flower Shop,Fast Food Restaurant,Eastern European Restaurant,Food Truck,Dog Run,Dive Bar,Distribution Center,Discount Store,Diner,Food Court,Fried Chicken Joint,French Restaurant,Department Store
1,East Mill Woods,0,Bakery,Pub,Wine Shop,Food Truck,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dive Bar,Dog Run,Eastern European Restaurant,Electronics Store,Fast Food Restaurant,Flower Shop,Food & Drink Shop,Food Court,French Restaurant,Creperie,Fried Chicken Joint,Furniture / Home Store


The neighbourhoods contained in the **0E cluster** have bus stations as well as restaurants among the most common venues. **North Capilano** also has ski trails which can be of a particular interest to the tourists. Hotels are not popular in these areas. At the same time, other places of interest are not common in these neighbourhoods.

#### **5.1.2.Cluster 1E**

In [51]:
cluster_1E = edmonton_merged.loc[edmonton_merged['Cluster Labels'] == 1, edmonton_merged.columns[[2] + list(range(5, edmonton_merged.shape[1]))]]
cluster_1E = cluster_1E.drop_duplicates()
cluster_1E = cluster_1E.reset_index(drop=True)
cluster_1E

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
0,"Kaskitayo, Aspen Gardens",1,Lake,Wine Shop,French Restaurant,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dive Bar,Dog Run,Eastern European Restaurant,Electronics Store,Fast Food Restaurant,Flower Shop,Food & Drink Shop,Food Court,Food Truck,Fried Chicken Joint,Hot Dog Joint,Furniture / Home Store,Gas Station


There are no neighbourhoods of significant tourist interest in **Cluster 1E**.

#### **5.1.3.Cluster 2E**

In [52]:
cluster_2E = edmonton_merged.loc[edmonton_merged['Cluster Labels'] == 2, edmonton_merged.columns[[2] + list(range(5, edmonton_merged.shape[1]))]]
cluster_2E = cluster_2E.drop_duplicates()
cluster_2E = cluster_2E.reset_index(drop=True)
cluster_2E

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
0,"East North Central, West Beverly",2,Department Store,Smoke Shop,Construction & Landscaping,Print Shop,Grocery Store,Park,Dog Run,Fast Food Restaurant,Electronics Store,Eastern European Restaurant,Discount Store,Dive Bar,Distribution Center,Food & Drink Shop,Diner,Dim Sum Restaurant,Flower Shop,Wine Shop,Food Court,Creperie
1,"South Downtown, South Downtown Fringe (Alberta...",2,Park,Hotel,Baseball Stadium,Sandwich Place,Thai Restaurant,French Restaurant,Grocery Store,Food & Drink Shop,Dim Sum Restaurant,Diner,Halal Restaurant,Discount Store,Distribution Center,Dive Bar,Dog Run,Eastern European Restaurant,Electronics Store,Fast Food Restaurant,Flower Shop,Gymnastics Gym
2,"Glenora, SW Downtown Fringe",2,Portuguese Restaurant,Park,Wine Shop,Creperie,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dive Bar,Dog Run,Eastern European Restaurant,Electronics Store,Fast Food Restaurant,Flower Shop,Food & Drink Shop,Food Court,Food Truck,French Restaurant,Fried Chicken Joint,Furniture / Home Store
3,South Industrial,2,Business Service,Park,Gym,Golf Driving Range,Wine Shop,Diner,Discount Store,Distribution Center,Dive Bar,Dog Run,Eastern European Restaurant,Electronics Store,Fast Food Restaurant,Flower Shop,Food & Drink Shop,Food Court,French Restaurant,Food Truck,Department Store,Fried Chicken Joint
4,Central Beverly,2,Department Store,Smoke Shop,Construction & Landscaping,Print Shop,Grocery Store,Park,Dog Run,Fast Food Restaurant,Electronics Store,Eastern European Restaurant,Discount Store,Dive Bar,Distribution Center,Food & Drink Shop,Diner,Dim Sum Restaurant,Flower Shop,Wine Shop,Food Court,Creperie


In [53]:
cluster_2E.drop(['Cluster Labels'], axis = 1)
cluster_2E.apply(lambda row: row.astype(str).str.contains('Hotel').any(), axis=1)

0    False
1     True
2    False
3    False
4    False
dtype: bool

In [54]:
cluster_2E.apply(lambda row: row.astype(str).str.contains('Hostel').any(), axis=1)

0    False
1    False
2    False
3    False
4    False
dtype: bool

Among the major places of interest for the tourists visiting the neighbourhoods of the 2E cluster are parks and restastaurants. Some of them already have many hotels, for example South Downtown, South Downtown Fringe, this this neighbourhood cannot be recommended for opening a new hotel.

#### **5.1.4.Cluster 3E**

In [55]:
cluster_3E = edmonton_merged.loc[edmonton_merged['Cluster Labels'] == 3, edmonton_merged.columns[[2] + list(range(5, edmonton_merged.shape[1]))]]
cluster_3E = cluster_3E.drop_duplicates()
cluster_3E = cluster_3E.reset_index(drop=True)
cluster_3E

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
0,"West Clareview, East Londonderry",3,Bus Station,Mexican Restaurant,Buffet,Toy / Game Store,Record Shop,Electronics Store,Food Court,Food & Drink Shop,Flower Shop,Fast Food Restaurant,Eastern European Restaurant,French Restaurant,Dog Run,Dive Bar,Distribution Center,Discount Store,Diner,Food Truck,Wine Shop,Fried Chicken Joint
1,North Capilano,3,Ski Trail,Bus Station,Playground,Halal Restaurant,Gymnastics Gym,Department Store,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dive Bar,Dog Run,Eastern European Restaurant,Electronics Store,Fast Food Restaurant,Flower Shop,Food & Drink Shop,Food Court,Food Truck,French Restaurant
2,"SE Capilano, West Southeast Industrial, East B...",3,Home Service,Baseball Field,Business Service,Playground,Discount Store,Distribution Center,Diner,Dive Bar,Dim Sum Restaurant,Eastern European Restaurant,Electronics Store,Fast Food Restaurant,Flower Shop,Food & Drink Shop,Food Court,Food Truck,Dog Run,French Restaurant,Fried Chicken Joint,Furniture / Home Store
3,Central Bonnie Doon,3,American Restaurant,Water Park,Trail,Eastern European Restaurant,Food Court,Food & Drink Shop,Flower Shop,Fast Food Restaurant,Electronics Store,Dive Bar,Dog Run,French Restaurant,Distribution Center,Discount Store,Diner,Dim Sum Restaurant,Food Truck,Fried Chicken Joint,Creperie,Furniture / Home Store
4,"West Londonderry, East Calder",3,Grocery Store,Arts & Crafts Store,Hockey Arena,Bakery,Butcher,Baseball Field,Dog Run,Comic Shop,Recreation Center,Electronics Store,Food & Drink Shop,Flower Shop,Fast Food Restaurant,Wine Shop,Eastern European Restaurant,Food Truck,Dive Bar,Distribution Center,Discount Store,Food Court
5,"South Bonnie Doon, East University",3,American Restaurant,Pharmacy,Coffee Shop,Mediterranean Restaurant,Flower Shop,Food Truck,Diner,Discount Store,Distribution Center,Dive Bar,Dog Run,Eastern European Restaurant,Electronics Store,Fast Food Restaurant,Food & Drink Shop,Food Court,French Restaurant,Department Store,Fried Chicken Joint,Furniture / Home Store
6,"North Central, Queen Mary Park, Blatchford",3,Pharmacy,Café,Bakery,Bank,Music Venue,Theater,Electronics Store,Food & Drink Shop,Flower Shop,Fast Food Restaurant,Dog Run,Eastern European Restaurant,Food Truck,Dive Bar,Distribution Center,Discount Store,Diner,Food Court,Wine Shop,French Restaurant
7,"West University, Strathcona Place",3,Theater,College Gym,Sandwich Place,Diner,Restaurant,Bank,College Residence Hall,Miscellaneous Shop,Pub,Juice Bar,Electronics Store,Food & Drink Shop,Flower Shop,Fast Food Restaurant,Wine Shop,Eastern European Restaurant,Dive Bar,Distribution Center,Discount Store,Dog Run
8,"NorthDowntown Fringe, East Downtown Fringe",3,Thai Restaurant,Café,Soccer Stadium,Gym,Grocery Store,Gift Shop,Dog Run,Flower Shop,Fast Food Restaurant,Electronics Store,Eastern European Restaurant,Wine Shop,Food Court,Dive Bar,Distribution Center,Discount Store,Diner,Dim Sum Restaurant,Food & Drink Shop,Fried Chicken Joint
9,"Southgate, North Riverbend",3,Sandwich Place,Boutique,Furniture / Home Store,Distribution Center,Restaurant,Coffee Shop,Food & Drink Shop,Flower Shop,Fast Food Restaurant,Wine Shop,Electronics Store,Food Court,Dog Run,Dive Bar,Discount Store,Diner,Eastern European Restaurant,French Restaurant,Food Truck,Department Store


In [70]:
cluster_3E.drop(['Cluster Labels'], axis = 1)
cluster_3E.apply(lambda row: row.astype(str).str.contains('Hotel').any(), axis=1)

0     False
1     False
2     False
3     False
4     False
5     False
6     False
7     False
8     False
9     False
10     True
11    False
12    False
13    False
14    False
15    False
16     True
17    False
18    False
19    False
20    False
21    False
22    False
23     True
dtype: bool

The 3E cluster contains many neighbourhoods particularly auspicious for opening a new hotel, however, some of them, such as **West Northwest Industrial, Winterburn**, **South Downtown, South Downtown Fringe** and **North Downtown** have hotels among the five most common venues. 
I found the following neighbourhoods particularly auspicious for opening a hotel in this cluster:
- **Central Mistatim** has casinos as the most common venues. It is also a neighbourhood crowded with a variety of restaurants and shops with alcohol. At the same time, hotels are not common here.
- **North Capilano** has a lot of ski trails and playgrounds which are great for active vacations;
- **West University, Strathcona Place** has theaters as the second most common venue. There are also diners and pubs popular in the area. Hotels are not common in the neighbourhood.

#### **5.1.5.Cluster 4E**

In [56]:
cluster_4E = edmonton_merged.loc[edmonton_merged['Cluster Labels'] == 4, edmonton_merged.columns[[2] + list(range(5, edmonton_merged.shape[1]))]]
cluster_4E = cluster_4E.drop_duplicates()
cluster_4E = cluster_4E.reset_index(drop=True)
cluster_4E

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
0,Heritage Valley,4,Gymnastics Gym,Rest Area,Wine Shop,Food Truck,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dive Bar,Dog Run,Eastern European Restaurant,Electronics Store,Fast Food Restaurant,Flower Shop,Food & Drink Shop,Food Court,French Restaurant,Creperie,Fried Chicken Joint,Furniture / Home Store
1,Ellerslie,4,Motorcycle Shop,Gymnastics Gym,French Restaurant,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dive Bar,Dog Run,Eastern European Restaurant,Electronics Store,Fast Food Restaurant,Flower Shop,Food & Drink Shop,Food Court,Food Truck,Wine Shop,Hot Dog Joint,Fried Chicken Joint,Furniture / Home Store


**Glenora, SW Downtown Fringe** contained in Cluster 4E has many Portuguese restaurants and wine shops. Gastropubs, gay bars and gift shops are also common in the area.

#### **5.1.6.Cluster 5E**

In [57]:
cluster_5E = edmonton_merged.loc[edmonton_merged['Cluster Labels'] == 5, edmonton_merged.columns[[2] + list(range(5, edmonton_merged.shape[1]))]]
cluster_5E = cluster_5E.drop_duplicates()
cluster_5E = cluster_5E.reset_index(drop=True)
cluster_5E

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
0,"East Southeast Industrial, South Clover Bar",5,Housing Development,French Restaurant,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dive Bar,Dog Run,Eastern European Restaurant,Electronics Store,Fast Food Restaurant,Flower Shop,Food & Drink Shop,Food Court,Food Truck,Fried Chicken Joint,Hot Dog Joint,Furniture / Home Store,Gas Station,Gastropub


**Cluster 5E** includes **Southwest Edmonton** which is an interesting neighbourhood since it has paintball fields as the most common venues. Yet, other venues potentially attractive to tourists are not popular in the area.

#### **5.1.7.Cluster 6E**

In [58]:
cluster_6E = edmonton_merged.loc[edmonton_merged['Cluster Labels'] == 6, edmonton_merged.columns[[2] + list(range(5, edmonton_merged.shape[1]))]]
cluster_6E = cluster_6E.drop_duplicates()
cluster_6E = cluster_6E.reset_index(drop=True)
cluster_6E

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
0,Central Londonderry,6,Construction & Landscaping,Wine Shop,French Restaurant,Diner,Discount Store,Distribution Center,Dive Bar,Dog Run,Eastern European Restaurant,Electronics Store,Fast Food Restaurant,Flower Shop,Food & Drink Shop,Food Court,Food Truck,Fried Chicken Joint,Department Store,Furniture / Home Store,Gas Station,Gastropub


**Kaskitayo, Aspen Gardens** which is the only neighbourhood in **Cluster 6E** can be of potential tourist interest since the most frequent venue here is a lake. There are some restaurants in the area as well and hotels or hostels are not frequent in this area.

#### **5.1.8.Conclusions Edmonton**

Taking into consideration the analysis made above, I found **Central Mistatim**, **North Capilano** and **West University, Strathcona Place** to be of the greatest interest to potential investors. These neighbourhoods are located in the 2E cluster.

### **5.2.Results and Discussion: Calgary**

#### **5.2.1.Cluster 0C**

In [59]:
cluster_0C = calgary_merged.loc[calgary_merged['Cluster Labels'] == 0, calgary_merged.columns[[2] + list(range(5, calgary_merged.shape[1]))]]
cluster_0C = cluster_0C.drop_duplicates()
cluster_0C = cluster_0C.reset_index(drop=True)
cluster_0C

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
0,"Brentwood, Collingwood, Nose Hill",0,Asian Restaurant,Yoga Studio,French Restaurant,Deli / Bodega,Department Store,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Donut Shop,Dry Cleaner,Falafel Restaurant,Fast Food Restaurant,Flea Market,Food Court,Fried Chicken Joint,Cosmetics Shop,Furniture / Home Store,Gas Station,Gastropub


**Cluster 0C** has some venues attractive for tourists whereas it is not occupied by hotels or hostels.

#### **5.2.2.Cluster 1C**

In [60]:
cluster_1C = calgary_merged.loc[calgary_merged['Cluster Labels'] == 1, calgary_merged.columns[[2] + list(range(5, calgary_merged.shape[1]))]]
cluster_1C = cluster_1C.drop_duplicates()
cluster_1C = cluster_1C.reset_index(drop=True)
cluster_1C

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
0,"Montgomery, Bowness, Silver Springs, Greenwood",1,Food Court,Steakhouse,Coffee Shop,Scenic Lookout,Construction & Landscaping,Grocery Store,Gourmet Shop,Curling Ice,Deli / Bodega,Hardware Store,Department Store,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Donut Shop,Dry Cleaner,Gym / Fitness Center,Falafel Restaurant,Fast Food Restaurant
1,"Rosscarrock, Westgate, Wildwood, Shaganappi, S...",1,Mexican Restaurant,Pub,Indian Restaurant,Fast Food Restaurant,Sandwich Place,Spa,Pizza Place,Middle Eastern Restaurant,Breakfast Spot,Bookstore,Board Shop,Candy Store,Fried Chicken Joint,Sports Bar,Gas Station,Bakery,Tattoo Parlor,Thai Restaurant,Health & Beauty Service,Vietnamese Restaurant
2,"Bridgeland, Greenview, Zoo, YYC",1,Fast Food Restaurant,Scenic Lookout,Dim Sum Restaurant,Grocery Store,Noodle House,Chinese Restaurant,Pharmacy,Restaurant,Falafel Restaurant,Sandwich Place,Indian Restaurant,Middle Eastern Restaurant,Seafood Restaurant,Convenience Store,Steakhouse,Sushi Restaurant,Bank,Vietnamese Restaurant,Gym / Fitness Center,Asian Restaurant
3,"Inglewood, Burnsland, Chinatown, East Victoria...",1,Hotel,Pub,Coffee Shop,Performing Arts Venue,Theater,Italian Restaurant,New American Restaurant,Steakhouse,Restaurant,Deli / Bodega,Cocktail Bar,Asian Restaurant,Hookah Bar,Liquor Store,Middle Eastern Restaurant,Museum,Ice Cream Shop,Hostel,Yoga Studio,Wine Shop
4,"Highfield, Burns Industrial",1,American Restaurant,Karaoke Bar,Skating Rink,Gym / Fitness Center,Flea Market,Curling Ice,Deli / Bodega,Department Store,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Donut Shop,Dry Cleaner,Falafel Restaurant,Fast Food Restaurant,Food Court,Gym,Convenience Store,French Restaurant
5,"Queensland, Lake Bonavista, Willow Park, Acadia",1,Pizza Place,Chinese Restaurant,Insurance Office,Child Care Service,Yoga Studio,Flea Market,Deli / Bodega,Department Store,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Donut Shop,Dry Cleaner,Falafel Restaurant,Fast Food Restaurant,Food Court,Cosmetics Shop,French Restaurant,Fried Chicken Joint
6,"Sandstone, MacEwan Glen, Beddington, Harvest H...",1,Pizza Place,Pharmacy,Italian Restaurant,Coffee Shop,Bank,Liquor Store,Grocery Store,Dim Sum Restaurant,Falafel Restaurant,Department Store,Diner,Discount Store,Dog Run,Deli / Bodega,Donut Shop,Dry Cleaner,Yoga Studio,Flea Market,Fast Food Restaurant,Cosmetics Shop
7,"Tuscany, Scenic Acres",1,Convenience Store,Video Store,Liquor Store,Pharmacy,Pizza Place,Pub,Chinese Restaurant,Diner,Donut Shop,Dog Run,Discount Store,Yoga Studio,Dry Cleaner,Department Store,Deli / Bodega,Curling Ice,Dim Sum Restaurant,Falafel Restaurant,Fast Food Restaurant,Flea Market
8,"Mount Pleasant, Capitol Hill, Banff Trail",1,Coffee Shop,American Restaurant,Sushi Restaurant,Comic Shop,Chinese Restaurant,Pub,Mediterranean Restaurant,Sandwich Place,Gas Station,Fast Food Restaurant,Vietnamese Restaurant,Gastropub,Gift Shop,Deli / Bodega,Gym,Department Store,Dim Sum Restaurant,Grocery Store,Diner,Gourmet Shop
9,"Kensington, Westmont, Parkdale, University",1,Hobby Shop,Furniture / Home Store,Park,Gym / Fitness Center,Gym,Cosmetics Shop,Curling Ice,Deli / Bodega,Department Store,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Donut Shop,Dry Cleaner,Falafel Restaurant,Fast Food Restaurant,Flea Market,History Museum,French Restaurant


In [61]:
cluster_1C.drop(['Cluster Labels'], axis = 1)
cluster_1C.apply(lambda row: row.astype(str).str.contains('Hotel').any(), axis=1)

0     False
1     False
2     False
3      True
4     False
5     False
6     False
7     False
8     False
9     False
10     True
11    False
12     True
13    False
14    False
15    False
16     True
17    False
18    False
dtype: bool

In [62]:
cluster_1C.apply(lambda row: row.astype(str).str.contains('Hostel').any(), axis=1)

0     False
1     False
2     False
3      True
4     False
5     False
6     False
7     False
8     False
9     False
10    False
11    False
12    False
13    False
14    False
15    False
16    False
17    False
18    False
dtype: bool

**Cluster 1C** doesn't contain any neighbourhoods where hotels and hostels would be frequent venues. At the same time, judging by the most frequent venues in all of the neighbourhoods in this sluster, these areas are particularly attractive for shopping. 
Applying the criteria for the data of this project, the following neighbourhoods are chosen as the most auspicious for a hotel business:
- **Forest Lawn, Dover, Erin Woods** - has restaurants as well as shops. Vintage stores and bars are among the most frequent venues in this area;
- **Lynnwood Ridge, Ogden, Foothills Industrial** - has different stores and restaurants while clothing stores are particularly frequent in this area;
- **Thorncliffe, Tuxedo Park** - has different stores including kids stores as the most frequent venues and gourmet shops. The area also has bars.
These neighbourhoods can be appealing for shopping tourism, however, it i scrucial to pay attention to factories which are also widespraed in these areas.

#### **5.2.3.Cluster 2C**

In [63]:
cluster_2C = calgary_merged.loc[calgary_merged['Cluster Labels'] == 2, calgary_merged.columns[[2] + list(range(5, calgary_merged.shape[1]))]]
cluster_2C = cluster_2C.drop_duplicates()
cluster_2C = cluster_2C.reset_index(drop=True)
cluster_2C

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
0,"Lakeview, Glendale, Killarney, Glamorgan",2,Wine Shop,Coffee Shop,Cosmetics Shop,Curling Ice,Deli / Bodega,Department Store,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Donut Shop,Dry Cleaner,Falafel Restaurant,Fast Food Restaurant,Flea Market,Yoga Studio,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Gas Station
1,"Elbow Park, Britannia, Parkhill, Mission",2,Coffee Shop,Japanese Restaurant,Yoga Studio,Food Court,Deli / Bodega,Department Store,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Donut Shop,Dry Cleaner,Falafel Restaurant,Fast Food Restaurant,Flea Market,French Restaurant,Cosmetics Shop,Fried Chicken Joint,Furniture / Home Store,Gas Station


In [64]:
cluster_2C.drop(['Cluster Labels'], axis = 1)
cluster_2C.apply(lambda row: row.astype(str).str.contains('Hotel').any(), axis=1)

0    False
1    False
dtype: bool

In [65]:
cluster_2C.apply(lambda row: row.astype(str).str.contains('Hostel').any(), axis=1)

0    False
1    False
dtype: bool

**Cluster 2C** contains very attractive areas for tourists. This cluster is filled with neighbourhoods offering various gastronomic venues, however the Inglewood, Burnsland, Chinatown, East Victoria... neighbourhoods have hostels and hotels among their frequent venues whereas Connaught, West Victoria Park and Rundle, Whitehorn, Monterey Park have only hotels as frequent venues. For that reason, these neighbourhoods will not be taken into consideration.
These are some of the most attractive areas which still do not have too many hotels:
- **Montgomery, Bowness, Silver Springs, Greenwood** - has movie theatres as the second common venue as well as a variety of places for eating out;
- **Lynnwood Ridge, Ogden, Foothills Industrial...** - is also featured by a number of different dining options;
- **Rosscarrock, Westgate, Wildwood, Shaganappi ...** - apart from a variety of restaurants with different cuisine, the area's most frequent venue is a pub. In addition to it, a spa is also a common venue here;
- **City Centre, Calgary Tower** - has a large variety of restaurants and many pubs, bars, breweries as well as Indie movie theatres;
- **Oak Ridge, Haysboro, Kingsland, Kelvin Grove** - offers a variety of restaurants, pubs as well as shopping facilities.

#### **5.2.4.Cluster 3C**

In [66]:
cluster_3C = calgary_merged.loc[calgary_merged['Cluster Labels'] == 3, calgary_merged.columns[[2] + list(range(5, calgary_merged.shape[1]))]]
cluster_3C = cluster_3C.drop_duplicates()
cluster_3C = cluster_3C.reset_index(drop=True)
cluster_3C

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
0,"Martindale, Taradale, Falconridge, Saddle Ridge",3,Moving Target,Dog Run,Shop & Service,Yoga Studio,Convenience Store,Curling Ice,Deli / Bodega,Department Store,Dim Sum Restaurant,Diner,Discount Store,Donut Shop,Dry Cleaner,Falafel Restaurant,Fast Food Restaurant,Flea Market,Food Court,French Restaurant,Fried Chicken Joint,Furniture / Home Store


#### **5.2.5.Cluster 4C**

In [67]:
cluster_4C = calgary_merged.loc[calgary_merged['Cluster Labels'] == 4, calgary_merged.columns[[2] + list(range(5, calgary_merged.shape[1]))]]
cluster_4C = cluster_4C.drop_duplicates()
cluster_4C = cluster_4C.reset_index(drop=True)
cluster_4C

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
0,Northwest Calgary,4,Flea Market,Yoga Studio,Food Court,Curling Ice,Deli / Bodega,Department Store,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Donut Shop,Dry Cleaner,Falafel Restaurant,Fast Food Restaurant,French Restaurant,Wine Shop,Fried Chicken Joint,Furniture / Home Store,Gas Station,Gastropub


#### **5.2.6.Cluster 5C**

In [68]:
cluster_5C = calgary_merged.loc[calgary_merged['Cluster Labels'] == 5, calgary_merged.columns[[2] + list(range(5, calgary_merged.shape[1]))]]
cluster_5C = cluster_5C.drop_duplicates()
cluster_5C = cluster_5C.reset_index(drop=True)
cluster_5C

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
0,"Dalhousie, Edgemont, Hamptons, Hidden Valley",5,Convenience Store,Asian Restaurant,Gas Station,Café,Food Court,Deli / Bodega,Department Store,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Donut Shop,Dry Cleaner,Falafel Restaurant,Fast Food Restaurant,Flea Market,French Restaurant,Cosmetics Shop,Fried Chicken Joint,Furniture / Home Store
1,"Forest Lawn, Dover, Erin Woods",5,Bar,Smoke Shop,Convenience Store,Yoga Studio,Flea Market,Curling Ice,Deli / Bodega,Department Store,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Donut Shop,Dry Cleaner,Falafel Restaurant,Fast Food Restaurant,French Restaurant,Food Court,Fried Chicken Joint,Furniture / Home Store
2,"Lynnwood Ridge, Ogden, Foothills Industrial, G...",5,Convenience Store,Pizza Place,Clothing Store,Diner,Yoga Studio,Curling Ice,Deli / Bodega,Department Store,Dim Sum Restaurant,Discount Store,Dog Run,Donut Shop,Dry Cleaner,Falafel Restaurant,Fast Food Restaurant,Flea Market,Food Court,French Restaurant,Fried Chicken Joint,Furniture / Home Store
3,"Hawkwood, Arbour Lake, Citadel, Ranchlands, Ro...",5,Pizza Place,Pub,Bar,Middle Eastern Restaurant,Yoga Studio,Fast Food Restaurant,Curling Ice,Deli / Bodega,Department Store,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Donut Shop,Dry Cleaner,Falafel Restaurant,Flea Market,Convenience Store,Food Court,French Restaurant
4,"Discovery Ridge, Signal Hill, West Springs, Ch...",5,Vietnamese Restaurant,Convenience Store,Pizza Place,Gas Station,Bar,Yoga Studio,Discount Store,Dry Cleaner,Donut Shop,Dog Run,Diner,Fast Food Restaurant,Dim Sum Restaurant,Department Store,Deli / Bodega,Curling Ice,Falafel Restaurant,French Restaurant,Flea Market,Food Court
5,"Thorncliffe, Tuxedo Park",5,Vietnamese Restaurant,Park,Home Service,Fast Food Restaurant,Convenience Store,Bar,Kids Store,Department Store,Dim Sum Restaurant,Diner,Flea Market,Discount Store,Dog Run,Donut Shop,Dry Cleaner,Falafel Restaurant,Deli / Bodega,Curling Ice,French Restaurant,Food Court
6,"Cranston, Auburn Bay, Mahogany",5,Cosmetics Shop,Pizza Place,Liquor Store,Convenience Store,Curling Ice,Deli / Bodega,Department Store,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Donut Shop,Dry Cleaner,Falafel Restaurant,Fast Food Restaurant,Yoga Studio,Food Court,French Restaurant,Fried Chicken Joint,Furniture / Home Store
7,Symons Valley,5,Convenience Store,Home Service,Food Court,Curling Ice,Deli / Bodega,Department Store,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Donut Shop,Dry Cleaner,Falafel Restaurant,Fast Food Restaurant,Flea Market,French Restaurant,History Museum,Fried Chicken Joint,Furniture / Home Store,Gas Station


#### **5.2.7.Cluster 6C**

In [69]:
cluster_6C = calgary_merged.loc[calgary_merged['Cluster Labels'] == 6, calgary_merged.columns[[2] + list(range(5, calgary_merged.shape[1]))]]
cluster_6C = cluster_6C.drop_duplicates()
cluster_6C = cluster_6C.reset_index(drop=True)
cluster_6C

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
0,"Midnapore, Sundance",6,Hardware Store,Yoga Studio,Food Court,Curling Ice,Deli / Bodega,Department Store,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Donut Shop,Dry Cleaner,Falafel Restaurant,Fast Food Restaurant,Flea Market,French Restaurant,Wine Shop,Fried Chicken Joint,Furniture / Home Store,Gas Station


**Clusters 3C, 4C, 5C** and **6C** are quite similar to each other as well as to some of the neighbourhoods from the previous clusters. They have some attractions for tourists, however, other neighbourhoods mentioned above are featured by a greater variety of appealing venues.

#### **5.2.8.Conclusions - Calgary**

Since the goal of this project is choosing three top neighbourhoods for opening a new hotel in Edmonton and Calgary respectively, the final choice for Calgary is:
- **Rosscarrock, Westgate, Wildwood, Shaganappi ...** as it is a place full of not only restaurants, but also pubs. Spas located in this area are an additional attraction for tourists;
- **City Centre, Calgary Tower** - this location is itself the centre of the city and it is a place appealing to tourists during daytime and nightime;
- **Oak Ridge, Haysboro, Kingsland, Kelvin Grove** - this neighbourhood offers a mixture of restaurants, pubs as well as shopping facilities.

### **5.3.Edmonton vs Calgary - comparison based on the results of kMeans algorythm**

The analysis of the neighbourhoods of Edmonton and Calgary has shown a great difference between the distribution of venues around these cities of Alberta, even though their population and areas are quite similar.
Needless to say, as a rule, the most attractive part of a city for tourists as well as city residents is its centre. When it comes to Edmonton, the central neighbourhoods of this city are located around the Downtown part. These neighbourhoods are occupied by various places of interest and entertaining, however, there is also the largest number of hotels and hostels.
In Calgary, the city centre is not crowded by accommodation facilities which makes it a potential place for opening a new hotel. Yet, this fact still not guarantees that in the reality the City Centre of Calgary will turn out to be the best choice of a location for a new hotel. There might be factors influencing the distribution of hotel sin this area making them unpopular venues. 
At the same time, the top neighbourhoods of Edmonton such as **Central Mistatim** and **West University, Strathcona Place** provide visitors with more infrastructure for entertainment than the neighbourhoods of Calgary. Central Mistatim is home for casinos whereas West University, Strathcona Place is a great place for theatre devotees.

After all the analysis performed in the current project, it becomes obvious that the hospitality business  is more advanced in Calgary. 
In this city, more areas are featured by particularly high performance in the hospitality business than in Edmonton. Occupancy of the hotels in this city is higher than in Edmonton and the accommodation venues tend to generate higher revenue. At the same time, the most efficient locations are characterised by growth in RevPAR unlike such locations in Edmonton.
It is also crucial to point out that all of the three neighbourhoods chosen after the machine learning analysis was completed are lying within the city area with top performance on the market of the hospitality business. In the case of Edmonton, there was only one such neighbourhood among the chosen ones and, in addition to it, the area it is situated in is featured by the decrease in RevPAR.
Taking into consideration these facts, I would recommend the neighbourhoods of Calgary to the future investors.

## **6.Conclusion** <a name="Conclusion"></a>

In this project, I used programming and machine learning techniques such as web scrapping, retrieving geospatial data as well clustering it in order to understand the potential of Edmonton and Calgary on the market of the hospitality business. The results achieved in such a way were compared to the  statistical information provided by CBRE hotels.
Based on all of  scrutinising the data, I would like to recommend such neighbourhoods as - Rosscarrock, Westgate, Wildwood, Shaganappi; City Centre, Calgary Tower and Oak Ridge, Haysboro, Kingsland, Kelvin Grove as future locations for new hotels. All of them are located in Calgary which appears in the results of the project as  a more auspicious place for the hospitality business than Edmonton.
Still, the methods applied in this projects were not enough for precising a single best location out of the three ones mentioned above. Other features might be needed for considering in order to choose a single best location.