# Capstone Project - The Battle of the Neighborhoods (Week 2)
### Applied Data Science Capstone by IBM/Coursera

## Table of contents
* [Introduction](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)

## 1.Introduction
### 1.1 Background
Edmonton is the capital city of the Canadian province of Alberta. It is the second largest city and Canada’s fifth largest municipality. Edmonton being the major economic centre for northern and central Alberta, is a favourable city to start a new in demand business.
### 1.2 Business Problem
As Edmonton is one of the highly populated city there might be gas stations located across its neighbourhoods. This project aims to predict the neighbourhoods suitable to open a gas station in  Edmonton.


In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes
import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans



print('Libraries imported.')

Collecting package metadata (current_repodata.json): done
Solving environment: done


  current version: 4.8.3
  latest version: 4.8.4

Please update conda by running

    $ conda update -n base -c defaults conda



# All requested packages already installed.

Libraries imported.


In [2]:
pip install lxml

Note: you may need to restart the kernel to use updated packages.


## 2. Data acquisition and cleaning
### 2.1 Data Sources
Since the requirement is to find the neighbourhoods, it can be found in 
https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_T. FOURSQUARE API can be used to get the information about the avaiable venues present in the neighbourhoods.

### 2.2 Data Cleaning
Data downloaded from the above mentioned sources and stored in a table.
There are some neighbourhoods with out latitude and longitude positions in the dataset. So, removed those neighbourhoods from the table.


In [3]:
df_init = pd.read_html("https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_T")
len(df_init)

6

In [4]:
# define the dataframe columns
column_names = ['Borough', 'Neighborhood', 'Latitude', 'Longitude'] 

# instantiate the dataframe
neighborhoods = pd.DataFrame(columns=column_names)
neighborhoods

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude


In [5]:
neighborhoods=df_init[1]

In [6]:
neighborhoods.dtypes

Postal Code     object
Borough         object
Neighborhood    object
Latitude        object
Longitude       object
dtype: object

In [7]:
neighborhoods = neighborhoods[neighborhoods.Latitude!='Not assigned']
neighborhoods = neighborhoods[neighborhoods.Longitude!='Not assigned']

neighborhoods['Latitude'] = neighborhoods['Latitude'].astype('float64')
neighborhoods['Longitude'] = neighborhoods['Longitude'].astype('float64')


In [8]:
neighborhoods.dtypes

Postal Code      object
Borough          object
Neighborhood     object
Latitude        float64
Longitude       float64
dtype: object

In [9]:
from geopy.geocoders import Nominatim
address = 'Alberta'

geolocator = Nominatim(user_agent="ab_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Alberta are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Alberta are 55.001251, -115.002136.


In [10]:
!conda install -c conda-forge folium=0.5.0 --yes
import folium 


Collecting package metadata (current_repodata.json): done
Solving environment: done


  current version: 4.8.3
  latest version: 4.8.4

Please update conda by running

    $ conda update -n base -c defaults conda



# All requested packages already installed.



In [11]:
Edmonton_data = neighborhoods[neighborhoods['Borough'] == 'Edmonton'].reset_index(drop=True)
Edmonton_data

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude
0,T5A,Edmonton,"West Clareview, East Londonderry",53.5899,-113.4413
1,T6A,Edmonton,North Capilano,53.5483,-113.408
2,T5B,Edmonton,"East North Central, West Beverly",53.5766,-113.4608
3,T6B,Edmonton,"SE Capilano, West Southeast Industrial, East B...",53.5322,-113.4404
4,T5C,Edmonton,Central Londonderry,53.6129,-113.4572
5,T6C,Edmonton,Central Bonnie Doon,53.5182,-113.4769
6,T5E,Edmonton,"West Londonderry, East Calder",53.5923,-113.5168
7,T6E,Edmonton,"South Bonnie Doon, East University",53.5087,-113.5078
8,T5G,Edmonton,"North Central, Queen Mary Park, Blatchford",53.5682,-113.4822
9,T6G,Edmonton,"West University, Strathcona Place",53.5248,-113.5334


## 3. Methodology
In order to extract information about the venues present in different neighbourhoods, assumption is to have them clustered. I chose to use “K-Means Clustering Algorithm”. K-means is a type of unsupervised learning, which is generally used for unlabelled data (data without defined categories or groups). 

K-means groups each data point present in the data, given by the number K. It iteratively assign each data point to one of K groups based on the features of that data point.

### 3.1 Analysis
Steps involved during analysis:
    • Download and explore Dataset:
      Information about the neighbourhood location were downloaded, cleansed.
    • Create a map of Edmonton with neighborhoods superimposed on top:
      This is to visualize the neighbourhood location present in Edmonton.
    • Explore neighbourhoods in Edomonton to get details of the venues using FOURSQUARE API:
      This is get the details about the venues present in every neighbourhood of  Edmonton
    • Analyse every neighbourhood:
      Group the neighbourhoods depending on the frequency of each category venues.
    • Cluster neighbourhoods using K-Means:
      Clustered neighbourhoods into 5 categories.                               
    • Examine Clusters:
      This is to get the list of neighbourhoods with top 10 venues for each neighbourhood.


### Create a map of Edmonton with neighborhoods superimposed on top

In [12]:
address = 'Edmonton, Alberta'

geolocator = Nominatim(user_agent="em_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Edmonton are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Edmonton are 53.535411, -113.507996.


In [13]:
# create map of Manhattan using latitude and longitude values
map_edmonton = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(Edmonton_data['Latitude'], Edmonton_data['Longitude'], Edmonton_data['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='red',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_edmonton)  
    
map_edmonton

### Explore neighbourhoods in Edomonton to get details of the venues using FOURSQUARE API

In [14]:
CLIENT_ID = '42XSAG0GGS0LYX1JC3FCE1D5RDY4V412505LA2ZHBNEYTQUA' # your Foursquare ID
CLIENT_SECRET = 'B4UKKDCO5UUBQUUS4WJ1M02PWHGWOT5IWGXVZIZ0H3BWLUWY' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: 42XSAG0GGS0LYX1JC3FCE1D5RDY4V412505LA2ZHBNEYTQUA
CLIENT_SECRET:B4UKKDCO5UUBQUUS4WJ1M02PWHGWOT5IWGXVZIZ0H3BWLUWY


In [15]:
def getNearbyVenues(names, latitudes, longitudes):
    radius=500
    LIMIT=100
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [16]:
Edmonton_venues = getNearbyVenues(names=Edmonton_data['Neighborhood'],
                                   latitudes=Edmonton_data['Latitude'],
                                   longitudes=Edmonton_data['Longitude']
                                  )


West Clareview, East Londonderry
North Capilano
East North Central, West Beverly
SE Capilano, West Southeast Industrial, East Bonnie Doon
Central Londonderry
Central Bonnie Doon
West Londonderry, East Calder
South Bonnie Doon, East University
North Central, Queen Mary Park, Blatchford
West University, Strathcona Place
NorthDowntown Fringe, East Downtown Fringe
Southgate, North Riverbend
North Downtown
Kaskitayo, Aspen Gardens
South Downtown, South Downtown Fringe (Alberta Provincial Government)
West Mill Woods
North Westmount, West Calder, East Mistatim
East Mill Woods
South Westmount, Groat Estate, East Northwest Industrial
Southwest Edmonton
Glenora, SW Downtown Fringe
South Industrial
North Jasper Place
East Southeast Industrial, South Clover Bar
Central Jasper Place, Buena Vista
Southgate, North Riverbend
West Northwest Industrial, Winterburn
North Clover Bar
West Jasper Place, West Edmonton Mall
The Meadows
Central Mistatim
The Palisades, West Castle Downs
Central Beverly
Heritage

In [17]:
print(Edmonton_venues.shape)
Edmonton_venues.head()

(309, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,"West Clareview, East Londonderry",53.5899,-113.4413,Café del Sol,53.592441,-113.441455,Mexican Restaurant
1,"West Clareview, East Londonderry",53.5899,-113.4413,Buffet Royale Carvery,53.587229,-113.439075,Buffet
2,"West Clareview, East Londonderry",53.5899,-113.4413,Red Claw Gaming,53.586937,-113.439775,Toy / Game Store
3,"West Clareview, East Londonderry",53.5899,-113.4413,My Grandma's Attic,53.586033,-113.441629,Record Shop
4,"West Clareview, East Londonderry",53.5899,-113.4413,Belvedere Transit Centre,53.587932,-113.435254,Bus Station


In [18]:
Edmonton_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Central Beverly,4,4,4,4,4,4
Central Bonnie Doon,5,5,5,5,5,5
"Central Jasper Place, Buena Vista",8,8,8,8,8,8
Central Mistatim,3,3,3,3,3,3
East Castledowns,8,8,8,8,8,8
East Mill Woods,2,2,2,2,2,2
"East North Central, West Beverly",4,4,4,4,4,4
"East Southeast Industrial, South Clover Bar",3,3,3,3,3,3
Ellerslie,2,2,2,2,2,2
"Glenora, SW Downtown Fringe",1,1,1,1,1,1


In [19]:
print('There are {} uniques categories.'.format(len(Edmonton_venues['Venue Category'].unique())))

There are 119 uniques categories.


### Analyse every neighbourhood

In [20]:
# one hot encoding
Edmonton_onehot = pd.get_dummies(Edmonton_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
Edmonton_onehot['Neighborhood'] = Edmonton_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [Edmonton_onehot.columns[-1]] + list(Edmonton_onehot.columns[:-1])
Edmonton_onehot = Edmonton_onehot[fixed_columns]

Edmonton_onehot.head()

Unnamed: 0,Neighborhood,American Restaurant,Arts & Crafts Store,Asian Restaurant,Bakery,Bank,Baseball Field,Baseball Stadium,Big Box Store,Bookstore,Breakfast Spot,Brewery,Buffet,Burger Joint,Bus Station,Business Service,Butcher,Café,Casino,Cheese Shop,Chinese Restaurant,Clothing Store,Coffee Shop,College Gym,College Residence Hall,Comic Shop,Construction & Landscaping,Convenience Store,Cosmetics Shop,Creperie,Department Store,Diner,Discount Store,Distribution Center,Dog Run,Eastern European Restaurant,Electronics Store,Falafel Restaurant,Fast Food Restaurant,Flower Shop,Food & Drink Shop,Food Truck,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Gas Station,Gastropub,Gay Bar,Gift Shop,Golf Course,Golf Driving Range,Grocery Store,Gym,Gymnastics Gym,Halal Restaurant,Hockey Arena,Home Service,Hot Dog Joint,Hotel,Housing Development,Ice Cream Shop,Indian Restaurant,Irish Pub,Italian Restaurant,Japanese Restaurant,Lake,Light Rail Station,Liquor Store,Lounge,Massage Studio,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Motorcycle Shop,Movie Theater,Museum,Music Venue,New American Restaurant,Nightclub,Noodle House,Office,Paper / Office Supplies Store,Park,Pet Store,Pharmacy,Pizza Place,Playground,Plaza,Pool Hall,Portuguese Restaurant,Pub,Record Shop,Recreation Center,Rental Car Location,Rest Area,Restaurant,Rock Club,Salad Place,Sandwich Place,Shopping Mall,Skating Rink,Ski Trail,Smoke Shop,Soccer Stadium,Steakhouse,Supermarket,Sushi Restaurant,Tapas Restaurant,Thai Restaurant,Theater,Toy / Game Store,Trail,Turkish Restaurant,Vietnamese Restaurant,Warehouse Store,Water Park,Whisky Bar,Wine Shop
0,"West Clareview, East Londonderry",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,"West Clareview, East Londonderry",0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,"West Clareview, East Londonderry",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0
3,"West Clareview, East Londonderry",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,"West Clareview, East Londonderry",0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [21]:
Edmonton_onehot.shape

(309, 120)

In [22]:
Edmonton_grouped = Edmonton_onehot.groupby('Neighborhood').mean().reset_index()
Edmonton_grouped

Unnamed: 0,Neighborhood,American Restaurant,Arts & Crafts Store,Asian Restaurant,Bakery,Bank,Baseball Field,Baseball Stadium,Big Box Store,Bookstore,Breakfast Spot,Brewery,Buffet,Burger Joint,Bus Station,Business Service,Butcher,Café,Casino,Cheese Shop,Chinese Restaurant,Clothing Store,Coffee Shop,College Gym,College Residence Hall,Comic Shop,Construction & Landscaping,Convenience Store,Cosmetics Shop,Creperie,Department Store,Diner,Discount Store,Distribution Center,Dog Run,Eastern European Restaurant,Electronics Store,Falafel Restaurant,Fast Food Restaurant,Flower Shop,Food & Drink Shop,Food Truck,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Gas Station,Gastropub,Gay Bar,Gift Shop,Golf Course,Golf Driving Range,Grocery Store,Gym,Gymnastics Gym,Halal Restaurant,Hockey Arena,Home Service,Hot Dog Joint,Hotel,Housing Development,Ice Cream Shop,Indian Restaurant,Irish Pub,Italian Restaurant,Japanese Restaurant,Lake,Light Rail Station,Liquor Store,Lounge,Massage Studio,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Motorcycle Shop,Movie Theater,Museum,Music Venue,New American Restaurant,Nightclub,Noodle House,Office,Paper / Office Supplies Store,Park,Pet Store,Pharmacy,Pizza Place,Playground,Plaza,Pool Hall,Portuguese Restaurant,Pub,Record Shop,Recreation Center,Rental Car Location,Rest Area,Restaurant,Rock Club,Salad Place,Sandwich Place,Shopping Mall,Skating Rink,Ski Trail,Smoke Shop,Soccer Stadium,Steakhouse,Supermarket,Sushi Restaurant,Tapas Restaurant,Thai Restaurant,Theater,Toy / Game Store,Trail,Turkish Restaurant,Vietnamese Restaurant,Warehouse Store,Water Park,Whisky Bar,Wine Shop
0,Central Beverly,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Central Bonnie Doon,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.2,0.0,0.0
2,"Central Jasper Place, Buena Vista",0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Central Mistatim,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0
4,East Castledowns,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.25,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,East Mill Woods,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,"East North Central, West Beverly",0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,"East Southeast Industrial, South Clover Bar",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Ellerslie,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,"Glenora, SW Downtown Fringe",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [23]:
Edmonton_grouped.shape

(35, 120)

In [24]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [25]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = Edmonton_grouped['Neighborhood']

for ind in np.arange(Edmonton_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(Edmonton_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Central Beverly,Smoke Shop,Arts & Crafts Store,Construction & Landscaping,Grocery Store,Furniture / Home Store,Distribution Center,Dog Run,Eastern European Restaurant,Electronics Store,Falafel Restaurant
1,Central Bonnie Doon,American Restaurant,Trail,Grocery Store,Cosmetics Shop,Water Park,Asian Restaurant,Fried Chicken Joint,Discount Store,Distribution Center,Dog Run
2,"Central Jasper Place, Buena Vista",Fast Food Restaurant,Pizza Place,Convenience Store,Sushi Restaurant,Café,Sandwich Place,Bakery,Dog Run,Eastern European Restaurant,Falafel Restaurant
3,Central Mistatim,Warehouse Store,Liquor Store,Casino,Wine Shop,Fried Chicken Joint,Discount Store,Distribution Center,Dog Run,Eastern European Restaurant,Electronics Store
4,East Castledowns,Plaza,Bakery,Bus Station,Construction & Landscaping,Recreation Center,Playground,Skating Rink,Distribution Center,Discount Store,Dog Run


### Cluster neighbourhoods using K-Means

In [26]:
from sklearn.cluster import KMeans
kclusters = 5

Edmonton_grouped_clustering = Edmonton_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(Edmonton_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([0, 1, 1, 1, 0, 1, 0, 0, 1, 3], dtype=int32)

In [27]:
# add clustering labels

neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

Edmonton_merged = Edmonton_data

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
Edmonton_merged = Edmonton_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')


In [28]:
Edmonton_merged.head()

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,T5A,Edmonton,"West Clareview, East Londonderry",53.5899,-113.4413,0.0,Mexican Restaurant,Bus Station,Buffet,Record Shop,Toy / Game Store,Flower Shop,Fried Chicken Joint,French Restaurant,Food Truck,Food & Drink Shop
1,T6A,Edmonton,North Capilano,53.5483,-113.408,0.0,Ski Trail,Golf Course,Playground,Park,Food Truck,Food & Drink Shop,Flower Shop,Fast Food Restaurant,Falafel Restaurant,Wine Shop
2,T5B,Edmonton,"East North Central, West Beverly",53.5766,-113.4608,0.0,Smoke Shop,Arts & Crafts Store,Construction & Landscaping,Grocery Store,Furniture / Home Store,Distribution Center,Dog Run,Eastern European Restaurant,Electronics Store,Falafel Restaurant
3,T6B,Edmonton,"SE Capilano, West Southeast Industrial, East B...",53.5322,-113.4404,0.0,Business Service,Playground,Home Service,Baseball Field,Fast Food Restaurant,French Restaurant,Food Truck,Food & Drink Shop,Flower Shop,Electronics Store
4,T5C,Edmonton,Central Londonderry,53.6129,-113.4572,,,,,,,,,,,


In [29]:
Edmonton_merged = Edmonton_merged.dropna()
Edmonton_merged = Edmonton_merged.reset_index(drop=True)
Edmonton_merged

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,T5A,Edmonton,"West Clareview, East Londonderry",53.5899,-113.4413,0.0,Mexican Restaurant,Bus Station,Buffet,Record Shop,Toy / Game Store,Flower Shop,Fried Chicken Joint,French Restaurant,Food Truck,Food & Drink Shop
1,T6A,Edmonton,North Capilano,53.5483,-113.408,0.0,Ski Trail,Golf Course,Playground,Park,Food Truck,Food & Drink Shop,Flower Shop,Fast Food Restaurant,Falafel Restaurant,Wine Shop
2,T5B,Edmonton,"East North Central, West Beverly",53.5766,-113.4608,0.0,Smoke Shop,Arts & Crafts Store,Construction & Landscaping,Grocery Store,Furniture / Home Store,Distribution Center,Dog Run,Eastern European Restaurant,Electronics Store,Falafel Restaurant
3,T6B,Edmonton,"SE Capilano, West Southeast Industrial, East B...",53.5322,-113.4404,0.0,Business Service,Playground,Home Service,Baseball Field,Fast Food Restaurant,French Restaurant,Food Truck,Food & Drink Shop,Flower Shop,Electronics Store
4,T6C,Edmonton,Central Bonnie Doon,53.5182,-113.4769,1.0,American Restaurant,Trail,Grocery Store,Cosmetics Shop,Water Park,Asian Restaurant,Fried Chicken Joint,Discount Store,Distribution Center,Dog Run
5,T5E,Edmonton,"West Londonderry, East Calder",53.5923,-113.5168,0.0,Butcher,Baseball Field,Recreation Center,Comic Shop,Grocery Store,Shopping Mall,Hockey Arena,Bakery,Dog Run,Arts & Crafts Store
6,T6E,Edmonton,"South Bonnie Doon, East University",53.5087,-113.5078,1.0,American Restaurant,Pharmacy,Coffee Shop,Flower Shop,Mediterranean Restaurant,Fried Chicken Joint,Distribution Center,Dog Run,Eastern European Restaurant,Electronics Store
7,T5G,Edmonton,"North Central, Queen Mary Park, Blatchford",53.5682,-113.4822,1.0,Café,Music Venue,Pharmacy,Vietnamese Restaurant,Bakery,Bank,Grocery Store,French Restaurant,Food Truck,Food & Drink Shop
8,T6G,Edmonton,"West University, Strathcona Place",53.5248,-113.5334,1.0,Theater,College Gym,Paper / Office Supplies Store,Diner,Restaurant,Coffee Shop,Pub,Sandwich Place,College Residence Hall,Bank
9,T5H,Edmonton,"NorthDowntown Fringe, East Downtown Fringe",53.555,-113.4822,1.0,Soccer Stadium,Park,Gym,Grocery Store,Café,Gift Shop,Wine Shop,Falafel Restaurant,Food & Drink Shop,Flower Shop


In [30]:
Edmonton_merged['Cluster Labels'] = Edmonton_merged['Cluster Labels'].astype('int32')


In [31]:
Edmonton_merged.head()

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,T5A,Edmonton,"West Clareview, East Londonderry",53.5899,-113.4413,0,Mexican Restaurant,Bus Station,Buffet,Record Shop,Toy / Game Store,Flower Shop,Fried Chicken Joint,French Restaurant,Food Truck,Food & Drink Shop
1,T6A,Edmonton,North Capilano,53.5483,-113.408,0,Ski Trail,Golf Course,Playground,Park,Food Truck,Food & Drink Shop,Flower Shop,Fast Food Restaurant,Falafel Restaurant,Wine Shop
2,T5B,Edmonton,"East North Central, West Beverly",53.5766,-113.4608,0,Smoke Shop,Arts & Crafts Store,Construction & Landscaping,Grocery Store,Furniture / Home Store,Distribution Center,Dog Run,Eastern European Restaurant,Electronics Store,Falafel Restaurant
3,T6B,Edmonton,"SE Capilano, West Southeast Industrial, East B...",53.5322,-113.4404,0,Business Service,Playground,Home Service,Baseball Field,Fast Food Restaurant,French Restaurant,Food Truck,Food & Drink Shop,Flower Shop,Electronics Store
4,T6C,Edmonton,Central Bonnie Doon,53.5182,-113.4769,1,American Restaurant,Trail,Grocery Store,Cosmetics Shop,Water Park,Asian Restaurant,Fried Chicken Joint,Discount Store,Distribution Center,Dog Run


In [32]:
neighborhoods_venues_sorted

Unnamed: 0,Cluster Labels,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,0,Central Beverly,Smoke Shop,Arts & Crafts Store,Construction & Landscaping,Grocery Store,Furniture / Home Store,Distribution Center,Dog Run,Eastern European Restaurant,Electronics Store,Falafel Restaurant
1,1,Central Bonnie Doon,American Restaurant,Trail,Grocery Store,Cosmetics Shop,Water Park,Asian Restaurant,Fried Chicken Joint,Discount Store,Distribution Center,Dog Run
2,1,"Central Jasper Place, Buena Vista",Fast Food Restaurant,Pizza Place,Convenience Store,Sushi Restaurant,Café,Sandwich Place,Bakery,Dog Run,Eastern European Restaurant,Falafel Restaurant
3,1,Central Mistatim,Warehouse Store,Liquor Store,Casino,Wine Shop,Fried Chicken Joint,Discount Store,Distribution Center,Dog Run,Eastern European Restaurant,Electronics Store
4,0,East Castledowns,Plaza,Bakery,Bus Station,Construction & Landscaping,Recreation Center,Playground,Skating Rink,Distribution Center,Discount Store,Dog Run
5,1,East Mill Woods,Bakery,Pub,Wine Shop,Furniture / Home Store,Discount Store,Distribution Center,Dog Run,Eastern European Restaurant,Electronics Store,Falafel Restaurant
6,0,"East North Central, West Beverly",Smoke Shop,Arts & Crafts Store,Construction & Landscaping,Grocery Store,Furniture / Home Store,Distribution Center,Dog Run,Eastern European Restaurant,Electronics Store,Falafel Restaurant
7,0,"East Southeast Industrial, South Clover Bar",Housing Development,Construction & Landscaping,Bus Station,Wine Shop,Furniture / Home Store,Distribution Center,Dog Run,Eastern European Restaurant,Electronics Store,Falafel Restaurant
8,1,Ellerslie,Motorcycle Shop,Gymnastics Gym,Wine Shop,Discount Store,Distribution Center,Dog Run,Eastern European Restaurant,Electronics Store,Falafel Restaurant,Fast Food Restaurant
9,3,"Glenora, SW Downtown Fringe",Portuguese Restaurant,Wine Shop,Department Store,Discount Store,Distribution Center,Dog Run,Eastern European Restaurant,Electronics Store,Falafel Restaurant,Fast Food Restaurant


In [33]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=10)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(Edmonton_merged['Latitude'], Edmonton_merged['Longitude'], Edmonton_merged['Neighborhood'], Edmonton_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### Examine Clusters

In [34]:
Edmonton_merged.loc[Edmonton_merged['Cluster Labels'] == 0, Edmonton_merged.columns[[2] + list(range(5, Edmonton_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"West Clareview, East Londonderry",0,Mexican Restaurant,Bus Station,Buffet,Record Shop,Toy / Game Store,Flower Shop,Fried Chicken Joint,French Restaurant,Food Truck,Food & Drink Shop
1,North Capilano,0,Ski Trail,Golf Course,Playground,Park,Food Truck,Food & Drink Shop,Flower Shop,Fast Food Restaurant,Falafel Restaurant,Wine Shop
2,"East North Central, West Beverly",0,Smoke Shop,Arts & Crafts Store,Construction & Landscaping,Grocery Store,Furniture / Home Store,Distribution Center,Dog Run,Eastern European Restaurant,Electronics Store,Falafel Restaurant
3,"SE Capilano, West Southeast Industrial, East B...",0,Business Service,Playground,Home Service,Baseball Field,Fast Food Restaurant,French Restaurant,Food Truck,Food & Drink Shop,Flower Shop,Electronics Store
5,"West Londonderry, East Calder",0,Butcher,Baseball Field,Recreation Center,Comic Shop,Grocery Store,Shopping Mall,Hockey Arena,Bakery,Dog Run,Arts & Crafts Store
18,Southwest Edmonton,0,Home Service,Construction & Landscaping,Wine Shop,Fried Chicken Joint,Discount Store,Distribution Center,Dog Run,Eastern European Restaurant,Electronics Store,Falafel Restaurant
22,"East Southeast Industrial, South Clover Bar",0,Housing Development,Construction & Landscaping,Bus Station,Wine Shop,Furniture / Home Store,Distribution Center,Dog Run,Eastern European Restaurant,Electronics Store,Falafel Restaurant
30,Central Beverly,0,Smoke Shop,Arts & Crafts Store,Construction & Landscaping,Grocery Store,Furniture / Home Store,Distribution Center,Dog Run,Eastern European Restaurant,Electronics Store,Falafel Restaurant
32,East Castledowns,0,Plaza,Bakery,Bus Station,Construction & Landscaping,Recreation Center,Playground,Skating Rink,Distribution Center,Discount Store,Dog Run


In [35]:
Edmonton_merged.loc[Edmonton_merged['Cluster Labels'] == 1, Edmonton_merged.columns[[2] + list(range(5, Edmonton_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,Central Bonnie Doon,1,American Restaurant,Trail,Grocery Store,Cosmetics Shop,Water Park,Asian Restaurant,Fried Chicken Joint,Discount Store,Distribution Center,Dog Run
6,"South Bonnie Doon, East University",1,American Restaurant,Pharmacy,Coffee Shop,Flower Shop,Mediterranean Restaurant,Fried Chicken Joint,Distribution Center,Dog Run,Eastern European Restaurant,Electronics Store
7,"North Central, Queen Mary Park, Blatchford",1,Café,Music Venue,Pharmacy,Vietnamese Restaurant,Bakery,Bank,Grocery Store,French Restaurant,Food Truck,Food & Drink Shop
8,"West University, Strathcona Place",1,Theater,College Gym,Paper / Office Supplies Store,Diner,Restaurant,Coffee Shop,Pub,Sandwich Place,College Residence Hall,Bank
9,"NorthDowntown Fringe, East Downtown Fringe",1,Soccer Stadium,Park,Gym,Grocery Store,Café,Gift Shop,Wine Shop,Falafel Restaurant,Food & Drink Shop,Flower Shop
10,"Southgate, North Riverbend",1,Distribution Center,Coffee Shop,Restaurant,Sandwich Place,Furniture / Home Store,Wine Shop,French Restaurant,Discount Store,Dog Run,Eastern European Restaurant
11,North Downtown,1,Coffee Shop,Sandwich Place,Fast Food Restaurant,Pub,Restaurant,Hotel,Italian Restaurant,Café,Brewery,New American Restaurant
13,"South Downtown, South Downtown Fringe (Alberta...",1,Park,French Restaurant,Hotel,Sandwich Place,Baseball Stadium,Thai Restaurant,Fast Food Restaurant,Food Truck,Food & Drink Shop,Flower Shop
15,"North Westmount, West Calder, East Mistatim",1,Furniture / Home Store,Breakfast Spot,Motorcycle Shop,Massage Studio,Middle Eastern Restaurant,Pub,Eastern European Restaurant,Electronics Store,Falafel Restaurant,Fried Chicken Joint
16,East Mill Woods,1,Bakery,Pub,Wine Shop,Furniture / Home Store,Discount Store,Distribution Center,Dog Run,Eastern European Restaurant,Electronics Store,Falafel Restaurant


In [36]:
Edmonton_merged.loc[Edmonton_merged['Cluster Labels'] == 2, Edmonton_merged.columns[[2] + list(range(5, Edmonton_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
12,"Kaskitayo, Aspen Gardens",2,Lake,Wine Shop,Furniture / Home Store,Discount Store,Distribution Center,Dog Run,Eastern European Restaurant,Electronics Store,Falafel Restaurant,Fast Food Restaurant


In [37]:
Edmonton_merged.loc[Edmonton_merged['Cluster Labels'] == 3, Edmonton_merged.columns[[2] + list(range(5, Edmonton_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
19,"Glenora, SW Downtown Fringe",3,Portuguese Restaurant,Wine Shop,Department Store,Discount Store,Distribution Center,Dog Run,Eastern European Restaurant,Electronics Store,Falafel Restaurant,Fast Food Restaurant


In [38]:
Edmonton_merged.loc[Edmonton_merged['Cluster Labels'] == 4, Edmonton_merged.columns[[2] + list(range(5, Edmonton_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
14,West Mill Woods,4,Business Service,Department Store,Discount Store,Distribution Center,Dog Run,Eastern European Restaurant,Electronics Store,Falafel Restaurant,Fast Food Restaurant,Flower Shop


## 4. Results
Gas stations are more common in the neighbourhoods segmented to cluster 2.However frequency of Gas Station in other clusters 1,3,4,5 is insignificant.


## 5. Discussion
In this project, I tried to predict the suitable neighbourhood for a new gas station in Edmonton. I took the data about the neighbourhoods along with its locations present in Edmonton. I used K-Means clustering algorithm to segment the neighbourhoods and used FOURSQUARE API to get the venue details present in those neighbourhoods. Finally able to cluster the neighbourhoods with most frequent top 10 venues.


## 6.Conclusion
Model in this study focused on finding the neighbourhoods with less number of gas stations. However, there would be other factors that might also contribute to open up a new gas station such as land availability, prices, proximity to public places, population etc.
