# Finding the Best Location For A Family Run Business

## Introduction

My parents have always wanted to start their own business.  Specifically a boutique coffee shop, and as they're not getting any younger they wish to start something sooner rather than later.  For this Capstone, I have decided to try and find the optimal neighborhood for them to setup shop.  It will need to be in Toronto since that's where they live.  As empty nesters they decided to dump their house and use what should have been my inheritance (I'm not bitter) to buy a condo dowtown Toronto and use what's left to seed their coffee shop business in a nearby Toronto neighborhood.  So needless to say, I have a vested interest in the business being sucessful.

## How Are We Gonna Do It?
The idea is to use a subset of venues provided by the FourSquare API and do some cluster analysis to find a location that is relatively low in other coffee shops and high relativity to venues that will attract customers, as mentioned on the website CoffeShopStartups.  The data we're going to use for great locations listed here: https://coffeeshopstartups.com/10-great-locations-to-start-your-coffee-shop-business/.  We will attempt as part of the discovery to map the ideal places on the website to location listed in FourSquare data pull for the potential neighbourhoods to be scouted listed below.

Using the data from FourSquare (filtered and transformed) for the core Toronto area neighbourhods. So effectively limited to:
- Downtown Toronto
- East Toronto
- West Toronto
- Central Toronto

We will figure out optimal K in K-Means clustering to identify the area(s) my parents should look into setting up shop within the core Toronto area near to where they live.  Using optimal k for the clusters we will evaluate the venues visually, in tables, to decide where may be the best places to start looking at real estate as the next step provided the data is able to provide a good location.

### Let's Import what we need to get started...

In [3]:
import numpy as np 
import pandas as pd 
import json 
import requests 
import matplotlib.cm as cm
import matplotlib.colors as colors
from sklearn.cluster import KMeans
from geopy.geocoders import Nominatim
!conda install -c conda-forge folium=0.5.0
import folium
from bs4 import BeautifulSoup

print('Libraries imported.')

Fetching package metadata .............
Solving package specifications: .

Package plan for installation in environment /opt/conda/envs/DSX-Python35:

The following NEW packages will be INSTALLED:

    altair:  2.2.2-py35_1 conda-forge
    branca:  0.3.1-py_0   conda-forge
    folium:  0.5.0-py_0   conda-forge
    vincent: 0.4.4-py_1   conda-forge

altair-2.2.2-p 100% |################################| Time: 0:00:00  53.10 MB/s
branca-0.3.1-p 100% |################################| Time: 0:00:00  30.66 MB/s
vincent-0.4.4- 100% |################################| Time: 0:00:00  22.45 MB/s
folium-0.5.0-p 100% |################################| Time: 0:00:00  37.19 MB/s
Libraries imported.


Credentials loaded below in a hidden cell..

In [30]:
# The code was removed by Watson Studio for sharing.

### Create Dataframe containing only the neighborhoods we're interested in by Borough

In [31]:
url = requests.get('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M').text
soup = BeautifulSoup(url, 'lxml')

neigh_df = pd.read_html(soup.find('table', {'class' : 'wikitable sortable'}).prettify())
neigh_df = neigh_df[0]
neigh_df.columns = neigh_df.iloc[0]
neigh_df = neigh_df[neigh_df.Borough != 'Not assigned']
neigh_df.columns = ['PostalCode', 'Borough', 'Neighborhood']

neigh_df = neigh_df.groupby(['PostalCode','Borough'])['Neighborhood'].apply(', '.join).reset_index()
neigh_df.Neighborhood.replace('Not assigned', neigh_df.Borough, inplace=True)

toronto_data = neigh_df[neigh_df['Borough'].str.contains('Toronto')]

!wget -O geo_data.csv http://cocl.us/Geospatial_data
geo_data_df = pd.read_csv('geo_data.csv')

toronto_data = pd.concat([toronto_data , geo_data_df], axis=1, join='inner')
toronto_data.drop('PostalCode', axis=1, inplace=True)
toronto_data.drop('Postal Code', axis=1, inplace=True)
toronto_data


--2019-01-05 00:37:33--  http://cocl.us/Geospatial_data
Resolving cocl.us (cocl.us)... 169.48.113.201
Connecting to cocl.us (cocl.us)|169.48.113.201|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://cocl.us/Geospatial_data [following]
--2019-01-05 00:37:33--  https://cocl.us/Geospatial_data
Connecting to cocl.us (cocl.us)|169.48.113.201|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://ibm.box.com/shared/static/9afzr83pps4pwf2smjjcf1y5mvgb18rr.csv [following]
--2019-01-05 00:37:34--  https://ibm.box.com/shared/static/9afzr83pps4pwf2smjjcf1y5mvgb18rr.csv
Resolving ibm.box.com (ibm.box.com)... 107.152.25.197, 107.152.24.197
Connecting to ibm.box.com (ibm.box.com)|107.152.25.197|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://ibm.ent.box.com/shared/static/9afzr83pps4pwf2smjjcf1y5mvgb18rr.csv [following]
--2019-01-05 00:37:34--  https://ibm.ent

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
37,East Toronto,The Beaches,43.676357,-79.293031
41,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188
42,East Toronto,"The Beaches West, India Bazaar",43.668999,-79.315572
43,East Toronto,Studio District,43.659526,-79.340923
44,Central Toronto,Lawrence Park,43.72802,-79.38879
45,Central Toronto,Davisville North,43.712751,-79.390197
46,Central Toronto,North Toronto West,43.715383,-79.405678
47,Central Toronto,Davisville,43.704324,-79.38879
48,Central Toronto,"Moore Park, Summerhill East",43.689574,-79.38316
49,Central Toronto,"Deer Park, Forest Hill SE, Rathnelly, South Hi...",43.686412,-79.400049


### Let's visualize the neighborhoods to make sure we've pulled the proper coordinates

In [32]:
address = 'Toronto, Ontario'
geolocator = Nominatim()
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
map_toronto = folium.Map(location=[latitude, longitude], zoom_start=12)

for lat, lng, label in zip(toronto_data['Latitude'], toronto_data['Longitude'], toronto_data['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  

print('The map has {} boroughs and {} neighborhoods.'.format(
        len(toronto_data['Borough'].unique()),
        toronto_data.shape[0]
    )
)

map_toronto

The map has 4 boroughs and 38 neighborhoods.


In [33]:
  def getNearbyVenues(names, latitudes, longitudes, radius=500):
    LIMIT = 100
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)


### Let's remove the venues that we're not directly competing against or will have little impact on our customer stream, such as restaurants, bars, airport and diners

In [54]:
toronto_venues = getNearbyVenues(names=toronto_data['Neighborhood'],
                                   latitudes=toronto_data['Latitude'],
                                   longitudes=toronto_data['Longitude']
                                  )

The Beaches
The Danforth West, Riverdale
The Beaches West, India Bazaar
Studio District
Lawrence Park
Davisville North
North Toronto West
Davisville
Moore Park, Summerhill East
Deer Park, Forest Hill SE, Rathnelly, South Hill, Summerhill West
Rosedale
Cabbagetown, St. James Town
Church and Wellesley
Harbourfront, Regent Park
Ryerson, Garden District
St. James Town
Berczy Park
Central Bay Street
Adelaide, King, Richmond
Harbourfront East, Toronto Islands, Union Station
Design Exchange, Toronto Dominion Centre
Commerce Court, Victoria Hotel
Roselawn
Forest Hill North, Forest Hill West
The Annex, North Midtown, Yorkville
Harbord, University of Toronto
Chinatown, Grange Park, Kensington Market
CN Tower, Bathurst Quay, Island airport, Harbourfront West, King and Spadina, Railway Lands, South Niagara
Stn A PO Boxes 25 The Esplanade
First Canadian Place, Underground city
Christie
Dovercourt Village, Dufferin
Little Portugal, Trinity
Brockton, Exhibition Place, Parkdale Village
High Park, The 

In [55]:
toronto_venues = toronto_venues[~toronto_venues['Venue Category'].str.contains('Restaurant')]
toronto_venues = toronto_venues[~toronto_venues['Venue Category'].str.contains('Pub')]
toronto_venues = toronto_venues[~toronto_venues['Venue Category'].str.contains('pub')]
toronto_venues = toronto_venues[~toronto_venues['Venue Category'].str.contains('Pizza')]
toronto_venues = toronto_venues[~toronto_venues['Venue Category'].str.contains('Dinner')]
toronto_venues = toronto_venues[~toronto_venues['Venue Category'].str.contains('Bar')]
toronto_venues = toronto_venues[~toronto_venues['Venue Category'].str.contains('Burger')]
toronto_venues = toronto_venues[~toronto_venues['Venue Category'].str.contains('Steak')]
toronto_venues = toronto_venues[~toronto_venues['Venue Category'].str.contains('Burger')]
toronto_venues = toronto_venues[~toronto_venues['Venue Category'].str.contains('Airport')]

print('There are {} uniques categories.'.format(len(toronto_venues['Venue Category'].unique())))

toronto_venues.groupby('Neighborhood').count()

print(toronto_venues['Venue Category'].unique())

There are 164 uniques categories.
['Coffee Shop' 'Astrologer' 'Neighborhood' 'Ice Cream Shop'
 'Cosmetics Shop' 'Yoga Studio' 'Health Food Store' 'Brewery'
 'Fruit & Vegetable Store' 'Trail' 'Bookstore' 'Diner' 'Dessert Shop'
 'Bubble Tea Shop' 'Spa' 'Grocery Store' 'Bakery' 'Liquor Store'
 'Furniture / Home Store' 'Gym' 'Fish & Chips Shop' 'Park' 'Burrito Place'
 'Pet Store' 'Movie Theater' 'Sandwich Place' 'Intersection' 'Fish Market'
 'Cheese Shop' 'Café' 'Stationery Store' 'Coworking Space' 'Music Store'
 'Convenience Store' 'Bank' 'Clothing Store' 'Gym / Fitness Center'
 'Swim School' 'Bus Line' 'Food & Drink Shop' 'Breakfast Spot' 'Hotel'
 'Sporting Goods Shop' 'Bagel Shop' 'Rental Car Location'
 'Toy / Game Store' 'Gourmet Shop' 'Farmers Market' 'Pharmacy' 'Playground'
 'Supermarket' 'Fried Chicken Joint' 'Light Rail Station' 'Butcher'
 'General Entertainment' 'Jewelry Store' 'Gift Shop' 'Deli / Bodega'
 'Market' 'Beer Store' 'Snack Place' 'Flower Shop' 'Dance Studio'
 'Tea Room

In [56]:
# one hot encoding
toronto_onehot = pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
toronto_onehot['Neighborhood'] = toronto_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]

toronto_onehot.head()

Unnamed: 0,Yoga Studio,Adult Boutique,Antique Shop,Aquarium,Art Gallery,Art Museum,Arts & Crafts Store,Astrologer,Athletics & Sports,Auto Workshop,...,Tanning Salon,Tea Room,Theater,Thrift / Vintage Store,Toy / Game Store,Trail,Train Station,Video Game Store,Wings Joint,Women's Store
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,1,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
5,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
6,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [57]:
toronto_grouped = toronto_onehot.groupby('Neighborhood').mean().reset_index()
toronto_grouped

Unnamed: 0,Neighborhood,Yoga Studio,Adult Boutique,Antique Shop,Aquarium,Art Gallery,Art Museum,Arts & Crafts Store,Astrologer,Athletics & Sports,...,Tanning Salon,Tea Room,Theater,Thrift / Vintage Store,Toy / Game Store,Trail,Train Station,Video Game Store,Wings Joint,Women's Store
0,"Adelaide, King, Richmond",0.0,0.0,0.0,0.0,0.017544,0.017544,0.0,0.0,0.0,...,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.017544
1,Berczy Park,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,...,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,"Brockton, Exhibition Place, Parkdale Village",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Business Reply Mail Processing Centre 969 Eastern,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,"CN Tower, Bathurst Quay, Island airport, Harbo...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,"Cabbagetown, St. James Town",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Central Bay Street,0.020408,0.0,0.0,0.0,0.0,0.020408,0.0,0.0,0.0,...,0.0,0.020408,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,"Chinatown, Grange Park, Kensington Market",0.0,0.0,0.0,0.0,0.0,0.0,0.018868,0.0,0.0,...,0.0,0.018868,0.0,0.018868,0.0,0.0,0.0,0.0,0.0,0.0
8,Christie,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Church and Wellesley,0.025,0.025,0.0,0.0,0.0,0.0,0.025,0.0,0.0,...,0.0,0.025,0.025,0.0,0.0,0.0,0.0,0.025,0.025,0.0


In [38]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = toronto_grouped['Neighborhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Adelaide, King, Richmond",Coffee Shop,Café,Clothing Store,Hotel,Gym,Bakery,Bookstore,Breakfast Spot,Concert Hall,Cosmetics Shop
1,Berczy Park,Coffee Shop,Bakery,Cheese Shop,Café,Farmers Market,Park,Jazz Club,Hotel,Fish Market,Nightclub
2,"Brockton, Exhibition Place, Parkdale Village",Coffee Shop,Café,Breakfast Spot,Gym / Fitness Center,Stadium,Gym,Furniture / Home Store,Nightclub,Convenience Store,Pet Store
3,Business Reply Mail Processing Centre 969 Eastern,Light Rail Station,Smoke Shop,Auto Workshop,Park,Butcher,Burrito Place,Brewery,Skate Park,Farmers Market,Spa
4,"CN Tower, Bathurst Quay, Island airport, Harbo...",Boat or Ferry,Plane,Sculpture Garden,Boutique,Harbor / Marina,Convenience Store,Event Space,Electronics Store,Donut Shop,Dog Run
5,"Cabbagetown, St. James Town",Coffee Shop,Park,Bakery,Pharmacy,Café,Butcher,Jewelry Store,Sandwich Place,Liquor Store,Diner
6,Central Bay Street,Coffee Shop,Café,Ice Cream Shop,Salad Place,Bubble Tea Shop,Spa,Sandwich Place,Donut Shop,Poke Place,Park
7,"Chinatown, Grange Park, Kensington Market",Café,Coffee Shop,Bakery,Dessert Shop,Gaming Cafe,Grocery Store,Furniture / Home Store,Breakfast Spot,Brewery,Bubble Tea Shop
8,Christie,Grocery Store,Café,Park,Convenience Store,Coffee Shop,Diner,Athletics & Sports,Nightclub,Baby Store,Deli / Bodega
9,Church and Wellesley,Coffee Shop,Men's Store,Café,Bubble Tea Shop,Yoga Studio,Burrito Place,Sculpture Garden,Hotel,Sandwich Place,Breakfast Spot


In [58]:
# set number of clusters
kclusters = 5

toronto_grouped_clustering = toronto_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:20] 

array([1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 1, 1], dtype=int32)

In [48]:
toronto_merged = toronto_data
toronto_merged['Cluster Labels'] = kmeans.labels_
toronto_merged = toronto_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')
toronto_merged.reset_index(inplace=True)

toronto_merged

Unnamed: 0,index,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,37,East Toronto,The Beaches,43.676357,-79.293031,1,Coffee Shop,Astrologer,Women's Store,Department Store,Event Space,Electronics Store,Donut Shop,Dog Run,Discount Store,Diner
1,41,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188,1,Coffee Shop,Ice Cream Shop,Bookstore,Yoga Studio,Bakery,Health Food Store,Furniture / Home Store,Fruit & Vegetable Store,Liquor Store,Diner
2,42,East Toronto,"The Beaches West, India Bazaar",43.668999,-79.315572,1,Park,Movie Theater,Ice Cream Shop,Gym,Liquor Store,Pet Store,Burrito Place,Fish & Chips Shop,Sandwich Place,Intersection
3,43,East Toronto,Studio District,43.659526,-79.340923,1,Café,Coffee Shop,Yoga Studio,Bakery,Convenience Store,Music Store,Ice Cream Shop,Fish Market,Diner,Coworking Space
4,44,Central Toronto,Lawrence Park,43.72802,-79.38879,0,Park,Bus Line,Swim School,Deli / Bodega,Donut Shop,Dog Run,Discount Store,Diner,Dessert Shop,Department Store
5,45,Central Toronto,Davisville North,43.712751,-79.390197,1,Park,Breakfast Spot,Food & Drink Shop,Sandwich Place,Hotel,Electronics Store,Donut Shop,Dog Run,Discount Store,Diner
6,46,Central Toronto,North Toronto West,43.715383,-79.405678,1,Clothing Store,Coffee Shop,Sporting Goods Shop,Bagel Shop,Diner,Dessert Shop,Park,Rental Car Location,Sandwich Place,Spa
7,47,Central Toronto,Davisville,43.704324,-79.38879,1,Dessert Shop,Sandwich Place,Coffee Shop,Pharmacy,Café,Toy / Game Store,Gourmet Shop,Brewery,Farmers Market,Diner
8,48,Central Toronto,"Moore Park, Summerhill East",43.689574,-79.38316,1,Playground,Women's Store,Deli / Bodega,Electronics Store,Donut Shop,Dog Run,Discount Store,Diner,Dessert Shop,Department Store
9,49,Central Toronto,"Deer Park, Forest Hill SE, Rathnelly, South Hi...",43.686412,-79.400049,1,Coffee Shop,Supermarket,Convenience Store,Light Rail Station,Fried Chicken Joint,Bagel Shop,Women's Store,Electronics Store,Donut Shop,Dog Run


### Cluster 1

In [65]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 0, toronto_merged.columns[[2] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,Lawrence Park,0,Park,Bus Line,Swim School,Deli / Bodega,Donut Shop,Dog Run,Discount Store,Diner,Dessert Shop,Department Store


### Cluster 2

In [66]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 1, toronto_merged.columns[[2] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,The Beaches,1,Coffee Shop,Astrologer,Women's Store,Department Store,Event Space,Electronics Store,Donut Shop,Dog Run,Discount Store,Diner
1,"The Danforth West, Riverdale",1,Coffee Shop,Ice Cream Shop,Bookstore,Yoga Studio,Bakery,Health Food Store,Furniture / Home Store,Fruit & Vegetable Store,Liquor Store,Diner
2,"The Beaches West, India Bazaar",1,Park,Movie Theater,Ice Cream Shop,Gym,Liquor Store,Pet Store,Burrito Place,Fish & Chips Shop,Sandwich Place,Intersection
3,Studio District,1,Café,Coffee Shop,Yoga Studio,Bakery,Convenience Store,Music Store,Ice Cream Shop,Fish Market,Diner,Coworking Space
5,Davisville North,1,Park,Breakfast Spot,Food & Drink Shop,Sandwich Place,Hotel,Electronics Store,Donut Shop,Dog Run,Discount Store,Diner
6,North Toronto West,1,Clothing Store,Coffee Shop,Sporting Goods Shop,Bagel Shop,Diner,Dessert Shop,Park,Rental Car Location,Sandwich Place,Spa
7,Davisville,1,Dessert Shop,Sandwich Place,Coffee Shop,Pharmacy,Café,Toy / Game Store,Gourmet Shop,Brewery,Farmers Market,Diner
8,"Moore Park, Summerhill East",1,Playground,Women's Store,Deli / Bodega,Electronics Store,Donut Shop,Dog Run,Discount Store,Diner,Dessert Shop,Department Store
9,"Deer Park, Forest Hill SE, Rathnelly, South Hi...",1,Coffee Shop,Supermarket,Convenience Store,Light Rail Station,Fried Chicken Joint,Bagel Shop,Women's Store,Electronics Store,Donut Shop,Dog Run
10,Rosedale,1,Park,Trail,Playground,Dance Studio,Electronics Store,Donut Shop,Dog Run,Discount Store,Diner,Dessert Shop


### Cluster 3

In [67]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 2, toronto_merged.columns[[2] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
24,"The Annex, North Midtown, Yorkville",2,Coffee Shop,Sandwich Place,Café,Liquor Store,Park,Pharmacy,Cosmetics Shop,History Museum,BBQ Joint,Flower Shop


### Cluster 4

In [68]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 3, toronto_merged.columns[[2] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
17,Central Bay Street,3,Coffee Shop,Café,Ice Cream Shop,Salad Place,Bubble Tea Shop,Spa,Sandwich Place,Donut Shop,Poke Place,Park
22,Roselawn,3,Garden,Home Service,Women's Store,Deli / Bodega,Electronics Store,Donut Shop,Dog Run,Discount Store,Diner,Dessert Shop
27,"CN Tower, Bathurst Quay, Island airport, Harbo...",3,Boat or Ferry,Plane,Sculpture Garden,Boutique,Harbor / Marina,Convenience Store,Event Space,Electronics Store,Donut Shop,Dog Run


### Cluster 5

In [69]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 4, toronto_merged.columns[[2] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
28,Stn A PO Boxes 25 The Esplanade,4,Coffee Shop,Café,Hotel,Cosmetics Shop,Farmers Market,Bakery,Cheese Shop,Gym,Creperie,Art Gallery


## Conclusion

Clearly as we can can see from the cluster analysis **Lawrence Park** should be where we first investigate further on viability of opening one (or more) coffee shops since the next closest venue in terms of commonality that may compete directly are the Donuts Shops in 5th place.