# Final Assignment - Data Sciences
# Capstone Project - The Battle of the Neighborhoods
## Applied Data Science Capstone by IBM/Coursera


## Table of Contents
1. Introduction: Business Problem
2. Data
3. Methodology
4. Analysis
5. Results and Discussion
6. Conclusion


## Introduction: Business Problem

In this project, we attempt to locate an optimal primise for a restaurant. Specifically, this report is targeted to stakeholders who are interested in openning an Italian restaurant in Toronto.

As there are numerous restaurants including Chinese, Japanese, Korean, French, etc. in Toronto, we are trying to detect locations that are not already crowded with similar business. We particularly interested in areas with no Italian restaurants in vicinity. We should also prefer locations as close to city center as possible to be eye-catching, assuming that first two conditions are met.

We will use our science powers to generate a few most promissing neighborhoods based on this criteria. Advantages of each area will then be clearly expressed so that best possible final location can be chosen by stakeholders.


## Data

Based on definition of our problem, factors that will influence our decission are:
- number of existing restaurants in the borough (any type of restaurant)
- number of and distance to Italian restaurants in the borough, if any

We decided to use regularly spaced grid of locations, centered around city center, to define our neighborhoods.
Following data sources will be needed to extract/generate the required information:
- centers of candidate areas will be generated algorithmically and approximate addresses of centers of those areas will be obtained using the given list of city in Toronto
- number of restaurants and their type and location in every neighborhood will be obtained using Foursquare API

## Borough Candidate

Based on the prior data given by the list of neighborhood in Toronto, import the relevant data including latitude and longitude, the neighborhood name, Borough etc. In this event research, we will focus on one single Borough that is the North York.

In [3]:
import pandas as pd
import numpy as np

df_candidate = pd.read_csv('/Users/luofan/Downloads/torontodata.csv',index_col=0)
df_candidate

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude
0,M1B,Scarborough,"'Rouge', 'Malvern'",43.806686,-79.194353
1,M1C,Scarborough,"'Highland Creek', 'Rouge Hill', 'Port Union'",43.784535,-79.160497
2,M1E,Scarborough,"'Guildwood', 'Morningside', 'West Hill'",43.763573,-79.188711
3,M1G,Scarborough,'Woburn',43.770992,-79.216917
4,M1H,Scarborough,'Cedarbrae',43.773136,-79.239476
5,M1J,Scarborough,'Scarborough Village',43.744734,-79.239476
6,M1K,Scarborough,"'East Birchmount Park', 'Ionview', 'Kennedy Park'",43.727929,-79.262029
7,M1L,Scarborough,"'Clairlea', 'Golden Mile', 'Oakridge'",43.711112,-79.284577
8,M1M,Scarborough,"'Cliffcrest', 'Cliffside', 'Scarborough Villag...",43.716316,-79.239476
9,M1N,Scarborough,"'Birch Cliff', 'Cliffside West'",43.692657,-79.264848


In [4]:
# given the location data of Tornoto center
# create the map of Toronto

import folium

toronto_latitude = 43.6532; toronto_longitude = -79.3832
map_toronto = folium.Map(location = [toronto_latitude, toronto_longitude], zoom_start = 10.7)

# add markers to the map
for lat, lng, borough, neighborhood in zip(df_candidate['Latitude'], df_candidate['Longitude'], df_candidate['Borough'], df_candidate['Neighbourhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(map_toronto)  
    

map_toronto

## Foursquare Data

The foursquare data will provide the series of venues especially the restaurant venues.

In [5]:
# import the foursquare data resources
CLIENT_ID = 'CR1TOIJLC1RMUIU1KPJ5K4AJH1YAVSAFLOMNYQEM5KY3ISQG' 
CLIENT_SECRET = 'ZU1FQJ0GFH4J5WC2W3TA3O4NEIGRPS3DRVJZ0NXLU50F5HS1'
VERSION = '20190604'

# display the North York data explicitly
NorthYork_data =df_candidate[df_candidate['Borough'] == 'North York'].reset_index(drop=True)
NorthYork_data.head()

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude
0,M2H,North York,'Hillcrest Village',43.803762,-79.363452
1,M2J,North York,"'Fairview', 'Henry Farm', 'Oriole'",43.778517,-79.346556
2,M2K,North York,'Bayview Village',43.786947,-79.385975
3,M2L,North York,"'Silver Hills', 'York Mills'",43.75749,-79.374714
4,M2M,North York,"'Newtonbrook', 'Willowdale'",43.789053,-79.408493


In [9]:
import requests

def getNearbyVenues(names, latitudes, longitudes, radius=500, LIMIT = 1000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)


# Get venues for each neighborhood in North York
NorthYork_venues = getNearbyVenues(names=NorthYork_data['Neighbourhood'],
                                   latitudes=NorthYork_data['Latitude'],
                                   longitudes=NorthYork_data['Longitude']
                                  )
NorthYork_venues.head()

'Hillcrest Village'
'Fairview', 'Henry Farm', 'Oriole'
'Bayview Village'
'Silver Hills', 'York Mills'
'Newtonbrook', 'Willowdale'
'Willowdale South'
'York Mills West'
'Willowdale West'
'Parkwoods'
'Don Mills North'
'Flemingdon Park', 'Don Mills South'
'Bathurst Manor', 'Downsview North', 'Wilson Heights'
'Northwood Park', 'York University'
'CFB Toronto', 'Downsview East'
'Downsview West'
'Downsview Central'
'Downsview Northwest'
'Victoria Village'
'Bedford Park', 'Lawrence Manor East'
'Lawrence Heights', 'Lawrence Manor'
'Glencairn'
'Downsview', 'North Park', 'Upwood Park'
'Humber Summit'
'Emery', 'Humberlea'


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,'Hillcrest Village',43.803762,-79.363452,Eagle's Nest Golf Club,43.805455,-79.364186,Golf Course
1,'Hillcrest Village',43.803762,-79.363452,AY Jackson Pool,43.804515,-79.366138,Pool
2,'Hillcrest Village',43.803762,-79.363452,Villa Madina,43.801685,-79.363938,Mediterranean Restaurant
3,'Hillcrest Village',43.803762,-79.363452,Duncan Creek Park,43.805539,-79.360695,Dog Run
4,'Hillcrest Village',43.803762,-79.363452,A.Y. Jackson Secondary School Track,43.805068,-79.366677,Athletics & Sports


In [10]:
# one hot encoding
ny_onehot = pd.get_dummies(NorthYork_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
ny_onehot['Neighborhood'] = NorthYork_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [ny_onehot.columns[-1]] + list(ny_onehot.columns[:-1])
ny_onehot = ny_onehot[fixed_columns]

ny_onehot.head()

Unnamed: 0,Neighborhood,Accessories Store,Airport,American Restaurant,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Bakery,Bank,Bar,...,Sushi Restaurant,Tea Room,Thai Restaurant,Theater,Toy / Game Store,Video Game Store,Video Store,Vietnamese Restaurant,Wings Joint,Women's Store
0,'Hillcrest Village',0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,'Hillcrest Village',0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,'Hillcrest Village',0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,'Hillcrest Village',0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,'Hillcrest Village',0,0,0,0,0,1,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [11]:
ny_grouped = ny_onehot.groupby('Neighborhood').mean().reset_index()

In [13]:
# Get top 100 venues per neighborhood

def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

num_top_venues = 100

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = ny_grouped['Neighborhood']

for ind in np.arange(ny_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(ny_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,...,91th Most Common Venue,92th Most Common Venue,93th Most Common Venue,94th Most Common Venue,95th Most Common Venue,96th Most Common Venue,97th Most Common Venue,98th Most Common Venue,99th Most Common Venue,100th Most Common Venue
0,"'Bathurst Manor', 'Downsview North', 'Wilson H...",Coffee Shop,Sandwich Place,Supermarket,Pharmacy,Pizza Place,Deli / Bodega,Diner,Bridal Shop,Restaurant,...,Hotel,Ice Cream Shop,Indian Restaurant,Indonesian Restaurant,Italian Restaurant,Men's Store,Japanese Restaurant,Jewelry Store,Juice Bar,Korean Restaurant
1,'Bayview Village',Chinese Restaurant,Café,Bank,Japanese Restaurant,Women's Store,Event Space,Cosmetics Shop,Deli / Bodega,Department Store,...,Jewelry Store,Juice Bar,Korean Restaurant,Lounge,Pet Store,Luggage Store,Massage Studio,Mediterranean Restaurant,Men's Store,Metro Station
2,"'Bedford Park', 'Lawrence Manor East'",Coffee Shop,Fast Food Restaurant,Italian Restaurant,Sandwich Place,Indian Restaurant,Liquor Store,Café,Pharmacy,Pizza Place,...,Home Service,Hotel,Ice Cream Shop,Indonesian Restaurant,Intersection,Japanese Restaurant,Jewelry Store,Korean Restaurant,Lounge,Luggage Store
3,"'CFB Toronto', 'Downsview East'",Snack Place,Other Repair Shop,Park,Airport,Golf Course,Electronics Store,Construction & Landscaping,Convenience Store,Cosmetics Shop,...,Italian Restaurant,Japanese Restaurant,Jewelry Store,Korean Restaurant,Pet Store,Liquor Store,Lounge,Luggage Store,Massage Studio,Mediterranean Restaurant
4,'Don Mills North',Gym / Fitness Center,Caribbean Restaurant,Café,Japanese Restaurant,Basketball Court,Event Space,Cosmetics Shop,Deli / Bodega,Department Store,...,Jewelry Store,Juice Bar,Korean Restaurant,Lounge,Pet Store,Luggage Store,Massage Studio,Mediterranean Restaurant,Men's Store,Metro Station
5,'Downsview Central',Home Service,Baseball Field,Food Truck,Korean Restaurant,Women's Store,Event Space,Cosmetics Shop,Deli / Bodega,Department Store,...,Jewelry Store,Juice Bar,Liquor Store,Luggage Store,Pharmacy,Massage Studio,Mediterranean Restaurant,Men's Store,Metro Station,Middle Eastern Restaurant
6,'Downsview Northwest',Gym / Fitness Center,Athletics & Sports,Liquor Store,Discount Store,Grocery Store,Asian Restaurant,Arts & Crafts Store,Deli / Bodega,Department Store,...,Japanese Restaurant,Jewelry Store,Juice Bar,Lounge,Pet Store,Luggage Store,Massage Studio,Mediterranean Restaurant,Men's Store,Metro Station
7,'Downsview West',Grocery Store,Shopping Mall,Hotel,Bank,Park,Construction & Landscaping,Convenience Store,Cosmetics Shop,Deli / Bodega,...,Japanese Restaurant,Jewelry Store,Juice Bar,Liquor Store,Pet Store,Lounge,Luggage Store,Massage Studio,Mediterranean Restaurant,Men's Store
8,"'Downsview', 'North Park', 'Upwood Park'",Park,Bakery,Basketball Court,Construction & Landscaping,Event Space,Convenience Store,Cosmetics Shop,Deli / Bodega,Department Store,...,Japanese Restaurant,Jewelry Store,Juice Bar,Liquor Store,Pet Store,Lounge,Luggage Store,Massage Studio,Mediterranean Restaurant,Men's Store
9,"'Emery', 'Humberlea'",Baseball Field,Women's Store,Fast Food Restaurant,Convenience Store,Cosmetics Shop,Deli / Bodega,Department Store,Dim Sum Restaurant,Diner,...,Jewelry Store,Juice Bar,Korean Restaurant,Lounge,Pet Store,Luggage Store,Massage Studio,Mediterranean Restaurant,Men's Store,Metro Station


In [14]:
# Run k-means to cluster the neighborhoods into 7 clusters

# import k-means from clustering stage
from sklearn.cluster import KMeans

ny_data = NorthYork_data.drop(16)
# set number of clusters
kclusters = 7

ny_grouped_clustering = ny_grouped.drop('Neighborhood', 1)


# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(ny_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 
#len(kmeans.labels_)#=16
#scarborough_data.shape

array([1, 3, 1, 1, 3, 1, 1, 2, 1, 0], dtype=int32)

In [15]:
kmeans.labels_1 = np.append(kmeans.labels_,0)
ny_merged = ny_data

# add clustering labels
ny_merged['Cluster Labels'] = kmeans.labels_1

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
ny_merged = ny_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighbourhood')

ny_merged

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,...,91th Most Common Venue,92th Most Common Venue,93th Most Common Venue,94th Most Common Venue,95th Most Common Venue,96th Most Common Venue,97th Most Common Venue,98th Most Common Venue,99th Most Common Venue,100th Most Common Venue
0,M2H,North York,'Hillcrest Village',43.803762,-79.363452,1,Golf Course,Pool,Athletics & Sports,Mediterranean Restaurant,...,Japanese Restaurant,Jewelry Store,Juice Bar,Liquor Store,Pet Store,Lounge,Luggage Store,Massage Studio,Men's Store,Metro Station
1,M2J,North York,"'Fairview', 'Henry Farm', 'Oriole'",43.778517,-79.346556,3,Clothing Store,Fast Food Restaurant,Coffee Shop,Restaurant,...,General Entertainment,Gift Shop,Golf Course,Greek Restaurant,Grocery Store,Gym,Hockey Arena,Home Service,Hotel,Ice Cream Shop
2,M2K,North York,'Bayview Village',43.786947,-79.385975,1,Chinese Restaurant,Café,Bank,Japanese Restaurant,...,Jewelry Store,Juice Bar,Korean Restaurant,Lounge,Pet Store,Luggage Store,Massage Studio,Mediterranean Restaurant,Men's Store,Metro Station
3,M2L,North York,"'Silver Hills', 'York Mills'",43.75749,-79.374714,1,,,,,...,,,,,,,,,,
4,M2M,North York,"'Newtonbrook', 'Willowdale'",43.789053,-79.408493,3,,,,,...,,,,,,,,,,
5,M2N,North York,'Willowdale South',43.77012,-79.408493,1,Restaurant,Ramen Restaurant,Coffee Shop,Café,...,Golf Course,Greek Restaurant,Gym,Wings Joint,Hockey Arena,Indian Restaurant,Mediterranean Restaurant,Intersection,Italian Restaurant,Jewelry Store
6,M2P,North York,'York Mills West',43.752758,-79.400049,1,Park,Convenience Store,Bank,Women's Store,...,Japanese Restaurant,Jewelry Store,Juice Bar,Liquor Store,Pet Store,Lounge,Luggage Store,Massage Studio,Mediterranean Restaurant,Men's Store
7,M2R,North York,'Willowdale West',43.782736,-79.442259,2,Pharmacy,Butcher,Discount Store,Coffee Shop,...,Japanese Restaurant,Jewelry Store,Juice Bar,Liquor Store,Park,Lounge,Luggage Store,Massage Studio,Mediterranean Restaurant,Men's Store
8,M3A,North York,'Parkwoods',43.753259,-79.329656,1,Park,Food & Drink Shop,Fast Food Restaurant,Women's Store,...,Japanese Restaurant,Jewelry Store,Juice Bar,Liquor Store,Pet Store,Lounge,Luggage Store,Massage Studio,Mediterranean Restaurant,Men's Store
9,M3B,North York,'Don Mills North',43.745906,-79.352188,0,Gym / Fitness Center,Caribbean Restaurant,Café,Japanese Restaurant,...,Jewelry Store,Juice Bar,Korean Restaurant,Lounge,Pet Store,Luggage Store,Massage Studio,Mediterranean Restaurant,Men's Store,Metro Station


In [19]:
# Visualize the clusters in the map

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# create map
map_clusters = folium.Map(location = [toronto_latitude, toronto_longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(ny_merged['Latitude'], ny_merged['Longitude'], ny_merged['Neighbourhood'], ny_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [20]:
ny_merged.loc[ny_merged['Cluster Labels'] == 0, ny_merged.columns[[1] + list(range(5, ny_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,...,91th Most Common Venue,92th Most Common Venue,93th Most Common Venue,94th Most Common Venue,95th Most Common Venue,96th Most Common Venue,97th Most Common Venue,98th Most Common Venue,99th Most Common Venue,100th Most Common Venue
9,North York,0,Gym / Fitness Center,Caribbean Restaurant,Café,Japanese Restaurant,Basketball Court,Event Space,Cosmetics Shop,Deli / Bodega,...,Jewelry Store,Juice Bar,Korean Restaurant,Lounge,Pet Store,Luggage Store,Massage Studio,Mediterranean Restaurant,Men's Store,Metro Station
23,North York,0,Baseball Field,Women's Store,Fast Food Restaurant,Convenience Store,Cosmetics Shop,Deli / Bodega,Department Store,Dim Sum Restaurant,...,Jewelry Store,Juice Bar,Korean Restaurant,Lounge,Pet Store,Luggage Store,Massage Studio,Mediterranean Restaurant,Men's Store,Metro Station


In [21]:
ny_merged.loc[ny_merged['Cluster Labels'] == 1, ny_merged.columns[[1] + list(range(5, ny_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,...,91th Most Common Venue,92th Most Common Venue,93th Most Common Venue,94th Most Common Venue,95th Most Common Venue,96th Most Common Venue,97th Most Common Venue,98th Most Common Venue,99th Most Common Venue,100th Most Common Venue
0,North York,1,Golf Course,Pool,Athletics & Sports,Mediterranean Restaurant,Dog Run,Women's Store,Empanada Restaurant,Convenience Store,...,Japanese Restaurant,Jewelry Store,Juice Bar,Liquor Store,Pet Store,Lounge,Luggage Store,Massage Studio,Men's Store,Metro Station
2,North York,1,Chinese Restaurant,Café,Bank,Japanese Restaurant,Women's Store,Event Space,Cosmetics Shop,Deli / Bodega,...,Jewelry Store,Juice Bar,Korean Restaurant,Lounge,Pet Store,Luggage Store,Massage Studio,Mediterranean Restaurant,Men's Store,Metro Station
3,North York,1,,,,,,,,,...,,,,,,,,,,
5,North York,1,Restaurant,Ramen Restaurant,Coffee Shop,Café,Japanese Restaurant,Sushi Restaurant,Sandwich Place,Ice Cream Shop,...,Golf Course,Greek Restaurant,Gym,Wings Joint,Hockey Arena,Indian Restaurant,Mediterranean Restaurant,Intersection,Italian Restaurant,Jewelry Store
6,North York,1,Park,Convenience Store,Bank,Women's Store,Event Space,Cosmetics Shop,Deli / Bodega,Department Store,...,Japanese Restaurant,Jewelry Store,Juice Bar,Liquor Store,Pet Store,Lounge,Luggage Store,Massage Studio,Mediterranean Restaurant,Men's Store
8,North York,1,Park,Food & Drink Shop,Fast Food Restaurant,Women's Store,Empanada Restaurant,Convenience Store,Cosmetics Shop,Deli / Bodega,...,Japanese Restaurant,Jewelry Store,Juice Bar,Liquor Store,Pet Store,Lounge,Luggage Store,Massage Studio,Mediterranean Restaurant,Men's Store
10,North York,1,Coffee Shop,Gym,Asian Restaurant,Beer Store,Chinese Restaurant,Bike Shop,Clothing Store,Dim Sum Restaurant,...,Indonesian Restaurant,Intersection,Jewelry Store,Juice Bar,Liquor Store,Other Repair Shop,Lounge,Luggage Store,Massage Studio,Mediterranean Restaurant
11,North York,1,Coffee Shop,Sandwich Place,Supermarket,Pharmacy,Pizza Place,Deli / Bodega,Diner,Bridal Shop,...,Hotel,Ice Cream Shop,Indian Restaurant,Indonesian Restaurant,Italian Restaurant,Men's Store,Japanese Restaurant,Jewelry Store,Juice Bar,Korean Restaurant
12,North York,1,Coffee Shop,Miscellaneous Shop,Metro Station,Bar,Massage Studio,Women's Store,Electronics Store,Cosmetics Shop,...,Italian Restaurant,Japanese Restaurant,Jewelry Store,Korean Restaurant,Pharmacy,Liquor Store,Lounge,Luggage Store,Mediterranean Restaurant,Men's Store
13,North York,1,Snack Place,Other Repair Shop,Park,Airport,Golf Course,Electronics Store,Construction & Landscaping,Convenience Store,...,Italian Restaurant,Japanese Restaurant,Jewelry Store,Korean Restaurant,Pet Store,Liquor Store,Lounge,Luggage Store,Massage Studio,Mediterranean Restaurant


In [22]:
ny_merged.loc[ny_merged['Cluster Labels'] == 2, ny_merged.columns[[1] + list(range(5, ny_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,...,91th Most Common Venue,92th Most Common Venue,93th Most Common Venue,94th Most Common Venue,95th Most Common Venue,96th Most Common Venue,97th Most Common Venue,98th Most Common Venue,99th Most Common Venue,100th Most Common Venue
7,North York,2,Pharmacy,Butcher,Discount Store,Coffee Shop,Pizza Place,Grocery Store,Athletics & Sports,Bakery,...,Japanese Restaurant,Jewelry Store,Juice Bar,Liquor Store,Park,Lounge,Luggage Store,Massage Studio,Mediterranean Restaurant,Men's Store


In [23]:
ny_merged.loc[ny_merged['Cluster Labels'] == 3, ny_merged.columns[[1] + list(range(5, ny_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,...,91th Most Common Venue,92th Most Common Venue,93th Most Common Venue,94th Most Common Venue,95th Most Common Venue,96th Most Common Venue,97th Most Common Venue,98th Most Common Venue,99th Most Common Venue,100th Most Common Venue
1,North York,3,Clothing Store,Fast Food Restaurant,Coffee Shop,Restaurant,Asian Restaurant,Food Court,Japanese Restaurant,Bakery,...,General Entertainment,Gift Shop,Golf Course,Greek Restaurant,Grocery Store,Gym,Hockey Arena,Home Service,Hotel,Ice Cream Shop
4,North York,3,,,,,,,,,...,,,,,,,,,,


In [24]:
ny_merged.loc[ny_merged['Cluster Labels'] == 4, ny_merged.columns[[1] + list(range(5, ny_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,...,91th Most Common Venue,92th Most Common Venue,93th Most Common Venue,94th Most Common Venue,95th Most Common Venue,96th Most Common Venue,97th Most Common Venue,98th Most Common Venue,99th Most Common Venue,100th Most Common Venue
14,North York,4,Grocery Store,Shopping Mall,Hotel,Bank,Park,Construction & Landscaping,Convenience Store,Cosmetics Shop,...,Japanese Restaurant,Jewelry Store,Juice Bar,Liquor Store,Pet Store,Lounge,Luggage Store,Massage Studio,Mediterranean Restaurant,Men's Store


## Conclusion

Based on the above analysis, it has shown that the cluster one is one of the best investment regions as it has mature business religons with dense amount of other venues. Yet the rest clusters are relevant smaller and tranquile, it will be riskier to overwhelming