# Segmentation and Clustering Neighborhoods in Toronto

### Q1. To create dataframe of following format :-
<img src="https://d3c33hcgiwev3.cloudfront.net/imageAssetProxy.v1/7JXaz3NNEeiMwApe4i-fLg_40e690ae0e927abda2d4bde7d94ed133_Screen-Shot-2018-06-18-at-7.17.57-PM.png?expiry=1592179200000&hmac=STnpJHiHoJcl4S-g8Iaaw1Wc7fwWGhaPPXa2bebTKLY">

In [1]:
import pandas as pd
import numpy as np
import json
import requests
#!conda install -c conda-forge folium --yes
import folium
from sklearn.cluster import KMeans
#!conda install -c conda-forge geopy --yes
from geopy.geocoders import Nominatim
from bs4 import BeautifulSoup
import matplotlib.cm as cm
import matplotlib.colors as colors
%matplotlib inline

### Loading data from Wikipedia using BeautifulSoup

In [2]:
data = requests.get('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M').text
soup = BeautifulSoup(data,'html.parser')
PostalCode = []
Borough = []
Neighborhood = []

In [3]:
for row in soup.find('table').find_all('tr'):
    td = row.find_all('td')
    if(len(td) > 0):
        PostalCode.append(td[0].text.rstrip('\n'))
        Borough.append(td[1].text.rstrip('\n'))
        Neighborhood.append(td[2].text.rstrip('\n'))

In [4]:
df = pd.DataFrame({'PostalCode':PostalCode,'Borough':Borough,'Neighborhood':Neighborhood})
df.head()

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"


In [5]:
df_toronto = df[df['Borough'] != "Not assigned"].reset_index(drop = True)
df_toronto.head()

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Regent Park, Harbourfront"
3,M6A,North York,"Lawrence Manor, Lawrence Heights"
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"


In [6]:
toronto_grouped = df_toronto.groupby('PostalCode').agg(lambda x: ",".join(x)).reset_index()
toronto_grouped.head()

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M1B,Scarborough,"Malvern, Rouge"
1,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek"
2,M1E,Scarborough,"Guildwood, Morningside, West Hill"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae


In [7]:
toronto_grouped.shape

(103, 3)

# Required Dataframe for Problem 1

In [8]:
col = ["PostalCode", "Borough", "Neighborhood"]
q1_df = pd.DataFrame(columns = col)
lst = ["M5G","M2H","M4B","M1J","M4G","M4M","M1R","M9V","M9L","M5V","M1B","M5A"]
for code in lst:
    q1_df = q1_df.append(toronto_grouped[toronto_grouped['PostalCode'] == code],ignore_index = True)
q1_df

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M5G,Downtown Toronto,Central Bay Street
1,M2H,North York,Hillcrest Village
2,M4B,East York,"Parkview Hill, Woodbine Gardens"
3,M1J,Scarborough,Scarborough Village
4,M4G,East York,Leaside
5,M4M,East Toronto,Studio District
6,M1R,Scarborough,"Wexford, Maryvale"
7,M9V,Etobicoke,"South Steeles, Silverstone, Humbergate, Jamest..."
8,M9L,North York,Humber Summit
9,M5V,Downtown Toronto,"CN Tower, King and Spadina, Railway Lands, Har..."


### Q2. To create dataframe of following format :-
<img src="https://d3c33hcgiwev3.cloudfront.net/imageAssetProxy.v1/HZ3jNHNOEeiMwApe4i-fLg_f44f0f10ccfaf42fcbdba9813364e173_Screen-Shot-2018-06-18-at-7.18.16-PM.png?expiry=1592179200000&hmac=p6GIZYcyxkyUGnS1AuBNfb4WoR32RSXMrT3Z4QuJatg">

### Loading csv file from assignment

In [9]:
geo_df = pd.read_csv("https://cocl.us/Geospatial_data")
geo_df.rename(columns = {'Postal Code':"PostalCode"},inplace = True)
geo_df.head()

Unnamed: 0,PostalCode,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


In [10]:
toronto_merged = toronto_grouped.merge(geo_df,on = "PostalCode")
toronto_merged.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M1B,Scarborough,"Malvern, Rouge",43.806686,-79.194353
1,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476


In [11]:
toronto_merged.shape

(103, 5)

# Required Dataframe for Problem 2

In [12]:
col = ['PostalCode','Borough','Neighborhood','Latitude','Longitude']
q2_df = pd.DataFrame(columns = col)
for code in lst:
    q2_df = q2_df.append(toronto_merged[toronto_merged['PostalCode'] == code],ignore_index = True)
q2_df

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M5G,Downtown Toronto,Central Bay Street,43.657952,-79.387383
1,M2H,North York,Hillcrest Village,43.803762,-79.363452
2,M4B,East York,"Parkview Hill, Woodbine Gardens",43.706397,-79.309937
3,M1J,Scarborough,Scarborough Village,43.744734,-79.239476
4,M4G,East York,Leaside,43.70906,-79.363452
5,M4M,East Toronto,Studio District,43.659526,-79.340923
6,M1R,Scarborough,"Wexford, Maryvale",43.750072,-79.295849
7,M9V,Etobicoke,"South Steeles, Silverstone, Humbergate, Jamest...",43.739416,-79.588437
8,M9L,North York,Humber Summit,43.756303,-79.565963
9,M5V,Downtown Toronto,"CN Tower, King and Spadina, Railway Lands, Har...",43.628947,-79.39442


### Q3. To generate clusters of similar neighborhood.

## Map of all Neighborhoods in Toronto

In [13]:
address = 'Toronto'
geolocator = Nominatim(user_agent='my-application')
location = geolocator.geocode(address)
lat = location.latitude
long = location.longitude

map_toronto = folium.Map(location=[lat,long],zoom_start=10)
for lat,long,borough,neighborhood in zip(toronto_merged['Latitude'],toronto_merged['Longitude'],toronto_merged['Borough'],toronto_merged['Neighborhood']):
    label = "{}, {}".format(neighborhood,borough)
    label = folium.Popup(label,parse_html=True)
    folium.CircleMarker(
        [lat,long],
        radius = 5,
        popup = label,
        color = 'blue',
        fill = True,
        fill_color = '#3186cc',
        fill_opacity = 0.6
    ).add_to(map_toronto)
    
map_toronto

In [14]:
# The code was removed by Watson Studio for sharing.

### Getting informatio about Neighborhoods from FourSquare API

In [15]:
venues = []
for lat,long,post,borough,neighborhood in zip(toronto_merged['Latitude'],toronto_merged['Longitude'],toronto_merged['PostalCode'],toronto_merged['Borough'],toronto_merged['Neighborhood']):
    url = "https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}".format(client_id,
                                                                                                                               client_secret,
                                                                                                                               version,
                                                                                                                               lat,
                                                                                                                               long,
                                                                                                                               rad,
                                                                                                                               lim)
    results = requests.get(url).json()["response"]['groups'][0]['items']
    for venue in results:
        venues.append((post,borough,neighborhood,lat,long,
                      venue['venue']['name'],
                      venue['venue']['location']['lat'],
                      venue['venue']['location']['lng'],
                      venue['venue']['categories'][0]['name']
                     ))
    

### Constructing DataFrame for Neighborhood data

In [16]:
venue_df = pd.DataFrame(venues)
venue_df.columns = ['PostalCode','Borough','Neighborhood','Borough_Latitude','Borough_Longitude','Venue_Name','Venue_Latitude','Venue_Longitude','Venue_Category']
venue_df.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Borough_Latitude,Borough_Longitude,Venue_Name,Venue_Latitude,Venue_Longitude,Venue_Category
0,M1B,Scarborough,"Malvern, Rouge",43.806686,-79.194353,Wendy’s,43.807448,-79.199056,Fast Food Restaurant
1,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek",43.784535,-79.160497,Chris Effects Painting,43.784343,-79.163742,Construction & Landscaping
2,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek",43.784535,-79.160497,Royal Canadian Legion,43.782533,-79.163085,Bar
3,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711,RBC Royal Bank,43.76679,-79.191151,Bank
4,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711,G & G Electronics,43.765309,-79.191537,Electronics Store


In [17]:
toronto_onehot = pd.get_dummies(venue_df[['Venue_Category']], prefix="", prefix_sep="")
toronto_onehot['PostalCode'] = venue_df['PostalCode'] 
toronto_onehot['Borough'] = venue_df['Borough'] 
toronto_onehot['Neighborhood'] = venue_df['Neighborhood'] 
toronto_onehot['Latitude'] = venue_df['Borough_Latitude']
toronto_onehot['Longitude'] = venue_df['Borough_Longitude']
fixed_columns = list(toronto_onehot.columns[-5:]) + list(toronto_onehot.columns[:-5])
toronto_onehot = toronto_onehot[fixed_columns]
toronto_grouped = toronto_onehot.groupby(['PostalCode','Borough','Neighborhood','Latitude','Longitude']).mean().reset_index()
toronto_grouped.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Yoga Studio,Accessories Store,Afghan Restaurant,Airport,Airport Food Court,...,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop,Wings Joint,Women's Store
0,M1B,Scarborough,"Malvern, Rouge",43.806686,-79.194353,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek",43.784535,-79.160497,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,M1G,Scarborough,Woburn,43.770992,-79.216917,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [18]:
num_top_venues = 10
indicators = ['st','nd','rd']
col = ['PostalCode','Borough','Neighborhood','Latitude','Longitude']
for ind in np.arange(num_top_venues):
    try:
        col.append('{}{} Most Common Venue'.format(ind+1,indicators[ind]))
    except:
        col.append('{}th Most Common Venue'.format(ind+1))

neighborhood_venue_sorted = pd.DataFrame(columns=col)
neighborhood_venue_sorted['PostalCode'] = toronto_grouped['PostalCode']
neighborhood_venue_sorted['Borough'] = toronto_grouped['Borough']
neighborhood_venue_sorted['Neighborhood'] = toronto_grouped['Neighborhood']
neighborhood_venue_sorted['Latitude'] = toronto_grouped['Latitude']
neighborhood_venue_sorted['Longitude'] = toronto_grouped['Longitude']
for ind in np.arange(toronto_grouped.shape[0]):
    categories = toronto_grouped.iloc[ind,5:]
    categories_sorted = categories.sort_values(ascending = False)
    neighborhood_venue_sorted.iloc[ind,5:] = categories_sorted.index.values[0:num_top_venues]

neighborhood_venue_sorted.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M1B,Scarborough,"Malvern, Rouge",43.806686,-79.194353,Fast Food Restaurant,Drugstore,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Dumpling Restaurant,Field
1,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek",43.784535,-79.160497,Construction & Landscaping,Bar,Women's Store,Drugstore,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Dumpling Restaurant
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711,Medical Center,Mexican Restaurant,Breakfast Spot,Electronics Store,Rental Car Location,Intersection,Bank,Doner Restaurant,Distribution Center,Dog Run
3,M1G,Scarborough,Woburn,43.770992,-79.216917,Coffee Shop,Korean Restaurant,Women's Store,Dumpling Restaurant,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Drugstore
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476,Hakka Restaurant,Gas Station,Bakery,Athletics & Sports,Bank,Caribbean Restaurant,Thai Restaurant,Fried Chicken Joint,Dumpling Restaurant,Drugstore


### Cluster Analysis Process

In [19]:
kcluster = 5
toronto_clustering = toronto_grouped.drop(["PostalCode","Borough","Neighborhood"], 1)
kmeans = KMeans(n_clusters=kcluster, random_state=0).fit(toronto_clustering)
neighborhood_venue_sorted['Cluster label'] = kmeans.labels_
neighborhood_venue_sorted.head(10)

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Cluster label
0,M1B,Scarborough,"Malvern, Rouge",43.806686,-79.194353,Fast Food Restaurant,Drugstore,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Dumpling Restaurant,Field,3
1,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek",43.784535,-79.160497,Construction & Landscaping,Bar,Women's Store,Drugstore,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Dumpling Restaurant,4
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711,Medical Center,Mexican Restaurant,Breakfast Spot,Electronics Store,Rental Car Location,Intersection,Bank,Doner Restaurant,Distribution Center,Dog Run,3
3,M1G,Scarborough,Woburn,43.770992,-79.216917,Coffee Shop,Korean Restaurant,Women's Store,Dumpling Restaurant,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Drugstore,0
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476,Hakka Restaurant,Gas Station,Bakery,Athletics & Sports,Bank,Caribbean Restaurant,Thai Restaurant,Fried Chicken Joint,Dumpling Restaurant,Drugstore,3
5,M1J,Scarborough,Scarborough Village,43.744734,-79.239476,Playground,Construction & Landscaping,Women's Store,Dumpling Restaurant,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Drugstore,4
6,M1K,Scarborough,"Kennedy Park, Ionview, East Birchmount Park",43.727929,-79.262029,Discount Store,Bus Station,Department Store,Coffee Shop,Convenience Store,Drugstore,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,3
7,M1L,Scarborough,"Golden Mile, Clairlea, Oakridge",43.711112,-79.284577,Bakery,Bus Line,Park,Soccer Field,Metro Station,Intersection,Bus Station,Ice Cream Shop,Drugstore,Dumpling Restaurant,3
8,M1M,Scarborough,"Cliffside, Cliffcrest, Scarborough Village West",43.716316,-79.239476,Motel,American Restaurant,Women's Store,Dim Sum Restaurant,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Drugstore,3
9,M1N,Scarborough,"Birch Cliff, Cliffside West",43.692657,-79.264848,Café,General Entertainment,Skating Rink,College Stadium,Concert Hall,Construction & Landscaping,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store,3


### Marking different Clusters on Map

In [26]:
map_clusters = folium.Map(location=[lat,long], zoom_start=11)
x = np.arange(kcluster)
ys = [i+x+(i*x)**2 for i in range(kcluster)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]
markers_colors = []
for lat,lon,post,bor,poi,cluster in zip(neighborhood_venue_sorted['Latitude'],neighborhood_venue_sorted['Longitude'],neighborhood_venue_sorted['PostalCode'],neighborhood_venue_sorted['Borough'],neighborhood_venue_sorted['Neighborhood'],neighborhood_venue_sorted['Cluster label']):
    label = folium.Popup('{} ({}): {} - Cluster {}'.format(bor, post, poi, cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### Information about each Clusters

#### Cluster 1

In [27]:
neighborhood_venue_sorted.loc[neighborhood_venue_sorted['Cluster label'] == 0, neighborhood_venue_sorted.columns[[1] + list(range(5,neighborhood_venue_sorted.shape[1]))]]

Unnamed: 0,Borough,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Cluster label
3,Scarborough,Coffee Shop,Korean Restaurant,Women's Store,Dumpling Restaurant,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Drugstore,0
78,York,Restaurant,Sandwich Place,Coffee Shop,Discount Store,Bar,Women's Store,Doner Restaurant,Diner,Distribution Center,Dog Run,0


#### Cluster 2

In [28]:
neighborhood_venue_sorted.loc[neighborhood_venue_sorted['Cluster label'] == 1, neighborhood_venue_sorted.columns[[1] + list(range(5,neighborhood_venue_sorted.shape[1]))]]

Unnamed: 0,Borough,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Cluster label
89,Etobicoke,Baseball Field,Women's Store,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Dumpling Restaurant,Dim Sum Restaurant,1
93,North York,Food Service,Baseball Field,Women's Store,Field,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Drugstore,1


#### Cluster 3

In [29]:
neighborhood_venue_sorted.loc[neighborhood_venue_sorted['Cluster label'] == 2, neighborhood_venue_sorted.columns[[1] + list(range(5,neighborhood_venue_sorted.shape[1]))]]

Unnamed: 0,Borough,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Cluster label
21,North York,Park,Convenience Store,Women's Store,Dumpling Restaurant,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Drugstore,2
23,North York,Food & Drink Shop,Park,Fireworks Store,Women's Store,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Drugstore,2
28,North York,Park,Business Service,Snack Place,Airport,Drugstore,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,2
29,North York,Park,Bank,Shopping Mall,Grocery Store,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Drugstore,2
38,East York,Park,Intersection,Convenience Store,Metro Station,Donut Shop,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Drugstore,2
42,Central Toronto,Park,Bus Line,Swim School,Fast Food Restaurant,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,2
46,Central Toronto,Gym,Park,Women's Store,Drugstore,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,2
48,Downtown Toronto,Park,Trail,Playground,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,2
72,York,Park,Pool,Women's Store,Greek Restaurant,Deli / Bodega,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Drugstore,2
77,North York,Basketball Court,Park,Construction & Landscaping,Bakery,Trail,Dumpling Restaurant,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,2


#### Cluster 4

In [30]:
neighborhood_venue_sorted.loc[neighborhood_venue_sorted['Cluster label'] == 3, neighborhood_venue_sorted.columns[[1] + list(range(5,neighborhood_venue_sorted.shape[1]))]]

Unnamed: 0,Borough,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Cluster label
0,Scarborough,Fast Food Restaurant,Drugstore,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Dumpling Restaurant,Field,3
2,Scarborough,Medical Center,Mexican Restaurant,Breakfast Spot,Electronics Store,Rental Car Location,Intersection,Bank,Doner Restaurant,Distribution Center,Dog Run,3
4,Scarborough,Hakka Restaurant,Gas Station,Bakery,Athletics & Sports,Bank,Caribbean Restaurant,Thai Restaurant,Fried Chicken Joint,Dumpling Restaurant,Drugstore,3
6,Scarborough,Discount Store,Bus Station,Department Store,Coffee Shop,Convenience Store,Drugstore,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,3
7,Scarborough,Bakery,Bus Line,Park,Soccer Field,Metro Station,Intersection,Bus Station,Ice Cream Shop,Drugstore,Dumpling Restaurant,3
8,Scarborough,Motel,American Restaurant,Women's Store,Dim Sum Restaurant,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Drugstore,3
9,Scarborough,Café,General Entertainment,Skating Rink,College Stadium,Concert Hall,Construction & Landscaping,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store,3
10,Scarborough,Indian Restaurant,Pet Store,Vietnamese Restaurant,Chinese Restaurant,Donut Shop,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,3
11,Scarborough,Middle Eastern Restaurant,Auto Garage,Breakfast Spot,Bakery,Sandwich Place,Donut Shop,Distribution Center,Dog Run,Doner Restaurant,Drugstore,3
12,Scarborough,Lounge,Latin American Restaurant,Breakfast Spot,Skating Rink,Clothing Store,Drugstore,Discount Store,Distribution Center,Dog Run,Doner Restaurant,3


#### Cluster 5

In [31]:
neighborhood_venue_sorted.loc[neighborhood_venue_sorted['Cluster label'] == 4, neighborhood_venue_sorted.columns[[1] + list(range(5,neighborhood_venue_sorted.shape[1]))]]

Unnamed: 0,Borough,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Cluster label
1,Scarborough,Construction & Landscaping,Bar,Women's Store,Drugstore,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Dumpling Restaurant,4
5,Scarborough,Playground,Construction & Landscaping,Women's Store,Dumpling Restaurant,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Drugstore,4
14,Scarborough,Park,Playground,Donut Shop,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Drugstore,4


#### Observation:

Most of the neighborhood fall into cluster 4 which has venues like banks, supermarkets, parks etc. Cluster 1 has coffee shop and restraunts.
Cluster 2 has women stores and baseball fields. Cluster 3 has parks, business services, shopping mall etc. And cluster 5 has bar and play grounds.