# Segmenting and Clustering Neighborhoods in the city of Toronto and Manhatten

My capstone projects looks at the Neighborhood of different cities in different countries and compares them by building clusters. The research question is rather a sociological one than a business problem. The idea is to check, whether the neighborhoods of one city will end up in one cluster or whether there are similar neighborhood structures that exists in different countries. This can be a signal of whether the national culture is dominant or local cultures develop on their own. The two cities analyzed are New York and Toronto. For further research the analysis could be run multiple times to see whether characteristics of the country have an impact on which culture (national or local) is the more important one.

# Preparation 

In [3]:
import numpy as np # library to handle data in a vectorized manner
import pandas as pd # library for data analsysis
import json # library to handle JSON files

!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    openssl-1.1.1d             |       h516909a_0         2.1 MB  conda-forge
    certifi-2019.11.28         |           py36_0         149 KB  conda-forge
    ca-certificates-2019.11.28 |       hecc5488_0         145 KB  conda-forge
    geographiclib-1.50         |             py_0          34 KB  conda-forge
    geopy-1.20.0               |             py_0          57 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         2.5 MB

The following NEW packages will be INSTALLED:

    geographiclib:   1.50-py_0         conda-forge
    geopy:           1.20.0-py_0       conda-forge

The following packages will be UPDATED:

    ca-

## Get the data for Toronto



Read url to get Wikipedia Table, do some data wrangling

In [4]:
url='http://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'
canada=pd.read_html(url, header=0)[0]
canada=canada[canada.Borough!='Not assigned']
canada=canada[canada.Neighborhood!='Not assigned'] #Remove cells where Borough ist not assigned or Neighborhood is not assigned
df=canada.groupby(['Postcode','Borough'])['Neighborhood'].agg(lambda x: ','.join(set(x))).reset_index() #List all Neighborhoods that belong to one Postcode. Rename dataframe df now
df.rename(columns={'Postcode':'PostalCode'}, inplace=True)
df.head()

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M1B,Scarborough,"Malvern,Rouge"
1,M1C,Scarborough,"Highland Creek,Rouge Hill,Port Union"
2,M1E,Scarborough,"West Hill,Morningside,Guildwood"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae


Read the csv file

In [5]:
df_geo=pd.read_csv('http://cocl.us/Geospatial_data')
df_geo.rename(columns={'Postal Code':'PostalCode'}, inplace=True)


Join it with df

In [6]:
df_joined=df.join(df_geo.set_index('PostalCode'), on='PostalCode')

Check the new Dataframe df_joined

In [8]:
print(df_joined.shape)
df_joined.head()

(102, 5)


Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M1B,Scarborough,"Malvern,Rouge",43.806686,-79.194353
1,M1C,Scarborough,"Highland Creek,Rouge Hill,Port Union",43.784535,-79.160497
2,M1E,Scarborough,"West Hill,Morningside,Guildwood",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476


Remove the PostalCode and restrict to Toronto

In [11]:
toronto_data=df_joined[df_joined.Borough.str.contains('Toronto')]
toronto_data=toronto_data.drop(columns=['PostalCode'])

## Get data for New York

In [10]:
!wget -q -O 'newyork_data.json' https://cocl.us/new_york_dataset

In [12]:
with open('newyork_data.json') as json_data:
    newyork_data = json.load(json_data)

In [13]:
neighborhoods_data = newyork_data['features']
column_names = ['Borough', 'Neighborhood', 'Latitude', 'Longitude'] 
neighborhoods = pd.DataFrame(columns=column_names)
for data in neighborhoods_data:
    borough = neighborhood_name = data['properties']['borough'] 
    neighborhood_name = data['properties']['name']
        
    neighborhood_latlon = data['geometry']['coordinates']
    neighborhood_lat = neighborhood_latlon[1]
    neighborhood_lon = neighborhood_latlon[0]
    
    neighborhoods = neighborhoods.append({'Borough': borough,
                                          'Neighborhood': neighborhood_name,
                                          'Latitude': neighborhood_lat,
                                          'Longitude': neighborhood_lon}, ignore_index=True)
    

## Final Data Set (Geo data)


The final dataset holds all neighborhoods of Toronto and New York with the lat and lon values

In [14]:
df = pd.concat([neighborhoods, toronto_data], axis=0)
df.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Bronx,Wakefield,40.894705,-73.847201
1,Bronx,Co-op City,40.874294,-73.829939
2,Bronx,Eastchester,40.887556,-73.827806
3,Bronx,Fieldston,40.895437,-73.905643
4,Bronx,Riverdale,40.890834,-73.912585


## Explore the neighborhoods: Get Foursquare data

List the Boroughs that will be analyzed

In [15]:
print(df.shape)
df['Borough'].unique()

(345, 4)


array(['Bronx', 'Manhattan', 'Brooklyn', 'Queens', 'Staten Island',
       'East Toronto', 'Central Toronto', 'Downtown Toronto',
       'West Toronto'], dtype=object)

Now we need foursquare credentials (this is private information not shared on github) and define a limit for the requests

In [16]:
# @hidden cell 
CLIENT_ID = 'OMUU4RXXXXSKVB' # your Foursquare ID
# @hidden cell 
CLIENT_SECRET = 'YBJQW32KGZ1Y3ELFXXXXXCKOQAOO5ZQPB' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT=500

Now we use the code from the course notbook to get the relevant data from foursquare

In [17]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']
    
    

In [18]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

run the function


In [19]:

venues = getNearbyVenues(names=df['Neighborhood'],
                                   latitudes=df['Latitude'],
                                   longitudes=df['Longitude']
                                  )


Wakefield
Co-op City
Eastchester
Fieldston
Riverdale
Kingsbridge
Marble Hill
Woodlawn
Norwood
Williamsbridge
Baychester
Pelham Parkway
City Island
Bedford Park
University Heights
Morris Heights
Fordham
East Tremont
West Farms
High  Bridge
Melrose
Mott Haven
Port Morris
Longwood
Hunts Point
Morrisania
Soundview
Clason Point
Throgs Neck
Country Club
Parkchester
Westchester Square
Van Nest
Morris Park
Belmont
Spuyten Duyvil
North Riverdale
Pelham Bay
Schuylerville
Edgewater Park
Castle Hill
Olinville
Pelham Gardens
Concourse
Unionport
Edenwald
Bay Ridge
Bensonhurst
Sunset Park
Greenpoint
Gravesend
Brighton Beach
Sheepshead Bay
Manhattan Terrace
Flatbush
Crown Heights
East Flatbush
Kensington
Windsor Terrace
Prospect Heights
Brownsville
Williamsburg
Bushwick
Bedford Stuyvesant
Brooklyn Heights
Cobble Hill
Carroll Gardens
Red Hook
Gowanus
Fort Greene
Park Slope
Cypress Hills
East New York
Starrett City
Canarsie
Flatlands
Mill Island
Manhattan Beach
Coney Island
Bath Beach
Borough Park
Dyker

In [23]:
print(venues.shape)
print(venues['Neighborhood'].unique().shape)
venues.head()

(11957, 7)
(337,)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Wakefield,40.894705,-73.847201,Lollipops Gelato,40.894123,-73.845892,Dessert Shop
1,Wakefield,40.894705,-73.847201,Rite Aid,40.896649,-73.844846,Pharmacy
2,Wakefield,40.894705,-73.847201,Carvel Ice Cream,40.890487,-73.848568,Ice Cream Shop
3,Wakefield,40.894705,-73.847201,Shell,40.894187,-73.845862,Gas Station
4,Wakefield,40.894705,-73.847201,Cooler Runnings Jamaican Restaurant Inc,40.898083,-73.850259,Caribbean Restaurant


## Build the Flatfile for Cluster Analysis

Use One hot encoding

In [24]:
onehot = pd.get_dummies(venues[['Venue Category']], prefix="", prefix_sep="")

onehot['Neighborhood'] = venues['Neighborhood'] 

fixed_columns = [onehot.columns[-1]] + list(onehot.columns[:-1])
toronto_onehot = onehot[fixed_columns]

onehot.head()

Unnamed: 0,Accessories Store,Adult Boutique,Afghan Restaurant,African Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,...,Warehouse Store,Waste Facility,Waterfront,Weight Loss Center,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


One row for each neighborhood

In [25]:
df_grouped = onehot.groupby('Neighborhood').mean().reset_index()
df_grouped.head()

Unnamed: 0,Neighborhood,Accessories Store,Adult Boutique,Afghan Restaurant,African Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,...,Warehouse Store,Waste Facility,Waterfront,Weight Loss Center,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio
0,Allerton,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Annadale,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Arden Heights,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Arlington,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Arrochar,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


Put into pandas and display the mpost common venues for each neighborhood

In [32]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [52]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = df_grouped['Neighborhood']

for ind in np.arange(df_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(df_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Allerton,Pizza Place,Deli / Bodega,Spa,Supermarket,Pharmacy,Chinese Restaurant,Bakery,Intersection,Mexican Restaurant,Gas Station
1,Annadale,Pizza Place,American Restaurant,Bakery,Pharmacy,Train Station,Diner,Park,Sushi Restaurant,Dance Studio,Food
2,Arden Heights,Pharmacy,Deli / Bodega,Coffee Shop,Pizza Place,Bus Stop,Yoga Studio,Falafel Restaurant,Egyptian Restaurant,Electronics Store,Empanada Restaurant
3,Arlington,Intersection,Grocery Store,American Restaurant,Deli / Bodega,Bus Stop,Farm,Egyptian Restaurant,Electronics Store,Empanada Restaurant,English Restaurant
4,Arrochar,Bus Stop,Italian Restaurant,Deli / Bodega,Cosmetics Shop,Pizza Place,Food Truck,Supermarket,Middle Eastern Restaurant,Outdoors & Recreation,Mediterranean Restaurant


# Clustering


Build 8 clusters


In [53]:
# set number of clusters to 2
kclusters = 8

df_grouped_clustering = df_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(df_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:20] 

array([4, 4, 1, 1, 1, 0, 0, 4, 0, 4, 0, 0, 0, 4, 0, 5, 4, 0, 4, 4],
      dtype=int32)

In [54]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

df_merged = df

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
df_merged = df_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

df_merged.head() # check the last columns!

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Bronx,Wakefield,40.894705,-73.847201,4.0,Pharmacy,Gas Station,Food,Caribbean Restaurant,Donut Shop,Laundromat,Sandwich Place,Dessert Shop,Ice Cream Shop,Field
1,Bronx,Co-op City,40.874294,-73.829939,4.0,Bus Station,Baseball Field,Restaurant,Pizza Place,Gift Shop,Pharmacy,Fast Food Restaurant,Park,Grocery Store,Mattress Store
2,Bronx,Eastchester,40.887556,-73.827806,4.0,Bus Station,Caribbean Restaurant,Diner,Deli / Bodega,Convenience Store,Chinese Restaurant,Metro Station,Donut Shop,Seafood Restaurant,Platform
3,Bronx,Fieldston,40.895437,-73.905643,4.0,Plaza,River,Bus Station,Yoga Studio,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Empanada Restaurant,English Restaurant,Ethiopian Restaurant
4,Bronx,Riverdale,40.890834,-73.912585,4.0,Bus Station,Park,Food Truck,Bank,Gym,Baseball Field,Home Service,Plaza,Field,Filipino Restaurant


## Data Wrangling of Results Table

Some Neighborhoods don't have clusters

In [55]:
df_merged[df_merged['Cluster Labels'].isnull()]

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
207,Staten Island,Port Ivory,40.639683,-74.174645,,,,,,,,,,,
255,Staten Island,Emerson Hill,40.606794,-74.097762,,,,,,,,,,,
257,Staten Island,Howland Hook,40.638433,-74.186223,,,,,,,,,,,


I will skip these for now

In [56]:
final_cluster=df_merged[np.isfinite(df_merged['Cluster Labels'])]



## The final Clusters

In [57]:

final_cluster['Cluster Labels']=final_cluster['Cluster Labels'].astype(int)
final_cluster.head()

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  from ipykernel import kernelapp as app


Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Bronx,Wakefield,40.894705,-73.847201,4,Pharmacy,Gas Station,Food,Caribbean Restaurant,Donut Shop,Laundromat,Sandwich Place,Dessert Shop,Ice Cream Shop,Field
1,Bronx,Co-op City,40.874294,-73.829939,4,Bus Station,Baseball Field,Restaurant,Pizza Place,Gift Shop,Pharmacy,Fast Food Restaurant,Park,Grocery Store,Mattress Store
2,Bronx,Eastchester,40.887556,-73.827806,4,Bus Station,Caribbean Restaurant,Diner,Deli / Bodega,Convenience Store,Chinese Restaurant,Metro Station,Donut Shop,Seafood Restaurant,Platform
3,Bronx,Fieldston,40.895437,-73.905643,4,Plaza,River,Bus Station,Yoga Studio,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Empanada Restaurant,English Restaurant,Ethiopian Restaurant
4,Bronx,Riverdale,40.890834,-73.912585,4,Bus Station,Park,Food Truck,Bank,Gym,Baseball Field,Home Service,Plaza,Field,Filipino Restaurant


## Map the Clusters


In [58]:
address = 'Toronto'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude

# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(final_cluster['Latitude'], final_cluster['Longitude'], final_cluster['Neighborhood'], final_cluster['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

## Descriptives on the resulting clusters

In [67]:
clusters=final_cluster.groupby(['Cluster Labels','Borough'])['Neighborhood'].count().reset_index()
#final_cluster.groupby(['Cluster Labels'])['Borough'].agg(lambda x: ','.join(set(x))).reset_index()
clusters['Country']= np.where(clusters['Borough'].str.contains('Toronto'), 'Toronto', 'New York')
clusters

Unnamed: 0,Cluster Labels,Borough,Neighborhood,Country
0,0,Bronx,3,New York
1,0,Brooklyn,34,New York
2,0,Central Toronto,6,Toronto
3,0,Downtown Toronto,18,Toronto
4,0,East Toronto,5,Toronto
5,0,Manhattan,40,New York
6,0,Queens,24,New York
7,0,Staten Island,18,New York
8,0,West Toronto,5,Toronto
9,1,Bronx,5,New York


In [70]:
clusters.groupby(['Cluster Labels','Country'])['Neighborhood'].sum().reset_index()

Unnamed: 0,Cluster Labels,Country,Neighborhood
0,0,New York,119
1,0,Toronto,34
2,1,New York,25
3,2,New York,1
4,3,New York,7
5,4,New York,145
6,4,Toronto,2
7,5,New York,4
8,5,Toronto,2
9,6,New York,2


In [71]:
final_cluster[final_cluster['Cluster Labels']==5]

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
27,Bronx,Clason Point,40.806551,-73.854144,5,Park,Home Service,Grocery Store,Boat or Ferry,Pool,Bus Stop,South American Restaurant,Factory,Egyptian Restaurant,Electronics Store
192,Queens,Somerville,40.597711,-73.796648,5,Park,Yoga Studio,Food Court,Duty-free Shop,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Empanada Restaurant,English Restaurant,Ethiopian Restaurant
203,Staten Island,Todt Hill,40.597069,-74.111329,5,Park,Yoga Studio,Food Court,Duty-free Shop,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Empanada Restaurant,English Restaurant,Ethiopian Restaurant
303,Queens,Bayswater,40.611322,-73.765968,5,Park,Tennis Court,Playground,Yoga Studio,Exhibit,Duty-free Shop,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Empanada Restaurant
44,Central Toronto,Lawrence Park,43.72802,-79.38879,5,Park,Swim School,Bus Line,Yoga Studio,Farm,Egyptian Restaurant,Electronics Store,Empanada Restaurant,English Restaurant,Ethiopian Restaurant
48,Central Toronto,"Summerhill East,Moore Park",43.689574,-79.38316,5,Summer Camp,Park,Playground,Yoga Studio,Factory,Duty-free Shop,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Empanada Restaurant


In [74]:
final_cluster[final_cluster['Cluster Labels']==7]

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
63,Central Toronto,Roselawn,43.711695,-79.416936,7,Garden,Yoga Studio,Farm,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Empanada Restaurant,English Restaurant,Ethiopian Restaurant,Event Service


In [75]:
final_cluster[final_cluster['Cluster Labels']==3]

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
198,Staten Island,New Brighton,40.640615,-74.087017,3,Bus Stop,Park,Playground,Flower Shop,Deli / Bodega,Discount Store,Yoga Studio,Egyptian Restaurant,Electronics Store,Empanada Restaurant
212,Staten Island,Oakwood,40.558462,-74.121566,3,Bar,Lawyer,Bus Stop,Yoga Studio,Farm,Egyptian Restaurant,Electronics Store,Empanada Restaurant,English Restaurant,Ethiopian Restaurant
224,Staten Island,Park Hill,40.60919,-74.080157,3,Bus Stop,Gym / Fitness Center,Park,Athletics & Sports,Coffee Shop,Hotel,Yoga Studio,Factory,Egyptian Restaurant,Electronics Store
245,Staten Island,Bloomfield,40.605779,-74.187256,3,Recreation Center,Discount Store,Theme Park,Park,Bus Stop,Fish Market,Fish & Chips Shop,Egyptian Restaurant,Electronics Store,Food
256,Staten Island,Randall Manor,40.63563,-74.098051,3,Bus Stop,Park,Pizza Place,Deli / Bodega,Farm,Egyptian Restaurant,Electronics Store,Empanada Restaurant,English Restaurant,Ethiopian Restaurant
285,Staten Island,Willowbrook,40.603707,-74.132084,3,Bus Stop,Intersection,Deli / Bodega,Pizza Place,Bagel Shop,Fish Market,Fish & Chips Shop,Eastern European Restaurant,Egyptian Restaurant,Electronics Store
305,Staten Island,Fox Hills,40.617311,-74.08174,3,Bus Stop,Sandwich Place,Yoga Studio,Falafel Restaurant,Duty-free Shop,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Empanada Restaurant,English Restaurant


In [76]:
final_cluster[final_cluster['Borough']=='Central Toronto']

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
44,Central Toronto,Lawrence Park,43.72802,-79.38879,5,Park,Swim School,Bus Line,Yoga Studio,Farm,Egyptian Restaurant,Electronics Store,Empanada Restaurant,English Restaurant,Ethiopian Restaurant
45,Central Toronto,Davisville North,43.712751,-79.390197,0,Food & Drink Shop,Dance Studio,Breakfast Spot,Park,Clothing Store,Sandwich Place,Hotel,Gym,Filipino Restaurant,Field
46,Central Toronto,North Toronto West,43.715383,-79.405678,0,Clothing Store,Sporting Goods Shop,Coffee Shop,Yoga Studio,Dessert Shop,Chinese Restaurant,Mexican Restaurant,Miscellaneous Shop,Salon / Barbershop,Café
47,Central Toronto,Davisville,43.704324,-79.38879,0,Pizza Place,Dessert Shop,Sandwich Place,Gym,Coffee Shop,Italian Restaurant,Sushi Restaurant,Café,Brewery,Greek Restaurant
48,Central Toronto,"Summerhill East,Moore Park",43.689574,-79.38316,5,Summer Camp,Park,Playground,Yoga Studio,Factory,Duty-free Shop,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Empanada Restaurant
49,Central Toronto,"Rathnelly,South Hill,Summerhill West,Forest Hi...",43.686412,-79.400049,0,Pub,Coffee Shop,Liquor Store,Vietnamese Restaurant,Fried Chicken Joint,Supermarket,Sushi Restaurant,Bagel Shop,Light Rail Station,Pizza Place
63,Central Toronto,Roselawn,43.711695,-79.416936,7,Garden,Yoga Studio,Farm,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Empanada Restaurant,English Restaurant,Ethiopian Restaurant,Event Service
64,Central Toronto,"Forest Hill West,Forest Hill North",43.696948,-79.411307,0,Trail,Park,Jewelry Store,Sushi Restaurant,Yoga Studio,Falafel Restaurant,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Empanada Restaurant
65,Central Toronto,"North Midtown,Yorkville,The Annex",43.67271,-79.405678,0,Café,Sandwich Place,Coffee Shop,American Restaurant,Liquor Store,BBQ Joint,Pharmacy,Middle Eastern Restaurant,Pizza Place,History Museum
