# <center>Toronto Neighborhood Clustering - Part 3<center>

### In this project, we gather data on the neighborhoods of Toronto and use that data to cluster the neighborhoods in a way that will be useful to solve certain problems.

### In part 3, we use our collected data from Part 2 to cluster the neighborhoods of Toronto.

In [2]:
#import the libraries needed for this part of the project
import numpy as np

import pandas as pd
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests

import matplotlib.cm as cm
import matplotlib.colors as colors

from sklearn.cluster import KMeans

import folium

Let's first download our data.

In [30]:
toronto_df = pd.read_csv('toronto_data.csv')
print('Data successfully downloaded!!')

Data successfully downloaded!!


Let's look at the data.

In [31]:
toronto_df.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763
4,M7A,Queen's Park,Ontario Provincial Government,43.662301,-79.389494


Let's next use the geopy library to get the coordinates of Toronto.  Our user agent will be 'toronto_explorer'.

In [32]:
address = 'Toronto, ON'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Toronto are 43.6534817, -79.3839347.


Next, let's create a map of Toronto showing all the borough and neighborhood locations.  The neighborhood is displayed in brackets '[ ]' to distinguish it from the borough.

In [33]:
#create the map
map_toronto = folium.Map(location=[latitude, longitude], zoom_start=10)

#add the markers showing the boroughs
for neighborhood, borough, lat, long in zip(toronto_df['Neighborhood'], toronto_df['Borough'], toronto_df['Latitude'], toronto_df['Longitude']):
    label = '[{}], {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, long],
        radius=5,
        popup=label,
        color='black',
        fill=True,
        fill_color='#aaaaaa',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
map_toronto

#### We can now start to explore these boroughs and neighborhoods.

#### Let's suppose that we want to move to a neighborhood with a lot of 'essential venues', like restaurants, schools, train stations, parks, etc.

#### We can use the Foursquare API to help us with this analysis.

First, we need to define our credentials for the Foursquare API.

In [34]:
CLIENT_ID = 'KZ1QEKFTY1CVLAPZLIOGWRR001KSYJYVRCUGDW4G43THET40' # our Foursquare ID
CLIENT_SECRET = '2XZRMYMA0EIWDV2ZF4UDIYEOGYXCFIJCTX51PNFFYXFJ4TKS' # our Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT = 100 # A default Foursquare API limit value

print('Our credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Our credentails:
CLIENT_ID: KZ1QEKFTY1CVLAPZLIOGWRR001KSYJYVRCUGDW4G43THET40
CLIENT_SECRET:2XZRMYMA0EIWDV2ZF4UDIYEOGYXCFIJCTX51PNFFYXFJ4TKS


Next, we borrow 2 functions from the New York neighborhoods analysis lab to help us in our analysis.  These are the get_category_type and getNearbyVenues functions.

In [35]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [36]:
#This function will retrieve each of the top 100 venues that are within 500 yards of each neighborhood location.
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

We create a DataFrame called toronto_venues to hold the venue data.

In [137]:
toronto_venues = getNearbyVenues(names = toronto_df['Neighborhood'],
                                   latitudes = toronto_df['Latitude'],
                                   longitudes = toronto_df['Longitude']
                                  )

Parkwoods
Victoria Village
Regent Park, Harbourfront
Lawrence Manor, Lawrence Heights
Ontario Provincial Government
Islington Avenue
Malvern, Rouge
Don Mills North
Parkview Hill, Woodbine Gardens
Garden District, Ryerson
Glencairn
West Deane Park, Princess Gardens, Martin Grove, Islington, Cloverdale
Rouge Hill, Port Union, Highland Creek
Don Mills South, Flemingdon Park
Woodbine Heights
St. James Town
Humewood-Cedarvale
Eringate, Bloordale Gardens, Old Burnhamthorpe, Markland Wood
Guildwood, Morningside, West Hill
The Beaches
Berczy Park
Caledonia-Fairbanks
Woburn
Leaside
Central Bay Street
Christie
Cedarbrae
Hillcrest Village
Bathurst Manor, Wilson Heights, Downsview North
Thorncliffe Park
Richmond, Adelaide, King
Dufferin, Dovercourt Village
Scarborough Village
Fairview, Henry Farm, Oriole
Northwood Park, York University
The Danforth East
Harbourfront East, Union Station, Toronto Islands
Little Portugal, Trinity
Kennedy Park, Ionview, East Birchmount Park
Bayview Village
Downsview E

Let's look at how big our new DataFrame is.

In [138]:
print (toronto_venues.shape)
toronto_venues.head()

(2121, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Parkwoods,43.753259,-79.329656,Brookbanks Park,43.751976,-79.33214,Park
1,Parkwoods,43.753259,-79.329656,Brookbanks Pool,43.751389,-79.332184,Pool
2,Parkwoods,43.753259,-79.329656,Variety Store,43.751974,-79.333114,Food & Drink Shop
3,Victoria Village,43.725882,-79.315572,Victoria Village Arena,43.723481,-79.315635,Hockey Arena
4,Victoria Village,43.725882,-79.315572,Portugril,43.725819,-79.312785,Portuguese Restaurant


How many venues were returned for each neighborhood?  To find out, we create a new DataFrame called 'Number_of_venues' and execute the following code.

In [172]:
Number_of_venues = toronto_venues.groupby('Neighborhood').count()
Number_of_venues = pd.DataFrame(Number_of_venues['Venue'])
Number_of_venues.rename(columns = {'Venue': 'Number of Venues'}, inplace = True)
Number_of_venues.reset_index(inplace = True)
Number_of_venues.head()

Unnamed: 0,Neighborhood,Number of Venues
0,Agincourt,5
1,"Alderwood, Long Branch",7
2,"Bathurst Manor, Wilson Heights, Downsview North",22
3,Bayview Village,4
4,"Bedford Park, Lawrence Manor East",24


How many unique categories are there?

In [173]:
print('There are {} unique categories.'.format(len(toronto_venues['Venue Category'].unique())))

There are 271 unique categories.


Let's look at all the unique categories of venues.

In [174]:
for category in toronto_venues['Venue Category'].unique():
    print(category)

Park
Pool
Food & Drink Shop
Hockey Arena
Portuguese Restaurant
Coffee Shop
Pizza Place
Bakery
Distribution Center
Spa
Restaurant
Breakfast Spot
Gym / Fitness Center
Historic Site
Farmers Market
Chocolate Shop
Pub
Performing Arts Venue
Dessert Shop
French Restaurant
Café
Yoga Studio
Theater
Event Space
Shoe Store
Art Gallery
Cosmetics Shop
Brewery
Bank
Electronics Store
Beer Store
Hotel
Health Food Store
Antique Shop
Boutique
Furniture / Home Store
Vietnamese Restaurant
Clothing Store
Accessories Store
Miscellaneous Shop
Italian Restaurant
Beer Bar
Creperie
Sushi Restaurant
Burrito Place
Mexican Restaurant
Diner
Wings Joint
Fried Chicken Joint
Discount Store
Japanese Restaurant
Smoothie Shop
Sandwich Place
Gym
Bar
College Auditorium
Fast Food Restaurant
Caribbean Restaurant
Gastropub
Pharmacy
Pet Store
Intersection
Flea Market
Athletics & Sports
Comic Shop
Plaza
Ramen Restaurant
Sporting Goods Shop
Music Venue
Burger Joint
Shopping Mall
Tanning Salon
Bookstore
Steakhouse
College Rec Cen

### Finding the essential venues

In order to pick out the essential venues from all the venues, we must create a criteria and then use that to filter through the list.  Our criteria for essential venues will be any venue under the category of the following list.

1. Stores
2. Restaurants
3. Train/Bus/Gas Stations
4. Outdoor Recreation
5. Movie Theaters

We must make code that can filter out these categories.  When looking at the list of categories above, we can use the following words to filter out each type of essential venue.

1. Stores - 'Store', 'Shop', 'Bakery', 'Market'
2. Restaurants - 'Restaurant', 'Café', 'Joint', 'Pizza Place', 'Diner', 'Steakhouse', 'Cafe'
3. Train/Bus/Gas Stations - 'Station'
4. Outdoor Recreation - 'Park', 'Lake', 'Outdoors', 'Field', 'Trail', 'Beach', 'River'
5. Movie Theaters - 'Movie Theater'

First, we create a boolean function that tests whether or not a venue category is essential.

In [175]:
def isVenueEssential(category):
    #use a dummy variable x to hold the category name
    x = category
    
    #convert all letters in the category string to lower case so that case does not affect the test criteria
    x = x.lower()
    
    #This if ladder will determine whether or not the venue is essential.
    #The find method for strings returns the index of the string to be found if that string to be found is in the string and
    #returns -1 if it is not found.  This is a good testing criteria.
    if x.find('store') >= 0:
        return True
    elif x.find('shop') >= 0:
        return True
    elif x.find('bakery') >= 0:
        return True
    elif x.find('market') >= 0:
        return True
    elif x.find('restaurant') >= 0:
        return True
    elif x.find('café') >= 0:
        return True
    elif x.find('joint') >= 0:
        return True
    elif x.find('pizza place') >= 0:
        return True
    elif x.find('diner') >= 0:
        return True
    elif x.find('steakhouse') >= 0:
        return True
    elif x.find('cafe') >= 0:
        return True
    elif x.find('station') >= 0:
        return True
    elif x.find('park') >= 0:
        return True
    elif x.find('lake') >= 0:
        return True
    elif x.find('outdoors') >= 0:
        return True
    elif x.find('field') >= 0:
        return True
    elif x.find('trail') >= 0:
        return True
    elif x.find('beach') >= 0:
        return True
    elif x.find('river') >= 0:
        return True
    elif x.find('movie theater') >= 0:
        return True
    else:
        #venue is nonessential, so return False
        return False

Now, the following code will divide up the venues into 2 categories: essential and nonessential.  Any venue category that does not fall into our criteria of essential will be classified as nonessential.

In [176]:
essential_venues = []
nonessential_venues = []

for category in toronto_venues['Venue Category'].unique():
    if  isVenueEssential(category):
        #essential venue
        essential_venues.append(category)
    else:
        #nonessential venue
        nonessential_venues.append(category)

Let's now look at the lists of essential and nonessential venues.

In [177]:
#Essential Venues
essential_venues

['Park',
 'Food & Drink Shop',
 'Portuguese Restaurant',
 'Coffee Shop',
 'Pizza Place',
 'Bakery',
 'Restaurant',
 'Farmers Market',
 'Chocolate Shop',
 'Dessert Shop',
 'French Restaurant',
 'Café',
 'Shoe Store',
 'Cosmetics Shop',
 'Electronics Store',
 'Beer Store',
 'Health Food Store',
 'Antique Shop',
 'Furniture / Home Store',
 'Vietnamese Restaurant',
 'Clothing Store',
 'Accessories Store',
 'Miscellaneous Shop',
 'Italian Restaurant',
 'Sushi Restaurant',
 'Mexican Restaurant',
 'Diner',
 'Wings Joint',
 'Fried Chicken Joint',
 'Discount Store',
 'Japanese Restaurant',
 'Smoothie Shop',
 'Fast Food Restaurant',
 'Caribbean Restaurant',
 'Pet Store',
 'Flea Market',
 'Comic Shop',
 'Ramen Restaurant',
 'Sporting Goods Shop',
 'Burger Joint',
 'Shopping Mall',
 'Bookstore',
 'Steakhouse',
 'Thai Restaurant',
 'Modern European Restaurant',
 'New American Restaurant',
 'Lake',
 'Middle Eastern Restaurant',
 'Chinese Restaurant',
 'Ethiopian Restaurant',
 'Seafood Restaurant',
 

In [178]:
#Nonessential Venues
nonessential_venues

['Pool',
 'Hockey Arena',
 'Distribution Center',
 'Spa',
 'Breakfast Spot',
 'Gym / Fitness Center',
 'Historic Site',
 'Pub',
 'Performing Arts Venue',
 'Yoga Studio',
 'Theater',
 'Event Space',
 'Art Gallery',
 'Brewery',
 'Bank',
 'Hotel',
 'Boutique',
 'Beer Bar',
 'Creperie',
 'Burrito Place',
 'Sandwich Place',
 'Gym',
 'Bar',
 'College Auditorium',
 'Gastropub',
 'Pharmacy',
 'Intersection',
 'Athletics & Sports',
 'Plaza',
 'Music Venue',
 'Tanning Salon',
 'College Rec Center',
 'Tea Room',
 'Lounge',
 'Wine Bar',
 'Hookah Bar',
 'Poutine Place',
 'Office',
 'Construction & Landscaping',
 'Skating Rink',
 'Curling Ice',
 'Bus Stop',
 'Food Truck',
 'Cocktail Bar',
 'Jazz Club',
 'Fountain',
 'Irish Pub',
 'Bistro',
 'Garden',
 'Rental Car Location',
 'Medical Center',
 'Neighborhood',
 'Museum',
 'Concert Hall',
 'Basketball Stadium',
 'Nightclub',
 'Sports Bar',
 'Poke Place',
 'Art Museum',
 'Juice Bar',
 'Salad Place',
 'Sculpture Garden',
 'Golf Course',
 'Dog Run',
 'De

How many essential and nonessential venues are there?

In [179]:
print ('Number of essential venues: ', len(essential_venues))
print ('Number of nonessential venues: ', len(nonessential_venues))

Number of essential venues:  143
Number of nonessential venues:  128


We are now ready to cluster the neighborhoods.

### Clustering the neighborhoods

We will cluster the neighborhoods using KMeans based on how many essential venues the neighborhoods have vs nonessential venues.

First, we must make a DataFrame that lists all neighborhoods along with the number of essential and nonessential venues for each neighborhood.

In [180]:
#Add the columns 'Essential' and 'Nonessential' to the toronto_venues DataFrame.
#Initialize each of these columns with value 0.
toronto_venues['Essential'] = 0
toronto_venues['Nonessential'] = 0

In [181]:
#Put the correct value of 0 or 1 in the 'Essential' and 'Nonessential' columns, with 1 being True, and 0 being False.
for i in range(toronto_venues.shape[0]):
    category = toronto_venues.loc[i,'Venue Category']
    if isVenueEssential(category):
        toronto_venues.loc[i,'Essential'] = 1
        toronto_venues.loc[i,'Nonessential'] = 0
    else:
        toronto_venues.loc[i,'Essential'] = 0
        toronto_venues.loc[i,'Nonessential'] = 1

Let's look at the new DataFrame.

In [182]:
toronto_venues.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category,Essential,Nonessential
0,Parkwoods,43.753259,-79.329656,Brookbanks Park,43.751976,-79.33214,Park,1,0
1,Parkwoods,43.753259,-79.329656,Brookbanks Pool,43.751389,-79.332184,Pool,0,1
2,Parkwoods,43.753259,-79.329656,Variety Store,43.751974,-79.333114,Food & Drink Shop,1,0
3,Victoria Village,43.725882,-79.315572,Victoria Village Arena,43.723481,-79.315635,Hockey Arena,0,1
4,Victoria Village,43.725882,-79.315572,Portugril,43.725819,-79.312785,Portuguese Restaurant,1,0


Let's now make our new DataFrame that will contain the number of essential and nonessential venues for each neighborhood.

In [183]:
tn_venue_count = toronto_venues.groupby('Neighborhood').sum()
tn_venue_count = tn_venue_count[['Essential','Nonessential']]
tn_venue_count.rename(columns = {'Essential': 'Number of Essential Venues',
                                  'Nonessential': 'Number of Nonessential Venues'}, inplace = True)
tn_venue_count.reset_index(inplace = True)

Let's look at this new DataFrame.

In [184]:
tn_venue_count.head()

Unnamed: 0,Neighborhood,Number of Essential Venues,Number of Nonessential Venues
0,Agincourt,2,3
1,"Alderwood, Long Branch",3,4
2,"Bathurst Manor, Wilson Heights, Downsview North",17,5
3,Bayview Village,3,1
4,"Bedford Park, Lawrence Manor East",17,7


We will use this DataFrame to cluster the neighborhoods.

In [185]:
#We will group our neighborhoods into 4 clusters.
kclusters = 4

toronto_grouped_clustering = tn_venue_count.drop('Neighborhood',1)

#run the clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_grouped_clustering)

#check the cluster labels for each row in the DataFrame.
kmeans.labels_[0:10]

array([3, 3, 0, 3, 0, 1, 3, 0, 0, 3])

Now, we add our kmeans labels to our tn_venue_count DataFrame.

In [186]:
tn_venue_count.insert(0, 'Cluster Label', kmeans.labels_)

In [189]:
tn_venue_count.head()

Unnamed: 0,Cluster Label,Neighborhood,Number of Essential Venues,Number of Nonessential Venues
0,3,Agincourt,2,3
1,3,"Alderwood, Long Branch",3,4
2,0,"Bathurst Manor, Wilson Heights, Downsview North",17,5
3,3,Bayview Village,3,1
4,0,"Bedford Park, Lawrence Manor East",17,7


Let's now join this DataFrame with our original toronto_df DataFrame!!

In [197]:
toronto_clusters_df = toronto_df.merge(tn_venue_count, how = 'inner', on = 'Neighborhood')

Let's look at this new frame.

In [198]:
toronto_clusters_df.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Label,Number of Essential Venues,Number of Nonessential Venues
0,M3A,North York,Parkwoods,43.753259,-79.329656,3,2,1
1,M4A,North York,Victoria Village,43.725882,-79.315572,3,3,1
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636,1,28,17
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763,0,11,1
4,M7A,Queen's Park,Ontario Provincial Government,43.662301,-79.389494,0,18,11


Let's now look at the clusters on a map.

In [221]:
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
colors_array = ['red', 'blue', 'green', 'brown']
fill_colors_array = ['#ff7c7b', '#87ceeb', '#00ff7f', '#d2691e']
# add markers to the map
for lat, lon, neighborhood, borough, cluster in zip(toronto_clusters_df['Latitude'],
                                           toronto_clusters_df['Longitude'],
                                           toronto_clusters_df['Neighborhood'],
                                           toronto_clusters_df['Borough'],
                                           toronto_clusters_df['Cluster Label']):
    label = folium.Popup('[' + neighborhood + '], ' + borough + ', Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=colors_array[cluster],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### Examining the clusters

Now, we can look at each cluster and see how each neighborhood was grouped.

#### Cluster 0

In [222]:
toronto_clusters_df.loc[toronto_clusters_df['Cluster Label'] == 0,:]

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Label,Number of Essential Venues,Number of Nonessential Venues
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763,0,11,1
4,M7A,Queen's Park,Ontario Provincial Government,43.662301,-79.389494,0,18,11
11,M3C,North York,"Don Mills South, Flemingdon Park",43.7259,-79.340923,0,16,4
23,M6G,Downtown Toronto,Christie,43.669542,-79.422564,0,14,2
26,M3H,North York,"Bathurst Manor, Wilson Heights, Downsview North",43.754328,-79.442259,0,17,5
27,M4H,East York,Thorncliffe Park,43.705369,-79.349372,0,13,7
29,M6H,West Toronto,"Dufferin, Dovercourt Village",43.669005,-79.442259,0,9,6
41,M6K,West Toronto,"Brockton, Parkdale Village, Exhibition Place",43.636847,-79.428191,0,12,11
44,M4L,East Toronto,"India Bazaar, The Beaches West",43.668999,-79.315572,0,14,6
51,M5M,North York,"Bedford Park, Lawrence Manor East",43.733283,-79.41975,0,17,7


#### Cluster 1

In [223]:
toronto_clusters_df.loc[toronto_clusters_df['Cluster Label'] == 1,:]

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Label,Number of Essential Venues,Number of Nonessential Venues
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636,1,28,17
18,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306,1,39,19
21,M4G,East York,Leaside,43.70906,-79.363452,1,27,6
22,M5G,Downtown Toronto,Central Bay Street,43.657952,-79.387383,1,46,17
31,M2J,North York,"Fairview, Henry Farm, Oriole",43.778517,-79.346556,1,51,15
35,M6J,West Toronto,"Little Portugal, Trinity",43.647927,-79.41975,1,30,12
39,M4K,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188,1,36,6
50,M4M,East Toronto,Studio District,43.659526,-79.340923,1,25,12
55,M2N,North York,Willowdale South,43.77012,-79.408493,1,29,6
75,M4S,Central Toronto,Davisville,43.704324,-79.38879,1,29,9


#### Cluster 2

In [224]:
toronto_clusters_df.loc[toronto_clusters_df['Cluster Label'] == 2,:]

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Label,Number of Essential Venues,Number of Nonessential Venues
8,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937,2,77,23
13,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418,2,54,26
28,M5H,Downtown Toronto,"Richmond, Adelaide, King",43.650571,-79.384568,2,58,35
34,M5J,Downtown Toronto,"Harbourfront East, Union Station, Toronto Islands",43.640816,-79.381752,2,52,48
40,M5K,Downtown Toronto,"Toronto Dominion Centre, Design Exchange",43.647177,-79.381576,2,54,46
45,M5L,Downtown Toronto,"Commerce Court, Victoria Hotel",43.648198,-79.379817,2,64,36
88,M5W,Downtown Toronto,"Stn A PO Boxes 25, The Esplanade",43.646435,-79.374846,2,59,36
92,M5X,Downtown Toronto,"First Canadian Place, Underground city",43.648429,-79.38228,2,59,41
94,M4Y,Downtown Toronto,Church and Wellesley,43.66586,-79.38316,2,50,27


#### Cluster 3

In [225]:
toronto_clusters_df.loc[toronto_clusters_df['Cluster Label'] == 3,:]

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Label,Number of Essential Venues,Number of Nonessential Venues
0,M3A,North York,Parkwoods,43.753259,-79.329656,3,2,1
1,M4A,North York,Victoria Village,43.725882,-79.315572,3,3,1
5,M1B,Scarborough,"Malvern, Rouge",43.806686,-79.194353,3,1,0
6,M3B,North York,Don Mills North,43.745906,-79.352188,3,3,1
7,M4B,East York,"Parkview Hill, Woodbine Gardens",43.706397,-79.309937,3,4,7
9,M6B,North York,Glencairn,43.709577,-79.445073,3,4,0
10,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek",43.784535,-79.160497,3,0,2
12,M4C,East York,Woodbine Heights,43.695344,-79.318389,3,3,5
14,M6C,York,Humewood-Cedarvale,43.693781,-79.428191,3,2,2
15,M9C,Etobicoke,"Eringate, Bloordale Gardens, Old Burnhamthorpe...",43.643515,-79.577201,3,6,1


### Observations

Based on the clustering results above, we can label the clusters as follows.

1. Cluster 0 - Some essential venues, Few to no nonessential venues
2. Cluster 1 - Many essential venues, Some nonessential venues
3. Cluster 2 - Many essential venues, Many nonessential venues
4. Cluster 3 - Few to no venues

All of the the neighborhoods of Cluster 2 are located in Downtown Toronto.  Although these would be the most obvious first choices when looking at choosing a neighborhood, the cost of housing in these neighborhoods may be prohibitive since it is likely to be much higher than for other neighborhoods.

For those who are looking for neighborhoods with reasonable housing costs, the neighborhoods of Cluster 1 would most likely be good choices.  They have a lot of essential venues and a good amount of nonessential venues.

Cluster 0 neighborhoods are also ok.  However, they do not have as many venues as Cluster 1 or Cluster 2 neighborhoods.

It would be wise to avoid any neighborhoods in Cluster 3 because they have very few venues.