# Segmenting and Clustering Neighborhoods in Toronto

This notebook contains my way of solving the peer-graded assignment of week 3 of the Applied Data Science Capstone course from IBM/Coursera: Segmenting and Clustering Neighborhoods in Toronto. The notebook contains solutions to all three parts of the assignment, each found under their representative header. 

In [111]:
import pandas as pd
import numpy as np

---

## Part 1

### Step 1: Loading data, and first formatting
1. Create a url to the Wikipedia page, scrape it using pandas´ read_html method and assign it to 'data'.
2. Create initial dataframe, data_df, by assigning it the first element (index 0) of 'data'.
3. Rename the columns of data_df to match the dataframe from the assignment instructions.
4. Display the top rows of the current state of the dataframe

In [106]:
# 1.
url = 'https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'
data = pd.read_html(url)
# 2.
data_df = data[0]
# 3.
data_df.rename(columns={'Postcode':'PostalCode', 'Neighbourhood':'Neighborhood'}, inplace=True)
# 4.
data_df.head()

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront


### Step 2: Cleaning the dataframe
1. Remove all rows which do not contain any assigned borough.
2. Group the rows according to postal code and borough, and separate neighborhoods with a comma and a space.
3. Check whether any rows have not been assigned a neighborhood, and add the name of the borough to the neighborhood.
4. Reset indeces and display the top rows of the current state of the dataframe

In [115]:
# 1.
data_df = data_df[data_df['Borough']!='Not assigned']
# 2.
data_df = data_df.groupby(['PostalCode','Borough'], sort=False).agg(', '.join)
# 3.
data_df[data_df['Neighborhood']=='Not assigned']
data_df.replace("Not assigned", "Queen's Park", inplace=True)
# 4.
data_df.reset_index(inplace=True)
data_df.head()

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,Harbourfront
3,M6A,North York,"Lawrence Heights, Lawrence Manor"
4,M7A,Queen's Park,Queen's Park


### Step 3: Checking the shape of the dataframe

In [201]:
print('Number of rows: {}, Number of columns: {}'.format(data_df.shape[0], data_df.shape[1]))
print(data_df.shape)

Number of rows: 103, Number of columns: 3
(103, 3)


---

## Part 2

### Step 1: Loading Geospatial data
1. Create path to the source of the csv file containing the geospatial data.
2. Create a dataframe containing the geospatial data by reading the csv file, and display the head and tail of the dataframe.

In [125]:
# 1.
path = 'https://cocl.us/Geospatial_data'
# 2.
Geospatial_data = pd.read_csv(path)
Geospatial_data

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476
...,...,...,...
98,M9N,43.706876,-79.518188
99,M9P,43.696319,-79.532242
100,M9R,43.688905,-79.554724
101,M9V,43.739416,-79.588437


### Step 2: Creating the new dataframe
1. Rename the postal code column of the geospatial_data dataframe in order share a common key with the data_df dataframe.
2. Merge the two dataframes along their common key, PostalCode, and display the top rows of the new dataframe.

In [126]:
Geospatial_data.rename(columns={'Postal Code':'PostalCode'}, inplace=True)
neigh_df = pd.merge(data_df, Geospatial_data, on='PostalCode')
neigh_df.head(12)

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,Harbourfront,43.65426,-79.360636
3,M6A,North York,"Lawrence Heights, Lawrence Manor",43.718518,-79.464763
4,M7A,Queen's Park,Queen's Park,43.662301,-79.389494
5,M9A,Downtown Toronto,Queen's Park,43.667856,-79.532242
6,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353
7,M3B,North York,Don Mills North,43.745906,-79.352188
8,M4B,East York,"Woodbine Gardens, Parkview Hill",43.706397,-79.309937
9,M5B,Downtown Toronto,"Ryerson, Garden District",43.657162,-79.378937


---

## Part 3

### Step 0: Installing and importing packages

In [225]:
from sklearn.cluster import KMeans
import requests
import folium
import matplotlib.cm as cm
import matplotlib.colors as colors
print('Packages imported')

Packages imported


### Step 1: Creating dataframe of boroughs specifically in Toronto
1. Create new dataframe, tor_neigh, by selecting all rows with boroughs having 'Toronto' as part of their name from neigh_df.
2. Reset the indeces of the dataframe, and display it´s top 10 rows.

In [137]:
# 1.
tor_neigh = neigh_df[neigh_df['Borough'].str.contains('Toronto')]
# 2.
tor_neigh.reset_index(inplace=True, drop=True)
tor_neigh.head(10)

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M5A,Downtown Toronto,Harbourfront,43.65426,-79.360636
1,M9A,Downtown Toronto,Queen's Park,43.667856,-79.532242
2,M5B,Downtown Toronto,"Ryerson, Garden District",43.657162,-79.378937
3,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418
4,M4E,East Toronto,The Beaches,43.676357,-79.293031
5,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306
6,M5G,Downtown Toronto,Central Bay Street,43.657952,-79.387383
7,M6G,Downtown Toronto,Christie,43.669542,-79.422564
8,M5H,Downtown Toronto,"Adelaide, King, Richmond",43.650571,-79.384568
9,M6H,West Toronto,"Dovercourt Village, Dufferin",43.669005,-79.442259


### Step 2: Visualizing each neighborhood
1. Set latitude and longitude as mean values of the coordinates of all the selected neighborhoods/postal codes.
2. Initiate a map of Toronto based on aforementioned coordinates.
3. Plot circular markers at the location of each postal code, showing the names of neighborhoods and boroughs.
4. Visualize/show the map.

In [198]:
# 1.
latitude = tor_neigh.Latitude.mean()
longitude = tor_neigh.Longitude.mean()
# 2.
tor_map = folium.Map(location=[latitude, longitude], zoom_start=10.5)
# 3.
for lat, lng, neigh, bor in zip(tor_neigh['Latitude'], tor_neigh['Longitude'], tor_neigh['Neighborhood'], tor_neigh['Borough']):
    label = '{}, {}'.format(neigh, bor)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=4,
        popup=label,
        color='black',
        fill=True,
        fill_color='white',
        fill_opacity=0.4,
        parse_html=False).add_to(tor_map)  
# 4.    
tor_map

### Step 3: Pre-exploration
1. Check which borough in Toronto has the largest amount of neighborhoods/postal codes. I do this in order to get som descriptive statistics of my neighborhood data.
2. Locate the first neighborhood of my neighborhood data.

In [218]:
# 1.
print('Boroughs sorted by number of neighborhoods/postal codes:')
print(tor_neigh['Borough'].value_counts())
print('')

# 2.
print('The first neighborhood is {}, which has a latitude of {} and a longitude of {}'.format(
    tor_neigh.loc[0,'Neighborhood'], tor_neigh.loc[0,'Latitude'], tor_neigh.loc[0,'Longitude']))

Boroughs sorted by number of neighborhoods/postal codes:
Downtown Toronto    19
Central Toronto      9
West Toronto         6
East Toronto         5
Name: Borough, dtype: int64

The first neighborhood is Harbourfront, which has a latitude of 43.6542599 and a longitude of -79.3606359


#### **These coordinates will be the foundation of my Foursquare API searches, which will follow underneath**

### Step 4: Exploration of neighborhoods in Toronto

**1. Setting up my credentials in order to use Foursquare API**

In [223]:
CLIENT_ID = 'FLRH4PNSIL0GXEWJV3YAS1IK1VDZTRFDPKQSERDZUXW22W0V'
CLIENT_SECRET = 'INYWU5KF5VLJ0QTDGHXRCJP1RU0FTF0CQ1MJ2WSN32CSLXIV'
VERSION = '20180605'
limit = 100

**2. Defining getNearbyVenues function that locates venues in a given area**

In [221]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            limit)
            
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

**3. Use the function created above to find venues in Toronto, and display the top rows of the dataframe.**

In [227]:
tor_venues = getNearbyVenues(names = tor_neigh['Neighborhood'],
                             latitudes = tor_neigh['Latitude'],
                             longitudes = tor_neigh['Longitude'])
tor_venues.head()

Harbourfront
Queen's Park
Ryerson, Garden District
St. James Town
The Beaches
Berczy Park
Central Bay Street
Christie
Adelaide, King, Richmond
Dovercourt Village, Dufferin
Harbourfront East, Toronto Islands, Union Station
Little Portugal, Trinity
The Danforth West, Riverdale
Design Exchange, Toronto Dominion Centre
Brockton, Exhibition Place, Parkdale Village
The Beaches West, India Bazaar
Commerce Court, Victoria Hotel
Studio District
Lawrence Park
Roselawn
Davisville North
Forest Hill North, Forest Hill West
High Park, The Junction South
North Toronto West
The Annex, North Midtown, Yorkville
Parkdale, Roncesvalles
Davisville
Harbord, University of Toronto
Runnymede, Swansea
Moore Park, Summerhill East
Chinatown, Grange Park, Kensington Market
Deer Park, Forest Hill SE, Rathnelly, South Hill, Summerhill West
CN Tower, Bathurst Quay, Island airport, Harbourfront West, King and Spadina, Railway Lands, South Niagara
Rosedale
Stn A PO Boxes 25 The Esplanade
Cabbagetown, St. James Town
Fir

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Harbourfront,43.65426,-79.360636,Roselle Desserts,43.653447,-79.362017,Bakery
1,Harbourfront,43.65426,-79.360636,Tandem Coffee,43.653559,-79.361809,Coffee Shop
2,Harbourfront,43.65426,-79.360636,Cooper Koo Family YMCA,43.653191,-79.357947,Gym / Fitness Center
3,Harbourfront,43.65426,-79.360636,Body Blitz Spa East,43.654735,-79.359874,Spa
4,Harbourfront,43.65426,-79.360636,Morning Glory Cafe,43.653947,-79.361149,Breakfast Spot


**4. Use onehot-encoding to create a sparse dataframe.**

In [250]:
tor_onehot = pd.get_dummies(tor_venues[['Venue Category']], prefix="", prefix_sep="")

tor_onehot['Neighborhood'] = tor_venues['Neighborhood'] 

tor_onehot = pd.merge(tor_venues['Neighborhood'], tor_onehot)

tor_onehot.head()

Unnamed: 0,Neighborhood,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,...,Thrift / Vintage Store,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Wings Joint,Yoga Studio
0,Harbourfront,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Harbourfront,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Harbourfront,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Harbourfront,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Harbourfront,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


**5. Group rows by neighborhood, and find mean frequency of each venue in the neighborhoods.**

In [251]:
tor_grouped = tor_onehot.groupby('Neighborhood').mean().reset_index()
tor_grouped

Unnamed: 0,Neighborhood,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,...,Thrift / Vintage Store,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Wings Joint,Yoga Studio
0,"Adelaide, King, Richmond",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,...,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0
1,Berczy Park,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.018182,0.0,0.0,0.0,0.0,0.0
2,"Brockton, Exhibition Place, Parkdale Village",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455
3,Business Reply Mail Processing Centre 969 Eastern,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,"CN Tower, Bathurst Quay, Island airport, Harbo...",0.0,0.066667,0.066667,0.066667,0.133333,0.133333,0.133333,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,"Cabbagetown, St. James Town",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Central Bay Street,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011905,0.0,...,0.0,0.0,0.0,0.0,0.011905,0.0,0.0,0.011905,0.0,0.011905
7,"Chinatown, Grange Park, Kensington Market",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.010638,0.0,0.0,0.0,0.031915,0.0,0.042553,0.010638,0.0,0.0
8,Christie,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Church and Wellesley,0.012048,0.0,0.0,0.0,0.0,0.0,0.0,0.012048,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.012048,0.0,0.012048,0.012048


**6. Define a function that returns the most common venues of each neighborhood.**

In [252]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

**7. Use the function to find the top 10 venues in each neighborhood in Toronto.**

In [253]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = tor_grouped['Neighborhood']

for ind in np.arange(tor_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(tor_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Adelaide, King, Richmond",Coffee Shop,Café,Steakhouse,Thai Restaurant,Salad Place,Burger Joint,Bar,Bakery,Sushi Restaurant,Asian Restaurant
1,Berczy Park,Coffee Shop,Beer Bar,Cocktail Bar,Steakhouse,Café,Cheese Shop,Farmers Market,Bakery,Seafood Restaurant,Bistro
2,"Brockton, Exhibition Place, Parkdale Village",Breakfast Spot,Café,Coffee Shop,Yoga Studio,Gym,Restaurant,Pet Store,Performing Arts Venue,Italian Restaurant,Intersection
3,Business Reply Mail Processing Centre 969 Eastern,Burrito Place,Auto Workshop,Restaurant,Skate Park,Comic Shop,Garden Center,Gym / Fitness Center,Garden,Brewery,Smoke Shop
4,"CN Tower, Bathurst Quay, Island airport, Harbo...",Airport Lounge,Airport Service,Airport Terminal,Boat or Ferry,Boutique,Sculpture Garden,Rental Car Location,Bar,Harbor / Marina,Airport Gate


**8. Cluster the neighborhoods.**

In [254]:
kclusters = 5

tor_grouped_clustering = tor_grouped.drop('Neighborhood', 1)

kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(tor_grouped_clustering)

kmeans.labels_[0:10] 

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=int32)

**9. Create a dataframe showing each neighborhood with their most common venues, showing their respective clusters.**

In [263]:
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

tor_merged = tor_neigh

tor_merged = tor_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

tor_merged = tor_merged[np.isfinite(tor_merged['Cluster Labels'])].reset_index(drop=True)

tor_merged.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M5A,Downtown Toronto,Harbourfront,43.65426,-79.360636,0.0,Coffee Shop,Park,Bakery,Pub,Breakfast Spot,Restaurant,Café,Theater,Mexican Restaurant,Dessert Shop
1,M5B,Downtown Toronto,"Ryerson, Garden District",43.657162,-79.378937,0.0,Coffee Shop,Clothing Store,Cosmetics Shop,Café,Middle Eastern Restaurant,Fast Food Restaurant,Bakery,Theater,Bookstore,Pizza Place
2,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418,0.0,Café,Coffee Shop,Restaurant,Clothing Store,Hotel,Italian Restaurant,Beer Bar,American Restaurant,Cosmetics Shop,Breakfast Spot
3,M4E,East Toronto,The Beaches,43.676357,-79.293031,0.0,Other Great Outdoors,Trail,Health Food Store,Pub,Dance Studio,Eastern European Restaurant,Dumpling Restaurant,Donut Shop,Doner Restaurant,Dog Run
4,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306,0.0,Coffee Shop,Beer Bar,Cocktail Bar,Steakhouse,Café,Cheese Shop,Farmers Market,Bakery,Seafood Restaurant,Bistro


**10. Visualize the clustered neighborhoods.**

In [264]:
clustered_map = folium.Map(location=[latitude, longitude], zoom_start=11)

x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

markers_colors = []
for lat, lon, poi, cluster in zip(tor_merged['Latitude'], tor_merged['Longitude'], tor_merged['Neighborhood'], tor_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[int(cluster-1)],
        fill=True,
        fill_color=rainbow[int(cluster-1)],
        fill_opacity=0.7).add_to(clustered_map)
       
clustered_map

**Observation: Overwhelming majority of neighborhoods belong to the same cluster.**

### Step 5: Analyzing each cluster

**Cluster 1**

In [265]:
tor_merged.loc[tor_merged['Cluster Labels']==0]

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M5A,Downtown Toronto,Harbourfront,43.65426,-79.360636,0.0,Coffee Shop,Park,Bakery,Pub,Breakfast Spot,Restaurant,Café,Theater,Mexican Restaurant,Dessert Shop
1,M5B,Downtown Toronto,"Ryerson, Garden District",43.657162,-79.378937,0.0,Coffee Shop,Clothing Store,Cosmetics Shop,Café,Middle Eastern Restaurant,Fast Food Restaurant,Bakery,Theater,Bookstore,Pizza Place
2,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418,0.0,Café,Coffee Shop,Restaurant,Clothing Store,Hotel,Italian Restaurant,Beer Bar,American Restaurant,Cosmetics Shop,Breakfast Spot
3,M4E,East Toronto,The Beaches,43.676357,-79.293031,0.0,Other Great Outdoors,Trail,Health Food Store,Pub,Dance Studio,Eastern European Restaurant,Dumpling Restaurant,Donut Shop,Doner Restaurant,Dog Run
4,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306,0.0,Coffee Shop,Beer Bar,Cocktail Bar,Steakhouse,Café,Cheese Shop,Farmers Market,Bakery,Seafood Restaurant,Bistro
5,M5G,Downtown Toronto,Central Bay Street,43.657952,-79.387383,0.0,Coffee Shop,Sandwich Place,Italian Restaurant,Burger Joint,Ice Cream Shop,Café,Japanese Restaurant,Fried Chicken Joint,Chinese Restaurant,Salad Place
6,M6G,Downtown Toronto,Christie,43.669542,-79.422564,0.0,Grocery Store,Café,Park,Restaurant,Baby Store,Athletics & Sports,Candy Store,Diner,Italian Restaurant,Nightclub
7,M5H,Downtown Toronto,"Adelaide, King, Richmond",43.650571,-79.384568,0.0,Coffee Shop,Café,Steakhouse,Thai Restaurant,Salad Place,Burger Joint,Bar,Bakery,Sushi Restaurant,Asian Restaurant
8,M6H,West Toronto,"Dovercourt Village, Dufferin",43.669005,-79.442259,0.0,Pharmacy,Bakery,Supermarket,Pizza Place,Café,Middle Eastern Restaurant,Art Gallery,Bar,Bank,Smoke Shop
9,M5J,Downtown Toronto,"Harbourfront East, Toronto Islands, Union Station",43.640816,-79.381752,0.0,Coffee Shop,Aquarium,Café,Hotel,Italian Restaurant,Restaurant,Brewery,Scenic Lookout,Fried Chicken Joint,Pizza Place


**Observation: Neighborhoods in this cluster seemingly contains a lot of restaurants, pubs/bars, cafés, and a wide variety of stores. This resembles the type of venues that one would expect to find in the city centre.**

**Cluster 2**

In [266]:
tor_merged.loc[tor_merged['Cluster Labels']==1]

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
20,M5P,Central Toronto,"Forest Hill North, Forest Hill West",43.696948,-79.411307,1.0,Park,Jewelry Store,Trail,Sushi Restaurant,Yoga Studio,Deli / Bodega,Eastern European Restaurant,Dumpling Restaurant,Donut Shop,Doner Restaurant
32,M4W,Downtown Toronto,Rosedale,43.679563,-79.377529,1.0,Park,Playground,Trail,Dance Studio,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop,Doner Restaurant,Dog Run


**Observation: Neighborhoods in this cluster contains parks, restaurants and a few stores. This seems to indicate that venues in these two neighborhoods are focusing on different pastime activities that their inhabitants might enjoy.**

**Cluster 3**

In [267]:
tor_merged.loc[tor_merged['Cluster Labels']==2]

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
17,M4N,Central Toronto,Lawrence Park,43.72802,-79.38879,2.0,Park,Gym / Fitness Center,Swim School,Bus Line,Yoga Studio,Department Store,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop


**Observation: The venues in the neighborhood in this cluster appears to be heavily focus on pastime activities, as the consist of parks, gyms and yoga studios, as few stores, as well as a few restaurants.**

**Cluster 4**

In [268]:
tor_merged.loc[tor_merged['Cluster Labels']==3]

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
18,M5N,Central Toronto,Roselawn,43.711695,-79.416936,3.0,Health & Beauty Service,Garden,Yoga Studio,Deli / Bodega,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop,Doner Restaurant


**Observation: The venues in this cluster/neighborhood seem to focus on health/beaty, given by the three most common venues. Additionally there appears to be quite a few restaurants in this cluster.**

**Cluster 5**

In [269]:
tor_merged.loc[tor_merged['Cluster Labels']==4]

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
28,M4T,Central Toronto,"Moore Park, Summerhill East",43.689574,-79.38316,4.0,Restaurant,Playground,Intersection,Deli / Bodega,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop,Doner Restaurant,Dog Run


**Observation: This cluster contains a lot of restaurants, as well as some stores and a playground. Taking this into cosideration, and the fact that intersections are the third most common venues in the neighborhood, suggests that the cluster/neighborhood lies quite close to the city centre.**