# Applied Data Science Capstone

Welcome to my Coursera's Capstone Project notebook. On this notebook I'll develop my final project for the [IBM Data Science](https://www.coursera.org/professional-certificates/ibm-data-science) course.

In [2]:
#!pip install geocoder
#!pip install bs4

In [188]:
import pandas as pd
import numpy as np
import matplotlib.cm as cm
import matplotlib.colors as colors
import requests, geocoder, folium, json
from bs4 import BeautifulSoup
from sklearn.cluster import KMeans

# Segmenting and Clustering
## Convert table to dataframe

At first, I crawled and parsed the Wikipedia's page.

In [45]:
url = 'https://en.wikipedia.org/w/index.php?title=List_of_postal_codes_of_Canada:_M&oldid=945633050'
page = requests.get(url)
soup = BeautifulSoup(page.text, 'html.parser')

The Wikipedia's page has a table containing all the postal codes. Instead of using a `for` structure to parse the HTML, I used the built-in `soup.find()` method, that returns the expected table. Then I converted it into a dataframe using `pd.read_html()`.

In [46]:
table = soup.find('table')
codes = pd.read_html(str(table))[0]
codes.head()

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront


To prevent invalid borough values, I removed all not assigned boroughs from the dataframe.

In [47]:
codes = codes[(codes.Borough != 'Not assigned')]
codes.head(12)

Unnamed: 0,Postcode,Borough,Neighbourhood
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront
5,M6A,North York,Lawrence Heights
6,M6A,North York,Lawrence Manor
7,M7A,Downtown Toronto,Queen's Park
9,M9A,Etobicoke,Islington Avenue
10,M1B,Scarborough,Rouge
11,M1B,Scarborough,Malvern
13,M3B,North York,Don Mills North


Then I renamed the column `Postcode` to `Postal Code` and merged the two dataframes, `codes` and `geo`, into a new `df` dataframe containing all the information.

In [48]:
codes.rename(columns = {'Postcode' : 'Postal Code', 'Neighbourhood':'Neighborhood'}, inplace = True)
codes = codes.groupby(by=['Postal Code','Borough'], sort=False).agg( ', '.join).reset_index()

As seen, my dataset has a total of **103 rows**.

In [49]:
codes.shape

(103, 3)

## Merge postal codes and geolocations dataframes

The following snippet reads the location cordinates and creates a new dataframe containing all the information.

In [50]:
geo = pd.read_csv('https://raw.githubusercontent.com/thiagobodruk/Coursera_Capstone/master/Geospatial_Coordinates.csv')
df = codes.merge(geo, on = 'Postal Code')
df.head()

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,Harbourfront,43.65426,-79.360636
3,M6A,North York,"Lawrence Heights, Lawrence Manor",43.718518,-79.464763
4,M7A,Downtown Toronto,Queen's Park,43.662301,-79.389494


### Map of Toronto

Now, let's plot the Toronto map, based on the dataframe locations, using the Folium library.

In [51]:
toronto_lat = 43.6532;
toronto_lng = -79.3832
map_toronto = folium.Map(location = [toronto_lat, toronto_lng], zoom_start = 10.7)

for lat, lng, borough, neighborhood in zip(df['Latitude'], df['Longitude'], df['Borough'], df['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(map_toronto)  
map_toronto

### Select all neighborhoods from Downtown Toronto

Let's first get all the neighborhoods from North York borough and the borough's latitude and longitude.

In [186]:
borough = df[df['Borough'].str.contains("Toronto")]
borough.head()

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude
2,M5A,Downtown Toronto,Harbourfront,43.65426,-79.360636
4,M7A,Downtown Toronto,Queen's Park,43.662301,-79.389494
9,M5B,Downtown Toronto,"Ryerson, Garden District",43.657162,-79.378937
15,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418
19,M4E,East Toronto,The Beaches,43.676357,-79.293031


In [187]:
borough_lat =  df.loc[2, 'Latitude']
borough_lng =  df.loc[2, 'Longitude']
borough_name =  df.loc[2, 'Borough']
print('The geograpical coordinate of {} are {}, {}.'.format(borough_name, borough_lat, borough_lng))

The geograpical coordinate of Downtown Toronto are 43.6542599, -79.3606359.


Then, let's plot all the North York's neighborhoods.

In [111]:
map_borough = folium.Map(location = [borough_lat, borough_lng], zoom_start = 11)

for lat, lng, label in zip(borough['Latitude'], borough['Longitude'], borough['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(map_borough)  
map_borough

### Explore using Foursquare's API

In [142]:
def explore(latitude, longitude):
    #print('Location:', (latitude, longitude))
    CLIENT_ID = 'OMVPO1DVXFDX4RZ1L1VCKMC45ZML0TK3JQP0JBIK4YAQAHFB'
    CLIENT_SECRET = '2HVRKXEVMZ5CEHUFXY5G3PFEHS4TQYQUBXXU51WZIS4R1PXF'
    VERSION = '20180604'
    radius = 500
    limit = 10
    url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&ll={},{}&v={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, radius, limit)
    results = requests.get(url).json()
    #print('Responde Code:', results['meta']['code'])
    return results['response']['groups'][0]['items']

In [124]:
result = explore(borough_lat, borough_lng)
result

Location: (43.6542599, -79.3606359)
Responde Code: 200


[{'reasons': {'count': 0,
   'items': [{'summary': 'This spot is popular',
     'type': 'general',
     'reasonName': 'globalInteractionReason'}]},
  'venue': {'id': '54ea41ad498e9a11e9e13308',
   'name': 'Roselle Desserts',
   'location': {'address': '362 King St E',
    'crossStreet': 'Trinity St',
    'lat': 43.653446723052674,
    'lng': -79.3620167174383,
    'labeledLatLngs': [{'label': 'display',
      'lat': 43.653446723052674,
      'lng': -79.3620167174383}],
    'distance': 143,
    'postalCode': 'M5A 1K9',
    'cc': 'CA',
    'city': 'Toronto',
    'state': 'ON',
    'country': 'Canada',
    'formattedAddress': ['362 King St E (Trinity St)',
     'Toronto ON M5A 1K9',
     'Canada']},
   'categories': [{'id': '4bf58dd8d48988d16a941735',
     'name': 'Bakery',
     'pluralName': 'Bakeries',
     'shortName': 'Bakery',
     'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/bakery_',
      'suffix': '.png'},
     'primary': True}],
   'photos': {'count': 0, 'grou

The following method extracts the categories from each venue, then insert them on a new dataframe `nearby_venues`.

In [126]:
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [135]:
nearby_venues = pd.json_normalize(result)

filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]
nearby_venues.head(10)

Unnamed: 0,name,categories,lat,lng
0,Roselle Desserts,Bakery,43.653447,-79.362017
1,Tandem Coffee,Coffee Shop,43.653559,-79.361809
2,Cooper Koo Family YMCA,Distribution Center,43.653249,-79.358008
3,Body Blitz Spa East,Spa,43.654735,-79.359874
4,Morning Glory Cafe,Breakfast Spot,43.653947,-79.361149
5,Impact Kitchen,Restaurant,43.656369,-79.35698
6,Corktown Common,Park,43.655618,-79.356211
7,Figs Breakfast & Lunch,Breakfast Spot,43.655675,-79.364503
8,The Distillery Historic District,Historic Site,43.650244,-79.359323
9,Dominion Pub and Kitchen,Pub,43.656919,-79.358967


In [137]:
print('{} venues were returned in {}.'.format(nearby_venues.shape[0], borough_name))

10 venues were returned in Downtown Toronto.


Now, we create a method that explore all the venues nearby and assemble all the information.

In [139]:
def getNearbyVenues(names, latitudes, longitudes):
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
        results = explore(lat, lng)
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])
    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    return(nearby_venues)

In [146]:
borough_venues = getNearbyVenues(
    borough['Neighborhood'], borough['Latitude'], borough['Longitude'])

Harbourfront
Queen's Park
Ryerson, Garden District
St. James Town
The Beaches
Berczy Park
Central Bay Street
Christie
Adelaide, King, Richmond
Dovercourt Village, Dufferin
Harbourfront East, Toronto Islands, Union Station
Little Portugal, Trinity
The Danforth West, Riverdale
Design Exchange, Toronto Dominion Centre
Brockton, Exhibition Place, Parkdale Village
The Beaches West, India Bazaar
Commerce Court, Victoria Hotel
Studio District
Lawrence Park
Roselawn
Davisville North
Forest Hill North, Forest Hill West
High Park, The Junction South
North Toronto West
The Annex, North Midtown, Yorkville
Parkdale, Roncesvalles
Davisville
Harbord, University of Toronto
Runnymede, Swansea
Moore Park, Summerhill East
Chinatown, Grange Park, Kensington Market
Deer Park, Forest Hill SE, Rathnelly, South Hill, Summerhill West
CN Tower, Bathurst Quay, Island airport, Harbourfront West, King and Spadina, Railway Lands, South Niagara
Rosedale
Stn A PO Boxes 25 The Esplanade
Cabbagetown, St. James Town
Fir

In [147]:
borough_venues.head(10)

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Harbourfront,43.65426,-79.360636,Roselle Desserts,43.653447,-79.362017,Bakery
1,Harbourfront,43.65426,-79.360636,Tandem Coffee,43.653559,-79.361809,Coffee Shop
2,Harbourfront,43.65426,-79.360636,Cooper Koo Family YMCA,43.653249,-79.358008,Distribution Center
3,Harbourfront,43.65426,-79.360636,Body Blitz Spa East,43.654735,-79.359874,Spa
4,Harbourfront,43.65426,-79.360636,Morning Glory Cafe,43.653947,-79.361149,Breakfast Spot
5,Harbourfront,43.65426,-79.360636,Impact Kitchen,43.656369,-79.35698,Restaurant
6,Harbourfront,43.65426,-79.360636,Corktown Common,43.655618,-79.356211,Park
7,Harbourfront,43.65426,-79.360636,Figs Breakfast & Lunch,43.655675,-79.364503,Breakfast Spot
8,Harbourfront,43.65426,-79.360636,The Distillery Historic District,43.650244,-79.359323,Historic Site
9,Harbourfront,43.65426,-79.360636,Dominion Pub and Kitchen,43.656919,-79.358967,Pub


In [150]:
print('There are {} uniques categories in {}.'.format(len(borough_venues['Venue Category'].unique()), borough_name))

There are 120 uniques categories in Downtown Toronto.


The following snippet creates a pivot table containing all the categories of the neighborhood. After thatm we group the categories by neighborhood.

In [162]:
pivot = pd.get_dummies(borough_venues[['Venue Category']], prefix="", prefix_sep="")
pivot['Neighborhood'] = borough_venues['Neighborhood'] 
fixed_columns = [pivot.columns[-1]] + list(pivot.columns[:-1])
pivot = pivot[fixed_columns]
pivot.describe(include='all')

Unnamed: 0,Yoga Studio,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Terminal,American Restaurant,Arts & Crafts Store,Asian Restaurant,Auto Workshop,...,Sushi Restaurant,Swim School,Tea Room,Thai Restaurant,Theater,Theme Restaurant,Trail,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Wine Bar
count,348.0,348.0,348.0,348.0,348.0,348.0,348.0,348.0,348.0,348.0,...,348.0,348.0,348.0,348.0,348.0,348.0,348.0,348.0,348.0,348.0
unique,,,,,,,,,,,...,,,,,,,,,,
top,,,,,,,,,,,...,,,,,,,,,,
freq,,,,,,,,,,,...,,,,,,,,,,
mean,0.011494,0.002874,0.002874,0.002874,0.005747,0.002874,0.011494,0.005747,0.005747,0.002874,...,0.020115,0.002874,0.011494,0.008621,0.002874,0.002874,0.008621,0.011494,0.002874,0.002874
std,0.106747,0.053606,0.053606,0.053606,0.0757,0.053606,0.106747,0.0757,0.0757,0.053606,...,0.140596,0.053606,0.106747,0.09258,0.053606,0.053606,0.09258,0.106747,0.053606,0.053606
min,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
25%,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
50%,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
75%,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [163]:
pivot_neighbor = pivot.groupby('Neighborhood').mean().reset_index()
pivot_neighbor.head(7)

Unnamed: 0,Neighborhood,Yoga Studio,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Terminal,American Restaurant,Arts & Crafts Store,Asian Restaurant,...,Sushi Restaurant,Swim School,Tea Room,Thai Restaurant,Theater,Theme Restaurant,Trail,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Wine Bar
0,"Adelaide, King, Richmond",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0
1,Berczy Park,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.1,0.0,0.0
2,"Brockton, Exhibition Place, Parkdale Village",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Business Reply Mail Processing Centre 969 Eastern,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,"CN Tower, Bathurst Quay, Island airport, Harbo...",0.0,0.1,0.1,0.1,0.2,0.1,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,"Cabbagetown, St. James Town",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Central Bay Street,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


### Analyzing neighbors and categories

The following snippet identifies the top 10 venues by neighborhood.

In [165]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    return row_categories_sorted.index.values[0:num_top_venues]

In [169]:
num_top_venues = 10
indicators = ['st', 'nd', 'rd']
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = pivot_neighbor['Neighborhood']
for ind in np.arange(pivot_neighbor.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(pivot_neighbor.iloc[ind, :], num_top_venues)
neighborhoods_venues_sorted

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Adelaide, King, Richmond",Asian Restaurant,Greek Restaurant,Hotel,Speakeasy,Plaza,Steakhouse,Café,Restaurant,Concert Hall,Vegetarian / Vegan Restaurant
1,Berczy Park,French Restaurant,Farmers Market,Museum,Concert Hall,Liquor Store,Restaurant,Cocktail Bar,Park,Vegetarian / Vegan Restaurant,Thai Restaurant
2,"Brockton, Exhibition Place, Parkdale Village",Coffee Shop,Pet Store,Gym,Climbing Gym,Café,Breakfast Spot,Bar,Italian Restaurant,Furniture / Home Store,Food
3,Business Reply Mail Processing Centre 969 Eastern,Skate Park,Auto Workshop,Garden Center,Pizza Place,Burrito Place,Restaurant,Brewery,Farmers Market,Fast Food Restaurant,Comic Shop
4,"CN Tower, Bathurst Quay, Island airport, Harbo...",Airport Lounge,Bar,Airport,Airport Food Court,Airport Gate,Airport Terminal,Coffee Shop,Plane,Harbor / Marina,Wine Bar
5,"Cabbagetown, St. James Town",Café,Indian Restaurant,Italian Restaurant,Jewelry Store,Diner,Bakery,Restaurant,Japanese Restaurant,General Entertainment,Dance Studio
6,Central Bay Street,Coffee Shop,Modern European Restaurant,Park,Sushi Restaurant,Italian Restaurant,Japanese Restaurant,Gastropub,Cosmetics Shop,Convenience Store,Concert Hall
7,"Chinatown, Grange Park, Kensington Market",Café,Coffee Shop,Arts & Crafts Store,Organic Grocery,Bakery,Mexican Restaurant,Caribbean Restaurant,Vietnamese Restaurant,Dance Studio,Cuban Restaurant
8,Christie,Café,Grocery Store,Coffee Shop,Italian Restaurant,Restaurant,Candy Store,Diner,Creperie,Cuban Restaurant,Distribution Center
9,Church and Wellesley,Bookstore,Gastropub,Park,Theme Restaurant,Bubble Tea Shop,Restaurant,Mexican Restaurant,Breakfast Spot,Ramen Restaurant,Dance Studio


### Clustering the boroughs

The following snippet clusters the boroughs then add the labels to the dataframe.

In [180]:
k = 5
neighborhood_grouped_clustering = pivot_neighbor.drop('Neighborhood', 1)
kmeans = KMeans(n_clusters = k, random_state = 0)
kmeans.fit(neighborhood_grouped_clustering)
kmeans.labels_[0:10]

array([4, 3, 0, 3, 0, 4, 0, 4, 4, 3], dtype=int32)

In [185]:
clusters = borough
clusters['Cluster Labels'] = kmeans.labels_
clusters = clusters.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')
clusters

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,M5A,Downtown Toronto,Harbourfront,43.65426,-79.360636,4,Breakfast Spot,Pub,Coffee Shop,Spa,Restaurant,Historic Site,Bakery,Distribution Center,Park,Cuban Restaurant
4,M7A,Downtown Toronto,Queen's Park,43.662301,-79.389494,3,Coffee Shop,Sushi Restaurant,Portuguese Restaurant,Creperie,Italian Restaurant,Park,Distribution Center,Yoga Studio,Beer Bar,Flea Market
9,M5B,Downtown Toronto,"Ryerson, Garden District",43.657162,-79.378937,0,Burrito Place,Clothing Store,Music Venue,Comic Shop,Theater,Tea Room,Pizza Place,Café,Plaza,Ramen Restaurant
15,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418,3,Coffee Shop,Gym,Italian Restaurant,Restaurant,BBQ Joint,Middle Eastern Restaurant,Japanese Restaurant,Cosmetics Shop,Food Truck,Gastropub
19,M4E,East Toronto,The Beaches,43.676357,-79.293031,0,Pub,Park,Trail,Health Food Store,Creperie,Department Store,Deli / Bodega,Dance Studio,Cuban Restaurant,Wine Bar
20,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306,4,French Restaurant,Farmers Market,Museum,Concert Hall,Liquor Store,Restaurant,Cocktail Bar,Park,Vegetarian / Vegan Restaurant,Thai Restaurant
24,M5G,Downtown Toronto,Central Bay Street,43.657952,-79.387383,0,Coffee Shop,Modern European Restaurant,Park,Sushi Restaurant,Italian Restaurant,Japanese Restaurant,Gastropub,Cosmetics Shop,Convenience Store,Concert Hall
25,M6G,Downtown Toronto,Christie,43.669542,-79.422564,4,Café,Grocery Store,Coffee Shop,Italian Restaurant,Restaurant,Candy Store,Diner,Creperie,Cuban Restaurant,Distribution Center
30,M5H,Downtown Toronto,"Adelaide, King, Richmond",43.650571,-79.384568,4,Asian Restaurant,Greek Restaurant,Hotel,Speakeasy,Plaza,Steakhouse,Café,Restaurant,Concert Hall,Vegetarian / Vegan Restaurant
31,M6H,West Toronto,"Dovercourt Village, Dufferin",43.669005,-79.442259,3,Bakery,Brewery,Music Venue,Café,Grocery Store,Middle Eastern Restaurant,Gym / Fitness Center,Bar,Bank,Diner


### Plotting the clusters on the map

Now, each one of the five clusters are plotted on the map, on their respective neighborhood.

In [194]:
map_clusters = folium.Map(location = [borough_lat, borough_lng], zoom_start=11)


x = np.arange(k)
ys = [i+x+(i*x)**2 for i in range(k)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]


markers_colors = []
for lat, lon, poi, cluster in zip(clusters['Latitude'], clusters['Longitude'], clusters['Neighborhood'], clusters['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

## The five neighborhood's clusters

In [204]:
clusters[(clusters['Cluster Labels'] == 0)]

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
9,M5B,Downtown Toronto,"Ryerson, Garden District",43.657162,-79.378937,0,Burrito Place,Clothing Store,Music Venue,Comic Shop,Theater,Tea Room,Pizza Place,Café,Plaza,Ramen Restaurant
19,M4E,East Toronto,The Beaches,43.676357,-79.293031,0,Pub,Park,Trail,Health Food Store,Creperie,Department Store,Deli / Bodega,Dance Studio,Cuban Restaurant,Wine Bar
24,M5G,Downtown Toronto,Central Bay Street,43.657952,-79.387383,0,Coffee Shop,Modern European Restaurant,Park,Sushi Restaurant,Italian Restaurant,Japanese Restaurant,Gastropub,Cosmetics Shop,Convenience Store,Concert Hall
42,M5K,Downtown Toronto,"Design Exchange, Toronto Dominion Centre",43.647177,-79.381576,0,Coffee Shop,Café,Tea Room,Beer Bar,Restaurant,Hotel,Pub,Gym,Cuban Restaurant,Diner
43,M6K,West Toronto,"Brockton, Exhibition Place, Parkdale Village",43.636847,-79.428191,0,Coffee Shop,Pet Store,Gym,Climbing Gym,Café,Breakfast Spot,Bar,Italian Restaurant,Furniture / Home Store,Food
80,M5S,Downtown Toronto,"Harbord, University of Toronto",43.662696,-79.400049,0,Yoga Studio,Bakery,Dessert Shop,Restaurant,Japanese Restaurant,Italian Restaurant,Beer Bar,Bar,Bookstore,College Gym
84,M5T,Downtown Toronto,"Chinatown, Grange Park, Kensington Market",43.653206,-79.400049,0,Café,Coffee Shop,Arts & Crafts Store,Organic Grocery,Bakery,Mexican Restaurant,Caribbean Restaurant,Vietnamese Restaurant,Dance Studio,Cuban Restaurant
92,M5W,Downtown Toronto,Stn A PO Boxes 25 The Esplanade,43.646435,-79.374846,0,Cocktail Bar,Vegetarian / Vegan Restaurant,French Restaurant,Fountain,Park,Thai Restaurant,Museum,Restaurant,Café,Cosmetics Shop


In [205]:
clusters[(clusters['Cluster Labels'] == 1)]

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
54,M4M,East Toronto,Studio District,43.659526,-79.340923,1,Coffee Shop,Gay Bar,Bookstore,Fish Market,Ice Cream Shop,Café,Sandwich Place,Pet Store,Arts & Crafts Store,Concert Hall
69,M6P,West Toronto,"High Park, The Junction South",43.661608,-79.464763,1,Speakeasy,Gastropub,Park,Furniture / Home Store,Thai Restaurant,Flea Market,Arts & Crafts Store,Italian Restaurant,Bar,Mexican Restaurant
74,M5R,Central Toronto,"The Annex, North Midtown, Yorkville",43.67271,-79.405678,1,Café,Indian Restaurant,American Restaurant,Burger Joint,BBQ Joint,Middle Eastern Restaurant,Coffee Shop,Park,Vegetarian / Vegan Restaurant,Cosmetics Shop
81,M6S,West Toronto,"Runnymede, Swansea",43.651571,-79.48445,1,Italian Restaurant,Sushi Restaurant,Café,Pub,Burrito Place,Bookstore,Fish & Chips Shop,Coffee Shop,Food,Tea Room
97,M5X,Downtown Toronto,"First Canadian Place, Underground city",43.648429,-79.38228,1,Restaurant,Café,Coffee Shop,American Restaurant,Steakhouse,Gym / Fitness Center,Gym,Pizza Place,Cosmetics Shop,Creperie


In [206]:
clusters[(clusters['Cluster Labels'] == 2)]

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
83,M4T,Central Toronto,"Moore Park, Summerhill East",43.689574,-79.38316,2,Playground,Restaurant,Park,Wine Bar,Diner,College Gym,Comic Shop,Concert Hall,Convenience Store,Cosmetics Shop


In [207]:
clusters[(clusters['Cluster Labels'] == 3)]

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,M7A,Downtown Toronto,Queen's Park,43.662301,-79.389494,3,Coffee Shop,Sushi Restaurant,Portuguese Restaurant,Creperie,Italian Restaurant,Park,Distribution Center,Yoga Studio,Beer Bar,Flea Market
15,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418,3,Coffee Shop,Gym,Italian Restaurant,Restaurant,BBQ Joint,Middle Eastern Restaurant,Japanese Restaurant,Cosmetics Shop,Food Truck,Gastropub
31,M6H,West Toronto,"Dovercourt Village, Dufferin",43.669005,-79.442259,3,Bakery,Brewery,Music Venue,Café,Grocery Store,Middle Eastern Restaurant,Gym / Fitness Center,Bar,Bank,Diner
41,M4K,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188,3,Greek Restaurant,Ice Cream Shop,Yoga Studio,Cosmetics Shop,Pub,Brewery,Italian Restaurant,Food,Flea Market,Concert Hall
61,M4N,Central Toronto,Lawrence Park,43.72802,-79.38879,3,Park,Swim School,Bus Line,Distribution Center,College Gym,Comic Shop,Concert Hall,Convenience Store,Cosmetics Shop,Creperie
62,M5N,Central Toronto,Roselawn,43.711695,-79.416936,3,Garden,Wine Bar,Dog Run,Comic Shop,Concert Hall,Convenience Store,Cosmetics Shop,Creperie,Cuban Restaurant,Dance Studio
67,M4P,Central Toronto,Davisville North,43.712751,-79.390197,3,Breakfast Spot,Sandwich Place,Park,Convenience Store,Food & Drink Shop,Gym,Department Store,Hotel,Deli / Bodega,Dessert Shop
68,M5P,Central Toronto,"Forest Hill North, Forest Hill West",43.696948,-79.411307,3,Sushi Restaurant,Park,Trail,Jewelry Store,Wine Bar,Department Store,Deli / Bodega,Dance Studio,Cuban Restaurant,Creperie
73,M4R,Central Toronto,North Toronto West,43.715383,-79.405678,3,Yoga Studio,Spa,Clothing Store,Chinese Restaurant,Mexican Restaurant,Restaurant,Salon / Barbershop,Dessert Shop,Coffee Shop,Diner
75,M6R,West Toronto,"Parkdale, Roncesvalles",43.64896,-79.456325,3,Gift Shop,Italian Restaurant,Coffee Shop,Dog Run,Eastern European Restaurant,Movie Theater,Dessert Shop,Cuban Restaurant,Restaurant,Dance Studio


In [208]:
clusters[(clusters['Cluster Labels'] == 4)]

Unnamed: 0,Postal Code,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,M5A,Downtown Toronto,Harbourfront,43.65426,-79.360636,4,Breakfast Spot,Pub,Coffee Shop,Spa,Restaurant,Historic Site,Bakery,Distribution Center,Park,Cuban Restaurant
20,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306,4,French Restaurant,Farmers Market,Museum,Concert Hall,Liquor Store,Restaurant,Cocktail Bar,Park,Vegetarian / Vegan Restaurant,Thai Restaurant
25,M6G,Downtown Toronto,Christie,43.669542,-79.422564,4,Café,Grocery Store,Coffee Shop,Italian Restaurant,Restaurant,Candy Store,Diner,Creperie,Cuban Restaurant,Distribution Center
30,M5H,Downtown Toronto,"Adelaide, King, Richmond",43.650571,-79.384568,4,Asian Restaurant,Greek Restaurant,Hotel,Speakeasy,Plaza,Steakhouse,Café,Restaurant,Concert Hall,Vegetarian / Vegan Restaurant
36,M5J,Downtown Toronto,"Harbourfront East, Toronto Islands, Union Station",43.640816,-79.381752,4,Performing Arts Venue,Supermarket,Deli / Bodega,Skating Rink,Salad Place,Sporting Goods Shop,Lake,Park,Dessert Shop,Convenience Store
37,M6J,West Toronto,"Little Portugal, Trinity",43.647927,-79.41975,4,Wine Bar,Asian Restaurant,Cocktail Bar,Pizza Place,Cuban Restaurant,Korean Restaurant,Brewery,Ice Cream Shop,New American Restaurant,Greek Restaurant
47,M4L,East Toronto,"The Beaches West, India Bazaar",43.668999,-79.315572,4,Ice Cream Shop,Sushi Restaurant,Liquor Store,Pub,Brewery,Italian Restaurant,Fast Food Restaurant,Gym,Park,Fish & Chips Shop
48,M5L,Downtown Toronto,"Commerce Court, Victoria Hotel",43.648198,-79.379817,4,Café,Coffee Shop,Museum,Restaurant,Pub,Tea Room,American Restaurant,Gym,Gym / Fitness Center,Deli / Bodega
86,M4V,Central Toronto,"Deer Park, Forest Hill SE, Rathnelly, South Hi...",43.686412,-79.400049,4,Coffee Shop,Pub,American Restaurant,Restaurant,Sports Bar,Supermarket,Liquor Store,Sushi Restaurant,Cosmetics Shop,Dessert Shop
91,M4W,Downtown Toronto,Rosedale,43.679563,-79.377529,4,Park,Trail,Playground,Wine Bar,Diner,College Gym,Comic Shop,Concert Hall,Convenience Store,Cosmetics Shop


## Thank you! :)