# Final Assignment of the Applied Data Science Capstone
Segmenting and Clustering Neighborhoods in Toronto

## Scrape the Toronto Neighbourhoods
1. Use beautiful soap to scrape wikipedia page: https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M
2. Load the scraped data into the dataframe

Install the Beautiful Soup library for scraping of the wikipedia page

In [63]:

! pip3 install bs4



Import all the necessary libraries for the first task

In [64]:
import pandas as pd
import requests
import numpy as np
from bs4 import BeautifulSoup

Get the wikipedia page html and load it to Beautiful Soup with html.parser

In [65]:
html_data = requests.get(url='https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M')
soup_scraper = BeautifulSoup(html_data.text, 'html.parser')
soup_scraper.title


<title>List of postal codes of Canada: M - Wikipedia</title>

Wikipedia page successfully loaded, now we can create the pandas data frame with required columns and fill it with table details from html

In [66]:
toronto_neighbourhoods = pd.DataFrame(columns=['PostalCode', 'Borough', 'Neighbourhood']);

for row in soup_scraper.find('div', id='mw-content-text').find('table').find('tbody').find_all('tr'):
    col = row.find_all('td')
    if len(col) > 0:
        postal_code = col[0].text
        borough = col[1].text
        neighbourhood = col[2].text

        toronto_neighbourhoods = toronto_neighbourhoods.append({'PostalCode': postal_code, 'Borough': borough, 'Neighbourhood': neighbourhood}, ignore_index=True)
    


We have acquired our dataset, now we print the first 5 elements to see the data quality

In [67]:
toronto_neighbourhoods.head()


Unnamed: 0,PostalCode,Borough,Neighbourhood
0,M1A\n,Not assigned\n,Not assigned\n
1,M2A\n,Not assigned\n,Not assigned\n
2,M3A\n,North York\n,Parkwoods\n
3,M4A\n,North York\n,Victoria Village\n
4,M5A\n,Downtown Toronto\n,"Regent Park, Harbourfront\n"


We have to remove '\n' character from the dataset

In [68]:
toronto_neighbourhoods['PostalCode'] = toronto_neighbourhoods['PostalCode'].str.replace(r'\n', '')
toronto_neighbourhoods['Borough'] = toronto_neighbourhoods['Borough'].str.replace(
    r'\n', '')
toronto_neighbourhoods['Neighbourhood'] = toronto_neighbourhoods['Neighbourhood'].str.replace(
    r'\n', '')
toronto_neighbourhoods.head()


Unnamed: 0,PostalCode,Borough,Neighbourhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"


Now we should have clean data, but we can see there are unassigned neighbourhoods. We should remove them

In [69]:
toronto_neighbourhoods.replace('Not assigned', np.nan, inplace=True)
toronto_neighbourhoods.dropna(subset=['Borough'], axis=0, inplace=True)
toronto_neighbourhoods['Neighbourhood'].fillna(toronto_neighbourhoods['Borough'], inplace=True)
toronto_neighbourhoods.isnull().value_counts()


PostalCode  Borough  Neighbourhood
False       False    False            103
dtype: int64

No lets see how many of the rows we have left.

In [70]:
toronto_neighbourhoods.shape


(103, 3)

## Add Geolocation data
As the geocoder api do not work properly, load the latitude and longitude from the csv given in the assignment.

In [80]:
toronto_postal_code_geo = pd.read_csv('https://cocl.us/Geospatial_data')
toronto_postal_code_geo.head()

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


With the geo data loaded, rename the column to have same column names in data frames and look at shape if we can merge them.

In [82]:
toronto_postal_code_geo.rename(columns={'Postal Code': 'PostalCode'}, inplace=True)
toronto_postal_code_geo.shape


(103, 3)

As the shape is the same as of our original data frame, merge two data frames together based on the PostalCode column.

In [88]:
toronto_neighbourhoods_geo = pd.merge(toronto_neighbourhoods, toronto_postal_code_geo, on='PostalCode')
toronto_neighbourhoods_geo.head()


Unnamed: 0,PostalCode,Borough,Neighbourhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494


## Data Visualisation on the Map
Explore and cluster the neighborhoods in Toronto. We'll work with only boroughs that contain the word Toronto and then replicate the same analysis we did to the New York City data.

We do not need postal code anymore, we can remove it

In [89]:
toronto_neighbourhoods_geo.drop('PostalCode', axis = 1, inplace=True)

Use geopy library to get the latitude and longitude values of Toronto

In [91]:
! pip3 install geopy

Collecting geopy
  Downloading geopy-2.1.0-py3-none-any.whl (112 kB)
[K     |████████████████████████████████| 112 kB 4.7 MB/s 
[?25hCollecting geographiclib<2,>=1.49
  Downloading geographiclib-1.50-py3-none-any.whl (38 kB)
Installing collected packages: geographiclib, geopy
Successfully installed geographiclib-1.50 geopy-2.1.0


In [92]:
from geopy.geocoders import Nominatim

In [93]:
address = 'Toronto, Ontario'

geolocator = Nominatim(user_agent="to_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(
    latitude, longitude))


The geograpical coordinate of Toronto are 43.6534817, -79.3839347.


Create a Map of Toronto with neighborhoods

In [94]:
! pip3 install folium

Collecting folium
  Downloading folium-0.12.1-py2.py3-none-any.whl (94 kB)
[K     |████████████████████████████████| 94 kB 2.4 MB/s 
Collecting branca>=0.3.0
  Downloading branca-0.4.2-py3-none-any.whl (24 kB)
Installing collected packages: branca, folium
Successfully installed branca-0.4.2 folium-0.12.1


In [95]:
import folium

In [98]:
# create map of New York using latitude and longitude values
map_toronto = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighbourhood in zip(toronto_neighbourhoods_geo['Latitude'], toronto_neighbourhoods_geo['Longitude'],
 toronto_neighbourhoods_geo['Borough'], toronto_neighbourhoods_geo['Neighbourhood']):
    label = '{}, {}'.format(neighbourhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)

map_toronto


## Foursquare Data, Get Venues
After the visualisation, lets cluster the data. We'll use the Foursquare API to do so.

Foursquare API config

In [100]:
CLIENT_ID = 'T1GPNN0F3DDVR5HUEMG3AVOGD3GPKQ0QAJMHUYLF4520ZAUE'  # your Foursquare ID
# your Foursquare Secret
CLIENT_SECRET = '2SM12XT5EDJ5QXQAEDOJVGDYCIP40JCWXBPUSTT0LAJYBVUP'
VERSION = '20180605'  # Foursquare API version
LIMIT = 100  # A default Foursquare API limit value


Let's create a function to get the venues information from neighbourhoods

In [105]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):

    venues_list = []
    for name, lat, lng in zip(names, latitudes, longitudes):

        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID,
            CLIENT_SECRET,
            VERSION,
            lat,
            lng,
            radius,
            LIMIT)

        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']

        # return only relevant information for each nearby venue
        venues_list.append([(
            name,
            lat,
            lng,
            v['venue']['name'],
            v['venue']['location']['lat'],
            v['venue']['location']['lng'],
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame(
        [item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighbourhood',
                             'Neighbourhood Latitude',
                             'Neighbourhood Longitude',
                             'Venue',
                             'Venue Latitude',
                             'Venue Longitude',
                             'Venue Category']

    return(nearby_venues)


Let's use the function and get the information we need

In [106]:
toronto_venues = getNearbyVenues(names=toronto_neighbourhoods_geo['Neighbourhood'],
                                 latitudes=toronto_neighbourhoods_geo['Latitude'],
                                 longitudes=toronto_neighbourhoods_geo['Longitude']
                                 )
toronto_venues.head()

Unnamed: 0,Neighbourhood,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Parkwoods,43.753259,-79.329656,Brookbanks Park,43.751976,-79.33214,Park
1,Parkwoods,43.753259,-79.329656,Towns On The Ravine,43.754754,-79.332552,Hotel
2,Parkwoods,43.753259,-79.329656,Variety Store,43.751974,-79.333114,Food & Drink Shop
3,Parkwoods,43.753259,-79.329656,Corrosion Service Company Limited,43.752432,-79.334661,Construction & Landscaping
4,Victoria Village,43.725882,-79.315572,Victoria Village Arena,43.723481,-79.315635,Hockey Arena


How many venues we have for each Neighbourhood?

In [107]:
toronto_venues.groupby('Neighbourhood').count()

Unnamed: 0_level_0,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighbourhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Agincourt,5,5,5,5,5,5
"Alderwood, Long Branch",8,8,8,8,8,8
"Bathurst Manor, Wilson Heights, Downsview North",21,21,21,21,21,21
Bayview Village,4,4,4,4,4,4
"Bedford Park, Lawrence Manor East",26,26,26,26,26,26
...,...,...,...,...,...,...
"Willowdale, Willowdale East",34,34,34,34,34,34
"Willowdale, Willowdale West",4,4,4,4,4,4
Woburn,3,3,3,3,3,3
Woodbine Heights,5,5,5,5,5,5


How many unique categories for venues are in the data?

In [109]:
print('There is {} uniques categories.'.format(
    len(toronto_venues['Venue Category'].unique())))


There is 270 uniques categories.


## Cluster Neighbourhoods
Group the data by taking the mean of the frequency of occurence of each category

In [113]:
# one hot encoding
toronto_onehot = pd.get_dummies(
    toronto_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
toronto_onehot['Neighbourhood'] = toronto_venues['Neighbourhood']

# move neighborhood column to the first column
fixed_columns = [toronto_onehot.columns[-1]] + \
    list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]

toronto_grouped = toronto_onehot.groupby(
    'Neighbourhood').mean().reset_index()
toronto_grouped.head()


Unnamed: 0,Neighbourhood,Accessories Store,Airport,Airport Food Court,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,Aquarium,...,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio
0,Agincourt,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,"Alderwood, Long Branch",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,"Bathurst Manor, Wilson Heights, Downsview North",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Bayview Village,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,"Bedford Park, Lawrence Manor East",0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.0


Run k-means to cluster the neighbourhood into 5 clusters

In [116]:
! pip3 install sklearn

Collecting sklearn
  Downloading sklearn-0.0.tar.gz (1.1 kB)
Collecting scikit-learn
  Downloading scikit_learn-0.24.1-cp38-cp38-macosx_10_13_x86_64.whl (7.2 MB)
[K     |████████████████████████████████| 7.2 MB 4.9 MB/s 
[?25hCollecting joblib>=0.11
  Downloading joblib-1.0.1-py3-none-any.whl (303 kB)
[K     |████████████████████████████████| 303 kB 5.5 MB/s 
[?25hCollecting threadpoolctl>=2.0.0
  Downloading threadpoolctl-2.1.0-py3-none-any.whl (12 kB)
Collecting scipy>=0.19.1
  Downloading scipy-1.6.1-cp38-cp38-macosx_10_9_x86_64.whl (30.8 MB)
[K     |████████████████████████████████| 30.8 MB 3.5 MB/s 
Building wheels for collected packages: sklearn
  Building wheel for sklearn (setup.py) ... [?25ldone
[?25h  Created wheel for sklearn: filename=sklearn-0.0-py2.py3-none-any.whl size=1316 sha256=a227be5787d5debb088065a8536eab0e2ec206ff0ac86a8262656e42e5f9d235
  Stored in directory: /Users/svecond2/Library/Caches/pip/wheels/22/0b/40/fd3f795caaa1fb4c6cb738bc1f56100be1e57da95849bfc

In [117]:
from sklearn.cluster import KMeans

In [118]:
# set number of clusters
kclusters = 5

toronto_grouped_clustering = toronto_grouped.drop('Neighbourhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(
    toronto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]


array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1], dtype=int32)

Let's first get the to 10 venues for each neighbourhood

In [129]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)

    return row_categories_sorted.index.values[0:num_top_venues]


num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighbourhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
toronto_venues_sorted = pd.DataFrame(columns=columns)
toronto_venues_sorted['Neighbourhood'] = toronto_grouped['Neighbourhood']

for ind in np.arange(toronto_grouped.shape[0]):
    toronto_venues_sorted.iloc[ind, 1:] = return_most_common_venues(
        toronto_grouped.iloc[ind, :], num_top_venues)

toronto_venues_sorted.head()


Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Agincourt,Latin American Restaurant,Clothing Store,Breakfast Spot,Lounge,Skating Rink,Motel,Moroccan Restaurant,Monument / Landmark,Molecular Gastronomy Restaurant,Modern European Restaurant
1,"Alderwood, Long Branch",Pizza Place,Pharmacy,Gym,Coffee Shop,Pool,Pub,Sandwich Place,Mexican Restaurant,Metro Station,Middle Eastern Restaurant
2,"Bathurst Manor, Wilson Heights, Downsview North",Bank,Coffee Shop,Diner,Middle Eastern Restaurant,Park,Mobile Phone Shop,Shopping Mall,Fried Chicken Joint,Supermarket,Sandwich Place
3,Bayview Village,Japanese Restaurant,Bank,Chinese Restaurant,Café,Nail Salon,Museum,Movie Theater,Motel,Moroccan Restaurant,Monument / Landmark
4,"Bedford Park, Lawrence Manor East",Coffee Shop,Sandwich Place,Italian Restaurant,Hobby Shop,Thai Restaurant,Pharmacy,Pizza Place,Comfort Food Restaurant,Pub,Restaurant


Now let's combine the top 10 venues and the clusters

In [130]:
# add clustering labels
toronto_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)
toronto_merged = toronto_neighbourhoods_geo

# merge manhattan_grouped with manhattan_data to add latitude/longitude for each neighborhood
toronto_merged = toronto_merged.join(
    toronto_venues_sorted.set_index('Neighbourhood'), on='Neighbourhood')

toronto_merged.head()  # check the last columns!


Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,North York,Parkwoods,43.753259,-79.329656,1.0,Hotel,Construction & Landscaping,Park,Food & Drink Shop,Miscellaneous Shop,Motel,Moroccan Restaurant,Monument / Landmark,Molecular Gastronomy Restaurant,Modern European Restaurant
1,North York,Victoria Village,43.725882,-79.315572,1.0,Pizza Place,Hockey Arena,Portuguese Restaurant,French Restaurant,Coffee Shop,Accessories Store,Modern European Restaurant,Motel,Moroccan Restaurant,Monument / Landmark
2,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636,1.0,Coffee Shop,Bakery,Park,Breakfast Spot,Theater,Café,Pub,Dessert Shop,Brewery,Spa
3,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763,1.0,Furniture / Home Store,Clothing Store,Accessories Store,Vietnamese Restaurant,Miscellaneous Shop,Event Space,Coffee Shop,Boutique,Women's Store,Other Great Outdoors
4,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494,1.0,Coffee Shop,Sushi Restaurant,Diner,College Cafeteria,Yoga Studio,Theater,Mexican Restaurant,Café,Fried Chicken Joint,Sandwich Place


We can see Cluster Labels are floats, we need them as ints.

In [138]:
# Ensure there are not floats in cluster lables
print('There are {} uniques categories.'.format(
    len(toronto_merged['Cluster Labels'].unique())))
toronto_merged['Cluster Labels'].unique()


There are 6 uniques categories.


array([ 1., nan,  2.,  4.,  0.,  3.])

We have NaN in our cluster categories, lets look which neighbourhoods have it

In [148]:
toronto_merged_nulls = toronto_merged[toronto_merged.isnull().any(axis=1)]
toronto_merged_nulls

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
5,Etobicoke,"Islington Avenue, Humber Valley Village",43.667856,-79.532242,,,,,,,,,,,
45,North York,"York Mills, Silver Hills",43.75749,-79.374714,,,,,,,,,,,
52,North York,"Willowdale, Newtonbrook",43.789053,-79.408493,,,,,,,,,,,
95,Scarborough,Upper Rouge,43.836125,-79.205636,,,,,,,,,,,


Ok, so those do not have any Cluster Labels because they have no Venues in the Foursquare. For now, lets drop them.

In [152]:
toronto_merged.dropna(subset=['Cluster Labels'], axis=0, inplace=True)
toronto_merged[toronto_merged.isnull().any(axis=1)]


Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue


The last thing we need is to convert cluster labels to ints from floats.

In [154]:
toronto_merged['Cluster Labels'] = toronto_merged['Cluster Labels'].astype(int)


Ok, now we should be able to visualise the clusters

In [123]:
! pip3 install matplotlib

Collecting matplotlib
  Downloading matplotlib-3.3.4-cp38-cp38-macosx_10_9_x86_64.whl (8.5 MB)
[K     |████████████████████████████████| 8.5 MB 6.1 MB/s 
Collecting pillow>=6.2.0
  Downloading Pillow-8.1.1-cp38-cp38-macosx_10_10_x86_64.whl (2.2 MB)
[K     |████████████████████████████████| 2.2 MB 5.0 MB/s 
[?25hCollecting cycler>=0.10
  Downloading cycler-0.10.0-py2.py3-none-any.whl (6.5 kB)
Collecting pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.3
  Using cached pyparsing-2.4.7-py2.py3-none-any.whl (67 kB)
Collecting kiwisolver>=1.0.1
  Downloading kiwisolver-1.3.1-cp38-cp38-macosx_10_9_x86_64.whl (61 kB)
[K     |████████████████████████████████| 61 kB 291 kB/s 
Installing collected packages: pillow, cycler, pyparsing, kiwisolver, matplotlib
Successfully installed cycler-0.10.0 kiwisolver-1.3.1 matplotlib-3.3.4 pillow-8.1.1 pyparsing-2.4.7


In [124]:
# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors


In [155]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['Neighbourhood'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' +
                         str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)

map_clusters
