# Segmenting and Clustering Neighbourhood in Toronto

### Let's First Import All Reqired Python Libraries

In [6]:
import numpy as np # library to handle data in a vectorized manner

!conda install -c conda-forge wikipedia --yes #uncomment this if you haven't installed wikipedia
import wikipedia as wp #Library to scrap data from wikipedia

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - wikipedia


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    openssl-1.1.1c             |       h516909a_0         2.1 MB  conda-forge
    ca-certificates-2019.6.16  |       hecc5488_0         145 KB  conda-forge
    certifi-2019.6.16          |           py36_0         148 KB  conda-forge
    wikipedia-1.4.0            |             py_2          13 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         2.4 MB

The following NEW packages will be INSTALLED:

    wikipedia:       1.4.0-py_2        conda-forge

The following packages will be UPDATED:

    ca-certificates: 2019.5.15-0                   --> 2019.6.16-hecc5488_0 conda-forge
    certifi:         2019.6.16-py36_0       

#### Download Data Set of Canada Neighbourhoods from Wikipedia and Import into DataFrame

In [7]:
#Get the html source
html = wp.page("List of postal codes of Canada: M").html().encode("UTF-8")
df = pd.read_html(html)[0]
df.to_csv('raw_data.csv',header=0,index=False)
df.head()

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront


Now we have the Raw Data, we can't use this data for Clustering. It means, Data Should be clean and Consistent  
So Let's Process the Data

In [8]:
#Replace Not assigned Values to NaN
df = df.replace(r'Not assigned', np.nan)
df.head()

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M1A,,
1,M2A,,
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront


Now, According to Instruction from Coursera, we have to replce the NaN values of Neighbourhood with same Corresponding Borough Name

In [9]:
df["Neighbourhood"].fillna( value=0, inplace=True)  #replace Nan Values with 

for i in range(0, 288):
    if(df["Neighbourhood"].iloc[i] == 0):
        df["Neighbourhood"].iloc[i] = df["Borough"].iloc[i]

df.head()

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M1A,,
1,M2A,,
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront


In [10]:
# Removing the Rows which contains NaN Values
df.dropna(how='any', inplace=True)
df = df.reset_index(drop=True)
df.head()

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,Harbourfront
3,M5A,Downtown Toronto,Regent Park
4,M6A,North York,Lawrence Heights


In [11]:
#Combinig all Neighbourhood having same Postcode
df = df.groupby(['Postcode', 'Borough'],as_index=False).agg(lambda x:','.join(x))
df.head()

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M1B,Scarborough,"Rouge,Malvern"
1,M1C,Scarborough,"Highland Creek,Rouge Hill,Port Union"
2,M1E,Scarborough,"Guildwood,Morningside,West Hill"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae


In [12]:
df.shape

(103, 3)

We have cleaned the Data, Now we have to download the Co-ordinates of our Data and combine into new Dataframe 'df_canada'

In [13]:
df_loc = pd.read_csv('https://cocl.us/Geospatial_data')
df_loc.head()

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


In [15]:
df_loc.columns = ('Postcode', 'Latitude', 'Longitude')
df_canada = pd.merge(df,
                 df_loc[['Postcode', 'Latitude', 'Longitude']],
                 on='Postcode')
df_canada = pd.DataFrame(df_canada)
df_canada.head()

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude
0,M1B,Scarborough,"Rouge,Malvern",43.806686,-79.194353
1,M1C,Scarborough,"Highland Creek,Rouge Hill,Port Union",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood,Morningside,West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476


#### Create a New DataFrame Containing word 'Toronto' in Borough Column

In [16]:
df_toronto = df_canada[df_canada['Borough'].str.contains("Toronto")].reset_index(drop=True)
df_toronto

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude
0,M4E,East Toronto,The Beaches,43.676357,-79.293031
1,M4K,East Toronto,"The Danforth West,Riverdale",43.679557,-79.352188
2,M4L,East Toronto,"The Beaches West,India Bazaar",43.668999,-79.315572
3,M4M,East Toronto,Studio District,43.659526,-79.340923
4,M4N,Central Toronto,Lawrence Park,43.72802,-79.38879
5,M4P,Central Toronto,Davisville North,43.712751,-79.390197
6,M4R,Central Toronto,North Toronto West,43.715383,-79.405678
7,M4S,Central Toronto,Davisville,43.704324,-79.38879
8,M4T,Central Toronto,"Moore Park,Summerhill East",43.689574,-79.38316
9,M4V,Central Toronto,"Deer Park,Forest Hill SE,Rathnelly,South Hill,...",43.686412,-79.400049


#### Use geopy library to get the latitude and longitude values of Torronto City.


In order to define an instance of the geocoder, we need to define a user_agent. We will name our agent tc_explorer, as shown below.

In [17]:
address = 'Toronto, Canada'

geolocator = Nominatim(user_agent="tc_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Toronto are 43.653963, -79.387207.


### Create a map of Toronto City with neighborhoods superimposed on top.

In [18]:
# create map of Toronto using latitude and longitude values
map_toronto = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighbourhood in zip(df_toronto['Latitude'], df_toronto['Longitude'], df_toronto['Borough'], df_toronto['Neighbourhood']):
    label = '{}, {}'.format(neighbourhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
    
map_toronto

Preparing Data for Visualizing Toronto Neighbourhoods

In [19]:
# one hot encoding
toronto_onehot = pd.get_dummies(df_toronto[['Neighbourhood']], prefix="", prefix_sep="")

# add Neighbourhood column back to dataframe
toronto_onehot['Neighbourhood'] = df_toronto['Neighbourhood']

# move Neighbourhood column to the first column
fixed_columns = [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]

toronto_onehot.head()

Unnamed: 0,Neighbourhood,"Adelaide,King,Richmond",Berczy Park,"Brockton,Exhibition Place,Parkdale Village",Business Reply Mail Processing Centre 969 Eastern,"CN Tower,Bathurst Quay,Island airport,Harbourfront West,King and Spadina,Railway Lands,South Niagara","Cabbagetown,St. James Town",Central Bay Street,"Chinatown,Grange Park,Kensington Market",Christie,Church and Wellesley,"Commerce Court,Victoria Hotel",Davisville,Davisville North,"Deer Park,Forest Hill SE,Rathnelly,South Hill,Summerhill West","Design Exchange,Toronto Dominion Centre","Dovercourt Village,Dufferin","First Canadian Place,Underground city","Forest Hill North,Forest Hill West","Harbord,University of Toronto","Harbourfront East,Toronto Islands,Union Station","Harbourfront,Regent Park","High Park,The Junction South",Lawrence Park,"Little Portugal,Trinity","Moore Park,Summerhill East",North Toronto West,"Parkdale,Roncesvalles",Rosedale,Roselawn,"Runnymede,Swansea","Ryerson,Garden District",St. James Town,Stn A PO Boxes 25 The Esplanade,Studio District,"The Annex,North Midtown,Yorkville",The Beaches,"The Beaches West,India Bazaar","The Danforth West,Riverdale"
0,The Beaches,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0
1,"The Danforth West,Riverdale",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1
2,"The Beaches West,India Bazaar",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0
3,Studio District,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0
4,Lawrence Park,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


## Cluster Neighborhoods

Run *k*-means to cluster the neighborhood into 4 clusters.

In [20]:
# set number of clusters
kclusters = 4

toronto_clustering = toronto_onehot.drop('Neighbourhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_ 

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 1, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0], dtype=int32)

Let's create a new dataframe that includes the cluster as well as the co-ordinates for each neighborhood.

In [21]:
toronto_onehot.insert(0, 'Cluster Labels', kmeans.labels_)

toronto_merged = df_toronto

# merge toronto_onehot with toronto_data to add latitude/longitude for each Neighbourhood
toronto_merged = toronto_merged.join(toronto_onehot.set_index('Neighbourhood'), on='Neighbourhood')

Finally, let's visualize the resulting clusters

In [22]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['Neighbourhood'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

## Examine Clusters

### Cluster 1

In [23]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 0, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,"Adelaide,King,Richmond",Berczy Park,"Brockton,Exhibition Place,Parkdale Village",Business Reply Mail Processing Centre 969 Eastern,"CN Tower,Bathurst Quay,Island airport,Harbourfront West,King and Spadina,Railway Lands,South Niagara","Cabbagetown,St. James Town",Central Bay Street,"Chinatown,Grange Park,Kensington Market",Christie,Church and Wellesley,"Commerce Court,Victoria Hotel",Davisville,Davisville North,"Deer Park,Forest Hill SE,Rathnelly,South Hill,Summerhill West","Design Exchange,Toronto Dominion Centre","Dovercourt Village,Dufferin","First Canadian Place,Underground city","Forest Hill North,Forest Hill West","Harbord,University of Toronto","Harbourfront East,Toronto Islands,Union Station","Harbourfront,Regent Park","High Park,The Junction South",Lawrence Park,"Little Portugal,Trinity","Moore Park,Summerhill East",North Toronto West,"Parkdale,Roncesvalles",Rosedale,Roselawn,"Runnymede,Swansea","Ryerson,Garden District",St. James Town,Stn A PO Boxes 25 The Esplanade,Studio District,"The Annex,North Midtown,Yorkville",The Beaches,"The Beaches West,India Bazaar","The Danforth West,Riverdale"
0,East Toronto,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0
1,East Toronto,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1
2,East Toronto,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0
3,East Toronto,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0
4,Central Toronto,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
5,Central Toronto,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
6,Central Toronto,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0
7,Central Toronto,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
8,Central Toronto,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0
9,Central Toronto,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


### Cluster 2

In [24]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 1, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,"Adelaide,King,Richmond",Berczy Park,"Brockton,Exhibition Place,Parkdale Village",Business Reply Mail Processing Centre 969 Eastern,"CN Tower,Bathurst Quay,Island airport,Harbourfront West,King and Spadina,Railway Lands,South Niagara","Cabbagetown,St. James Town",Central Bay Street,"Chinatown,Grange Park,Kensington Market",Christie,Church and Wellesley,"Commerce Court,Victoria Hotel",Davisville,Davisville North,"Deer Park,Forest Hill SE,Rathnelly,South Hill,Summerhill West","Design Exchange,Toronto Dominion Centre","Dovercourt Village,Dufferin","First Canadian Place,Underground city","Forest Hill North,Forest Hill West","Harbord,University of Toronto","Harbourfront East,Toronto Islands,Union Station","Harbourfront,Regent Park","High Park,The Junction South",Lawrence Park,"Little Portugal,Trinity","Moore Park,Summerhill East",North Toronto West,"Parkdale,Roncesvalles",Rosedale,Roselawn,"Runnymede,Swansea","Ryerson,Garden District",St. James Town,Stn A PO Boxes 25 The Esplanade,Studio District,"The Annex,North Midtown,Yorkville",The Beaches,"The Beaches West,India Bazaar","The Danforth West,Riverdale"
23,Central Toronto,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


As you Obsereved, The Clustering is done But very poorly, Because of close datapoints and less data!

## Let's Try to Repeat same Process for all Neighbourhood Instead of Toronto Only

Let's See working on Complete Dataset can work well or not

In [26]:
# one hot encoding
canada_onehot = pd.get_dummies(df_canada[['Neighbourhood']], prefix="", prefix_sep="")

# add Neighbourhood column back to dataframe
canada_onehot['Neighbourhood'] = df_canada['Neighbourhood']

# move Neighbourhood column to the first column
fixed_columns = [canada_onehot.columns[-1]] + list(canada_onehot.columns[:-1])
canada_onehot = canada_onehot[fixed_columns]

canada_onehot.shape

(103, 104)

In [43]:
# set number of clusters
kclusters = 5

canada_clustering = canada_onehot.drop('Neighbourhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(canada_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_ 


array([2, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 2, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 1, 0, 0, 2, 4, 0, 0, 0, 3, 0, 0, 0], dtype=int32)

In [44]:
canada_onehot.insert(0, 'Cluster Labels', kmeans.labels_)

canada_merged = df_canada

# merge toronto_onehot with toronto_data to add latitude/longitude for each Borough
canada_merged = canada_merged.join(canada_onehot.set_index('Neighbourhood'), on='Neighbourhood')

In [45]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(canada_merged['Latitude'], canada_merged['Longitude'], canada_merged['Neighbourhood'], canada_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

## Examine Clusters

### Cluster 1

In [46]:
canada_merged.loc[canada_merged['Cluster Labels'] == 0, canada_merged.columns[[1] + list(range(5, canada_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,"Adelaide,King,Richmond",Agincourt,"Agincourt North,L'Amoreaux East,Milliken,Steeles East","Albion Gardens,Beaumond Heights,Humbergate,Jamestown,Mount Olive,Silverstone,South Steeles,Thistletown","Alderwood,Long Branch","Bathurst Manor,Downsview North,Wilson Heights",Bayview Village,"Bedford Park,Lawrence Manor East",Berczy Park,"Birch Cliff,Cliffside West","Bloordale Gardens,Eringate,Markland Wood,Old Burnhamthorpe","Brockton,Exhibition Place,Parkdale Village",Business Reply Mail Processing Centre 969 Eastern,"CFB Toronto,Downsview East","CN Tower,Bathurst Quay,Island airport,Harbourfront West,King and Spadina,Railway Lands,South Niagara","Cabbagetown,St. James Town",Caledonia-Fairbanks,Canada Post Gateway Processing Centre,Cedarbrae,Central Bay Street,"Chinatown,Grange Park,Kensington Market",Christie,Church and Wellesley,"Clairlea,Golden Mile,Oakridge","Clarks Corners,Sullivan,Tam O'Shanter","Cliffcrest,Cliffside,Scarborough Village West","Cloverdale,Islington,Martin Grove,Princess Gardens,West Deane Park","Commerce Court,Victoria Hotel",Davisville,Davisville North,"Deer Park,Forest Hill SE,Rathnelly,South Hill,Summerhill West","Del Ray,Keelesdale,Mount Dennis,Silverthorn","Design Exchange,Toronto Dominion Centre",Don Mills North,"Dorset Park,Scarborough Town Centre,Wexford Heights","Dovercourt Village,Dufferin",Downsview Central,Downsview Northwest,Downsview West,"Downsview,North Park,Upwood Park","East Birchmount Park,Ionview,Kennedy Park",East Toronto,"Emery,Humberlea","Fairview,Henry Farm,Oriole","First Canadian Place,Underground city","Flemingdon Park,Don Mills South","Forest Hill North,Forest Hill West",Glencairn,"Guildwood,Morningside,West Hill","Harbord,University of Toronto","Harbourfront East,Toronto Islands,Union Station","Harbourfront,Regent Park","High Park,The Junction South","Highland Creek,Rouge Hill,Port Union",Hillcrest Village,"Humber Bay Shores,Mimico South,New Toronto","Humber Bay,King's Mill Park,Kingsway Park South East,Mimico NE,Old Mill South,The Queensway East,Royal York South East,Sunnylea",Humber Summit,Humewood-Cedarvale,Islington Avenue,"Kingsview Village,Martin Grove Gardens,Richview Gardens,St. Phillips","Kingsway Park South West,Mimico NW,The Queensway West,Royal York South West,South of Bloor",L'Amoreaux West,"Lawrence Heights,Lawrence Manor",Lawrence Park,Leaside,"Little Portugal,Trinity","Maryvale,Wexford","Moore Park,Summerhill East","Newtonbrook,Willowdale",North Toronto West,Northwest,"Northwood Park,York University","Parkdale,Roncesvalles",Parkwoods,Queen's Park,Rosedale,Roselawn,"Rouge,Malvern","Runnymede,Swansea","Ryerson,Garden District",Scarborough Village,"Silver Hills,York Mills",St. James Town,Stn A PO Boxes 25 The Esplanade,Studio District,"The Annex,North Midtown,Yorkville",The Beaches,"The Beaches West,India Bazaar","The Danforth West,Riverdale","The Junction North,Runnymede","The Kingsway,Montgomery Road,Old Mill North",Thorncliffe Park,Upper Rouge,Victoria Village,Westmount,Weston,Willowdale South,Willowdale West,Woburn,"Woodbine Gardens,Parkview Hill",Woodbine Heights,York Mills West
1,Scarborough,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Scarborough,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0
4,Scarborough,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
5,Scarborough,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
6,Scarborough,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
7,Scarborough,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
8,Scarborough,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
9,Scarborough,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
10,Scarborough,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
11,Scarborough,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


### Cluster 2

In [47]:
canada_merged.loc[canada_merged['Cluster Labels'] == 1, canada_merged.columns[[1] + list(range(5, canada_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,"Adelaide,King,Richmond",Agincourt,"Agincourt North,L'Amoreaux East,Milliken,Steeles East","Albion Gardens,Beaumond Heights,Humbergate,Jamestown,Mount Olive,Silverstone,South Steeles,Thistletown","Alderwood,Long Branch","Bathurst Manor,Downsview North,Wilson Heights",Bayview Village,"Bedford Park,Lawrence Manor East",Berczy Park,"Birch Cliff,Cliffside West","Bloordale Gardens,Eringate,Markland Wood,Old Burnhamthorpe","Brockton,Exhibition Place,Parkdale Village",Business Reply Mail Processing Centre 969 Eastern,"CFB Toronto,Downsview East","CN Tower,Bathurst Quay,Island airport,Harbourfront West,King and Spadina,Railway Lands,South Niagara","Cabbagetown,St. James Town",Caledonia-Fairbanks,Canada Post Gateway Processing Centre,Cedarbrae,Central Bay Street,"Chinatown,Grange Park,Kensington Market",Christie,Church and Wellesley,"Clairlea,Golden Mile,Oakridge","Clarks Corners,Sullivan,Tam O'Shanter","Cliffcrest,Cliffside,Scarborough Village West","Cloverdale,Islington,Martin Grove,Princess Gardens,West Deane Park","Commerce Court,Victoria Hotel",Davisville,Davisville North,"Deer Park,Forest Hill SE,Rathnelly,South Hill,Summerhill West","Del Ray,Keelesdale,Mount Dennis,Silverthorn","Design Exchange,Toronto Dominion Centre",Don Mills North,"Dorset Park,Scarborough Town Centre,Wexford Heights","Dovercourt Village,Dufferin",Downsview Central,Downsview Northwest,Downsview West,"Downsview,North Park,Upwood Park","East Birchmount Park,Ionview,Kennedy Park",East Toronto,"Emery,Humberlea","Fairview,Henry Farm,Oriole","First Canadian Place,Underground city","Flemingdon Park,Don Mills South","Forest Hill North,Forest Hill West",Glencairn,"Guildwood,Morningside,West Hill","Harbord,University of Toronto","Harbourfront East,Toronto Islands,Union Station","Harbourfront,Regent Park","High Park,The Junction South","Highland Creek,Rouge Hill,Port Union",Hillcrest Village,"Humber Bay Shores,Mimico South,New Toronto","Humber Bay,King's Mill Park,Kingsway Park South East,Mimico NE,Old Mill South,The Queensway East,Royal York South East,Sunnylea",Humber Summit,Humewood-Cedarvale,Islington Avenue,"Kingsview Village,Martin Grove Gardens,Richview Gardens,St. Phillips","Kingsway Park South West,Mimico NW,The Queensway West,Royal York South West,South of Bloor",L'Amoreaux West,"Lawrence Heights,Lawrence Manor",Lawrence Park,Leaside,"Little Portugal,Trinity","Maryvale,Wexford","Moore Park,Summerhill East","Newtonbrook,Willowdale",North Toronto West,Northwest,"Northwood Park,York University","Parkdale,Roncesvalles",Parkwoods,Queen's Park,Rosedale,Roselawn,"Rouge,Malvern","Runnymede,Swansea","Ryerson,Garden District",Scarborough Village,"Silver Hills,York Mills",St. James Town,Stn A PO Boxes 25 The Esplanade,Studio District,"The Annex,North Midtown,Yorkville",The Beaches,"The Beaches West,India Bazaar","The Danforth West,Riverdale","The Junction North,Runnymede","The Kingsway,Montgomery Road,Old Mill North",Thorncliffe Park,Upper Rouge,Victoria Village,Westmount,Weston,Willowdale South,Willowdale West,Woburn,"Woodbine Gardens,Parkview Hill",Woodbine Heights,York Mills West
30,North York,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
38,East York,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
91,Etobicoke,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


Ummm....Little Improvement....Still Not upto the Point!

As we can see from Map, All the Neighbourhoods are very close and also we have less data, So K-means Clustering unable to differentiate as the datapoints are very close.

### Thank you for completing this Notebook!