# Section 1
In this section, we will read the Canada Neighborhood information from the wikipedia website, and pre-process the dataframe to a certain extent

In [1]:
import pandas as pd

We will read the html table from Wikipedia, using Pandas **read_html** command. **read_html** reads all the html tables on a webpage, and returns a list of DataFrames (for each html table). As we are interested on the DF with Canada's neighbourhood information, we take the first table from the list. 

In [2]:
Canada_Neighborhood=pd.read_html('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M')[0]
Canada_Neighborhood.head(20)

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront
5,M5A,Downtown Toronto,Regent Park
6,M6A,North York,Lawrence Heights
7,M6A,North York,Lawrence Manor
8,M7A,Queen's Park,Not assigned
9,M8A,Not assigned,Not assigned


Remove the rows, which have a 'Not assigned' value in **Borough**

In [3]:
Canada_Neighborhood=Canada_Neighborhood[Canada_Neighborhood['Borough']!='Not assigned']
#Canada_Neighborhood.head(20)

If a **Neighbourhood** has a 'Not assigned' value, then replace it with the value of **Borough**

In [4]:
Canada_Neighborhood.loc[Canada_Neighborhood['Neighbourhood']=='Not assigned','Neighbourhood']=Canada_Neighborhood[Canada_Neighborhood['Neighbourhood']=='Not assigned']['Borough']
#Canada_Neighborhood.head(20)

Group the **Neighbourhood's** on basis of **Postcode** & **Borough**, and concatenate the contents of **Neighborhood** with a comma separater

In [5]:
Canada_Neighborhood=Canada_Neighborhood.groupby(['Postcode','Borough']).agg(lambda x: ', '.join(x)).reset_index()
Canada_Neighborhood.head(20)

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M1B,Scarborough,"Rouge, Malvern"
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union"
2,M1E,Scarborough,"Guildwood, Morningside, West Hill"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae
5,M1J,Scarborough,Scarborough Village
6,M1K,Scarborough,"East Birchmount Park, Ionview, Kennedy Park"
7,M1L,Scarborough,"Clairlea, Golden Mile, Oakridge"
8,M1M,Scarborough,"Cliffcrest, Cliffside, Scarborough Village West"
9,M1N,Scarborough,"Birch Cliff, Cliffside West"


In [6]:
Canada_Neighborhood.shape

(103, 3)

# Section 2
In this section, we will try to enhance the dataframe with Latitude and Longitude information

First we will try, extracting latitude and longitude information from geocoder API

In [7]:
!conda install -c conda-forge geocoder --yes
import geocoder

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - geocoder


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    geocoder-1.38.1            |             py_1          53 KB  conda-forge
    certifi-2019.9.11          |           py36_0         147 KB  conda-forge
    ca-certificates-2019.9.11  |       hecc5488_0         144 KB  conda-forge
    ratelim-0.1.6              |             py_2           6 KB  conda-forge
    openssl-1.1.1d             |       h516909a_0         2.1 MB  conda-forge
    ------------------------------------------------------------
                                           Total:         2.4 MB

The following NEW packages will be INSTALLED:

    geocoder:        1.38.1-py_1       conda-forge
    ratelim:         0.1.6-py_2        conda-forge

The following packages will be UPDATED:

    

In [8]:
lat_lng_coords = None
postal_code='M1B'
Neigh='Toronto, Ontario'

In [9]:
# loop until you get the coordinates
while(lat_lng_coords is None):
    g = geocoder.google('{},{}'.format(postal_code,Neigh))
    lat_lng_coords = g.latlng
    
latitude=lat_lng_coords[0]
longitude=lat_lng_coords[1]

KeyboardInterrupt: 

It seems, the geocoder API is not returning the co-ordinates at all, because of which the while loop is running forever. We will now try to extract the information (latitude, longitude) from the CSV file present along with the assignment.

Read the CSV file into IBM Watson

In [13]:
import types
#import pandas as pd
from botocore.client import Config
import ibm_boto3

def __iter__(self): return 0

# @hidden_cell
# The following code accesses a file in your IBM Cloud Object Storage. It includes your credentials.
# You might want to remove those credentials before you share the notebook.
client_2ff313f89b7b4d198635ee2234f8bd9b = ibm_boto3.client(service_name='s3',
    ibm_api_key_id='SKKrWWblVq3fvYDAd9qmTrtD4TSnL3UuM9UuUowvgLcu',
    ibm_auth_endpoint="https://iam.eu-gb.bluemix.net/oidc/token",
    config=Config(signature_version='oauth'),
    endpoint_url='https://s3.eu-geo.objectstorage.service.networklayer.com')

body = client_2ff313f89b7b4d198635ee2234f8bd9b.get_object(Bucket='datasciencecapstoneclustering-donotdelete-pr-ykuxafb46bln1j',Key='Geospatial_Coordinates.csv')['Body']
# add missing __iter__ method, so pandas accepts body as file-like object
if not hasattr(body, "__iter__"): body.__iter__ = types.MethodType( __iter__, body )

Canada_Neighborhood_Coordinates = pd.read_csv(body)
Canada_Neighborhood_Coordinates.columns=['Postcode','Latitude','Longitude']
Canada_Neighborhood_Coordinates.head()


Unnamed: 0,Postcode,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


Join the dataframe with Neighborhood information on **Postcode**, to get the resultant frame. 

In [18]:
Canada_Neighborhood_Info=Canada_Neighborhood.join(Canada_Neighborhood_Coordinates.set_index('Postcode'), on='Postcode')
Canada_Neighborhood_Info.head(20)

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude
0,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476
5,M1J,Scarborough,Scarborough Village,43.744734,-79.239476
6,M1K,Scarborough,"East Birchmount Park, Ionview, Kennedy Park",43.727929,-79.262029
7,M1L,Scarborough,"Clairlea, Golden Mile, Oakridge",43.711112,-79.284577
8,M1M,Scarborough,"Cliffcrest, Cliffside, Scarborough Village West",43.716316,-79.239476
9,M1N,Scarborough,"Birch Cliff, Cliffside West",43.692657,-79.264848


In [19]:
Canada_Neighborhood_Info.shape

(103, 5)

# Section 3
Here we are going to Analyse the Neighbourhood information and cluster them on basis of localities in the Neighborhood.

We are going to run analysis on Borough's with Toronto in it. 

In [28]:
Canada_Neighborhood_Data = Canada_Neighborhood_Info[(Canada_Neighborhood_Info.Borough.str.contains('Toronto'))].reset_index(drop=True)
Canada_Neighborhood_Data.head(10)

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude
0,M4E,East Toronto,The Beaches,43.676357,-79.293031
1,M4K,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188
2,M4L,East Toronto,"The Beaches West, India Bazaar",43.668999,-79.315572
3,M4M,East Toronto,Studio District,43.659526,-79.340923
4,M4N,Central Toronto,Lawrence Park,43.72802,-79.38879
5,M4P,Central Toronto,Davisville North,43.712751,-79.390197
6,M4R,Central Toronto,North Toronto West,43.715383,-79.405678
7,M4S,Central Toronto,Davisville,43.704324,-79.38879
8,M4T,Central Toronto,"Moore Park, Summerhill East",43.689574,-79.38316
9,M4V,Central Toronto,"Deer Park, Forest Hill SE, Rathnelly, South Hi...",43.686412,-79.400049


In [31]:
!conda install -c conda-forge folium=0.5.0 --yes
import folium

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - folium=0.5.0


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    folium-0.5.0               |             py_0          45 KB  conda-forge
    vincent-0.4.4              |             py_1          28 KB  conda-forge
    branca-0.3.1               |             py_0          25 KB  conda-forge
    altair-3.2.0               |           py36_0         770 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         868 KB

The following NEW packages will be INSTALLED:

    altair:  3.2.0-py36_0 conda-forge
    branca:  0.3.1-py_0   conda-forge
    folium:  0.5.0-py_0   conda-forge
    vincent: 0.4.4-py_1   conda-forge


Downloading and Extracting Packages
folium-0.5.0         | 45 KB    

Lets create a map of Toronto with all the neighborhoods superimposed on it

In [34]:
#Create a map of toronto 
toronto_lat = 43.6532
toronto_lon = -79.3832
map_toronto=folium.Map(location=[toronto_lat,toronto_lon],zoom_start=11)

# Superimpose the Neighborhood information on it
for index, rows in Canada_Neighborhood_Data.iterrows():
    label='{}, {}, {}'.format(rows['Neighbourhood'], rows['Borough'], rows['Postcode'])
    label = folium.Popup(label,parse_html=True)
    folium.CircleMarker(
    [rows['Latitude'],rows['Longitude']],
    radius=5,
    popup=label,
    color='blue',
    fill=True,
    fill_color='#3186cc',
    fill_opacity=0.7,
    parse_html=False).add_to(map_toronto)
    #print(label)
    
map_toronto

DEFINE FOUR SQUARE API CREDENTIALS

In [36]:
CLIENT_ID = 'XD5J0LL1S1G1PVPQWLKKDRT20JXKVUXSKXMKJNIINCCRRO4V' # your Foursquare ID
CLIENT_SECRET = '23SDL3GYJF34QVDOSMCI21Y4OSDA1UQTPT22DFRZDQU0X2YZ' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
radius=500
LIMIT=100

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: XD5J0LL1S1G1PVPQWLKKDRT20JXKVUXSKXMKJNIINCCRRO4V
CLIENT_SECRET:23SDL3GYJF34QVDOSMCI21Y4OSDA1UQTPT22DFRZDQU0X2YZ


In [85]:
import requests
from pandas.io.json import json_normalize
from sklearn.cluster import KMeans
import matplotlib.cm as cm
import matplotlib.colors as colors
import numpy as np

Lets run the FOUR SQUARE API to extract all the venues that are in 500 metre radius of the Neighborhood's Post Code

In [45]:
venues_list=[]
for index,rows in Canada_Neighborhood_Data.iterrows():
    url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            rows['Latitude'], 
            rows['Longitude'], 
            radius, 
            LIMIT)
    result=requests.get(url).json()['response']['groups'][0]['items']
    venues_list.append([(
        rows['Postcode'],
        rows['Borough'],
        rows['Neighbourhood'],
        rows['Latitude'],
        rows['Longitude'],
        v['venue']['name'], 
        v['venue']['location']['lat'], 
        v['venue']['location']['lng'],  
        v['venue']['categories'][0]['name']) for v in result])    

In [46]:
nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
nearby_venues.columns=['Postcode','Borough','Neighbourhood','Neighborhood Latitude','Neighborhood Longitude', 'Venue','Venue Latitude','Venue Longitude','Venue Category']
nearby_venues.head()

Unnamed: 0,Postcode,Borough,Neighbourhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,M4E,East Toronto,The Beaches,43.676357,-79.293031,Glen Manor Ravine,43.676821,-79.293942,Trail
1,M4E,East Toronto,The Beaches,43.676357,-79.293031,The Big Carrot Natural Food Market,43.678879,-79.297734,Health Food Store
2,M4E,East Toronto,The Beaches,43.676357,-79.293031,Grover Pub and Grub,43.679181,-79.297215,Pub
3,M4E,East Toronto,The Beaches,43.676357,-79.293031,Upper Beaches,43.680563,-79.292869,Neighborhood
4,M4K,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188,Pantheon,43.677621,-79.351434,Greek Restaurant


Lets see how many venues were returned for each PostCode

In [48]:
nearby_venues.groupby(['Postcode','Borough','Neighbourhood'])['Venue'].count()

Postcode  Borough           Neighbourhood                                                                                             
M4E       East Toronto      The Beaches                                                                                                     4
M4K       East Toronto      The Danforth West, Riverdale                                                                                   42
M4L       East Toronto      The Beaches West, India Bazaar                                                                                 17
M4M       East Toronto      Studio District                                                                                                38
M4N       Central Toronto   Lawrence Park                                                                                                   4
M4P       Central Toronto   Davisville North                                                                                                8
M4R       Cen

In [49]:
print('There are {} unique categories.'.format(len(nearby_venues['Venue Category'].unique())))

There are 229 unique categories.


Lets try to find the top venues for each neighborhood

In [54]:
Toronto_onehot=pd.get_dummies(nearby_venues[['Venue Category']], prefix='',prefix_sep='')
Toronto_onehot['Postcode']=nearby_venues['Postcode']
fixed_columns=[Toronto_onehot.columns[-1]] + list(Toronto_onehot.columns[:-1])
Toronto_onehot=Toronto_onehot[fixed_columns]
Toronto_onehot.head(10)

Unnamed: 0,Postcode,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,...,Theme Restaurant,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Wings Joint,Yoga Studio
0,M4E,0,0,0,0,0,0,0,0,0,...,0,0,1,0,0,0,0,0,0,0
1,M4E,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,M4E,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,M4E,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,M4K,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
5,M4K,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
6,M4K,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
7,M4K,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
8,M4K,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
9,M4K,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [63]:
Toronto_grouped=Toronto_onehot.groupby('Postcode').mean().reset_index()
Toronto_grouped.head(10)

Unnamed: 0,Postcode,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,...,Theme Restaurant,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Wings Joint,Yoga Studio
0,M4E,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,M4K,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,...,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.02381
2,M4L,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,M4M,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026316
4,M4N,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,M4P,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,M4R,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455
7,M4S,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.029412,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,M4T,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,M4V,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0


In [76]:
Venue_Transposed = Toronto_grouped.loc[0].T.reset_index().loc[1:]
Venue_Transposed.columns=['venue','freq']
Venue_Transposed.sort_values(by='freq',ascending=False).head(10).T.reset_index().loc[0]

index                         venue
116               Health Food Store
158                    Neighborhood
178                             Pub
222                           Trail
147        Mediterranean Restaurant
148                     Men's Store
149              Mexican Restaurant
150       Middle Eastern Restaurant
151              Miscellaneous Shop
152      Modern European Restaurant
Name: 0, dtype: object

In [77]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [81]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Postcode']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Postcode'] = Toronto_grouped['Postcode']

for ind in np.arange(Toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(Toronto_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Postcode,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M4E,Neighborhood,Health Food Store,Trail,Pub,Dessert Shop,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant
1,M4K,Greek Restaurant,Coffee Shop,Italian Restaurant,Ice Cream Shop,Furniture / Home Store,Yoga Studio,Bubble Tea Shop,Spa,Juice Bar,Cosmetics Shop
2,M4L,Pet Store,Pub,Liquor Store,Burger Joint,Sandwich Place,Fast Food Restaurant,Burrito Place,Fish & Chips Shop,Italian Restaurant,Steakhouse
3,M4M,Café,Coffee Shop,Italian Restaurant,Bakery,American Restaurant,Martial Arts Dojo,Fish Market,Bookstore,Latin American Restaurant,Brewery
4,M4N,Dim Sum Restaurant,Park,Swim School,Bus Line,Diner,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant


# Clustering Neighborhoods
Lets cluster the Neighborhoods, to see, which of the neighborhoods have similar kind of venues

In [82]:
# set number of clusters
kclusters = 5

Toronto_grouped_clustering = Toronto_grouped.drop('Postcode', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(Toronto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 1, 1, 1, 1, 1, 1, 1, 3, 1], dtype=int32)

In [83]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

Toronto_merged = Canada_Neighborhood_Data

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
Toronto_merged = Toronto_merged.join(neighborhoods_venues_sorted.set_index('Postcode'), on='Postcode')

Toronto_merged.head() # check the last columns!

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M4E,East Toronto,The Beaches,43.676357,-79.293031,0,Neighborhood,Health Food Store,Trail,Pub,Dessert Shop,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant
1,M4K,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188,1,Greek Restaurant,Coffee Shop,Italian Restaurant,Ice Cream Shop,Furniture / Home Store,Yoga Studio,Bubble Tea Shop,Spa,Juice Bar,Cosmetics Shop
2,M4L,East Toronto,"The Beaches West, India Bazaar",43.668999,-79.315572,1,Pet Store,Pub,Liquor Store,Burger Joint,Sandwich Place,Fast Food Restaurant,Burrito Place,Fish & Chips Shop,Italian Restaurant,Steakhouse
3,M4M,East Toronto,Studio District,43.659526,-79.340923,1,Café,Coffee Shop,Italian Restaurant,Bakery,American Restaurant,Martial Arts Dojo,Fish Market,Bookstore,Latin American Restaurant,Brewery
4,M4N,Central Toronto,Lawrence Park,43.72802,-79.38879,1,Dim Sum Restaurant,Park,Swim School,Bus Line,Diner,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant


Here is the map of Toronto, where all similar neighborhoods are clustered (color coded) together

In [91]:
# create map
map_clusters = folium.Map(location=[toronto_lat,toronto_lon], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, bor, neigh, cluster in zip(Toronto_merged['Latitude'], Toronto_merged['Longitude'], Toronto_merged['Postcode'], Toronto_merged['Borough'], Toronto_merged['Neighbourhood'], Toronto_merged['Cluster Labels']):
    label = folium.Popup(str(bor) + ', ' + str(neigh)+ ', ' + str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### Cluster Analysis

In [97]:
Toronto_merged['Cluster Labels'].value_counts()

1    33
3     2
4     1
2     1
0     1
Name: Cluster Labels, dtype: int64

It seems **most of the neighborhoods (33)** falls as a part of **Cluster 1**
Lets look at the neighborhood information for Cluster 1, and see if we find something in common

In [98]:
Toronto_merged.loc[Toronto_merged['Cluster Labels'] == 1, Toronto_merged.columns[[0,1,2] + list(range(6, Toronto_merged.shape[1]))]]

Unnamed: 0,Postcode,Borough,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,M4K,East Toronto,"The Danforth West, Riverdale",Greek Restaurant,Coffee Shop,Italian Restaurant,Ice Cream Shop,Furniture / Home Store,Yoga Studio,Bubble Tea Shop,Spa,Juice Bar,Cosmetics Shop
2,M4L,East Toronto,"The Beaches West, India Bazaar",Pet Store,Pub,Liquor Store,Burger Joint,Sandwich Place,Fast Food Restaurant,Burrito Place,Fish & Chips Shop,Italian Restaurant,Steakhouse
3,M4M,East Toronto,Studio District,Café,Coffee Shop,Italian Restaurant,Bakery,American Restaurant,Martial Arts Dojo,Fish Market,Bookstore,Latin American Restaurant,Brewery
4,M4N,Central Toronto,Lawrence Park,Dim Sum Restaurant,Park,Swim School,Bus Line,Diner,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant
5,M4P,Central Toronto,Davisville North,Gym,Food & Drink Shop,Park,Convenience Store,Sandwich Place,Breakfast Spot,Clothing Store,Hotel,Doner Restaurant,Donut Shop
6,M4R,Central Toronto,North Toronto West,Clothing Store,Coffee Shop,Sporting Goods Shop,Yoga Studio,Furniture / Home Store,Rental Car Location,Diner,Dessert Shop,Mexican Restaurant,Salon / Barbershop
7,M4S,Central Toronto,Davisville,Sandwich Place,Coffee Shop,Dessert Shop,Sushi Restaurant,Gym,Italian Restaurant,Café,Pizza Place,Fried Chicken Joint,Salon / Barbershop
9,M4V,Central Toronto,"Deer Park, Forest Hill SE, Rathnelly, South Hi...",Light Rail Station,Coffee Shop,Pub,American Restaurant,Supermarket,Sushi Restaurant,Restaurant,Sports Bar,Fried Chicken Joint,Bagel Shop
11,M4X,Downtown Toronto,"Cabbagetown, St. James Town",Coffee Shop,Café,Pub,Italian Restaurant,Bakery,Pizza Place,Market,Restaurant,Liquor Store,Japanese Restaurant
12,M4Y,Downtown Toronto,Church and Wellesley,Coffee Shop,Japanese Restaurant,Sushi Restaurant,Gay Bar,Restaurant,Hotel,Gym,Mediterranean Restaurant,Men's Store,Italian Restaurant


The one thing in cluster 1 that catches eye is **Coffee Shop**