## Finding ideal location to open an Indian sweets shop in Totonto (Canada)

The project includes scraping the Wikipedia page for the postal codes of Canada, then process and clean the data, getting geolocation for each neighborhood, using FourSquare API to get venues details for indian restaurants and applying clustring on the data. The clustering is carried out by K Means.

### Installing & Importing required libraries

In [1]:
# Install Beautiful Soup for web scrapping
!pip install beautifulsoup4

# Install xml parsing library
!pip install lxml

# Install geopy to get Longitude and Latitude
!pip install geopy

# Install Folium to display maps
!pip install folium

import requests # library to handle requests
import pandas as pd # library for data analsysis
import numpy as np # library to handle data in a vectorized manner
import random # library for random number generation
from IPython.display import display_html # library to display html

from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values
    
from pandas.io.json import json_normalize # tranforming json file into a pandas dataframe library

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

import folium # plotting library
from bs4 import BeautifulSoup # web scrapping library
from sklearn.cluster import KMeans # Clustring library

Collecting folium
  Downloading folium-0.11.0-py2.py3-none-any.whl (93 kB)
[K     |████████████████████████████████| 93 kB 3.6 MB/s  eta 0:00:01
[?25hCollecting branca>=0.3.0
  Downloading branca-0.4.1-py3-none-any.whl (24 kB)
Installing collected packages: branca, folium
Successfully installed branca-0.4.1 folium-0.11.0


### Scraping the Wikipedia page for the table of postal codes of Canada using BeautifulSoup

In [2]:
page = requests.get('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M').text
soup=BeautifulSoup(page,'lxml')
print(soup.title)
tab = str(soup.table)
display_html(tab,raw=True)

<title>List of postal codes of Canada: M - Wikipedia</title>


Postal Code,Borough,Neighbourhood
M1A,Not assigned,Not assigned
M2A,Not assigned,Not assigned
M3A,North York,Parkwoods
M4A,North York,Victoria Village
M5A,Downtown Toronto,"Regent Park, Harbourfront"
M6A,North York,"Lawrence Manor, Lawrence Heights"
M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"
M8A,Not assigned,Not assigned
M9A,Etobicoke,"Islington Avenue, Humber Valley Village"
M1B,Scarborough,"Malvern, Rouge"


### The html table is converted to Pandas DataFrame

In [3]:
df = pd.read_html(tab)
df = df[0]
df.head()

Unnamed: 0,Postal Code,Borough,Neighbourhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"


### Data cleaning and pre-processing

In [4]:
# Renaming of columns as per the requirement
df.rename(columns={'Postal Code':'PostalCode'},inplace=True)

# Dropping the rows where Borough is 'Not assigned'
df1 = df[df.Borough != 'Not assigned']

# Combining the neighbourhoods with same Postalcode
df2 = df1.groupby(['PostalCode','Borough'], sort=False).agg(', '.join)
df2.reset_index(inplace=True)

# Replacing the name of the neighbourhoods which are 'Not assigned' with names of Borough
df2['Neighbourhood'] = np.where(df2['Neighbourhood'] == 'Not assigned',df2['Borough'], df2['Neighbourhood'])

df2.shape

(103, 3)

### Importing the csv file conatining the latitudes and longitudes

In [5]:
df_lat_lon = pd.read_csv('https://cocl.us/Geospatial_data')
df_lat_lon.head()

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


### Merging the two data frames for getting the Latitudes and Longitudes

In [6]:
df_lat_lon.rename(columns={'Postal Code':'PostalCode'},inplace=True)
df3 = pd.merge(df2,df_lat_lon,on='PostalCode')
df3.head()

Unnamed: 0,PostalCode,Borough,Neighbourhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494


### Fetching the rows from the data frame which contains Toronto in their Borough

In [7]:
df4 = df3[df3['Borough'].str.contains('Toronto',regex=False)]
df4

Unnamed: 0,PostalCode,Borough,Neighbourhood,Latitude,Longitude
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494
9,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937
15,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418
19,M4E,East Toronto,The Beaches,43.676357,-79.293031
20,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306
24,M5G,Downtown Toronto,Central Bay Street,43.657952,-79.387383
25,M6G,Downtown Toronto,Christie,43.669542,-79.422564
30,M5H,Downtown Toronto,"Richmond, Adelaide, King",43.650571,-79.384568
31,M6H,West Toronto,"Dufferin, Dovercourt Village",43.669005,-79.442259


### Visualizing all the Neighbourhoods of the above data frame in map

In [8]:
map_toronto = folium.Map(location=[43.651070,-79.347015],zoom_start=10)

for lat,lng,borough,neighbourhood in zip(df4['Latitude'],df4['Longitude'],df4['Borough'],df4['Neighbourhood']):
    label = '{}, {}'.format(neighbourhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
    [lat,lng],
    radius=5,
    popup=label,
    color='blue',
    fill=True,
    fill_color='#3186cc',
    fill_opacity=0.7,
    parse_html=False).add_to(map_toronto)
map_toronto

### Get the venues in Toronto having Indian restaurants, using FoursquareAPI

In [9]:
CLIENT_ID = 'EGC0H2KQVE4JGWHDNQEG2XUN1HLGBEDWVIEXW13I2MDM2B0O' # your Foursquare ID
CLIENT_SECRET = '5F3P0QM1Q33YSSOD4IVIUGKGPN5MFCNDFBPDJJ1GQMCT0DMJ' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

In [10]:
# Size of the radius to retrieve venues from FoursquareAPI, and limit of venues per neighbourhood
CONST_venuesRadiusScan = 900
CONST_venuesLimit = 100

def getNearbyVenues(postalCodes, boroughs, neighbourhoods, latitudes, longitudes):
    
    venues_list=[]
    # Loop through each neighbourhood given in parameters
    for postalCode, borough, neighbourhood, lat, lng in zip(postalCodes, boroughs, neighbourhoods, latitudes, longitudes):
            
        # create the API request URL to explore the neighbourhood using FoursquareAPI
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}&query={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            CONST_venuesRadiusScan, 
            CONST_venuesLimit,
            'Indian Restaurants')

        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue : name, latitude, longitude, and the categories' names
        venues_list.append([(
            postalCode,
            borough,
            neighbourhood, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    # add the venues in the dataframe
    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = [
                        'PostalCode',
                        'Borough',
                        'Neighborhood', 
                        'Neighborhood Latitude', 
                        'Neighborhood Longitude', 
                        'Venue', 
                        'Venue Latitude', 
                        'Venue Longitude', 
                        'Venue Category'
    ]
    
    return(nearby_venues)

In [11]:
df5 = getNearbyVenues(  
                                    postalCodes=df4['PostalCode'],
                                    boroughs=df4['Borough'],
                                    neighbourhoods=df4['Neighbourhood'],
                                    latitudes=df4['Latitude'],
                                    longitudes=df4['Longitude']
                                  )

In [12]:
df5.shape

(216, 9)

In [13]:
df5["Venue Category"].value_counts()

Indian Restaurant            202
North Indian Restaurant        9
Italian Restaurant             2
Sri Lankan Restaurant          1
Pakistani Restaurant           1
Indian Chinese Restaurant      1
Name: Venue Category, dtype: int64

### Analyse each neighborhood : each type of venue will be one hot encoded so we will be able to perform a K-means clustering on the dataframe

In [14]:
# One Hot Encoding
df6 = pd.get_dummies(df5[['Venue Category']], prefix="", prefix_sep="")

# add postalCode, borough, and neighborhood column back to dataframe
df6.insert(loc=0, column='PostalCode', value=df5['PostalCode'])
df6.insert(loc=1, column='Borough', value=df5['Borough'])
df6.insert(loc=2, column='Neighborhood', value=df5['Neighborhood'])

df6.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Indian Chinese Restaurant,Indian Restaurant,Italian Restaurant,North Indian Restaurant,Pakistani Restaurant,Sri Lankan Restaurant
0,M5A,Downtown Toronto,"Regent Park, Harbourfront",0,1,0,0,0,0
1,M5A,Downtown Toronto,"Regent Park, Harbourfront",0,1,0,0,0,0
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",0,1,0,0,0,0
3,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",0,1,0,0,0,0
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",0,1,0,0,0,0


### Group rows by postal code, borough and neighbourhood, using means

In [15]:
df7 = df6.groupby(['PostalCode','Borough', 'Neighborhood']).mean().reset_index()
df7.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Indian Chinese Restaurant,Indian Restaurant,Italian Restaurant,North Indian Restaurant,Pakistani Restaurant,Sri Lankan Restaurant
0,M4E,East Toronto,The Beaches,0.0,1.0,0.0,0.0,0.0,0.0
1,M4K,East Toronto,"The Danforth West, Riverdale",0.0,1.0,0.0,0.0,0.0,0.0
2,M4L,East Toronto,"India Bazaar, The Beaches West",0.071429,0.857143,0.0,0.0,0.071429,0.0
3,M4M,East Toronto,Studio District,0.0,1.0,0.0,0.0,0.0,0.0
4,M4P,Central Toronto,Davisville North,0.0,0.666667,0.333333,0.0,0.0,0.0


### Print each neighborhood along with the top 5 most common venues

In [16]:
num_top_venues = 5
for index, row in df7.iterrows():
    tempPostalCode = row['PostalCode']
    tempBorough = row['Borough']
    tempNeighborhood = row['Neighborhood']
    print("----"+tempPostalCode + " / " + tempBorough + " / " + tempNeighborhood +"----")
    temp = df7[(df7.PostalCode == tempPostalCode) & (df7.Borough == tempBorough) & (df7.Neighborhood == tempNeighborhood)].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[3:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----M4E / East Toronto / The Beaches----
                       venue  freq
0          Indian Restaurant   1.0
1  Indian Chinese Restaurant   0.0
2         Italian Restaurant   0.0
3    North Indian Restaurant   0.0
4       Pakistani Restaurant   0.0


----M4K / East Toronto / The Danforth West, Riverdale----
                       venue  freq
0          Indian Restaurant   1.0
1  Indian Chinese Restaurant   0.0
2         Italian Restaurant   0.0
3    North Indian Restaurant   0.0
4       Pakistani Restaurant   0.0


----M4L / East Toronto / India Bazaar, The Beaches West----
                       venue  freq
0          Indian Restaurant  0.86
1  Indian Chinese Restaurant  0.07
2       Pakistani Restaurant  0.07
3         Italian Restaurant  0.00
4    North Indian Restaurant  0.00


----M4M / East Toronto / Studio District----
                       venue  freq
0          Indian Restaurant   1.0
1  Indian Chinese Restaurant   0.0
2         Italian Restaurant   0.0
3    North Indian Re

### Put the results into a Pandas dataframe
### Now let's create the new dataframe and display the top 10 venues for each neighborhood

In [17]:
# Function to sort the venues in descending order.

def return_most_common_venues(row, num_top_venues):
    # Remove the key PostalCode x Borough x Neighbourhood from the row
    row_categories = row.iloc[3:]
    
    # Sort ascending
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    # Return the top num_top_venues
    return row_categories_sorted.index.values[0:num_top_venues]

In [18]:

num_top_venues = 6
indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['PostalCode', 'Borough', 'Neighborhood']

for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe, and set it with the columns names
df9 = pd.DataFrame(columns=columns)

# add the keys from the grouped dataframe (Postal code x Borough x Neighborhood)
df9['PostalCode'] = df7['PostalCode']
df9['Borough'] = df7['Borough']
df9['Neighborhood'] = df7['Neighborhood']

# loop through each rows
for ind in np.arange(df7.shape[0]):
    df9.iloc[ind, 3:] = return_most_common_venues(df7.iloc[ind, :], num_top_venues)

#return_most_common_venues(df7.iloc[1, :], num_top_venues)
#df9.iloc[1, 3:]
#df7.iloc[1, :]
df9.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue
0,M4E,East Toronto,The Beaches,Indian Restaurant,Sri Lankan Restaurant,Pakistani Restaurant,North Indian Restaurant,Italian Restaurant,Indian Chinese Restaurant
1,M4K,East Toronto,"The Danforth West, Riverdale",Indian Restaurant,Sri Lankan Restaurant,Pakistani Restaurant,North Indian Restaurant,Italian Restaurant,Indian Chinese Restaurant
2,M4L,East Toronto,"India Bazaar, The Beaches West",Indian Restaurant,Pakistani Restaurant,Indian Chinese Restaurant,Sri Lankan Restaurant,North Indian Restaurant,Italian Restaurant
3,M4M,East Toronto,Studio District,Indian Restaurant,Sri Lankan Restaurant,Pakistani Restaurant,North Indian Restaurant,Italian Restaurant,Indian Chinese Restaurant
4,M4P,Central Toronto,Davisville North,Indian Restaurant,Italian Restaurant,Sri Lankan Restaurant,Pakistani Restaurant,North Indian Restaurant,Indian Chinese Restaurant


### K-Means Clustering Approach

In [19]:
# set number of clusters
kclusters = 5

df8 = df7.drop('Neighborhood', 1)
df8 = df8.drop('PostalCode', 1)
df8 = df8.drop('Borough', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(df8)

# check cluster labels generated for each row in the dataframe
kmeans.labels_

array([0, 0, 3, 0, 2, 1, 0, 0, 0, 0, 0, 3, 0, 0, 0, 3, 3, 0, 3, 3, 3, 0,
       0, 3, 3, 0, 0, 0, 4, 0, 0, 3], dtype=int32)

In [20]:
df9.insert(0, 'Cluster Labels', kmeans.labels_)

In [21]:
df9.head()

Unnamed: 0,Cluster Labels,PostalCode,Borough,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue
0,0,M4E,East Toronto,The Beaches,Indian Restaurant,Sri Lankan Restaurant,Pakistani Restaurant,North Indian Restaurant,Italian Restaurant,Indian Chinese Restaurant
1,0,M4K,East Toronto,"The Danforth West, Riverdale",Indian Restaurant,Sri Lankan Restaurant,Pakistani Restaurant,North Indian Restaurant,Italian Restaurant,Indian Chinese Restaurant
2,3,M4L,East Toronto,"India Bazaar, The Beaches West",Indian Restaurant,Pakistani Restaurant,Indian Chinese Restaurant,Sri Lankan Restaurant,North Indian Restaurant,Italian Restaurant
3,0,M4M,East Toronto,Studio District,Indian Restaurant,Sri Lankan Restaurant,Pakistani Restaurant,North Indian Restaurant,Italian Restaurant,Indian Chinese Restaurant
4,2,M4P,Central Toronto,Davisville North,Indian Restaurant,Italian Restaurant,Sri Lankan Restaurant,Pakistani Restaurant,North Indian Restaurant,Indian Chinese Restaurant


In [22]:
# merge groupde data frame with neighbourhoods data to add latitude/longitude for each neighborhood

df10 = df4
df10 = df10.join(df9.set_index(['PostalCode','Borough', 'Neighborhood']), on=['PostalCode','Borough', 'Neighbourhood'])
df10 = df10.dropna()
df10.shape

(32, 12)

### Display the map

In [23]:
map_clusters = folium.Map(location=[43.651070,-79.347015], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, postalCode, borough, neighborhood, cluster in zip(df10['Latitude'], df10['Longitude'], df10['PostalCode'], df10['Borough'], df10['Neighbourhood'], df10['Cluster Labels']):
    label = folium.Popup(str(postalCode) + ' - Cluster ' + str(cluster), parse_html=True)
    cluster = int(cluster)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### Examining the cluster output

From below its clear that neighborhoods in cluster 3 and cluster 0 are the most favourable places to open Indian sweets shop.

In [24]:
df10[df10['Cluster Labels'] == 0].head(10)

Unnamed: 0,PostalCode,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue
2,M5A,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636,0.0,Indian Restaurant,Sri Lankan Restaurant,Pakistani Restaurant,North Indian Restaurant,Italian Restaurant,Indian Chinese Restaurant
9,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937,0.0,Indian Restaurant,Sri Lankan Restaurant,Pakistani Restaurant,North Indian Restaurant,Italian Restaurant,Indian Chinese Restaurant
15,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418,0.0,Indian Restaurant,Sri Lankan Restaurant,Pakistani Restaurant,North Indian Restaurant,Italian Restaurant,Indian Chinese Restaurant
19,M4E,East Toronto,The Beaches,43.676357,-79.293031,0.0,Indian Restaurant,Sri Lankan Restaurant,Pakistani Restaurant,North Indian Restaurant,Italian Restaurant,Indian Chinese Restaurant
25,M6G,Downtown Toronto,Christie,43.669542,-79.422564,0.0,Indian Restaurant,Sri Lankan Restaurant,Pakistani Restaurant,North Indian Restaurant,Italian Restaurant,Indian Chinese Restaurant
30,M5H,Downtown Toronto,"Richmond, Adelaide, King",43.650571,-79.384568,0.0,Indian Restaurant,Sri Lankan Restaurant,Pakistani Restaurant,North Indian Restaurant,Italian Restaurant,Indian Chinese Restaurant
37,M6J,West Toronto,"Little Portugal, Trinity",43.647927,-79.41975,0.0,Indian Restaurant,Sri Lankan Restaurant,Pakistani Restaurant,North Indian Restaurant,Italian Restaurant,Indian Chinese Restaurant
41,M4K,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188,0.0,Indian Restaurant,Sri Lankan Restaurant,Pakistani Restaurant,North Indian Restaurant,Italian Restaurant,Indian Chinese Restaurant
54,M4M,East Toronto,Studio District,43.659526,-79.340923,0.0,Indian Restaurant,Sri Lankan Restaurant,Pakistani Restaurant,North Indian Restaurant,Italian Restaurant,Indian Chinese Restaurant
69,M6P,West Toronto,"High Park, The Junction South",43.661608,-79.464763,0.0,Indian Restaurant,Sri Lankan Restaurant,Pakistani Restaurant,North Indian Restaurant,Italian Restaurant,Indian Chinese Restaurant


In [25]:
df10[df10['Cluster Labels'] == 1].head(10)

Unnamed: 0,PostalCode,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue
73,M4R,Central Toronto,"North Toronto West, Lawrence Park",43.715383,-79.405678,1.0,Italian Restaurant,Sri Lankan Restaurant,Pakistani Restaurant,North Indian Restaurant,Indian Restaurant,Indian Chinese Restaurant


In [26]:
df10[df10['Cluster Labels'] == 2].head(10)

Unnamed: 0,PostalCode,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue
67,M4P,Central Toronto,Davisville North,43.712751,-79.390197,2.0,Indian Restaurant,Italian Restaurant,Sri Lankan Restaurant,Pakistani Restaurant,North Indian Restaurant,Indian Chinese Restaurant


In [27]:
df10[df10['Cluster Labels'] == 3].head(10)

Unnamed: 0,PostalCode,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government",43.662301,-79.389494,3.0,Indian Restaurant,North Indian Restaurant,Sri Lankan Restaurant,Pakistani Restaurant,Italian Restaurant,Indian Chinese Restaurant
20,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306,3.0,Indian Restaurant,North Indian Restaurant,Sri Lankan Restaurant,Pakistani Restaurant,Italian Restaurant,Indian Chinese Restaurant
24,M5G,Downtown Toronto,Central Bay Street,43.657952,-79.387383,3.0,Indian Restaurant,North Indian Restaurant,Sri Lankan Restaurant,Pakistani Restaurant,Italian Restaurant,Indian Chinese Restaurant
36,M5J,Downtown Toronto,"Harbourfront East, Union Station, Toronto Islands",43.640816,-79.381752,3.0,Indian Restaurant,North Indian Restaurant,Sri Lankan Restaurant,Pakistani Restaurant,Italian Restaurant,Indian Chinese Restaurant
42,M5K,Downtown Toronto,"Toronto Dominion Centre, Design Exchange",43.647177,-79.381576,3.0,Indian Restaurant,North Indian Restaurant,Sri Lankan Restaurant,Pakistani Restaurant,Italian Restaurant,Indian Chinese Restaurant
47,M4L,East Toronto,"India Bazaar, The Beaches West",43.668999,-79.315572,3.0,Indian Restaurant,Pakistani Restaurant,Indian Chinese Restaurant,Sri Lankan Restaurant,North Indian Restaurant,Italian Restaurant
48,M5L,Downtown Toronto,"Commerce Court, Victoria Hotel",43.648198,-79.379817,3.0,Indian Restaurant,North Indian Restaurant,Sri Lankan Restaurant,Pakistani Restaurant,Italian Restaurant,Indian Chinese Restaurant
84,M5T,Downtown Toronto,"Kensington Market, Chinatown, Grange Park",43.653206,-79.400049,3.0,Indian Restaurant,Sri Lankan Restaurant,Pakistani Restaurant,North Indian Restaurant,Italian Restaurant,Indian Chinese Restaurant
92,M5W,Downtown Toronto,Stn A PO Boxes,43.646435,-79.374846,3.0,Indian Restaurant,North Indian Restaurant,Sri Lankan Restaurant,Pakistani Restaurant,Italian Restaurant,Indian Chinese Restaurant
99,M4Y,Downtown Toronto,Church and Wellesley,43.66586,-79.38316,3.0,Indian Restaurant,North Indian Restaurant,Sri Lankan Restaurant,Pakistani Restaurant,Italian Restaurant,Indian Chinese Restaurant


In [28]:
df10[df10['Cluster Labels'] == 4].head(10)

Unnamed: 0,PostalCode,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue
43,M6K,West Toronto,"Brockton, Parkdale Village, Exhibition Place",43.636847,-79.428191,4.0,Indian Restaurant,North Indian Restaurant,Sri Lankan Restaurant,Pakistani Restaurant,Italian Restaurant,Indian Chinese Restaurant
