### This Notebook is created for Coursera IBM CAPSTONE Project "Explore and cluster Neighborhoods of Toronto"

###### For this study we use list of boroughs and neighborhoods from wikipedia  https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M
Load it using get command of requests library and scraping using BeautifulSoup package

In [5]:
import requests
from bs4 import BeautifulSoup
import pandas as pd

In [7]:
url = 'https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'
html = requests.get(url).content

In [8]:
soup = BeautifulSoup(html, 'lxml')
table = soup.find ('table', class_= 'wikitable sortable' )
rows = table.find_all('tr')
new_list=[]
for tr in rows:
    td=tr.find_all('td')
    row = [i.text.strip() for i in td]
    if len(row)>2:
      
        
        new_list.append(row)

In [9]:
toronto_df = pd.DataFrame(new_list, columns =['PostalCode', 'Borough', 'Neighborhood'])
toronto_df.head()

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront


###### For processing in data analysis only the cells that have an assigned borough we drop cells, a borough is Not assigned.

In [10]:
toronto_df.set_index(['Borough'], inplace=True)
toronto_df.drop(['Not assigned'], axis=0, inplace=True)
toronto_df.head()

Unnamed: 0_level_0,PostalCode,Neighborhood
Borough,Unnamed: 1_level_1,Unnamed: 2_level_1
North York,M3A,Parkwoods
North York,M4A,Victoria Village
Downtown Toronto,M5A,Harbourfront
Downtown Toronto,M5A,Regent Park
North York,M6A,Lawrence Heights


###### More than one neighborhood can exist in one postal code area we join these neighborhoods in one row in Dataframe. For this we apply multilevel index Borough and Postal Code, apply groupby and using lambda function join neighborhoods in one data row separate with comma

In [11]:
#show rows with same postal code
duplicated = toronto_df[toronto_df.duplicated(['PostalCode'])]
duplicated.head()


Unnamed: 0_level_0,PostalCode,Neighborhood
Borough,Unnamed: 1_level_1,Unnamed: 2_level_1
Downtown Toronto,M5A,Regent Park
North York,M6A,Lawrence Manor
Scarborough,M1B,Malvern
East York,M4B,Parkview Hill
Downtown Toronto,M5B,Garden District


In [12]:
toronto_df.reset_index(inplace=True)
toronto_df.set_index(['Borough','PostalCode'], inplace=True)
toronto=toronto_df.groupby( ['Borough','PostalCode']).agg(lambda x: ', ' .join(set(x)))
toronto.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,Neighborhood
Borough,PostalCode,Unnamed: 2_level_1
Central Toronto,M4N,Lawrence Park
Central Toronto,M4P,Davisville North
Central Toronto,M4R,North Toronto West
Central Toronto,M4S,Davisville
Central Toronto,M4T,"Moore Park, Summerhill East"


###### If borough data not assigned in cell , than neighborhood will be the same as borough.

In [13]:
toronto.reset_index(inplace=True)
toronto["Neighborhood"][toronto["Neighborhood"]=='Not assigned'] = toronto["Borough"]


###### To check number of rows in dataframe after applying all cleaning data we use shape attribute

In [14]:
toronto.shape

(103, 3)

##### Dataframe of the postal code of each neighborhood along with the borough name and neighborhood name has been built

##### In order to utilize the Foursquare location data, we need to get the latitude and the longitude coordinates of each neighborhood. For this we load geoload.csv in separate dataframe and then merge  two dataframes on 'Postal Code' columns.

In [15]:
!wget -q -O geoloc.csv https://cocl.us/Geospatial_data

geoloc_df=pd.read_csv('geoloc.csv')
geoloc_df.head()

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


In [16]:
geo_torontdo_df=pd.merge(toronto, geoloc_df, left_on=['PostalCode'], right_on=['Postal Code'])
geo_torontdo_df.shape

(103, 6)

In [17]:
geo_torontdo_df.head()

Unnamed: 0,Borough,PostalCode,Neighborhood,Postal Code,Latitude,Longitude
0,Central Toronto,M4N,Lawrence Park,M4N,43.72802,-79.38879
1,Central Toronto,M4P,Davisville North,M4P,43.712751,-79.390197
2,Central Toronto,M4R,North Toronto West,M4R,43.715383,-79.405678
3,Central Toronto,M4S,Davisville,M4S,43.704324,-79.38879
4,Central Toronto,M4T,"Moore Park, Summerhill East",M4T,43.689574,-79.38316


##### Dataframe, which contains neighborhood geographical coordinates has been built

In [18]:
print('The dataframe has {} boroughs and {} neighborhoods.'.format(
        len(geo_torontdo_df['Borough'].unique()),
        geo_torontdo_df.shape[0]
    )
)

The dataframe has 11 boroughs and 103 neighborhoods.


##### Explore and cluster the neighborhoods in Toronto.

###### Using geopy library we get latitude and longitude values of Toronto, generate map using folium library and superimpose neighborhoods labels on it  

In [19]:
from geopy.geocoders import Nominatim

address = 'Toronto, ON'

geolocator = Nominatim(user_agent = "toronto_explorer", timeout=None)
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print ('The geographical coordinates of Toronto are {},{}'.format(latitude, longitude))

The geographical coordinates of Toronto are 43.653963,-79.387207


In [20]:
import folium 
map_toronto = folium.Map(location=[latitude, longitude], zoom_start=10)
for lat, lng, borough, neighborhood in zip(geo_torontdo_df['Latitude'], geo_torontdo_df['Longitude'],
                                          geo_torontdo_df['Borough'], geo_torontdo_df['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label=folium.Popup(label, parse_html=True)
    folium.CircleMarker(
    [lat, lng],
    radius=5,
    popup=label,
    color='red',
    fill=True,
    fill_color='#3186cc',
    fill_opacity=0.7,
    parse_html=False).add_to(map_toronto) 

In [21]:
map_toronto

In [23]:
CLIENT_ID = "0HRDWP1IKCCSKALJDWFCCSFBG2214SVN4N5TL53O0JLYHGE3"
CLIENT_SECRET = 'OR2R45JED2I3AMXBGIBIORVPNRNHPMKHASH34GEAWZC20KSA'
VERSION = '20180605'
LIMIT=100
radius=500

#### Function that explore venues using Foursquare and extract venues categories

In [24]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

###### We will explore only boroughs that contain the word Toronto


In [25]:
toronto_data = geo_torontdo_df[geo_torontdo_df['Borough'].str.contains('Toronto')].reset_index(drop=True)
toronto_data['Neighborhood'].count()

38

In [26]:
toronto_data["Borough"].unique()

array(['Central Toronto', 'Downtown Toronto', 'East Toronto',
       'West Toronto'], dtype=object)

In [27]:
#There are 38 neighborhoods in this boroughs
print ('There are {} neighborhoods in {} boroughs'.format(toronto_data['Neighborhood'].count(), toronto_data["Borough"].unique()))

There are 38 neighborhoods in ['Central Toronto' 'Downtown Toronto' 'East Toronto' 'West Toronto'] boroughs


In [28]:
# apply get_neaby_venues function for neighborhoods
toronto_venues = getNearbyVenues(names=toronto_data['Neighborhood'],
                                   latitudes=toronto_data['Latitude'],
                                   longitudes=toronto_data['Longitude']
                                  )

Lawrence Park
Davisville North
North Toronto West
Davisville
Moore Park, Summerhill East
South Hill, Summerhill West, Deer Park, Rathnelly, Forest Hill SE
Roselawn
Forest Hill West, Forest Hill North
North Midtown, The Annex, Yorkville
Rosedale
St. James Town, Cabbagetown
Church and Wellesley
Harbourfront, Regent Park
Garden District, Ryerson
St. James Town
Berczy Park
Central Bay Street
Richmond, King, Adelaide
Toronto Islands, Union Station, Harbourfront East
Design Exchange, Toronto Dominion Centre
Commerce Court, Victoria Hotel
Harbord, University of Toronto
Kensington Market, Chinatown, Grange Park
Bathurst Quay, Island airport, CN Tower, King and Spadina, Railway Lands, Harbourfront West, South Niagara
Stn A PO Boxes 25 The Esplanade
First Canadian Place, Underground city
Christie
The Beaches
Riverdale, The Danforth West
India Bazaar, The Beaches West
Studio District
Business Reply Mail Processing Centre 969 Eastern
Dovercourt Village, Dufferin
Little Portugal, Trinity
Exhibition

In [29]:
toronto_venues.shape

(1680, 7)

In [30]:
toronto_venues.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Lawrence Park,43.72802,-79.38879,Lawrence Park Ravine,43.726963,-79.394382,Park
1,Lawrence Park,43.72802,-79.38879,Zodiac Swim School,43.728532,-79.38286,Swim School
2,Lawrence Park,43.72802,-79.38879,TTC Bus #162 - Lawrence-Donway,43.728026,-79.382805,Bus Line
3,Davisville North,43.712751,-79.390197,Homeway Restaurant & Brunch,43.712641,-79.391557,Breakfast Spot
4,Davisville North,43.712751,-79.390197,Summerhill Market North,43.715499,-79.392881,Food & Drink Shop


###### Check how many venues were returned for each neighborhood

In [31]:
toronto_venues.groupby('Neighborhood')['Venue'].count()

Neighborhood
Bathurst Quay, Island airport, CN Tower, King and Spadina, Railway Lands, Harbourfront West, South Niagara     15
Berczy Park                                                                                                    55
Business Reply Mail Processing Centre 969 Eastern                                                              15
Central Bay Street                                                                                             79
Christie                                                                                                       17
Church and Wellesley                                                                                           86
Commerce Court, Victoria Hotel                                                                                100
Davisville                                                                                                     32
Davisville North                                                           

In [32]:
print('There are {} uniques categories.'.format(len(toronto_venues['Venue Category'].unique())))

There are 232 uniques categories.


###### For analizing each neighborhood we use get_dummies and turn Venue Category columns to separate column for each category containing 0 or 1 value and 

In [33]:
toronto_onehot = pd.get_dummies(toronto_venues[['Venue Category']], prefix='', prefix_sep='')
toronto_onehot.shape

(1680, 232)

In [34]:
#get new dataframe consisting of 232 columns one for each category

In [35]:
#add neighborhood columns back to new dataframe
toronto_onehot['Neighborhood'] = toronto_venues['Neighborhood'] 

# move neighborhood column to the first column

fixed_columns = [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]

toronto_onehot.columns[0:10]

Index(['Yoga Studio', 'Afghan Restaurant', 'Airport', 'Airport Food Court',
       'Airport Gate', 'Airport Lounge', 'Airport Service', 'Airport Terminal',
       'American Restaurant', 'Antique Shop'],
      dtype='object')

###### group rows by neighborhood and by taking the mean of the frequency of occurrence of each category

In [36]:
toronto_grouped = toronto_onehot.groupby('Neighborhood').mean().reset_index()
toronto_grouped

Unnamed: 0,Neighborhood,Yoga Studio,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,...,Theater,Theme Restaurant,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Wings Joint
0,"Bathurst Quay, Island airport, CN Tower, King ...",0.0,0.0,0.066667,0.066667,0.066667,0.133333,0.133333,0.066667,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Berczy Park,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.018182,0.0,0.0,0.0,0.0
2,Business Reply Mail Processing Centre 969 Eastern,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Central Bay Street,0.012658,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012658,...,0.0,0.0,0.0,0.0,0.0,0.012658,0.0,0.0,0.012658,0.0
4,Christie,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Church and Wellesley,0.011628,0.011628,0.0,0.0,0.0,0.0,0.0,0.0,0.011628,...,0.011628,0.011628,0.0,0.0,0.0,0.0,0.011628,0.011628,0.0,0.011628
6,"Commerce Court, Victoria Hotel",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,...,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0
7,Davisville,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.03125,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Davisville North,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,"Design Exchange, Toronto Dominion Centre",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,...,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.01,0.0


###### print each neighborhood along with the top 5 most common venues

In [37]:
num_top_venues = 5

for hood in toronto_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = toronto_grouped[toronto_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Bathurst Quay, Island airport, CN Tower, King and Spadina, Railway Lands, Harbourfront West, South Niagara----
              venue  freq
0    Airport Lounge  0.13
1   Airport Service  0.13
2  Sculpture Garden  0.07
3             Plane  0.07
4          Boutique  0.07


----Berczy Park----
                venue  freq
0         Coffee Shop  0.07
1  Seafood Restaurant  0.04
2            Beer Bar  0.04
3                Café  0.04
4         Cheese Shop  0.04


----Business Reply Mail Processing Centre 969 Eastern----
            venue  freq
0      Comic Shop  0.07
1   Auto Workshop  0.07
2            Park  0.07
3      Restaurant  0.07
4  Farmers Market  0.07


----Central Bay Street----
                venue  freq
0         Coffee Shop  0.14
1      Ice Cream Shop  0.05
2  Italian Restaurant  0.05
3                Café  0.05
4        Burger Joint  0.04


----Christie----
           venue  freq
0           Café  0.18
1  Grocery Store  0.18
2           Park  0.12
3     Restaurant  0.06
4   

In [38]:
# function to sort the venues in descending order

def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

###### create the new dataframe and display the top 10 venues for each neighborhood

In [39]:
num_top_venues = 10
import numpy as np
indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = toronto_grouped['Neighborhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Bathurst Quay, Island airport, CN Tower, King ...",Airport Lounge,Airport Service,Plane,Bar,Coffee Shop,Sculpture Garden,Boutique,Boat or Ferry,Harbor / Marina,Airport
1,Berczy Park,Coffee Shop,Cocktail Bar,Beer Bar,Bakery,Steakhouse,Seafood Restaurant,Farmers Market,Cheese Shop,Café,Butcher
2,Business Reply Mail Processing Centre 969 Eastern,Pizza Place,Garden,Skate Park,Light Rail Station,Park,Farmers Market,Spa,Fast Food Restaurant,Brewery,Burrito Place
3,Central Bay Street,Coffee Shop,Ice Cream Shop,Italian Restaurant,Café,Sandwich Place,Burger Joint,Bar,Salad Place,Bubble Tea Shop,Gym / Fitness Center
4,Christie,Grocery Store,Café,Park,Baby Store,Italian Restaurant,Diner,Restaurant,Athletics & Sports,Nightclub,Candy Store


##### Cluster neigborhoods using K-means algorithm and K value 5

In [40]:
from sklearn.cluster import KMeans
# set number of clusters
kclusters = 5

toronto_grouped_clustering = toronto_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1])

In [41]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

toronto_merged = toronto_data

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
toronto_merged = toronto_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

toronto_merged.head()

Unnamed: 0,Borough,PostalCode,Neighborhood,Postal Code,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Central Toronto,M4N,Lawrence Park,M4N,43.72802,-79.38879,3,Park,Swim School,Bus Line,Wings Joint,Diner,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant
1,Central Toronto,M4P,Davisville North,M4P,43.712751,-79.390197,1,Gym,Hotel,Park,Breakfast Spot,Clothing Store,Sandwich Place,Dance Studio,Food & Drink Shop,Dog Run,Doner Restaurant
2,Central Toronto,M4R,North Toronto West,M4R,43.715383,-79.405678,1,Sporting Goods Shop,Coffee Shop,Clothing Store,Yoga Studio,Dessert Shop,Spa,Burger Joint,Mexican Restaurant,Salon / Barbershop,Diner
3,Central Toronto,M4S,Davisville,M4S,43.704324,-79.38879,1,Sandwich Place,Dessert Shop,Sushi Restaurant,Gym,Italian Restaurant,Coffee Shop,Thai Restaurant,Café,Pizza Place,Fried Chicken Joint
4,Central Toronto,M4T,"Moore Park, Summerhill East",M4T,43.689574,-79.38316,4,Playground,Tennis Court,Restaurant,Wings Joint,Dessert Shop,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant


In [42]:
toronto_merged.tail() # cheching last columns of dataframe

Unnamed: 0,Borough,PostalCode,Neighborhood,Postal Code,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
33,West Toronto,M6J,"Little Portugal, Trinity",M6J,43.647927,-79.41975,1,Bar,Coffee Shop,Restaurant,Asian Restaurant,Men's Store,Pizza Place,Café,New American Restaurant,Vietnamese Restaurant,Brewery
34,West Toronto,M6K,"Exhibition Place, Parkdale Village, Brockton",M6K,43.636847,-79.428191,1,Café,Breakfast Spot,Coffee Shop,Furniture / Home Store,Convenience Store,Burrito Place,Stadium,Caribbean Restaurant,Italian Restaurant,Bar
35,West Toronto,M6P,"High Park, The Junction South",M6P,43.661608,-79.464763,1,Bar,Café,Thai Restaurant,Mexican Restaurant,Park,Arts & Crafts Store,Discount Store,Bakery,Diner,Cajun / Creole Restaurant
36,West Toronto,M6R,"Parkdale, Roncesvalles",M6R,43.64896,-79.456325,1,Gift Shop,Coffee Shop,Bookstore,Bank,Dog Run,Bar,Movie Theater,Restaurant,Dessert Shop,Eastern European Restaurant
37,West Toronto,M6S,"Runnymede, Swansea",M6S,43.651571,-79.48445,1,Café,Coffee Shop,Italian Restaurant,Sushi Restaurant,Pizza Place,Falafel Restaurant,Supplement Shop,Pub,Food & Drink Shop,French Restaurant


###### Visualize clusters on Toronto map

In [45]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=12)
import matplotlib.cm as cm
import matplotlib.colors as colors
# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['Neighborhood'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [47]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 0, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,PostalCode,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
7,M5P,-79.411307,0,Park,Trail,Jewelry Store,Sushi Restaurant,Wings Joint,Diner,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store
9,M4W,-79.377529,0,Park,Playground,Trail,Wings Joint,Dim Sum Restaurant,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant


In [48]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 1, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,PostalCode,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,M4P,-79.390197,1,Gym,Hotel,Park,Breakfast Spot,Clothing Store,Sandwich Place,Dance Studio,Food & Drink Shop,Dog Run,Doner Restaurant
2,M4R,-79.405678,1,Sporting Goods Shop,Coffee Shop,Clothing Store,Yoga Studio,Dessert Shop,Spa,Burger Joint,Mexican Restaurant,Salon / Barbershop,Diner
3,M4S,-79.38879,1,Sandwich Place,Dessert Shop,Sushi Restaurant,Gym,Italian Restaurant,Coffee Shop,Thai Restaurant,Café,Pizza Place,Fried Chicken Joint
5,M4V,-79.400049,1,Coffee Shop,Pub,Light Rail Station,American Restaurant,Restaurant,Sushi Restaurant,Fried Chicken Joint,Sports Bar,Bagel Shop,Supermarket
8,M5R,-79.405678,1,Sandwich Place,Café,Coffee Shop,Pharmacy,American Restaurant,Burger Joint,Indian Restaurant,BBQ Joint,Pub,Liquor Store
10,M4X,-79.367675,1,Coffee Shop,Pub,Market,Pizza Place,Italian Restaurant,Bakery,Restaurant,Café,Beer Store,Sandwich Place
11,M4Y,-79.38316,1,Coffee Shop,Japanese Restaurant,Sushi Restaurant,Restaurant,Gay Bar,Café,Gym,Hotel,Italian Restaurant,Men's Store
12,M5A,-79.360636,1,Coffee Shop,Park,Bakery,Pub,Mexican Restaurant,Café,Breakfast Spot,Theater,Yoga Studio,Spa
13,M5B,-79.378937,1,Clothing Store,Coffee Shop,Cosmetics Shop,Café,Bakery,Ramen Restaurant,Sporting Goods Shop,Pizza Place,Fast Food Restaurant,Bookstore
14,M5C,-79.375418,1,Coffee Shop,Café,Hotel,Restaurant,Italian Restaurant,Gastropub,Cosmetics Shop,Breakfast Spot,Bakery,Clothing Store


In [49]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 2, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,PostalCode,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
6,M5N,-79.416936,2,Garden,Home Service,Dim Sum Restaurant,Farmers Market,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant


In [50]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 3, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,PostalCode,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M4N,-79.38879,3,Park,Swim School,Bus Line,Wings Joint,Diner,Falafel Restaurant,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant


In [51]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 4, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,PostalCode,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,M4T,-79.38316,4,Playground,Tennis Court,Restaurant,Wings Joint,Dessert Shop,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant
