<div class="alert alert-block alert-success" style="margin-top: 20px">  
    
<a id='ThirdSection'> <H1> Third Assignment - Explore and cluster the Toronto Neighborhood</H1></a>  
     3.1. [Following the NY Cluster lab](#31)  
     3.2. [Subsetting to East, Central, Downtown, and West Toronto](#32)  
     3.3. [Decoupling Neighborhood](#33)  
     3.4. [Map of East, Central, Downtown, and West Toronto Neighborhood](#34)  
     3.5. [FourSquare API](#35)  
     3.6. [Preprocessing Neighborhood](#36)  
     3.7. [K-Means Clustering](#37)  
     3.8. [Cluster Map](#38)  
     3.9. [Examine Cluster](#39)  
</div>

Import a couple dependancies before we start:

In [2]:
# Segmenting and Clustering Neighborhoods in Toronto
import pandas as pd
import numpy as np
from bs4 import BeautifulSoup
import requests
import json # library to handle JSON files
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes 
import folium # map rendering library

Let's load the csv files from the first two assignments:

In [3]:
cleaned_df2 = pd.read_csv('toronto_lat_long.csv')
cleaned_df2.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353
1,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek",43.784535,-79.160497
2,M1E,Scarborough,"West Hill, Morningside, Guildwood",43.763573,-79.188711
3,M1K,Scarborough,"Kennedy Park, Ionview, East Birchmount Park",43.727929,-79.262029
4,M1L,Scarborough,"Clairlea, Golden Mile, Oakridge",43.711112,-79.284577


<a id='31'>Following the NY cluster lab</a>  
Just checking on the number of boroughs and neighborhoods from cleaned_df2 and df dataframe:

In [4]:
print('The dataframe has {} boroughs and {} neighborhoods.'.format(
        len(cleaned_df2['Borough'].unique()),
        len(cleaned_df2['Neighborhood'].unique())
    )
)

The dataframe has 11 boroughs and 103 neighborhoods.


In order to define an instance of the geocoder, we need to define a user_agent. We will name our agent <em>toronto_explorer</em>, as shown below.

In [5]:
address = 'Toronto, Ontario'

geolocator = Nominatim(user_agent="toronto_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Toronto are 43.653963, -79.387207.


#### Create a map of Toronto with neighborhoods superimposed on top.

In [6]:
# create map of Toronto using latitude and longitude values
map_toronto = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(cleaned_df2['Latitude'], cleaned_df2['Longitude'], cleaned_df2['Borough'], cleaned_df2['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
    
map_toronto

Notice that there are multiple neighborhood per marker. It will be decoupled in the next couple of lines.

Let's explore the Boroughs in the dataframe:

In [7]:
cleaned_df2['Borough'].unique()

array(['Scarborough', 'North York', 'East York', 'East Toronto',
       'Central Toronto', 'Downtown Toronto', 'West Toronto', 'York',
       'Etobicoke', "Queen's Park", 'Mississauga'], dtype=object)

<a id='32'>Subsetting to East, Central, Downtown, and West Toronto</a>  
[Back to the top](#ThirdSection)    
We are going to subset the data and only specify the 'East Toronto', 'Central Toronto', 'Downtown Toronto' , and 'West Toronto' Borough using ```str.contains```:

In [8]:
toronto_data = cleaned_df2[cleaned_df2['Borough'].str.contains('Toronto')]
# reset the index:
toronto_data = toronto_data.reset_index(drop=True)
toronto_data.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M4K,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188
1,M4L,East Toronto,"India Bazaar, The Beaches West",43.668999,-79.315572
2,M4T,Central Toronto,"Summerhill East, Moore Park",43.689574,-79.38316
3,M4V,Central Toronto,"Rathnelly, Deer Park, South Hill, Summerhill W...",43.686412,-79.400049
4,M4X,Downtown Toronto,"Cabbagetown, St. James Town",43.667967,-79.367675


In [9]:
# shape of the data:
toronto_data.shape

(38, 5)

<a id='33'>Decoupling Neighborhood</a>  
[Back to the top](#ThirdSection)    
There are multiple neighborhood for each postal code, so I will decouple them and get the latitude and longitude using the Nominatim (geocode) function.

First let's check which rows have one neighborhood and which rows have multiple neighborhood:

In [10]:
# for test:
len(toronto_data['Neighborhood'][0].split(','))

2

In [11]:
# to get the row:
toronto_data.shape[0]

38

In [12]:
one_neigh= []
mult_neigh = []
for i in range(toronto_data.shape[0]):
    countN = len(toronto_data['Neighborhood'][i].split(','))
    if countN > 1:
        mult_neigh.append(i)
    elif countN == 1:
        one_neigh.append(i)

In [13]:
toronto_data_mult = toronto_data.loc[mult_neigh]
toronto_data_mult.shape

(23, 5)

In [14]:
toronto_data_one = toronto_data.loc[one_neigh]
toronto_data_one.shape

(15, 5)

In [15]:
toronto_data_one

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
23,M4E,East Toronto,The Beaches,43.676357,-79.293031
24,M4M,East Toronto,Studio District,43.659526,-79.340923
25,M4N,Central Toronto,Lawrence Park,43.72802,-79.38879
26,M4P,Central Toronto,Davisville North,43.712751,-79.390197
27,M4R,Central Toronto,North Toronto West,43.715383,-79.405678
28,M4S,Central Toronto,Davisville,43.704324,-79.38879
29,M4W,Downtown Toronto,Rosedale,43.679563,-79.377529
30,M4Y,Downtown Toronto,Church and Wellesley,43.66586,-79.38316
31,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418
32,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306


We will keep the ```toronto_data_one``` DataFrame and decouple the ```toronto_data_mult``` DataFrame:

In [17]:
neighbor_list = []
for neigh in toronto_data_mult['Neighborhood']:
    neighbor_list.append(neigh)
neighbor_list[:5]

['The Danforth West, Riverdale',
 'India Bazaar, The Beaches West',
 'Summerhill East, Moore Park',
 'Rathnelly, Deer Park, South Hill, Summerhill West, Forest Hill SE',
 'Cabbagetown, St. James Town']

In [19]:
# decoupling the neighbor_list:
new = [i.split(', ') for i in neighbor_list]
# https://stackoverflow.com/questions/952914/how-to-make-a-flat-list-out-of-list-of-lists
neighbor_list = [item for sublist in new for item in sublist]
neighbor_list[:10]

['The Danforth West',
 'Riverdale',
 'India Bazaar',
 'The Beaches West',
 'Summerhill East',
 'Moore Park',
 'Rathnelly',
 'Deer Park',
 'South Hill',
 'Summerhill West']

Let's use the `geopy` package using Nominatim geolocation service:

In [20]:
lat_to = []
long_to = []
for neigh in neighbor_list:
    address = "{}, Toronto, ON".format(neigh)
    geolocator = Nominatim(user_agent="toronto_explorer")
    location = geolocator.geocode(address)
    if location is not None:
        lat_to.append(location.latitude)
        long_to.append(location.longitude)
    else:
        lat_to.append(np.nan)
        long_to.append(np.nan)

Create a new dataframe entitled `neighborhoods`:

In [21]:
column_names = ['Neighborhood', 'Latitude', 'Longitude'] 
# instantiate the dataframe
neighborhoods = pd.DataFrame(columns=column_names)

In [22]:
# filling the neighborhoods dataframe with what we got earlier:
neighborhoods['Neighborhood'] = neighbor_list
neighborhoods['Latitude']= lat_to
neighborhoods['Longitude']= long_to

In [23]:
neighborhoods.head()

Unnamed: 0,Neighborhood,Latitude,Longitude
0,The Danforth West,43.68636,-79.300316
1,Riverdale,43.66547,-79.352594
2,India Bazaar,43.672223,-79.323503
3,The Beaches West,43.671024,-79.296712
4,Summerhill East,43.681678,-79.390504


In [24]:
# let's see the null values:
neighborhoods[neighborhoods['Latitude'].isnull()]

Unnamed: 0,Neighborhood,Latitude,Longitude
42,Railway Lands,,


Cross-checking the `Railway Lands` neighborhoods with the toronto_data dataframe to see their values:m

In [26]:
toronto_data.loc[toronto_data['Neighborhood'].str.contains('Railway Lands')]

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
15,M5V,Downtown Toronto,"CN Tower, Bathurst Quay, Island airport, Harbo...",43.628947,-79.39442


The *Railway Lands* are amongst other downtown Toronto neighborhood Postal Code. I am comfortable with deleting this one.

In [27]:
neighborhoods = neighborhoods.drop(neighborhoods[neighborhoods['Neighborhood'] == 'Railway Lands'].index)

In [28]:
# let's see the null values:
neighborhoods[neighborhoods['Latitude'].isnull()]

Unnamed: 0,Neighborhood,Latitude,Longitude


**We remove all the ```null``` values**

Next, let's re-add the Borough column by taking the `toronto_data_mult` DataFrame and creating a list comprehension:

In [30]:
borough_list = [list(toronto_data_mult.loc[toronto_data_mult['Neighborhood'].str.contains(neigh)]['Borough']) 
 for neigh in neighborhoods['Neighborhood']]

In [31]:
borough_list[:15]

[['East Toronto'],
 ['East Toronto'],
 ['East Toronto'],
 ['East Toronto'],
 ['Central Toronto'],
 ['Central Toronto'],
 ['Central Toronto'],
 ['Central Toronto'],
 ['Central Toronto'],
 ['Central Toronto'],
 ['Central Toronto'],
 ['Downtown Toronto'],
 ['Downtown Toronto'],
 ['Downtown Toronto', 'Downtown Toronto', 'Downtown Toronto'],
 ['Downtown Toronto']]

There are rows that have multiple values, so I will only take the first one using another list comprehension:

In [33]:
borough_list = [borough_list[i][0] for i in range(len(borough_list))]
# check the length of the borough list
len(borough_list)

58

In [34]:
# checking the final product:
borough_list[:15]

['East Toronto',
 'East Toronto',
 'East Toronto',
 'East Toronto',
 'Central Toronto',
 'Central Toronto',
 'Central Toronto',
 'Central Toronto',
 'Central Toronto',
 'Central Toronto',
 'Central Toronto',
 'Downtown Toronto',
 'Downtown Toronto',
 'Downtown Toronto',
 'Downtown Toronto']

In [35]:
# make sure that the dataframe has the same length
neighborhoods.shape

(58, 3)

In [36]:
# let's insert the borough value into neighborhoods dataframe
neighborhoods.insert(0,'Borough',borough_list)
neighborhoods.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,East Toronto,The Danforth West,43.68636,-79.300316
1,East Toronto,Riverdale,43.66547,-79.352594
2,East Toronto,India Bazaar,43.672223,-79.323503
3,East Toronto,The Beaches West,43.671024,-79.296712
4,Central Toronto,Summerhill East,43.681678,-79.390504


Now let's re-add the `toronto_data_one` DataFrame into the `neighborhoods` DataFrame. In order to do so, I will drop the Postal Code column to match the columns with one another: 

In [37]:
toronto_data_one = toronto_data_one.drop('PostalCode',axis=1)

Now let's rejoin the toronto_data_one and the neighborhoods:

In [38]:
neighborhoods = pd.concat([neighborhoods,toronto_data_one])
neighborhoods.sort_values(by='Borough')
neighborhoods = neighborhoods.reset_index(drop=True)

In [39]:
neighborhoods.head(15)

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,East Toronto,The Danforth West,43.68636,-79.300316
1,East Toronto,Riverdale,43.66547,-79.352594
2,East Toronto,India Bazaar,43.672223,-79.323503
3,East Toronto,The Beaches West,43.671024,-79.296712
4,Central Toronto,Summerhill East,43.681678,-79.390504
5,Central Toronto,Moore Park,43.690388,-79.383297
6,Central Toronto,Rathnelly,43.677472,-79.40046
7,Central Toronto,Deer Park,43.68809,-79.394093
8,Central Toronto,South Hill,43.677926,-79.405767
9,Central Toronto,Summerhill West,43.681678,-79.390504


As we did with all of Toronto, let's visualize the smaller subset and the neighborhoods in it:

In [40]:
# create map of Toronto using latitude and longitude values
map_subset = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, borough, neighborhood in zip(neighborhoods['Latitude'], neighborhoods['Longitude'],neighborhoods['Borough'], neighborhoods['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_subset)  
    
map_subset

We see that there are two anomalies:
1. Underground city
2. Richmond

They are both from downtown Toronto Borough, but located outside the city. So, I will need to fix this right now:

In [41]:
# Underground City taken from toronto_data:
toronto_data.loc[toronto_data['Neighborhood'].str.contains('Underground city')]

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
16,M5X,Downtown Toronto,"Underground city, First Canadian Place",43.648429,-79.38228


In [42]:
# from the neighborhoods data frame:
neighborhoods.loc[neighborhoods['Neighborhood'].str.contains('Underground city')]

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
43,Downtown Toronto,Underground city,43.770145,-79.374863


Looking at the geocoder function and see where the error lies:

In [43]:
address = 'Underground city, Toronto, Ontario'

geolocator = Nominatim(user_agent="toronto_explorer")
location = geolocator.geocode(address)
#latitude = location.latitude
#longitude = location.longitude
print('The geograpical coordinate are {}, {}.'.format(latitude, longitude))
print(location)

The geograpical coordinate are 43.653963, -79.387207.
The Underground, 794a, Sheppard Avenue East, Bayview Village, Don Valley North, North York, Toronto, Ontario, M2K 1C3, Canada


The query pointed towards North York Borough, instead of downtown Toronto. So, using google, I found the coordinate to be: 43.6516645, -79.3816759. 

In [44]:
underground_loc = neighborhoods.loc[neighborhoods['Neighborhood'].str.contains('Underground city')].index

In [45]:
# let's replace them with the value that I obtained from google maps:
neighborhoods.loc[underground_loc,'Latitude'] = 43.6516645
neighborhoods.loc[underground_loc,'Longitude'] = -79.3816759

In [46]:
neighborhoods.loc[underground_loc]

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
43,Downtown Toronto,Underground city,43.651665,-79.381676


Moving our attention to Richmond:

In [47]:
# Richmond taken from toronto_data:
toronto_data.loc[toronto_data['Neighborhood'].str.contains('Richmond')]

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
7,M5H,Downtown Toronto,"King, Adelaide, Richmond",43.650571,-79.384568


In [48]:
# from the neighborhoods data frame:
neighborhoods.loc[neighborhoods['Neighborhood'].str.contains('Richmond')]

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
19,Downtown Toronto,Richmond,43.812589,-79.26337


Looking at the original query and seeing where the error lies:

In [49]:
address = 'Richmond, Toronto, Ontario'

geolocator = Nominatim(user_agent="toronto_explorer")
location = geolocator.geocode(address)
print(location)

Richmond Park, Agincourt North, Scarborough North, Scarborough, Toronto, Golden Horseshoe, Ontario, Canada


It points toward Richmond Park in the borough of Scarborough, instead of the Richmond Street in downtown Toronto. So, let's modify the query into *'Richmond Street'*:

In [50]:
address = 'Richmond St, Toronto, Ontario'

geolocator = Nominatim(user_agent="toronto_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(location)
print('The geograpical coordinate are {}, {}.'.format(latitude, longitude))

Richmond Street, 1, Yonge Street, Downtown Yonge, Toronto Centre, Old Toronto, Toronto, Golden Horseshoe, Ontario, Canada
The geograpical coordinate are 43.6517909, -79.3790667.


In [51]:
richmond_loc = neighborhoods.loc[neighborhoods['Neighborhood'].str.contains('Richmond')].index
richmond_loc

Int64Index([19], dtype='int64')

In [52]:
neighborhoods.loc[richmond_loc,'Latitude'] = latitude
neighborhoods.loc[richmond_loc,'Longitude'] = longitude

In [53]:
# for sanity check:
neighborhoods.loc[richmond_loc]

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
19,Downtown Toronto,Richmond,43.651791,-79.379067


<a id='34'>Map of East, Central, Downtown, and West Toronto Neighborhood</a>  
[Back to the top](#ThirdSection)    
Let's recreate the map with the correct Latitude and Longitude:

In [54]:
# create map of Toronto using latitude and longitude values
map_subset = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, borough, neighborhood in zip(neighborhoods['Latitude'], neighborhoods['Longitude'],neighborhoods['Borough'], neighborhoods['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_subset)  
    
map_subset

<a id='35'>FourSquare API</a>  
[Back to the top](#ThirdSection)    
Now, let's use the foursquare API to explore the neighborhoods and segment them. I am hiding the code since I do not want to show my client ID and client secret to the world.

In [55]:
CLIENT_ID = 'HU0RS0DRV1VJOEOIBR2GEC2P4HPYGHAZZGFESJTN0ADP0GAI' # your Foursquare ID
CLIENT_SECRET = 'GNNJI0DNG1ZVHLTXEDYL541XXNPMZTE040AXNSBDL30X5TEF' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

#print('Your credentails:')
#print('CLIENT_ID: ' + CLIENT_ID)
#print('CLIENT_SECRET:' + CLIENT_SECRET)

#### Let's explore the first neighborhood in our dataframe.

The name of the neighborhood is:

In [56]:
neighborhoods.loc[0, 'Neighborhood']

'The Danforth West'

Let's get the longitude and latitude of the neighborhood:

In [57]:
neighborhood_latitude = neighborhoods.loc[0, 'Latitude'] # neighborhood latitude value
neighborhood_longitude = neighborhoods.loc[0, 'Longitude'] # neighborhood longitude value
neighborhood_name = neighborhoods.loc[0, 'Neighborhood'] # neighborhood name

#### Now, let's get the top 100 venues that are in 'The Danforth West' within a radius of 500 meters.

In [58]:
# define radius
radius = 500
# limit of number of venues returned by Foursquare API
LIMIT = 100
search_query = 'venues'
url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&ll={},{}&v={}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    VERSION, 
    radius, 
    LIMIT)
results = requests.get(url).json()
#results

Let's borrow the **get_category_type** function from the Foursquare lab:

In [59]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

Let's clean the json and structure it into a *pandas* dataframe.

In [60]:
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

Unnamed: 0,name,categories,lat,lng
0,Ted Reeve Arena,Skating Rink,43.684527,-79.299426
1,Beach Hill Smokehouse,BBQ Joint,43.684105,-79.30041
2,Duckworth's Fish and Chips,Fish & Chips Shop,43.688629,-79.299952
3,Cool Runnings,Caribbean Restaurant,43.683497,-79.300017
4,Sultan Shawarma & Falafel,Middle Eastern Restaurant,43.688223,-79.301709


How many venues returned by FourSquare?

In [61]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

27 venues were returned by Foursquare.


## Explore the neighborhoods methodically:  
#### Let's borrow the 'getNearbyVenue' function from the lab: 

In [62]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [63]:
# let's put them into one place:
toronto_venues = getNearbyVenues(names=neighborhoods['Neighborhood'],
                                   latitudes=neighborhoods['Latitude'],
                                   longitudes=neighborhoods['Longitude']
                                  )

In [64]:
# let's save the DataFrame so that I do not need to call it every single time I rerun the Jupyter notebook:
toronto_venues.to_csv('toronto_venues.csv',index=False)

In [65]:
# let's explore the DataFrame:
print(toronto_venues.shape)
toronto_venues.head()

(3798, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,The Danforth West,43.68636,-79.300316,Ted Reeve Arena,43.684527,-79.299426,Skating Rink
1,The Danforth West,43.68636,-79.300316,Beach Hill Smokehouse,43.684105,-79.30041,BBQ Joint
2,The Danforth West,43.68636,-79.300316,Duckworth's Fish and Chips,43.688629,-79.299952,Fish & Chips Shop
3,The Danforth West,43.68636,-79.300316,Cool Runnings,43.683497,-79.300017,Caribbean Restaurant
4,The Danforth West,43.68636,-79.300316,Sultan Shawarma & Falafel,43.688223,-79.301709,Middle Eastern Restaurant


#### Let's find out how many unique categories can be curated from all the returned venues

In [66]:
print('There are {} uniques categories.'.format(len(toronto_venues['Venue Category'].unique())))

There are 299 uniques categories.


<a id='36'>Preprocessing venues</a>  
[Back to the top](#ThirdSection)    
## Analyze Each Neighborhood

In [67]:
# one hot encoding
toronto_onehot = pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")

In [68]:
# add neighborhood column back to dataframe
toronto_onehot['Neighborhood'] = toronto_venues['Neighborhood'] 
toronto_onehot.head()

Unnamed: 0,Accessories Store,Afghan Restaurant,African Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,...,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Whisky Bar,Wine Bar,Wings Joint,Women's Store,Yoga Studio,Zoo
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [69]:
# where does "Neighborhood" located in toronto_onehot dataframe?
toronto_onehot.columns.get_loc('Neighborhood')

194

In [70]:
# saving it to a variable, since we are going to use it multiple times:
num_neigh_to_one= toronto_onehot.columns.get_loc('Neighborhood')

In [71]:
# just for sanity check:
toronto_onehot.columns[num_neigh_to_one]

'Neighborhood'

In [73]:
fixed_columns = [toronto_onehot.columns[num_neigh_to_one]]+list(toronto_onehot.columns[:num_neigh_to_one])+list(toronto_onehot.columns[num_neigh_to_one+1:])
toronto_onehot = toronto_onehot[fixed_columns]
toronto_onehot.head()

Unnamed: 0,Neighborhood,Accessories Store,Afghan Restaurant,African Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,...,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Whisky Bar,Wine Bar,Wings Joint,Women's Store,Yoga Studio,Zoo
0,The Danforth West,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,The Danforth West,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,The Danforth West,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,The Danforth West,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,The Danforth West,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


Checking the `toronto_onehot` DataFrame size:

In [74]:
toronto_onehot.shape

(3798, 299)

#### Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category

In [75]:
toronto_grouped = toronto_onehot.groupby('Neighborhood').mean().reset_index()
toronto_grouped.head()

Unnamed: 0,Neighborhood,Accessories Store,Afghan Restaurant,African Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,...,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Whisky Bar,Wine Bar,Wings Joint,Women's Store,Yoga Studio,Zoo
0,Adelaide,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0
1,Bathurst Quay,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Berczy Park,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Brockton,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.095238,0.0,0.0,0.0,0.0,0.0,0.0
4,Business Reply Mail Processing Centre 969 Eastern,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0


Checking the `toronto_grouped` DataFrame size:

In [76]:
toronto_grouped.shape

(72, 299)

In [None]:
# if we want to print the top 5 venues
#  num_top_venues = 5

#for hood in toronto_grouped['Neighborhood']:
#    print("----"+hood+"----")
#    temp = toronto_grouped[toronto_grouped['Neighborhood'] == hood].T.reset_index()
#    temp.columns = ['venue','freq']
#    temp = temp.iloc[1:]
#    temp['freq'] = temp['freq'].astype(float)
#    temp = temp.round({'freq': 2})
#    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
#    print('\n')

#### Let's put that into a *pandas* dataframe  
Borrowing from the lab: 

In [77]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Now let's create the new dataframe and display the top 10 venues for each neighborhood.

In [78]:
num_top_venues = 10
# for: 1st, 2nd, and 3rd
indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        # for first, second, third, use this one:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        # others use nth for the column name:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = toronto_grouped['Neighborhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Adelaide,Coffee Shop,Hotel,Gastropub,Restaurant,Café,Italian Restaurant,Cosmetics Shop,Breakfast Spot,Japanese Restaurant,American Restaurant
1,Bathurst Quay,Coffee Shop,Café,Park,Grocery Store,Dance Studio,Garden,Sculpture Garden,Diner,Caribbean Restaurant,Japanese Restaurant
2,Berczy Park,Coffee Shop,Cocktail Bar,Steakhouse,Seafood Restaurant,Beer Bar,Café,Farmers Market,Cheese Shop,Bakery,Restaurant
3,Brockton,Bar,Park,Vietnamese Restaurant,Pizza Place,French Restaurant,Gastropub,Korean Restaurant,Dive Bar,South American Restaurant,Jazz Club
4,Business Reply Mail Processing Centre 969 Eastern,Comic Shop,Brewery,Auto Workshop,Light Rail Station,Burrito Place,Restaurant,Recording Studio,Garden,Park,Gym / Fitness Center


<a id='37'>K-Means Clustering</a>  
[Back to the top](#ThirdSection)    
## Cluster Neighborhoods

Run *k*-means to cluster the neighborhood into 10 clusters.

In [79]:
# set number of clusters
kclusters = 10

toronto_grouped_clustering = toronto_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([7, 7, 7, 0, 0, 7, 0, 7, 0, 7], dtype=int32)

In [80]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

toronto_merged = neighborhoods

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
toronto_merged = toronto_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

toronto_merged.head() # check the last columns!

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,East Toronto,The Danforth West,43.68636,-79.300316,0,Pharmacy,Coffee Shop,Grocery Store,Bus Line,BBQ Joint,Sandwich Place,Café,Bakery,Sushi Restaurant,Baseball Field
1,East Toronto,Riverdale,43.66547,-79.352594,0,Chinese Restaurant,Vietnamese Restaurant,Bakery,Light Rail Station,Grocery Store,Burger Joint,Bar,Asian Restaurant,Baseball Field,Gym / Fitness Center
2,East Toronto,India Bazaar,43.672223,-79.323503,0,Indian Restaurant,Café,Grocery Store,Donut Shop,Sandwich Place,Bus Stop,Diner,Burger Joint,Theater,Asian Restaurant
3,East Toronto,The Beaches West,43.671024,-79.296712,0,Park,Beach,Coffee Shop,Pizza Place,Breakfast Spot,Pub,Bar,Tea Room,Thai Restaurant,Japanese Restaurant
4,Central Toronto,Summerhill East,43.681678,-79.390504,0,Italian Restaurant,Sushi Restaurant,Pizza Place,Coffee Shop,Café,Spa,Beer Store,Sporting Goods Shop,Bakery,Bank


<a id='38'>Cluster Map</a>  
[Back to the top](#ThirdSection)    
Finally, let's visualize the resulting clusters

In [81]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['Neighborhood'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

<a id='39'>Examine Cluster</a>  
[Back to the top](#ThirdSection)    

Now, let's examine each cluster and determine the discriminating venue categories that distinguish each cluster. 

#### Cluster 1

In [82]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 0, toronto_merged.columns[[0,1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,East Toronto,The Danforth West,Pharmacy,Coffee Shop,Grocery Store,Bus Line,BBQ Joint,Sandwich Place,Café,Bakery,Sushi Restaurant,Baseball Field
1,East Toronto,Riverdale,Chinese Restaurant,Vietnamese Restaurant,Bakery,Light Rail Station,Grocery Store,Burger Joint,Bar,Asian Restaurant,Baseball Field,Gym / Fitness Center
2,East Toronto,India Bazaar,Indian Restaurant,Café,Grocery Store,Donut Shop,Sandwich Place,Bus Stop,Diner,Burger Joint,Theater,Asian Restaurant
3,East Toronto,The Beaches West,Park,Beach,Coffee Shop,Pizza Place,Breakfast Spot,Pub,Bar,Tea Room,Thai Restaurant,Japanese Restaurant
4,Central Toronto,Summerhill East,Italian Restaurant,Sushi Restaurant,Pizza Place,Coffee Shop,Café,Spa,Beer Store,Sporting Goods Shop,Bakery,Bank
6,Central Toronto,Rathnelly,Park,Italian Restaurant,Mexican Restaurant,French Restaurant,American Restaurant,Liquor Store,Electronics Store,BBQ Joint,Pizza Place,Coffee Shop
8,Central Toronto,South Hill,Coffee Shop,Sandwich Place,History Museum,Café,Pizza Place,Park,Pub,Castle,Jewish Restaurant,Burger Joint
9,Central Toronto,Summerhill West,Italian Restaurant,Sushi Restaurant,Pizza Place,Coffee Shop,Café,Spa,Beer Store,Sporting Goods Shop,Bakery,Bank
11,Downtown Toronto,Cabbagetown,Restaurant,Coffee Shop,Café,Indian Restaurant,Diner,Beer Store,Japanese Restaurant,Italian Restaurant,Gastropub,Bakery
29,Central Toronto,The Annex,Pizza Place,Coffee Shop,Ice Cream Shop,Thai Restaurant,Park,Gym,Bistro,Grocery Store,Indian Restaurant,Donut Shop


It seems that `cluster 1` has a lot of places to hang out and eat.

#### Cluster 2

In [83]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 1, toronto_merged.columns[[0,1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
69,Central Toronto,Roselawn,Garden,Zoo,Event Space,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Ethiopian Restaurant,Falafel Restaurant


Cluster 2 has only one neighborhood with gathering places such as garden, zoo, and event space as the first three venues.

#### Cluster 3

In [84]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 2, toronto_merged.columns[[0,1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
10,Central Toronto,Forest Hill SE,Accessories Store,Playground,Bank,Event Space,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Ethiopian Restaurant
27,Central Toronto,Forest Hill West,Accessories Store,Playground,Bank,Event Space,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Ethiopian Restaurant
28,Central Toronto,Forest Hill North,Accessories Store,Playground,Bank,Event Space,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Electronics Store,Ethiopian Restaurant


Cluster 3 is located in the Central Toronto Borough and is in the Forest Hill area.

#### Cluster 4

In [85]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 3, toronto_merged.columns[[0,1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
5,Central Toronto,Moore Park,Playground,Gym,Zoo,Ethiopian Restaurant,Doner Restaurant,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Electronics Store


Cluster 4 has one neighborhood (Moore Park) with playground, gym, and zoo as the three most common venues.

#### Cluster 5

In [86]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 4, toronto_merged.columns[[0,1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
56,West Toronto,Swansea,Dance Studio,Skating Rink,Pilates Studio,Park,Bus Line,Electronics Store,Doner Restaurant,Donut Shop,Dumpling Restaurant,Eastern European Restaurant


Cluster 5 has one neighborhood (Swansea) with dance studio, skating rink, and pilates studio as the three most common venues.

#### Cluster 6

In [87]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 5, toronto_merged.columns[[0,1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
64,Downtown Toronto,Rosedale,Park,Playground,Building,Trail,Zoo,Egyptian Restaurant,Doner Restaurant,Donut Shop,Dumpling Restaurant,Eastern European Restaurant


Cluster 6 has one neighborhood (Rosedal) with park, playground, and building as the three most common venues.

#### Cluster 7

In [88]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 6, toronto_merged.columns[[0,1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
60,Central Toronto,Lawrence Park,Bus Line,Park,Swim School,Ethiopian Restaurant,Doner Restaurant,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant,Electronics Store


Lawrence Park is the one neighborhood for Cluster 7 with bus line, park, and swim school as the three most common venues.

#### Cluster 8

In [89]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 7, toronto_merged.columns[[0,1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
7,Central Toronto,Deer Park,Coffee Shop,Italian Restaurant,Pub,Sushi Restaurant,Café,Grocery Store,Bagel Shop,Restaurant,Pizza Place,Sandwich Place
12,Downtown Toronto,St. James Town,Coffee Shop,Café,Restaurant,Hotel,Italian Restaurant,Pizza Place,Bakery,Breakfast Spot,Beer Bar,Cosmetics Shop
13,Downtown Toronto,Harbourfront,Coffee Shop,Restaurant,Hotel,Café,Italian Restaurant,Pizza Place,Gym,Music Venue,Sporting Goods Shop,Sushi Restaurant
14,Downtown Toronto,Regent Park,Coffee Shop,Thai Restaurant,Beer Store,Animal Shelter,Sushi Restaurant,Fast Food Restaurant,Auto Dealership,Food Truck,Restaurant,Moving Target
15,Downtown Toronto,Ryerson,Coffee Shop,Clothing Store,Fast Food Restaurant,Café,Middle Eastern Restaurant,Ramen Restaurant,Japanese Restaurant,Sandwich Place,Tea Room,Burger Joint
16,Downtown Toronto,Garden District,Coffee Shop,Clothing Store,Fast Food Restaurant,Cosmetics Shop,Restaurant,Hotel,Middle Eastern Restaurant,Theater,Tea Room,Plaza
17,Downtown Toronto,King,Coffee Shop,Restaurant,Hotel,Italian Restaurant,Café,Gastropub,American Restaurant,Japanese Restaurant,Gym,Deli / Bodega
18,Downtown Toronto,Adelaide,Coffee Shop,Hotel,Gastropub,Restaurant,Café,Italian Restaurant,Cosmetics Shop,Breakfast Spot,Japanese Restaurant,American Restaurant
19,Downtown Toronto,Richmond,Coffee Shop,Cosmetics Shop,Gastropub,Hotel,Italian Restaurant,Restaurant,Clothing Store,Café,Plaza,American Restaurant
20,Downtown Toronto,Harbourfront East,Coffee Shop,Restaurant,Hotel,Café,Italian Restaurant,Pizza Place,Gym,Music Venue,Sporting Goods Shop,Sushi Restaurant


Cluster 8 has either Coffee Shop or Cafe as the first two most common venue. 

#### Cluster 9

In [90]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 8, toronto_merged.columns[[0,1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
21,Downtown Toronto,Toronto Islands,American Restaurant,Playground,Beer Garden,Scenic Lookout,Beach,Farm,Event Space,Dumpling Restaurant,Eastern European Restaurant,Egyptian Restaurant


Cluster 9 is Toronto Islands which is off the coast of Downtown Toronto and has american restaurant, playground, and beer garden as the three most common venue.

#### Cluster 10

In [91]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 9, toronto_merged.columns[[0,1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
58,East Toronto,The Beaches,Other Great Outdoors,Pub,Health Food Store,Trail,Egyptian Restaurant,Doctor's Office,Dog Run,Doner Restaurant,Donut Shop,Dumpling Restaurant


Cluster 10 is The Beaches located in East Toronto Borough with great outdoors, pub, and health food store as the three most common venues. 