# Toronto Neighborhoods

First we need to install any packages that hasn't been previously installed. Once installed, I comment the lines.

In [1]:
# import sys
# !{sys.executable} -m pip install bs4
# !{sys.executable} -m pip install lxml

Then import the required libraries to parse the html page.

In [2]:
from bs4 import BeautifulSoup
from lxml import etree
import requests

All webparsing activities will go into the below cell.

In [3]:
quote_page = 'https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'
source = requests.get(quote_page).text
soup = BeautifulSoup(source, "lxml")
table = soup.find('table',{'class':'wikitable sortable'})

Now, the data from the html should be put into lists.

In [4]:
PostalCode = []
Borough = []
Neighborhood = []
for tr in table.find_all('tr'):
        tds = tr.find_all('td')
        if not tds:
            continue
        PC, B, N = [td.text.strip() for td in tds[:3]]
        PostalCode.append(PC)
        Borough.append(B)
        Neighborhood.append(N)

Pandas needs to be imported so that we can put the table data into a dataframe.

In [5]:
import pandas as pd
df = pd.DataFrame({'PostalCode': PostalCode,
                  'Borough': Borough,
                  'Neighborhood': Neighborhood})

The dataframe should match the requirements of the submission, which are explained inline below.

In [6]:
df = df[df.Borough != 'Not assigned'] # To ignore cells with a borough that are 'Not assigned'
nameless = df.Neighborhood == 'Not assigned' # To identify rows that have a Borough but do not have a neighborhood assigned
df.loc[nameless, 'Neighborhood'] = df.loc[nameless, 'Borough'] # To replace 'Not assigned' neighborhoods with the borough name
neighborhoods = df.groupby(['PostalCode','Borough'])['Neighborhood'].apply(', '.join).reset_index() # To group the PostalCode and Boroughs together and concatenate the neighborhoods
neighborhoods

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M1B,Scarborough,"Rouge, Malvern"
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union"
2,M1E,Scarborough,"Guildwood, Morningside, West Hill"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae
5,M1J,Scarborough,Scarborough Village
6,M1K,Scarborough,"East Birchmount Park, Ionview, Kennedy Park"
7,M1L,Scarborough,"Clairlea, Golden Mile, Oakridge"
8,M1M,Scarborough,"Cliffcrest, Cliffside, Scarborough Village West"
9,M1N,Scarborough,"Birch Cliff, Cliffside West"


Finally, the below code retrieves the shape of our dataframe, "neighborhoods".

In [7]:
neighborhoods.shape

(103, 3)

We now can map latitude and longitude coordinates to each of the 103 PostalCodes.

In [8]:
# !{sys.executable} -m pip install geocoder
# decided not to use geocoder because it was taking too long to set up and work

In [9]:
coordinates = pd.read_csv("https://cocl.us/Geospatial_data")

In [10]:
neighborhoods['Latitude'] = coordinates['Latitude']
neighborhoods['Longitude'] = coordinates['Longitude']
neighborhoods

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476
5,M1J,Scarborough,Scarborough Village,43.744734,-79.239476
6,M1K,Scarborough,"East Birchmount Park, Ionview, Kennedy Park",43.727929,-79.262029
7,M1L,Scarborough,"Clairlea, Golden Mile, Oakridge",43.711112,-79.284577
8,M1M,Scarborough,"Cliffcrest, Cliffside, Scarborough Village West",43.716316,-79.239476
9,M1N,Scarborough,"Birch Cliff, Cliffside West",43.692657,-79.264848


# Third Question
Here I will cluster the neighborhoods together. I'll focus on Etobicoke, because it sounds interesting and I can't pronounce it. First, the relevant libraries should be imported for clustering and visualization.

In [11]:
#!conda install -c conda-forge geopy --yes
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

# !conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

Now we'll do some preparation for seeing the neighborhoods

In [12]:
etobicoke_data = neighborhoods[neighborhoods['Borough'] == 'Etobicoke'].reset_index(drop=True)
etobicoke_data.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M8V,Etobicoke,"Humber Bay Shores, Mimico South, New Toronto",43.605647,-79.501321
1,M8W,Etobicoke,"Alderwood, Long Branch",43.602414,-79.543484
2,M8X,Etobicoke,"The Kingsway, Montgomery Road, Old Mill North",43.653654,-79.506944
3,M8Y,Etobicoke,"Humber Bay, King's Mill Park, Kingsway Park So...",43.636258,-79.498509
4,M8Z,Etobicoke,"Kingsway Park South West, Mimico NW, The Queen...",43.628841,-79.520999


In [13]:
address = 'Toronto, Canada'

geolocator = Nominatim(user_agent="ca_explorer")
TR_location = geolocator.geocode(address)
TR_latitude = TR_location.latitude
TR_longitude = TR_location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(TR_latitude, TR_longitude))

The geograpical coordinate of Toronto are 43.653963, -79.387207.


In [14]:
# create map of Toronto using latitude and longitude values
map_toronto = folium.Map(location=[TR_latitude, TR_longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(etobicoke_data['Latitude'], etobicoke_data['Longitude'], etobicoke_data['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
    
map_toronto

Now we can pull some information from Foursquare to begin clustering the neighborhoods.

In [15]:
CLIENT_ID = '4GSJSOHDZKYYB0XYCN0K30QT1SNPDDNGE3YY0EZJOJAGOKYF' # your Foursquare ID
CLIENT_SECRET = '4V0KU1O3BADD0UX0ZZ4FFCH5BPSFRPIUACWN3CKJRSSBNLSD' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: 4GSJSOHDZKYYB0XYCN0K30QT1SNPDDNGE3YY0EZJOJAGOKYF
CLIENT_SECRET:4V0KU1O3BADD0UX0ZZ4FFCH5BPSFRPIUACWN3CKJRSSBNLSD


In [16]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [17]:
LIMIT = 100
radius = 500

etobicoke_venues = getNearbyVenues(names=etobicoke_data['Neighborhood'],
                                   latitudes=etobicoke_data['Latitude'],
                                   longitudes=etobicoke_data['Longitude']
                                  )

Humber Bay Shores, Mimico South, New Toronto
Alderwood, Long Branch
The Kingsway, Montgomery Road, Old Mill North
Humber Bay, King's Mill Park, Kingsway Park South East, Mimico NE, Old Mill South, The Queensway East, Royal York South East, Sunnylea
Kingsway Park South West, Mimico NW, The Queensway West, Royal York South West, South of Bloor
Islington Avenue
Cloverdale, Islington, Martin Grove, Princess Gardens, West Deane Park
Bloordale Gardens, Eringate, Markland Wood, Old Burnhamthorpe
Westmount
Kingsview Village, Martin Grove Gardens, Richview Gardens, St. Phillips
Albion Gardens, Beaumond Heights, Humbergate, Jamestown, Mount Olive, Silverstone, South Steeles, Thistletown
Northwest


In [18]:
print(etobicoke_venues.shape)
etobicoke_venues.head()

(71, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,"Humber Bay Shores, Mimico South, New Toronto",43.605647,-79.501321,LCBO,43.602281,-79.499302,Liquor Store
1,"Humber Bay Shores, Mimico South, New Toronto",43.605647,-79.501321,Domino's Pizza,43.601676,-79.500908,Pizza Place
2,"Humber Bay Shores, Mimico South, New Toronto",43.605647,-79.501321,New Toronto Fish & Chips,43.601849,-79.503281,Restaurant
3,"Humber Bay Shores, Mimico South, New Toronto",43.605647,-79.501321,Delicia Bakery & Pastry,43.601403,-79.503012,Bakery
4,"Humber Bay Shores, Mimico South, New Toronto",43.605647,-79.501321,Lucky Dice Restaurant,43.601392,-79.503056,Café


In [19]:
etobicoke_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"Albion Gardens, Beaumond Heights, Humbergate, Jamestown, Mount Olive, Silverstone, South Steeles, Thistletown",11,11,11,11,11,11
"Alderwood, Long Branch",10,10,10,10,10,10
"Bloordale Gardens, Eringate, Markland Wood, Old Burnhamthorpe",8,8,8,8,8,8
"Cloverdale, Islington, Martin Grove, Princess Gardens, West Deane Park",1,1,1,1,1,1
"Humber Bay Shores, Mimico South, New Toronto",13,13,13,13,13,13
"Humber Bay, King's Mill Park, Kingsway Park South East, Mimico NE, Old Mill South, The Queensway East, Royal York South East, Sunnylea",1,1,1,1,1,1
"Kingsview Village, Martin Grove Gardens, Richview Gardens, St. Phillips",4,4,4,4,4,4
"Kingsway Park South West, Mimico NW, The Queensway West, Royal York South West, South of Bloor",11,11,11,11,11,11
Northwest,3,3,3,3,3,3
"The Kingsway, Montgomery Road, Old Mill North",3,3,3,3,3,3


In [20]:
print('There are {} uniques categories.'.format(len(etobicoke_venues['Venue Category'].unique())))

There are 40 uniques categories.


In [21]:
# one hot encoding
etobicoke_onehot = pd.get_dummies(etobicoke_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
etobicoke_onehot['Neighborhood'] = etobicoke_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [etobicoke_onehot.columns[-1]] + list(etobicoke_onehot.columns[:-1])
etobicoke_onehot = etobicoke_onehot[fixed_columns]

etobicoke_onehot.head()

Unnamed: 0,Neighborhood,American Restaurant,Athletics & Sports,Bakery,Bank,Bar,Baseball Field,Beer Store,Burger Joint,Bus Line,...,Rental Car Location,Restaurant,River,Sandwich Place,Shopping Plaza,Skating Rink,Supplement Shop,Thrift / Vintage Store,Video Store,Wings Joint
0,"Humber Bay Shores, Mimico South, New Toronto",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,"Humber Bay Shores, Mimico South, New Toronto",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,"Humber Bay Shores, Mimico South, New Toronto",0,0,0,0,0,0,0,0,0,...,0,1,0,0,0,0,0,0,0,0
3,"Humber Bay Shores, Mimico South, New Toronto",0,0,1,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,"Humber Bay Shores, Mimico South, New Toronto",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [22]:
etobicoke_grouped = etobicoke_onehot.groupby('Neighborhood').mean().reset_index()
etobicoke_grouped

Unnamed: 0,Neighborhood,American Restaurant,Athletics & Sports,Bakery,Bank,Bar,Baseball Field,Beer Store,Burger Joint,Bus Line,...,Rental Car Location,Restaurant,River,Sandwich Place,Shopping Plaza,Skating Rink,Supplement Shop,Thrift / Vintage Store,Video Store,Wings Joint
0,"Albion Gardens, Beaumond Heights, Humbergate, ...",0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,...,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.090909,0.0
1,"Alderwood, Long Branch",0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.1,0.0,0.1,0.0,0.0,0.0,0.0
2,"Bloordale Gardens, Eringate, Markland Wood, Ol...",0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,...,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0
3,"Cloverdale, Islington, Martin Grove, Princess ...",0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,"Humber Bay Shores, Mimico South, New Toronto",0.076923,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.076923,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.0
5,"Humber Bay, King's Mill Park, Kingsway Park So...",0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,"Kingsview Village, Martin Grove Gardens, Richv...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,"Kingsway Park South West, Mimico NW, The Queen...",0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.090909,0.0,...,0.0,0.0,0.0,0.090909,0.0,0.0,0.090909,0.090909,0.0,0.090909
8,Northwest,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,...,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,"The Kingsway, Montgomery Road, Old Mill North",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [23]:
num_top_venues = 5

for hood in etobicoke_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = etobicoke_grouped[etobicoke_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Albion Gardens, Beaumond Heights, Humbergate, Jamestown, Mount Olive, Silverstone, South Steeles, Thistletown----
                 venue  freq
0        Grocery Store  0.18
1          Coffee Shop  0.09
2       Sandwich Place  0.09
3             Pharmacy  0.09
4  Fried Chicken Joint  0.09


----Alderwood, Long Branch----
                venue  freq
0         Pizza Place   0.2
1  Athletics & Sports   0.1
2                 Pub   0.1
3                 Gym   0.1
4        Skating Rink   0.1


----Bloordale Gardens, Eringate, Markland Wood, Old Burnhamthorpe----
            venue  freq
0            Park  0.12
1        Pharmacy  0.12
2     Pizza Place  0.12
3      Beer Store  0.12
4  Shopping Plaza  0.12


----Cloverdale, Islington, Martin Grove, Princess Gardens, West Deane Park----
                       venue  freq
0                       Bank   1.0
1        American Restaurant   0.0
2        Rental Car Location   0.0
3  Middle Eastern Restaurant   0.0
4          Mobile Phone Shop   0.0


## Well that's interesting
In the PostalCode associated with Humber Bay etc., it has only a baseball field. That's it. Similarly, that associated with Cloverdale only has a bank. That will be interesting to see later, and I can imagine it will receive its own cluster later. In either case, it either suggests a lack of Foursquare use in Etobicoke or more simply that there is just not much around these areas.

Press on!

In [24]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [25]:
import numpy as np

num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = etobicoke_grouped['Neighborhood']

for ind in np.arange(etobicoke_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(etobicoke_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Albion Gardens, Beaumond Heights, Humbergate, ...",Grocery Store,Fried Chicken Joint,Video Store,Sandwich Place,Liquor Store,Beer Store,Coffee Shop,Fast Food Restaurant,Pizza Place,Pharmacy
1,"Alderwood, Long Branch",Pizza Place,Gym,Sandwich Place,Coffee Shop,Pharmacy,Pub,Pool,Skating Rink,Athletics & Sports,Bar
2,"Bloordale Gardens, Eringate, Markland Wood, Ol...",Café,Pizza Place,Convenience Store,Shopping Plaza,Liquor Store,Beer Store,Park,Pharmacy,Drugstore,Discount Store
3,"Cloverdale, Islington, Martin Grove, Princess ...",Bank,Wings Joint,Chinese Restaurant,Fried Chicken Joint,Flower Shop,Fast Food Restaurant,Drugstore,Discount Store,Convenience Store,Coffee Shop
4,"Humber Bay Shores, Mimico South, New Toronto",Gym,Pizza Place,Bakery,Café,Fast Food Restaurant,Flower Shop,Fried Chicken Joint,Hobby Shop,Liquor Store,Pharmacy


In [26]:
# set number of clusters
kclusters = 5

etobicoke_grouped_clustering = etobicoke_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(etobicoke_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([3, 3, 3, 2, 3, 0, 3, 3, 4, 1], dtype=int32)

In [27]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

etobicoke_merged = etobicoke_data

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
etobicoke_merged = etobicoke_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')
etobicoke_merged = etobicoke_merged[np.isfinite(etobicoke_merged['Cluster Labels'])]

etobicoke_merged['Cluster Labels'] = etobicoke_merged['Cluster Labels'].astype('int64')

etobicoke_merged # check the last columns!

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M8V,Etobicoke,"Humber Bay Shores, Mimico South, New Toronto",43.605647,-79.501321,3,Gym,Pizza Place,Bakery,Café,Fast Food Restaurant,Flower Shop,Fried Chicken Joint,Hobby Shop,Liquor Store,Pharmacy
1,M8W,Etobicoke,"Alderwood, Long Branch",43.602414,-79.543484,3,Pizza Place,Gym,Sandwich Place,Coffee Shop,Pharmacy,Pub,Pool,Skating Rink,Athletics & Sports,Bar
2,M8X,Etobicoke,"The Kingsway, Montgomery Road, Old Mill North",43.653654,-79.506944,1,Pool,River,Park,Wings Joint,Fast Food Restaurant,Drugstore,Discount Store,Convenience Store,Coffee Shop,Chinese Restaurant
3,M8Y,Etobicoke,"Humber Bay, King's Mill Park, Kingsway Park So...",43.636258,-79.498509,0,Baseball Field,Wings Joint,Chinese Restaurant,Fried Chicken Joint,Flower Shop,Fast Food Restaurant,Drugstore,Discount Store,Convenience Store,Coffee Shop
4,M8Z,Etobicoke,"Kingsway Park South West, Mimico NW, The Queen...",43.628841,-79.520999,3,Wings Joint,Grocery Store,Bakery,Burger Joint,Convenience Store,Discount Store,Fast Food Restaurant,Gym,Sandwich Place,Thrift / Vintage Store
6,M9B,Etobicoke,"Cloverdale, Islington, Martin Grove, Princess ...",43.650943,-79.554724,2,Bank,Wings Joint,Chinese Restaurant,Fried Chicken Joint,Flower Shop,Fast Food Restaurant,Drugstore,Discount Store,Convenience Store,Coffee Shop
7,M9C,Etobicoke,"Bloordale Gardens, Eringate, Markland Wood, Ol...",43.643515,-79.577201,3,Café,Pizza Place,Convenience Store,Shopping Plaza,Liquor Store,Beer Store,Park,Pharmacy,Drugstore,Discount Store
8,M9P,Etobicoke,Westmount,43.696319,-79.532242,3,Pizza Place,Intersection,Sandwich Place,Middle Eastern Restaurant,Coffee Shop,Chinese Restaurant,Wings Joint,Fast Food Restaurant,Drugstore,Discount Store
9,M9R,Etobicoke,"Kingsview Village, Martin Grove Gardens, Richv...",43.688905,-79.554724,3,Mobile Phone Shop,Park,Bus Line,Pizza Place,Wings Joint,Chinese Restaurant,Fast Food Restaurant,Drugstore,Discount Store,Convenience Store
10,M9V,Etobicoke,"Albion Gardens, Beaumond Heights, Humbergate, ...",43.739416,-79.588437,3,Grocery Store,Fried Chicken Joint,Video Store,Sandwich Place,Liquor Store,Beer Store,Coffee Shop,Fast Food Restaurant,Pizza Place,Pharmacy


In [28]:
# create map
map_clusters = folium.Map(location=[TR_latitude, TR_longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(etobicoke_merged['Latitude'], etobicoke_merged['Longitude'], etobicoke_merged['Neighborhood'], etobicoke_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [29]:
etobicoke_merged.loc[etobicoke_merged['Cluster Labels'] == 0, etobicoke_merged.columns[[1] + list(range(5, etobicoke_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
3,Etobicoke,0,Baseball Field,Wings Joint,Chinese Restaurant,Fried Chicken Joint,Flower Shop,Fast Food Restaurant,Drugstore,Discount Store,Convenience Store,Coffee Shop


In [30]:
etobicoke_merged.loc[etobicoke_merged['Cluster Labels'] == 1, etobicoke_merged.columns[[1] + list(range(5, etobicoke_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,Etobicoke,1,Pool,River,Park,Wings Joint,Fast Food Restaurant,Drugstore,Discount Store,Convenience Store,Coffee Shop,Chinese Restaurant


In [31]:
etobicoke_merged.loc[etobicoke_merged['Cluster Labels'] == 2, etobicoke_merged.columns[[1] + list(range(5, etobicoke_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
6,Etobicoke,2,Bank,Wings Joint,Chinese Restaurant,Fried Chicken Joint,Flower Shop,Fast Food Restaurant,Drugstore,Discount Store,Convenience Store,Coffee Shop


In [32]:
etobicoke_merged.loc[etobicoke_merged['Cluster Labels'] == 3, etobicoke_merged.columns[[1] + list(range(5, etobicoke_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Etobicoke,3,Gym,Pizza Place,Bakery,Café,Fast Food Restaurant,Flower Shop,Fried Chicken Joint,Hobby Shop,Liquor Store,Pharmacy
1,Etobicoke,3,Pizza Place,Gym,Sandwich Place,Coffee Shop,Pharmacy,Pub,Pool,Skating Rink,Athletics & Sports,Bar
4,Etobicoke,3,Wings Joint,Grocery Store,Bakery,Burger Joint,Convenience Store,Discount Store,Fast Food Restaurant,Gym,Sandwich Place,Thrift / Vintage Store
7,Etobicoke,3,Café,Pizza Place,Convenience Store,Shopping Plaza,Liquor Store,Beer Store,Park,Pharmacy,Drugstore,Discount Store
8,Etobicoke,3,Pizza Place,Intersection,Sandwich Place,Middle Eastern Restaurant,Coffee Shop,Chinese Restaurant,Wings Joint,Fast Food Restaurant,Drugstore,Discount Store
9,Etobicoke,3,Mobile Phone Shop,Park,Bus Line,Pizza Place,Wings Joint,Chinese Restaurant,Fast Food Restaurant,Drugstore,Discount Store,Convenience Store
10,Etobicoke,3,Grocery Store,Fried Chicken Joint,Video Store,Sandwich Place,Liquor Store,Beer Store,Coffee Shop,Fast Food Restaurant,Pizza Place,Pharmacy


In [33]:
etobicoke_merged.loc[etobicoke_merged['Cluster Labels'] == 4, etobicoke_merged.columns[[1] + list(range(5, etobicoke_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
11,Etobicoke,4,Drugstore,Bar,Rental Car Location,Wings Joint,Café,Flower Shop,Fast Food Restaurant,Discount Store,Convenience Store,Coffee Shop


## Well that was interesting
So it seems that clusters 0 and 2 are defined particularly by their one venue (the Baseball Field and the Bank). Clear trends aren't easily discernable except for Cluster 3, perhaps due to the fewness of locations provided from Foursquare for these neighborhoods in Etobicoke. Perhaps a different borough would be more interesting!