# Week 4 and 5 of Data Science Capstone Course
By Stefan Siegert

## <b>(a)</b>: Import required libraries and create PANDAS dataframe from Wikipedia table

Please note: Some of the imported and installed libraries and tools are not required for this concrete example here, like parts of geopy or kmeans. However. since this program structure can also used for many other tasks, where exactly these components would be needed, they are still included here.

Followed by this, the rough data about postal codes in Toronto will be read from the Wikipedia page and will be converted to a PANDAS dataframe.

The code block ends, like many following code blocks, with the option to show the current state of the dataframe. It is de-activated here by adding a comment sign '#' which can easily be removed if necessary.

In [1]:
print('Importing process starts ...')
import pandas as pd             # Pandas library
import numpy as np              # library to handle data in a vectorized manner

#import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files
!pip install geopy
#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
import requests                             # library to handle requests
from pandas.io.json import json_normalize   # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.11.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
!pip -q install folium
import folium # map rendering library

print('Libraries imported.')


# Creating the dataframe from Wikipedia page
df_toronto_rough = pd.read_html('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M')[0]
df_toronto = df_toronto_rough.dropna().reset_index(drop=True)
df_toronto.rename(columns={'Postal Code':'PostalCode'}, inplace=True)

#df_toronto                       #   <- Activate by removing the 1st '#' if you want to see the detailed data in the dataframe

Importing process starts ...
Libraries imported.


## (b): Add latitude and longitude to data frame

Add 2 more columns to df_toronto dataframe ('Latitude' and 'Longitude'), read *.csv file including this data and add it to all rows in the dataframe.

In [2]:
df_toronto = df_toronto.reindex(columns = df_toronto.columns.tolist() + ['Latitude','Longitude'])
df_geospatial = pd.read_csv('https://cocl.us/Geospatial_data')
df_geospatial.set_index('Postal Code', inplace=True)
#df_geospatial               #   <- Activate by removing the 1st '#' if you want to see the detailed data in the dataframe
for index, row in df_toronto.iterrows():
    df_toronto.at[index, 'Latitude'] = df_geospatial.at[row['PostalCode'],'Latitude']
    df_toronto.at[index, 'Longitude'] = df_geospatial.at[row['PostalCode'],'Longitude']
#df_toronto                  #   <- Activate by removing the 1st '#' if you want to see the detailed data in the dataframe

## <b>(c)</b>: Create a map of Toronto with all neighborhoods

Please note: What is done here is very similar to the example <b>Segmenting and Clustering Neighborhoods in New York City</b>. For that reason, many parts of the source code as well as comments and markups were copied and only modified where necessary.

We use geopy library to get the latitude and longitude values of Toronto. Then we will create a map with neighborhoods superimposed on top.

In [3]:
address = 'Toronto, Canada'

geolocator = Nominatim(user_agent="toronto_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
# print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))

# create map of Toronto using latitude and longitude values
map_toronto = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(df_toronto['Latitude'], df_toronto['Longitude'], df_toronto['Borough'], df_toronto['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
    
map_toronto

## <b>(d)</b>: Connecting to Foursquare API to explore the neighborhoods

Next, we are going to start utilizing the Foursquare API to explore the neighborhoods and segment them. Then, let's create the GET request URL. Name our URL url. Finally,the code to run 2 newlly created function on each neighborhood and create a new dataframe called toronto_venues. Since analyzing takes some time, each single neighborhood will be displayed so the user can see the progress.

In [4]:
# Define Foursquare data

CLIENT_ID = 'IPVCIXFHVRETKT3Z30COPDACI24DHCMYVB01GJ3AINSWYMVZ' # your Foursquare ID
CLIENT_SECRET = 'Z1VA15MRZ5EQNGGEXGWS51EGJVR4F0DV402N31JBR3K4G4VF' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

LIMIT = 100
radius = 500

# Let's define 2 functions for later use:

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# function for all the neighborhoods in Toronto.
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

toronto_venues = getNearbyVenues(names=df_toronto['Neighborhood'],
                                   latitudes=df_toronto['Latitude'],
                                   longitudes=df_toronto['Longitude']
                                  )

#print(toronto_venues.shape)              #   <- Activate by removing the 1st '#' if you want to see the detailed data in the dataframe
#toronto_venues.head()                    #   <- Activate by removing the 1st '#' if you want to see the detailed data in the dataframe

Parkwoods
Victoria Village
Regent Park, Harbourfront
Lawrence Manor, Lawrence Heights
Queen's Park, Ontario Provincial Government
Islington Avenue, Humber Valley Village
Malvern, Rouge
Don Mills
Parkview Hill, Woodbine Gardens
Garden District, Ryerson
Glencairn
West Deane Park, Princess Gardens, Martin Grove, Islington, Cloverdale
Rouge Hill, Port Union, Highland Creek
Don Mills
Woodbine Heights
St. James Town
Humewood-Cedarvale
Eringate, Bloordale Gardens, Old Burnhamthorpe, Markland Wood
Guildwood, Morningside, West Hill
The Beaches
Berczy Park
Caledonia-Fairbanks
Woburn
Leaside
Central Bay Street
Christie
Cedarbrae
Hillcrest Village
Bathurst Manor, Wilson Heights, Downsview North
Thorncliffe Park
Richmond, Adelaide, King
Dufferin, Dovercourt Village
Scarborough Village
Fairview, Henry Farm, Oriole
Northwood Park, York University
East Toronto, Broadview North (Old East York)
Harbourfront East, Union Station, Toronto Islands
Little Portugal, Trinity
Kennedy Park, Ionview, East Birchmo

## <b>(e)</b>: Check data as accessed until now

This step is not necessary but makes sense to check if all data was added correctly until now. Let's find out how many unique categories can be curated from all the returned venues and let's check how many venues were returned for each neighborhood:

In [5]:
print('There are {} uniques categories.'.format(len(toronto_venues['Venue Category'].unique())))
toronto_venues.groupby('Neighborhood').count()

There are 272 uniques categories.


Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Agincourt,4,4,4,4,4,4
"Alderwood, Long Branch",10,10,10,10,10,10
"Bathurst Manor, Wilson Heights, Downsview North",19,19,19,19,19,19
Bayview Village,4,4,4,4,4,4
"Bedford Park, Lawrence Manor East",22,22,22,22,22,22
Berczy Park,55,55,55,55,55,55
"Birch Cliff, Cliffside West",4,4,4,4,4,4
"Brockton, Parkdale Village, Exhibition Place",23,23,23,23,23,23
"Business reply mail Processing Centre, South Central Letter Processing Plant Toronto",17,17,17,17,17,17
"CN Tower, King and Spadina, Railway Lands, Harbourfront West, Bathurst Quay, South Niagara, Island airport",15,15,15,15,15,15


## <b>(f)</b>: Analyze each neighborhood and make ranking

This step is very important. Each single neighborhood in our dataframe will be connected to the most common venues in the area. Let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category. Finally, let's print each neighborhood along with the top 5 most common venues.

In [6]:
## <b>(e)</b>: Check data as accessed until now# one hot encoding
toronto_onehot = pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
toronto_onehot['Neighborhood'] = toronto_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]

# toronto_onehot.head()

toronto_grouped = toronto_onehot.groupby('Neighborhood').mean().reset_index()
#toronto_grouped

num_top_venues = 5

for hood in toronto_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = toronto_grouped[toronto_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Agincourt----
                       venue  freq
0               Skating Rink  0.25
1                     Lounge  0.25
2             Breakfast Spot  0.25
3  Latin American Restaurant  0.25
4                Yoga Studio  0.00


----Alderwood, Long Branch----
                venue  freq
0         Pizza Place   0.2
1                 Pub   0.1
2        Skating Rink   0.1
3      Sandwich Place   0.1
4  Athletics & Sports   0.1


----Bathurst Manor, Wilson Heights, Downsview North----
              venue  freq
0       Coffee Shop  0.11
1              Bank  0.11
2  Sushi Restaurant  0.05
3     Shopping Mall  0.05
4       Gas Station  0.05


----Bayview Village----
                 venue  freq
0  Japanese Restaurant  0.25
1                 Café  0.25
2                 Bank  0.25
3   Chinese Restaurant  0.25
4                Motel  0.00


----Bedford Park, Lawrence Manor East----
                venue  freq
0  Italian Restaurant  0.09
1      Sandwich Place  0.09
2          Restaurant  0.09
3

                venue  freq
0      Discount Store  0.33
1    Department Store  0.17
2  Chinese Restaurant  0.17
3         Bus Station  0.17
4         Coffee Shop  0.17


----Kensington Market, Chinatown, Grange Park----
                   venue  freq
0                   Café  0.09
1            Coffee Shop  0.05
2     Mexican Restaurant  0.05
3  Vietnamese Restaurant  0.05
4                 Bakery  0.05


----Kingsview Village, St. Phillips, Martin Grove Gardens, Richview Gardens----
                   venue  freq
0            Pizza Place  0.25
1                   Park  0.25
2      Mobile Phone Shop  0.25
3         Sandwich Place  0.25
4  Performing Arts Venue  0.00


----Lawrence Manor, Lawrence Heights----
               venue  freq
0     Clothing Store  0.23
1  Accessories Store  0.15
2      Women's Store  0.08
3        Coffee Shop  0.08
4        Event Space  0.08


----Lawrence Park----
                       venue  freq
0                       Park  0.33
1                Swim Schoo

          venue  freq
0   Coffee Shop  0.18
1           Pub  0.12
2  Liquor Store  0.06
3    Bagel Shop  0.06
4    Restaurant  0.06


----The Annex, North Midtown, Yorkville----
               venue  freq
0        Coffee Shop  0.12
1     Sandwich Place  0.12
2               Café  0.12
3     History Museum  0.04
4  Indian Restaurant  0.04


----The Beaches----
                 venue  freq
0                Trail  0.25
1    Health Food Store  0.25
2                  Pub  0.25
3          Yoga Studio  0.00
4  Monument / Landmark  0.00


----The Danforth West, Riverdale----
                    venue  freq
0        Greek Restaurant  0.21
1      Italian Restaurant  0.07
2             Coffee Shop  0.07
3  Furniture / Home Store  0.05
4          Ice Cream Shop  0.05


----The Kingsway, Montgomery Road, Old Mill North----
                       venue  freq
0                       Park   0.5
1                      River   0.5
2                Yoga Studio   0.0
3  Middle Eastern Restaurant   0.0
4 

## <b>(g)</b>: Transfer this to a new dataframe

Let's put that into a pandas dataframe. First, let's write a function to sort the venues in descending order. Then let's create the new dataframe with the top 10 venues for each neighborhood.

In [7]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = toronto_grouped['Neighborhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

#neighborhoods_venues_sorted.head()

## <b>(h)</b>: Defining our own clusters and clustering the whole data

First, the grouped venue data must be merged with our df_toronto dataframe. Let's call this new dataframe df_toronto_areastyles.

Then, we have to check each row if it matches our cluster criteria and set the cluster value in the 'Cluster Lables' column. If a neighborhood does not match to any of our clusters, we will remove that row from the dataframe. Finally, let's show the left over dataframe in the end.

In [8]:
kclusters = 4
df_venues_sorted = neighborhoods_venues_sorted.reindex(columns = ['Cluster Labels'] + neighborhoods_venues_sorted.columns.tolist())
df_toronto_areastyles = df_toronto

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
df_toronto_areastyles = df_toronto_areastyles.join(df_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

for index, row in df_toronto_areastyles.iterrows():
    newcluster = False
    
    row_list = row.values.tolist()
    row_list_top3 = row_list[0:9]
    row_list_top6 = row_list[0:12]
    
    # Check for cluster #01: hipster's happiness    
    if (row_list_top3.count('Coffee Shop') == 1)  & (row_list_top3.count('Café') == 1) & ((row_list.count('Sushi Restaurant') == 1) | (row_list.count('Japanese Restaurant') == 1)):
        df_toronto_areastyles.at[index,'Cluster Labels'] = '0'
        newcluster = True
    
    # Check for cluster #02: family friendly areas
    if (row_list.count('Park') == 1)  & (row_list.count('Playground') == 1):
        df_toronto_areastyles.at[index,'Cluster Labels'] = '1'
        newcluster = True
    
    # Check for cluster #03: workout areas        
    if ((row_list_top6.count('Gym') == 1) | (row_list_top6.count('Gym / Fitness Center') == 1)):
        df_toronto_areastyles.at[index,'Cluster Labels'] = '2'
        newcluster = True
    
    # Check for cluster #04: shopping to the max
    a = 0
    for b in range(0,10):
        venue = str(row_list_top6[b])
        if ('Store' in venue) | ('Shop' in venue) | ('shop' in venue) | ('Mall' in venue) | ('store' in venue) | ('Supermarket' in venue):
            if venue != 'Coffee Shop':
                a = a+1
    if a>2:
        df_toronto_areastyles.at[index,'Cluster Labels'] = '3'
        newcluster = True
           
    if newcluster == False:
        df_toronto_areastyles = df_toronto_areastyles.drop(index, axis=0)
    
df_toronto_areastyles

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M3A,North York,Parkwoods,43.753259,-79.329656,3.0,Park,Convenience Store,Food & Drink Shop,Women's Store,Doner Restaurant,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run
3,M6A,North York,"Lawrence Manor, Lawrence Heights",43.718518,-79.464763,3.0,Clothing Store,Accessories Store,Women's Store,Gift Shop,Boutique,Miscellaneous Shop,Event Space,Coffee Shop,Furniture / Home Store,Vietnamese Restaurant
7,M3B,North York,Don Mills,43.745906,-79.352188,2.0,Beer Store,Gym,Restaurant,Japanese Restaurant,Sporting Goods Shop,Asian Restaurant,Coffee Shop,Café,Dim Sum Restaurant,Italian Restaurant
9,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937,0.0,Clothing Store,Coffee Shop,Café,Japanese Restaurant,Cosmetics Shop,Restaurant,Bubble Tea Shop,Italian Restaurant,Middle Eastern Restaurant,Theater
13,M3C,North York,Don Mills,43.7259,-79.340923,2.0,Beer Store,Gym,Restaurant,Japanese Restaurant,Sporting Goods Shop,Asian Restaurant,Coffee Shop,Café,Dim Sum Restaurant,Italian Restaurant
24,M5G,Downtown Toronto,Central Bay Street,43.657952,-79.387383,0.0,Coffee Shop,Café,Italian Restaurant,Sandwich Place,Japanese Restaurant,Bubble Tea Shop,Salad Place,Ice Cream Shop,Burger Joint,Thai Restaurant
30,M5H,Downtown Toronto,"Richmond, Adelaide, King",43.650571,-79.384568,2.0,Coffee Shop,Café,Restaurant,Clothing Store,Gym,Deli / Bodega,Thai Restaurant,Hotel,Sushi Restaurant,Concert Hall
40,M3K,North York,Downsview,43.737473,-79.464763,3.0,Grocery Store,Park,Liquor Store,Shopping Mall,Food Truck,Discount Store,Hotel,Athletics & Sports,Baseball Field,Bank
42,M5K,Downtown Toronto,"Toronto Dominion Centre, Design Exchange",43.647177,-79.381576,0.0,Coffee Shop,Café,Hotel,American Restaurant,Italian Restaurant,Japanese Restaurant,Salad Place,Seafood Restaurant,Restaurant,Deli / Bodega
46,M3L,North York,Downsview,43.739015,-79.506944,3.0,Grocery Store,Park,Liquor Store,Shopping Mall,Food Truck,Discount Store,Hotel,Athletics & Sports,Baseball Field,Bank


## <b>(i)</b>: Creating the map with our clustered neighborhoods

Finally, let's visualize the resulting clusters.

In [9]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(df_toronto_areastyles['Latitude'], df_toronto_areastyles['Longitude'], df_toronto_areastyles['Neighborhood'], df_toronto_areastyles['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    cluster2 = int(cluster) # for some reason the Cluster Label became float, not int, so I converted this value here
    folium.CircleMarker(
        [lat, lon],
        radius=6,
        popup=label,
        color='#606060',
        fill=True,
        fill_color=rainbow[cluster2],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

# (j) Finally: Here are our 4 clusters :-)


#### Cluster 1: Hipster's Happiness
Coffee shops and restaurants in the neighborhood

In [10]:
df_toronto_areastyles.loc[df_toronto_areastyles['Cluster Labels'] == 0, df_toronto_areastyles.columns[[2] + list(range(5, df_toronto_areastyles.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
9,"Garden District, Ryerson",0.0,Clothing Store,Coffee Shop,Café,Japanese Restaurant,Cosmetics Shop,Restaurant,Bubble Tea Shop,Italian Restaurant,Middle Eastern Restaurant,Theater
24,Central Bay Street,0.0,Coffee Shop,Café,Italian Restaurant,Sandwich Place,Japanese Restaurant,Bubble Tea Shop,Salad Place,Ice Cream Shop,Burger Joint,Thai Restaurant
42,"Toronto Dominion Centre, Design Exchange",0.0,Coffee Shop,Café,Hotel,American Restaurant,Italian Restaurant,Japanese Restaurant,Salad Place,Seafood Restaurant,Restaurant,Deli / Bodega
81,"Runnymede, Swansea",0.0,Café,Pizza Place,Coffee Shop,Sushi Restaurant,Restaurant,Pub,Italian Restaurant,Yoga Studio,Diner,Smoothie Shop
92,Stn A PO Boxes,0.0,Coffee Shop,Café,Seafood Restaurant,Restaurant,Cocktail Bar,Beer Bar,Japanese Restaurant,Italian Restaurant,Hotel,Creperie


#### Cluster 2: Family friendly areas
With parks and playgrounds as most common venues in the neighborhood

In [11]:
df_toronto_areastyles.loc[df_toronto_areastyles['Cluster Labels'] == 1, df_toronto_areastyles.columns[[2] + list(range(5, df_toronto_areastyles.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
83,"Moore Park, Summerhill East",1.0,Park,Playground,Tennis Court,Summer Camp,Women's Store,Distribution Center,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store
85,"Milliken, Agincourt North, Steeles East, L'Amo...",1.0,Playground,Park,Doner Restaurant,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run,Donut Shop
91,Rosedale,1.0,Park,Playground,Trail,Women's Store,Doner Restaurant,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Distribution Center


#### Cluster 3: Gym and sports areas
For the more sportive ones

In [12]:
df_toronto_areastyles.loc[df_toronto_areastyles['Cluster Labels'] == 2, df_toronto_areastyles.columns[[2] + list(range(5, df_toronto_areastyles.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
7,Don Mills,2.0,Beer Store,Gym,Restaurant,Japanese Restaurant,Sporting Goods Shop,Asian Restaurant,Coffee Shop,Café,Dim Sum Restaurant,Italian Restaurant
13,Don Mills,2.0,Beer Store,Gym,Restaurant,Japanese Restaurant,Sporting Goods Shop,Asian Restaurant,Coffee Shop,Café,Dim Sum Restaurant,Italian Restaurant
30,"Richmond, Adelaide, King",2.0,Coffee Shop,Café,Restaurant,Clothing Store,Gym,Deli / Bodega,Thai Restaurant,Hotel,Sushi Restaurant,Concert Hall
48,"Commerce Court, Victoria Hotel",2.0,Coffee Shop,Café,Restaurant,Hotel,American Restaurant,Gym,Japanese Restaurant,Seafood Restaurant,Italian Restaurant,Deli / Bodega
67,Davisville North,2.0,Department Store,Gym,Park,Sandwich Place,Breakfast Spot,Hotel,Food & Drink Shop,Dog Run,Distribution Center,Dim Sum Restaurant
76,Canada Post Gateway Processing Centre,2.0,Hotel,Coffee Shop,Gym,American Restaurant,Intersection,Sandwich Place,Middle Eastern Restaurant,Fried Chicken Joint,Burrito Place,Mediterranean Restaurant
97,"First Canadian Place, Underground city",2.0,Coffee Shop,Café,Japanese Restaurant,Hotel,Gym,Restaurant,Seafood Restaurant,Salad Place,Steakhouse,Deli / Bodega


#### Cluster 4: Shopping areas
Stores, malls and special shops dominate these neighborhoods.

In [13]:
df_toronto_areastyles.loc[df_toronto_areastyles['Cluster Labels'] == 3, df_toronto_areastyles.columns[[2] + list(range(5, df_toronto_areastyles.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Parkwoods,3.0,Park,Convenience Store,Food & Drink Shop,Women's Store,Doner Restaurant,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run
3,"Lawrence Manor, Lawrence Heights",3.0,Clothing Store,Accessories Store,Women's Store,Gift Shop,Boutique,Miscellaneous Shop,Event Space,Coffee Shop,Furniture / Home Store,Vietnamese Restaurant
40,Downsview,3.0,Grocery Store,Park,Liquor Store,Shopping Mall,Food Truck,Discount Store,Hotel,Athletics & Sports,Baseball Field,Bank
46,Downsview,3.0,Grocery Store,Park,Liquor Store,Shopping Mall,Food Truck,Discount Store,Hotel,Athletics & Sports,Baseball Field,Bank
53,Downsview,3.0,Grocery Store,Park,Liquor Store,Shopping Mall,Food Truck,Discount Store,Hotel,Athletics & Sports,Baseball Field,Bank
60,Downsview,3.0,Grocery Store,Park,Liquor Store,Shopping Mall,Food Truck,Discount Store,Hotel,Athletics & Sports,Baseball Field,Bank
64,Weston,3.0,Park,Convenience Store,Women's Store,Donut Shop,Dim Sum Restaurant,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant
