# Segmenting and Clustering Toronto Neighbourhoods


**Important note**: This notebook contains all three sections for the assignment (webscraping, latitudes and longitudes, clustering neighbourhoods). 

## Importing and installing all necessary libraries

In [1]:
# Basic libraries for dataframes
import numpy as np
import pandas as pd
pd.set_option("display.max_columns", None)
pd.set_option("display.max_rows", None)

# Libraries for making pretty plots
import matplotlib as plt
import matplotlib.cm as cm
import matplotlib.colors as colors

# Webscraping library
import requests # library to handle requests

# K-means clustering library
from sklearn.cluster import KMeans

print("Libraries imported!")
print("")

# Library for converting an address into latitude and longitude values - this library is not necessary for this assignment but will be necessary for the final capstone
# !pip install geopy
# from geopy.geocoders import Nominatim
# print("")
# print("Geopy installed!")
# print("")

# Library for getting latitudes and longitudes
!pip install geocoder
import geocoder
print("")
print("Geocoder installed!")
print("")

# Library for displaying maps
!pip install folium
import folium
print("")
print("Folium installed!")
print("")

print("")
print("All importing and installing done!")

Libraries imported!

Collecting geocoder
  Downloading geocoder-1.38.1-py2.py3-none-any.whl (98 kB)
[K     |████████████████████████████████| 98 kB 8.3 MB/s  eta 0:00:01
[?25hCollecting ratelim
  Downloading ratelim-0.1.6-py2.py3-none-any.whl (4.0 kB)
Installing collected packages: ratelim, geocoder
Successfully installed geocoder-1.38.1 ratelim-0.1.6

Geocoder installed!

Collecting folium
  Downloading folium-0.12.1-py2.py3-none-any.whl (94 kB)
[K     |████████████████████████████████| 94 kB 4.6 MB/s eta 0:00:01
[?25hCollecting branca>=0.3.0
  Downloading branca-0.4.2-py3-none-any.whl (24 kB)
Installing collected packages: branca, folium
Successfully installed branca-0.4.2 folium-0.12.1

Folium installed!


All importing and installing done!


## Webscraping (questions 1-4)

First, we get the table of postal codes, boroughs, and neighbourhoods via webscraping. 

In [2]:
# Webscraping 
url = "https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M"
html_data = requests.get(url).text
canada_data = pd.read_html(str(html_data))[0]
canada_data.columns = ["PostalCode", "Borough", "Neighbourhood"]
canada_data.head()

Unnamed: 0,PostalCode,Borough,Neighbourhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"


Then we drop any rows with a borough that isn't assigned (borough name = "Not assigned").

In [3]:
# Dropping any rows with a borough where = "Not assigned"
canada_data = canada_data[canada_data.Borough != "Not assigned"]
canada_data.head()

Unnamed: 0,PostalCode,Borough,Neighbourhood
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"
5,M6A,North York,"Lawrence Manor, Lawrence Heights"
6,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"


In the case that we have any neighbourhoods that do not have an assigned name, we change the name from "Not assigned" to the same value as the neighbourhood name. We do this by looping over every row in the dataframe to check for "Not assigned" and if we find it, we set the value equal to the neighbourhood name. 

In [4]:
# Assigning the same neighbourhood name as the borough name in the case that a neighbourhood = "Not assigned"
for i in range(0, len(canada_data)):
    if canada_data.iloc[i,2] == "Not assigned": 
        canada_data.iloc[i,2] = canada_data.iloc[i,1]
    else:
        canada_data.iloc[i,2] = canada_data.iloc[i,2]

canada_data.head()

Unnamed: 0,PostalCode,Borough,Neighbourhood
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"
5,M6A,North York,"Lawrence Manor, Lawrence Heights"
6,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"


In [5]:
print("The number of rows is:", canada_data.shape[0], "and the number of columns is:", canada_data.shape[1])

The number of rows is: 103 and the number of columns is: 3


## Latitude and longitude questions (question 5)

First, we organise by postcode and then we reset the index to zero. 

In [6]:
# Organising by PostalCode and resetting the index
canada_data.sort_values(by = "PostalCode", axis = 0, inplace = True)
canada_data.reset_index(inplace = True, drop = True)
canada_data.head()

Unnamed: 0,PostalCode,Borough,Neighbourhood
0,M1B,Scarborough,"Malvern, Rouge"
1,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek"
2,M1E,Scarborough,"Guildwood, Morningside, West Hill"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae


Here, we would loop over all postcodes in the dataframe in order to extract the latitude and longitude using geocoder. Unfortunately, whilst I'm kind of sure my code works, geocoder does not, so we're not going to run this cell and we're just going to use the csv file of latitudes and longitudes that has been provided.

In [7]:
# So this piece of code I think in principal does work but unfortunately geocoder does not 

"""
canada_data["Latitude"] = ""
canada_data["Longitude"] = ""
canada_data.head()

for i in range(0, len(canada_data)):
    print("Iteration:", i+1, "and postal code:", postal_code)
    # Initialising as 'None'
    lat_lng_coords = None
    # Looping until the coordinates are acquired
    while(lat_lng_coords is None):
        postal_code = canada_data.loc[i, "PostalCode"] # Gets the postcode from the first column of canada_data
        g = geocoder.google('{}, Toronto, Ontario'.format(postal_code)) # Checking geocoder
        lat_lng_coords = g.latlng # If this returns None, it loops again until it's not None
    canada_data.loc[i,"Latitude"] = lat_lng_coords[0] # Assigning the latitude value to the dataframe
    canada_data.loc[i, "Longitude"] = lat_lng_coords[1] # Assigning the longitude value to the dataframe
"""

'\ncanada_data["Latitude"] = ""\ncanada_data["Longitude"] = ""\ncanada_data.head()\n\nfor i in range(0, len(canada_data)):\n    print("Iteration:", i+1, "and postal code:", postal_code)\n    # Initialising as \'None\'\n    lat_lng_coords = None\n    # Looping until the coordinates are acquired\n    while(lat_lng_coords is None):\n        postal_code = canada_data.loc[i, "PostalCode"] # Gets the postcode from the first column of canada_data\n        g = geocoder.google(\'{}, Toronto, Ontario\'.format(postal_code)) # Checking geocoder\n        lat_lng_coords = g.latlng # If this returns None, it loops again until it\'s not None\n    canada_data.loc[i,"Latitude"] = lat_lng_coords[0] # Assigning the latitude value to the dataframe\n    canada_data.loc[i, "Longitude"] = lat_lng_coords[1] # Assigning the longitude value to the dataframe\n'

Instead of using geocoder, here we get the latitude and longitude data for each postcode from the .csv file given. Then, we sort the resulting dataframe by postcode which should then match the other dataframe of borough/neighbourhood names...

In [8]:
# Getting latitude and longitude data 
url = "https://cocl.us/Geospatial_data"
lat_long = pd.read_csv(url)

# Sorting by postal code
lat_long.sort_values(by = "Postal Code", axis = 0, inplace = True)
lat_long.head()

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


In [9]:
# Checking if canada_data["PostalCode"] and lat_long["Postal Code"] match on all values
print(canada_data["PostalCode"].equals(lat_long["Postal Code"]))
if canada_data["PostalCode"].equals(lat_long["Postal Code"]) == True:
    print("This means that the two columns are identical and that we can combine the dataframes")
else: 
    print("This means that the postal codes do not match and we need to do some more wrangling")

True
This means that the two columns are identical and that we can combine the dataframes


Because the postcodes match line by line, we can now stick the two dataframes together into one dataframe of postcodes, borough names, neighbourhoods, latitudes, and longitudes.

In [10]:
canada_alldata = pd.concat([canada_data, lat_long["Latitude"], lat_long["Longitude"]], axis = 1)
canada_alldata

Unnamed: 0,PostalCode,Borough,Neighbourhood,Latitude,Longitude
0,M1B,Scarborough,"Malvern, Rouge",43.806686,-79.194353
1,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476
5,M1J,Scarborough,Scarborough Village,43.744734,-79.239476
6,M1K,Scarborough,"Kennedy Park, Ionview, East Birchmount Park",43.727929,-79.262029
7,M1L,Scarborough,"Golden Mile, Clairlea, Oakridge",43.711112,-79.284577
8,M1M,Scarborough,"Cliffside, Cliffcrest, Scarborough Village West",43.716316,-79.239476
9,M1N,Scarborough,"Birch Cliff, Cliffside West",43.692657,-79.264848


We're going to drop all boroughs that do not have 'Toronto' in the name. Later on, we're going to obtain different sorts of venue per postcode and we want to ensure that all of the postcodes included have sufficient diversity of venues. Smaller and less central boroughs are less likely to have many or diverse venues, hence why we are dropping them. 

In [11]:
central_neighbourhoods = canada_alldata["Borough"].str.contains("Toronto")
toronto_data = canada_alldata[central_neighbourhoods]
toronto_data.reset_index(inplace = True, drop = True)
toronto_data.head()

Unnamed: 0,PostalCode,Borough,Neighbourhood,Latitude,Longitude
0,M4E,East Toronto,The Beaches,43.676357,-79.293031
1,M4K,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188
2,M4L,East Toronto,"India Bazaar, The Beaches West",43.668999,-79.315572
3,M4M,East Toronto,Studio District,43.659526,-79.340923
4,M4N,Central Toronto,Lawrence Park,43.72802,-79.38879


In [12]:
print("The dataframe has {} postcodes and {} boroughs".format(toronto_data.shape[0], len(toronto_data["Borough"].unique())))

The dataframe has 39 postcodes and 4 boroughs


## Getting venue data 

Then we make a pretty map of Toronto with all of the neighbourhood clusters shown. 

In [13]:
# Latitude and longitude of the centre point of Toronto
lat_toronto = 43.6532
long_toronto = -79.3832

map_toronto = folium.Map(location = [lat_toronto, long_toronto], zoom_start = 12)

for lat, lng, neighbourhoods, borough, postalcode in zip(toronto_data["Latitude"], 
                                                         toronto_data["Longitude"], 
                                                         toronto_data["Neighbourhood"], 
                                                         toronto_data["Borough"], 
                                                         toronto_data["PostalCode"]):
    label = "{}\n\n({}, {})".format(neighbourhoods, borough, postalcode)
    label = folium.Popup(label, parse_html = True)
    folium.CircleMarker(
        [lat, lng],
        radius = 5,
        popup = label,
        color = "blue",
        fill = True,
        fill_color = "#3186cc",
        fill_opacity = 0.7,
        parse_html = False).add_to(map_toronto)  
    
map_toronto

There should be a hidden cell here with my foursquare credentials.

In [14]:
# @hidden_cell

# Foursquare credentials for obtaining neighbourhood data
CLIENT_ID = 'CPENNL4333IGHOVFIPFVOFGLMIYOEMCK5CYEVIFYPLCLG1RE' # your Foursquare ID
CLIENT_SECRET = 'BZZXMFMSIKNFE4WNW2RFDIWKEKRGNJEPFC30NKF1QK2F0ICK' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT = 100 # A default Foursquare API limit value

The below cell defines a function to obtain nearby venues for each given latitude and longitude.

In [15]:
def getNearbyVenues(postcode, names, latitudes, longitudes, radius = 500):
    
    venues_list = []
    for pc, name, lat, lng in zip(postcode, names, latitudes, longitudes):
        print("{} -- {}".format(pc, name))
        
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            pc, 
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Postal code', 
                             'Borough',
                             'Postcode latitude', 
                             'Postcode longitude', 
                             'Venue', 
                             'Venue latitude', 
                             'Venue longitude',
                             'Venue category']
    
    return(nearby_venues)

Obtaining all nearby venues for Toronto...

In [18]:
toronto_venues = getNearbyVenues(postcode = toronto_data['PostalCode'], 
                                 names = toronto_data['Borough'], 
                                 latitudes = toronto_data['Latitude'],
                                 longitudes = toronto_data['Longitude']
                                )
print("")
print("Done")

M4E -- East Toronto
M4K -- East Toronto
M4L -- East Toronto
M4M -- East Toronto
M4N -- Central Toronto
M4P -- Central Toronto
M4R -- Central Toronto
M4S -- Central Toronto
M4T -- Central Toronto
M4V -- Central Toronto
M4W -- Downtown Toronto
M4X -- Downtown Toronto
M4Y -- Downtown Toronto
M5A -- Downtown Toronto
M5B -- Downtown Toronto
M5C -- Downtown Toronto
M5E -- Downtown Toronto
M5G -- Downtown Toronto
M5H -- Downtown Toronto
M5J -- Downtown Toronto
M5K -- Downtown Toronto
M5L -- Downtown Toronto
M5N -- Central Toronto
M5P -- Central Toronto
M5R -- Central Toronto
M5S -- Downtown Toronto
M5T -- Downtown Toronto
M5V -- Downtown Toronto
M5W -- Downtown Toronto
M5X -- Downtown Toronto
M6G -- Downtown Toronto
M6H -- West Toronto
M6J -- West Toronto
M6K -- West Toronto
M6P -- West Toronto
M6R -- West Toronto
M6S -- West Toronto
M7A -- Downtown Toronto
M7Y -- East Toronto

Done


In [19]:
toronto_venues.head()

Unnamed: 0,Postal code,Borough,Postcode latitude,Postcode longitude,Venue,Venue latitude,Venue longitude,Venue category
0,M4E,East Toronto,43.676357,-79.293031,Glen Manor Ravine,43.676821,-79.293942,Trail
1,M4E,East Toronto,43.676357,-79.293031,The Big Carrot Natural Food Market,43.678879,-79.297734,Health Food Store
2,M4E,East Toronto,43.676357,-79.293031,Grover Pub and Grub,43.679181,-79.297215,Pub
3,M4E,East Toronto,43.676357,-79.293031,Upper Beaches,43.680563,-79.292869,Neighborhood
4,M4E,East Toronto,43.676357,-79.293031,Seaspray Restaurant,43.678888,-79.298167,Asian Restaurant


In [20]:
print("The number of rows in the venue dataframe is:", toronto_venues.shape[0])

The number of rows in the venue dataframe is: 1606


Let's take a look at how many venues we have per postcode. We can ignore the last three columns as every venue will also have a latitude, longitude, and a category. 

In [21]:
toronto_venues_count = toronto_venues.groupby(["Postal code", "Borough"]).count()
toronto_venues_count

Unnamed: 0_level_0,Unnamed: 1_level_0,Postcode latitude,Postcode longitude,Venue,Venue latitude,Venue longitude,Venue category
Postal code,Borough,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
M4E,East Toronto,5,5,5,5,5,5
M4K,East Toronto,42,42,42,42,42,42
M4L,East Toronto,18,18,18,18,18,18
M4M,East Toronto,37,37,37,37,37,37
M4N,Central Toronto,3,3,3,3,3,3
M4P,Central Toronto,8,8,8,8,8,8
M4R,Central Toronto,18,18,18,18,18,18
M4S,Central Toronto,33,33,33,33,33,33
M4T,Central Toronto,3,3,3,3,3,3
M4V,Central Toronto,16,16,16,16,16,16


In [22]:
print('There are {} unique categories of venue'.format(len(toronto_venues['Venue category'].unique())))

There are 239 unique categories of venue


Now we do one-hot encoding for venue categories. This will give us a dataframe with one row per venue where it is encoded as 1 for the venue category it's in and 0 for all other venue categories. 

In [23]:
# One-hot encoding
toronto_onehot = pd.get_dummies(toronto_venues[["Venue category"]], prefix = "", prefix_sep = "")

# Add postcode column back to dataframe and move it to the first column
# toronto_onehot.insert(0, "Postal code", toronto_venues["Postal code"])
# toronto_onehot.insert(1, "Borough", toronto_venues["Borough"])

toronto_onehot["Postal code"] = toronto_venues["Postal code"] 
fixed_columns = [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]

# Display
toronto_onehot.head()

Unnamed: 0,Postal code,Adult Boutique,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,Aquarium,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Workshop,BBQ Joint,Baby Store,Bagel Shop,Bakery,Bank,Bar,Baseball Stadium,Basketball Stadium,Beach,Bed & Breakfast,Beer Bar,Beer Store,Belgian Restaurant,Bike Rental / Bike Share,Bistro,Boat or Ferry,Bookstore,Boutique,Brazilian Restaurant,Breakfast Spot,Brewery,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Bus Line,Butcher,Café,Cajun / Creole Restaurant,Candy Store,Caribbean Restaurant,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,College Arts Building,College Auditorium,College Cafeteria,College Gym,College Rec Center,Colombian Restaurant,Comfort Food Restaurant,Comic Shop,Concert Hall,Convenience Store,Cosmetics Shop,Coworking Space,Creperie,Cuban Restaurant,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,Escape Room,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Fish & Chips Shop,Fish Market,Flea Market,Food & Drink Shop,Food Court,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,General Entertainment,General Travel,German Restaurant,Gift Shop,Gluten-free Restaurant,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Harbor / Marina,Health & Beauty Service,Health Food Store,Historic Site,History Museum,Home Service,Hookah Bar,Hospital,Hotel,Hotel Bar,IT Services,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indoor Play Area,Intersection,Irish Pub,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Knitting Store,Korean Restaurant,Lake,Latin American Restaurant,Light Rail Station,Lingerie Store,Liquor Store,Lounge,Malay Restaurant,Market,Martial Arts School,Massage Studio,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Moroccan Restaurant,Movie Theater,Museum,Music Venue,Neighborhood,New American Restaurant,Nightclub,Noodle House,Office,Opera House,Optical Shop,Organic Grocery,Other Great Outdoors,Park,Performing Arts Venue,Pet Store,Pharmacy,Pizza Place,Playground,Plaza,Poke Place,Pool,Portuguese Restaurant,Poutine Place,Pub,Ramen Restaurant,Record Shop,Recording Studio,Rental Car Location,Restaurant,Roof Deck,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,School,Sculpture Garden,Seafood Restaurant,Shoe Store,Shopping Mall,Skate Park,Skating Rink,Smoke Shop,Smoothie Shop,Snack Place,Soup Place,Spa,Speakeasy,Sporting Goods Shop,Sports Bar,Stadium,Stationery Store,Steakhouse,Strip Club,Supermarket,Supplement Shop,Sushi Restaurant,Swim School,Taco Place,Tailor Shop,Taiwanese Restaurant,Tanning Salon,Tea Room,Tennis Court,Thai Restaurant,Theater,Theme Restaurant,Tibetan Restaurant,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Women's Store,Yoga Studio
0,M4E,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0
1,M4E,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,M4E,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,M4E,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,M4E,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In the above dataframe, we have multiple rows per postcode which is not what we want. Now we will change that so that each postcode has its own row. We will also calculate the relative frequency with which each venue category appears in that postcode. 

In [24]:
toronto_grouped = toronto_onehot.groupby("Postal code").mean().reset_index()
toronto_grouped.head()

Unnamed: 0,Postal code,Adult Boutique,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,Aquarium,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Workshop,BBQ Joint,Baby Store,Bagel Shop,Bakery,Bank,Bar,Baseball Stadium,Basketball Stadium,Beach,Bed & Breakfast,Beer Bar,Beer Store,Belgian Restaurant,Bike Rental / Bike Share,Bistro,Boat or Ferry,Bookstore,Boutique,Brazilian Restaurant,Breakfast Spot,Brewery,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Bus Line,Butcher,Café,Cajun / Creole Restaurant,Candy Store,Caribbean Restaurant,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,College Arts Building,College Auditorium,College Cafeteria,College Gym,College Rec Center,Colombian Restaurant,Comfort Food Restaurant,Comic Shop,Concert Hall,Convenience Store,Cosmetics Shop,Coworking Space,Creperie,Cuban Restaurant,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,Escape Room,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Fish & Chips Shop,Fish Market,Flea Market,Food & Drink Shop,Food Court,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,General Entertainment,General Travel,German Restaurant,Gift Shop,Gluten-free Restaurant,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Harbor / Marina,Health & Beauty Service,Health Food Store,Historic Site,History Museum,Home Service,Hookah Bar,Hospital,Hotel,Hotel Bar,IT Services,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indoor Play Area,Intersection,Irish Pub,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Knitting Store,Korean Restaurant,Lake,Latin American Restaurant,Light Rail Station,Lingerie Store,Liquor Store,Lounge,Malay Restaurant,Market,Martial Arts School,Massage Studio,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Moroccan Restaurant,Movie Theater,Museum,Music Venue,Neighborhood,New American Restaurant,Nightclub,Noodle House,Office,Opera House,Optical Shop,Organic Grocery,Other Great Outdoors,Park,Performing Arts Venue,Pet Store,Pharmacy,Pizza Place,Playground,Plaza,Poke Place,Pool,Portuguese Restaurant,Poutine Place,Pub,Ramen Restaurant,Record Shop,Recording Studio,Rental Car Location,Restaurant,Roof Deck,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,School,Sculpture Garden,Seafood Restaurant,Shoe Store,Shopping Mall,Skate Park,Skating Rink,Smoke Shop,Smoothie Shop,Snack Place,Soup Place,Spa,Speakeasy,Sporting Goods Shop,Sports Bar,Stadium,Stationery Store,Steakhouse,Strip Club,Supermarket,Supplement Shop,Sushi Restaurant,Swim School,Taco Place,Tailor Shop,Taiwanese Restaurant,Tanning Salon,Tea Room,Tennis Court,Thai Restaurant,Theater,Theme Restaurant,Tibetan Restaurant,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Women's Store,Yoga Studio
0,M4E,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,M4K,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.02381,0.02381,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.095238,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.02381,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.02381,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.02381
2,M4L,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.055556,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,M4M,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.054054,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.054054,0.027027,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.054054,0.0,0.0,0.0,0.0,0.0,0.0,0.054054,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.027027,0.0,0.081081,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.027027,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.054054,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027
4,M4N,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


Let's delete any postcodes with less than ten types of venue. Later on, we're going to calculate the top ten venue categories per postcode and if a postcode doesn't have ten venue categories, the results won't make sense. We'll start by calculating which postcodes have <10 unique venue categories:

In [25]:
toronto_ncats = toronto_onehot.groupby("Postal code").sum().reset_index()

# Looping over all rows and columns. We'll replace all non-zero numbers with a one, then we can calculate how many unique venue types we have per postal code
for row in range(0, len(toronto_ncats)):
    for col in range(2, len(toronto_ncats.columns)):
        if toronto_ncats.iloc[row, col] > 0: 
            toronto_ncats.iloc[row, col] = 1
        else: 
            toronto_ncats.iloc[row, col] = 0 

toronto_n_cats = pd.concat([toronto_ncats["Postal code"], toronto_ncats.sum(axis = 1)], axis = 1)
toronto_n_cats.rename(columns = {0: 'Number of unique venue categories'}, inplace = True)

toronto_n_cats

Unnamed: 0,Postal code,Number of unique venue categories
0,M4E,5
1,M4K,28
2,M4L,16
3,M4M,30
4,M4N,3
5,M4P,8
6,M4R,16
7,M4S,23
8,M4T,3
9,M4V,15


Here we can see for example postcode M4E has only 5 unique venue categories (at least when I did this, it may have gained some since then). Let's see which other ones there: 

In [26]:
x = toronto_n_cats["Number of unique venue categories"] < 10
x = pd.DataFrame(x)
x.rename(columns = {"Number of unique venue categories": "Venue categories less than 10"}, inplace = True)
toronto_n_cats = pd.concat([toronto_n_cats, x], axis = 1)
toronto_n_cats = toronto_n_cats[toronto_n_cats["Venue categories less than 10"] == True]
toronto_n_cats

Unnamed: 0,Postal code,Number of unique venue categories,Venue categories less than 10
0,M4E,5,True
4,M4N,3,True
5,M4P,8,True
8,M4T,3,True
10,M4W,3,True
22,M5N,2,True
23,M5P,4,True


Let's get only the rows now from toronto_grouped which contain postcodes with more than 10 unique categories of venue in them. 

In [27]:
toronto_grouped = toronto_grouped[x["Venue categories less than 10"] == False]
toronto_grouped.reset_index(inplace = True, drop = True)
toronto_grouped

Unnamed: 0,Postal code,Adult Boutique,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,Aquarium,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Workshop,BBQ Joint,Baby Store,Bagel Shop,Bakery,Bank,Bar,Baseball Stadium,Basketball Stadium,Beach,Bed & Breakfast,Beer Bar,Beer Store,Belgian Restaurant,Bike Rental / Bike Share,Bistro,Boat or Ferry,Bookstore,Boutique,Brazilian Restaurant,Breakfast Spot,Brewery,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Bus Line,Butcher,Café,Cajun / Creole Restaurant,Candy Store,Caribbean Restaurant,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,College Arts Building,College Auditorium,College Cafeteria,College Gym,College Rec Center,Colombian Restaurant,Comfort Food Restaurant,Comic Shop,Concert Hall,Convenience Store,Cosmetics Shop,Coworking Space,Creperie,Cuban Restaurant,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,Escape Room,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Fish & Chips Shop,Fish Market,Flea Market,Food & Drink Shop,Food Court,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,General Entertainment,General Travel,German Restaurant,Gift Shop,Gluten-free Restaurant,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Harbor / Marina,Health & Beauty Service,Health Food Store,Historic Site,History Museum,Home Service,Hookah Bar,Hospital,Hotel,Hotel Bar,IT Services,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indoor Play Area,Intersection,Irish Pub,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Knitting Store,Korean Restaurant,Lake,Latin American Restaurant,Light Rail Station,Lingerie Store,Liquor Store,Lounge,Malay Restaurant,Market,Martial Arts School,Massage Studio,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Moroccan Restaurant,Movie Theater,Museum,Music Venue,Neighborhood,New American Restaurant,Nightclub,Noodle House,Office,Opera House,Optical Shop,Organic Grocery,Other Great Outdoors,Park,Performing Arts Venue,Pet Store,Pharmacy,Pizza Place,Playground,Plaza,Poke Place,Pool,Portuguese Restaurant,Poutine Place,Pub,Ramen Restaurant,Record Shop,Recording Studio,Rental Car Location,Restaurant,Roof Deck,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,School,Sculpture Garden,Seafood Restaurant,Shoe Store,Shopping Mall,Skate Park,Skating Rink,Smoke Shop,Smoothie Shop,Snack Place,Soup Place,Spa,Speakeasy,Sporting Goods Shop,Sports Bar,Stadium,Stationery Store,Steakhouse,Strip Club,Supermarket,Supplement Shop,Sushi Restaurant,Swim School,Taco Place,Tailor Shop,Taiwanese Restaurant,Tanning Salon,Tea Room,Tennis Court,Thai Restaurant,Theater,Theme Restaurant,Tibetan Restaurant,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Women's Store,Yoga Studio
0,M4K,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.02381,0.02381,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.095238,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.02381,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.02381,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.02381
1,M4L,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.055556,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,M4M,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.054054,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.054054,0.027027,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.054054,0.0,0.0,0.0,0.0,0.0,0.0,0.054054,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.027027,0.0,0.081081,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.027027,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.054054,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027
3,M4R,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.111111,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.055556,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556
4,M4S,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.060606,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.060606,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.030303,0.0,0.060606,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.030303,0.0,0.0,0.060606,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.030303,0.060606,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.060606,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,M4V,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0
6,M4X,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.068182,0.0,0.0,0.022727,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.068182,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.022727,0.0,0.0,0.022727,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.045455,0.022727,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.022727,0.022727,0.045455,0.022727,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.068182,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727
7,M4Y,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0,0.0125,0.0,0.0125,0.0,0.025,0.0125,0.0,0.0,0.025,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0875,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0,0.0125,0.0,0.0,0.0125,0.0125,0.0,0.0125,0.0125,0.0,0.0,0.0,0.0,0.0,0.0125,0.0125,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0125,0.0375,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0125,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0125,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.025,0.025,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0125,0.0,0.0,0.0,0.0375,0.0,0.0125,0.0,0.0125,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0125,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0125,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025
8,M5A,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.022727,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.068182,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.159091,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.022727,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.022727,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.068182,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727
9,M5B,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.03,0.0,0.01,0.01,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.09,0.0,0.08,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.03,0.03,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.03,0.01,0.01,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0


In [28]:
print("The number of postcodes is {} and the number of venue categories is {}".format(toronto_grouped.shape[0], toronto_grouped.shape[1] - 1))

The number of postcodes is 32 and the number of venue categories is 239


Here, we display the top 5 venue categories per postcode and their relative frequency (i.e, 0.10 = 10% of venues in that postcode).

In [29]:
num_top_venues = 5

for pc in toronto_grouped['Postal code']:
    print("------------", pc, "------------")
    temp = toronto_grouped[toronto_grouped['Postal code'] == pc].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending = False).reset_index(drop = True).head(num_top_venues))
    print('\n')

------------ M4K ------------
                    venue  freq
0        Greek Restaurant  0.17
1             Coffee Shop  0.10
2      Italian Restaurant  0.07
3          Ice Cream Shop  0.05
4  Furniture / Home Store  0.05


------------ M4L ------------
                  venue  freq
0                  Park  0.11
1  Fast Food Restaurant  0.11
2               Brewery  0.06
3        Sandwich Place  0.06
4      Sushi Restaurant  0.06


------------ M4M ------------
                 venue  freq
0          Coffee Shop  0.08
1              Brewery  0.05
2                 Café  0.05
3            Gastropub  0.05
4  American Restaurant  0.05


------------ M4R ------------
                venue  freq
0      Clothing Store  0.11
1         Coffee Shop  0.11
2         Yoga Studio  0.06
3  Seafood Restaurant  0.06
4  Chinese Restaurant  0.06


------------ M4S ------------
                venue  freq
0        Dessert Shop  0.09
1      Sandwich Place  0.09
2  Italian Restaurant  0.06
3         Pizza 

In [30]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending = False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Here we create a dataframe of all the top 10 most common venues per postal code.

In [32]:
num_top_venues = 10

indicators = ["st", "nd", "rd"]

# Create columns according to number of top venues
columns = ["Postal code"] 

for ind in np.arange(num_top_venues):
    try:
        columns.append("{}{} Most Common Venue".format(ind+1, indicators[ind]))
    except:
        columns.append("{}th Most Common Venue".format(ind+1))

# Create a new dataframe
toronto_venues_sorted = pd.DataFrame(columns = columns)
toronto_venues_sorted['Postal code'] = toronto_grouped['Postal code']

for ind in np.arange(toronto_grouped.shape[0]):
    toronto_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

toronto_venues_sorted

Unnamed: 0,Postal code,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M4K,Greek Restaurant,Coffee Shop,Italian Restaurant,Furniture / Home Store,Restaurant,Ice Cream Shop,Yoga Studio,Spa,Dessert Shop,Pub
1,M4L,Park,Fast Food Restaurant,Liquor Store,Movie Theater,Steakhouse,Sushi Restaurant,Italian Restaurant,Ice Cream Shop,Fish & Chips Shop,Pub
2,M4M,Coffee Shop,Brewery,Café,Gastropub,Bakery,American Restaurant,Yoga Studio,Coworking Space,Cheese Shop,Clothing Store
3,M4R,Coffee Shop,Clothing Store,Bagel Shop,Fast Food Restaurant,Mexican Restaurant,Diner,Park,Chinese Restaurant,Café,Restaurant
4,M4S,Sandwich Place,Dessert Shop,Coffee Shop,Pizza Place,Café,Italian Restaurant,Sushi Restaurant,Gym,Seafood Restaurant,Greek Restaurant
5,M4V,Coffee Shop,Sushi Restaurant,Fried Chicken Joint,Light Rail Station,Liquor Store,Pizza Place,Pub,Restaurant,Sandwich Place,Bank
6,M4X,Coffee Shop,Restaurant,Café,Bakery,Italian Restaurant,Pub,Market,Pizza Place,Pharmacy,Sandwich Place
7,M4Y,Coffee Shop,Japanese Restaurant,Sushi Restaurant,Restaurant,Gay Bar,Yoga Studio,Pub,Pizza Place,Men's Store,Mediterranean Restaurant
8,M5A,Coffee Shop,Bakery,Park,Theater,Breakfast Spot,Café,Restaurant,Pub,French Restaurant,Chocolate Shop
9,M5B,Clothing Store,Coffee Shop,Italian Restaurant,Cosmetics Shop,Bubble Tea Shop,Middle Eastern Restaurant,Café,Japanese Restaurant,Hotel,Bookstore


Now that we have our dataframe with the top most common venue types per postcode, let's start clustering!

## Clustering neighbourhoods

First we will run k-means clustering on our data to group the postcodes of central Toronto. 

In [51]:
# Set number of clusters
kclusters = 5

toronto_grouped_clustering = toronto_grouped.drop("Postal code", 1)

# Run k-means clustering
kmeans = KMeans(n_clusters = kclusters, random_state = 7).fit(toronto_grouped_clustering)

# Check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([2, 1, 2, 2, 1, 1, 2, 1, 2, 2], dtype=int32)

Then we will create a new dataframe that includes postcode, borough name, cluster label, and the top ten venue categories. 

In [52]:
toronto_clustered = toronto_venues_sorted

# Drop cluster label column in the case that it already exists
try: 
    toronto_clustered.drop("Cluster Labels", axis = 1, inplace = True)
# If it doesn't already exist, it will throw an error, so in the case that this is the first time we're running a KMC this session, it will print this instead
except: 
    print("This is the first time we are running a k-means clustering")
    
# Add clustering labels
toronto_clustered.insert(1, "Cluster Labels", kmeans.labels_)
toronto_clustered.head()

canada_alldata.rename(columns = {"PostalCode": "Postal code"}, inplace = True)
toronto_clustered = toronto_clustered.join(canada_alldata.set_index("Postal code"), on = "Postal code")

toronto_clustered

Unnamed: 0,Postal code,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Borough,Neighbourhood,Latitude,Longitude
0,M4K,2,Greek Restaurant,Coffee Shop,Italian Restaurant,Furniture / Home Store,Restaurant,Ice Cream Shop,Yoga Studio,Spa,Dessert Shop,Pub,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188
1,M4L,1,Park,Fast Food Restaurant,Liquor Store,Movie Theater,Steakhouse,Sushi Restaurant,Italian Restaurant,Ice Cream Shop,Fish & Chips Shop,Pub,East Toronto,"India Bazaar, The Beaches West",43.668999,-79.315572
2,M4M,2,Coffee Shop,Brewery,Café,Gastropub,Bakery,American Restaurant,Yoga Studio,Coworking Space,Cheese Shop,Clothing Store,East Toronto,Studio District,43.659526,-79.340923
3,M4R,2,Coffee Shop,Clothing Store,Bagel Shop,Fast Food Restaurant,Mexican Restaurant,Diner,Park,Chinese Restaurant,Café,Restaurant,Central Toronto,"North Toronto West, Lawrence Park",43.715383,-79.405678
4,M4S,1,Sandwich Place,Dessert Shop,Coffee Shop,Pizza Place,Café,Italian Restaurant,Sushi Restaurant,Gym,Seafood Restaurant,Greek Restaurant,Central Toronto,Davisville,43.704324,-79.38879
5,M4V,1,Coffee Shop,Sushi Restaurant,Fried Chicken Joint,Light Rail Station,Liquor Store,Pizza Place,Pub,Restaurant,Sandwich Place,Bank,Central Toronto,"Summerhill West, Rathnelly, South Hill, Forest...",43.686412,-79.400049
6,M4X,2,Coffee Shop,Restaurant,Café,Bakery,Italian Restaurant,Pub,Market,Pizza Place,Pharmacy,Sandwich Place,Downtown Toronto,"St. James Town, Cabbagetown",43.667967,-79.367675
7,M4Y,1,Coffee Shop,Japanese Restaurant,Sushi Restaurant,Restaurant,Gay Bar,Yoga Studio,Pub,Pizza Place,Men's Store,Mediterranean Restaurant,Downtown Toronto,Church and Wellesley,43.66586,-79.38316
8,M5A,2,Coffee Shop,Bakery,Park,Theater,Breakfast Spot,Café,Restaurant,Pub,French Restaurant,Chocolate Shop,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
9,M5B,2,Clothing Store,Coffee Shop,Italian Restaurant,Cosmetics Shop,Bubble Tea Shop,Middle Eastern Restaurant,Café,Japanese Restaurant,Hotel,Bookstore,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937


Let's now look at the different clusters.

### Cluster 0

In [106]:
toronto_zero = toronto_clustered.loc[toronto_clustered['Cluster Labels'] == 0, toronto_clustered.columns[[1] + list(range(2, toronto_clustered.shape[1]))]]
toronto_zero

Unnamed: 0,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Borough,Neighbourhood,Latitude,Longitude
20,0,Airport Lounge,Airport Service,Airport Terminal,Harbor / Marina,Boutique,Rental Car Location,Coffee Shop,Bar,Boat or Ferry,Sculpture Garden,Downtown Toronto,"CN Tower, King and Spadina, Railway Lands, Har...",43.628947,-79.39442


This cluster seems to be travel-related. We can see airport-related venues (airport lounge, airport terminal, etc.) but also harbours, boats/ferries, and rental car locations. 

### Cluster 1

In [91]:
toronto_one = toronto_clustered.loc[toronto_clustered['Cluster Labels'] == 1, toronto_clustered.columns[[1] + list(range(2, toronto_clustered.shape[1]))]]
toronto_one

Unnamed: 0,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Borough,Neighbourhood,Latitude,Longitude
1,1,Park,Fast Food Restaurant,Liquor Store,Movie Theater,Steakhouse,Sushi Restaurant,Italian Restaurant,Ice Cream Shop,Fish & Chips Shop,Pub,East Toronto,"India Bazaar, The Beaches West",43.668999,-79.315572
4,1,Sandwich Place,Dessert Shop,Coffee Shop,Pizza Place,Café,Italian Restaurant,Sushi Restaurant,Gym,Seafood Restaurant,Greek Restaurant,Central Toronto,Davisville,43.704324,-79.38879
5,1,Coffee Shop,Sushi Restaurant,Fried Chicken Joint,Light Rail Station,Liquor Store,Pizza Place,Pub,Restaurant,Sandwich Place,Bank,Central Toronto,"Summerhill West, Rathnelly, South Hill, Forest...",43.686412,-79.400049
7,1,Coffee Shop,Japanese Restaurant,Sushi Restaurant,Restaurant,Gay Bar,Yoga Studio,Pub,Pizza Place,Men's Store,Mediterranean Restaurant,Downtown Toronto,Church and Wellesley,43.66586,-79.38316
18,1,Café,Bookstore,Bakery,Bar,Japanese Restaurant,Sandwich Place,Bank,Italian Restaurant,Beer Bar,Beer Store,Downtown Toronto,"University of Toronto, Harbord",43.662696,-79.400049
29,1,Coffee Shop,Sushi Restaurant,Café,Pizza Place,Italian Restaurant,Pub,Smoothie Shop,Bookstore,School,Sandwich Place,West Toronto,"Runnymede, Swansea",43.651571,-79.48445
31,1,Brewery,Restaurant,Recording Studio,Butcher,Burrito Place,Fast Food Restaurant,Auto Workshop,Farmers Market,Spa,Garden,East Toronto,"Business reply mail Processing Centre, South C...",43.662744,-79.321558


Let's look at the most common venues in this cluster. 

In [102]:
# Instantiate the dataframe with the first column of the cluster 
venues_cluster_one = toronto_one.iloc[:,1].value_counts().rename_axis("unique_venues").reset_index(name = "count")

# Loop over the other most common venue columns and append them 
for n in range(2,11):
    new = toronto_one.iloc[:,n].value_counts().rename_axis("unique_venues").reset_index(name = "count")
    venues_cluster_one = venues_cluster_one.append(new)
    
# Some venues show up in multiple columns. Grouping them gets us one count for the whole dataframe 
venues_cluster_one = venues_cluster_one.groupby("unique_venues").sum()

# Let's add in an extra column so we can see the percentage of postcodes in the cluster that have that venue
percentage = venues_cluster_one["count"]/len(toronto_one) * 100
percentage = pd.DataFrame(percentage)
percentage.rename(columns = {"count": "percentage"}, inplace = True)
venues_cluster_one = pd.concat([venues_cluster_one, percentage], axis = 1)

# Let's also only look at venues that are present in more than 20% of postcodes 
venues_cluster_one[venues_cluster_one["percentage"] > 20].sort_values("count", ascending = False)

Unnamed: 0_level_0,count,percentage
unique_venues,Unnamed: 1_level_1,Unnamed: 2_level_1
Sushi Restaurant,5,71.428571
Coffee Shop,4,57.142857
Italian Restaurant,4,57.142857
Pizza Place,4,57.142857
Pub,4,57.142857
Sandwich Place,4,57.142857
Café,3,42.857143
Restaurant,3,42.857143
Bank,2,28.571429
Bookstore,2,28.571429


This cluster seems to be mostly snack food (coffeehouses, cafés, sandwich places, fastfood restaurants). There are also a few Italian restaurants / pizza places, Japanese/sushi restaurants, and pubs. We'll call this cluster the "snack food cluster".

### Cluster 2

In [94]:
toronto_two = toronto_clustered.loc[toronto_clustered['Cluster Labels'] == 2, toronto_clustered.columns[[1] + list(range(2, toronto_clustered.shape[1]))]]
toronto_two

Unnamed: 0,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Borough,Neighbourhood,Latitude,Longitude
0,2,Greek Restaurant,Coffee Shop,Italian Restaurant,Furniture / Home Store,Restaurant,Ice Cream Shop,Yoga Studio,Spa,Dessert Shop,Pub,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188
2,2,Coffee Shop,Brewery,Café,Gastropub,Bakery,American Restaurant,Yoga Studio,Coworking Space,Cheese Shop,Clothing Store,East Toronto,Studio District,43.659526,-79.340923
3,2,Coffee Shop,Clothing Store,Bagel Shop,Fast Food Restaurant,Mexican Restaurant,Diner,Park,Chinese Restaurant,Café,Restaurant,Central Toronto,"North Toronto West, Lawrence Park",43.715383,-79.405678
6,2,Coffee Shop,Restaurant,Café,Bakery,Italian Restaurant,Pub,Market,Pizza Place,Pharmacy,Sandwich Place,Downtown Toronto,"St. James Town, Cabbagetown",43.667967,-79.367675
8,2,Coffee Shop,Bakery,Park,Theater,Breakfast Spot,Café,Restaurant,Pub,French Restaurant,Chocolate Shop,Downtown Toronto,"Regent Park, Harbourfront",43.65426,-79.360636
9,2,Clothing Store,Coffee Shop,Italian Restaurant,Cosmetics Shop,Bubble Tea Shop,Middle Eastern Restaurant,Café,Japanese Restaurant,Hotel,Bookstore,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937
10,2,Coffee Shop,Café,Cocktail Bar,Gastropub,American Restaurant,Bakery,Creperie,Moroccan Restaurant,Department Store,Clothing Store,Downtown Toronto,St. James Town,43.651494,-79.375418
11,2,Coffee Shop,Bakery,Cocktail Bar,Pharmacy,Cheese Shop,Restaurant,Farmers Market,Beer Bar,Seafood Restaurant,Sandwich Place,Downtown Toronto,Berczy Park,43.644771,-79.373306
12,2,Coffee Shop,Italian Restaurant,Sandwich Place,Café,Burger Joint,Salad Place,Thai Restaurant,Bubble Tea Shop,Yoga Studio,Indian Restaurant,Downtown Toronto,Central Bay Street,43.657952,-79.387383
13,2,Coffee Shop,Café,Restaurant,Bakery,Deli / Bodega,Thai Restaurant,Gym,Clothing Store,Cosmetics Shop,Asian Restaurant,Downtown Toronto,"Richmond, Adelaide, King",43.650571,-79.384568


Let's look at the most common venues in this cluster. 

In [103]:
# Instantiate the dataframe with the first column of the cluster 
venues_cluster_two = toronto_two.iloc[:,1].value_counts().rename_axis("unique_venues").reset_index(name = "count")

# Loop over the other most common venue columns and append them 
for n in range(2,11):
    new = toronto_two.iloc[:,n].value_counts().rename_axis("unique_venues").reset_index(name = "count")
    venues_cluster_two = venues_cluster_two.append(new)
    
# Some venues show up in multiple columns. Grouping them gets us one count for the whole dataframe 
venues_cluster_two = venues_cluster_two.groupby("unique_venues").sum()

# Let's add in an extra column so we can see the percentage of postcodes in the cluster that have that venue
percentage = venues_cluster_two["count"]/len(toronto_two) * 100
percentage = pd.DataFrame(percentage)
percentage.rename(columns = {"count": "percentage"}, inplace = True)
venues_cluster_two = pd.concat([venues_cluster_two, percentage], axis = 1)

# Let's also only look at venues that are present in more than 20% of postcodes 
venues_cluster_two[venues_cluster_two["percentage"] > 20].sort_values("count", ascending = False)

Unnamed: 0_level_0,count,percentage
unique_venues,Unnamed: 1_level_1,Unnamed: 2_level_1
Coffee Shop,19,86.363636
Café,18,81.818182
Restaurant,15,68.181818
Bakery,11,50.0
Italian Restaurant,10,45.454545
Hotel,6,27.272727
Clothing Store,5,22.727273
Japanese Restaurant,5,22.727273
Sandwich Place,5,22.727273
Seafood Restaurant,5,22.727273


This cluster has a lot of coffeehouses and cafés! Similarly, there are quite a few bakeries and sandwich places. There are also quite a lot of restaurants (generic, Italian, Japanese, seafood). We'll call this cluster the "coffeehouse/café cluster".

### Cluster 3

In [104]:
toronto_three = toronto_clustered.loc[toronto_clustered['Cluster Labels'] == 3, toronto_clustered.columns[[1] + list(range(2, toronto_clustered.shape[1]))]]
toronto_three

Unnamed: 0,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Borough,Neighbourhood,Latitude,Longitude
24,3,Pharmacy,Bakery,Grocery Store,Pool,Music Venue,Middle Eastern Restaurant,Café,Brewery,Bar,Supermarket,West Toronto,"Dufferin, Dovercourt Village",43.669005,-79.442259


This cluster has only one postcode. We'll call this the "groceries and essentials cluster" as it has a grocery shop, supermarket, bakery, and pharmacy. 

### Cluster 4

In [105]:
toronto_four = toronto_clustered.loc[toronto_clustered['Cluster Labels'] == 4, toronto_clustered.columns[[1] + list(range(2, toronto_clustered.shape[1]))]]
toronto_four

Unnamed: 0,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Borough,Neighbourhood,Latitude,Longitude
23,4,Grocery Store,Café,Park,Nightclub,Italian Restaurant,Candy Store,Baby Store,Athletics & Sports,Restaurant,Coffee Shop,Downtown Toronto,Christie,43.669542,-79.422564


This cluster also has only one postcode. We will call this cluster the "shopping cluster" as it has a grocery shop, sweet shop, baby shop, and athletics & sports shop. 

### Visualising our neighbourhood clusters on a map

Let's first change the cluster labels to something more meaningful. 

| Cluster                | Cluster name           |
| ---------------------- | ---------------------- |
| Cluster 0              | Airport & travel       |
| Cluster 1              | Snackfood              |
| Cluster 2              | Coffeehouses/cafés     |
| Cluster 3              | Groceries & essentials |
| Cluster 4              | Shopping               |

In [119]:
# Creating a new column "Cluster name", populating it, appending it to the end of the dataframe
clustername = []

for i in range(0, len(toronto_clustered)):
    if toronto_clustered.iloc[i,1] == 0: 
        clustername.append("Airport & travel")
    if toronto_clustered.iloc[i,1] == 1: 
        clustername.append("Snackfood")
    if toronto_clustered.iloc[i,1] == 2: 
        clustername.append("Coffeehouses/cafés")
    if toronto_clustered.iloc[i,1] == 3: 
        clustername.append("Groceries & essentials")
    if toronto_clustered.iloc[i,1] == 4: 
        clustername.append("Shopping")

clustername = pd.DataFrame(clustername)
clustername.rename(columns = {0: "Cluster name"}, inplace = True)
clustername

toronto_clustered = pd.concat([toronto_clustered, clustername], axis = 1)
toronto_clustered.head()

Unnamed: 0,Postal code,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Borough,Neighbourhood,Latitude,Longitude,Cluster name
0,M4K,2,Greek Restaurant,Coffee Shop,Italian Restaurant,Furniture / Home Store,Restaurant,Ice Cream Shop,Yoga Studio,Spa,Dessert Shop,Pub,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188,Coffeehouses/cafés
1,M4L,1,Park,Fast Food Restaurant,Liquor Store,Movie Theater,Steakhouse,Sushi Restaurant,Italian Restaurant,Ice Cream Shop,Fish & Chips Shop,Pub,East Toronto,"India Bazaar, The Beaches West",43.668999,-79.315572,Snackfood
2,M4M,2,Coffee Shop,Brewery,Café,Gastropub,Bakery,American Restaurant,Yoga Studio,Coworking Space,Cheese Shop,Clothing Store,East Toronto,Studio District,43.659526,-79.340923,Coffeehouses/cafés
3,M4R,2,Coffee Shop,Clothing Store,Bagel Shop,Fast Food Restaurant,Mexican Restaurant,Diner,Park,Chinese Restaurant,Café,Restaurant,Central Toronto,"North Toronto West, Lawrence Park",43.715383,-79.405678,Coffeehouses/cafés
4,M4S,1,Sandwich Place,Dessert Shop,Coffee Shop,Pizza Place,Café,Italian Restaurant,Sushi Restaurant,Gym,Seafood Restaurant,Greek Restaurant,Central Toronto,Davisville,43.704324,-79.38879,Snackfood


Finally, let's visualise what we found! 

In [125]:
# Create map using the same Toronto latitude and longitude coordinates as above
map_clusters = folium.Map(location = [lat_toronto, long_toronto], zoom_start = 12)

# Set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, neighbourhoods, borough, postalcode, cluster, clustername in zip(toronto_clustered["Latitude"], 
                                                                               toronto_clustered["Longitude"], 
                                                                               toronto_clustered["Neighbourhood"],
                                                                               toronto_clustered["Borough"], 
                                                                               toronto_clustered["Postal code"],
                                                                               toronto_clustered["Cluster Labels"],
                                                                               toronto_clustered["Cluster name"]):
    label = "{} cluster -- {}, {}".format(clustername, borough, postalcode)
    label = folium.Popup(label, parse_html = True)
    folium.CircleMarker(
        [lat, lon],
        radius = 5,
        popup = label,
        color = rainbow[cluster-1],
        fill = True,
        fill_color = rainbow[cluster-1],
        fill_opacity = 0.7).add_to(map_clusters)
    
map_clusters