# Welcome to the Applied Data Science Capstone Project Notebook

## Table of Contents

<a href="#item1">1. Segmenting and Clustering Neighborhoods in Toronto</a>
<br/>
<a href="#item11">1.1 Import the neighborhood data into the dataframe grouped by postal codes</a>
<br/>
<a href="#item12">1.2 Enrich our DataFrame with geocodes for each postal code of Toronto</a>
<br/>
<a href="#item13">1.3 Visualize data of Toronto neighborhoods</a>
<br/>
<a href="#item14">1.4 Explore venues of Toronto neighborhoods</a>
<br/>
<a href="#item15">1.5 Cluster neighborhoods of Toronto</a>
<br/>
<a href="#item16">1.6 Visualize clusters of Toronto neighborhoods</a>

<a id="item1"></a>
# 1. Segmenting and Clustering Neighborhoods in Toronto
<a id="item11"></a>
## 1.1 Import the neighborhood data into the dataframe grouped by postal codes

First we need to import libraries required for web scraping and further data analysis and visualization:

In [2]:
import pandas as pd # import pandas
import numpy as np # import Numpy

# install BeautifulSoup

!conda install -c conda-forge beautifulsoup4 --yes
from bs4 import BeautifulSoup

import requests # library to handle HHTP requests

import sys # library to access system functions

import json # library to handle JSON files


# install and import geocoder lib
!conda install -c conda-forge geocoder --yes
import geocoder # import geocoder

print("Geocoder library loaded.")

!conda install -c conda-forge folium --yes
import folium # map rendering library

print("Folium library loaded.")

from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

# set pandas option to return all rows and columns
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

Solving environment: done


  current version: 4.5.11
  latest version: 4.7.12

Please update conda by running

    $ conda update -n base -c defaults conda



## Package Plan ##

  environment location: /home/jupyterlab/conda/envs/python

  added / updated specs: 
    - beautifulsoup4


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    blas-2.11                  |         openblas          10 KB  conda-forge
    scikit-learn-0.20.1        |   py36h22eb022_0         5.7 MB
    liblapack-3.8.0            |      11_openblas          10 KB  conda-forge
    scipy-1.3.2                |   py36h921218d_0        18.0 MB  conda-forge
    libopenblas-0.3.6          |       h5a2b251_2         7.7 MB
    liblapacke-3.8.0           |      11_openblas          10 KB  conda-forge
    numpy-1.17.3               |   py36h95a1406_0         5.2 MB  conda-forge
    libcblas-3.8.0             |      11_openbl

Then we retrieve (scrape) the postal codes and neighborhoods of Toronto using BeautifulSoup library - ***please note that web scraping may fail in case if the page is not accessible***:

In [3]:
# retrieve the page using GET request
url = 'https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M' #'https://ere34t.test'
try:
    response = requests.get(url)
except ConnectionError: 
    print("Error while accessing the webpage  {}\nException:\n{}\n{}".format(url,sys.exc_info()[0],sys.exc_info()[1]))
except:
    print("Error while reading the contents of the webpage {}\nException:\n{}\n{}".format(url,sys.exc_info()[0],sys.exc_info()[1]))
else:
    print("Web page data successfully loaded.")

Web page data successfully loaded.


Load the data into BS object instance:

In [4]:
try:
    # load the page contents into BeautifulSoup object instance 
    soup = BeautifulSoup(response.text, "html.parser")

    # find the table rows - the table doesn't contain ID so I started from the DIV tag and locating our table by tag
    table_rows = soup.find('div', id="mw-content-text").table.tbody.find_all("tr")
except TypeError as e: 
    # in case if the current page design changes so that above string couldn't be found then exception shall be catched 
    print("Unable to find the required information on the web page:\n{0}".format(e))
    
else:
    print("Page data has been loaded into BS object.")

Page data has been loaded into BS object.


Read the scraped table from BS object into the DataFrame:

In [5]:
header = ['PostalCode', 'Borough', 'Neighborhood']

body = []

try:
    # read tables rows and cells into a list
    for tr in table_rows:
        td = tr.find_all('td')    
        row = [cell.text.rstrip() for cell in td]
        body.append(row)
except:
   print("Error while retrieving the body of the table.\nException details:{0}".format(sys.exc_info()[0])) 
else:
    # Load list data into a data frame
    df = pd.DataFrame(body, columns=header)

    # Replace Not assigned and None values to NaN 
    df.replace(["Not assigned","None"], np.NaN, inplace=True)

    df = df[df[header[1]].notnull()]
    
    df[header[2]].fillna(df[header[1]], inplace=True)
    
    print("DataFrame has been loaded with postal codes of Toronto.")

DataFrame has been loaded with postal codes of Toronto.


Let's explore our data set - the size and possible duplicates:

In [6]:
print(df.shape)

df[df.duplicated(['Neighborhood'], keep=False)].sort_values("Neighborhood")

(210, 3)


Unnamed: 0,PostalCode,Borough,Neighborhood
8,M7A,Queen's Park,Queen's Park
10,M9A,Queen's Park,Queen's Park
147,M6N,York,Runnymede
186,M6S,West Toronto,Runnymede
34,M5C,Downtown Toronto,St. James Town
250,M4X,Downtown Toronto,St. James Town


Ok, we found 210 records and 3 possible duplicate records - let's remember their postal codes to highlight them later on the map using different colors:

In [7]:
possible_duplicates = df[df.duplicated(['Neighborhood'], keep=False)].sort_values("Neighborhood")["PostalCode"].unique()

Let's define *toronto_data* dataframe - group *dF* dataframe by PostalCode and aggregate the Neighborhood values via concatenation separated by a *comma*:
<a href="#capstone1"></a>

In [8]:
toronto_data = (df.groupby(header[0]).agg({header[1]:'first',header[2] : ', '.join}).reset_index().reindex(columns=df.columns))

toronto_data.head()

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M1B,Scarborough,"Rouge, Malvern"
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union"
2,M1E,Scarborough,"Guildwood, Morningside, West Hill"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae


Let's clarify what happened with our possible duplicate neighborhoods:

In [9]:
toronto_data[toronto_data.duplicated(['Neighborhood'], keep=False)].sort_values("Neighborhood")

Unnamed: 0,PostalCode,Borough,Neighborhood
85,M7A,Queen's Park,Queen's Park
93,M9A,Queen's Park,Queen's Park


So, we see only one duplicate record, it means the rest were grouped under their postal codes with other neighborhoods.

Show the dimensions of the resulting DataFrame:

In [10]:
toronto_data.shape

(103, 3)

<a id="item12"></a>
## 1.2 Enrich our DataFrame with geocodes for each postal code of Toronto

We shall use the *geocoder* library to retrieve geo coordinates for each postal code of Toronto:

In [11]:
lat_list = []
lng_list = []

# for each row (postal code) in the DataFrame we shall retrieve the coordinates and store them into 2 lists (latitude and longitude)
for row in toronto_data.itertuples(index=False):
    # set lat_lng_coords to None before while iteration
    lat_lng_coords = None
    while(lat_lng_coords is None):
        g = geocoder.arcgis('{}, Toronto, Ontario'.format(row[0]))
        lat_lng_coords = g.latlng
                
        lat_list.append(lat_lng_coords[0]) 
        lng_list.append(lat_lng_coords[1]) 

# create 2 new columns in our DataFrame and assign them with the lists
toronto_data["Latitude"]=lat_list
toronto_data["Longitude"]=lng_list

toronto_data.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M1B,Scarborough,"Rouge, Malvern",43.811525,-79.195517
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",43.785665,-79.158725
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.765815,-79.175193
3,M1G,Scarborough,Woburn,43.768369,-79.21759
4,M1H,Scarborough,Cedarbrae,43.769688,-79.23944


Let's check the dimensions - I expect 103x5:

In [12]:
toronto_data.shape

(103, 5)

<a id="item13"></a>
## 1.3 Visualize data of Toronto neighborhoods 

Let's discover our neighborhood data together with *possible_duplicates* on the map. Possible duplicate records shall be colored in red:

In [61]:
# create map of Toronto using latitude and longitude values
latitude = 43.651070
longitude = -79.347015

map_toronto = folium.Map(location=[latitude, longitude], zoom_start=10, height=500)

# add markers to map
for lat, lng, borough, neighborhood, postalcode in zip(toronto_data['Latitude'], toronto_data['Longitude'], toronto_data['Borough'], toronto_data['Neighborhood'], toronto_data["PostalCode"]):
    label = folium.Popup("{} {}".format(postalcode, neighborhood), parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=3,
        popup=label,
        color= ('red' if postalcode in possible_duplicates else 'blue'),
        fill=True,
        fill_color= ('red' if postalcode in possible_duplicates else 'blue'),
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  

map_toronto

Let's check possible duplicate neighborhoods highlighted with red circles:
1) **Queen's Park** neighborhood spans across 2 different locations. M7A located close to the Queen's Park place, but M9A located in Etobicoke which is quite far from the Queen's Park. Seems it's an error in Wikipedia page - the true location of this postal code belongs to **Humber Valley Village** neighborhood - https://en.wikipedia.org/wiki/Humber_Valley_Village. We shall correct the data in the dataframe to reflect the same.
<br/>
2) **Runnymede** neighborhood spans across 2 different postal codes. Let's check if that's true using Wikipedia and Google Maps (https://en.wikipedia.org/wiki/Runnymede,_Toronto): this neighborhood located between Annette St in the south and Dundas St West in the north. The only postal code for this area is M6S. The other one lies to the north of Runnymede which might be an error in Wikipedia page. Let's remove mention of Runnymede from M6N postal code.
<br/>
3) **St. James Town** - as per Wiki page located across 2 postal codes but a quick research shows that St. James Town has M4X postal code. The other one is in the heart of Old Toronto nearby St. James Park - again I suspect an error in Wikipedia data. Let's correct M5C data with **St James Park** as a name of the neighborhood - by analogy with Queen's Park toponym above.

In [15]:
toronto_data.loc[toronto_data["PostalCode"]=="M9A",["Borough","Neighborhood"]] = ["Etobicoke","Humber Valley Village"]
toronto_data.loc[toronto_data["PostalCode"]=="M6N", "Neighborhood"] = "The Junction North"
toronto_data.loc[toronto_data["PostalCode"]=="M5C", "Neighborhood"] = "St James Park"

toronto_data.loc[toronto_data["PostalCode"].isin(possible_duplicates)].head(6)


Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
51,M4X,Downtown Toronto,"Cabbagetown, St. James Town",43.66816,-79.366602
55,M5C,Downtown Toronto,St James Park,43.65121,-79.375481
81,M6N,York,The Junction North,43.676125,-79.481932
84,M6S,West Toronto,"Runnymede, Swansea",43.64962,-79.476141
85,M7A,Queen's Park,Queen's Park,43.66115,-79.391715
93,M9A,Etobicoke,Humber Valley Village,43.662299,-79.528195


Now it looks better - all possibly duplicated neighborhoods aligned with their postal codes.

<a id="item14"></a>
## 1.4 Explore venues of Toronto neighborhoods 

In order to explore venues of Toronto let's define the Foursquare client's ID and secret as well as the API version:

In [16]:
# The code was removed by Watson Studio for sharing.

Let's borrow getNearbyVenues from the course lab with the slight modification - print out the location without any venues found:

In [17]:
def getNearbyVenues(postalcodes, names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for postalcode, name, lat, lng in zip(postalcodes, names, latitudes, longitudes):
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        if not results:
            print("{} {} has no venues".format(postalcode, name))
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            postalcode,
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
        
    nearby_venues.columns = [
                  'PostalCode',
                  'Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

Let's call the code using the function defined above:

In [18]:
toronto_venues = getNearbyVenues(postalcodes=toronto_data['PostalCode'],
                                   names=toronto_data['Neighborhood'],
                                   latitudes=toronto_data['Latitude'],
                                   longitudes=toronto_data['Longitude']
                                  )


M1X Upper Rouge has no venues
M2L Silver Hills, York Mills has no venues
M5N Roselawn has no venues


Let's note that 3 postal codes have no venues - later on we'll mark them with separate cluster.

Let's explore the *toronto_venues* dataframe:

In [19]:
toronto_venues.head()

Unnamed: 0,PostalCode,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,M1B,"Rouge, Malvern",43.811525,-79.195517,Canadian Appliance Source Whitby,43.808353,-79.191331,Home Service
1,M1C,"Highland Creek, Rouge Hill, Port Union",43.785665,-79.158725,Royal Canadian Legion,43.782533,-79.163085,Bar
2,M1E,"Guildwood, Morningside, West Hill",43.765815,-79.175193,Homestead Roofing Repair,43.76514,-79.178663,Construction & Landscaping
3,M1E,"Guildwood, Morningside, West Hill",43.765815,-79.175193,Heron Park Community Centre,43.768867,-79.176958,Gym / Fitness Center
4,M1E,"Guildwood, Morningside, West Hill",43.765815,-79.175193,Heron Park,43.769327,-79.177201,Park


In [20]:
unique_categories = toronto_venues['Venue Category'].unique()
print('There are {} uniques categories.'.format(len(unique_categories)))

There are 262 uniques categories.


Let's check the list of categories:

In [21]:
unique_categories.sort()

print(unique_categories)

['Afghan Restaurant' 'Airport' 'American Restaurant' 'Antique Shop'
 'Argentinian Restaurant' 'Art Gallery' 'Arts & Crafts Store'
 'Asian Restaurant' 'Athletics & Sports' 'Auto Dealership' 'Auto Garage'
 'BBQ Joint' 'Baby Store' 'Bagel Shop' 'Bakery' 'Bank' 'Bar'
 'Basketball Court' 'Basketball Stadium' 'Bed & Breakfast' 'Beer Bar'
 'Beer Store' 'Belgian Restaurant' 'Bike Shop' 'Bistro' 'Boat or Ferry'
 'Bookstore' 'Boutique' 'Brazilian Restaurant' 'Breakfast Spot' 'Brewery'
 'Bridge' 'Bubble Tea Shop' 'Buffet' 'Building' 'Burger Joint'
 'Burrito Place' 'Bus Line' 'Bus Station' 'Bus Stop' 'Business Service'
 'Butcher' 'Cafeteria' 'Café' 'Candy Store' 'Caribbean Restaurant'
 'Carpet Store' 'Cheese Shop' 'Chinese Restaurant' 'Chocolate Shop'
 'Church' 'Clothing Store' 'Cocktail Bar' 'Coffee Shop'
 'College Arts Building' 'College Auditorium' 'College Cafeteria'
 'College Gym' 'College Rec Center' 'College Stadium'
 'Colombian Restaurant' 'Comfort Food Restaurant' 'Comic Shop'
 'Concert H

As we can see there is a category named **Neighborhood** - let's explore what's there and how it can contribute to our analysis:

In [22]:
toronto_venues[toronto_venues["Venue Category"]=="Neighborhood"].shape

toronto_venues[toronto_venues["Venue Category"]=="Neighborhood"].head()

Unnamed: 0,PostalCode,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
314,M4E,The Beaches,43.676531,-79.295425,Upper Beaches,43.680563,-79.292869,Neighborhood
390,M4M,Studio District,43.660629,-79.334855,Leslieville,43.66207,-79.337856,Neighborhood
930,M5G,Central Bay Street,43.656091,-79.38493,Downtown Toronto,43.653232,-79.385296,Neighborhood
1086,M5H,"Adelaide, King, Richmond",43.6497,-79.382582,Downtown Toronto,43.653232,-79.385296,Neighborhood


In [23]:
toronto_data[toronto_data["PostalCode"].isin(["M4E", "M4M", "M5G", "M5H"])].head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
37,M4E,East Toronto,The Beaches,43.676531,-79.295425
43,M4M,East Toronto,Studio District,43.660629,-79.334855
57,M5G,Downtown Toronto,Central Bay Street,43.656091,-79.38493
58,M5H,Downtown Toronto,"Adelaide, King, Richmond",43.6497,-79.382582


This category doesn't bring any value to our analysis because we already have Neighborhood data. Let's drop this category:

In [24]:
toronto_venues.drop(toronto_venues[toronto_venues["Venue Category"]=="Neighborhood"].index, inplace=True)

After we dropped the category there should be 261 unique categories:

In [25]:
unique_categories = toronto_venues['Venue Category'].unique()
print('There are {} uniques categories.'.format(len(unique_categories)))

There are 261 uniques categories.


Let's create a one hot encoding dataframe by 'Venue Category' and group the data by PostalCode and Neighborhood calculating the mean value, e.g. frequency of each category per PostalCode:

In [26]:
# one hot encoding
toronto_onehot = pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")

# add postal code and neighborhood column back to dataframe
toronto_onehot['PostalCode'] = toronto_venues['PostalCode'] 
toronto_onehot['Neighborhood'] = toronto_venues['Neighborhood']

toronto_grouped = toronto_onehot.groupby(['PostalCode','Neighborhood']).mean().reset_index()


toronto_grouped

Unnamed: 0,PostalCode,Neighborhood,Afghan Restaurant,Airport,American Restaurant,Antique Shop,Argentinian Restaurant,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Dealership,Auto Garage,BBQ Joint,Baby Store,Bagel Shop,Bakery,Bank,Bar,Basketball Court,Basketball Stadium,Bed & Breakfast,Beer Bar,Beer Store,Belgian Restaurant,Bike Shop,Bistro,Boat or Ferry,Bookstore,Boutique,Brazilian Restaurant,Breakfast Spot,Brewery,Bridge,Bubble Tea Shop,Buffet,Building,Burger Joint,Burrito Place,Bus Line,Bus Station,Bus Stop,Business Service,Butcher,Cafeteria,Café,Candy Store,Caribbean Restaurant,Carpet Store,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Clothing Store,Cocktail Bar,Coffee Shop,College Arts Building,College Auditorium,College Cafeteria,College Gym,College Rec Center,College Stadium,Colombian Restaurant,Comfort Food Restaurant,Comic Shop,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Creperie,Cuban Restaurant,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Doctor's Office,Dog Run,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Field,Filipino Restaurant,Fish & Chips Shop,Fish Market,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Stand,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Gas Station,Gastropub,Gay Bar,General Entertainment,General Travel,Gift Shop,Gluten-free Restaurant,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Harbor / Marina,Hardware Store,Health & Beauty Service,Health Food Store,Historic Site,History Museum,Hobby Shop,Hockey Arena,Home Service,Hong Kong Restaurant,Hookah Bar,Hostel,Hotel,Hotel Bar,Hotpot Restaurant,Ice Cream Shop,Indian Restaurant,Indonesian Restaurant,Intersection,Irish Pub,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Karaoke Bar,Kitchen Supply Store,Korean Restaurant,Lake,Latin American Restaurant,Lawyer,Leather Goods Store,Light Rail Station,Lingerie Store,Liquor Store,Lounge,Mac & Cheese Joint,Malay Restaurant,Market,Massage Studio,Mattress Store,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Movie Theater,Museum,Music Store,Music Venue,New American Restaurant,Nightclub,Noodle House,Office,Opera House,Optical Shop,Organic Grocery,Other Great Outdoors,Park,Performing Arts Venue,Peruvian Restaurant,Pet Store,Pharmacy,Pier,Pilates Studio,Pizza Place,Platform,Playground,Plaza,Poke Place,Pool,Portuguese Restaurant,Poutine Place,Pub,Ramen Restaurant,Record Shop,Rental Car Location,Residential Building (Apartment / Condo),Restaurant,Rock Climbing Spot,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Sculpture Garden,Seafood Restaurant,Shanghai Restaurant,Shoe Store,Shopping Mall,Skating Rink,Ski Chalet,Smoke Shop,Smoothie Shop,Snack Place,Soccer Field,Soup Place,Southern / Soul Food Restaurant,Spa,Speakeasy,Sporting Goods Shop,Sports Bar,Sports Club,Stationery Store,Steakhouse,Storage Facility,Strip Club,Supermarket,Supplement Shop,Sushi Restaurant,Swim School,Taco Place,Tailor Shop,Taiwanese Restaurant,Tanning Salon,Tapas Restaurant,Tea Room,Tech Startup,Tennis Court,Thai Restaurant,Theater,Theme Restaurant,Thrift / Vintage Store,Toy / Game Store,Trail,Train Station,Tram Station,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio
0,M1B,"Rouge, Malvern",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,M1C,"Highland Creek, Rouge Hill, Port Union",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,M1E,"Guildwood, Morningside, West Hill",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,M1G,Woburn,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,M1H,Cedarbrae,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,M1J,Scarborough Village,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,M1K,"East Birchmount Park, Ionview, Kennedy Park",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,M1L,"Clairlea, Golden Mile, Oakridge",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.222222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.222222,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.222222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,M1M,"Cliffcrest, Cliffside, Scarborough Village West",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,M1N,"Birch Cliff, Cliffside West",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [27]:
toronto_onehot.shape

(2468, 263)

In [28]:
toronto_grouped.shape

(100, 263)

Let's borrow *return_most_common_venues* function from our course lab - it will return *n-* most frequent venue types per each postal code:

In [29]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[2:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [30]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['PostalCode','Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['PostalCode'] = toronto_grouped['PostalCode']
neighborhoods_venues_sorted['Neighborhood'] = toronto_grouped['Neighborhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 2:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,PostalCode,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M1B,"Rouge, Malvern",Home Service,Food & Drink Shop,Flower Shop,Flea Market,Fish Market,Fish & Chips Shop,Filipino Restaurant,Field,Fast Food Restaurant,Farmers Market
1,M1C,"Highland Creek, Rouge Hill, Port Union",Bar,Yoga Studio,Food,Flower Shop,Flea Market,Fish Market,Fish & Chips Shop,Filipino Restaurant,Field,Fast Food Restaurant
2,M1E,"Guildwood, Morningside, West Hill",Construction & Landscaping,Gym / Fitness Center,Park,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Eastern European Restaurant
3,M1G,Woburn,Business Service,Korean Restaurant,Soccer Field,Coffee Shop,Park,Fast Food Restaurant,Event Space,Falafel Restaurant,Farm,Farmers Market
4,M1H,Cedarbrae,Playground,Lounge,Eastern European Restaurant,Flower Shop,Flea Market,Fish Market,Fish & Chips Shop,Filipino Restaurant,Field,Fast Food Restaurant


<a id="item15"></a>
## 1.5 Cluster neighborhoods of Toronto

Let's split the neighborhoods in 6 clusters - the 6th cluster will consist of those neighborhoods which don't have any venues return from Foursquare API. We shall use KMean data clusterization algorithm from sclearn library:

In [31]:
# set number of clusters to 5
kclusters = 5

toronto_grouped_clustering = toronto_grouped.drop(['PostalCode','Neighborhood'], 1)

toronto_grouped_clustering.head()

# run k-means clustering
kmeans = KMeans(init='k-means++', n_clusters=kclusters, random_state=0,n_init=1000).fit(toronto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([2, 0, 4, 4, 0, 0, 0, 0, 0, 4], dtype=int32)

Let's add the results of clustering to the initial cleaned data of Toronto neighborhoods and call it *toronto_merged* dataframe:

In [32]:
# add clustering labels
if 'Cluster Labels' in neighborhoods_venues_sorted.columns:
    neighborhoods_venues_sorted['Cluster Labels']=kmeans.labels_
else:
    neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

toronto_merged = toronto_data

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
toronto_merged = toronto_merged.join(neighborhoods_venues_sorted.set_index(['PostalCode','Neighborhood']), on=['PostalCode','Neighborhood'])

toronto_merged.head() # check the last columns!

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M1B,Scarborough,"Rouge, Malvern",43.811525,-79.195517,2.0,Home Service,Food & Drink Shop,Flower Shop,Flea Market,Fish Market,Fish & Chips Shop,Filipino Restaurant,Field,Fast Food Restaurant,Farmers Market
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",43.785665,-79.158725,0.0,Bar,Yoga Studio,Food,Flower Shop,Flea Market,Fish Market,Fish & Chips Shop,Filipino Restaurant,Field,Fast Food Restaurant
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.765815,-79.175193,4.0,Construction & Landscaping,Gym / Fitness Center,Park,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Eastern European Restaurant
3,M1G,Scarborough,Woburn,43.768369,-79.21759,4.0,Business Service,Korean Restaurant,Soccer Field,Coffee Shop,Park,Fast Food Restaurant,Event Space,Falafel Restaurant,Farm,Farmers Market
4,M1H,Scarborough,Cedarbrae,43.769688,-79.23944,0.0,Playground,Lounge,Eastern European Restaurant,Flower Shop,Flea Market,Fish Market,Fish & Chips Shop,Filipino Restaurant,Field,Fast Food Restaurant


How many examples fall into each cluster?

In [33]:
toronto_merged.groupby("Cluster Labels").size()

Cluster Labels
0.0    74
1.0     1
2.0     3
3.0     1
4.0    21
dtype: int64


<br/>
Remember that we have NaN cluster labels after merge - there should be 3 records w/o any venues found on FourSquare:

In [34]:
toronto_merged[pd.isnull(toronto_merged).any(axis=1)].head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
16,M1X,Scarborough,Upper Rouge,43.834215,-79.216701,,,,,,,,,,,
20,M2L,North York,"Silver Hills, York Mills",43.757095,-79.38032,,,,,,,,,,,
63,M5N,Central Toronto,Roselawn,43.711941,-79.41912,,,,,,,,,,,


As expected 3 records found - let's mark these records with Cluster = 5 (the numbering starts from 0):

In [35]:
toronto_merged.loc[pd.isnull(toronto_merged).any(axis=1),"Cluster Labels"] = kclusters
toronto_merged.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M1B,Scarborough,"Rouge, Malvern",43.811525,-79.195517,2.0,Home Service,Food & Drink Shop,Flower Shop,Flea Market,Fish Market,Fish & Chips Shop,Filipino Restaurant,Field,Fast Food Restaurant,Farmers Market
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",43.785665,-79.158725,0.0,Bar,Yoga Studio,Food,Flower Shop,Flea Market,Fish Market,Fish & Chips Shop,Filipino Restaurant,Field,Fast Food Restaurant
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.765815,-79.175193,4.0,Construction & Landscaping,Gym / Fitness Center,Park,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Eastern European Restaurant
3,M1G,Scarborough,Woburn,43.768369,-79.21759,4.0,Business Service,Korean Restaurant,Soccer Field,Coffee Shop,Park,Fast Food Restaurant,Event Space,Falafel Restaurant,Farm,Farmers Market
4,M1H,Scarborough,Cedarbrae,43.769688,-79.23944,0.0,Playground,Lounge,Eastern European Restaurant,Flower Shop,Flea Market,Fish Market,Fish & Chips Shop,Filipino Restaurant,Field,Fast Food Restaurant


<a id="item16"></a>
## 1.6 Visualize clusters of Toronto neighborhoods

In [51]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=10)

# set color scheme for the clusters
rainbow = ['red','orange','yellow','green','blue','purple']

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['Neighborhood'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=2,
        popup=label,
        color=rainbow[int(cluster)-1],
        fill=True,
        fill_color=rainbow[int(cluster)-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### Cluster 1

In [40]:
cluster1 = toronto_merged.loc[toronto_merged['Cluster Labels'] == 0, toronto_merged.columns[[0,2] + list(range(6, toronto_merged.shape[1]))]]

cluster1

Unnamed: 0,PostalCode,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,M1C,"Highland Creek, Rouge Hill, Port Union",Bar,Yoga Studio,Food,Flower Shop,Flea Market,Fish Market,Fish & Chips Shop,Filipino Restaurant,Field,Fast Food Restaurant
4,M1H,Cedarbrae,Playground,Lounge,Eastern European Restaurant,Flower Shop,Flea Market,Fish Market,Fish & Chips Shop,Filipino Restaurant,Field,Fast Food Restaurant
5,M1J,Scarborough Village,Train Station,Restaurant,Grocery Store,Indian Restaurant,Yoga Studio,Electronics Store,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farm
6,M1K,"East Birchmount Park, Ionview, Kennedy Park",Discount Store,Convenience Store,Department Store,Coffee Shop,Fast Food Restaurant,Event Space,Falafel Restaurant,Farm,Farmers Market,Field
7,M1L,"Clairlea, Golden Mile, Oakridge",Intersection,Bus Line,Bakery,Bus Station,Soccer Field,Coffee Shop,Yoga Studio,Event Space,Falafel Restaurant,Farm
8,M1M,"Cliffcrest, Cliffside, Scarborough Village West",Discount Store,Sandwich Place,Pharmacy,Bank,Liquor Store,Coffee Shop,Bistro,Yoga Studio,Event Space,Falafel Restaurant
10,M1P,"Dorset Park, Scarborough Town Centre, Wexford ...",Bakery,Wine Shop,Gift Shop,Yoga Studio,Farmers Market,Electronics Store,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farm
11,M1R,"Maryvale, Wexford",Convenience Store,Auto Garage,Fast Food Restaurant,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farm,Farmers Market,Field,Eastern European Restaurant
12,M1S,Agincourt,Shopping Mall,Skating Rink,Chinese Restaurant,Sushi Restaurant,Bakery,Discount Store,Supermarket,Shanghai Restaurant,Pool,Bubble Tea Shop
13,M1T,"Clarks Corners, Sullivan, Tam O'Shanter",Pharmacy,Pizza Place,Hobby Shop,Golf Course,Coffee Shop,Chinese Restaurant,Shopping Mall,Fried Chicken Joint,Thai Restaurant,Convenience Store


Let's check the most popular venue types:

In [41]:
print(cluster1[["1st Most Common Venue","PostalCode"]].groupby("1st Most Common Venue").count().sort_values(by="PostalCode",ascending=False).head())
print(cluster1[["2nd Most Common Venue","PostalCode"]].groupby("2nd Most Common Venue").count().sort_values(by="PostalCode",ascending=False).head())

                       PostalCode
1st Most Common Venue            
Coffee Shop                    18
Convenience Store               6
Pizza Place                     6
Café                            5
Pharmacy                        4
                       PostalCode
2nd Most Common Venue            
Coffee Shop                    10
Café                            9
Fast Food Restaurant            5
Grocery Store                   4
Hotel                           3


**Cluster 1** - neighborhoods scattered across the city with top concentration in historical center, has lots of coffee shops, convenience stores, pizza places and hotels

### Cluster 2

In [42]:
cluster2 = toronto_merged.loc[toronto_merged['Cluster Labels'] == 1, toronto_merged.columns[[0,2] + list(range(6, toronto_merged.shape[1]))]]

cluster2

Unnamed: 0,PostalCode,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
28,M3H,"Bathurst Manor, Downsview North, Wilson Heights",Men's Store,Yoga Studio,Food & Drink Shop,Flower Shop,Flea Market,Fish Market,Fish & Chips Shop,Filipino Restaurant,Field,Fast Food Restaurant


Cluster 2 - several neighborhoods with lots of men's stores, food and drink shops and Yoga studios

### Cluster 3

In [43]:
cluster3 = toronto_merged.loc[toronto_merged['Cluster Labels'] == 2, toronto_merged.columns[[0,2] + list(range(6, toronto_merged.shape[1]))]]

cluster3

Unnamed: 0,PostalCode,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M1B,"Rouge, Malvern",Home Service,Food & Drink Shop,Flower Shop,Flea Market,Fish Market,Fish & Chips Shop,Filipino Restaurant,Field,Fast Food Restaurant,Farmers Market
32,M3M,Downsview Central,Construction & Landscaping,Food & Drink Shop,Flower Shop,Flea Market,Fish Market,Fish & Chips Shop,Filipino Restaurant,Field,Fast Food Restaurant,Farmers Market
96,M9L,Humber Summit,Home Service,Business Service,Construction & Landscaping,American Restaurant,Electronics Store,Food,Flower Shop,Flea Market,Fish Market,Fish & Chips Shop


Cluster 3 - several neighborhoods with home service facilities, food and drink, flower shops and fish markets

### Cluster 4

In [44]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 3, toronto_merged.columns[[0,2] + list(range(6, toronto_merged.shape[1]))]]

Unnamed: 0,PostalCode,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
94,M9B,"Cloverdale, Islington, Martin Grove, Princess ...",Filipino Restaurant,Yoga Studio,Food & Drink Shop,Flower Shop,Flea Market,Fish Market,Fish & Chips Shop,Field,Fast Food Restaurant,Farmers Market


**Cluster 4** - neighborhoods in Etobicoke with several Filipino restaurants, Yoga studios and food&drink shops. These are mostly residential areas.

### Cluster 5

In [45]:
cluster5 = toronto_merged.loc[toronto_merged['Cluster Labels'] == 4, toronto_merged.columns[[0,2] + list(range(6, toronto_merged.shape[1]))]]

cluster5

Unnamed: 0,PostalCode,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,M1E,"Guildwood, Morningside, West Hill",Construction & Landscaping,Gym / Fitness Center,Park,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Eastern European Restaurant
3,M1G,Woburn,Business Service,Korean Restaurant,Soccer Field,Coffee Shop,Park,Fast Food Restaurant,Event Space,Falafel Restaurant,Farm,Farmers Market
9,M1N,"Birch Cliff, Cliffside West",General Entertainment,College Stadium,Skating Rink,Park,Gym Pool,Gym,Fast Food Restaurant,Field,Farmers Market,Donut Shop
19,M2K,Bayview Village,Construction & Landscaping,Dog Run,Trail,Park,Eastern European Restaurant,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farm,Farmers Market
23,M2P,York Mills West,Convenience Store,Speakeasy,Bank,Park,Flea Market,Fish Market,Fish & Chips Shop,Filipino Restaurant,Field,Eastern European Restaurant
25,M3A,Parkwoods,Food & Drink Shop,Bus Stop,Park,Fast Food Restaurant,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farm,Farmers Market,Field
26,M3B,Don Mills North,Burger Joint,Gas Station,Soccer Field,Park,Yoga Studio,Farmers Market,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farm
30,M3K,"CFB Toronto, Downsview East",Food Court,Park,Airport,Coffee Shop,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant
34,M4A,Victoria Village,Food Stand,Grocery Store,Park,Yoga Studio,Farmers Market,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farm,Fast Food Restaurant
40,M4J,East Toronto,Bar,Park,Italian Restaurant,Farmers Market,Fish Market,Fish & Chips Shop,Filipino Restaurant,Field,Fast Food Restaurant,Dumpling Restaurant


In [46]:
print(cluster5[["1st Most Common Venue","PostalCode"]].groupby("1st Most Common Venue").count().sort_values(by="PostalCode",ascending=False).head(5))
print(cluster5[["2nd Most Common Venue","PostalCode"]].groupby("2nd Most Common Venue").count().sort_values(by="PostalCode",ascending=False).head(5))

                            PostalCode
1st Most Common Venue                 
Park                                 4
Construction & Landscaping           2
Playground                           2
Bar                                  1
Burger Joint                         1
                       PostalCode
2nd Most Common Venue            
Park                            4
Basketball Court                1
Bus Line                        1
Speakeasy                       1
Mexican Restaurant              1


**Cluster 5** - neighborhoods with lots of parks, playgrounds and sport facilities, construction and landscaping businesses and some restaurants are available in these neighborhoods. 

### Cluster 6

In [48]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 5, toronto_merged.columns[[0,2] + list(range(6, toronto_merged.shape[1]))]]

Unnamed: 0,PostalCode,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
16,M1X,Upper Rouge,,,,,,,,,,
20,M2L,"Silver Hills, York Mills",,,,,,,,,,
63,M5N,Roselawn,,,,,,,,,,


**Cluster 6** - are mostly residential areas or less developed areas which we cannot tell much without some additional research.

Quick overview:<br/>
- Silver Hills - quiet residential area with mid+ price range<br/>
- York Mills - residential area with luxury condos and detached houses with mid to high price range<br/>
- Roselawn - residential area with mostly detached houses<br/>
- Rouge National Urban Park located within this cluster.