## <u>Toronto Neighborhoods Analysis</u>

The analysis is divided into the following steps:
### 1. [Getting Toronto's postcodes, boroughs and neighboorhoods.](#Getting_toronto_data)
### 2. [Acquiring the geospatial data of Toronto postal codes.](#geospatial_data)
### 3. [Clustering Toronto's neighborhoods according to their features.](#Clustering)

<a id='Getting_toronto_data'></a>

### <u>1.Getting Toronto's postcodes, boroughs and neighboorhoods.</u>

In this step we will obtain the neighborhoods and districts for each Toronto postal code.
The data will be obtained from the following Wikipedia website:

- [https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M](https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M)

With the data obtained from Wikipedia we will create a Dataframe with the following columns:

- Postalcode
- Borough
- Neighborhood

To obtain the final Dataframe we will make the following arrangements in the Dataframe:

- We will remove the rows from the Dataframe in which the value of the *Borough* column is *Not assigned*.
- If any row has a value for the *Borough* column and the value for *Neighboorhood* column is *Not assigned* we assign the value of the *Borough* column.
- Finally, we will group in the column *Neighborhood* all neighborhoods with the same postal code

----

First of all we import the libraries that we will need to obtain the data from the wikipedia website and build the DataFrame:

- Requests: It facilitates us to make HTTP requests.
- Pandas: It is a library that allows us manipulate and analyze data, its main structure is the DataFrame. 

In [1]:
import requests
import pandas as pd

 - Get the wikipedia website with Toronto's postal codes, boroughs and neighborhood. We assign the result to the variable *web_data*

In [2]:
url = 'https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'

web_data = requests.get(url).text

- Using the `.read_html ()` method from *Pandas* library, convert the table with the Toronto postal codes into a DataFrame and assign it to the variable *toronto_df*.
- We Show up the first five rows,  applying the `.head()` method.

In [3]:
toronto_df = pd.read_html(web_data)[0]
toronto_df.head()

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront


We create the *criteria* variable and assign it a Series of Boolean values that checks if Boroug is *Not assigned* and filters the DataFrame *toronto_df* by  *criteria* variable.

We previously printed the number the rows of the original DataFrame.

Finally, we print both the number of rows removed and those that remains and check if the sum of both is equal to the number of rows of the original DataFrame

In [4]:
rows_before = toronto_df.shape[0]
print('Rows before:', rows_before)

criteria = toronto_df['Borough']=='Not assigned'
toronto_df = toronto_df[~criteria].reset_index()
toronto_df.drop('index', axis=1)


print('Rows removed: ',criteria.sum())
print('Rows that remains: ',toronto_df.shape[0])

total_rows = criteria.sum()+toronto_df.shape[0]
print('Rows removed plus rows remains equals rows before?',total_rows==rows_before )

Rows before: 287
Rows removed:  77
Rows that remains:  210
Rows removed plus rows remains equals rows before? True


- Check if there is any 'Not assigned' value in the *Neighbourhood* column

In [5]:
toronto_df.loc[toronto_df['Neighbourhood']=='Not assigned', 'Neighbourhood'].any()

False

- In order to obtain the final DataFrame we group by columns *Postcode* and *Borough*.
- First we check the number of unique postal codes that we have in the DataFrame.

In [6]:
toronto_df.Postcode.nunique()

103

In [7]:
toronto_df = pd.DataFrame(toronto_df.groupby(['Postcode', 'Borough'])['Neighbourhood'].unique().str.join(', ').reset_index())

- Finally we show the first twelve rows of the DataFrame and print the total number of rows using the `.shape` method.

In [8]:
toronto_df.head(12)

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M1B,Scarborough,"Rouge, Malvern"
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union"
2,M1E,Scarborough,"Guildwood, Morningside, West Hill"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae
5,M1J,Scarborough,Scarborough Village
6,M1K,Scarborough,"East Birchmount Park, Ionview, Kennedy Park"
7,M1L,Scarborough,"Clairlea, Golden Mile, Oakridge"
8,M1M,Scarborough,"Cliffcrest, Cliffside, Scarborough Village West"
9,M1N,Scarborough,"Birch Cliff, Cliffside West"


In [9]:
print('Total rows: ', toronto_df.shape[0])

Total rows:  103


<a id='geospatial_data'></a>

### <u>Acquiring the geospatial data of Toronto postal codes.</u>

In this step we look for the coordinates for the toronto postal codes and add the latitude and longitude to the DataFrame of step 1.

----

First we import the libraries we will need:
- Googlemaps: to obtain the coordinates of each postal code.
- Os: to get the google api key from the environment variable.

In [10]:
import googlemaps
import os

- We instantiate a `Client` object to make requests to the Google Maps API web services and add them to the *gmaps*.

In [11]:
gmaps = googlemaps.Client(key=os.environ["GOOGLE_API_KEY"])

- We create the *codes* list containing all postal codes from *toronto_df*.

In [12]:
codes = toronto_df.Postcode.to_list()

- We build the *coor* dictionary with the keys:
  - *Postcode*
  - *Latitude*
  - *Longuitude*
- We assign each key an empty list.
- We use a `for` loop to iterate through each postal code contained in *codes*.
- We apply the `geocode` method to make requests to the Google geocoding API and obtain the coordinates of each Toronto postal code.
- Populate the dictionary using a `try` block in case we don't get any data for any postal code, the` except` block raises an IndexError and continues the for loop.

In [13]:
coor ={'Postcode':[], 'Latitude':[], 'Longitude':[]}
for code in codes:
    g = gmaps.geocode('{}, Toronto, Ontario'.format(code))
    try:
        coor['Postcode'].append(code)
        coor['Latitude'].append(g[0]['geometry']['location']['lat'])
        coor['Longitude'].append(g[0]['geometry']['location']['lng'])
    except IndexError:
        continue

- We use the `.from_dict ()` method of the *DataFrame* class from *Pandas* to create a DataFrame from the *coor* dictionary and assign it to *coor_df*.

In [14]:
coor_df = pd.DataFrame.from_dict(coor)

- To get the final *DataFrame* we use the `.merge ()` method, to join the DataFrames *toronto_df* and *coor_df*, we set the parameter `on` equal to *Postcode* to join both DataFrames by the column *Postcode*.

In [15]:
toronto_df = toronto_df.merge(coor_df, how='inner', on='Postcode')

- Finally we apply the `head ()` method, set the parameter *n* equal to 12 to show the first twelve rows of the *DataFrame*.

In [16]:
toronto_df.head(12)

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude
0,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476
5,M1J,Scarborough,Scarborough Village,43.744734,-79.239476
6,M1K,Scarborough,"East Birchmount Park, Ionview, Kennedy Park",43.727929,-79.262029
7,M1L,Scarborough,"Clairlea, Golden Mile, Oakridge",43.711112,-79.284577
8,M1M,Scarborough,"Cliffcrest, Cliffside, Scarborough Village West",43.716316,-79.239476
9,M1N,Scarborough,"Birch Cliff, Cliffside West",43.692657,-79.264848


<a id='Clustering'></a>

### <u>Clustering Toronto's neighborhoods according to their features.</u>

 In the last step we will group the Toronto's Neighbourhoods into cluster by their most common venues.
 ****

First of all we will import the libraries that we will need and that we have not previously imported.
 - *Folium*: To render maps.
 - *Numpy*: To make vectorized calculations.
 - *Pandas.json_normalize*: To transform json files into pandas DataFrame.
 - *Maplotlib.cm*: To get matplotlib buit-in colormaps.
 - *Matplotlib.colors*: Allow us to map numbers to colors.
 - *Sklearn.clustes.KMeans*: To cluster data by K-Means algorithm

In [17]:
import folium
import numpy as np
from pandas import json_normalize
import matplotlib.cm as cm
import matplotlib.colors as colors
from sklearn.cluster import KMeans

In [18]:
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

We use the googlemaps Client that we instantiate before to get the location of *Toronto*.

In [19]:
address = 'Toronto City, Ontario'
location = gmaps.geocode(address)
location

[{'address_components': [{'long_name': 'Toronto',
    'short_name': 'Toronto',
    'types': ['locality', 'political']},
   {'long_name': 'Toronto Division',
    'short_name': 'Toronto Division',
    'types': ['administrative_area_level_2', 'political']},
   {'long_name': 'Ontario',
    'short_name': 'ON',
    'types': ['administrative_area_level_1', 'political']},
   {'long_name': 'Canada',
    'short_name': 'CA',
    'types': ['country', 'political']}],
  'formatted_address': 'Toronto, ON, Canada',
  'geometry': {'bounds': {'northeast': {'lat': 43.8554579, 'lng': -79.1168971},
    'southwest': {'lat': 43.5810245, 'lng': -79.639219}},
   'location': {'lat': 43.653226, 'lng': -79.3831843},
   'location_type': 'APPROXIMATE',
   'viewport': {'northeast': {'lat': 43.8554579, 'lng': -79.1168971},
    'southwest': {'lat': 43.5810245, 'lng': -79.639219}}},
  'place_id': 'ChIJpTvG15DL1IkRd8S0KlBVNTI',
  'types': ['locality', 'political']}]

We get the coordinates for Toronto, and assign them to *latitude* & *longitude* variables.

In [20]:
latitude = location[0]['geometry']['location']['lat']
longitude = location[0]['geometry']['location']['lng']
print('The coordinate for Toronto are: {}, {}.'.format(latitude, longitude))

The coordinate for Toronto are: 43.653226, -79.3831843.


Next, we will use *folium* library to draw a map of Toronto neighbourhoods, we use the data we've get in step 2.

In [21]:
map_toronto = folium.Map(location=[latitude, longitude], zomm_start=10)
for lat, lng, borough, neighborhood in zip(toronto_df['Latitude'],
                                           toronto_df['Longitude'],
                                           toronto_df['Borough'],
                                           toronto_df['Neighbourhood']):
    label = '{},\n\n{}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker([lat, lng],
                        radius=5,
                        popup=label,
                        color='blue',
                        fill=True,
                        fill_color='#3186cc',
                        fill_opacity=0.7,
                        parse_html=False).add_to(map_toronto)
map_toronto

Let's get the places near the first neighborhood of the dataframe.
***

We set our credentials and version for *Foursquare's* API

In [22]:
ID = os.environ['FOURSQUARE_CLIENT_ID']
SECRET = os.environ['FOURSQUARE_CLIENT_SECRET']
ACCESS_TOKEN = os.environ['FOURSQUARE_ACCESS_TOKEN']
version='20200314'

We set the number of places to 100 and the distance from the neighborhood to 500 meters.

In [23]:
LIMIT = 100
radius = 500

We assign the latitude and longitude of the first neighbourhood to the variables * lat * and * lng * respectively

In [24]:
toronto_df.loc[0, 'Neighbourhood']
lat = toronto_df.loc[0, 'Latitude']
lng = toronto_df.loc[0,'Longitude']

Built the URI for our GET request, since we want to obtain all the venues within a radius we use the *explore* endpoint.

In [25]:
url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&cliente_secret={}&oauth_token={}&v={}&ll={},{}&radius={}&limit={}'.format(
      ID,
      SECRET,
      ACCESS_TOKEN,
      version,
      lat,
      lng,
      radius,
      LIMIT)


We make the GET request and check the result

In [26]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5e6e75e71835dd001bee37a7'},
 'notifications': [{'type': 'notificationTray', 'item': {'unreadCount': 0}}],
  'headerLocation': 'Malvern',
  'headerFullLocation': 'Malvern, Toronto',
  'headerLocationGranularity': 'neighborhood',
  'totalResults': 2,
  'suggestedBounds': {'ne': {'lat': 43.811186304500005,
    'lng': -79.1881295807304},
   'sw': {'lat': 43.8021862955, 'lng': -79.20057721926959}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4bb6b9446edc76b0d771311c',
       'name': "Wendy's",
       'location': {'crossStreet': 'Morningside & Sheppard',
        'lat': 43.80744841934756,
        'lng': -79.19905558052072,
        'labeledLatLngs': [{'label': 'display',
          'lat': 43.80744841934756,
          'lng': -79.19905558052

The request only retun two venues. The *warning* key from results *response*, say "There aren't a lot of results near you. Try something more general, reset your filters, or expand the search area.". 

Since the resquest is being made from Spain on Sunday at 16:40 (CET), we try change filters. 
From *Foursquare's* explore endpoint docs we see that we can set *time* and *day* parameters to *any* in order to obtain venues for any time of the day and any day of the week. 

[https://developer.foursquare.com/docs/api/venues/explore](https://developer.foursquare.com/docs/api/venues/explore)

Built a new URI with *time* and *day* parameters, and check the results.


In [27]:
url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&cliente_secret={}&oauth_token={}&v={}&ll={},{}&radius={}&limit={}&time=any&day=any'.format(
      ID,
      SECRET,
      ACCESS_TOKEN,
      version,
      lat,
      lng,
      radius,
      LIMIT)
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5e6e7627e826ac001b03433d'},
 'notifications': [{'type': 'notificationTray', 'item': {'unreadCount': 0}}],
  'headerLocation': 'Malvern',
  'headerFullLocation': 'Malvern, Toronto',
  'headerLocationGranularity': 'neighborhood',
  'totalResults': 2,
  'suggestedBounds': {'ne': {'lat': 43.811186304500005,
    'lng': -79.1881295807304},
   'sw': {'lat': 43.8021862955, 'lng': -79.20057721926959}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4bb6b9446edc76b0d771311c',
       'name': "Wendy's",
       'location': {'crossStreet': 'Morningside & Sheppard',
        'lat': 43.80744841934756,
        'lng': -79.19905558052072,
        'labeledLatLngs': [{'label': 'display',
          'lat': 43.80744841934756,
          'lng': -79.19905558052

As we can see we keep getting two vunues, so let's set the distance from the neighbourhood to 1000 meters.

In [28]:
radius=1000
url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&cliente_secret={}&oauth_token={}&v={}&ll={},{}&radius={}&limit={}&time=any&day=any'.format(
      ID,
      SECRET,
      ACCESS_TOKEN,
      version,
      lat,
      lng,
      radius,
      LIMIT)
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5e6e767978a484001bac5c5d'},
 'notifications': [{'type': 'notificationTray', 'item': {'unreadCount': 0}}],
 'response': {'suggestedFilters': {'header': 'Tap to show:',
   'filters': [{'name': 'Open now', 'key': 'openNow'}]},
  'headerLocation': 'Malvern',
  'headerFullLocation': 'Malvern, Toronto',
  'headerLocationGranularity': 'neighborhood',
  'totalResults': 38,
  'suggestedBounds': {'ne': {'lat': 43.815686309000014,
    'lng': -79.1819057614608},
   'sw': {'lat': 43.79768629099999, 'lng': -79.2068010385392}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '579a91b3498e9bd833afa78a',
       'name': "Wendy's",
       'location': {'address': '8129 Sheppard Avenue',
        'lat': 43.8020084,
        'lng': -79.1980797,
        'label

We now have thirty eight locations, so we leave the radius set at 1000 meters.
***

We borrow the function we use
to Manhattan neighborhoods in New York, to get venues for every Toronto neighborhood

In [29]:
def getNearbyVenues(names, latitudes, longitudes, radius=1000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&cliente_secret={}&oauth_token={}&v={}&ll={},{}&radius={}&limit={}&time=any&day=any'.format(
            ID, 
            SECRET,
            ACCESS_TOKEN,
            version, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

We build the dataframe * toronto_venues *, applying the function defined above.

In [30]:
toronto_venues=getNearbyVenues(names=toronto_df['Neighbourhood'],
                               latitudes=toronto_df['Latitude'],
                               longitudes=toronto_df['Longitude'])

Rouge, Malvern
Highland Creek, Rouge Hill, Port Union
Guildwood, Morningside, West Hill
Woburn
Cedarbrae
Scarborough Village
East Birchmount Park, Ionview, Kennedy Park
Clairlea, Golden Mile, Oakridge
Cliffcrest, Cliffside, Scarborough Village West
Birch Cliff, Cliffside West
Dorset Park, Scarborough Town Centre, Wexford Heights
Maryvale, Wexford
Agincourt
Clarks Corners, Sullivan, Tam O'Shanter
Agincourt North, L'Amoreaux East, Milliken, Steeles East
L'Amoreaux West
Upper Rouge
Hillcrest Village
Fairview, Henry Farm, Oriole
Bayview Village
Silver Hills, York Mills
Newtonbrook, Willowdale
Willowdale South
York Mills West
Willowdale West
Parkwoods
Don Mills North
Flemingdon Park, Don Mills South
Bathurst Manor, Downsview North, Wilson Heights
Northwood Park, York University
CFB Toronto, Downsview East
Downsview West
Downsview Central
Downsview Northwest
Victoria Village
Woodbine Gardens, Parkview Hill
Woodbine Heights
The Beaches
Leaside
Thorncliffe Park
East Toronto
The Danforth West, 

Ckeck the size of *toronto_venues* DataFrame and show firts five rows.

In [31]:
print(toronto_venues.shape)
toronto_venues.head()

(4537, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,"Rouge, Malvern",43.806686,-79.194353,Wendy's,43.802008,-79.19808,Fast Food Restaurant
1,"Rouge, Malvern",43.806686,-79.194353,Wendy's,43.807448,-79.199056,Fast Food Restaurant
2,"Rouge, Malvern",43.806686,-79.194353,Caribbean Wave,43.798558,-79.195777,Caribbean Restaurant
3,"Rouge, Malvern",43.806686,-79.194353,Harvey's,43.80002,-79.198307,Restaurant
4,"Rouge, Malvern",43.806686,-79.194353,Staples Morningside,43.800285,-79.196607,Paper / Office Supplies Store


Check how many venues were returned for each neighbourhood

In [32]:
toronto_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"Adelaide, King, Richmond",100,100,100,100,100,100
Agincourt,56,56,56,56,56,56
"Agincourt North, L'Amoreaux East, Milliken, Steeles East",36,36,36,36,36,36
"Albion Gardens, Beaumond Heights, Humbergate, Jamestown, Mount Olive, Silverstone, South Steeles, Thistletown",31,31,31,31,31,31
"Alderwood, Long Branch",25,25,25,25,25,25
"Bathurst Manor, Downsview North, Wilson Heights",41,41,41,41,41,41
Bayview Village,19,19,19,19,19,19
"Bedford Park, Lawrence Manor East",58,58,58,58,58,58
Berczy Park,100,100,100,100,100,100
"Birch Cliff, Cliffside West",22,22,22,22,22,22


We can see that there is great variety in the number of venues returned for each neighborhood.
We are going to apply the  `.describe()`method of *Pandas* library to see the distribution of neighborhoods by the number of places returned.

In [33]:
toronto_venues.groupby('Neighborhood').count().Venue.describe()

count    102.000000
mean      44.480392
std       31.495632
min        3.000000
25%       19.250000
50%       35.000000
75%       59.000000
max      100.000000
Name: Venue, dtype: float64

We can see that 50 percent of the data in the interquartile range return between 20 and 60 places. For consistent analysis, we will only stick to neighborhoods that returned 20 or more venues and 60 or fewer.

In [34]:
toronto_venues_grouped = toronto_venues.groupby('Neighborhood').count().reset_index()
criteria = (toronto_venues_grouped['Venue'] >= 20) & (toronto_venues_grouped['Venue'] <= 60)
neighbourhoods_choose = toronto_venues_grouped[criteria]['Neighborhood'].tolist()
len(neighbourhoods_choose)

52

In [35]:
toronto_venues_select = toronto_venues[toronto_venues['Neighborhood'].isin(neighbourhoods_choose)].reset_index(drop=True)
print(toronto_venues_select.Neighborhood.nunique())
toronto_venues_select

52


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,"Rouge, Malvern",43.806686,-79.194353,Wendy's,43.802008,-79.19808,Fast Food Restaurant
1,"Rouge, Malvern",43.806686,-79.194353,Wendy's,43.807448,-79.199056,Fast Food Restaurant
2,"Rouge, Malvern",43.806686,-79.194353,Caribbean Wave,43.798558,-79.195777,Caribbean Restaurant
3,"Rouge, Malvern",43.806686,-79.194353,Harvey's,43.80002,-79.198307,Restaurant
4,"Rouge, Malvern",43.806686,-79.194353,Staples Morningside,43.800285,-79.196607,Paper / Office Supplies Store
5,"Rouge, Malvern",43.806686,-79.194353,Tim Hortons,43.802,-79.198169,Coffee Shop
6,"Rouge, Malvern",43.806686,-79.194353,NT Home Service Inc.,43.806411,-79.197736,Home Service
7,"Rouge, Malvern",43.806686,-79.194353,Lee Valley,43.803161,-79.199681,Hobby Shop
8,"Rouge, Malvern",43.806686,-79.194353,Images Salon & Spa,43.802283,-79.198565,Spa
9,"Rouge, Malvern",43.806686,-79.194353,Morningside Guardian Pharmacy,43.802551,-79.199422,Pharmacy


We see that we have 52 neighborhoods that returned between 20 and 60.
Now find out how many unique categories we have.

In [36]:
print('There are {} uniques categories'.format(toronto_venues_select['Venue Category'].nunique()))

There are 266 uniques categories


- We will use `.get_dummies` method from *Pandas* library to convert values from *Venue Category* column into dummy/indicator variables

In [37]:
toronto_onehot = pd.get_dummies(toronto_venues_select[['Venue Category']], prefix='', prefix_sep='')
toronto_onehot.insert(loc=0, column='Neighborhood', value=toronto_venues_select['Neighborhood']) 

print(toronto_onehot.shape)
toronto_onehot.head()

(1880, 267)


Unnamed: 0,Neighborhood,ATM,Accessories Store,African Restaurant,Airport,American Restaurant,Amphitheater,Antique Shop,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Dealership,Auto Garage,Auto Workshop,Automotive Shop,BBQ Joint,Baby Store,Badminton Court,Bagel Shop,Bakery,Bank,Bar,Beach,Beer Bar,Beer Store,Bike Shop,Bistro,Bookstore,Brazilian Restaurant,Breakfast Spot,Brewery,Bridal Shop,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Bus Line,Bus Station,Bus Stop,Business Service,Butcher,Café,Cantonese Restaurant,Caribbean Restaurant,Carpet Store,Cemetery,Chinese Restaurant,Chiropractor,Clothing Store,Coffee Shop,College Rec Center,College Stadium,Comedy Club,Comfort Food Restaurant,Comic Shop,Community Center,Construction & Landscaping,Convenience Store,Cosmetics Shop,Creperie,Cuban Restaurant,Cupcake Shop,Curling Ice,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Dive Bar,Dog Run,Doner Restaurant,Donut Shop,Dry Cleaner,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,Elementary School,Empanada Restaurant,Ethiopian Restaurant,Event Space,Fabric Shop,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Field,Filipino Restaurant,Financial or Legal Service,Fireworks Store,Fish & Chips Shop,Fish Market,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,Frame Store,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Garden,Garden Center,Gas Station,Gastropub,General Entertainment,Gift Shop,Golf Course,Golf Driving Range,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Hakka Restaurant,Hardware Store,Health & Beauty Service,Health Food Store,Hobby Shop,Hockey Arena,Home Service,Hookah Bar,Hostel,Hot Dog Joint,Hotel,Hotpot Restaurant,IT Services,Ice Cream Shop,Indian Chinese Restaurant,Indian Restaurant,Indie Movie Theater,Indonesian Restaurant,Intersection,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Kids Store,Kitchen Supply Store,Korean Restaurant,Latin American Restaurant,Laundromat,Laundry Service,Lawyer,Library,Light Rail Station,Lighting Store,Lingerie Store,Liquor Store,Locksmith,Lounge,Malay Restaurant,Market,Martial Arts Dojo,Massage Studio,Medical Center,Medical Supply Store,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Moroccan Restaurant,Motorcycle Shop,Movie Theater,Moving Target,Music Store,Music Venue,Nail Salon,Noodle House,Office,Optical Shop,Organic Grocery,Other Great Outdoors,Other Repair Shop,Outdoor Supply Store,Pakistani Restaurant,Paper / Office Supplies Store,Park,Pastry Shop,Performing Arts Venue,Pet Store,Pharmacy,Photography Studio,Pie Shop,Pilates Studio,Pizza Place,Playground,Plaza,Pool,Pool Hall,Portuguese Restaurant,Print Shop,Pub,Ramen Restaurant,Record Shop,Recording Studio,Recreation Center,Rental Car Location,Rental Service,Residential Building (Apartment / Condo),Restaurant,Road,Rock Climbing Spot,Rock Club,Sake Bar,Salon / Barbershop,Sandwich Place,Scenic Lookout,Seafood Restaurant,Shanghai Restaurant,Shoe Store,Shop & Service,Shopping Mall,Shopping Plaza,Skate Park,Skating Rink,Ski Chalet,Smoke Shop,Snack Place,Soccer Field,Soccer Stadium,Soup Place,Spa,Speakeasy,Sporting Goods Shop,Sports Bar,Sports Club,Sri Lankan Restaurant,Steakhouse,Storage Facility,Supermarket,Supplement Shop,Sushi Restaurant,Taco Place,Tailor Shop,Taiwanese Restaurant,Tapas Restaurant,Tea Room,Tennis Court,Thai Restaurant,Theater,Thrift / Vintage Store,Toy / Game Store,Track,Trail,Train Station,Turkish Restaurant,Vape Store,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Wine Bar,Wings Joint,Women's Store,Yoga Studio,Zoo
0,"Rouge, Malvern",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,"Rouge, Malvern",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,"Rouge, Malvern",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,"Rouge, Malvern",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,"Rouge, Malvern",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [38]:
toronto_grouped = toronto_onehot.groupby('Neighborhood').mean().reset_index()
toronto_grouped

Unnamed: 0,Neighborhood,ATM,Accessories Store,African Restaurant,Airport,American Restaurant,Amphitheater,Antique Shop,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Dealership,Auto Garage,Auto Workshop,Automotive Shop,BBQ Joint,Baby Store,Badminton Court,Bagel Shop,Bakery,Bank,Bar,Beach,Beer Bar,Beer Store,Bike Shop,Bistro,Bookstore,Brazilian Restaurant,Breakfast Spot,Brewery,Bridal Shop,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Bus Line,Bus Station,Bus Stop,Business Service,Butcher,Café,Cantonese Restaurant,Caribbean Restaurant,Carpet Store,Cemetery,Chinese Restaurant,Chiropractor,Clothing Store,Coffee Shop,College Rec Center,College Stadium,Comedy Club,Comfort Food Restaurant,Comic Shop,Community Center,Construction & Landscaping,Convenience Store,Cosmetics Shop,Creperie,Cuban Restaurant,Cupcake Shop,Curling Ice,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Dive Bar,Dog Run,Doner Restaurant,Donut Shop,Dry Cleaner,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,Elementary School,Empanada Restaurant,Ethiopian Restaurant,Event Space,Fabric Shop,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Field,Filipino Restaurant,Financial or Legal Service,Fireworks Store,Fish & Chips Shop,Fish Market,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,Frame Store,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Garden,Garden Center,Gas Station,Gastropub,General Entertainment,Gift Shop,Golf Course,Golf Driving Range,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Hakka Restaurant,Hardware Store,Health & Beauty Service,Health Food Store,Hobby Shop,Hockey Arena,Home Service,Hookah Bar,Hostel,Hot Dog Joint,Hotel,Hotpot Restaurant,IT Services,Ice Cream Shop,Indian Chinese Restaurant,Indian Restaurant,Indie Movie Theater,Indonesian Restaurant,Intersection,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Kids Store,Kitchen Supply Store,Korean Restaurant,Latin American Restaurant,Laundromat,Laundry Service,Lawyer,Library,Light Rail Station,Lighting Store,Lingerie Store,Liquor Store,Locksmith,Lounge,Malay Restaurant,Market,Martial Arts Dojo,Massage Studio,Medical Center,Medical Supply Store,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Moroccan Restaurant,Motorcycle Shop,Movie Theater,Moving Target,Music Store,Music Venue,Nail Salon,Noodle House,Office,Optical Shop,Organic Grocery,Other Great Outdoors,Other Repair Shop,Outdoor Supply Store,Pakistani Restaurant,Paper / Office Supplies Store,Park,Pastry Shop,Performing Arts Venue,Pet Store,Pharmacy,Photography Studio,Pie Shop,Pilates Studio,Pizza Place,Playground,Plaza,Pool,Pool Hall,Portuguese Restaurant,Print Shop,Pub,Ramen Restaurant,Record Shop,Recording Studio,Recreation Center,Rental Car Location,Rental Service,Residential Building (Apartment / Condo),Restaurant,Road,Rock Climbing Spot,Rock Club,Sake Bar,Salon / Barbershop,Sandwich Place,Scenic Lookout,Seafood Restaurant,Shanghai Restaurant,Shoe Store,Shop & Service,Shopping Mall,Shopping Plaza,Skate Park,Skating Rink,Ski Chalet,Smoke Shop,Snack Place,Soccer Field,Soccer Stadium,Soup Place,Spa,Speakeasy,Sporting Goods Shop,Sports Bar,Sports Club,Sri Lankan Restaurant,Steakhouse,Storage Facility,Supermarket,Supplement Shop,Sushi Restaurant,Taco Place,Tailor Shop,Taiwanese Restaurant,Tapas Restaurant,Tea Room,Tennis Court,Thai Restaurant,Theater,Thrift / Vintage Store,Toy / Game Store,Track,Trail,Train Station,Turkish Restaurant,Vape Store,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Wine Bar,Wings Joint,Women's Store,Yoga Studio,Zoo
0,Agincourt,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.017857,0.0,0.035714,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.017857,0.035714,0.0,0.0,0.142857,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.017857,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.017857,0.0,0.0,0.017857,0.017857,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.017857,0.017857,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.017857,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,"Agincourt North, L'Amoreaux East, Milliken, St...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.194444,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,"Albion Gardens, Beaumond Heights, Humbergate, ...",0.0,0.032258,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.032258,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.032258,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.032258,0.0,0.0,0.0,0.0,0.0,0.0,0.032258,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.129032,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.032258,0.0,0.0,0.032258,0.032258,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.032258,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.032258,0.096774,0.0,0.0,0.032258,0.0,0.032258,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.032258,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.032258,0.0,0.0,0.0,0.0,0.0,0.0,0.064516,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.064516,0.0,0.0,0.0,0.096774,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.032258,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.032258,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.032258,0.0,0.0,0.0,0.0,0.0,0.0
3,"Alderwood, Long Branch",0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.08,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.08,0.0,0.0,0.0,0.08,0.0,0.0,0.0,0.0,0.0,0.04,0.04,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,"Bathurst Manor, Downsview North, Wilson Heights",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02439,0.0,0.0,0.0,0.0,0.0,0.0,0.02439,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02439,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02439,0.0,0.0,0.04878,0.0,0.0,0.0,0.0,0.0,0.02439,0.0,0.04878,0.0,0.0,0.0,0.0,0.0,0.0,0.02439,0.0,0.0,0.02439,0.0,0.0,0.02439,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02439,0.02439,0.0,0.0,0.0,0.0,0.02439,0.0,0.0,0.02439,0.0,0.0,0.0,0.0,0.02439,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02439,0.0,0.0,0.0,0.0,0.0,0.0,0.04878,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02439,0.02439,0.0,0.0,0.02439,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02439,0.0,0.0,0.0,0.04878,0.0,0.0,0.02439,0.073171,0.0,0.0,0.0,0.02439,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02439,0.0,0.0,0.0,0.0,0.0,0.02439,0.0,0.0,0.0,0.0,0.0,0.02439,0.0,0.0,0.0,0.02439,0.0,0.0,0.0,0.0,0.0,0.04878,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02439,0.0,0.02439,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02439,0.0,0.0,0.0,0.0,0.0,0.02439,0.0,0.0,0.0,0.0,0.0,0.0
5,"Bedford Park, Lawrence Manor East",0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.017241,0.017241,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.017241,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.051724,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.051724,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.051724,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.017241,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.017241,0.017241,0.0,0.0,0.0,0.051724,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.051724,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.017241,0.0,0.0
6,"Birch Cliff, Cliffside West",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.045455,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.045455,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,"Bloordale Gardens, Eringate, Markland Wood, Ol...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.083333,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.041667,0.041667,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Business Reply Mail Processing Centre 969 Eastern,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.0,0.026316,0.0,0.026316,0.026316,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.078947,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.026316,0.0,0.0,0.0,0.0,0.026316,0.0,0.026316,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.105263,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,0.078947,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,"CFB Toronto, Downsview East",0.0,0.030303,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.030303,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.060606,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0


Create a new dataframe with the to 10 venues for each neighborhood.

To undertake this task we borrow the function *return_most_common_venues* from the notebook *Segmenting and Clustering Neighbohoods in New York City*

In [39]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [40]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = toronto_grouped['Neighborhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Agincourt,Chinese Restaurant,Spa,Shopping Mall,Caribbean Restaurant,Bakery,Food & Drink Shop,Restaurant,Sri Lankan Restaurant,Pool Hall,Print Shop
1,"Agincourt North, L'Amoreaux East, Milliken, St...",Chinese Restaurant,Pharmacy,Bakery,Pizza Place,Park,Spa,Noodle House,BBQ Joint,Shop & Service,Caribbean Restaurant
2,"Albion Gardens, Beaumond Heights, Humbergate, ...",Electronics Store,Pizza Place,Grocery Store,Pharmacy,Mobile Phone Shop,Gym Pool,Auto Garage,Sushi Restaurant,Liquor Store,Financial or Legal Service
3,"Alderwood, Long Branch",Pharmacy,Convenience Store,Pizza Place,ATM,Coffee Shop,Construction & Landscaping,Recording Studio,Pub,Print Shop,Discount Store
4,"Bathurst Manor, Downsview North, Wilson Heights",Pharmacy,Ice Cream Shop,Coffee Shop,Convenience Store,Spa,Park,Pizza Place,Deli / Bodega,Ski Chalet,Shopping Mall


- We use kmeans to cluster the neighborhood into 5 clusters.

In [41]:
kclusters = 5

toronto_grouped_clustering = toronto_grouped.drop('Neighborhood', 1)

kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_grouped_clustering)
kmeans.labels_

array([4, 4, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 3, 0, 0, 0, 0, 2, 0, 2, 0, 0,
       1, 0, 0, 1, 0, 0, 0, 1, 1, 4, 1, 0, 2, 1, 1, 0, 1, 0, 1, 0, 1, 1,
       1, 0, 1, 0, 0, 1, 0, 0])

Create a new DataFrame *toronto_merged* that includes the cluster as well top 10 venues and coordinates for each neighborhood.

In [42]:
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)
toronto_merged = toronto_df.merge(neighborhoods_venues_sorted.set_index('Neighborhood'),
                                   how='inner', left_on='Neighbourhood', right_on='Neighborhood')
toronto_merged.head()

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353,0,Office,Pharmacy,Fast Food Restaurant,Coffee Shop,Trail,Restaurant,Martial Arts Dojo,Fruit & Vegetable Store,Filipino Restaurant,Spa
1,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711,0,Pizza Place,Electronics Store,Salon / Barbershop,Supermarket,Coffee Shop,Park,Fast Food Restaurant,Pharmacy,Convenience Store,Restaurant
2,M1H,Scarborough,Cedarbrae,43.773136,-79.239476,0,Coffee Shop,Bakery,Pharmacy,Burger Joint,Indian Restaurant,Gas Station,Caribbean Restaurant,Gym / Fitness Center,Thai Restaurant,Chinese Restaurant
3,M1K,Scarborough,"East Birchmount Park, Ionview, Kennedy Park",43.727929,-79.262029,0,Coffee Shop,Discount Store,Pharmacy,Chinese Restaurant,Grocery Store,Fast Food Restaurant,Convenience Store,Bus Line,Metro Station,Bus Station
4,M1L,Scarborough,"Clairlea, Golden Mile, Oakridge",43.711112,-79.284577,0,Intersection,Pharmacy,Business Service,Convenience Store,Home Service,Coffee Shop,Bus Line,Park,Ice Cream Shop,Diner


Visualize the resulting clusters

In [43]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['Neighbourhood'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

Examine each cluster to determine the discriminating venues categories that distinguish each cluster

##### Cluser 1

In [44]:
print('There are {} neighborhood in Cluster 1.'.format(toronto_merged.loc[toronto_merged['Cluster Labels'] == 0].shape[0]))
toronto_merged.loc[toronto_merged['Cluster Labels'] == 0, toronto_merged.columns[[2] + list(range(6, toronto_merged.shape[1]))]]

There are 29 neighborhood in Cluster 1.


Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Rouge, Malvern",Office,Pharmacy,Fast Food Restaurant,Coffee Shop,Trail,Restaurant,Martial Arts Dojo,Fruit & Vegetable Store,Filipino Restaurant,Spa
1,"Guildwood, Morningside, West Hill",Pizza Place,Electronics Store,Salon / Barbershop,Supermarket,Coffee Shop,Park,Fast Food Restaurant,Pharmacy,Convenience Store,Restaurant
2,Cedarbrae,Coffee Shop,Bakery,Pharmacy,Burger Joint,Indian Restaurant,Gas Station,Caribbean Restaurant,Gym / Fitness Center,Thai Restaurant,Chinese Restaurant
3,"East Birchmount Park, Ionview, Kennedy Park",Coffee Shop,Discount Store,Pharmacy,Chinese Restaurant,Grocery Store,Fast Food Restaurant,Convenience Store,Bus Line,Metro Station,Bus Station
4,"Clairlea, Golden Mile, Oakridge",Intersection,Pharmacy,Business Service,Convenience Store,Home Service,Coffee Shop,Bus Line,Park,Ice Cream Shop,Diner
5,"Birch Cliff, Cliffside West",Ice Cream Shop,Park,Photography Studio,Spa,College Stadium,Restaurant,General Entertainment,Skating Rink,Construction & Landscaping,Convenience Store
6,"Dorset Park, Scarborough Town Centre, Wexford ...",Furniture / Home Store,Coffee Shop,Indian Restaurant,Chinese Restaurant,Restaurant,Automotive Shop,Pharmacy,Electronics Store,Construction & Landscaping,Fast Food Restaurant
7,"Maryvale, Wexford",Middle Eastern Restaurant,Grocery Store,Pizza Place,Burger Joint,Intersection,Rental Car Location,Pharmacy,Smoke Shop,Bus Station,Business Service
9,"Clarks Corners, Sullivan, Tam O'Shanter",Pharmacy,Coffee Shop,Convenience Store,Sandwich Place,Intersection,Pizza Place,Fast Food Restaurant,Building,Market,Chinese Restaurant
12,Hillcrest Village,Pharmacy,Park,Coffee Shop,Convenience Store,Shopping Mall,Mobile Phone Shop,Fast Food Restaurant,Chinese Restaurant,Sandwich Place,Restaurant


##### Cluster 2 

In [45]:
print('There are {} neighborhood in Cluster 2.'.format(toronto_merged.loc[toronto_merged['Cluster Labels'] == 1].shape[0]))
toronto_merged.loc[toronto_merged['Cluster Labels'] == 1, toronto_merged.columns[[2] + list(range(6, toronto_merged.shape[1]))]]

There are 16 neighborhood in Cluster 2.


Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
13,"Newtonbrook, Willowdale",Korean Restaurant,Café,Middle Eastern Restaurant,Pizza Place,Coffee Shop,Bus Line,Shopping Mall,Dessert Shop,Ski Chalet,Bus Station
14,Willowdale South,Pizza Place,Japanese Restaurant,Bubble Tea Shop,Coffee Shop,Korean Restaurant,Sushi Restaurant,Café,Grocery Store,Ramen Restaurant,Middle Eastern Restaurant
22,The Beaches,Pub,Park,Japanese Restaurant,Coffee Shop,Breakfast Spot,Beach,Caribbean Restaurant,Tea Room,Bakery,Pharmacy
23,Leaside,Burger Joint,Sporting Goods Shop,Coffee Shop,BBQ Joint,Shopping Mall,Bike Shop,Sports Bar,Supermarket,Pet Store,Sushi Restaurant
24,East Toronto,Café,Greek Restaurant,Coffee Shop,Ethiopian Restaurant,American Restaurant,Gastropub,Bakery,Bar,Gourmet Shop,Donut Shop
25,"The Beaches West, India Bazaar",Indian Restaurant,Beach,Park,Brewery,Café,Coffee Shop,Bakery,Comic Shop,Farmers Market,Fast Food Restaurant
26,North Toronto West,Italian Restaurant,Mexican Restaurant,Café,Coffee Shop,Sporting Goods Shop,Restaurant,Bakery,Garden,Clothing Store,Spa
29,"Cabbagetown, St. James Town",Restaurant,Park,Café,Coffee Shop,Diner,Gastropub,Japanese Restaurant,Thai Restaurant,Italian Restaurant,Rock Club
31,Roselawn,Pharmacy,Italian Restaurant,Bank,Coffee Shop,Sushi Restaurant,Kids Store,Café,Lingerie Store,Gastropub,Bakery
33,Humewood-Cedarvale,Pizza Place,Sushi Restaurant,Park,Coffee Shop,Bank,Frozen Yogurt Shop,Bakery,Gastropub,Restaurant,Spa


##### Cluster 3

In [46]:
print('There are {} neighborhood in Cluster 3.'.format(toronto_merged.loc[toronto_merged['Cluster Labels'] == 2].shape[0]))
toronto_merged.loc[toronto_merged['Cluster Labels'] == 2, toronto_merged.columns[[2] + list(range(6, toronto_merged.shape[1]))]]

There are 3 neighborhood in Cluster 3.


Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
27,"Moore Park, Summerhill East",Park,Italian Restaurant,Restaurant,Café,Grocery Store,Tapas Restaurant,Sushi Restaurant,Bagel Shop,Tea Room,Coffee Shop
28,"Deer Park, Forest Hill SE, Rathnelly, South Hi...",Sushi Restaurant,Italian Restaurant,Park,Coffee Shop,Café,Restaurant,Spa,Liquor Store,Breakfast Spot,Supermarket
35,"Dovercourt Village, Dufferin",Café,Coffee Shop,Park,Bakery,Brewery,Portuguese Restaurant,Italian Restaurant,Sushi Restaurant,Bar,Grocery Store


##### Cluster 4

In [47]:
print('There are {} neighborhood in Cluster 4.'.format(toronto_merged.loc[toronto_merged['Cluster Labels'] == 3].shape[0]))
toronto_merged.loc[toronto_merged['Cluster Labels'] == 3, toronto_merged.columns[[2] + list(range(6, toronto_merged.shape[1]))]]

There are 1 neighborhood in Cluster 4.


Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
40,Canada Post Gateway Processing Centre,Mexican Restaurant,Middle Eastern Restaurant,Burrito Place,Hotel,Asian Restaurant,Paper / Office Supplies Store,Caribbean Restaurant,Breakfast Spot,Steakhouse,Comedy Club


##### Cluster 5

In [48]:
print('There are {} neighborhood in Cluster 5.'.format(toronto_merged.loc[toronto_merged['Cluster Labels'] == 4].shape[0]))
toronto_merged.loc[toronto_merged['Cluster Labels'] == 4, toronto_merged.columns[[2] + list(range(6, toronto_merged.shape[1]))]]

There are 3 neighborhood in Cluster 5.


Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
8,Agincourt,Chinese Restaurant,Spa,Shopping Mall,Caribbean Restaurant,Bakery,Food & Drink Shop,Restaurant,Sri Lankan Restaurant,Pool Hall,Print Shop
10,"Agincourt North, L'Amoreaux East, Milliken, St...",Chinese Restaurant,Pharmacy,Bakery,Pizza Place,Park,Spa,Noodle House,BBQ Joint,Shop & Service,Caribbean Restaurant
11,L'Amoreaux West,Chinese Restaurant,Coffee Shop,Fast Food Restaurant,Clothing Store,Bakery,Gym Pool,Pizza Place,Mobile Phone Shop,Tennis Court,Spa
