# IBM Applied Data Science Capstone Project. Assignment 1
# Segmenting and clustering neighborhoods in Toronto

This notebook is my solution to the 1st assignment in the Coursera, IBM Applied Data Science Capstone Project.\n The purpose is to scrape data from wikipedia, use Foursquare, segment and cluster, and create maps with Folium.

## PART A
### Assignment:
For this assignment, you will be required to explore and cluster the neighborhoods in Toronto.

1. Start by creating a new Notebook for this assignment.

2. Use the Notebook to build the code to scrape the following Wikipedia page, https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M, in order to obtain the data that is in the table of postal codes and to transform the data into a pandas dataframe like the one shown below: (shown on Coursera class site).

In [1]:
#import the dependencies that I'll need
import pandas as pd
import numpy as np

pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

#for data visualizations
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import json # library to handle JSON files

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# map rendering library
import folium 

# for kmeans
from sklearn.cluster import KMeans 

In [2]:
#Read in the data and check how many rows and columns
wiki = pd.read_html('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M', header=0)[0]
print ('The wiki dataframe has', wiki.shape[0], 'rows and', wiki.shape[1], 'columns')

The wiki dataframe has 288 rows and 3 columns


In [3]:
#See the first few rows of the wiki dataframe
wiki.head()

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront


### Assighnment:

3. To create the dataframe:

a. The dataframe will consist of three columns: PostalCode, Borough, and Neighborhood

b. Only process the cells that have an assigned borough. Ignore cells with a borough that is Not assigned.

c. More than one neighborhood can exist in one postal code area. For example, in the table on the Wikipedia page, you will notice that M5A is listed twice and has two neighborhoods: Harbourfront and Regent Park. These two rows will be combined into one row with the neighborhoods separated with a comma 

d. If a cell has a borough but a Not assigned neighborhood, then the neighborhood will be the same as the borough. So for the 9th cell in the table on the Wikipedia page, the value of the Borough and the Neighborhood columns will be Queen's Park.

e. Clean your Notebook and add Markdown cells to explain your work and any assumptions you are making.

#### 3a) I already confirmed above that there are in deed 3 columns titled Postcode, Borough, and Neighborhood

#### 3b) drop (ignore) boroughs that are 'Not assigned'. I also correct the spelling of Neighborhood in the dataframe. I then check the shape again.

In [4]:
# 3b) drop (ignore) boroughs that are 'Not assigned'. I also correct the spelling of Neighborhood in the dataframe. I then check the shape again.
wiki.rename(columns={'Neighbourhood':'Neighborhood'}, inplace=True)
wiki = wiki[wiki.Borough != 'Not assigned']
print ('There are now', wiki.shape[0], 'rows and', wiki.shape[1], 'columns in the wiki dataframe.')

There are now 211 rows and 3 columns in the wiki dataframe.


In [5]:
#Check the first few rows again...
wiki.head()

Unnamed: 0,Postcode,Borough,Neighborhood
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront
5,M5A,Downtown Toronto,Regent Park
6,M6A,North York,Lawrence Heights


#### 3c) Combine rows with multiple neighborhoods for a single postalcode
#### Note: I decided to make a seperate dataframe for 'borough'. This is so I don't get duplicate values in the 'borough' feature when I combine multiple neighborhoods in the 'neighborhood' feature. I then put them back together again into the same dataframe...

In [6]:
# First, I make a seperate 'borough' dataframe

borough = wiki[['Postcode', 'Borough']]
borough.head()

Unnamed: 0,Postcode,Borough
2,M3A,North York
3,M4A,North York
4,M5A,Downtown Toronto
5,M5A,Downtown Toronto
6,M6A,North York


In [7]:
# I then look at the shape (noticing that there are more rows in it than the wiki dataframe due to duplicate postcode values)
borough.shape

(211, 2)

In [8]:
# I then remove duplicate values for the same postcode. I then look at the shape again.
borough = borough.drop_duplicates(['Postcode'])
borough.shape

(103, 2)

In [9]:
# I know make a 'neighborhood' dataframe containing only the postcode and neighborhood (no borough). I then check the 1st few rows of the df.
neighborhood = wiki[['Postcode', 'Neighborhood']]
neighborhood.head()

Unnamed: 0,Postcode,Neighborhood
2,M3A,Parkwoods
3,M4A,Victoria Village
4,M5A,Harbourfront
5,M5A,Regent Park
6,M6A,Lawrence Heights


In [10]:
# Finally, I combine the multiple neighborhoods with the single postcodes.

neighborhood = neighborhood.groupby(['Postcode'], sort = False).agg(lambda x : ','.join(x))

In [11]:
# Check the first few rows to confirm it worked
neighborhood.head()

Unnamed: 0_level_0,Neighborhood
Postcode,Unnamed: 1_level_1
M3A,Parkwoods
M4A,Victoria Village
M5A,"Harbourfront,Regent Park"
M6A,"Lawrence Heights,Lawrence Manor"
M7A,Not assigned


#### 3d) Fill any 'Not assigned' Neighborhoods with the name of the Borough...

In [12]:
# First, I check to see how many there are...

neighborhood[neighborhood.Neighborhood == 'Not assigned']

Unnamed: 0_level_0,Neighborhood
Postcode,Unnamed: 1_level_1
M7A,Not assigned


In [13]:
# Now I need to check my 'borough' dataframe for the borough name of postcode M7A...
borough[borough.Postcode == 'M7A']

Unnamed: 0,Postcode,Borough
8,M7A,Queen's Park


#### I see there is only 1 row with a "not assigned" neighborhood value. So I will just change it individually now.

In [14]:
neighborhood = neighborhood.replace({'Neighborhood': r'Not assigned'}, {'Neighborhood': "Queen's Park"}, regex=True)

#### ... and now I'll check that I've corrected it...

In [15]:
neighborhood.head()

Unnamed: 0_level_0,Neighborhood
Postcode,Unnamed: 1_level_1
M3A,Parkwoods
M4A,Victoria Village
M5A,"Harbourfront,Regent Park"
M6A,"Lawrence Heights,Lawrence Manor"
M7A,Queen's Park


In [16]:
neighborhood[neighborhood.Neighborhood == 'Not assigned']

Unnamed: 0_level_0,Neighborhood
Postcode,Unnamed: 1_level_1


### 3d) In the last cell of your notebook (for PART A), use the .shape method to print the number of rows of your dataframe.

In [17]:
print ('My dataframe has', neighborhood.shape[0], 'rows')

My dataframe has 103 rows


## PART B

In this part of the assignment, I will be loading in a provided dataset with giospatial data to create the dataframe with latitude and longitude columns/values added in.

### Assignment Part B:

1. Use the csv file to creat the table with lat and lon included.

#### I'm first loading in the data, looking at the 1st few rows of the dataset, and renaming the column "Postal Code" to "Postcode" to match the neighborhood and borough dataframes.

In [18]:
geo = pd.read_csv('http://cocl.us/Geospatial_data')

In [19]:
geo.head()

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


In [20]:
geo.rename(columns={'Postal Code':'Postcode'}, inplace=True)
geo.head()

Unnamed: 0,Postcode,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


### I'm now merging the neighborhood and geo dataframes to get the lat and lon into the same dataframe as the postcode and neighborhoods.

In [21]:
neighborhood = pd.merge(neighborhood,geo[['Postcode','Latitude', 'Longitude']],on='Postcode', how='left')
neighborhood.head()

Unnamed: 0,Postcode,Neighborhood,Latitude,Longitude
0,M3A,Parkwoods,43.753259,-79.329656
1,M4A,Victoria Village,43.725882,-79.315572
2,M5A,"Harbourfront,Regent Park",43.65426,-79.360636
3,M6A,"Lawrence Heights,Lawrence Manor",43.718518,-79.464763
4,M7A,Queen's Park,43.662301,-79.389494


In [22]:
# Checking the shape...
neighborhood.shape

(103, 4)

### I now need to add back in the borough dataframe as well, so the neighborhood dataframe will now have all the appropriate features included.

## The dataframe below is my answer to PART B

In [23]:
# merge neighborhood and borough df's, sort by Latitude, and look at dataframe. 
neighborhood = pd.merge(neighborhood,borough[['Postcode','Borough']],on='Postcode', how='left')
neighborhood = neighborhood.sort_values(['Latitude'])
neighborhood

Unnamed: 0,Postcode,Neighborhood,Latitude,Longitude,Borough
93,M8W,"Alderwood,Long Branch",43.602414,-79.543484,Etobicoke
88,M8V,"Humber Bay Shores,Mimico South,New Toronto",43.605647,-79.501321,Etobicoke
102,M8Z,"Kingsway Park South West,Mimico NW,The Queensw...",43.628841,-79.520999,Etobicoke
87,M5V,"CN Tower,Bathurst Quay,Island airport,Harbourf...",43.628947,-79.39442,Downtown Toronto
101,M8Y,"Humber Bay,King's Mill Park,Kingsway Park Sout...",43.636258,-79.498509,Etobicoke
43,M6K,"Brockton,Exhibition Place,Parkdale Village",43.636847,-79.428191,West Toronto
76,M7R,Canada Post Gateway Processing Centre,43.636966,-79.615819,Mississauga
36,M5J,"Harbourfront East,Toronto Islands,Union Station",43.640816,-79.381752,Downtown Toronto
17,M9C,"Bloordale Gardens,Eringate,Markland Wood,Old B...",43.643515,-79.577201,Etobicoke
20,M5E,Berczy Park,43.644771,-79.373306,Downtown Toronto


In [24]:
# And it's shape...
neighborhood.shape

(103, 5)

## Part C

### Assignment PART C:

1. Explore and cluster the neighborhoods in Toronto. 

You can decide to work with only boroughs that contain the word Toronto and then replicate the same analysis we did to the New York City data. It is up to you.

Just make sure:

A. to add enough Markdown cells to explain what you decided to do and to report any observations you make.

B. to generate maps to visualize your neighborhoods and how they cluster together.

### I decided to work only with boroughs containing the word 'Toronto' and then analyzing for venues. 

#### I first pull out only the data containing boroughs with the word 'Toronto'

In [25]:
toronto_data = neighborhood[neighborhood['Borough'].str.contains("Toronto")]
toronto_data

Unnamed: 0,Postcode,Neighborhood,Latitude,Longitude,Borough
87,M5V,"CN Tower,Bathurst Quay,Island airport,Harbourf...",43.628947,-79.39442,Downtown Toronto
43,M6K,"Brockton,Exhibition Place,Parkdale Village",43.636847,-79.428191,West Toronto
36,M5J,"Harbourfront East,Toronto Islands,Union Station",43.640816,-79.381752,Downtown Toronto
20,M5E,Berczy Park,43.644771,-79.373306,Downtown Toronto
92,M5W,Stn A PO Boxes 25 The Esplanade,43.646435,-79.374846,Downtown Toronto
42,M5K,"Design Exchange,Toronto Dominion Centre",43.647177,-79.381576,Downtown Toronto
37,M6J,"Little Portugal,Trinity",43.647927,-79.41975,West Toronto
48,M5L,"Commerce Court,Victoria Hotel",43.648198,-79.379817,Downtown Toronto
97,M5X,"First Canadian Place,Underground city",43.648429,-79.38228,Downtown Toronto
75,M6R,"Parkdale,Roncesvalles",43.64896,-79.456325,West Toronto


#### Next, I use geopy library to get the latitude and longitude values of New York City.

In [26]:
address = 'Toronto, CA'

geolocator = Nominatim(user_agent="toronto_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Toronto are 43.653963, -79.387207.


#### I know create a map of Toronto with the neighborhoods superimposed on top.

In [27]:
# create map of New York using latitude and longitude values
map_toronto = folium.Map(location=[latitude, longitude], zoom_start=12)

# add markers to map
for lat, lng, borough, neighborhood in zip(toronto_data['Latitude'], toronto_data['Longitude'], toronto_data['Borough'], toronto_data['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
    
map_toronto

In [28]:
#### I first define my foursquare credentials and version (which I've hidden/deleted)

#### I then look at the different neighborhoods that in boroughs containing the word 'Toronto'

In [30]:
toronto_data.loc[:, 'Neighborhood']

87     CN Tower,Bathurst Quay,Island airport,Harbourf...
43            Brockton,Exhibition Place,Parkdale Village
36       Harbourfront East,Toronto Islands,Union Station
20                                           Berczy Park
92                       Stn A PO Boxes 25 The Esplanade
42               Design Exchange,Toronto Dominion Centre
37                               Little Portugal,Trinity
48                         Commerce Court,Victoria Hotel
97                 First Canadian Place,Underground city
75                                 Parkdale,Roncesvalles
30                                Adelaide,King,Richmond
15                                        St. James Town
81                                     Runnymede,Swansea
84               Chinatown,Grange Park,Kensington Market
2                               Harbourfront,Regent Park
9                                Ryerson,Garden District
24                                    Central Bay Street
54                             

In [31]:
#### I decided to look at venues within the neighborhood of 'St. James Town'. 
#### I first need to find the latitude and longitude for St. James Town...

In [32]:
neighborhood_latitude = toronto_data.loc[15, 'Latitude'] # neighborhood latitude value
neighborhood_longitude = toronto_data.loc[15, 'Longitude'] # neighborhood longitude value

neighborhood_name = toronto_data.loc[15, 'Neighborhood'] # neighborhood name

print('Latitude and longitude values of {} are {}, {}.'.format(neighborhood_name, 
                                                               neighborhood_latitude, 
                                                               neighborhood_longitude))

Latitude and longitude values of St. James Town are 43.6514939, -79.3754179.


#### Now, I'll get the top 100 venues that are in Marble Hill within a radius of 500 meters.

#### First, I create the GET request URL. Name your URL **url**.

In [33]:
# type your answer here
LIMIT = 100
radius = 500
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius,
    LIMIT)
url # display URL


'https://api.foursquare.com/v2/venues/explore?&client_id=MCAKKWNSRI12WVUBXPGTKULQA4OKRRFMAXRXMV5KFHV0ILRT&client_secret=DFULHUEWZS5Z1135L3W21L242NEFG01YZ501XY0QIE52CAZN&v=20180605&ll=43.6514939,-79.3754179&radius=500&limit=100'

#### I now send the GET request 

In [34]:
results = requests.get(url).json()

#### All the information I need is in the *items* key. Before I proceed, I'll borrow the **get_category_type** function from the Foursquare tutorial.

In [35]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

#### Now I'll clean the json and structure it into a *pandas* dataframe. I'll then check home many venues were returned by Foursquare.

In [36]:
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

Unnamed: 0,name,categories,lat,lng
0,Terroni,Italian Restaurant,43.650927,-79.375602
1,Gyu-Kaku Japanese BBQ,Japanese Restaurant,43.651422,-79.375047
2,Crepe TO,Creperie,43.650063,-79.374587
3,GEORGE Restaurant,Restaurant,43.653346,-79.374445
4,Triple A Bar (AAA),BBQ Joint,43.651658,-79.37272


In [37]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

100 venues were returned by Foursquare.


### Exploring Neighborhoods in Toronto

#### I'll first create a function to repeat the same process to all the neighborhoods in Toronto

In [38]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

#### Now I'll write the code to run the above function on each neighborhood and create a new dataframe called *toronto_venues*.

In [39]:
toronto_venues = getNearbyVenues(names=toronto_data['Neighborhood'],
                                   latitudes=toronto_data['Latitude'],
                                   longitudes=toronto_data['Longitude']
                                  )

CN Tower,Bathurst Quay,Island airport,Harbourfront West,King and Spadina,Railway Lands,South Niagara
Brockton,Exhibition Place,Parkdale Village
Harbourfront East,Toronto Islands,Union Station
Berczy Park
Stn A PO Boxes 25 The Esplanade
Design Exchange,Toronto Dominion Centre
Little Portugal,Trinity
Commerce Court,Victoria Hotel
First Canadian Place,Underground city
Parkdale,Roncesvalles
Adelaide,King,Richmond
St. James Town
Runnymede,Swansea
Chinatown,Grange Park,Kensington Market
Harbourfront,Regent Park
Ryerson,Garden District
Central Bay Street
Studio District
High Park,The Junction South
Harbord,University of Toronto
Business Reply Mail Processing Centre 969 Eastern
Church and Wellesley
Cabbagetown,St. James Town
The Beaches West,India Bazaar
Dovercourt Village,Dufferin
Christie
The Annex,North Midtown,Yorkville
The Beaches
The Danforth West,Riverdale
Rosedale
Deer Park,Forest Hill SE,Rathnelly,South Hill,Summerhill West
Moore Park,Summerhill East
Forest Hill North,Forest Hill West

In [40]:
print(toronto_venues.shape)
toronto_venues.head()

(1721, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,"CN Tower,Bathurst Quay,Island airport,Harbourf...",43.628947,-79.39442,Billy Bishop Toronto City Airport (YTZ) (Billy...,43.631585,-79.395643,Airport
1,"CN Tower,Bathurst Quay,Island airport,Harbourf...",43.628947,-79.39442,Porter Lounge,43.63068,-79.395756,Airport Lounge
2,"CN Tower,Bathurst Quay,Island airport,Harbourf...",43.628947,-79.39442,Toronto Harbour,43.633045,-79.396484,Harbor / Marina
3,"CN Tower,Bathurst Quay,Island airport,Harbourf...",43.628947,-79.39442,Billy Bishop Café,43.631132,-79.396139,Airport Food Court
4,"CN Tower,Bathurst Quay,Island airport,Harbourf...",43.628947,-79.39442,Air Canada Check-In Counter,43.631226,-79.395987,Airport Terminal


#### Let's check how many venues were returned for each neighborhood

In [41]:
toronto_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"Adelaide,King,Richmond",100,100,100,100,100,100
Berczy Park,57,57,57,57,57,57
"Brockton,Exhibition Place,Parkdale Village",22,22,22,22,22,22
Business Reply Mail Processing Centre 969 Eastern,19,19,19,19,19,19
"CN Tower,Bathurst Quay,Island airport,Harbourfront West,King and Spadina,Railway Lands,South Niagara",15,15,15,15,15,15
"Cabbagetown,St. James Town",50,50,50,50,50,50
Central Bay Street,86,86,86,86,86,86
"Chinatown,Grange Park,Kensington Market",100,100,100,100,100,100
Christie,16,16,16,16,16,16
Church and Wellesley,89,89,89,89,89,89


#### How many unique categories are there?

In [42]:
print('There are {} uniques categories.'.format(len(toronto_venues['Venue Category'].unique())))

There are 236 uniques categories.


### Analyzing each neighborhood...

In [43]:
# one hot encoding
toronto_onehot = pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
toronto_onehot['Neighborhood'] = toronto_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]

toronto_onehot.head()

Unnamed: 0,Yoga Studio,Adult Boutique,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,Aquarium,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Workshop,BBQ Joint,Baby Store,Bagel Shop,Bakery,Bank,Bar,Baseball Stadium,Basketball Stadium,Beach,Beer Bar,Beer Store,Belgian Restaurant,Bistro,Boat or Ferry,Bookstore,Boutique,Brazilian Restaurant,Breakfast Spot,Brewery,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Bus Line,Butcher,Café,Cajun / Creole Restaurant,Camera Store,Caribbean Restaurant,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,College Arts Building,College Gym,College Rec Center,Colombian Restaurant,Comfort Food Restaurant,Comic Shop,Concert Hall,Convenience Store,Cosmetics Shop,Costume Shop,Coworking Space,Creperie,Cuban Restaurant,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dive Bar,Dog Run,Doner Restaurant,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Fish & Chips Shop,Fish Market,Flea Market,Food,Food & Drink Shop,Food Court,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gastropub,Gay Bar,General Entertainment,General Travel,German Restaurant,Gift Shop,Gluten-free Restaurant,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Harbor / Marina,Health & Beauty Service,Health Food Store,Historic Site,History Museum,Hobby Shop,Hookah Bar,Hospital,Hostel,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Intersection,Irish Pub,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Jewish Restaurant,Juice Bar,Korean Restaurant,Lake,Latin American Restaurant,Light Rail Station,Lingerie Store,Liquor Store,Lounge,Mac & Cheese Joint,Malay Restaurant,Market,Martial Arts Dojo,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Movie Theater,Museum,Music Store,Music Venue,Neighborhood,New American Restaurant,Nightclub,Noodle House,Office,Opera House,Optical Shop,Organic Grocery,Other Great Outdoors,Park,Performing Arts Venue,Pet Store,Pharmacy,Pizza Place,Plane,Playground,Plaza,Poke Place,Pool,Portuguese Restaurant,Poutine Place,Pub,Ramen Restaurant,Record Shop,Recording Studio,Rental Car Location,Restaurant,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shoe Store,Shopping Mall,Skate Park,Skating Rink,Smoke Shop,Smoothie Shop,Snack Place,Soup Place,South American Restaurant,Southern / Soul Food Restaurant,Spa,Speakeasy,Sporting Goods Shop,Sports Bar,Stadium,Stationery Store,Steakhouse,Strip Club,Supermarket,Sushi Restaurant,Swim School,Taco Place,Tailor Shop,Taiwanese Restaurant,Tanning Salon,Tapas Restaurant,Tea Room,Thai Restaurant,Theater,Theme Restaurant,Thrift / Vintage Store,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Wine Shop,Wings Joint
0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,"CN Tower,Bathurst Quay,Island airport,Harbourf...",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,"CN Tower,Bathurst Quay,Island airport,Harbourf...",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,"CN Tower,Bathurst Quay,Island airport,Harbourf...",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,"CN Tower,Bathurst Quay,Island airport,Harbourf...",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,"CN Tower,Bathurst Quay,Island airport,Harbourf...",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [44]:
toronto_onehot.shape

(1721, 236)

#### Now, I'll group rows by neighborhood and by taking the mean of the frequency of occurrence of each category

In [45]:
toronto_grouped = toronto_onehot.groupby('Neighborhood').mean().reset_index()
toronto_grouped

Unnamed: 0,Neighborhood,Yoga Studio,Adult Boutique,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,Aquarium,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Workshop,BBQ Joint,Baby Store,Bagel Shop,Bakery,Bank,Bar,Baseball Stadium,Basketball Stadium,Beach,Beer Bar,Beer Store,Belgian Restaurant,Bistro,Boat or Ferry,Bookstore,Boutique,Brazilian Restaurant,Breakfast Spot,Brewery,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Bus Line,Butcher,Café,Cajun / Creole Restaurant,Camera Store,Caribbean Restaurant,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,College Arts Building,College Gym,College Rec Center,Colombian Restaurant,Comfort Food Restaurant,Comic Shop,Concert Hall,Convenience Store,Cosmetics Shop,Costume Shop,Coworking Space,Creperie,Cuban Restaurant,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dive Bar,Dog Run,Doner Restaurant,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Fish & Chips Shop,Fish Market,Flea Market,Food,Food & Drink Shop,Food Court,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gastropub,Gay Bar,General Entertainment,General Travel,German Restaurant,Gift Shop,Gluten-free Restaurant,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Harbor / Marina,Health & Beauty Service,Health Food Store,Historic Site,History Museum,Hobby Shop,Hookah Bar,Hospital,Hostel,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Intersection,Irish Pub,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Jewish Restaurant,Juice Bar,Korean Restaurant,Lake,Latin American Restaurant,Light Rail Station,Lingerie Store,Liquor Store,Lounge,Mac & Cheese Joint,Malay Restaurant,Market,Martial Arts Dojo,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Movie Theater,Museum,Music Store,Music Venue,New American Restaurant,Nightclub,Noodle House,Office,Opera House,Optical Shop,Organic Grocery,Other Great Outdoors,Park,Performing Arts Venue,Pet Store,Pharmacy,Pizza Place,Plane,Playground,Plaza,Poke Place,Pool,Portuguese Restaurant,Poutine Place,Pub,Ramen Restaurant,Record Shop,Recording Studio,Rental Car Location,Restaurant,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shoe Store,Shopping Mall,Skate Park,Skating Rink,Smoke Shop,Smoothie Shop,Snack Place,Soup Place,South American Restaurant,Southern / Soul Food Restaurant,Spa,Speakeasy,Sporting Goods Shop,Sports Bar,Stadium,Stationery Store,Steakhouse,Strip Club,Supermarket,Sushi Restaurant,Swim School,Taco Place,Tailor Shop,Taiwanese Restaurant,Tanning Salon,Tapas Restaurant,Tea Room,Thai Restaurant,Theater,Theme Restaurant,Thrift / Vintage Store,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Wine Shop,Wings Joint
0,"Adelaide,King,Richmond",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.01,0.01,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.02,0.0,0.0,0.01,0.03,0.01,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.06,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.01,0.01,0.0,0.0,0.0,0.01,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.03,0.0,0.02,0.01,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0
1,Berczy Park,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.017544,0.035088,0.0,0.0,0.0,0.017544,0.017544,0.017544,0.0,0.017544,0.017544,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035088,0.0,0.0,0.0,0.035088,0.0,0.0,0.0,0.0,0.017544,0.052632,0.087719,0.0,0.0,0.0,0.0,0.017544,0.0,0.017544,0.0,0.017544,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035088,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.017544,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.017544,0.017544,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035088,0.0,0.0,0.0,0.0,0.035088,0.0,0.0,0.0,0.0,0.0,0.0,0.035088,0.0,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035088,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.017544,0.017544,0.0,0.0,0.0,0.0,0.0,0.0,0.017544,0.0,0.0,0.0,0.0,0.0
2,"Brockton,Exhibition Place,Parkdale Village",0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.090909,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.045455,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Business Reply Mail Processing Centre 969 Eastern,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.105263,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,"CN Tower,Bathurst Quay,Island airport,Harbourf...",0.0,0.0,0.0,0.066667,0.066667,0.066667,0.133333,0.2,0.133333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,"Cabbagetown,St. James Town",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.02,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.04,0.0,0.0,0.02,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.02,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.02,0.0,0.0,0.02,0.0,0.0,0.0,0.02,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.02,0.0,0.0,0.0,0.04,0.02,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.02,0.02,0.06,0.0,0.02,0.02,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.06,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Central Bay Street,0.011628,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011628,0.0,0.0,0.0,0.0,0.011628,0.011628,0.0,0.0,0.0,0.0,0.0,0.0,0.011628,0.0,0.034884,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011628,0.0,0.0,0.0,0.0,0.034884,0.0,0.034884,0.0,0.0,0.0,0.034884,0.0,0.0,0.011628,0.0,0.023256,0.0,0.0,0.0,0.011628,0.0,0.151163,0.0,0.0,0.0,0.0,0.0,0.011628,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011628,0.011628,0.0,0.011628,0.011628,0.0,0.0,0.0,0.011628,0.0,0.0,0.0,0.0,0.0,0.011628,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011628,0.011628,0.0,0.011628,0.0,0.0,0.0,0.011628,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011628,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034884,0.023256,0.0,0.0,0.0,0.046512,0.023256,0.0,0.0,0.0,0.011628,0.011628,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.023256,0.011628,0.011628,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011628,0.0,0.0,0.0,0.0,0.011628,0.0,0.0,0.0,0.011628,0.0,0.0,0.0,0.011628,0.0,0.011628,0.0,0.0,0.011628,0.0,0.0,0.0,0.0,0.0,0.023256,0.0,0.023256,0.0,0.0,0.011628,0.0,0.0,0.0,0.0,0.0,0.011628,0.0,0.0,0.0,0.0,0.023256,0.0,0.0,0.0,0.0,0.0,0.011628,0.0,0.0,0.023256,0.0,0.0,0.0,0.0,0.0,0.0,0.011628,0.023256,0.0,0.0,0.0,0.0,0.0,0.0,0.011628,0.0,0.0,0.011628,0.0,0.0
7,"Chinatown,Grange Park,Kensington Market",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.04,0.0,0.06,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.02,0.01,0.0,0.0,0.07,0.0,0.0,0.02,0.01,0.03,0.0,0.0,0.0,0.0,0.02,0.04,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.01,0.01,0.04,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.06,0.0,0.05,0.01,0.0,0.0
8,Christie,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1875,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1875,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Church and Wellesley,0.022472,0.011236,0.011236,0.0,0.0,0.0,0.0,0.0,0.0,0.011236,0.0,0.0,0.0,0.0,0.0,0.011236,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011236,0.0,0.0,0.011236,0.0,0.022472,0.0,0.033708,0.011236,0.0,0.0,0.022472,0.0,0.0,0.011236,0.0,0.011236,0.0,0.0,0.0,0.0,0.0,0.067416,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011236,0.0,0.0,0.011236,0.0,0.0,0.0,0.0,0.011236,0.0,0.0,0.011236,0.0,0.0,0.0,0.0,0.0,0.011236,0.0,0.0,0.0,0.022472,0.0,0.0,0.0,0.0,0.0,0.011236,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022472,0.044944,0.011236,0.0,0.0,0.0,0.0,0.0,0.0,0.011236,0.022472,0.0,0.0,0.011236,0.0,0.0,0.0,0.011236,0.0,0.0,0.0,0.011236,0.0,0.011236,0.011236,0.0,0.0,0.0,0.011236,0.067416,0.0,0.0,0.0,0.011236,0.0,0.0,0.0,0.0,0.0,0.011236,0.0,0.0,0.0,0.0,0.0,0.022472,0.022472,0.011236,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011236,0.0,0.0,0.0,0.0,0.0,0.0,0.011236,0.0,0.0,0.011236,0.011236,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022472,0.011236,0.0,0.0,0.0,0.033708,0.011236,0.0,0.011236,0.0,0.0,0.011236,0.011236,0.0,0.0,0.0,0.0,0.011236,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.011236,0.0,0.0,0.011236,0.011236,0.0,0.044944,0.0,0.0,0.0,0.0,0.0,0.0,0.011236,0.0,0.011236,0.011236,0.0,0.0,0.0,0.0,0.0,0.011236,0.011236,0.0,0.0,0.011236


#### ....and let's look at the shape now

In [46]:
toronto_grouped.shape

(38, 236)

#### Now, I'll print each neighborhood along with the top 5 most common venues

In [47]:
num_top_venues = 5

for hood in toronto_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = toronto_grouped[toronto_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Adelaide,King,Richmond----
                 venue  freq
0          Coffee Shop  0.06
1                 Café  0.05
2           Steakhouse  0.04
3  American Restaurant  0.04
4      Thai Restaurant  0.04


----Berczy Park----
                venue  freq
0         Coffee Shop  0.09
1        Cocktail Bar  0.05
2      Farmers Market  0.04
3              Bakery  0.04
4  Seafood Restaurant  0.04


----Brockton,Exhibition Place,Parkdale Village----
                venue  freq
0      Breakfast Spot  0.09
1                Café  0.09
2         Coffee Shop  0.09
3  Italian Restaurant  0.05
4       Burrito Place  0.05


----Business Reply Mail Processing Centre 969 Eastern----
                venue  freq
0  Light Rail Station  0.11
1         Yoga Studio  0.05
2         Pizza Place  0.05
3             Brewery  0.05
4          Smoke Shop  0.05


----CN Tower,Bathurst Quay,Island airport,Harbourfront West,King and Spadina,Railway Lands,South Niagara----
              venue  freq
0   Airport Service

#### Now, I'll put that into a *pandas* dataframe

#### First, I'll write a function to sort the venues in descending order.

In [48]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

#### Now I'll create the new dataframe and display the top 10 venues for each neighborhood.

In [49]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = toronto_grouped['Neighborhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Adelaide,King,Richmond",Coffee Shop,Café,Steakhouse,American Restaurant,Thai Restaurant,Hotel,Sushi Restaurant,Bakery,Bar,Asian Restaurant
1,Berczy Park,Coffee Shop,Cocktail Bar,Pub,Cheese Shop,Restaurant,Bakery,Farmers Market,Steakhouse,Café,Seafood Restaurant
2,"Brockton,Exhibition Place,Parkdale Village",Breakfast Spot,Café,Coffee Shop,Yoga Studio,Stadium,Burrito Place,Restaurant,Caribbean Restaurant,Climbing Gym,Pet Store
3,Business Reply Mail Processing Centre 969 Eastern,Light Rail Station,Yoga Studio,Recording Studio,Smoke Shop,Brewery,Spa,Farmers Market,Fast Food Restaurant,Burrito Place,Restaurant
4,"CN Tower,Bathurst Quay,Island airport,Harbourf...",Airport Service,Airport Lounge,Airport Terminal,Harbor / Marina,Sculpture Garden,Boutique,Boat or Ferry,Plane,Airport Gate,Airport
5,"Cabbagetown,St. James Town",Coffee Shop,Restaurant,Pizza Place,Italian Restaurant,Café,Pub,Bakery,Park,Farmers Market,Japanese Restaurant
6,Central Bay Street,Coffee Shop,Italian Restaurant,Bubble Tea Shop,Burger Joint,Ice Cream Shop,Bar,Café,Sushi Restaurant,Middle Eastern Restaurant,Sandwich Place
7,"Chinatown,Grange Park,Kensington Market",Café,Vegetarian / Vegan Restaurant,Bar,Vietnamese Restaurant,Coffee Shop,Mexican Restaurant,Bakery,Dumpling Restaurant,Chinese Restaurant,Dessert Shop
8,Christie,Grocery Store,Café,Park,Diner,Athletics & Sports,Baby Store,Restaurant,Nightclub,Coffee Shop,Convenience Store
9,Church and Wellesley,Coffee Shop,Japanese Restaurant,Gay Bar,Sushi Restaurant,Restaurant,Burger Joint,Yoga Studio,Pub,Men's Store,Mediterranean Restaurant


## 4. Cluster Neighborhoods

#### I'll now Run *k*-means to cluster the neighborhood into 5 clusters.

In [50]:
# set number of clusters
kclusters = 5

toronto_grouped_clustering = toronto_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0], dtype=int32)

#### I'll now create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

In [51]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

toronto_merged = toronto_data

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
toronto_merged = toronto_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

toronto_merged # check the last columns!

Unnamed: 0,Postcode,Neighborhood,Latitude,Longitude,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
87,M5V,"CN Tower,Bathurst Quay,Island airport,Harbourf...",43.628947,-79.39442,Downtown Toronto,0,Airport Service,Airport Lounge,Airport Terminal,Harbor / Marina,Sculpture Garden,Boutique,Boat or Ferry,Plane,Airport Gate,Airport
43,M6K,"Brockton,Exhibition Place,Parkdale Village",43.636847,-79.428191,West Toronto,0,Breakfast Spot,Café,Coffee Shop,Yoga Studio,Stadium,Burrito Place,Restaurant,Caribbean Restaurant,Climbing Gym,Pet Store
36,M5J,"Harbourfront East,Toronto Islands,Union Station",43.640816,-79.381752,Downtown Toronto,0,Coffee Shop,Aquarium,Hotel,Café,Italian Restaurant,Scenic Lookout,Fried Chicken Joint,Bakery,Brewery,Pizza Place
20,M5E,Berczy Park,43.644771,-79.373306,Downtown Toronto,0,Coffee Shop,Cocktail Bar,Pub,Cheese Shop,Restaurant,Bakery,Farmers Market,Steakhouse,Café,Seafood Restaurant
92,M5W,Stn A PO Boxes 25 The Esplanade,43.646435,-79.374846,Downtown Toronto,0,Coffee Shop,Café,Restaurant,Seafood Restaurant,Pub,Hotel,Italian Restaurant,Cocktail Bar,Breakfast Spot,Japanese Restaurant
42,M5K,"Design Exchange,Toronto Dominion Centre",43.647177,-79.381576,Downtown Toronto,0,Coffee Shop,Café,Hotel,Restaurant,American Restaurant,Deli / Bodega,Gastropub,Italian Restaurant,Pizza Place,Burger Joint
37,M6J,"Little Portugal,Trinity",43.647927,-79.41975,West Toronto,0,Bar,Coffee Shop,Asian Restaurant,Bakery,Men's Store,Vietnamese Restaurant,Restaurant,New American Restaurant,Cocktail Bar,Café
48,M5L,"Commerce Court,Victoria Hotel",43.648198,-79.379817,Downtown Toronto,0,Coffee Shop,Café,Hotel,Restaurant,American Restaurant,Bakery,Seafood Restaurant,Gastropub,Deli / Bodega,Steakhouse
97,M5X,"First Canadian Place,Underground city",43.648429,-79.38228,Downtown Toronto,0,Coffee Shop,Café,Hotel,Restaurant,American Restaurant,Bar,Bakery,Deli / Bodega,Gastropub,Burger Joint
75,M6R,"Parkdale,Roncesvalles",43.64896,-79.456325,West Toronto,0,Breakfast Spot,Gift Shop,Burger Joint,Bar,Italian Restaurant,Restaurant,Bookstore,Movie Theater,Dessert Shop,Bank


#### Finally, I'll visualize the resulting clusters

In [52]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11.5)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['Neighborhood'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### Examine Clusters

#### Now, I can examine each cluster and determine the discriminating venue categories that distinguish each cluster. Based on the defining categories, I can then assign a name to each cluster. 

#### Cluster 1 (Urban; Business)

In [53]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 0, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
87,"CN Tower,Bathurst Quay,Island airport,Harbourf...",0,Airport Service,Airport Lounge,Airport Terminal,Harbor / Marina,Sculpture Garden,Boutique,Boat or Ferry,Plane,Airport Gate,Airport
43,"Brockton,Exhibition Place,Parkdale Village",0,Breakfast Spot,Café,Coffee Shop,Yoga Studio,Stadium,Burrito Place,Restaurant,Caribbean Restaurant,Climbing Gym,Pet Store
36,"Harbourfront East,Toronto Islands,Union Station",0,Coffee Shop,Aquarium,Hotel,Café,Italian Restaurant,Scenic Lookout,Fried Chicken Joint,Bakery,Brewery,Pizza Place
20,Berczy Park,0,Coffee Shop,Cocktail Bar,Pub,Cheese Shop,Restaurant,Bakery,Farmers Market,Steakhouse,Café,Seafood Restaurant
92,Stn A PO Boxes 25 The Esplanade,0,Coffee Shop,Café,Restaurant,Seafood Restaurant,Pub,Hotel,Italian Restaurant,Cocktail Bar,Breakfast Spot,Japanese Restaurant
42,"Design Exchange,Toronto Dominion Centre",0,Coffee Shop,Café,Hotel,Restaurant,American Restaurant,Deli / Bodega,Gastropub,Italian Restaurant,Pizza Place,Burger Joint
37,"Little Portugal,Trinity",0,Bar,Coffee Shop,Asian Restaurant,Bakery,Men's Store,Vietnamese Restaurant,Restaurant,New American Restaurant,Cocktail Bar,Café
48,"Commerce Court,Victoria Hotel",0,Coffee Shop,Café,Hotel,Restaurant,American Restaurant,Bakery,Seafood Restaurant,Gastropub,Deli / Bodega,Steakhouse
97,"First Canadian Place,Underground city",0,Coffee Shop,Café,Hotel,Restaurant,American Restaurant,Bar,Bakery,Deli / Bodega,Gastropub,Burger Joint
75,"Parkdale,Roncesvalles",0,Breakfast Spot,Gift Shop,Burger Joint,Bar,Italian Restaurant,Restaurant,Bookstore,Movie Theater,Dessert Shop,Bank


#### Cluster 2 (Suburban; residential)

In [54]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 1, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
83,"Moore Park,Summerhill East",1,Gym,Playground,Grocery Store,Event Space,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop,Doner Restaurant,Dog Run


#### Cluster 3 ('rural'; suburban; residential)

In [55]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 2, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
62,Roselawn,2,Garden,Wings Joint,Department Store,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop,Doner Restaurant,Dog Run


#### Cluster 4 (Park, suburban, residential)

In [56]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 3, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
91,Rosedale,3,Park,Playground,Trail,Wings Joint,Department Store,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop,Doner Restaurant
68,"Forest Hill North,Forest Hill West",3,Park,Trail,Jewelry Store,Sushi Restaurant,Wings Joint,Dessert Shop,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop


#### Cluster 5 (Urban; Mixed Business/residential)

In [57]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 4, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
61,Lawrence Park,4,Bus Line,Park,Swim School,Wings Joint,Electronics Store,Eastern European Restaurant,Dumpling Restaurant,Donut Shop,Doner Restaurant,Dog Run
