## Segmenting and Clustering Neighborhoods in Toronto

### Introduction

In this notebook, we have retrieved of Toronto Neighborhoods from a Wikipedia page, used Foursquare API to retrieve the neighborhood information of the Downtown Toronto Borough and then explored and clustered the data. Todo all this we have used Geocoder to get co-ordinates of neighborhoods, Foursquare API to get info of neighborhoods and K-means to cluster the data returned by Foursquare. The assignment is divided into three sections, Data Retrieval, getting Geographical co-ordinates and Exploring and Clustering. They are marked as Part A, Part B and Part C

## Part A : Retrieving Data of Toronto Neighborhoods from Wikipedia Page

### Importing all Required Libraries

In [80]:
import urllib.request #library used to open URL
from bs4 import BeautifulSoup #library used to parse HTML
import pandas as pd
import numpy as np
print('All Libraries imported')

All Libraries imported


In [81]:
#Specify which URl page to use for scraping Data
url = 'https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'

In [82]:
# open URl using urllib.request and save HTML in variable page
page = urllib.request.urlopen(url)

In [83]:
#parse HTML from our URL into BeautifulSoup parse tree format
soup = BeautifulSoup(page, "lxml")

In [84]:
#Using find_all function to get back all the tables in the HTML
all_tables = soup.find_all("table")
# Extracting the Postal Code table from the tables above
right_table = soup.find('table', class_='wikitable sortable')

In [85]:
# Extracting Data from the table above
P = []
B = []
N =[]
for row in right_table.find_all('tr'):
    cells = row.find_all('td')
    if len(cells)== 3:
        P.append(cells[0].find(text=True))
        B.append(cells[1].find(text = True))
        N.append(cells[2].find(text = True))

In [86]:
# Creating a Pandas Dataframe using the Lists above
df = pd.DataFrame(P, columns = ['PostalCode'])
df['Borough'] = B
df['Neighborhood']=N
df.head()

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront


## Cleaning and pre-processing the Data

In [87]:
#Dropping rows with  Borough = 'Not assigned'
df = df[df.Borough!= 'Not assigned']

# Resetting the index
df.index = np.arange(0, len(df))

# Merging rows with same PostalCode
df = df.groupby(['PostalCode', 'Borough'])['Neighborhood'].apply(list)
df = df.sample(frac=1).reset_index()
df['Neighborhood']= df['Neighborhood'].str.join(',')

# removing unwanted |n's at end of Neighborhood values which showed up after running the groupby command
df['Neighborhood']= df['Neighborhood'].str.replace('\n', '') 

# Assigning the value of Borough to Neighborhood if Neighborhood is 'Not assigned'
df.loc[df['Neighborhood']=='Not assigned', 'Neighborhood'] = df.loc[df['Neighborhood']=='Not assigned', 'Borough']

df.head(20)

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M4W,Downtown Toronto,Rosedale
1,M8Y,Etobicoke,"Humber Bay,King's Mill Park,Kingsway Park Sout..."
2,M7A,Queen's Park,Queen's Park
3,M5R,Central Toronto,"The Annex,North Midtown,Yorkville"
4,M5M,North York,"Bedford Park,Lawrence Manor East"
5,M5A,Downtown Toronto,Harbourfront
6,M5S,Downtown Toronto,"Harbord,University of Toronto"
7,M4N,Central Toronto,Lawrence Park
8,M9W,Etobicoke,Northwest
9,M6E,York,Caledonia-Fairbanks


In [88]:
df.shape

(103, 3)

## Part B : Get Longitude and Latitude Values of Postal Codes using geocoder

In [89]:
# Adding empty columns Latitude and Longitude to dataframe for storing longitude and Latitude values
df['Longitude'] = ""
df['Latitude'] = ""
df.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Longitude,Latitude
0,M4W,Downtown Toronto,Rosedale,,
1,M8Y,Etobicoke,"Humber Bay,King's Mill Park,Kingsway Park Sout...",,
2,M7A,Queen's Park,Queen's Park,,
3,M5R,Central Toronto,"The Annex,North Midtown,Yorkville",,
4,M5M,North York,"Bedford Park,Lawrence Manor East",,


In [90]:
# Getting Latitude and Longitude values for all postl Codes using Geocoder
import geocoder
lat_lng_coords = None
#looping through the dataframe for all postal codes
for postalcode in df['PostalCode']:
    g = geocoder.arcgis('{} Toronto, Ontario'.format(postalcode))
    lat_lng_coords = g.latlng
# locating row where "PostalCode" = postalcode
    i = df.loc[df['PostalCode']== postalcode].index
# Assigning the Co-ordinates returned by geocoder to the dataframe Latitude and Longitude columns of row i
    df.loc[i,'Latitude'] = lat_lng_coords[0] 
    df.loc[i,'Longitude'] = lat_lng_coords[1]
df

Unnamed: 0,PostalCode,Borough,Neighborhood,Longitude,Latitude
0,M4W,Downtown Toronto,Rosedale,-79.3779,43.6822
1,M8Y,Etobicoke,"Humber Bay,King's Mill Park,Kingsway Park Sout...",-79.4896,43.6328
2,M7A,Queen's Park,Queen's Park,-79.3917,43.6612
3,M5R,Central Toronto,"The Annex,North Midtown,Yorkville",-79.4038,43.6748
4,M5M,North York,"Bedford Park,Lawrence Manor East",-79.4191,43.7355
5,M5A,Downtown Toronto,Harbourfront,-79.3592,43.6503
6,M5S,Downtown Toronto,"Harbord,University of Toronto",-79.4018,43.6631
7,M4N,Central Toronto,Lawrence Park,-79.3871,43.7284
8,M9W,Etobicoke,Northwest,-79.5792,43.7117
9,M6E,York,Caledonia-Fairbanks,-79.451,43.6886


## Part C : Exploring and Clustering Neighborhoods in Toronto

Importing all required libraries and packages

In [91]:
from geopy.geocoders import Nominatim 
import requests
import json 
from pandas.io.json import json_normalize
from sklearn.cluster import KMeans
import matplotlib.cm as cm
import matplotlib.colors as colors
from IPython.display import Image
print('All Libraries and Packages imported')

All Libraries and Packages imported


In [92]:
!pip install folium



In [93]:
import folium

## Using geopy to retrieve Latitude and Longitude of Toronto

In [94]:
address = 'Toronto, Ontario'
geolocator = Nominatim(user_agent = "toronto_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geographical co-ordinates of Toronto are: {}, {}'.format(latitude, longitude))

The geographical co-ordinates of Toronto are: 43.653963, -79.387207


## Create a map of Toronto with neighborhoods superimposed on top

In [95]:
# Create map of Toronto using the Latitude and Longitude values from above
map_toronto = folium.Map(location = [latitude, longitude], zoom_start = 10)
map_toronto
# add markers to map showing all buroughs in df_toronto dataframe

for lat, lng, borough, neighborhood in zip(df['Latitude'], df['Longitude'], df['Borough'], df['Neighborhood']):
    label = '{},{}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html = True)
    folium.CircleMarker(
        [lat, lng],
        radius = 5,
        popup = label,
        color = 'blue',
        fill = True,
        fill_color = '#3186cc',
        fill_opacity = 0.7,
        parse_html = False).add_to(map_toronto)
map_toronto

## Exploring the Borough Downtown Toronto

### Create new Dataframe with neighborhoods in Downtown Toronto

In [96]:
# Creating Dataframe with buroughs having Toronto in their name

df_toronto = df[df['Borough'] == 'Downtown Toronto'].reset_index(drop = True)
df_toronto.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Longitude,Latitude
0,M4W,Downtown Toronto,Rosedale,-79.3779,43.6822
1,M5A,Downtown Toronto,Harbourfront,-79.3592,43.6503
2,M5S,Downtown Toronto,"Harbord,University of Toronto",-79.4018,43.6631
3,M5W,Downtown Toronto,Stn A PO Boxes 25 The Esplanade,-79.3854,43.6487
4,M5C,Downtown Toronto,St. James Town,-79.3755,43.6512


### Getting the Co-ordinates of Downtown Toronto

In [97]:
address = 'Downtown Toronto, Ontario'
geolocator = Nominatim(user_agent = "toronto_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geographical co-ordinates of Downtown Toronto are: {}, {}'.format(latitude, longitude))

The geographical co-ordinates of Downtown Toronto are: 43.655115, -79.380219


### Create Map of Downtown Toronto with Neighborhood markers superimposed on top

In [98]:
# Create map of Downtown Toronto using the above latitude and logitude values
map_downtown_toronto = folium.Map(location = [latitude, longitude], zoom_start = 11)

# Add markers showing neighborhoods in Downtown Toronto
for lat, lng, label in zip(df_toronto['Latitude'], df_toronto['Longitude'], df_toronto['Neighborhood']):
    label = folium.Popup(label, parse_html = True)
    folium.CircleMarker(
        [lat, lng],
        radius = 5,
        popup = label,
        color = 'blue',
        fill = True,
        fill_color = '#3186cc',
        fill_opacity = 0.7,
        parse_html = False).add_to(map_downtown_toronto)
map_downtown_toronto

### Using Foursquare API to explore neighborhoods in Downtown Toronto

## Defining Foursquare API Credentials

In [99]:
CLIENT_ID = 'R3SEJHFEJQUL4NWD2SIB4JYVHNCKUJVCCMHVUCCEQ4K30YSC'
CLIENT_SECRET = 'PQYQGAVLNH5V1SLWIJOMX0UIWO1Y1DJ5WBEE52UIVVWYPAR4'
VERSION = '20180605'

print('Foursquare Credentials')
print('CLient_id:',CLIENT_ID)
print('Client_Secret:', CLIENT_SECRET)

Foursquare Credentials
CLient_id: R3SEJHFEJQUL4NWD2SIB4JYVHNCKUJVCCMHVUCCEQ4K30YSC
Client_Secret: PQYQGAVLNH5V1SLWIJOMX0UIWO1Y1DJ5WBEE52UIVVWYPAR4


### Exploring First Neighborhood in Dataframe

In [100]:
df_toronto.loc[0,'Neighborhood']

'Rosedale'

### Getting Location of the first neighborhood of the Downtown Toronto Dataframe

In [101]:
neighborhood_latitude = df_toronto.loc[0, 'Latitude']
neighborhood_longitude = df_toronto.loc[0, 'Longitude']
neighborhood_name = df_toronto.loc[0, 'Neighborhood']

print('Latitude and Longitude values of {} are {} and {}.'.format(neighborhood_name, neighborhood_latitude, neighborhood_longitude))

Latitude and Longitude values of Rosedale are 43.68220500000007 and -79.37794519699997.


### Getting the top 100 venues in CN Tower, Bathurst Quay, Island Airport, Harbourfront West, King and Spadina, Railway Lands and South Niagara Neighborhood within a 500m radius

Creating URl for the request to the Foursquare API

In [102]:
LIMIT = 100
radius = 500
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(CLIENT_ID,
                                                                                                                      CLIENT_SECRET,
                                                                                                                      VERSION,
                                                                                                                      neighborhood_latitude,
                                                                                                                      neighborhood_longitude,
                                                                                                                      radius,
                                                                                                                      LIMIT)
print(url)

https://api.foursquare.com/v2/venues/explore?&client_id=R3SEJHFEJQUL4NWD2SIB4JYVHNCKUJVCCMHVUCCEQ4K30YSC&client_secret=PQYQGAVLNH5V1SLWIJOMX0UIWO1Y1DJ5WBEE52UIVVWYPAR4&v=20180605&ll=43.68220500000007,-79.37794519699997&radius=500&limit=100


In [103]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5dd3989771782e001bdd3dc5'},
 'response': {'headerLocation': 'Rosedale',
  'headerFullLocation': 'Rosedale, Toronto',
  'headerLocationGranularity': 'neighborhood',
  'totalResults': 4,
  'suggestedBounds': {'ne': {'lat': 43.68670500450007,
    'lng': -79.37173430620689},
   'sw': {'lat': 43.677704995500065, 'lng': -79.38415608779306}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4adcb343f964a520e32e21e3',
       'name': 'Summerhill Market',
       'location': {'address': '446 Summerhill Ave',
        'crossStreet': 'btwn. MacLennan Ave. and Glen Rd.',
        'lat': 43.68626482142425,
        'lng': -79.37545823237794,
        'labeledLatLngs': [{'label': 'display',
          'lat': 43.68626482142425,
          'lng': -79.37545823

### Creating function to get the category of the venue from the results

In [104]:
def get_category_type(row):
    try:
        category_list = row['categories']
    except:
        category_list = row['venue.categories']
    if len(category_list)==0:
        return None
    else:
        return category_list[0]['name']

### Cleaning the json and structuring it into a pandas dataframe

In [105]:
from pandas.io.json import json_normalize

In [106]:
venues = results['response']['groups'][0]['items']
nearby_venues = json_normalize(venues)

#filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues = nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis = 1)

# clean the columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

Unnamed: 0,name,categories,lat,lng
0,Summerhill Market,Grocery Store,43.686265,-79.375458
1,Rosedale Park,Playground,43.682328,-79.378934
2,Whitney Park,Park,43.682036,-79.373788
3,Scoops Convenience Boutique,Candy Store,43.686148,-79.375828


### Total number of nearby venues returned by Foursquare

In [107]:
print('{} nearby venues were returned by Foursquare'.format(nearby_venues.shape[0]))

4 nearby venues were returned by Foursquare


## Exploring all the neighborhoods in Downtown Toronto

### Create a Function to repeat the process of retrieving the venues for a neighborhood for all neighborhoods

In [108]:
def getNearbyVenues(names, latitudes, longitudes, radius = 500):
    venues_list = []
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
        
        #Create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
               CLIENT_ID,
               CLIENT_SECRET,
               VERSION,
               lat,
               lng,
               radius,
               LIMIT)
        # make the Get request
        results = requests.get(url).json()['response']['groups'][0]['items']
        
        # return only relevant information for each neighborhood
        venues_list.append([(
          name,
          lat,
          lng,
          v['venue']['name'],
          v['venue']['location']['lat'],
          v['venue']['location']['lng'],
          v['venue']['categories'][0]['name']) for v in results])
    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood',
                            'Neighborhood Latitude',
                            'Neighborhood Longitude',
                            'Venue',
                            'Venue Latitude',
                            'Venue Longitude',
                            'Venue Category']
    return (nearby_venues)

## Call the above function to get venues for all the Neighborhoods in Downtown Toronto

In [109]:
downtown_toronto_venues = getNearbyVenues(names=df_toronto['Neighborhood'],
                                   latitudes=df_toronto['Latitude'],
                                   longitudes=df_toronto['Longitude']
                                  )

Rosedale
Harbourfront
Harbord,University of Toronto
Stn A PO Boxes 25 The Esplanade
St. James Town
Central Bay Street
First Canadian Place,Underground city
Commerce Court,Victoria Hotel
Christie
CN Tower,Bathurst Quay,Island airport,Harbourfront West,King and Spadina,Railway Lands,South Niagara
Harbourfront East,Toronto Islands,Union Station
Design Exchange,Toronto Dominion Centre
Adelaide,King,Richmond
Cabbagetown,St. James Town
Berczy Park
Chinatown,Grange Park,Kensington Market
Church and Wellesley
Ryerson,Garden District


### Checking the size of the dataframe

In [110]:
print(downtown_toronto_venues.shape)

(1252, 7)


### Checking number of venues returned for each Neighborhood

In [111]:
downtown_toronto_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"Adelaide,King,Richmond",100,100,100,100,100,100
Berczy Park,63,63,63,63,63,63
"CN Tower,Bathurst Quay,Island airport,Harbourfront West,King and Spadina,Railway Lands,South Niagara",70,70,70,70,70,70
"Cabbagetown,St. James Town",40,40,40,40,40,40
Central Bay Street,96,96,96,96,96,96
"Chinatown,Grange Park,Kensington Market",84,84,84,84,84,84
Christie,12,12,12,12,12,12
Church and Wellesley,83,83,83,83,83,83
"Commerce Court,Victoria Hotel",100,100,100,100,100,100
"Design Exchange,Toronto Dominion Centre",100,100,100,100,100,100


### Getting the count of unique categories returned

In [112]:
print('There are {} unique catogories:'.format(len(downtown_toronto_venues['Venue Category'].unique())))

There are 184 unique catogories:


## Analyzing Each Neighborhood

In [113]:
# one hot encoding
down_tor_onehot = pd.get_dummies(downtown_toronto_venues[['Venue Category']], prefix="", prefix_sep="")

# There is a venue category 'Neighborhood', changing its name to 'Hoods'
down_tor_onehot.rename(columns={'Neighborhood': 'Hoods'}, inplace = True)

# add neighborhood column back as first column of dataframe
down_tor_onehot['Neighborhood'] = downtown_toronto_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [down_tor_onehot.columns[-1]] + list(down_tor_onehot.columns[:-1])
down_tor_onehot = down_tor_onehot[fixed_columns]

down_tor_onehot.head()

Unnamed: 0,Neighborhood,Afghan Restaurant,American Restaurant,Antique Shop,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,BBQ Joint,Baby Store,Bagel Shop,Bakery,Bank,Bar,Basketball Stadium,Beer Bar,Beer Store,Belgian Restaurant,Bistro,Boat or Ferry,Bookstore,Brazilian Restaurant,Breakfast Spot,Brewery,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Butcher,Café,Candy Store,Caribbean Restaurant,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Clothing Store,Cocktail Bar,Coffee Shop,College Arts Building,College Gym,College Rec Center,Colombian Restaurant,Comfort Food Restaurant,Comic Shop,Concert Hall,Cosmetics Shop,Creperie,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Dog Run,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,Ethiopian Restaurant,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Fish Market,Food & Drink Shop,Food Court,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Gaming Cafe,Gastropub,Gay Bar,General Entertainment,General Travel,Gluten-free Restaurant,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Harbor / Marina,Health & Beauty Service,Historic Site,Hobby Shop,Hookah Bar,Hostel,Hotel,Hotel Bar,Hotpot Restaurant,Ice Cream Shop,Indian Restaurant,Irish Pub,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Lake,Latin American Restaurant,Lingerie Store,Liquor Store,Lounge,Market,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Movie Theater,Moving Target,Museum,Music Venue,Hoods,New American Restaurant,Nightclub,Noodle House,Office,Opera House,Optical Shop,Organic Grocery,Other Great Outdoors,Park,Performing Arts Venue,Peruvian Restaurant,Pet Store,Pharmacy,Pier,Pizza Place,Playground,Plaza,Poke Place,Poutine Place,Pub,Ramen Restaurant,Record Shop,Restaurant,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Sculpture Garden,Seafood Restaurant,Shoe Store,Shopping Mall,Smoke Shop,Smoothie Shop,Snack Place,Soup Place,Spa,Speakeasy,Sporting Goods Shop,Sports Bar,Steakhouse,Strip Club,Sushi Restaurant,Taco Place,Tailor Shop,Taiwanese Restaurant,Tanning Salon,Tapas Restaurant,Tea Room,Tech Startup,Thai Restaurant,Theater,Theme Restaurant,Toy / Game Store,Trail,Train Station,Tram Station,Vegetarian / Vegan Restaurant,Veterinarian,Video Game Store,Vietnamese Restaurant,Wine Bar,Wine Shop,Wings Joint,Yoga Studio
0,Rosedale,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Rosedale,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Rosedale,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Rosedale,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Harbourfront,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


### Getting Size of new dataframe

In [114]:
down_tor_onehot.shape

(1252, 185)

### Grouping rows by neighborhoods and taking the mean of the frequencies of each category

In [115]:
down_tor_grouped = down_tor_onehot.groupby('Neighborhood').mean().reset_index()
down_tor_grouped.head()

Unnamed: 0,Neighborhood,Afghan Restaurant,American Restaurant,Antique Shop,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,BBQ Joint,Baby Store,Bagel Shop,Bakery,Bank,Bar,Basketball Stadium,Beer Bar,Beer Store,Belgian Restaurant,Bistro,Boat or Ferry,Bookstore,Brazilian Restaurant,Breakfast Spot,Brewery,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Butcher,Café,Candy Store,Caribbean Restaurant,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Clothing Store,Cocktail Bar,Coffee Shop,College Arts Building,College Gym,College Rec Center,Colombian Restaurant,Comfort Food Restaurant,Comic Shop,Concert Hall,Cosmetics Shop,Creperie,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Dog Run,Donut Shop,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,Ethiopian Restaurant,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Fish Market,Food & Drink Shop,Food Court,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Gaming Cafe,Gastropub,Gay Bar,General Entertainment,General Travel,Gluten-free Restaurant,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Harbor / Marina,Health & Beauty Service,Historic Site,Hobby Shop,Hookah Bar,Hostel,Hotel,Hotel Bar,Hotpot Restaurant,Ice Cream Shop,Indian Restaurant,Irish Pub,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Lake,Latin American Restaurant,Lingerie Store,Liquor Store,Lounge,Market,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Movie Theater,Moving Target,Museum,Music Venue,Hoods,New American Restaurant,Nightclub,Noodle House,Office,Opera House,Optical Shop,Organic Grocery,Other Great Outdoors,Park,Performing Arts Venue,Peruvian Restaurant,Pet Store,Pharmacy,Pier,Pizza Place,Playground,Plaza,Poke Place,Poutine Place,Pub,Ramen Restaurant,Record Shop,Restaurant,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Sculpture Garden,Seafood Restaurant,Shoe Store,Shopping Mall,Smoke Shop,Smoothie Shop,Snack Place,Soup Place,Spa,Speakeasy,Sporting Goods Shop,Sports Bar,Steakhouse,Strip Club,Sushi Restaurant,Taco Place,Tailor Shop,Taiwanese Restaurant,Tanning Salon,Tapas Restaurant,Tea Room,Tech Startup,Thai Restaurant,Theater,Theme Restaurant,Toy / Game Store,Trail,Train Station,Tram Station,Vegetarian / Vegan Restaurant,Veterinarian,Video Game Store,Vietnamese Restaurant,Wine Bar,Wine Shop,Wings Joint,Yoga Studio
0,"Adelaide,King,Richmond",0.0,0.03,0.0,0.01,0.0,0.03,0.0,0.0,0.0,0.0,0.03,0.0,0.03,0.0,0.01,0.0,0.0,0.0,0.0,0.02,0.01,0.02,0.0,0.0,0.01,0.03,0.01,0.0,0.06,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.08,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.01,0.0,0.0,0.02,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.01,0.0,0.0,0.01,0.03,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.03,0.0,0.02,0.01,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.04,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0
1,Berczy Park,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.015873,0.0,0.015873,0.047619,0.0,0.0,0.015873,0.031746,0.0,0.0,0.015873,0.0,0.0,0.0,0.031746,0.0,0.0,0.0,0.0,0.0,0.015873,0.031746,0.0,0.0,0.031746,0.0,0.0,0.0,0.015873,0.031746,0.079365,0.0,0.0,0.0,0.0,0.015873,0.0,0.015873,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.0,0.031746,0.0,0.0,0.015873,0.0,0.0,0.015873,0.015873,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.015873,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.031746,0.0,0.0,0.0,0.0,0.015873,0.015873,0.015873,0.015873,0.0,0.0,0.0,0.0,0.0,0.015873,0.031746,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.031746,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.031746,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.015873,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.015873,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,"CN Tower,Bathurst Quay,Island airport,Harbourf...",0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.028571,0.0,0.042857,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014286,0.014286,0.0,0.014286,0.0,0.0,0.028571,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.0,0.114286,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.014286,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.028571,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.042857,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.014286,0.014286,0.0,0.0,0.0,0.014286,0.071429,0.014286,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.014286,0.014286,0.0,0.014286,0.014286,0.0,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.014286,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.028571,0.0,0.014286,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.028571,0.014286,0.0,0.028571,0.0,0.0,0.0,0.028571,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014286,0.028571,0.0,0.0,0.014286,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.014286
3,"Cabbagetown,St. James Town",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.025,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.025,0.05,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.075,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.05,0.025,0.0,0.025,0.0,0.0,0.0,0.0,0.025,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.025,0.05,0.0,0.05,0.025,0.0,0.0,0.0,0.025,0.0,0.0,0.075,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Central Bay Street,0.0,0.010417,0.0,0.010417,0.0,0.0,0.0,0.0,0.0,0.0,0.03125,0.010417,0.010417,0.0,0.010417,0.0,0.0,0.0,0.0,0.020833,0.0,0.010417,0.0,0.020833,0.0,0.020833,0.0,0.0,0.010417,0.0,0.0,0.0,0.020833,0.0,0.0,0.0625,0.0,0.104167,0.0,0.0,0.0,0.0,0.0,0.010417,0.0,0.03125,0.0,0.0,0.0,0.010417,0.010417,0.0,0.010417,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010417,0.0,0.010417,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.010417,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.0,0.020833,0.0,0.0,0.020833,0.010417,0.0,0.0,0.010417,0.0,0.0,0.010417,0.0,0.0,0.0,0.0,0.0,0.010417,0.010417,0.010417,0.010417,0.0,0.0,0.010417,0.0,0.0,0.010417,0.010417,0.0,0.0,0.0,0.010417,0.0,0.0,0.0,0.0,0.010417,0.0,0.0,0.0,0.0,0.0,0.010417,0.0,0.020833,0.010417,0.0,0.010417,0.010417,0.0,0.020833,0.0,0.0,0.0,0.020833,0.0,0.010417,0.010417,0.010417,0.0,0.0,0.0,0.0,0.020833,0.0,0.020833,0.0,0.010417,0.0,0.020833,0.0,0.0,0.0,0.010417,0.0,0.020833,0.0,0.010417,0.010417,0.0,0.010417,0.0,0.0,0.0,0.0,0.0,0.010417,0.010417,0.010417,0.0,0.0,0.0


### Confirming New Size

In [116]:
down_tor_grouped.shape

(18, 185)

### Printing the top 5 venues for each Neighborhood

In [117]:
num_top_venues = 5

for hood in down_tor_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = down_tor_grouped[down_tor_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Adelaide,King,Richmond----
          venue  freq
0   Coffee Shop  0.08
1          Café  0.06
2         Hotel  0.05
3    Steakhouse  0.04
4  Burger Joint  0.03


----Berczy Park----
                venue  freq
0         Coffee Shop  0.08
1              Bakery  0.05
2          Restaurant  0.05
3  Seafood Restaurant  0.03
4               Hotel  0.03


----CN Tower,Bathurst Quay,Island airport,Harbourfront West,King and Spadina,Railway Lands,South Niagara----
                  venue  freq
0           Coffee Shop  0.11
1    Italian Restaurant  0.07
2  Gym / Fitness Center  0.04
3                   Bar  0.04
4             Speakeasy  0.03


----Cabbagetown,St. James Town----
         venue  freq
0   Restaurant  0.08
1  Coffee Shop  0.08
2       Bakery  0.05
3         Park  0.05
4         Café  0.05


----Central Bay Street----
                  venue  freq
0           Coffee Shop  0.10
1        Clothing Store  0.06
2  Fast Food Restaurant  0.03
3                Bakery  0.03
4        Cosme

## Putting these in a Pandas Dataframe

### Write a function to sort the venues in descending order

In [118]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

### Create new Dataframe with top ten venues for each neighborhood

In [119]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = down_tor_grouped['Neighborhood']

for ind in np.arange(down_tor_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(down_tor_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Adelaide,King,Richmond",Coffee Shop,Café,Hotel,Steakhouse,Burger Joint,Bar,Bakery,Japanese Restaurant,Restaurant,Asian Restaurant
1,Berczy Park,Coffee Shop,Bakery,Restaurant,Cheese Shop,Breakfast Spot,Beer Bar,Seafood Restaurant,Café,Hotel,Steakhouse
2,"CN Tower,Bathurst Quay,Island airport,Harbourf...",Coffee Shop,Italian Restaurant,Bar,Gym / Fitness Center,Pub,Bakery,Sandwich Place,Park,Café,Restaurant
3,"Cabbagetown,St. James Town",Restaurant,Coffee Shop,Pharmacy,Chinese Restaurant,Park,Café,Italian Restaurant,Pizza Place,Bakery,Pub
4,Central Bay Street,Coffee Shop,Clothing Store,Fast Food Restaurant,Cosmetics Shop,Bakery,Chinese Restaurant,Tea Room,Bookstore,Gym / Fitness Center,Spa


## Clustering Neighborhoods

### Running K-means to cluster neighborhoods into 5 clusters

In [120]:
# setting  number of clusters
kclusters = 5

down_tor_grouped_clustering = down_tor_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(down_tor_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([0, 0, 0, 4, 0, 4, 3, 0, 0, 0], dtype=int32)

### Creating new dataframe which includes the cluster as well as the top 10 venues

In [121]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

down_tor_merged = df_toronto

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
down_tor_merged = down_tor_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

down_tor_merged.head() # check the last columns!

Unnamed: 0,PostalCode,Borough,Neighborhood,Longitude,Latitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M4W,Downtown Toronto,Rosedale,-79.3779,43.6822,2,Playground,Candy Store,Park,Grocery Store,Eastern European Restaurant,Fish Market,Fish & Chips Shop,Fast Food Restaurant,Farmers Market,Farm
1,M5A,Downtown Toronto,Harbourfront,-79.3592,43.6503,0,Coffee Shop,Bakery,Boat or Ferry,Theater,Café,Breakfast Spot,Brewery,Spa,Gastropub,French Restaurant
2,M5S,Downtown Toronto,"Harbord,University of Toronto",-79.4018,43.6631,4,Café,Bakery,Bookstore,Coffee Shop,Restaurant,Gym,Japanese Restaurant,Italian Restaurant,Bar,Ramen Restaurant
3,M5W,Downtown Toronto,Stn A PO Boxes 25 The Esplanade,-79.3854,43.6487,0,Coffee Shop,Bar,Hotel,Café,Steakhouse,American Restaurant,Pub,Asian Restaurant,Pizza Place,Sushi Restaurant
4,M5C,Downtown Toronto,St. James Town,-79.3755,43.6512,0,Coffee Shop,Café,Hotel,Restaurant,Bakery,Breakfast Spot,Seafood Restaurant,Gastropub,Italian Restaurant,Cosmetics Shop


### Visualizing the clusters

In [122]:
import matplotlib.cm as cm

In [123]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(down_tor_merged['Latitude'], down_tor_merged['Longitude'], down_tor_merged['Neighborhood'], down_tor_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

## Examining Clusters

### Cluster 1

In [124]:
down_tor_merged.loc[down_tor_merged['Cluster Labels'] == 0, down_tor_merged.columns[[1] + list(range(5, down_tor_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Downtown Toronto,0,Coffee Shop,Bakery,Boat or Ferry,Theater,Café,Breakfast Spot,Brewery,Spa,Gastropub,French Restaurant
3,Downtown Toronto,0,Coffee Shop,Bar,Hotel,Café,Steakhouse,American Restaurant,Pub,Asian Restaurant,Pizza Place,Sushi Restaurant
4,Downtown Toronto,0,Coffee Shop,Café,Hotel,Restaurant,Bakery,Breakfast Spot,Seafood Restaurant,Gastropub,Italian Restaurant,Cosmetics Shop
5,Downtown Toronto,0,Coffee Shop,Clothing Store,Fast Food Restaurant,Cosmetics Shop,Bakery,Chinese Restaurant,Tea Room,Bookstore,Gym / Fitness Center,Spa
6,Downtown Toronto,0,Coffee Shop,Café,Hotel,American Restaurant,Restaurant,Bar,Bakery,Seafood Restaurant,Gym,Gastropub
7,Downtown Toronto,0,Coffee Shop,Café,Hotel,American Restaurant,Japanese Restaurant,Restaurant,Italian Restaurant,Gastropub,Deli / Bodega,Steakhouse
9,Downtown Toronto,0,Coffee Shop,Italian Restaurant,Bar,Gym / Fitness Center,Pub,Bakery,Sandwich Place,Park,Café,Restaurant
11,Downtown Toronto,0,Coffee Shop,Café,Hotel,American Restaurant,Restaurant,Seafood Restaurant,Gastropub,Steakhouse,Italian Restaurant,Bar
12,Downtown Toronto,0,Coffee Shop,Café,Hotel,Steakhouse,Burger Joint,Bar,Bakery,Japanese Restaurant,Restaurant,Asian Restaurant
14,Downtown Toronto,0,Coffee Shop,Bakery,Restaurant,Cheese Shop,Breakfast Spot,Beer Bar,Seafood Restaurant,Café,Hotel,Steakhouse


### Renaming Cluster 1 as 'Coffee Shops' since the most common venue in this cluster is Coffee Shop

In [125]:
down_tor_merged.loc[(down_tor_merged['Cluster Labels']) == 0,'Cluster Labels']= 'Coffee Shops'
# Display Cluster 1
down_tor_merged.loc[down_tor_merged['Cluster Labels'] == 'Coffee Shops', down_tor_merged.columns[[1] + list(range(5, down_tor_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Downtown Toronto,Coffee Shops,Coffee Shop,Bakery,Boat or Ferry,Theater,Café,Breakfast Spot,Brewery,Spa,Gastropub,French Restaurant
3,Downtown Toronto,Coffee Shops,Coffee Shop,Bar,Hotel,Café,Steakhouse,American Restaurant,Pub,Asian Restaurant,Pizza Place,Sushi Restaurant
4,Downtown Toronto,Coffee Shops,Coffee Shop,Café,Hotel,Restaurant,Bakery,Breakfast Spot,Seafood Restaurant,Gastropub,Italian Restaurant,Cosmetics Shop
5,Downtown Toronto,Coffee Shops,Coffee Shop,Clothing Store,Fast Food Restaurant,Cosmetics Shop,Bakery,Chinese Restaurant,Tea Room,Bookstore,Gym / Fitness Center,Spa
6,Downtown Toronto,Coffee Shops,Coffee Shop,Café,Hotel,American Restaurant,Restaurant,Bar,Bakery,Seafood Restaurant,Gym,Gastropub
7,Downtown Toronto,Coffee Shops,Coffee Shop,Café,Hotel,American Restaurant,Japanese Restaurant,Restaurant,Italian Restaurant,Gastropub,Deli / Bodega,Steakhouse
9,Downtown Toronto,Coffee Shops,Coffee Shop,Italian Restaurant,Bar,Gym / Fitness Center,Pub,Bakery,Sandwich Place,Park,Café,Restaurant
11,Downtown Toronto,Coffee Shops,Coffee Shop,Café,Hotel,American Restaurant,Restaurant,Seafood Restaurant,Gastropub,Steakhouse,Italian Restaurant,Bar
12,Downtown Toronto,Coffee Shops,Coffee Shop,Café,Hotel,Steakhouse,Burger Joint,Bar,Bakery,Japanese Restaurant,Restaurant,Asian Restaurant
14,Downtown Toronto,Coffee Shops,Coffee Shop,Bakery,Restaurant,Cheese Shop,Breakfast Spot,Beer Bar,Seafood Restaurant,Café,Hotel,Steakhouse


## Cluster 2

In [126]:
down_tor_merged.loc[down_tor_merged['Cluster Labels'] == 1, down_tor_merged.columns[[1] + list(range(5, down_tor_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
10,Downtown Toronto,1,Pier,Harbor / Marina,Park,Yoga Studio,Eastern European Restaurant,Fish Market,Fish & Chips Shop,Fast Food Restaurant,Farmers Market,Farm


### Since this cluster has only one element and top two venues are Playground and Park we can rename it "Kid Play Areas'

In [127]:
down_tor_merged.loc[(down_tor_merged['Cluster Labels']) == 1,'Cluster Labels']= 'Kid Play Areas'
down_tor_merged.loc[down_tor_merged['Cluster Labels'] == 'Kid Play Areas', down_tor_merged.columns[[1] + list(range(5, down_tor_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
10,Downtown Toronto,Kid Play Areas,Pier,Harbor / Marina,Park,Yoga Studio,Eastern European Restaurant,Fish Market,Fish & Chips Shop,Fast Food Restaurant,Farmers Market,Farm


### Cluster 3

In [128]:
down_tor_merged.loc[down_tor_merged['Cluster Labels'] == 2, down_tor_merged.columns[[1] + list(range(5, down_tor_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Downtown Toronto,2,Playground,Candy Store,Park,Grocery Store,Eastern European Restaurant,Fish Market,Fish & Chips Shop,Fast Food Restaurant,Farmers Market,Farm


### In this cluster we see that the most common venues are waterfront venues so we can name it 'Waterfront'

In [129]:
down_tor_merged.loc[(down_tor_merged['Cluster Labels']) == 2,'Cluster Labels']= 'Waterfront'
down_tor_merged.loc[down_tor_merged['Cluster Labels'] == 'Waterfront', down_tor_merged.columns[[1] + list(range(5, down_tor_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Downtown Toronto,Waterfront,Playground,Candy Store,Park,Grocery Store,Eastern European Restaurant,Fish Market,Fish & Chips Shop,Fast Food Restaurant,Farmers Market,Farm


### Cluster 4

In [130]:
down_tor_merged.loc[down_tor_merged['Cluster Labels'] == 3, down_tor_merged.columns[[1] + list(range(5, down_tor_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
8,Downtown Toronto,3,Café,Grocery Store,Coffee Shop,Playground,Candy Store,Italian Restaurant,Athletics & Sports,Baby Store,Bank,Food & Drink Shop


### The most common venues in this cluster eating places so we could name it 'Restaurants'

In [131]:
down_tor_merged.loc[(down_tor_merged['Cluster Labels']) == 3,'Cluster Labels']= 'Restaurants'
down_tor_merged.loc[down_tor_merged['Cluster Labels'] == 'Restaurants', down_tor_merged.columns[[1] + list(range(5, down_tor_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
8,Downtown Toronto,Restaurants,Café,Grocery Store,Coffee Shop,Playground,Candy Store,Italian Restaurant,Athletics & Sports,Baby Store,Bank,Food & Drink Shop


### Cluster 5

In [132]:
down_tor_merged.loc[down_tor_merged['Cluster Labels'] == 4, down_tor_merged.columns[[1] + list(range(5, down_tor_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,Downtown Toronto,4,Café,Bakery,Bookstore,Coffee Shop,Restaurant,Gym,Japanese Restaurant,Italian Restaurant,Bar,Ramen Restaurant
13,Downtown Toronto,4,Restaurant,Coffee Shop,Pharmacy,Chinese Restaurant,Park,Café,Italian Restaurant,Pizza Place,Bakery,Pub
15,Downtown Toronto,4,Café,Vietnamese Restaurant,Mexican Restaurant,Dumpling Restaurant,Bar,Chinese Restaurant,Coffee Shop,Vegetarian / Vegan Restaurant,Grocery Store,Bubble Tea Shop


### This cluster mainly has stores so we could name it 'Shopping'

In [133]:
down_tor_merged.loc[(down_tor_merged['Cluster Labels']) == 4,'Cluster Labels']= 'Shopping'
down_tor_merged.loc[down_tor_merged['Cluster Labels'] == 'Shopping', down_tor_merged.columns[[1] + list(range(5, down_tor_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,Downtown Toronto,Shopping,Café,Bakery,Bookstore,Coffee Shop,Restaurant,Gym,Japanese Restaurant,Italian Restaurant,Bar,Ramen Restaurant
13,Downtown Toronto,Shopping,Restaurant,Coffee Shop,Pharmacy,Chinese Restaurant,Park,Café,Italian Restaurant,Pizza Place,Bakery,Pub
15,Downtown Toronto,Shopping,Café,Vietnamese Restaurant,Mexican Restaurant,Dumpling Restaurant,Bar,Chinese Restaurant,Coffee Shop,Vegetarian / Vegan Restaurant,Grocery Store,Bubble Tea Shop
