# Segmenting and Clustering Neighborhoods in the city of Toronto, Canada

In this assignment, I will try to explore, segment, and cluster the neighborhoods in the city of Toronto. 

For the Toronto neighborhood data,a Wikipedia page exists that has all the information we need to explore and cluster the neighborhoods in Toronto. We are going to scrape the Wikipedia page and wrangle the data, clean it, and then read it into a pandas dataframe so that it is in a structured format.

#### Installing Required Packages

In [13]:
!conda install beautifulsoup4
!conda install lxml
!conda install requests
print('Packages installed successfully')

Fetching package metadata ...........
Solving package specifications: .

# All requested packages already installed.
# packages in environment at /opt/conda/envs/DSX-Python35:
#
beautifulsoup4            4.6.3                    py35_0  
Fetching package metadata ...........
Solving package specifications: .

# All requested packages already installed.
# packages in environment at /opt/conda/envs/DSX-Python35:
#
lxml                      4.2.5            py35hefd8a0e_0  
Fetching package metadata ...........
Solving package specifications: .

# All requested packages already installed.
# packages in environment at /opt/conda/envs/DSX-Python35:
#
requests                  2.19.1                   py35_0  
Packages installed successfully


#### Importing all the required libraries

In [14]:
from bs4 import BeautifulSoup
import requests

import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Fetching package metadata .............
Solving package specifications: .

# All requested packages already installed.
# packages in environment at /opt/conda/envs/DSX-Python35:
#
geopy                     1.17.0                     py_0    conda-forge
Fetching package metadata .............
Solving package specifications: .

# All requested packages already installed.
# packages in environment at /opt/conda/envs/DSX-Python35:
#
folium                    0.5.0                      py_0    conda-forge
Libraries imported.


#### Using Requests.get() to get the required web page and scrape the webpage using BeautifulSoup package

In [15]:
source = requests.get('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M').text
soup = BeautifulSoup(source,'lxml')


#### Converting the table to a Pandas Dataframe

In [16]:
table = soup.find('table')
table_rows = table.find_all('tr')
l = []
for tr in table_rows:
    td = tr.find_all('td')
    row = [tr.text.strip() for tr in td if tr.text.strip()]
    l.append(row)
df = pd.DataFrame(l, columns=["Postcode", "Borough", "Neighborhood"])


In [17]:
df1=df.groupby(['Postcode','Borough'])['Neighborhood'].apply(','.join).reset_index() 


#### Dropping rows where Borough has a value of 'Not assigned'

In [18]:
df2 =df1.drop(df1[df1.Borough == 'Not assigned'].index).reset_index()
del df2['index']


#### If a cell has a borough but a Not assigned neighborhood, then making the neighborhood value to be the same as the borough. 

In [19]:
df2.loc[df2['Neighborhood'] 
        == 'Not assigned', 'Neighborhood'] = df2['Borough']
df2.head()

Unnamed: 0,Postcode,Borough,Neighborhood
0,M1B,Scarborough,"Rouge,Malvern"
1,M1C,Scarborough,"Highland Creek,Rouge Hill,Port Union"
2,M1E,Scarborough,"Guildwood,Morningside,West Hill"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae


In [20]:
#Checking
df_chk = df2.loc[df2['Postcode']=="M7A"]
df_chk

Unnamed: 0,Postcode,Borough,Neighborhood
85,M7A,Queen's Park,Queen's Park


#### Print the number of rows of the dataframe

In [21]:
df2.shape

(103, 3)

#### Downloading the Geospatial data which contains the geographical coordinates of each postal code and saving it as a csv file

In [22]:
!wget -q -O 'Geospatial_data.csv' https://cocl.us/Geospatial_data
print('Data downloaded!')

Data downloaded!


#### Reading the csv file into Pandas Dataframe

In [23]:
Geospatial_df = pd.read_csv('Geospatial_data.csv')
Geospatial_df.rename(columns={'Postal Code':'Postcode'}, inplace=True)
Geospatial_df.head()

Unnamed: 0,Postcode,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


In [16]:
Geospatial_df.shape

(103, 3)

#### Getting the Latitudes and Longitudes from the Geospatial dataframe to df3 

In [24]:
df3 = pd.merge(df2, Geospatial_df, on='Postcode')
df3.head()

Unnamed: 0,Postcode,Borough,Neighborhood,Latitude,Longitude
0,M1B,Scarborough,"Rouge,Malvern",43.806686,-79.194353
1,M1C,Scarborough,"Highland Creek,Rouge Hill,Port Union",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood,Morningside,West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476


In [18]:
df3.shape

(103, 5)

#### To explore and cluster the neighborhoods in Toronto, taking only boroughs that contain the word Toronto 

In [25]:
df_toronto = df3[df3['Borough'].str.contains("Toronto")].reset_index(drop=True)
df_toronto.head()

Unnamed: 0,Postcode,Borough,Neighborhood,Latitude,Longitude
0,M4E,East Toronto,The Beaches,43.676357,-79.293031
1,M4K,East Toronto,"The Danforth West,Riverdale",43.679557,-79.352188
2,M4L,East Toronto,"The Beaches West,India Bazaar",43.668999,-79.315572
3,M4M,East Toronto,Studio District,43.659526,-79.340923
4,M4N,Central Toronto,Lawrence Park,43.72802,-79.38879


In [26]:
df_toronto['Borough'].unique()

array(['East Toronto', 'Central Toronto', 'Downtown Toronto',
       'West Toronto'], dtype=object)

#### Getting the geographical cordinates of Toronto

In [27]:
address = 'Toronto, ON'

geolocator = Nominatim()
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))



The geograpical coordinate of Toronto are 43.653963, -79.387207.


##### Creating map of Toronto

In [28]:
# create map of Toronto using latitude and longitude values
map_toronto = folium.Map(location=[latitude, longitude], zoom_start=12)

# add markers to map
for lat, lng, borough, neighborhood in zip(df_toronto['Latitude'], df_toronto['Longitude'], df_toronto['Borough'], df_toronto['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
    
map_toronto

In [29]:
# The code was removed by Watson Studio for sharing.

Your credentails:
CLIENT_ID: H4VPNOOABY2G5ELKPEC3ZVVKJ5WBWDLYAKUGHAPS2TH0VPUU
CLIENT_SECRET:52PN23K443CB53IOICJEHPWDWBJABG5B2IG41A30EPJFRH41


In [30]:
LIMIT =500

def getNearbyVenues(names, latitudes, longitudes, radius=1000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

#### Utilizing the Foursquare API to explore the neighborhoods and segment them.

#### Getting the top 100 venues for all the Neighbourhoods in Toronto within a radius of 1000 meters.

In [31]:
toronto_venues = getNearbyVenues(names=df_toronto['Neighborhood'],
                                   latitudes=df_toronto['Latitude'],
                                   longitudes=df_toronto['Longitude']
                                  )

The Beaches
The Danforth West,Riverdale
The Beaches West,India Bazaar
Studio District
Lawrence Park
Davisville North
North Toronto West
Davisville
Moore Park,Summerhill East
Deer Park,Forest Hill SE,Rathnelly,South Hill,Summerhill West
Rosedale
Cabbagetown,St. James Town
Church and Wellesley
Harbourfront,Regent Park
Ryerson,Garden District
St. James Town
Berczy Park
Central Bay Street
Adelaide,King,Richmond
Harbourfront East,Toronto Islands,Union Station
Design Exchange,Toronto Dominion Centre
Commerce Court,Victoria Hotel
Roselawn
Forest Hill North,Forest Hill West
The Annex,North Midtown,Yorkville
Harbord,University of Toronto
Chinatown,Grange Park,Kensington Market
CN Tower,Bathurst Quay,Island airport,Harbourfront West,King and Spadina,Railway Lands,South Niagara
Stn A PO Boxes 25 The Esplanade
First Canadian Place,Underground city
Christie
Dovercourt Village,Dufferin
Little Portugal,Trinity
Brockton,Exhibition Place,Parkdale Village
High Park,The Junction South
Parkdale,Roncesvall

In [33]:
print(toronto_venues.shape)
toronto_venues.head()

(3079, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,The Beaches,43.676357,-79.293031,Beaches Bake Shop,43.680363,-79.289692,Bakery
1,The Beaches,43.676357,-79.293031,Tori's Bakeshop,43.672114,-79.290331,Vegetarian / Vegan Restaurant
2,The Beaches,43.676357,-79.293031,The Fox Theatre,43.672801,-79.287272,Indie Movie Theater
3,The Beaches,43.676357,-79.293031,Ed's Real Scoop,43.67263,-79.287993,Ice Cream Shop
4,The Beaches,43.676357,-79.293031,The Beech Tree,43.680493,-79.288846,Gastropub


#### Assigning General Categories to all the venue categories

In [34]:
feat_name_list1 = ['Bakery',
'Vegetarian / Vegan Restaurant',
'Ice Cream Shop',
'Gastropub',
'Bagel Shop',
'Breakfast Spot',
'Coffee Shop',
'French Restaurant',
'Pub',
'Mexican Restaurant',
'Bar',
'Tea Room',
'Japanese Restaurant',
'Juice Bar',
'Caribbean Restaurant',
'Cupcake Shop',
'Diner',
'Ramen Restaurant',
'Indian Restaurant',
'Chocolate Shop',
'Sandwich Place',
'Thai Restaurant',
'Burger Joint',
'Café',
'Asian Restaurant',
'Beer Store',
'Pizza Place',
'Greek Restaurant',
'Restaurant',
'Italian Restaurant',
'Brewery',
'Dessert Shop',
'Bubble Tea Shop',
'BBQ Joint',
'Tapas Restaurant',
'Donut Shop',
'American Restaurant',
'Cuban Restaurant',
'Turkish Restaurant',
'New American Restaurant',
'Sushi Restaurant',
'Portuguese Restaurant',
'Fish & Chips Shop',
'Fast Food Restaurant',
'Falafel Restaurant',
'Chinese Restaurant',
'Pakistani Restaurant',
'Burrito Place',
'Steakhouse',
'Snack Place',
'Indian Chinese Restaurant',
'Middle Eastern Restaurant',
'Taco Place',
'Comfort Food Restaurant',
'Hotel',
'Latin American Restaurant',
'Dive Bar',
'Fried Chicken Joint',
'Vietnamese Restaurant',
'Seafood Restaurant',
'Hotel Bar',
'Food & Drink Shop',
'Wine Bar',
'Deli / Bodega',
'Salad Place',
'Smoothie Shop',
'Candy Store',
'Buffet',
'Indonesian Restaurant',
'Wings Joint',
'Cantonese Restaurant',
'Modern European Restaurant',
'German Restaurant',
'Filipino Restaurant',
'Pie Shop',
'Taiwanese Restaurant',
'Piano Bar',
'Gay Bar',
'Ethiopian Restaurant',
'Sake Bar',
'Afghan Restaurant',
'Persian Restaurant',
'Mediterranean Restaurant',
'Food Truck',
'Beer Bar',
'Cocktail Bar',
'Bistro',
'Belgian Restaurant',
'Food Court',
'Noodle House',
'Brazilian Restaurant',
'Gluten-free Restaurant',
'Irish Pub',
'Jewish Restaurant',
'Korean Restaurant',
'Eastern European Restaurant',
'Dumpling Restaurant',
'Doner Restaurant',
'Arepa Restaurant',
'South American Restaurant',
'Malay Restaurant',
'Mac & Cheese Joint',
'Southern / Soul Food Restaurant',
'Soup Place',
'Whisky Bar',
'Tibetan Restaurant',
'Hawaiian Restaurant',
'Beach Bar',
'Cajun / Creole Restaurant',
'Food',
'Frozen Yogurt Shop']
                
feat_name_list2 = ['Indie Movie Theater',
'Jazz Club',
'Concert Hall',
'Dance Studio',
'Indie Theater',
'Movie Theater',
'Rock Club',
'Gaming Cafe',
'General Entertainment',
'Theater',
'Performing Arts Venue',
'Hobby Shop',
'Nightclub',
'Video Game Store',
'Video Store',
'Event Space',
'Design Studio',
'Art Gallery',
'Karaoke Bar',
'Art Museum',
'Museum',
'Exhibit',
'Opera House',
'History Museum',
'Music Venue',
'Pool Hall',
'Street Art',
'Comedy Club',
'Amphitheater',
'Arts & Entertainment',
]
feat_name_list3 = ['Baseball Field',
'Sporting Goods Shop',
'Track',
'Skating Rink',
'Tennis Court',
'Athletics & Sports',
'Basketball Stadium',
'Baseball Stadium',
'Stadium',
'Soccer Stadium',
]
feat_name_list4 = ['College Quad',
'College Gym',
'Library',
'University',
'Music School',
'School',
'College Arts Building',
'College Lab',
]
feat_name_list5 =['Yoga Studio',
'Gym',
'Climbing Gym',
'Gym / Fitness Center',
'Gym Pool',
'Pilates Studio',
'Martial Arts Dojo',
'Rock Climbing Spot',
]
feat_name_list6=['Nail Salon',
'Salon / Barbershop',
'Spa',
'Health & Beauty Service',
'Bridal Shop',
]
feat_name_list7=['Park',
'Beach',
'Scenic Lookout',
'Tree',
'Pool',
'Skate Park',
'Garden Center',
'Harbor / Marina',
'Garden',
'Playground',
'Farm',
'Historic Site',
'Monument / Landmark',
'Fountain',
'Lake',
'Aquarium',
'Castle',
'Sculpture Garden',
'Other Great Outdoors',
'Zoo',
'River',
]
feat_name_list8 = ['Toy / Game Store',
'Supermarket',
'Bookstore',
'Pharmacy',
'Bank',
'Grocery Store',
'Electronics Store',
'Pet Store',
'Clothing Store',
'Flower Shop',
's Store"',
'Jewelry Store',
'Shopping Mall',
'Cosmetics Shop',
'Health Food Store',
'Fruit & Vegetable Store',
'Liquor Store',
'Furniture / Home Store',
'Discount Store',
'Farmers Market',
'Comic Shop',
'Butcher',
'Smoke Shop',
'Fish Market',
'Stationery Store',
'Cheese Shop',
'Boutique',
'Antique Shop',
'Arts & Crafts Store',
'Convenience Store',
'Plaza',
'Gourmet Shop',
'Gift Shop',
'Sports Bar',
'Miscellaneous Shop',
's Store',
'Adult Boutique',
'Shoe Store',
'Animal Shelter',
'Department Store',
'Optical Shop',
'Tailor Shop',
'Souvlaki Shop',
'Lingerie Store',
'Music Store',
'Record Shop',
'Organic Grocery',
'Paper / Office Supplies Store',
'Other Repair Shop',
'Thrift / Vintage Store',
'Accessories Store',
'Flea Market',
'Hardware Store',
'Gas Station',
]
feat_name_list9 =['Bus Stop',
'Light Rail Station',
'Rental Car Location',
'Metro Station',
'Train Station',
'General Travel',
'Airport',
'Airport Lounge',
'Bus Line',
]
toronto_venues['Category'] ='Others'
toronto_venues.loc[toronto_venues['Venue Category'].isin(feat_name_list1),'Category']='Food and Dine'
toronto_venues.loc[toronto_venues['Venue Category'].isin(feat_name_list2),'Category']='Arts and Entertainment'
toronto_venues.loc[toronto_venues['Venue Category'].isin(feat_name_list3),'Category']='Athletics and Sports'
toronto_venues.loc[toronto_venues['Venue Category'].isin(feat_name_list4),'Category']='Educational'
toronto_venues.loc[toronto_venues['Venue Category'].isin(feat_name_list5),'Category']='Gym and Fitness Center'
toronto_venues.loc[toronto_venues['Venue Category'].isin(feat_name_list6),'Category']='Health and Beauty Service'
toronto_venues.loc[toronto_venues['Venue Category'].isin(feat_name_list7),'Category']='Recreation'
toronto_venues.loc[toronto_venues['Venue Category'].isin(feat_name_list8),'Category']='Stores and Utilities'
toronto_venues.loc[toronto_venues['Venue Category'].isin(feat_name_list9),'Category']='Transportation'


In [35]:
list(toronto_venues['Category'].unique())

['Food and Dine',
 'Arts and Entertainment',
 'Stores and Utilities',
 'Recreation',
 'Health and Beauty Service',
 'Others',
 'Athletics and Sports',
 'Gym and Fitness Center',
 'Transportation',
 'Educational']

In [36]:
toronto_venues.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category,Category
0,The Beaches,43.676357,-79.293031,Beaches Bake Shop,43.680363,-79.289692,Bakery,Food and Dine
1,The Beaches,43.676357,-79.293031,Tori's Bakeshop,43.672114,-79.290331,Vegetarian / Vegan Restaurant,Food and Dine
2,The Beaches,43.676357,-79.293031,The Fox Theatre,43.672801,-79.287272,Indie Movie Theater,Arts and Entertainment
3,The Beaches,43.676357,-79.293031,Ed's Real Scoop,43.67263,-79.287993,Ice Cream Shop,Food and Dine
4,The Beaches,43.676357,-79.293031,The Beech Tree,43.680493,-79.288846,Gastropub,Food and Dine


#### Getting details about other venues such as Hospitals, schools  etc

In [37]:
LIMIT =500
search_query ='Hospital'
def getNearbyVenues(names, latitudes, longitudes, radius=1000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
               
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/search?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}&query={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT,
            search_query)
            
        # make the GET request
        results = requests.get(url).json()["response"]['venues']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['name'], 
            v['location']['lat'], 
            v['location']['lng']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude']
    
    return(nearby_venues)

In [38]:
toronto_hospitals = getNearbyVenues(names=df_toronto['Neighborhood'],
                                   latitudes=df_toronto['Latitude'],
                                   longitudes=df_toronto['Longitude']
                                  )

The Beaches
The Danforth West,Riverdale
The Beaches West,India Bazaar
Studio District
Lawrence Park
Davisville North
North Toronto West
Davisville
Moore Park,Summerhill East
Deer Park,Forest Hill SE,Rathnelly,South Hill,Summerhill West
Rosedale
Cabbagetown,St. James Town
Church and Wellesley
Harbourfront,Regent Park
Ryerson,Garden District
St. James Town
Berczy Park
Central Bay Street
Adelaide,King,Richmond
Harbourfront East,Toronto Islands,Union Station
Design Exchange,Toronto Dominion Centre
Commerce Court,Victoria Hotel
Roselawn
Forest Hill North,Forest Hill West
The Annex,North Midtown,Yorkville
Harbord,University of Toronto
Chinatown,Grange Park,Kensington Market
CN Tower,Bathurst Quay,Island airport,Harbourfront West,King and Spadina,Railway Lands,South Niagara
Stn A PO Boxes 25 The Esplanade
First Canadian Place,Underground city
Christie
Dovercourt Village,Dufferin
Little Portugal,Trinity
Brockton,Exhibition Place,Parkdale Village
High Park,The Junction South
Parkdale,Roncesvall

In [39]:
toronto_hospitals['Category'] = 'Hospitals'
toronto_hospitals.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Category
0,The Beaches,43.676357,-79.293031,Beaches Animal Hospital,43.673449,-79.283777,Hospitals
1,The Beaches,43.676357,-79.293031,Beaches Animal Hospital,43.673454,-79.283776,Hospitals
2,The Beaches,43.676357,-79.293031,Boardwalk Animal Hospital - In The Beaches,43.671459,-79.293374,Hospitals
3,The Beaches,43.676357,-79.293031,Boardwalk Animal Hospital,43.670814,-79.296732,Hospitals
4,The Beaches,43.676357,-79.293031,Kingston Road Animalhosp,43.68076,-79.284721,Hospitals


In [25]:
list(toronto_hospitals['Venue'].unique())

['Beaches Animal Hospital',
 'Boardwalk Animal Hospital - In The Beaches',
 'Boardwalk Animal Hospital',
 'Kingston Road Animalhosp',
 'Blue Cross Animal Hospital',
 'Danforth Animal Hospital',
 'Riverdale Animal Hospital',
 'Banks Animal Hospital',
 'Ashbridges Bay Animal Hospital',
 'Blacks Veterinary Hospital',
 'Toronto Veterinary Hospital',
 'Bay Dog Hospital',
 'Leslieville Animal Hospital',
 'Emergency Room: Sunnybrook Hospital',
 'Sunnybrook Hospital Critical Care Unit',
 'Neurology Unit: Sunnybrook Hospital',
 'Lawrence Park Animal Hospital',
 'K Wing: Sunnybrook Hospital',
 'Sunnybrook Health Sciences Centre',
 'M-wing: Sunnybrook',
 'Second Cup',
 'FUS Lab: Sunnybrook',
 'T-wing: Sunnybrook',
 'D Wing: Sunnybrook',
 'Mt. Pleasant-Davisville Veterinary Hospital',
 'Davisville Park Animal Hospital',
 'Yonge Street Animal Hospital',
 'Tim Hortons',
 'McGilvray Veterinary Hospital',
 'Usher Animal Hospital',
 'VIVIFY Hospitality Growth Solutions',
 'Rosedale Animal Hospital',
 '

#### Droping Veterinary Hospitals and cleaning up the hospital details

In [77]:
hosp_list = [ 'Sunnybrook Health Sciences Centre',
 'Toronto Grace Health Centre',
 "410 Health Centre (St. Michael's Hospital)",
 'The Hospital for Sick Children (SickKids)',
 'Toronto General Hospital',
 "Women's College Hospital",
 'Mount Sinai Hospital',
 'Princess Margaret Hospital Foundation',
 'St Michael Hospital',
 'Rouge Valley Hospital',
 'Council of Academic Hospitals of Ontario',
 'Mr Sub - Toronto Western Hospital',
 'Toronto Western Hospital',
 "St. Joseph's Health Centre"]
toronto_hosp = toronto_hospitals[toronto_hospitals['Venue'].isin(hosp_list)].reset_index()
toronto_hosp.head()

Unnamed: 0,index,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Category
0,20,Lawrence Park,43.72802,-79.38879,Sunnybrook Health Sciences Centre,43.721505,-79.37621,Hospitals
1,40,Rosedale,43.679563,-79.377529,Toronto Grace Health Centre,43.67064,-79.383263,Hospitals
2,41,"Cabbagetown,St. James Town",43.667967,-79.367675,410 Health Centre (St. Michael's Hospital),43.664832,-79.37405,Hospitals
3,43,Church and Wellesley,43.66586,-79.38316,The Hospital for Sick Children (SickKids),43.657499,-79.386512,Hospitals
4,44,Church and Wellesley,43.66586,-79.38316,Toronto General Hospital,43.658762,-79.388292,Hospitals


In [78]:
toronto_hosp = toronto_hosp.drop(['index'],axis =1)
toronto_hosp.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Category
0,Lawrence Park,43.72802,-79.38879,Sunnybrook Health Sciences Centre,43.721505,-79.37621,Hospitals
1,Rosedale,43.679563,-79.377529,Toronto Grace Health Centre,43.67064,-79.383263,Hospitals
2,"Cabbagetown,St. James Town",43.667967,-79.367675,410 Health Centre (St. Michael's Hospital),43.664832,-79.37405,Hospitals
3,Church and Wellesley,43.66586,-79.38316,The Hospital for Sick Children (SickKids),43.657499,-79.386512,Hospitals
4,Church and Wellesley,43.66586,-79.38316,Toronto General Hospital,43.658762,-79.388292,Hospitals


### Getting details for schools in toronto

In [41]:
LIMIT =100
search_query ='School'
def getNearbyVenues(names, latitudes, longitudes, radius=1000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
               
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/search?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}&query={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT,
            search_query)
            
        # make the GET request
        results = requests.get(url).json()["response"]['venues']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['name'], 
            v['location']['lat'], 
            v['location']['lng']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude']
    
    return(nearby_venues)

In [42]:
toronto_schools = getNearbyVenues(names=df_toronto['Neighborhood'],
                                   latitudes=df_toronto['Latitude'],
                                   longitudes=df_toronto['Longitude']
                                  )

The Beaches
The Danforth West,Riverdale
The Beaches West,India Bazaar
Studio District
Lawrence Park
Davisville North
North Toronto West
Davisville
Moore Park,Summerhill East
Deer Park,Forest Hill SE,Rathnelly,South Hill,Summerhill West
Rosedale
Cabbagetown,St. James Town
Church and Wellesley
Harbourfront,Regent Park
Ryerson,Garden District
St. James Town
Berczy Park
Central Bay Street
Adelaide,King,Richmond
Harbourfront East,Toronto Islands,Union Station
Design Exchange,Toronto Dominion Centre
Commerce Court,Victoria Hotel
Roselawn
Forest Hill North,Forest Hill West
The Annex,North Midtown,Yorkville
Harbord,University of Toronto
Chinatown,Grange Park,Kensington Market
CN Tower,Bathurst Quay,Island airport,Harbourfront West,King and Spadina,Railway Lands,South Niagara
Stn A PO Boxes 25 The Esplanade
First Canadian Place,Underground city
Christie
Dovercourt Village,Dufferin
Little Portugal,Trinity
Brockton,Exhibition Place,Parkdale Village
High Park,The Junction South
Parkdale,Roncesvall

In [43]:
toronto_schools['Category'] = 'Educational'
toronto_schools.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Category
0,The Beaches,43.676357,-79.293031,St.John Catholic School,43.680676,-79.294542,Educational
1,The Beaches,43.676357,-79.293031,Beach Swim School,43.682231,-79.28935,Educational
2,The Beaches,43.676357,-79.293031,Balmy Beach School,43.676199,-79.290134,Educational
3,The Beaches,43.676357,-79.293031,Adam Beck Junior Public School,43.683256,-79.288636,Educational
4,The Beaches,43.676357,-79.293031,Toronto Theatre Dance School,43.680833,-79.291376,Educational


In [79]:
toronto_new = pd.concat([toronto_venues,toronto_hosp,toronto_schools],axis =0)
toronto_new.head()


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Category
0,The Beaches,43.676357,-79.293031,Beaches Bake Shop,43.680363,-79.289692,Food and Dine
1,The Beaches,43.676357,-79.293031,Tori's Bakeshop,43.672114,-79.290331,Food and Dine
2,The Beaches,43.676357,-79.293031,The Fox Theatre,43.672801,-79.287272,Arts and Entertainment
3,The Beaches,43.676357,-79.293031,Ed's Real Scoop,43.67263,-79.287993,Food and Dine
4,The Beaches,43.676357,-79.293031,The Beech Tree,43.680493,-79.288846,Food and Dine


In [80]:
toronto_sorted = toronto_new.sort_values(by = 'Neighborhood').reset_index()
toronto_sorted = toronto_sorted.drop(['index'],axis =1)
toronto_sorted.head()
#toronto_sorted.tail()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Category
0,"Adelaide,King,Richmond",43.650571,-79.384568,Sam James Coffee Bar (SJCB),43.647881,-79.384332,Food and Dine
1,"Adelaide,King,Richmond",43.650571,-79.384568,Old City Hall,43.652009,-79.381744,Recreation
2,"Adelaide,King,Richmond",43.650571,-79.384568,Momofuku Noodle Bar,43.649366,-79.386217,Food and Dine
3,"Adelaide,King,Richmond",43.650571,-79.384568,JaBistro,43.649687,-79.38809,Food and Dine
4,"Adelaide,King,Richmond",43.650571,-79.384568,Adelaide Club Toronto,43.649279,-79.381921,Gym and Fitness Center


In [81]:
# one hot encoding
toronto_onehot = pd.get_dummies(toronto_sorted[['Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
toronto_onehot['Neighborhood'] = toronto_sorted['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]

toronto_onehot.head()

Unnamed: 0,Neighborhood,Arts and Entertainment,Athletics and Sports,Educational,Food and Dine,Gym and Fitness Center,Health and Beauty Service,Hospitals,Others,Recreation,Stores and Utilities,Transportation
0,"Adelaide,King,Richmond",0,0,0,1,0,0,0,0,0,0,0
1,"Adelaide,King,Richmond",0,0,0,0,0,0,0,0,1,0,0
2,"Adelaide,King,Richmond",0,0,0,1,0,0,0,0,0,0,0
3,"Adelaide,King,Richmond",0,0,0,1,0,0,0,0,0,0,0
4,"Adelaide,King,Richmond",0,0,0,0,1,0,0,0,0,0,0


In [82]:
toronto_onehot_new=toronto_onehot.set_index('Neighborhood')
toronto_onehot_new.head()

Unnamed: 0_level_0,Arts and Entertainment,Athletics and Sports,Educational,Food and Dine,Gym and Fitness Center,Health and Beauty Service,Hospitals,Others,Recreation,Stores and Utilities,Transportation
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
"Adelaide,King,Richmond",0,0,0,1,0,0,0,0,0,0,0
"Adelaide,King,Richmond",0,0,0,0,0,0,0,0,1,0,0
"Adelaide,King,Richmond",0,0,0,1,0,0,0,0,0,0,0
"Adelaide,King,Richmond",0,0,0,1,0,0,0,0,0,0,0
"Adelaide,King,Richmond",0,0,0,0,1,0,0,0,0,0,0


In [83]:
toronto_onehot_new = toronto_onehot_new.groupby('Neighborhood').sum()
toronto_onehot_new.head()

Unnamed: 0_level_0,Arts and Entertainment,Athletics and Sports,Educational,Food and Dine,Gym and Fitness Center,Health and Beauty Service,Hospitals,Others,Recreation,Stores and Utilities,Transportation
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
"Adelaide,King,Richmond",13,0,51,60,4,2,6,4,1,14,1
Berczy Park,6,2,23,62,2,0,0,5,6,16,1
"Brockton,Exhibition Place,Parkdale Village",5,4,13,66,3,1,0,4,2,15,0
Business reply mail Processing Centre969 Eastern,0,0,4,30,1,1,0,0,8,6,0
"CN Tower,Bathurst Quay,Island airport,Harbourfront West,King and Spadina,Railway Lands,South Niagara",1,1,4,4,0,0,0,2,6,0,2


In [55]:
toronto_onehot_new.shape


(38, 11)

#### Running *k*-means to cluster the neighborhood into 3 clusters.

In [84]:
# set number of clusters
kclusters = 5

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_onehot_new)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([2, 0, 0, 1, 4, 1, 2, 2, 0, 2], dtype=int32)

In [85]:
means_df = pd.DataFrame(kmeans.cluster_centers_)
means_df.columns = toronto_onehot_new.columns
means_df.index=['C1','C2','C3','C4','C5']
means_df['Facilities']=means_df.sum(axis = 1)
means_df.sort_values(axis=0,by=['Facilities'],ascending =False)

Unnamed: 0,Arts and Entertainment,Athletics and Sports,Educational,Food and Dine,Gym and Fitness Center,Health and Beauty Service,Hospitals,Others,Recreation,Stores and Utilities,Transportation,Facilities
C3,8.272727,0.727273,48.636364,68.090909,2.545455,0.909091,4.636364,3.727273,2.545455,12.272727,0.454545,152.818182
C1,5.090909,1.0,21.545455,70.818182,3.0,1.181818,4.440892e-16,2.454545,2.727273,13.272727,0.181818,121.272727
C4,4.0,1.666667,12.5,53.333333,2.5,1.166667,0.3333333,2.166667,7.666667,9.666667,1.333333,96.333333
C2,1.5,1.833333,15.5,30.333333,2.333333,0.666667,0.1666667,1.666667,6.0,6.666667,0.0,66.666667
C5,0.25,0.75,9.75,7.5,0.5,0.5,0.5,1.5,2.75,2.75,0.75,27.5


In [86]:
#toronto_onehot_new = toronto_onehot_new.set_index('Neighborhood')
toronto_onehot_new['Cluster']=1+kmeans.labels_
toronto_onehot_new.head()

Unnamed: 0_level_0,Arts and Entertainment,Athletics and Sports,Educational,Food and Dine,Gym and Fitness Center,Health and Beauty Service,Hospitals,Others,Recreation,Stores and Utilities,Transportation,Cluster
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
"Adelaide,King,Richmond",13,0,51,60,4,2,6,4,1,14,1,3
Berczy Park,6,2,23,62,2,0,0,5,6,16,1,1
"Brockton,Exhibition Place,Parkdale Village",5,4,13,66,3,1,0,4,2,15,0,1
Business reply mail Processing Centre969 Eastern,0,0,4,30,1,1,0,0,8,6,0,2
"CN Tower,Bathurst Quay,Island airport,Harbourfront West,King and Spadina,Railway Lands,South Niagara",1,1,4,4,0,0,0,2,6,0,2,5


In [87]:
toronto_final = df_toronto.set_index('Neighborhood').join(toronto_onehot_new)
toronto_final.head()

Unnamed: 0_level_0,Postcode,Borough,Latitude,Longitude,Arts and Entertainment,Athletics and Sports,Educational,Food and Dine,Gym and Fitness Center,Health and Beauty Service,Hospitals,Others,Recreation,Stores and Utilities,Transportation,Cluster
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
The Beaches,M4E,East Toronto,43.676357,-79.293031,2,1,17,52,0,2,0,3,7,11,0,4
"The Danforth West,Riverdale",M4K,East Toronto,43.679557,-79.352188,2,0,16,71,2,4,0,3,3,15,0,1
"The Beaches West,India Bazaar",M4L,East Toronto,43.668999,-79.315572,3,1,7,44,3,1,0,3,10,8,3,4
Studio District,M4M,East Toronto,43.659526,-79.340923,1,0,11,74,5,0,0,5,2,13,0,1
Lawrence Park,M4N,Central Toronto,43.72802,-79.38879,0,0,11,2,1,0,1,1,1,1,0,5


In [88]:
toronto_final.reset_index(level=0, inplace=True)
toronto_final.head()

Unnamed: 0,Neighborhood,Postcode,Borough,Latitude,Longitude,Arts and Entertainment,Athletics and Sports,Educational,Food and Dine,Gym and Fitness Center,Health and Beauty Service,Hospitals,Others,Recreation,Stores and Utilities,Transportation,Cluster
0,The Beaches,M4E,East Toronto,43.676357,-79.293031,2,1,17,52,0,2,0,3,7,11,0,4
1,"The Danforth West,Riverdale",M4K,East Toronto,43.679557,-79.352188,2,0,16,71,2,4,0,3,3,15,0,1
2,"The Beaches West,India Bazaar",M4L,East Toronto,43.668999,-79.315572,3,1,7,44,3,1,0,3,10,8,3,4
3,Studio District,M4M,East Toronto,43.659526,-79.340923,1,0,11,74,5,0,0,5,2,13,0,1
4,Lawrence Park,M4N,Central Toronto,43.72802,-79.38879,0,0,11,2,1,0,1,1,1,1,0,5


####  Visualize the resulting clusters

In [89]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_final['Latitude'], toronto_final['Longitude'], toronto_final['Neighborhood'], toronto_final['Cluster']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### Examining Each Clusters

###  Best Cluster based on facilities is Cluster3

In [90]:
toronto_final[toronto_final['Cluster'] == 3]

Unnamed: 0,Neighborhood,Postcode,Borough,Latitude,Longitude,Arts and Entertainment,Athletics and Sports,Educational,Food and Dine,Gym and Fitness Center,Health and Beauty Service,Hospitals,Others,Recreation,Stores and Utilities,Transportation,Cluster
12,Church and Wellesley,M4Y,Downtown Toronto,43.66586,-79.38316,9,0,50,70,2,2,6,4,3,10,0,3
14,"Ryerson,Garden District",M5B,Downtown Toronto,43.657162,-79.378937,6,0,50,65,2,1,7,3,4,19,0,3
15,St. James Town,M5C,Downtown Toronto,43.651494,-79.375418,8,0,50,65,3,0,2,7,3,14,0,3
17,Central Bay Street,M5G,Downtown Toronto,43.657952,-79.387383,9,0,51,63,2,2,5,4,3,16,0,3
18,"Adelaide,King,Richmond",M5H,Downtown Toronto,43.650571,-79.384568,13,0,51,60,4,2,6,4,1,14,1,3
20,"Design Exchange,Toronto Dominion Centre",M5K,Downtown Toronto,43.647177,-79.381576,8,2,48,71,3,1,3,2,6,6,1,3
21,"Commerce Court,Victoria Hotel",M5L,Downtown Toronto,43.648198,-79.379817,7,2,49,70,3,0,3,5,3,9,1,3
25,"Harbord,University of Toronto",M5S,Downtown Toronto,43.662696,-79.400049,8,0,49,71,0,0,6,3,2,13,0,3
26,"Chinatown,Grange Park,Kensington Market",M5T,Downtown Toronto,43.653206,-79.400049,8,0,44,74,3,0,5,2,0,13,0,3
28,Stn A PO Boxes 25 The Esplanade,M5W,Downtown Toronto,43.646435,-79.374846,6,2,44,69,3,0,3,4,2,13,1,3


### Second Best Cluster is Cluster1

In [91]:
toronto_final[toronto_final['Cluster'] == 1]

Unnamed: 0,Neighborhood,Postcode,Borough,Latitude,Longitude,Arts and Entertainment,Athletics and Sports,Educational,Food and Dine,Gym and Fitness Center,Health and Beauty Service,Hospitals,Others,Recreation,Stores and Utilities,Transportation,Cluster
1,"The Danforth West,Riverdale",M4K,East Toronto,43.679557,-79.352188,2,0,16,71,2,4,0,3,3,15,0,1
3,Studio District,M4M,East Toronto,43.659526,-79.340923,1,0,11,74,5,0,0,5,2,13,0,1
5,Davisville North,M4P,Central Toronto,43.712751,-79.390197,2,3,23,74,5,2,0,1,1,12,0,1
7,Davisville,M4S,Central Toronto,43.704324,-79.38879,2,1,28,79,4,0,0,2,1,11,0,1
13,"Harbourfront,Regent Park",M5A,Downtown Toronto,43.65426,-79.360636,11,1,25,65,2,1,0,2,5,13,0,1
16,Berczy Park,M5E,Downtown Toronto,43.644771,-79.373306,6,2,23,62,2,0,0,5,6,16,1,1
24,"The Annex,North Midtown,Yorkville",M5R,Central Toronto,43.67271,-79.405678,12,0,33,66,3,0,0,1,4,12,0,1
30,Christie,M6G,Downtown Toronto,43.669542,-79.422564,6,0,23,74,2,2,0,0,3,12,0,1
32,"Little Portugal,Trinity",M6J,West Toronto,43.647927,-79.41975,5,0,26,80,2,1,0,3,1,8,0,1
33,"Brockton,Exhibition Place,Parkdale Village",M6K,West Toronto,43.636847,-79.428191,5,4,13,66,3,1,0,4,2,15,0,1


### Third Best Cluster is Cluster4

In [92]:
toronto_final[toronto_final['Cluster'] == 4]

Unnamed: 0,Neighborhood,Postcode,Borough,Latitude,Longitude,Arts and Entertainment,Athletics and Sports,Educational,Food and Dine,Gym and Fitness Center,Health and Beauty Service,Hospitals,Others,Recreation,Stores and Utilities,Transportation,Cluster
0,The Beaches,M4E,East Toronto,43.676357,-79.293031,2,1,17,52,0,2,0,3,7,11,0,4
2,"The Beaches West,India Bazaar",M4L,East Toronto,43.668999,-79.315572,3,1,7,44,3,1,0,3,10,8,3,4
9,"Deer Park,Forest Hill SE,Rathnelly,South Hill,...",M4V,Central Toronto,43.686412,-79.400049,1,1,14,50,6,2,0,2,4,8,1,4
19,"Harbourfront East,Toronto Islands,Union Station",M5J,Downtown Toronto,43.640816,-79.381752,12,6,12,57,3,1,1,2,12,6,1,4
35,"Parkdale,Roncesvalles",M6R,West Toronto,43.64896,-79.456325,4,1,11,61,1,0,1,3,8,18,2,4
36,"Runnymede,Swansea",M6S,West Toronto,43.651571,-79.48445,2,0,14,56,2,1,0,0,5,7,1,4


### Cluster2 and Cluster5 are the worst clusters

In [93]:
toronto_final[toronto_final['Cluster'] == 2]

Unnamed: 0,Neighborhood,Postcode,Borough,Latitude,Longitude,Arts and Entertainment,Athletics and Sports,Educational,Food and Dine,Gym and Fitness Center,Health and Beauty Service,Hospitals,Others,Recreation,Stores and Utilities,Transportation,Cluster
6,North Toronto West,M4R,Central Toronto,43.715383,-79.405678,0,7,16,26,3,2,0,0,3,3,0,2
8,"Moore Park,Summerhill East",M4T,Central Toronto,43.689574,-79.38316,1,1,16,40,4,0,0,4,5,7,0,2
11,"Cabbagetown,St. James Town",M4X,Downtown Toronto,43.667967,-79.367675,5,0,24,25,0,0,1,1,6,3,0,2
23,"Forest Hill North,Forest Hill West",M5P,Central Toronto,43.696948,-79.411307,0,2,15,29,4,0,0,2,7,6,0,2
31,"Dovercourt Village,Dufferin",M6H,West Toronto,43.669005,-79.442259,3,1,18,32,2,1,0,3,7,15,0,2
37,Business reply mail Processing Centre969 Eastern,M7Y,East Toronto,43.662744,-79.321558,0,0,4,30,1,1,0,0,8,6,0,2


In [94]:
toronto_final[toronto_final['Cluster'] == 5]

Unnamed: 0,Neighborhood,Postcode,Borough,Latitude,Longitude,Arts and Entertainment,Athletics and Sports,Educational,Food and Dine,Gym and Fitness Center,Health and Beauty Service,Hospitals,Others,Recreation,Stores and Utilities,Transportation,Cluster
4,Lawrence Park,M4N,Central Toronto,43.72802,-79.38879,0,0,11,2,1,0,1,1,1,1,0,5
10,Rosedale,M4W,Downtown Toronto,43.679563,-79.377529,0,1,10,11,0,0,1,3,3,5,1,5
22,Roselawn,M5N,Central Toronto,43.711695,-79.416936,0,1,14,13,1,2,0,0,1,5,0,5
27,"CN Tower,Bathurst Quay,Island airport,Harbourf...",M5V,Downtown Toronto,43.628947,-79.39442,1,1,4,4,0,0,0,2,6,0,2,5
