# Segmenting and Clustering Neighborhoods in Toronto - Task 3

Explore and cluster the neighborhoods in Toronto. You can decide to work with only boroughs that contain the word Toronto and then replicate the same analysis we did to the New York City data. It is up to you.

Just make sure:

- to add enough Markdown cells to explain what you decided to do and to report any observations you make.
- to generate maps to visualize your neighborhoods and how they cluster together.

#### Importing the required libraries and packages

In [1]:
import requests 
import pandas as pd
from pandas.io.json import json_normalize

from sklearn.cluster import KMeans # k-means for clustering

import matplotlib.cm as cm # for maps
import matplotlib.colors as colors
from sklearn.preprocessing import StandardScaler

In [2]:
!conda install -c conda-forge folium=0.5.0 --yes 
import folium 

Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... done

# All requested packages already installed.



#### Reading the file created in Question 2

In [3]:
df = pd.read_csv('question2.csv')
df.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Regent Park , Harbourfront",43.65426,-79.360636
3,M6A,North York,"Lawrence Manor , Lawrence Heights",43.718518,-79.464763
4,M7A,Downtown Toronto,"Queen's Park , Ontario Provincial Government",43.662301,-79.389494


#### Working only with boroughs that contain the word Toronto

In [4]:
df_toronto = df[ df.Borough.str.contains('Toronto') ]
df_toronto

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
2,M5A,Downtown Toronto,"Regent Park , Harbourfront",43.65426,-79.360636
4,M7A,Downtown Toronto,"Queen's Park , Ontario Provincial Government",43.662301,-79.389494
9,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937
15,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418
19,M4E,East Toronto,The Beaches,43.676357,-79.293031
20,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306
24,M5G,Downtown Toronto,Central Bay Street,43.657952,-79.387383
25,M6G,Downtown Toronto,Christie,43.669542,-79.422564
30,M5H,Downtown Toronto,"Richmond , Adelaide , King",43.650571,-79.384568
31,M6H,West Toronto,"Dufferin , Dovercourt Village",43.669005,-79.442259


#### Finding center cordinates of Toronto using geocoder

In [5]:
import geocoder
g = geocoder.google('Toronto, ON')
print(g.latlng)

None


In [6]:
url = 'https://maps.googleapis.com/maps/api/geocode/json'
params = {'sensor': 'false', 'address': 'Toronto, ON'}
r = requests.get(url, params=params)
results = r.json()

In [7]:
print(results['error_message'])

You must use an API key to authenticate each request to Google Maps Platform APIs. For additional information, please refer to http://g.co/dev/maps-no-account


In [46]:
#As of June 11, 2018, you must enable billing with a credit card and have a valid API key for all of your projects

In [22]:
latitude = 43.6532
longitude = -79.3832
print('Coordinates of Toronto are {}, {}.'.format(latitude, longitude))

Coordinates of Toronto are 43.6532, -79.3832.


#### Creating Map

In [23]:
map_toronto = folium.Map(location=[latitude, longitude], zoom_start = 12)

# add markers to map
for lat, long, borough, neighborhood in zip(df_toronto['Latitude'], df_toronto['Longitude'], df_toronto['Borough'], df_toronto['Neighborhood']):
    label = '{} - {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, long],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
    
map_toronto

#### Loading FourSquare account crediantials

In [9]:
CLIENT_ID = 'LVQ521WZ0ZVT1LNDBDI4FEC4TH5GTPFUJ3IQT21HTNJN4QUO'
CLIENT_SECRET = 'PUZJSH44LO0PW3S5BX4ISNSN1GBPBX0T55NBO4ELRQFTRAUA'
VERSION = '20180605'

#### Getting the neighborhood's latitude and longitude values
(using IBM lab code)

In [10]:
def getNearbyVenues(names, latitudes, longitudes, radius=500, limit=100):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            limit)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

#### Getting venues for dataframe of toronto

In [11]:
venues_tor = getNearbyVenues(names=df_toronto['Neighborhood'],
                                 latitudes=df_toronto['Latitude'],
                                 longitudes=df_toronto['Longitude'])
venues_tor.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,"Regent Park , Harbourfront",43.65426,-79.360636,Roselle Desserts,43.653447,-79.362017,Bakery
1,"Regent Park , Harbourfront",43.65426,-79.360636,Tandem Coffee,43.653559,-79.361809,Coffee Shop
2,"Regent Park , Harbourfront",43.65426,-79.360636,Cooper Koo Family YMCA,43.653249,-79.358008,Distribution Center
3,"Regent Park , Harbourfront",43.65426,-79.360636,Body Blitz Spa East,43.654735,-79.359874,Spa
4,"Regent Park , Harbourfront",43.65426,-79.360636,Morning Glory Cafe,43.653947,-79.361149,Breakfast Spot


In [12]:
venues_tor.shape[0]


1622

#### Checking count of venues for each neighbourhood

In [13]:
venues_tor.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Berczy Park,55,55,55,55,55,55
"Brockton , Parkdale Village , Exhibition Place",23,23,23,23,23,23
Business reply mail Processing CentrE,16,16,16,16,16,16
"CN Tower , King and Spadina , Railway Lands , Harbourfront West , Bathurst\n Quay , South Niagara , Island airport",14,14,14,14,14,14
Central Bay Street,64,64,64,64,64,64
Christie,18,18,18,18,18,18
Church and Wellesley,75,75,75,75,75,75
"Commerce Court , Victoria Hotel",100,100,100,100,100,100
Davisville,34,34,34,34,34,34
Davisville North,7,7,7,7,7,7


In [14]:
toronto_onehot = pd.get_dummies(venues_tor[['Venue Category']], prefix="", prefix_sep="")
toronto_onehot.insert(0, 'Neighbourhood', venues_tor['Neighborhood'])

In [15]:
toronto_grouped = toronto_onehot.groupby('Neighbourhood').sum().reset_index()

Top veneus for each neighbourhood

(again using IBM lab Code)

In [16]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    return row_categories_sorted.index.values[0:num_top_venues]

In [17]:
import numpy as np
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = toronto_grouped['Neighbourhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Berczy Park,Coffee Shop,Farmers Market,Cheese Shop,Beer Bar,Café,Bakery,Restaurant,Italian Restaurant,Seafood Restaurant,Cocktail Bar
1,"Brockton , Parkdale Village , Exhibition Place",Café,Breakfast Spot,Coffee Shop,Gym,Bakery,Stadium,Burrito Place,Restaurant,Climbing Gym,Pet Store
2,Business reply mail Processing CentrE,Comic Shop,Auto Workshop,Light Rail Station,Smoke Shop,Brewery,Spa,Farmers Market,Fast Food Restaurant,Burrito Place,Butcher
3,"CN Tower , King and Spadina , Railway Lands , ...",Airport Lounge,Airport Service,Airport Terminal,Airport,Airport Food Court,Airport Gate,Harbor / Marina,Boutique,Boat or Ferry,Sculpture Garden
4,Central Bay Street,Coffee Shop,Italian Restaurant,Café,Sandwich Place,Ice Cream Shop,Sushi Restaurant,Japanese Restaurant,Bubble Tea Shop,Burger Joint,Salad Place


#### K-means Clustering

In [18]:
kclusters = 3 #define k
X = toronto_grouped.drop('Neighbourhood', 1).astype(float)
X = StandardScaler().fit_transform(X)

#run K-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(X)
yhat = kmeans.labels_

merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood

In [19]:
toronto_df = pd.merge(df_toronto, neighborhoods_venues_sorted, on='Neighborhood', how='right')
toronto_df.insert(5, 'Cluster Labels', yhat + 1)
toronto_df.head(15)

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M5A,Downtown Toronto,"Regent Park , Harbourfront",43.65426,-79.360636,2,Coffee Shop,Park,Bakery,Pub,Theater,Breakfast Spot,Café,Restaurant,Mexican Restaurant,Event Space
1,M7A,Downtown Toronto,"Queen's Park , Ontario Provincial Government",43.662301,-79.389494,2,Coffee Shop,Diner,Yoga Studio,Music Venue,Mexican Restaurant,Juice Bar,Italian Restaurant,Hobby Shop,Fried Chicken Joint,Distribution Center
2,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937,2,Coffee Shop,Clothing Store,Café,Bubble Tea Shop,Japanese Restaurant,Italian Restaurant,Middle Eastern Restaurant,Cosmetics Shop,Bookstore,Diner
3,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418,2,Coffee Shop,Café,Cocktail Bar,Beer Bar,Restaurant,American Restaurant,Hotel,Japanese Restaurant,Diner,Lingerie Store
4,M4E,East Toronto,The Beaches,43.676357,-79.293031,2,Trail,Pub,Neighborhood,Health Food Store,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Yoga Studio
5,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306,2,Coffee Shop,Farmers Market,Cheese Shop,Beer Bar,Café,Bakery,Restaurant,Italian Restaurant,Seafood Restaurant,Cocktail Bar
6,M5G,Downtown Toronto,Central Bay Street,43.657952,-79.387383,2,Coffee Shop,Italian Restaurant,Café,Sandwich Place,Ice Cream Shop,Sushi Restaurant,Japanese Restaurant,Bubble Tea Shop,Burger Joint,Salad Place
7,M6G,Downtown Toronto,Christie,43.669542,-79.422564,3,Grocery Store,Café,Park,Coffee Shop,Italian Restaurant,Candy Store,Restaurant,Diner,Athletics & Sports,Baby Store
8,M5H,Downtown Toronto,"Richmond , Adelaide , King",43.650571,-79.384568,2,Coffee Shop,Café,Gym,Restaurant,Deli / Bodega,Hotel,Thai Restaurant,Asian Restaurant,Concert Hall,Clothing Store
9,M6H,West Toronto,"Dufferin , Dovercourt Village",43.669005,-79.442259,2,Bakery,Pharmacy,Recording Studio,Supermarket,Bank,Brewery,Middle Eastern Restaurant,Café,Music Venue,Bar


#### Creating Map

In [26]:
# create map
map_clusters = folium.Map(location=[longitude, latitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_df['Latitude'], toronto_df['Longitude'], toronto_df['Neighborhood'], toronto_df['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

#### Cluster - 1

In [100]:
toronto_df[toronto_df['Cluster Labels'] == 1]

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
14,M6K,West Toronto,"Brockton , Parkdale Village , Exhibition Place",43.636847,-79.428191,1,Café,Breakfast Spot,Coffee Shop,Gym,Bakery,Stadium,Burrito Place,Restaurant,Climbing Gym,Pet Store


#### Cluster - 2

In [101]:
toronto_df[toronto_df['Cluster Labels'] == 2]

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M5A,Downtown Toronto,"Regent Park , Harbourfront",43.65426,-79.360636,2,Coffee Shop,Park,Bakery,Pub,Theater,Breakfast Spot,Café,Restaurant,Mexican Restaurant,Event Space
1,M7A,Downtown Toronto,"Queen's Park , Ontario Provincial Government",43.662301,-79.389494,2,Coffee Shop,Diner,Yoga Studio,Music Venue,Mexican Restaurant,Juice Bar,Italian Restaurant,Hobby Shop,Fried Chicken Joint,Distribution Center
2,M5B,Downtown Toronto,"Garden District, Ryerson",43.657162,-79.378937,2,Coffee Shop,Clothing Store,Café,Bubble Tea Shop,Japanese Restaurant,Italian Restaurant,Middle Eastern Restaurant,Cosmetics Shop,Bookstore,Diner
3,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418,2,Coffee Shop,Café,Cocktail Bar,Beer Bar,Restaurant,American Restaurant,Hotel,Japanese Restaurant,Diner,Lingerie Store
4,M4E,East Toronto,The Beaches,43.676357,-79.293031,2,Trail,Pub,Neighborhood,Health Food Store,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Yoga Studio
5,M5E,Downtown Toronto,Berczy Park,43.644771,-79.373306,2,Coffee Shop,Farmers Market,Cheese Shop,Beer Bar,Café,Bakery,Restaurant,Italian Restaurant,Seafood Restaurant,Cocktail Bar
6,M5G,Downtown Toronto,Central Bay Street,43.657952,-79.387383,2,Coffee Shop,Italian Restaurant,Café,Sandwich Place,Ice Cream Shop,Sushi Restaurant,Japanese Restaurant,Bubble Tea Shop,Burger Joint,Salad Place
8,M5H,Downtown Toronto,"Richmond , Adelaide , King",43.650571,-79.384568,2,Coffee Shop,Café,Gym,Restaurant,Deli / Bodega,Hotel,Thai Restaurant,Asian Restaurant,Concert Hall,Clothing Store
9,M6H,West Toronto,"Dufferin , Dovercourt Village",43.669005,-79.442259,2,Bakery,Pharmacy,Recording Studio,Supermarket,Bank,Brewery,Middle Eastern Restaurant,Café,Music Venue,Bar
10,M5J,Downtown Toronto,"Harbourfront East , Union Station , Toronto Is...",43.640816,-79.381752,2,Coffee Shop,Aquarium,Hotel,Restaurant,Café,Italian Restaurant,Brewery,Sporting Goods Shop,Fried Chicken Joint,Scenic Lookout


#### Cluster - 3

In [102]:
toronto_df[toronto_df['Cluster Labels'] == 3]

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
7,M6G,Downtown Toronto,Christie,43.669542,-79.422564,3,Grocery Store,Café,Park,Coffee Shop,Italian Restaurant,Candy Store,Restaurant,Diner,Athletics & Sports,Baby Store
11,M6J,West Toronto,"Little Portugal , Trinity",43.647927,-79.41975,3,Bar,Restaurant,Vegetarian / Vegan Restaurant,Café,Asian Restaurant,Men's Store,Coffee Shop,Italian Restaurant,French Restaurant,Bistro
25,M6R,West Toronto,"Parkdale , Roncesvalles",43.64896,-79.456325,3,Gift Shop,Bookstore,Dessert Shop,Eastern European Restaurant,Italian Restaurant,Bar,Bank,Dog Run,Restaurant,Movie Theater
29,M4T,Central Toronto,"Moore Park , Summerhill East",43.689574,-79.38316,3,Playground,Trail,Tennis Court,Yoga Studio,Dessert Shop,Event Space,Ethiopian Restaurant,Electronics Store,Eastern European Restaurant,Dumpling Restaurant
31,M4V,Central Toronto,"Summerhill West , Rathnelly , South Hill , For...",43.686412,-79.400049,3,Coffee Shop,Pub,Pizza Place,Light Rail Station,Liquor Store,Sports Bar,Restaurant,Supermarket,Sushi Restaurant,Bank
37,M4Y,Downtown Toronto,Church and Wellesley,43.66586,-79.38316,3,Coffee Shop,Japanese Restaurant,Gay Bar,Restaurant,Sushi Restaurant,Pub,Men's Store,Mediterranean Restaurant,Hotel,Gastropub
