## Segmenting and Clustering Neighborhoods in Toronto

#### Explore and cluster the neighborhoods in Toronto. You can decide to work with only boroughs that contain the word Toronto and then replicate the same analysis we did to the New York City data. It is up to you.

#### Importing libraries

In [125]:
import requests
from bs4 import BeautifulSoup
import pandas as pd

# import k-means from clustering stage
from sklearn.cluster import KMeans
import folium # map rendering library

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

#### Data source

In [19]:
link='https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'
wikipage = requests.get(link)
page = wikipage.text

In [20]:
soup = BeautifulSoup(page, 'html.parser')


column_names = ['Postcode', 'Borough', 'Neighbourhood'] 

neighbs=[]

tableSoup = soup.find_all("table")
table = tableSoup[0]  #getting the first instance table from all found tables
rows = table.find_all("tr")
for row in rows:
    cols = row.find_all("td")
    if len(cols) == 0 : continue
    neighbs.append({'Postcode': cols[0].text,'Borough': cols[1].text,'Neighbourhood': cols[2].text.split("\n")[0]})


df=pd.DataFrame(neighbs,columns=column_names)




#### The dataframe will consist of three columns: PostalCode, Borough, and Neighborhood

In [21]:
column_names = ['Postcode', 'Borough', 'Neighbourhood'] 
df.columns=column_names



In [22]:

df.head()

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront


#### Only process the cells that have an assigned borough. Ignore cells with a borough that is Not assigned.

In [23]:


df=df.drop(df[df['Borough']=='Not assigned'].index).reset_index(drop=True)


#### If a cell has a borough but a Not assigned neighborhood, then the neighborhood will be the same as the borough

In [24]:
##replacing value conditionally
df.loc[df['Neighbourhood'] =='Not assigned', 'Neighbourhood'] = df['Borough']

In [25]:
df.shape

(212, 3)

#### Grouping by Postcode and  Borough Columns, concatenating Neighbourhoods

In [26]:
df = df.groupby(['Postcode','Borough'])['Neighbourhood'].apply(', '.join).reset_index()

In [27]:
df.head(10)

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M1B,Scarborough,"Rouge, Malvern"
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union"
2,M1E,Scarborough,"Guildwood, Morningside, West Hill"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae
5,M1J,Scarborough,Scarborough Village
6,M1K,Scarborough,"East Birchmount Park, Ionview, Kennedy Park"
7,M1L,Scarborough,"Clairlea, Golden Mile, Oakridge"
8,M1M,Scarborough,"Cliffcrest, Cliffside, Scarborough Village West"
9,M1N,Scarborough,"Birch Cliff, Cliffside West"


In [28]:
df.shape

(103, 3)

In [29]:
### Reading data from Geospatial_Data file downloaded from "http://cocl.us/Geospatial_data"

In [30]:
geo_url = 'http://cocl.us/Geospatial_data'
geo_df=pd.read_csv(geo_url)

In [31]:
### Merging Main data set with Geo Data

In [32]:

df = pd.merge(left=df,right=geo_df, how='left', left_on='Postcode', right_on='Postal Code')
df.drop('Postal Code',axis=1,inplace=True)
df.rename(columns={'Postcode':'PostalCode'},inplace=True)

In [33]:
df.head(15)

Unnamed: 0,PostalCode,Borough,Neighbourhood,Latitude,Longitude
0,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476
5,M1J,Scarborough,Scarborough Village,43.744734,-79.239476
6,M1K,Scarborough,"East Birchmount Park, Ionview, Kennedy Park",43.727929,-79.262029
7,M1L,Scarborough,"Clairlea, Golden Mile, Oakridge",43.711112,-79.284577
8,M1M,Scarborough,"Cliffcrest, Cliffside, Scarborough Village West",43.716316,-79.239476
9,M1N,Scarborough,"Birch Cliff, Cliffside West",43.692657,-79.264848


### PART 3 - EXPLORING NEIGBOURHOODS

#### Let's count how many postal codes each borogh has to asseess how large they are

In [35]:
df.groupby(['Borough']).size()

Borough
Central Toronto      9
Downtown Toronto    18
East Toronto         5
East York            5
Etobicoke           12
Mississauga          1
North York          24
Queen's Park         1
Scarborough         17
West Toronto         6
York                 5
dtype: int64

#### As we can see North York  has the largest qty, lets peak it and create a data subset

In [85]:
neigbourhood='North York'
NY_df=df[df['Borough']==neigbourhood]

In [41]:
#and also we create a subset with everything but NY neigbourhoud
notNY_df=df[df['Borough']!='North York']

In [38]:
import folium # map rendering library

# create map of North York using latitude and longitude values
NY_MAP = folium.Map(location=[43.73, -79.36], zoom_start=12)

NY_MAP

In [39]:
# add markers to map
for lat, lng, postal, neighborhood in zip(NY_df['Latitude'], NY_df['Longitude'], NY_df['PostalCode'], NY_df['Neighbourhood']):
    label = '{}, {}'.format(postal,neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(NY_MAP)  
    
NY_MAP

#### mjkAs we can observe, coordinates of postal codes in neigbourhous are distrubuted pretty even

In [44]:
for lat, lng, postal, neighborhood in zip(notNY_df['Latitude'], notNY_df['Longitude'], notNY_df['PostalCode'], notNY_df['Neighbourhood']):
    label = '{}, {}'.format(postal,neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=3,
        popup=label,
        color='grey',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.4,
        parse_html=False).add_to(NY_MAP) 

#### Let's show our New York neigbourhoud (blue) vs rest area (grey)

In [45]:
NY_MAP

In [72]:
CLIENT_ID = 'S0AKHYENBG0JMAWYVBHIFO0N20YAOL0PZEBXWBMX00XKSZLS' # your Foursquare ID
CLIENT_SECRET = 'D2ONI0JMTGJNMCAVLTPUQKX5AGN14KPAXXWR2OLXWE25NPQE' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 30
#Finding venues in the radius of 1000m 
def getNearbyVenues(names, latitudes, longitudes, radius=3000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        print(name, len(results))
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighbourhood', 
                  'Neighbourhood Latitude', 
                  'Neighbourhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [73]:
# type your answer here

LIMIT = 100
ny_venues = getNearbyVenues(names=NY_df['Neighbourhood'],
                                   latitudes=NY_df['Latitude'],
                                   longitudes=NY_df['Longitude']
                                  )

Hillcrest Village
Hillcrest Village 93
Fairview, Henry Farm, Oriole
Fairview, Henry Farm, Oriole 100
Bayview Village
Bayview Village 100
Silver Hills, York Mills
Silver Hills, York Mills 100
Newtonbrook, Willowdale
Newtonbrook, Willowdale 100
Willowdale South
Willowdale South 94
York Mills West
York Mills West 82
Willowdale West
Willowdale West 100
Parkwoods
Parkwoods 100
Don Mills North
Don Mills North 100
Flemingdon Park, Don Mills South
Flemingdon Park, Don Mills South 100
Bathurst Manor, Downsview North, Wilson Heights
Bathurst Manor, Downsview North, Wilson Heights 100
Northwood Park, York University
Northwood Park, York University 100
CFB Toronto, Downsview East
CFB Toronto, Downsview East 100
Downsview West
Downsview West 76
Downsview Central
Downsview Central 71
Downsview Northwest
Downsview Northwest 89
Victoria Village
Victoria Village 100
Bedford Park, Lawrence Manor East
Bedford Park, Lawrence Manor East 100
Lawrence Heights, Lawrence Manor
Lawrence Heights, Lawrence Manor 

#### Let's preview the resulting dataframe and a number of venues

In [76]:
print(len(ny_venues.shape))
ny_venues.head()

2


Unnamed: 0,Neighbourhood,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Hillcrest Village,43.803762,-79.363452,Chatime Willowdale,43.791326,-79.367506,Bubble Tea Shop
1,Hillcrest Village,43.803762,-79.363452,Bayview Golf & Country Club,43.809391,-79.375285,Golf Course
2,Hillcrest Village,43.803762,-79.363452,Gyubee 牛兵衞燒肉工房,43.815661,-79.349423,BBQ Joint
3,Hillcrest Village,43.803762,-79.363452,Tastee,43.807722,-79.356798,Bakery
4,Hillcrest Village,43.803762,-79.363452,고려삼계탕 Korean Ginseng Chicken Soup & Bibimbap,43.798391,-79.369187,Korean Restaurant


In [81]:
#Let's check how many venues were returned for each neighbourhood
ny_venues.groupby('Neighbourhood')['Venue'].count()

Neighbourhood
Bathurst Manor, Downsview North, Wilson Heights    100
Bayview Village                                    100
Bedford Park, Lawrence Manor East                  100
CFB Toronto, Downsview East                        100
Don Mills North                                    100
Downsview Central                                   71
Downsview Northwest                                 89
Downsview West                                      76
Emery, Humberlea                                    67
Fairview, Henry Farm, Oriole                       100
Flemingdon Park, Don Mills South                   100
Glencairn                                          100
Hillcrest Village                                   93
Humber Summit                                       46
Lawrence Heights, Lawrence Manor                   100
Maple Leaf Park, North Park, Upwood Park           100
Newtonbrook, Willowdale                            100
Northwood Park, York University                    

In [86]:
print('There are {} uniques venues categories in {}.'.format(len(ny_venues['Venue Category'].unique()),neigbourhood))


There are 184 uniques categories in North York.


### Analyze Each Neighborhood

In [89]:
# one hot encoding
ny_onehot = pd.get_dummies(ny_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
ny_onehot['Neighbourhood'] = ny_venues['Neighbourhood'] 

# move neighborhood column to the first column
fixed_columns = [ny_onehot.columns[-1]] + list(ny_onehot.columns[:-1])
ny_onehot = ny_onehot[fixed_columns]

ny_onehot.head()

Unnamed: 0,Neighbourhood,Accessories Store,Afghan Restaurant,Airport,American Restaurant,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Dealership,BBQ Joint,...,Trail,Train Station,Turkish Restaurant,Vegetarian / Vegan Restaurant,Video Store,Vietnamese Restaurant,Warehouse Store,Wings Joint,Women's Store,Yoga Studio
0,Hillcrest Village,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Hillcrest Village,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Hillcrest Village,0,0,0,0,0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,0
3,Hillcrest Village,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Hillcrest Village,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [90]:
ny_onehot.shape

(2218, 185)

In [92]:
ny_grouped = ny_onehot.groupby('Neighbourhood').mean().reset_index()
ny_grouped

Unnamed: 0,Neighbourhood,Accessories Store,Afghan Restaurant,Airport,American Restaurant,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Dealership,BBQ Joint,...,Trail,Train Station,Turkish Restaurant,Vegetarian / Vegan Restaurant,Video Store,Vietnamese Restaurant,Warehouse Store,Wings Joint,Women's Store,Yoga Studio
0,"Bathurst Manor, Downsview North, Wilson Heights",0.0,0.0,0.0,0.01,0.02,0.01,0.01,0.0,0.0,...,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0
1,Bayview Village,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,"Bedford Park, Lawrence Manor East",0.0,0.0,0.0,0.01,0.01,0.02,0.0,0.0,0.01,...,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.01
3,"CFB Toronto, Downsview East",0.0,0.0,0.0,0.02,0.01,0.01,0.02,0.0,0.0,...,0.0,0.0,0.02,0.0,0.0,0.03,0.01,0.0,0.01,0.0
4,Don Mills North,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0
5,Downsview Central,0.014085,0.0,0.014085,0.0,0.0,0.0,0.028169,0.0,0.0,...,0.0,0.0,0.028169,0.0,0.0,0.084507,0.0,0.0,0.0,0.0
6,Downsview Northwest,0.0,0.0,0.0,0.0,0.0,0.011236,0.0,0.0,0.0,...,0.0,0.0,0.011236,0.0,0.0,0.022472,0.0,0.0,0.0,0.0
7,Downsview West,0.0,0.0,0.013158,0.0,0.0,0.0,0.013158,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.078947,0.0,0.0,0.0,0.0
8,"Emery, Humberlea",0.0,0.0,0.0,0.0,0.0,0.014925,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.029851,0.0,0.0,0.0,0.0
9,"Fairview, Henry Farm, Oriole",0.0,0.0,0.0,0.01,0.0,0.03,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


#### Let's print each neighborhood along with the top 5 most common venues

In [93]:
num_top_venues = 5

for hood in ny_grouped['Neighbourhood']:
    print("----"+hood+"----")
    temp = ny_grouped[ny_grouped['Neighbourhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Bathurst Manor, Downsview North, Wilson Heights----
           venue  freq
0    Coffee Shop  0.15
1     Restaurant  0.04
2    Pizza Place  0.04
3  Grocery Store  0.04
4           Café  0.04


----Bayview Village----
                venue  freq
0         Coffee Shop  0.11
1   Korean Restaurant  0.09
2  Chinese Restaurant  0.05
3                Park  0.05
4      Sandwich Place  0.05


----Bedford Park, Lawrence Manor East----
              venue  freq
0       Coffee Shop  0.08
1    Clothing Store  0.08
2              Café  0.05
3            Bakery  0.05
4  Sushi Restaurant  0.03


----CFB Toronto, Downsview East----
                   venue  freq
0         Clothing Store  0.09
1            Coffee Shop  0.09
2          Deli / Bodega  0.03
3  Vietnamese Restaurant  0.03
4     Turkish Restaurant  0.02


----Don Mills North----
                 venue  freq
0          Coffee Shop  0.06
1         Burger Joint  0.04
2  Japanese Restaurant  0.04
3                 Park  0.04
4           Resta

In [102]:
import numpy as np
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

#### Let's print each neighborhood along with the top 5 most common venues

In [145]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighbourhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighbourhoods_venues_sorted = pd.DataFrame(columns=columns)
neighbourhoods_venues_sorted['Neighbourhood'] = ny_grouped['Neighbourhood']

for ind in np.arange(ny_grouped.shape[0]):
    neighbourhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(ny_grouped.iloc[ind, :], num_top_venues)

neighbourhoods_venues_sorted.head(24)

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Bathurst Manor, Downsview North, Wilson Heights",Coffee Shop,Grocery Store,Café,Pizza Place,Restaurant,Japanese Restaurant,Sandwich Place,Sushi Restaurant,Park,Bakery
1,Bayview Village,Coffee Shop,Korean Restaurant,Chinese Restaurant,Sandwich Place,Pizza Place,Park,Grocery Store,Sushi Restaurant,Furniture / Home Store,Bank
2,"Bedford Park, Lawrence Manor East",Clothing Store,Coffee Shop,Bakery,Café,Cosmetics Shop,Toy / Game Store,Sushi Restaurant,Bagel Shop,Park,Burger Joint
3,"CFB Toronto, Downsview East",Coffee Shop,Clothing Store,Deli / Bodega,Vietnamese Restaurant,Furniture / Home Store,Bagel Shop,Dessert Shop,Men's Store,Cosmetics Shop,Park
4,Don Mills North,Coffee Shop,Burger Joint,Japanese Restaurant,Park,Restaurant,Middle Eastern Restaurant,Gym / Fitness Center,Café,Ice Cream Shop,Supermarket
5,Downsview Central,Coffee Shop,Vietnamese Restaurant,Pizza Place,Fast Food Restaurant,Bank,Sandwich Place,Grocery Store,Flea Market,Athletics & Sports,Furniture / Home Store
6,Downsview Northwest,Coffee Shop,Fast Food Restaurant,Pizza Place,Sandwich Place,Japanese Restaurant,Bank,Hotel,Grocery Store,Theater,Bar
7,Downsview West,Coffee Shop,Fast Food Restaurant,Pizza Place,Vietnamese Restaurant,Bank,Beer Store,Sandwich Place,Pharmacy,Flea Market,Big Box Store
8,"Emery, Humberlea",Fast Food Restaurant,Coffee Shop,Pizza Place,Bank,Fried Chicken Joint,Sandwich Place,Pharmacy,Caribbean Restaurant,Vietnamese Restaurant,Golf Course
9,"Fairview, Henry Farm, Oriole",Coffee Shop,Chinese Restaurant,Pharmacy,Middle Eastern Restaurant,Bakery,Burger Joint,Seafood Restaurant,Asian Restaurant,Fast Food Restaurant,Caribbean Restaurant


##### As we can see the majority of venues are cafes, restaurants and fast foods

## Cluster Neighborhoods

In [146]:


# set number of clusters
kclusters = 5

ny_grouped_clustering = ny_grouped.drop('Neighbourhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(ny_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([2, 2, 1, 1, 3, 0, 0, 0, 0, 3])

##### Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

In [147]:
ny_merged = NY_df
# add clustering labels
ny_merged['Cluster Labels'] = kmeans.labels_


ny_merged = ny_merged.join(neighbourhoods_venues_sorted.set_index('Neighbourhood'), on='Neighbourhood')

ny_merged.head() # check the last columns!

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  This is separate from the ipykernel package so we can avoid doing imports until


Unnamed: 0,PostalCode,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
17,M2H,North York,Hillcrest Village,43.803762,-79.363452,2,Coffee Shop,Pharmacy,Japanese Restaurant,Chinese Restaurant,Bakery,Sandwich Place,Caribbean Restaurant,Park,Pizza Place,Asian Restaurant
18,M2J,North York,"Fairview, Henry Farm, Oriole",43.778517,-79.346556,2,Coffee Shop,Chinese Restaurant,Pharmacy,Middle Eastern Restaurant,Bakery,Burger Joint,Seafood Restaurant,Asian Restaurant,Fast Food Restaurant,Caribbean Restaurant
19,M2K,North York,Bayview Village,43.786947,-79.385975,1,Coffee Shop,Korean Restaurant,Chinese Restaurant,Sandwich Place,Pizza Place,Park,Grocery Store,Sushi Restaurant,Furniture / Home Store,Bank
20,M2L,North York,"Silver Hills, York Mills",43.75749,-79.374714,1,Coffee Shop,Café,Japanese Restaurant,Burger Joint,Bank,Italian Restaurant,Furniture / Home Store,Pizza Place,Pharmacy,Supermarket
21,M2M,North York,"Newtonbrook, Willowdale",43.789053,-79.408493,3,Korean Restaurant,Coffee Shop,Middle Eastern Restaurant,Café,Sushi Restaurant,Fast Food Restaurant,Supermarket,Steakhouse,Liquor Store,Seafood Restaurant


In [148]:
###Clusters Visulization

In [149]:

# Centering the map by mean values of all neiggbourhoods
latitude=NY_df['Latitude'].mean()
longitude=NY_df['Longitude'].mean()

# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(ny_merged['Latitude'], ny_merged['Longitude'], ny_merged['Neighbourhood'], ny_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [150]:
ny_merged.loc[ny_merged['Cluster Labels'] == 0, ny_merged.columns[[2] + list(range(5, ny_merged.shape[1]))]]

Unnamed: 0,Neighbourhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
22,Willowdale South,0,Coffee Shop,Korean Restaurant,Sandwich Place,Café,Pizza Place,Bank,Sushi Restaurant,Middle Eastern Restaurant,Trail,Shopping Mall
23,York Mills West,0,Coffee Shop,Sushi Restaurant,Bakery,Café,Park,Korean Restaurant,Burger Joint,Tea Room,Seafood Restaurant,Sandwich Place
24,Willowdale West,0,Korean Restaurant,Coffee Shop,Middle Eastern Restaurant,Café,Supermarket,Grocery Store,Ramen Restaurant,Sushi Restaurant,Dessert Shop,Burrito Place
25,Parkwoods,0,Middle Eastern Restaurant,Coffee Shop,Chinese Restaurant,Burger Joint,Supermarket,Mediterranean Restaurant,Japanese Restaurant,American Restaurant,Gym / Fitness Center,Café
30,"CFB Toronto, Downsview East",0,Coffee Shop,Clothing Store,Deli / Bodega,Vietnamese Restaurant,Furniture / Home Store,Bagel Shop,Dessert Shop,Men's Store,Cosmetics Shop,Park
32,Downsview Central,0,Coffee Shop,Vietnamese Restaurant,Pizza Place,Fast Food Restaurant,Bank,Sandwich Place,Grocery Store,Flea Market,Athletics & Sports,Furniture / Home Store
34,Victoria Village,0,Middle Eastern Restaurant,Coffee Shop,Burger Joint,Supermarket,Gym / Fitness Center,Grocery Store,Chinese Restaurant,Movie Theater,Restaurant,Japanese Restaurant


##### 1st Cluster --> Get Coffe, some food and dont forget to visit a bank

In [154]:
ny_merged.loc[ny_merged['Cluster Labels'] == 0, ny_merged.columns[[2] + list(range(5, ny_merged.shape[1]))]]

Unnamed: 0,Neighbourhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
22,Willowdale South,0,Coffee Shop,Korean Restaurant,Sandwich Place,Café,Pizza Place,Bank,Sushi Restaurant,Middle Eastern Restaurant,Trail,Shopping Mall
23,York Mills West,0,Coffee Shop,Sushi Restaurant,Bakery,Café,Park,Korean Restaurant,Burger Joint,Tea Room,Seafood Restaurant,Sandwich Place
24,Willowdale West,0,Korean Restaurant,Coffee Shop,Middle Eastern Restaurant,Café,Supermarket,Grocery Store,Ramen Restaurant,Sushi Restaurant,Dessert Shop,Burrito Place
25,Parkwoods,0,Middle Eastern Restaurant,Coffee Shop,Chinese Restaurant,Burger Joint,Supermarket,Mediterranean Restaurant,Japanese Restaurant,American Restaurant,Gym / Fitness Center,Café
30,"CFB Toronto, Downsview East",0,Coffee Shop,Clothing Store,Deli / Bodega,Vietnamese Restaurant,Furniture / Home Store,Bagel Shop,Dessert Shop,Men's Store,Cosmetics Shop,Park
32,Downsview Central,0,Coffee Shop,Vietnamese Restaurant,Pizza Place,Fast Food Restaurant,Bank,Sandwich Place,Grocery Store,Flea Market,Athletics & Sports,Furniture / Home Store
34,Victoria Village,0,Middle Eastern Restaurant,Coffee Shop,Burger Joint,Supermarket,Gym / Fitness Center,Grocery Store,Chinese Restaurant,Movie Theater,Restaurant,Japanese Restaurant


##### 1st Cluster --> Restaraunts and Shops

In [160]:
ny_merged.loc[ny_merged['Cluster Labels'] == 0, ny_merged.columns[[2] + list(range(5, ny_merged.shape[1]))]]

Unnamed: 0,Neighbourhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
22,Willowdale South,0,Coffee Shop,Korean Restaurant,Sandwich Place,Café,Pizza Place,Bank,Sushi Restaurant,Middle Eastern Restaurant,Trail,Shopping Mall
23,York Mills West,0,Coffee Shop,Sushi Restaurant,Bakery,Café,Park,Korean Restaurant,Burger Joint,Tea Room,Seafood Restaurant,Sandwich Place
24,Willowdale West,0,Korean Restaurant,Coffee Shop,Middle Eastern Restaurant,Café,Supermarket,Grocery Store,Ramen Restaurant,Sushi Restaurant,Dessert Shop,Burrito Place
25,Parkwoods,0,Middle Eastern Restaurant,Coffee Shop,Chinese Restaurant,Burger Joint,Supermarket,Mediterranean Restaurant,Japanese Restaurant,American Restaurant,Gym / Fitness Center,Café
30,"CFB Toronto, Downsview East",0,Coffee Shop,Clothing Store,Deli / Bodega,Vietnamese Restaurant,Furniture / Home Store,Bagel Shop,Dessert Shop,Men's Store,Cosmetics Shop,Park
32,Downsview Central,0,Coffee Shop,Vietnamese Restaurant,Pizza Place,Fast Food Restaurant,Bank,Sandwich Place,Grocery Store,Flea Market,Athletics & Sports,Furniture / Home Store
34,Victoria Village,0,Middle Eastern Restaurant,Coffee Shop,Burger Joint,Supermarket,Gym / Fitness Center,Grocery Store,Chinese Restaurant,Movie Theater,Restaurant,Japanese Restaurant


##### 2nd Cluster --> Get Coffe, some food and dont forget to visit a bank

In [161]:
ny_merged.loc[ny_merged['Cluster Labels'] == 1, ny_merged.columns[[2] + list(range(5, ny_merged.shape[1]))]]

Unnamed: 0,Neighbourhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
19,Bayview Village,1,Coffee Shop,Korean Restaurant,Chinese Restaurant,Sandwich Place,Pizza Place,Park,Grocery Store,Sushi Restaurant,Furniture / Home Store,Bank
20,"Silver Hills, York Mills",1,Coffee Shop,Café,Japanese Restaurant,Burger Joint,Bank,Italian Restaurant,Furniture / Home Store,Pizza Place,Pharmacy,Supermarket
28,"Bathurst Manor, Downsview North, Wilson Heights",1,Coffee Shop,Grocery Store,Café,Pizza Place,Restaurant,Japanese Restaurant,Sandwich Place,Sushi Restaurant,Park,Bakery
31,Downsview West,1,Coffee Shop,Fast Food Restaurant,Pizza Place,Vietnamese Restaurant,Bank,Beer Store,Sandwich Place,Pharmacy,Flea Market,Big Box Store


##### 3rd Cluster --> Get Coffe at home, then do shopping or groceries, walk in the park afterwards

In [162]:
ny_merged.loc[ny_merged['Cluster Labels'] == 2, ny_merged.columns[[2] + list(range(5, ny_merged.shape[1]))]]

Unnamed: 0,Neighbourhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
17,Hillcrest Village,2,Coffee Shop,Pharmacy,Japanese Restaurant,Chinese Restaurant,Bakery,Sandwich Place,Caribbean Restaurant,Park,Pizza Place,Asian Restaurant
18,"Fairview, Henry Farm, Oriole",2,Coffee Shop,Chinese Restaurant,Pharmacy,Middle Eastern Restaurant,Bakery,Burger Joint,Seafood Restaurant,Asian Restaurant,Fast Food Restaurant,Caribbean Restaurant
79,"Maple Leaf Park, North Park, Upwood Park",2,Coffee Shop,Fast Food Restaurant,Furniture / Home Store,Vietnamese Restaurant,Sandwich Place,Bank,Fried Chicken Joint,Grocery Store,Pizza Place,Pharmacy
97,"Emery, Humberlea",2,Fast Food Restaurant,Coffee Shop,Pizza Place,Bank,Fried Chicken Joint,Sandwich Place,Pharmacy,Caribbean Restaurant,Vietnamese Restaurant,Golf Course


##### 4th Cluster --> Get Coffe first, foods, then come back to the hotel or go to bar/theater, visit jewelry store

In [163]:
ny_merged.loc[ny_merged['Cluster Labels'] == 3, ny_merged.columns[[2] + list(range(5, ny_merged.shape[1]))]]

Unnamed: 0,Neighbourhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
21,"Newtonbrook, Willowdale",3,Korean Restaurant,Coffee Shop,Middle Eastern Restaurant,Café,Sushi Restaurant,Fast Food Restaurant,Supermarket,Steakhouse,Liquor Store,Seafood Restaurant
26,Don Mills North,3,Coffee Shop,Burger Joint,Japanese Restaurant,Park,Restaurant,Middle Eastern Restaurant,Gym / Fitness Center,Café,Ice Cream Shop,Supermarket
27,"Flemingdon Park, Don Mills South",3,Coffee Shop,Burger Joint,Restaurant,Grocery Store,Park,Middle Eastern Restaurant,Japanese Restaurant,Indian Restaurant,Café,Bakery
29,"Northwood Park, York University",3,Coffee Shop,Sandwich Place,Pizza Place,Fast Food Restaurant,Grocery Store,Pharmacy,Restaurant,Bank,Middle Eastern Restaurant,Flea Market
62,"Bedford Park, Lawrence Manor East",3,Clothing Store,Coffee Shop,Bakery,Café,Cosmetics Shop,Toy / Game Store,Sushi Restaurant,Bagel Shop,Park,Burger Joint
71,"Lawrence Heights, Lawrence Manor",3,Clothing Store,Coffee Shop,Furniture / Home Store,Grocery Store,Bakery,Japanese Restaurant,Italian Restaurant,Cosmetics Shop,Dessert Shop,Men's Store
72,Glencairn,3,Clothing Store,Coffee Shop,Furniture / Home Store,Bakery,Italian Restaurant,Cosmetics Shop,Café,Jewelry Store,Japanese Restaurant,Sushi Restaurant


##### 5th Cluster --> Stay in hotel or go the Theater

In [164]:
ny_merged.loc[ny_merged['Cluster Labels'] == 4, ny_merged.columns[[2] + list(range(5, ny_merged.shape[1]))]]

Unnamed: 0,Neighbourhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
33,Downsview Northwest,4,Coffee Shop,Fast Food Restaurant,Pizza Place,Sandwich Place,Japanese Restaurant,Bank,Hotel,Grocery Store,Theater,Bar
96,Humber Summit,4,Coffee Shop,Fast Food Restaurant,Bank,Pizza Place,Sandwich Place,Caribbean Restaurant,Pharmacy,Italian Restaurant,Hotel,Grocery Store


#### As we can see, the observed region has lots of cafe and restaraunts everywhere, some neigbourhoods have a more densed concentration of restauraunts, others fast foof but everywhere you can get a cup of coffee. Somewhere a park is very close, another places have banks, jewelry stores and even a theater