Let's start with the map excercise. We can add all dependencies first

In [1]:
import sys
!conda install --yes --prefix {sys.prefix} lxml
!conda install --yes --prefix {sys.prefix} html5lib
!conda install --yes --prefix {sys.prefix} beautifulsoup4
!conda install -yes --prefix {sys.prefix} conda-forge geopy
!conda install --yes  --prefix {sys.prefix} conda-forge folium=0.5.0

import pandas as pd
import numpy as np
from geopy.geocoders import Nominatim
import folium
# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

Collecting package metadata (repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /Users/arpit/anaconda3

  added / updated specs:
    - lxml


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    certifi-2019.6.16          |           py36_0         154 KB
    conda-4.7.5                |           py36_0         3.0 MB
    ------------------------------------------------------------
                                           Total:         3.2 MB

The following packages will be UPDATED:

  openssl            conda-forge::openssl-1.1.1b-h01d97ff_2 --> pkgs/main::openssl-1.1.1c-h1de35cc_1

The following packages will be SUPERSEDED by a higher-priority channel:

  ca-certificates    conda-forge::ca-certificates-2019.6.1~ --> pkgs/main::ca-certificates-2019.5.15-0
  certifi                                       conda-forge --> pkgs/main
  conda            

Let's get our data now. First - read from wikipedia, then drop first row which is header, then clean data

In [162]:
df = pd.read_html("https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M")[0]
df=df.drop([0])

df.rename(columns={0:'PostalCode',1:'Borough',2:'Neighbourhood'},inplace=True)
df = df[df.Borough!='Not assigned']
df.Neighbourhood[df.Neighbourhood=='Not assigned']=df.Borough[df.Neighbourhood=='Not assigned']
df.head()


Unnamed: 0,PostalCode,Borough,Neighbourhood
3,M3A,North York,Parkwoods
4,M4A,North York,Victoria Village
5,M5A,Downtown Toronto,Harbourfront
6,M5A,Downtown Toronto,Regent Park
7,M6A,North York,Lawrence Heights


Let's review overall data

In [163]:
df.count()

PostalCode       211
Borough          211
Neighbourhood    211
dtype: int64

Let's get the coordinates now

In [164]:
geo = pd.read_csv('http://cocl.us/Geospatial_data')
geo.head()

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


Let's merge dataframes

In [165]:

dfinal = pd.merge(df,geo, left_on="PostalCode", right_on='Postal Code',how="left")

In [166]:
dfinal.head()

Unnamed: 0,PostalCode,Borough,Neighbourhood,Postal Code,Latitude,Longitude
0,M3A,North York,Parkwoods,M3A,43.753259,-79.329656
1,M4A,North York,Victoria Village,M4A,43.725882,-79.315572
2,M5A,Downtown Toronto,Harbourfront,M5A,43.65426,-79.360636
3,M5A,Downtown Toronto,Regent Park,M5A,43.65426,-79.360636
4,M6A,North York,Lawrence Heights,M6A,43.718518,-79.464763


Let's remove additional postal code columns

In [167]:
dfinal.drop(['PostalCode','Postal Code'],inplace = True,axis=1) 

In [168]:
print('The dataframe has {} boroughs and {} neighborhoods.'.format(
        len(dfinal['Borough'].unique()),
        dfinal.shape[0]
    )
)

The dataframe has 11 boroughs and 211 neighborhoods.


#### Use geopy library to get the latitude and longitude values of Toronto

In [169]:
address = 'Toronto, TO'

geolocator = Nominatim(user_agent="TO_explorer") #just a needed value for creating insatnce of Nominatim
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Toronto are 43.6523873, -79.3835641.


#### Create a map of Toronto with neighborhoods superimposed on top.

In [170]:
# create map of Toronto using latitude and longitude values
map_TO = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(dfinal['Latitude'], dfinal['Longitude'], dfinal['Borough'], dfinal['Neighbourhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_TO)  
    
map_TO

Map1 url - https://github.com/notthatyoda/coursera-adsc/blob/master/map-1.png?raw=true

![Map1](https://github.com/notthatyoda/coursera-adsc/blob/master/map-1.png?raw=true)

As suggested in the lab, let's take up all Borough's with Toronto in name and analyse those only further

In [171]:
map_data = dfinal[dfinal.apply(lambda row: row.astype(str).str.contains('Toronto', case=False).any(), axis=1)].reset_index(drop=True)
map_data.count()

Borough          77
Neighbourhood    77
Latitude         77
Longitude        77
dtype: int64

Let's list them on map. We can still use latitude , longitude for Toronto

In [172]:
map_data.head()

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude
0,Downtown Toronto,Harbourfront,43.65426,-79.360636
1,Downtown Toronto,Regent Park,43.65426,-79.360636
2,Downtown Toronto,Ryerson,43.657162,-79.378937
3,Downtown Toronto,Garden District,43.657162,-79.378937
4,Downtown Toronto,St. James Town,43.651494,-79.375418


In [134]:
# create map of Toronto using latitude and longitude values
map_TO = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(map_data['Latitude'], map_data['Longitude'], map_data['Borough'], map_data['Neighbourhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_TO)  
map_TO

Map2 url - https://github.com/notthatyoda/coursera-adsc/blob/master/map-2.png?raw=true

![Map2](https://github.com/notthatyoda/coursera-adsc/blob/master/map-2.png?raw=true)

Next, we are going to start utilizing the Foursquare API to explore the neighborhoods and segment them.

In [135]:
CLIENT_ID = '' # your Foursquare ID
CLIENT_SECRET = '' # your Foursquare Secret
VERSION = '' # Foursquare API version

Here's the function from lab that gets venues for all neighbourhoods

In [136]:
import requests

LIMIT = 20 # We will get top 15 venues only 

def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        #print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

Let's run the above function for all map_data

In [138]:
map_data_venues = getNearbyVenues(names=map_data['Neighbourhood'],
                                   latitudes=map_data['Latitude'],
                                   longitudes=map_data['Longitude']
                                  )

In [32]:
#map_data_venues.to_csv('venues.txt', sep='\t')
#Temp save data to avoid API calls again and again
#new_venues= pd.read_csv('venues.txt',sep='\t', index_col=0)

In [139]:
map_data_venues.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Harbourfront,43.65426,-79.360636,Roselle Desserts,43.653447,-79.362017,Bakery
1,Harbourfront,43.65426,-79.360636,Tandem Coffee,43.653559,-79.361809,Coffee Shop
2,Harbourfront,43.65426,-79.360636,Toronto Cooper Koo Family Cherry St YMCA Centre,43.653191,-79.357947,Gym / Fitness Center
3,Harbourfront,43.65426,-79.360636,Body Blitz Spa East,43.654735,-79.359874,Spa
4,Harbourfront,43.65426,-79.360636,Morning Glory Cafe,43.653947,-79.361149,Breakfast Spot


Let's check how many venues were returned for each neighborhood

In [140]:
map_data_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Adelaide,20,20,20,20,20,20
Bathurst Quay,16,16,16,16,16,16
Berczy Park,20,20,20,20,20,20
Brockton,20,20,20,20,20,20
Business Reply Mail Processing Centre 969 Eastern,19,19,19,19,19,19
CFB Toronto,3,3,3,3,3,3
CN Tower,16,16,16,16,16,16
Cabbagetown,20,20,20,20,20,20
Central Bay Street,20,20,20,20,20,20
Chinatown,20,20,20,20,20,20


Let's find out how many unique categories can be curated from all the returned venues

In [141]:
print('There are {} uniques categories.'.format(len(map_data_venues['Venue Category'].unique())))

There are 165 uniques categories.


## Analyze Each Neighborhood

In [142]:
# one hot encoding
toronto_onehot = pd.get_dummies(map_data_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
toronto_onehot['Neighborhood'] = map_data_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]

toronto_onehot.head()

Unnamed: 0,Yoga Studio,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Art Gallery,Arts & Crafts Store,...,Tea Room,Tennis Court,Thai Restaurant,Theater,Theme Restaurant,Toy / Game Store,Trail,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Wine Bar
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


Let's examine the new dataframe size.



In [143]:
toronto_onehot.shape

(1275, 165)

### Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category

In [144]:
toronto_grouped = toronto_onehot.groupby('Neighborhood').mean().reset_index()
toronto_grouped.head()

Unnamed: 0,Neighborhood,Yoga Studio,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Art Gallery,...,Tea Room,Tennis Court,Thai Restaurant,Theater,Theme Restaurant,Toy / Game Store,Trail,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Wine Bar
0,Adelaide,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0
1,Bathurst Quay,0.0,0.0625,0.0625,0.0625,0.125,0.125,0.125,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Berczy Park,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.05,0.0,0.05,0.0,0.0,0.0,0.0,0.05,0.0,0.0
3,Brockton,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Business Reply Mail Processing Centre 969 Eastern,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


#### Let's put top 10 most common venues for each neighborhood in a dataframe

Here's the venue sort function from lab

In [145]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Now let's create the new dataframe and display the top 10 venues for each neighborhood.

In [146]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = toronto_grouped['Neighborhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Adelaide,Steakhouse,Asian Restaurant,Greek Restaurant,Food Court,Seafood Restaurant,Speakeasy,Bar,Concert Hall,Hotel,Café
1,Bathurst Quay,Airport Lounge,Airport Service,Airport Terminal,Plane,Harbor / Marina,Sculpture Garden,Boutique,Boat or Ferry,Bar,Coffee Shop
2,Berczy Park,Cocktail Bar,Seafood Restaurant,Farmers Market,Basketball Stadium,Steakhouse,Coffee Shop,Museum,Concert Hall,Liquor Store,Breakfast Spot
3,Brockton,Breakfast Spot,Coffee Shop,Café,Caribbean Restaurant,Restaurant,Burrito Place,Bar,Stadium,Bakery,Italian Restaurant
4,Business Reply Mail Processing Centre 969 Eastern,Yoga Studio,Pizza Place,Skate Park,Brewery,Restaurant,Recording Studio,Burrito Place,Butcher,Garden Center,Spa


## Cluster Neighborhoods

Run k-means to cluster the neighborhood into 5 clusters.

In [147]:
# set number of clusters
kclusters = 5

toronto_grouped_clustering = toronto_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 1, 2, 4, 2, 2, 1, 2, 4, 0], dtype=int32)

Let's rename column name for merging

In [148]:
map_data=map_data.rename(columns={'Neighbourhood':'Neighborhood'})

Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

In [149]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

toronto_merged = map_data
toronto_merged = toronto_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

toronto_merged.head() 

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Downtown Toronto,Harbourfront,43.65426,-79.360636,4,Coffee Shop,Breakfast Spot,Bakery,Historic Site,Spa,Mexican Restaurant,Farmers Market,Restaurant,Dessert Shop,Pub
1,Downtown Toronto,Regent Park,43.65426,-79.360636,4,Coffee Shop,Breakfast Spot,Bakery,Historic Site,Spa,Mexican Restaurant,Farmers Market,Restaurant,Dessert Shop,Pub
2,Downtown Toronto,Ryerson,43.657162,-79.378937,0,Café,Beer Bar,Coffee Shop,Music Venue,Movie Theater,Pizza Place,Plaza,Burrito Place,Burger Joint,Ramen Restaurant
3,Downtown Toronto,Garden District,43.657162,-79.378937,0,Café,Beer Bar,Coffee Shop,Music Venue,Movie Theater,Pizza Place,Plaza,Burrito Place,Burger Joint,Ramen Restaurant
4,Downtown Toronto,St. James Town,43.651494,-79.375418,0,Gastropub,Restaurant,Café,Japanese Restaurant,Coffee Shop,Italian Restaurant,Jewelry Store,Diner,Deli / Bodega,Butcher


In [150]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=10)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['Neighborhood'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

Map3 url-https://github.com/notthatyoda/coursera-adsc/blob/master/map-3.png?raw=true

![Map3](https://github.com/notthatyoda/coursera-adsc/blob/master/map-3.png?raw=true)

## Lets's Examine Clusters

Let's examine each cluster 

## Cluster 1

In [151]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 0, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,Ryerson,Café,Beer Bar,Coffee Shop,Music Venue,Movie Theater,Pizza Place,Plaza,Burrito Place,Burger Joint,Ramen Restaurant
3,Garden District,Café,Beer Bar,Coffee Shop,Music Venue,Movie Theater,Pizza Place,Plaza,Burrito Place,Burger Joint,Ramen Restaurant
4,St. James Town,Gastropub,Restaurant,Café,Japanese Restaurant,Coffee Shop,Italian Restaurant,Jewelry Store,Diner,Deli / Bodega,Butcher
5,The Beaches,Trail,Other Great Outdoors,Pub,Health Food Store,Wine Bar,Convenience Store,Diner,Dessert Shop,Deli / Bodega,Dance Studio
9,Adelaide,Steakhouse,Asian Restaurant,Greek Restaurant,Food Court,Seafood Restaurant,Speakeasy,Bar,Concert Hall,Hotel,Café
10,King,Steakhouse,Asian Restaurant,Greek Restaurant,Food Court,Seafood Restaurant,Speakeasy,Bar,Concert Hall,Hotel,Café
11,Richmond,Steakhouse,Asian Restaurant,Greek Restaurant,Food Court,Seafood Restaurant,Speakeasy,Bar,Concert Hall,Hotel,Café
23,Design Exchange,Coffee Shop,Restaurant,Deli / Bodega,Café,Hotel,Pub,Beer Bar,Japanese Restaurant,Bakery,Sandwich Place
24,Toronto Dominion Centre,Coffee Shop,Restaurant,Deli / Bodega,Café,Hotel,Pub,Beer Bar,Japanese Restaurant,Bakery,Sandwich Place
30,Commerce Court,Café,Gastropub,Restaurant,Museum,Coffee Shop,Deli / Bodega,Beer Bar,Japanese Restaurant,Bakery,Pub


#### Coffee Shop, Cafe, Cuisine Specific Restaurants?

In [152]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 1, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
61,CN Tower,Airport Lounge,Airport Service,Airport Terminal,Plane,Harbor / Marina,Sculpture Garden,Boutique,Boat or Ferry,Bar,Coffee Shop
62,Bathurst Quay,Airport Lounge,Airport Service,Airport Terminal,Plane,Harbor / Marina,Sculpture Garden,Boutique,Boat or Ferry,Bar,Coffee Shop
63,Island airport,Airport Lounge,Airport Service,Airport Terminal,Plane,Harbor / Marina,Sculpture Garden,Boutique,Boat or Ferry,Bar,Coffee Shop
64,Harbourfront West,Airport Lounge,Airport Service,Airport Terminal,Plane,Harbor / Marina,Sculpture Garden,Boutique,Boat or Ferry,Bar,Coffee Shop
65,King and Spadina,Airport Lounge,Airport Service,Airport Terminal,Plane,Harbor / Marina,Sculpture Garden,Boutique,Boat or Ferry,Bar,Coffee Shop
66,Railway Lands,Airport Lounge,Airport Service,Airport Terminal,Plane,Harbor / Marina,Sculpture Garden,Boutique,Boat or Ferry,Bar,Coffee Shop
67,South Niagara,Airport Lounge,Airport Service,Airport Terminal,Plane,Harbor / Marina,Sculpture Garden,Boutique,Boat or Ferry,Bar,Coffee Shop


#### This is all Airport Related Venues

In [153]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 2, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
6,Berczy Park,Cocktail Bar,Seafood Restaurant,Farmers Market,Basketball Stadium,Steakhouse,Coffee Shop,Museum,Concert Hall,Liquor Store,Breakfast Spot
8,Christie,Café,Grocery Store,Park,Coffee Shop,Diner,Italian Restaurant,Baby Store,Restaurant,Nightclub,Convenience Store
12,Dovercourt Village,Pharmacy,Supermarket,Bakery,Music Venue,Pool,Café,Middle Eastern Restaurant,Liquor Store,Brewery,Brazilian Restaurant
13,Dufferin,Pharmacy,Supermarket,Bakery,Music Venue,Pool,Café,Middle Eastern Restaurant,Liquor Store,Brewery,Brazilian Restaurant
15,Harbourfront East,Park,Café,Lake,Plaza,Bakery,Hotel,Supermarket,Deli / Bodega,Sporting Goods Shop,Italian Restaurant
16,Toronto Islands,Park,Café,Lake,Plaza,Bakery,Hotel,Supermarket,Deli / Bodega,Sporting Goods Shop,Italian Restaurant
17,Union Station,Park,Café,Lake,Plaza,Bakery,Hotel,Supermarket,Deli / Bodega,Sporting Goods Shop,Italian Restaurant
18,Little Portugal,Bar,Wine Bar,Record Shop,Art Gallery,Asian Restaurant,Brewery,Cocktail Bar,Cuban Restaurant,French Restaurant,Vietnamese Restaurant
19,Trinity,Bar,Wine Bar,Record Shop,Art Gallery,Asian Restaurant,Brewery,Cocktail Bar,Cuban Restaurant,French Restaurant,Vietnamese Restaurant
20,CFB Toronto,Airport,Park,Other Repair Shop,Wine Bar,Cosmetics Shop,Discount Store,Diner,Dessert Shop,Deli / Bodega,Dance Studio


#### Park , Pet Store?

In [154]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 3, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
51,Moore Park,Playground,Tennis Court,Wine Bar,Cosmetics Shop,Discount Store,Diner,Dessert Shop,Deli / Bodega,Dance Studio,Cuban Restaurant
52,Summerhill East,Playground,Tennis Court,Wine Bar,Cosmetics Shop,Discount Store,Diner,Dessert Shop,Deli / Bodega,Dance Studio,Cuban Restaurant


#### This is all same venues from 1-10. Playground. Tennis Court, Huh!

In [155]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 4, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Harbourfront,Coffee Shop,Breakfast Spot,Bakery,Historic Site,Spa,Mexican Restaurant,Farmers Market,Restaurant,Dessert Shop,Pub
1,Regent Park,Coffee Shop,Breakfast Spot,Bakery,Historic Site,Spa,Mexican Restaurant,Farmers Market,Restaurant,Dessert Shop,Pub
7,Central Bay Street,Coffee Shop,Italian Restaurant,Bubble Tea Shop,Sushi Restaurant,Modern European Restaurant,Sandwich Place,Spa,Japanese Restaurant,Ramen Restaurant,Seafood Restaurant
14,East Toronto,Convenience Store,Coffee Shop,Park,Cosmetics Shop,Discount Store,Diner,Dessert Shop,Deli / Bodega,Dance Studio,Cuban Restaurant
25,Brockton,Breakfast Spot,Coffee Shop,Café,Caribbean Restaurant,Restaurant,Burrito Place,Bar,Stadium,Bakery,Italian Restaurant
26,Exhibition Place,Breakfast Spot,Coffee Shop,Café,Caribbean Restaurant,Restaurant,Burrito Place,Bar,Stadium,Bakery,Italian Restaurant
27,Parkdale Village,Breakfast Spot,Coffee Shop,Café,Caribbean Restaurant,Restaurant,Burrito Place,Bar,Stadium,Bakery,Italian Restaurant
40,North Toronto West,Coffee Shop,Yoga Studio,Bagel Shop,Gym / Fitness Center,Fast Food Restaurant,Diner,Dessert Shop,Mexican Restaurant,Clothing Store,Park
44,Parkdale,Gift Shop,Breakfast Spot,Restaurant,Dog Run,Bookstore,Coffee Shop,Bar,Bank,Dessert Shop,Movie Theater
45,Roncesvalles,Gift Shop,Breakfast Spot,Restaurant,Dog Run,Bookstore,Coffee Shop,Bar,Bank,Dessert Shop,Movie Theater


#### Seems Coffee Shop, Breakfast Spot names impact allocation to Cluster 5.