## 1.Download and Explore Dataset

In [2]:
import numpy as np 
import pandas as pd 
import json 
!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim 
import requests 
from pandas.io.json import json_normalize 
import matplotlib.cm as cm
import matplotlib.colors as colors
from sklearn.cluster import KMeans
!conda install folium -c conda-forge
import folium # map rendering library
print('Libraries imported.')

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    ca-certificates-2019.11.28 |       hecc5488_0         145 KB  conda-forge
    openssl-1.1.1d             |       h516909a_0         2.1 MB  conda-forge
    geopy-1.21.0               |             py_0          58 KB  conda-forge
    certifi-2019.11.28         |           py36_0         149 KB  conda-forge
    geographiclib-1.50         |             py_0          34 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         2.5 MB

The following NEW packages will be INSTALLED:

    geographiclib:   1.50-py_0         conda-forge
    geopy:           1.21.0-py_0       conda-forge

The following packages will be UPDATED:

    ca-

In [3]:
table = pd.read_html('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M', header=0)[0]
table.head()

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront


The wikipedia table has been scraped. Now, it is time to prepare the data in order to use them for the further analysis.

In [4]:
df=table.rename(columns={"Postcode": "PostalCode"})
df=df[df.Borough!='Not assigned'].reset_index(drop=True)
for r,row in enumerate(df['Borough']):
    if df.iloc[r,2]== 'Not assigned':
        df.iloc[r,2]= df.iloc[r,1]        
df1=pd.DataFrame(columns=['PostalCode', 'Borough', 'Neighbourhood'])
k=-1
for pcode in df['PostalCode'].unique().tolist():
    n=0
    for i,row in enumerate(df['PostalCode']):
        if pcode==row:
            n=n+1
            if n==1:
                k=k+1
                j=i
                df1=df1.append(df.iloc[j], ignore_index=True)
                df1.reset_index(drop=True)                   
            else:
                df1.iloc[k,2]=df1.iloc[k,2]+", "+ df.iloc[i,2]
                df1.reset_index(drop=True)
                     

df1.head(10)

Unnamed: 0,PostalCode,Borough,Neighbourhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,Harbourfront
3,M6A,North York,"Lawrence Heights, Lawrence Manor"
4,M7A,Downtown Toronto,Queen's Park
5,M9A,Queen's Park,Queen's Park
6,M1B,Scarborough,"Rouge, Malvern"
7,M3B,North York,Don Mills North
8,M4B,East York,"Woodbine Gardens, Parkview Hill"
9,M5B,Downtown Toronto,"Ryerson, Garden District"


As can be clearly seen in the dataframe above, the data have been arranged so that the rows having "Not assigned" borough cells are no longer in the table and neighboorhood names are combined seperating them with comma for those sharing the same postalcode. In addition, Borough names have been copied to the neighbourhood cells where the names were not assigned.  

In [5]:
df1.shape

(103, 3)

## 2.Adding the Geographical Coordinates of the Neighborhoods Using the Geocoder Package

In [6]:
column_names = ['PostalCode', 'Borough', 'Neighborhood', 'Latitude', 'Longitude'] 
df3=pd.DataFrame(columns=column_names)

for i,row in enumerate(df1['PostalCode']):
    postalcode=df1.iloc[i,0]
    borough = df1.iloc[i,1]
    neighborhood_name = df1.iloc[i,2]
        
    address = borough+', TO'
    geolocator = Nominatim(user_agent="to_explorer")
    location = geolocator.geocode(address)
    latitude = location.latitude
    longitude = location.longitude
    
    
    df3 = df3.append({'PostalCode':postalcode,
                                          'Borough': borough,
                                          'Neighborhood': neighborhood_name,
                                          'Latitude': latitude,
                                          'Longitude': longitude}, ignore_index=True)
df3.shape

(103, 5)

In [7]:
df3.head(10)

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.754326,-79.449117
1,M4A,North York,Victoria Village,43.754326,-79.449117
2,M5A,Downtown Toronto,Harbourfront,43.654174,-79.380812
3,M6A,North York,"Lawrence Heights, Lawrence Manor",43.754326,-79.449117
4,M7A,Downtown Toronto,Queen's Park,43.654174,-79.380812
5,M9A,Queen's Park,Queen's Park,43.659659,-79.39034
6,M1B,Scarborough,"Rouge, Malvern",43.773077,-79.257774
7,M3B,North York,Don Mills North,43.754326,-79.449117
8,M4B,East York,"Woodbine Gardens, Parkview Hill",43.699971,-79.33252
9,M5B,Downtown Toronto,"Ryerson, Garden District",43.654174,-79.380812


## 3.Explore and Cluster the Neighborhoods in Toronto

After having attained all the data that is required to explore and culster the neigbourhoods in Toronto, lets start using Foursquare API to explore the neighboorhoods. The API credentials are as follows: 

In [8]:
CLIENT_ID = 'WVNVCGPZRK3KI0RVKDVIGYPP2BEBXGWBVZ1ELXJBWZH2EADA' 
CLIENT_SECRET = 'B20AIXRRFRAPNXA4EYQEZNGVF22PHALMCQHWB2OVXWVVVOIY' 
VERSION = '20180605' 

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: WVNVCGPZRK3KI0RVKDVIGYPP2BEBXGWBVZ1ELXJBWZH2EADA
CLIENT_SECRET:B20AIXRRFRAPNXA4EYQEZNGVF22PHALMCQHWB2OVXWVVVOIY


It has been decided to work with the entire area. The function below will be used to repeat the process of attaining results of top 100 venues for each neighbourhood within a radius of 500 m. 

In [9]:
LIMIT = 100 
radius = 500
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
        results = requests.get(url).json()["response"]['groups'][0]['items']
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])
    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [10]:
toronto_venues = getNearbyVenues(names=df3['Neighborhood'], latitudes=df3['Latitude'], longitudes=df3['Longitude'] )

Parkwoods
Victoria Village
Harbourfront
Lawrence Heights, Lawrence Manor
Queen's Park
Queen's Park
Rouge, Malvern
Don Mills North
Woodbine Gardens, Parkview Hill
Ryerson, Garden District
Glencairn
Cloverdale, Islington, Martin Grove, Princess Gardens, West Deane Park
Highland Creek, Rouge Hill, Port Union
Flemingdon Park, Don Mills South
Woodbine Heights
St. James Town
Humewood-Cedarvale
Bloordale Gardens, Eringate, Markland Wood, Old Burnhamthorpe
Guildwood, Morningside, West Hill
The Beaches
Berczy Park
Caledonia-Fairbanks
Woburn
Leaside
Central Bay Street
Christie
Cedarbrae
Hillcrest Village
Bathurst Manor, Downsview North, Wilson Heights
Thorncliffe Park
Adelaide, King, Richmond
Dovercourt Village, Dufferin
Scarborough Village
Fairview, Henry Farm, Oriole
Northwood Park, York University
East Toronto
Harbourfront East, Toronto Islands, Union Station
Little Portugal, Trinity
East Birchmount Park, Ionview, Kennedy Park
Bayview Village
CFB Toronto, Downsview East
The Danforth West, Riv

In [11]:
print(toronto_venues.shape)
toronto_venues.head()

(4096, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Parkwoods,43.754326,-79.449117,Grill Gate,43.753123,-79.45169,Mediterranean Restaurant
1,Parkwoods,43.754326,-79.449117,Tim Hortons,43.754767,-79.44325,Coffee Shop
2,Parkwoods,43.754326,-79.449117,Orly Restaurant & Grill,43.754493,-79.443507,Middle Eastern Restaurant
3,Parkwoods,43.754326,-79.449117,Crave Restaurant,43.753133,-79.450378,Wings Joint
4,Victoria Village,43.754326,-79.449117,Grill Gate,43.753123,-79.45169,Mediterranean Restaurant


Lets sort out how many venues are shown for each neighbourhood.

In [45]:
ff=toronto_venues[['Venue','Neighborhood']].groupby('Neighborhood').count()
ff

Unnamed: 0_level_0,Venue
Neighborhood,Unnamed: 1_level_1
"Adelaide, King, Richmond",100
Agincourt,42
"Agincourt North, L'Amoreaux East, Milliken, Steeles East",42
"Albion Gardens, Beaumond Heights, Humbergate, Jamestown, Mount Olive, Silverstone, South Steeles, Thistletown",5
"Alderwood, Long Branch",5
"Bathurst Manor, Downsview North, Wilson Heights",4
Bayview Village,4
"Bedford Park, Lawrence Manor East",4
Berczy Park,100
"Birch Cliff, Cliffside West",42


The number of unique categories are as follows:

In [46]:
print('There are {} uniques categories.'.format(len(toronto_venues['Venue Category'].unique())))

There are 113 uniques categories.


Lets get into the weeds and analyze each neighbourhood.

In [47]:
toronto_onehot = pd.get_dummies(toronto_venues[['Venue Category']], prefix="", prefix_sep="")
toronto_onehot['Neighborhood'] = toronto_venues['Neighborhood'] 
fixed_columns = [toronto_onehot.columns[-1]] + list(toronto_onehot.columns[:-1])
toronto_onehot = toronto_onehot[fixed_columns]
print(toronto_onehot.shape)
toronto_onehot.head()

(4096, 113)


Unnamed: 0,Yoga Studio,American Restaurant,Art Gallery,Arts & Crafts Store,Asian Restaurant,Bakery,Bank,Bar,Beer Bar,Beer Store,...,Tapas Restaurant,Tea Room,Thai Restaurant,Theater,Toy / Game Store,University,Vegetarian / Vegan Restaurant,Video Game Store,Wings Joint,Women's Store
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,1,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


Lets sort neighbourhoods by taking the mean of the frequency of occurrence of each category.

In [48]:
toronto_grouped = toronto_onehot.groupby('Neighborhood').mean().reset_index()
print(toronto_grouped.shape)
toronto_grouped.head()

(102, 113)


Unnamed: 0,Neighborhood,Yoga Studio,American Restaurant,Art Gallery,Arts & Crafts Store,Asian Restaurant,Bakery,Bank,Bar,Beer Bar,...,Tapas Restaurant,Tea Room,Thai Restaurant,Theater,Toy / Game Store,University,Vegetarian / Vegan Restaurant,Video Game Store,Wings Joint,Women's Store
0,"Adelaide, King, Richmond",0.0,0.02,0.01,0.0,0.01,0.04,0.01,0.01,0.01,...,0.0,0.02,0.01,0.02,0.01,0.0,0.0,0.0,0.0,0.01
1,Agincourt,0.0,0.02381,0.0,0.0,0.0,0.02381,0.02381,0.0,0.0,...,0.0,0.047619,0.02381,0.0,0.02381,0.0,0.0,0.02381,0.02381,0.02381
2,"Agincourt North, L'Amoreaux East, Milliken, St...",0.0,0.02381,0.0,0.0,0.0,0.02381,0.02381,0.0,0.0,...,0.0,0.047619,0.02381,0.0,0.02381,0.0,0.0,0.02381,0.02381,0.02381
3,"Albion Gardens, Beaumond Heights, Humbergate, ...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,"Alderwood, Long Branch",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


Lets generate a summary showing 5 top venues for each neighbourhood 

In [63]:
num_top_venues = 5

for hood in toronto_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = toronto_grouped[toronto_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Adelaide, King, Richmond----
                venue  freq
0      Clothing Store  0.08
1          Restaurant  0.05
2         Coffee Shop  0.04
3              Bakery  0.04
4  Seafood Restaurant  0.03


----Agincourt----
            venue  freq
0  Clothing Store  0.19
1     Coffee Shop  0.07
2        Pharmacy  0.05
3  Cosmetics Shop  0.05
4        Tea Room  0.05


----Agincourt North, L'Amoreaux East, Milliken, Steeles East----
            venue  freq
0  Clothing Store  0.19
1     Coffee Shop  0.07
2        Pharmacy  0.05
3  Cosmetics Shop  0.05
4        Tea Room  0.05


----Albion Gardens, Beaumond Heights, Humbergate, Jamestown, Mount Olive, Silverstone, South Steeles, Thistletown----
         venue  freq
0  Supermarket   0.2
1       Garden   0.2
2   Playground   0.2
3     Pharmacy   0.2
4  Coffee Shop   0.2


----Alderwood, Long Branch----
         venue  freq
0  Supermarket   0.2
1       Garden   0.2
2   Playground   0.2
3     Pharmacy   0.2
4  Coffee Shop   0.2


----Bathurst Mano

Lets generate a dataframe showing top ten venues for each neighbourhood.

In [64]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False) 
    return row_categories_sorted.index.values[0:num_top_venues]
num_top_venues = 10
indicators = ['st', 'nd', 'rd']
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = toronto_grouped['Neighborhood']
for ind in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Adelaide, King, Richmond",Clothing Store,Restaurant,Bakery,Coffee Shop,Seafood Restaurant,Sushi Restaurant,Cosmetics Shop,Electronics Store,Italian Restaurant,Breakfast Spot
1,Agincourt,Clothing Store,Coffee Shop,Sandwich Place,Pharmacy,Cosmetics Shop,Tea Room,Discount Store,Gym,Pizza Place,Department Store
2,"Agincourt North, L'Amoreaux East, Milliken, St...",Clothing Store,Coffee Shop,Sandwich Place,Pharmacy,Cosmetics Shop,Tea Room,Discount Store,Gym,Pizza Place,Department Store
3,"Albion Gardens, Beaumond Heights, Humbergate, ...",Playground,Pharmacy,Coffee Shop,Garden,Supermarket,Women's Store,Food Court,Discount Store,Donut Shop,Electronics Store
4,"Alderwood, Long Branch",Playground,Pharmacy,Coffee Shop,Garden,Supermarket,Women's Store,Food Court,Discount Store,Donut Shop,Electronics Store


Now, we are ready to cluster neighbourhoods. K-means clustering method will be utilized.

In [65]:
kclusters = 5
toronto_grouped_clustering = toronto_grouped.drop('Neighborhood', 1)
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_grouped_clustering)
kmeans.labels_[0:10] 



array([0, 0, 0, 1, 1, 2, 2, 2, 0, 0], dtype=int32)

In [66]:
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)
toronto_merged =df3
toronto_merged = toronto_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

toronto_merged

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M3A,North York,Parkwoods,43.754326,-79.449117,2,Wings Joint,Coffee Shop,Mediterranean Restaurant,Middle Eastern Restaurant,Fried Chicken Joint,Discount Store,Donut Shop,Electronics Store,Falafel Restaurant,Farmers Market
1,M4A,North York,Victoria Village,43.754326,-79.449117,2,Wings Joint,Coffee Shop,Mediterranean Restaurant,Middle Eastern Restaurant,Fried Chicken Joint,Discount Store,Donut Shop,Electronics Store,Falafel Restaurant,Farmers Market
2,M5A,Downtown Toronto,Harbourfront,43.654174,-79.380812,0,Clothing Store,Restaurant,Bakery,Coffee Shop,Seafood Restaurant,Sushi Restaurant,Cosmetics Shop,Electronics Store,Italian Restaurant,Breakfast Spot
3,M6A,North York,"Lawrence Heights, Lawrence Manor",43.754326,-79.449117,2,Wings Joint,Coffee Shop,Mediterranean Restaurant,Middle Eastern Restaurant,Fried Chicken Joint,Discount Store,Donut Shop,Electronics Store,Falafel Restaurant,Farmers Market
4,M7A,Downtown Toronto,Queen's Park,43.654174,-79.380812,0,Coffee Shop,Clothing Store,Café,Restaurant,Italian Restaurant,Sandwich Place,Sushi Restaurant,Ice Cream Shop,Chinese Restaurant,Juice Bar
5,M9A,Queen's Park,Queen's Park,43.659659,-79.390340,0,Coffee Shop,Clothing Store,Café,Restaurant,Italian Restaurant,Sandwich Place,Sushi Restaurant,Ice Cream Shop,Chinese Restaurant,Juice Bar
6,M1B,Scarborough,"Rouge, Malvern",43.773077,-79.257774,0,Clothing Store,Coffee Shop,Sandwich Place,Pharmacy,Cosmetics Shop,Tea Room,Discount Store,Gym,Pizza Place,Department Store
7,M3B,North York,Don Mills North,43.754326,-79.449117,2,Wings Joint,Coffee Shop,Mediterranean Restaurant,Middle Eastern Restaurant,Fried Chicken Joint,Discount Store,Donut Shop,Electronics Store,Falafel Restaurant,Farmers Market
8,M4B,East York,"Woodbine Gardens, Parkview Hill",43.699971,-79.332520,0,Pub,Pizza Place,Restaurant,Greek Restaurant,Pastry Shop,Liquor Store,Park,Women's Store,Food & Drink Shop,Discount Store
9,M5B,Downtown Toronto,"Ryerson, Garden District",43.654174,-79.380812,0,Clothing Store,Restaurant,Bakery,Coffee Shop,Seafood Restaurant,Sushi Restaurant,Cosmetics Shop,Electronics Store,Italian Restaurant,Breakfast Spot


In [67]:
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['Neighborhood'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

Clustering neighbourhooods is complete. You can see the map for clustering above. Top 10 frequencies of venues in each neighbourhood correlate with cluster classes. Lets elaborate a bit more. You will see the tables below for each cluster.

### Cluster1

In [68]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 0, toronto_merged.columns[[2] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,Harbourfront,0,Clothing Store,Restaurant,Bakery,Coffee Shop,Seafood Restaurant,Sushi Restaurant,Cosmetics Shop,Electronics Store,Italian Restaurant,Breakfast Spot
4,Queen's Park,0,Coffee Shop,Clothing Store,Café,Restaurant,Italian Restaurant,Sandwich Place,Sushi Restaurant,Ice Cream Shop,Chinese Restaurant,Juice Bar
5,Queen's Park,0,Coffee Shop,Clothing Store,Café,Restaurant,Italian Restaurant,Sandwich Place,Sushi Restaurant,Ice Cream Shop,Chinese Restaurant,Juice Bar
6,"Rouge, Malvern",0,Clothing Store,Coffee Shop,Sandwich Place,Pharmacy,Cosmetics Shop,Tea Room,Discount Store,Gym,Pizza Place,Department Store
8,"Woodbine Gardens, Parkview Hill",0,Pub,Pizza Place,Restaurant,Greek Restaurant,Pastry Shop,Liquor Store,Park,Women's Store,Food & Drink Shop,Discount Store
9,"Ryerson, Garden District",0,Clothing Store,Restaurant,Bakery,Coffee Shop,Seafood Restaurant,Sushi Restaurant,Cosmetics Shop,Electronics Store,Italian Restaurant,Breakfast Spot
12,"Highland Creek, Rouge Hill, Port Union",0,Clothing Store,Coffee Shop,Sandwich Place,Pharmacy,Cosmetics Shop,Tea Room,Discount Store,Gym,Pizza Place,Department Store
14,Woodbine Heights,0,Pub,Pizza Place,Restaurant,Greek Restaurant,Pastry Shop,Liquor Store,Park,Women's Store,Food & Drink Shop,Discount Store
15,St. James Town,0,Clothing Store,Restaurant,Bakery,Coffee Shop,Seafood Restaurant,Sushi Restaurant,Cosmetics Shop,Electronics Store,Italian Restaurant,Breakfast Spot
18,"Guildwood, Morningside, West Hill",0,Clothing Store,Coffee Shop,Sandwich Place,Pharmacy,Cosmetics Shop,Tea Room,Discount Store,Gym,Pizza Place,Department Store


### Cluster2

In [69]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 1, toronto_merged.columns[[2] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
11,"Cloverdale, Islington, Martin Grove, Princess ...",1,Playground,Pharmacy,Coffee Shop,Garden,Supermarket,Women's Store,Food Court,Discount Store,Donut Shop,Electronics Store
17,"Bloordale Gardens, Eringate, Markland Wood, Ol...",1,Playground,Pharmacy,Coffee Shop,Garden,Supermarket,Women's Store,Food Court,Discount Store,Donut Shop,Electronics Store
70,Westmount,1,Playground,Pharmacy,Coffee Shop,Garden,Supermarket,Women's Store,Food Court,Discount Store,Donut Shop,Electronics Store
77,"Kingsview Village, Martin Grove Gardens, Richv...",1,Playground,Pharmacy,Coffee Shop,Garden,Supermarket,Women's Store,Food Court,Discount Store,Donut Shop,Electronics Store
88,"Humber Bay Shores, Mimico South, New Toronto",1,Playground,Pharmacy,Coffee Shop,Garden,Supermarket,Women's Store,Food Court,Discount Store,Donut Shop,Electronics Store
89,"Albion Gardens, Beaumond Heights, Humbergate, ...",1,Playground,Pharmacy,Coffee Shop,Garden,Supermarket,Women's Store,Food Court,Discount Store,Donut Shop,Electronics Store
93,"Alderwood, Long Branch",1,Playground,Pharmacy,Coffee Shop,Garden,Supermarket,Women's Store,Food Court,Discount Store,Donut Shop,Electronics Store
94,Northwest,1,Playground,Pharmacy,Coffee Shop,Garden,Supermarket,Women's Store,Food Court,Discount Store,Donut Shop,Electronics Store
98,"The Kingsway, Montgomery Road, Old Mill North",1,Playground,Pharmacy,Coffee Shop,Garden,Supermarket,Women's Store,Food Court,Discount Store,Donut Shop,Electronics Store
101,"Humber Bay, King's Mill Park, Kingsway Park So...",1,Playground,Pharmacy,Coffee Shop,Garden,Supermarket,Women's Store,Food Court,Discount Store,Donut Shop,Electronics Store


### Cluster3

In [70]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 2, toronto_merged.columns[[2] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Parkwoods,2,Wings Joint,Coffee Shop,Mediterranean Restaurant,Middle Eastern Restaurant,Fried Chicken Joint,Discount Store,Donut Shop,Electronics Store,Falafel Restaurant,Farmers Market
1,Victoria Village,2,Wings Joint,Coffee Shop,Mediterranean Restaurant,Middle Eastern Restaurant,Fried Chicken Joint,Discount Store,Donut Shop,Electronics Store,Falafel Restaurant,Farmers Market
3,"Lawrence Heights, Lawrence Manor",2,Wings Joint,Coffee Shop,Mediterranean Restaurant,Middle Eastern Restaurant,Fried Chicken Joint,Discount Store,Donut Shop,Electronics Store,Falafel Restaurant,Farmers Market
7,Don Mills North,2,Wings Joint,Coffee Shop,Mediterranean Restaurant,Middle Eastern Restaurant,Fried Chicken Joint,Discount Store,Donut Shop,Electronics Store,Falafel Restaurant,Farmers Market
10,Glencairn,2,Wings Joint,Coffee Shop,Mediterranean Restaurant,Middle Eastern Restaurant,Fried Chicken Joint,Discount Store,Donut Shop,Electronics Store,Falafel Restaurant,Farmers Market
13,"Flemingdon Park, Don Mills South",2,Wings Joint,Coffee Shop,Mediterranean Restaurant,Middle Eastern Restaurant,Fried Chicken Joint,Discount Store,Donut Shop,Electronics Store,Falafel Restaurant,Farmers Market
27,Hillcrest Village,2,Wings Joint,Coffee Shop,Mediterranean Restaurant,Middle Eastern Restaurant,Fried Chicken Joint,Discount Store,Donut Shop,Electronics Store,Falafel Restaurant,Farmers Market
28,"Bathurst Manor, Downsview North, Wilson Heights",2,Wings Joint,Coffee Shop,Mediterranean Restaurant,Middle Eastern Restaurant,Fried Chicken Joint,Discount Store,Donut Shop,Electronics Store,Falafel Restaurant,Farmers Market
33,"Fairview, Henry Farm, Oriole",2,Wings Joint,Coffee Shop,Mediterranean Restaurant,Middle Eastern Restaurant,Fried Chicken Joint,Discount Store,Donut Shop,Electronics Store,Falafel Restaurant,Farmers Market
34,"Northwood Park, York University",2,Wings Joint,Coffee Shop,Mediterranean Restaurant,Middle Eastern Restaurant,Fried Chicken Joint,Discount Store,Donut Shop,Electronics Store,Falafel Restaurant,Farmers Market


### Cluster4

In [71]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 3, toronto_merged.columns[[2] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
19,The Beaches,3,Historic Site,Sculpture Garden,Boat or Ferry,Music Venue,Women's Store,French Restaurant,Discount Store,Donut Shop,Electronics Store,Falafel Restaurant
41,"The Danforth West, Riverdale",3,Historic Site,Sculpture Garden,Boat or Ferry,Music Venue,Women's Store,French Restaurant,Discount Store,Donut Shop,Electronics Store,Falafel Restaurant
47,"The Beaches West, India Bazaar",3,Historic Site,Sculpture Garden,Boat or Ferry,Music Venue,Women's Store,French Restaurant,Discount Store,Donut Shop,Electronics Store,Falafel Restaurant
54,Studio District,3,Historic Site,Sculpture Garden,Boat or Ferry,Music Venue,Women's Store,French Restaurant,Discount Store,Donut Shop,Electronics Store,Falafel Restaurant
100,Business Reply Mail Processing Centre 969 Eastern,3,Historic Site,Sculpture Garden,Boat or Ferry,Music Venue,Women's Store,French Restaurant,Discount Store,Donut Shop,Electronics Store,Falafel Restaurant


### Cluster5

In [72]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 4, toronto_merged.columns[[2] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
16,Humewood-Cedarvale,4,Pizza Place,Asian Restaurant,Park,Coffee Shop,Beer Store,Supermarket,Women's Store,French Restaurant,Donut Shop,Electronics Store
21,Caledonia-Fairbanks,4,Pizza Place,Asian Restaurant,Park,Coffee Shop,Beer Store,Supermarket,Women's Store,French Restaurant,Donut Shop,Electronics Store
56,"Del Ray, Keelesdale, Mount Dennis, Silverthorn",4,Pizza Place,Asian Restaurant,Park,Coffee Shop,Beer Store,Supermarket,Women's Store,French Restaurant,Donut Shop,Electronics Store
63,"The Junction North, Runnymede",4,Pizza Place,Asian Restaurant,Park,Coffee Shop,Beer Store,Supermarket,Women's Store,French Restaurant,Donut Shop,Electronics Store
64,Weston,4,Pizza Place,Asian Restaurant,Park,Coffee Shop,Beer Store,Supermarket,Women's Store,French Restaurant,Donut Shop,Electronics Store


In [74]:
num_top_venues = 10
column_names = ['Neighborhood','Top/Top10' ,'Venue']
dff=pd.DataFrame(columns=column_names)
for hood in toronto_grouped['Neighborhood']:
    temp = toronto_grouped[toronto_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 1})
    x=temp.sort_values('freq', ascending=False).reset_index(drop=True)
    y=np.sum(x.iloc[:11])
    
    dff=dff.append({'Neighborhood': hood,'Top10':y[1],'Top':x.iloc[0,1]}, ignore_index=True )
                                         

dff['venue']=ff['Venue'].tolist()
dff['Top10']=dff['Top10']*dff['venue']

dff=dff.sort_values('Top10', ascending=False).reset_index(drop=True)
dff

Unnamed: 0,Neighborhood,Top/Top10,Venue,Top,Top10,venue
0,Queen's Park,,,0.1,17.5,175
1,"Moore Park, Summerhill East",,,0.1,15.0,75
2,Roselawn,,,0.1,15.0,75
3,"Dovercourt Village, Dufferin",,,0.1,15.0,75
4,"High Park, The Junction South",,,0.1,15.0,75
5,"Deer Park, Forest Hill SE, Rathnelly, South Hi...",,,0.1,15.0,75
6,Davisville North,,,0.1,15.0,75
7,Davisville,,,0.1,15.0,75
8,Lawrence Park,,,0.1,15.0,75
9,"Little Portugal, Trinity",,,0.1,15.0,75
