In this code, you will use Web Scraping with BeautifulSoup and Requests ,also how to convert addresses into their equivalent latitude and longitude values. Also, you will use the Foursquare API to explore neighborhoods in New York City. You will use the explore function to get the most common venue categories in each neighborhood, and then use this feature to group the neighborhoods into clusters. You will use the k-means clustering algorithm to complete this task. Finally, you will use the Folium library to visualize the neighborhoods in New York City and their emerging clusters.

In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

from bs4 import BeautifulSoup

print('Libraries imported.')

Solving environment: done

# All requested packages already installed.

Solving environment: done

# All requested packages already installed.

Libraries imported.


In [2]:
website_url = requests.get('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M').text
soup = BeautifulSoup(website_url,'lxml')


extracing data from website

In [3]:
My_table = soup.find('table',{'class':'wikitable sortable'})
print('done"')

done"


clean the data and put it in Data frame

In [5]:

lst_dict = []
for items in My_table.find_all('tr')[1::1]:
    data = items.find_all(['th','td'])
    try:
        Postcode = data[0].text
        Borough = data[1].text
        Neighbourhood = data[2].text
    except IndexError:pass
    lst_dict.append({'Postcode':Postcode, 'Borough':Borough, 'Neighbourhood': Neighbourhood})
    
    


In [6]:
table1=pd.DataFrame.from_dict(lst_dict)

clean the data

In [7]:
table2=table1[table1.Borough != 'Not assigned']

In [8]:
table2.Neighbourhood[8] = "Queen's Park"
table2=table2.replace('\n', '', regex=True)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  exec(code_obj, self.user_global_ns, self.user_ns)


More than one neighborhood can exist in one postal code area. For example, in the table on the Wikipedia page, you will notice that M5A is listed twice and has two neighborhoods: Harbourfront and Regent Park. These two rows will be combined into one row

In [9]:
table3=table2.groupby(['Postcode','Borough'])['Neighbourhood'].apply(', '.join).reset_index()

In [10]:
pd.set_option('expand_frame_repr', False)

In [11]:
table3.shape

(103, 3)

In [13]:
url="https://cocl.us/Geospatial_data"
c=pd.read_csv(url)
c=c.rename(index=str, columns={'Postal Code': 'Postcode'})

In [14]:
c.dtypes

Postcode      object
Latitude     float64
Longitude    float64
dtype: object

In [15]:
 neighborhoods= table3.merge(c,on='Postcode',how='left')
neighborhoods.shape

(103, 5)

In [16]:

print('The dataframe has {} boroughs and {} neighborhoods.'.format(
        len(neighborhoods['Borough'].unique()),
        neighborhoods.shape[0]
    )
)

The dataframe has 11 boroughs and 103 neighborhoods.


And make sure that the dataset has all 11 boroughs and 103 neighborhoods.

In [17]:
address = 'Toronto, ON'

geolocator = Nominatim()
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))



The geograpical coordinate of Toronto are 43.653963, -79.387207.


#### Create a map of Toronto with neighborhoods superimposed on top.

In [18]:
# create map of Toronto using latitude and longitude values
map_Toronto = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(neighborhoods['Latitude'], neighborhoods['Longitude'], neighborhoods['Borough'], neighborhoods['Neighbourhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_Toronto)  
    
map_Toronto

**Folium** is a great visualization library. Feel free to zoom into the above map, and click on each circle mark to reveal the name of the neighborhood and its respective borough.

#### Define Foursquare Credentials and Version

In [19]:
CLIENT_ID = 'WZLVJPFDK44S0J1XC4SQCOEGLS34UQYOQLXRAGE2IA3UUT4L' # your Foursquare ID
CLIENT_SECRET = 'BPEX5BZV5TQQOBHQ4ZZN04HEV0LCCUT1AC4JYTQCNV53LDCJ' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: WZLVJPFDK44S0J1XC4SQCOEGLS34UQYOQLXRAGE2IA3UUT4L
CLIENT_SECRET:BPEX5BZV5TQQOBHQ4ZZN04HEV0LCCUT1AC4JYTQCNV53LDCJ


Let's get the geographical coordinates of Manhattan.

## 2. Explore Neighborhoods in Tornto

#### Let's create a function to repeat the same process to all the neighborhoods in Tornto

#### Define Foursquare Credentials and Version

In [24]:
LIMIT=100
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
        
        
        for i in range(1,50):
            current_radius = radius + 50 * (i - 1)
            # create the API request URL
            url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
                CLIENT_ID, 
                CLIENT_SECRET, 
                VERSION, 
                lat, 
                lng, 
                current_radius, 
                LIMIT)

            # make the GET request
            groups = requests.get(url).json()["response"]['groups']

            selected_group = []
            for group in groups:
                if len(group) > len(selected_group):
                    selected_group = group
                  

            venue_list = []
            if len(selected_group) > 0:
                results = selected_group['items']

                # return only relevant information for each nearby venue
                venue_list = [(
                    name, 
                    lat, 
                    lng, 
                    v['venue']['name'], 
                    v['venue']['location']['lat'], 
                    v['venue']['location']['lng'],  
                    v['venue']['categories'][0]['name'],
                    current_radius
                ) for v in results if v['venue']['categories']]
            
            if len(venue_list) > 10:
                venues_list.append(venue_list)
                break
        
            
                
                
            
                

            

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category',
                  'Explore Radius']
    
    return(nearby_venues)

#### Now write the code to run the above function on each neighborhood and create a new dataframe called *manhattan_venues*.

In [25]:
# type your answer here

Tornto_venues = getNearbyVenues(names=neighborhoods['Neighbourhood'],
                                   latitudes=neighborhoods['Latitude'],
                                   longitudes=neighborhoods['Longitude']
                                  )



Rouge, Malvern
Highland Creek, Rouge Hill, Port Union
Guildwood, Morningside, West Hill
Woburn
Cedarbrae
Scarborough Village
East Birchmount Park, Ionview, Kennedy Park
Clairlea, Golden Mile, Oakridge
Cliffcrest, Cliffside, Scarborough Village West
Birch Cliff, Cliffside West
Dorset Park, Scarborough Town Centre, Wexford Heights
Maryvale, Wexford
Agincourt
Clarks Corners, Sullivan, Tam O'Shanter
Agincourt North, L'Amoreaux East, Milliken, Steeles East
L'Amoreaux West, Steeles West
Upper Rouge
Hillcrest Village
Fairview, Henry Farm, Oriole
Bayview Village
Silver Hills, York Mills
Newtonbrook, Willowdale
Willowdale South
York Mills West
Willowdale West
Parkwoods
Don Mills North
Flemingdon Park, Don Mills South
Bathurst Manor, Downsview North, Wilson Heights
Northwood Park, York University
CFB Toronto, Downsview East
Downsview West
Downsview Central
Downsview Northwest
Victoria Village
Woodbine Gardens, Parkview Hill
Woodbine Heights
The Beaches
Leaside
Thorncliffe Park
East Toronto
The D

#### Let's check the size of the resulting dataframe

In [26]:
print(Tornto_venues.shape)


(2710, 8)


Let's check how many venues were returned for each neighborhood

In [27]:
Tornto_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category,Explore Radius
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
"Adelaide, King, Richmond",100,100,100,100,100,100,100
Agincourt,15,15,15,15,15,15,15
"Agincourt North, L'Amoreaux East, Milliken, Steeles East",15,15,15,15,15,15,15
"Albion Gardens, Beaumond Heights, Humbergate, Jamestown, Mount Olive, Silverstone, South Steeles, Thistletown",11,11,11,11,11,11,11
"Alderwood, Long Branch",11,11,11,11,11,11,11
"Bathurst Manor, Downsview North, Wilson Heights",19,19,19,19,19,19,19
Bayview Village,12,12,12,12,12,12,12
"Bedford Park, Lawrence Manor East",25,25,25,25,25,25,25
Berczy Park,54,54,54,54,54,54,54
"Birch Cliff, Cliffside West",11,11,11,11,11,11,11


In [28]:
Tornto_venues.groupby('Neighborhood').count().shape

(103, 7)

#### Let's find out how many unique categories can be curated from all the returned venues

In [29]:
print('There are {} uniques categories.'.format(len(Tornto_venues['Venue Category'].unique())))

There are 290 uniques categories.


<a id='item3'></a>

## 3. Analyze Each Neighborhood

In [30]:
# one hot encoding
Toronto_onehot = pd.get_dummies(Tornto_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
Toronto_onehot['Neighborhood'] = Tornto_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [Toronto_onehot.columns[-1]] + list(Toronto_onehot.columns[:-1])
Toronto_onehot = Toronto_onehot[fixed_columns]


And let's examine the new dataframe size.

In [31]:
Toronto_onehot.shape

(2710, 290)

#### Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category

In [32]:
toronto_grouped = Toronto_onehot.groupby('Neighborhood').mean().reset_index()


#### Let's confirm the new size

In [33]:
toronto_grouped.shape

(103, 290)

#### Let's print each neighborhood along with the top 5 most common venues

In [34]:
num_top_venues = 5

for hood in toronto_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = toronto_grouped[toronto_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Adelaide, King, Richmond----
                 venue  freq
0          Coffee Shop  0.07
1                 Café  0.06
2      Thai Restaurant  0.04
3           Steakhouse  0.04
4  American Restaurant  0.04


----Agincourt----
                venue  freq
0      Breakfast Spot  0.07
1     Badminton Court  0.07
2    Sushi Restaurant  0.07
3  Seafood Restaurant  0.07
4         Supermarket  0.07


----Agincourt North, L'Amoreaux East, Milliken, Steeles East----
                  venue  freq
0             BBQ Joint  0.13
1           Pizza Place  0.13
2  Caribbean Restaurant  0.07
3  Fast Food Restaurant  0.07
4                Bakery  0.07


----Albion Gardens, Beaumond Heights, Humbergate, Jamestown, Mount Olive, Silverstone, South Steeles, Thistletown----
                 venue  freq
0        Grocery Store  0.18
1             Pharmacy  0.09
2  Japanese Restaurant  0.09
3  Fried Chicken Joint  0.09
4       Sandwich Place  0.09


----Alderwood, Long Branch----
            venue  freq
0     P

#### Let's put that into a *pandas* dataframe

First, let's write a function to sort the venues in descending order.

In [36]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Now let's create the new dataframe and display the top 10 venues for each neighborhood.

In [37]:
num_top_venues = 10


indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = toronto_grouped['Neighborhood']

for ind in np.arange(toronto_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Adelaide, King, Richmond",Coffee Shop,Café,Steakhouse,Thai Restaurant,American Restaurant,Bar,Restaurant,Gym,Hotel,Cosmetics Shop
1,Agincourt,Skating Rink,Breakfast Spot,Sushi Restaurant,Motorcycle Shop,Supermarket,Mediterranean Restaurant,Seafood Restaurant,Discount Store,Pool Hall,Shanghai Restaurant
2,"Agincourt North, L'Amoreaux East, Milliken, St...",Pizza Place,BBQ Joint,Pharmacy,Park,Noodle House,Chinese Restaurant,Caribbean Restaurant,Bubble Tea Shop,Shop & Service,Fast Food Restaurant
3,"Albion Gardens, Beaumond Heights, Humbergate, ...",Grocery Store,Pizza Place,Japanese Restaurant,Beer Store,Fried Chicken Joint,Discount Store,Sandwich Place,Fast Food Restaurant,Coffee Shop,Pharmacy
4,"Alderwood, Long Branch",Pizza Place,Coffee Shop,Sandwich Place,Bank,Gas Station,Pool,Pub,Skating Rink,Pharmacy,Gym
5,"Bathurst Manor, Downsview North, Wilson Heights",Coffee Shop,Ice Cream Shop,Shopping Mall,Pharmacy,Bridal Shop,Sandwich Place,Middle Eastern Restaurant,Bank,Diner,Supermarket
6,Bayview Village,Japanese Restaurant,Bank,Pharmacy,Grocery Store,Skating Rink,Skate Park,Shopping Mall,Café,Restaurant,Chinese Restaurant
7,"Bedford Park, Lawrence Manor East",Italian Restaurant,Coffee Shop,Fast Food Restaurant,Grocery Store,Thai Restaurant,Pharmacy,Comfort Food Restaurant,Pizza Place,Pub,Restaurant
8,Berczy Park,Coffee Shop,Cocktail Bar,Seafood Restaurant,Cheese Shop,Steakhouse,Bakery,Restaurant,Farmers Market,Café,Shopping Mall
9,"Birch Cliff, Cliffside West",College Stadium,Park,Café,Thai Restaurant,General Entertainment,Bank,Diner,Discount Store,Convenience Store,Gym


<a id='item4'></a>

## 4. Cluster Neighborhoods

Run *k*-means to cluster the neighborhood into 5 clusters.

In [38]:
# set number of clusters
kclusters = 10

toronto_grouped_clustering = toronto_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(toronto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_.shape


(103,)

Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

In [39]:
toronto_merged = neighborhoods

# add clustering labels
toronto_merged['Cluster Labels'] = kmeans.labels_

# merge toronto_grouped with toronto_data to add latitude/longitude for each Neighbourhood neighborhood neighborhoods
toronto_merged = toronto_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighbourhood')

toronto_merged.head() # check the last columns!

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353,6,Fast Food Restaurant,Coffee Shop,Filipino Restaurant,Martial Arts Dojo,Paper / Office Supplies Store,Hobby Shop,Bus Station,African Restaurant,Construction & Landscaping,Spa
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",43.784535,-79.160497,7,Hardware Store,Hotel,Pharmacy,Park,Burger Joint,Breakfast Spot,Italian Restaurant,Pizza Place,Grocery Store,Gym
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711,3,Pizza Place,Medical Center,Fast Food Restaurant,Fried Chicken Joint,Greek Restaurant,Sports Bar,Breakfast Spot,Mexican Restaurant,Electronics Store,Rental Car Location
3,M1G,Scarborough,Woburn,43.770992,-79.216917,3,Coffee Shop,Indian Restaurant,Sandwich Place,Fast Food Restaurant,Supplement Shop,Department Store,Bakery,Park,Juice Bar,Electronics Store
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476,1,Indian Restaurant,Hakka Restaurant,Athletics & Sports,Flower Shop,Bakery,Caribbean Restaurant,Thai Restaurant,Bank,Chinese Restaurant,Fried Chicken Joint


Finally, let's visualize the resulting clusters

In [40]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_merged['Latitude'], toronto_merged['Longitude'], toronto_merged['Neighbourhood'], toronto_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

<a id='item5'></a>

## 5. Examine Clusters

Now, you can examine each cluster and determine the discriminating venue categories that distinguish each cluster. Based on the defining categories, you can then assign a name to each cluster. I will leave this exercise to you.

#### Cluster 1

In [41]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 0, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
25,North York,0,Bus Stop,Park,Road,Caribbean Restaurant,Train Station,Furniture / Home Store,Food & Drink Shop,Pizza Place,Convenience Store,Eastern European Restaurant


#### Cluster 2

In [42]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 1, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,Scarborough,1,Indian Restaurant,Hakka Restaurant,Athletics & Sports,Flower Shop,Bakery,Caribbean Restaurant,Thai Restaurant,Bank,Chinese Restaurant,Fried Chicken Joint
5,Scarborough,1,Ice Cream Shop,Pizza Place,Bowling Alley,Coffee Shop,Fast Food Restaurant,Sandwich Place,Convenience Store,Restaurant,Japanese Restaurant,Train Station
10,Scarborough,1,Indian Restaurant,Electronics Store,Latin American Restaurant,Coffee Shop,Bakery,Chinese Restaurant,Brewery,Gaming Cafe,Vietnamese Restaurant,Pet Store
12,Scarborough,1,Skating Rink,Breakfast Spot,Sushi Restaurant,Motorcycle Shop,Supermarket,Mediterranean Restaurant,Seafood Restaurant,Discount Store,Pool Hall,Shanghai Restaurant
26,North York,1,Japanese Restaurant,Basketball Court,Café,Asian Restaurant,Caribbean Restaurant,Supermarket,Bank,Office,Mobile Phone Shop,Gym / Fitness Center
28,North York,1,Coffee Shop,Ice Cream Shop,Shopping Mall,Pharmacy,Bridal Shop,Sandwich Place,Middle Eastern Restaurant,Bank,Diner,Supermarket
46,Central Toronto,1,Sporting Goods Shop,Coffee Shop,Clothing Store,Health & Beauty Service,Park,Chinese Restaurant,Rental Car Location,Dessert Shop,Diner,Salon / Barbershop
52,Downtown Toronto,1,Japanese Restaurant,Coffee Shop,Sushi Restaurant,Gay Bar,Burger Joint,Restaurant,Fast Food Restaurant,Men's Store,Mediterranean Restaurant,Café
59,Downtown Toronto,1,Coffee Shop,Hotel,Pizza Place,Aquarium,Café,Scenic Lookout,Italian Restaurant,Sports Bar,Brewery,History Museum
69,Downtown Toronto,1,Coffee Shop,Café,Cocktail Bar,Hotel,Seafood Restaurant,Restaurant,Lounge,Pub,Art Gallery,Burger Joint


#### Cluster 3

In [43]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 2, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
6,Scarborough,2,Discount Store,Department Store,Chinese Restaurant,Bus Line,Convenience Store,Hobby Shop,Hockey Arena,Coffee Shop,Train Station,Light Rail Station
53,Downtown Toronto,2,Coffee Shop,Park,Café,Bakery,Pub,Mexican Restaurant,Theater,Breakfast Spot,Brewery,Italian Restaurant
58,Downtown Toronto,2,Coffee Shop,Café,Steakhouse,Thai Restaurant,American Restaurant,Bar,Restaurant,Gym,Hotel,Cosmetics Shop


#### Cluster 4

In [44]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 3, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,Scarborough,3,Pizza Place,Medical Center,Fast Food Restaurant,Fried Chicken Joint,Greek Restaurant,Sports Bar,Breakfast Spot,Mexican Restaurant,Electronics Store,Rental Car Location
3,Scarborough,3,Coffee Shop,Indian Restaurant,Sandwich Place,Fast Food Restaurant,Supplement Shop,Department Store,Bakery,Park,Juice Bar,Electronics Store
24,North York,3,Coffee Shop,Pizza Place,Pharmacy,Shopping Mall,Sandwich Place,Asian Restaurant,Bakery,Bank,Bus Line,Eastern European Restaurant
61,Downtown Toronto,3,Coffee Shop,Café,Hotel,Restaurant,American Restaurant,Deli / Bodega,Italian Restaurant,Seafood Restaurant,Gastropub,Gym
100,Etobicoke,3,Pizza Place,American Restaurant,Beer Store,Mobile Phone Shop,Bank,Supermarket,Sandwich Place,Chinese Restaurant,Intersection,Coffee Shop


#### Cluster 5

In [45]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 4, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
9,Scarborough,4,College Stadium,Park,Café,Thai Restaurant,General Entertainment,Bank,Diner,Discount Store,Convenience Store,Gym
16,Scarborough,4,Zoo Exhibit,Park,Restaurant,Dessert Shop,Golf Course,Grocery Store,Zoo,Vietnamese Restaurant,Eastern European Restaurant,Diner
18,North York,4,Clothing Store,Fast Food Restaurant,Coffee Shop,Bakery,Metro Station,Cosmetics Shop,Tea Room,Smoothie Shop,Deli / Bodega,Candy Store
23,North York,4,Park,Pizza Place,Tennis Court,Dog Run,French Restaurant,Bank,Business Service,Diner,Pet Store,Bowling Alley
35,East York,4,Pizza Place,Fast Food Restaurant,Athletics & Sports,Rock Climbing Spot,Pharmacy,Gym / Fitness Center,Intersection,Gastropub,Bank,Donut Shop
37,East Toronto,4,Bakery,Pet Store,Coffee Shop,French Restaurant,Breakfast Spot,Pub,Vegetarian / Vegan Restaurant,Electronics Store,Trail,Zoo
47,Central Toronto,4,Dessert Shop,Sandwich Place,Italian Restaurant,Café,Sushi Restaurant,Pizza Place,Coffee Shop,Seafood Restaurant,Restaurant,Indoor Play Area
51,Downtown Toronto,4,Coffee Shop,Restaurant,Pizza Place,Café,Pub,Bakery,Italian Restaurant,Indian Restaurant,Park,Butcher
54,Downtown Toronto,4,Coffee Shop,Clothing Store,Cosmetics Shop,Café,Japanese Restaurant,Bar,Restaurant,Bubble Tea Shop,Tea Room,Sandwich Place
60,Downtown Toronto,4,Coffee Shop,Hotel,Café,American Restaurant,Restaurant,Gym,Italian Restaurant,Sports Bar,Gastropub,Deli / Bodega


In [46]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 6, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Scarborough,6,Fast Food Restaurant,Coffee Shop,Filipino Restaurant,Martial Arts Dojo,Paper / Office Supplies Store,Hobby Shop,Bus Station,African Restaurant,Construction & Landscaping,Spa
7,Scarborough,6,Diner,Bus Line,Bakery,Intersection,Soccer Field,Bus Station,Metro Station,Park,Ice Cream Shop,Eastern European Restaurant
8,Scarborough,6,Pizza Place,Ice Cream Shop,Beach,Park,Cajun / Creole Restaurant,Burger Joint,Furniture / Home Store,Sports Bar,Ethiopian Restaurant,Empanada Restaurant
11,Scarborough,6,Korean Restaurant,Pizza Place,Fish Market,Grocery Store,Vietnamese Restaurant,Coffee Shop,Convenience Store,Breakfast Spot,Bakery,Middle Eastern Restaurant
15,Scarborough,6,Fast Food Restaurant,Chinese Restaurant,Japanese Restaurant,Pizza Place,Pharmacy,Sandwich Place,Grocery Store,Coffee Shop,Breakfast Spot,Noodle House
19,North York,6,Japanese Restaurant,Bank,Pharmacy,Grocery Store,Skating Rink,Skate Park,Shopping Mall,Café,Restaurant,Chinese Restaurant
20,North York,6,Park,Convenience Store,Japanese Restaurant,Discount Store,Intersection,Steakhouse,Pub,Furniture / Home Store,Coffee Shop,Gym
22,North York,6,Ramen Restaurant,Restaurant,Café,Pizza Place,Sandwich Place,Grocery Store,Shopping Mall,Steakhouse,Middle Eastern Restaurant,Indonesian Restaurant
27,North York,6,Coffee Shop,Asian Restaurant,Gym,Beer Store,Sporting Goods Shop,General Entertainment,Fast Food Restaurant,Clothing Store,Italian Restaurant,Sandwich Place
32,North York,6,Vietnamese Restaurant,Fast Food Restaurant,Falafel Restaurant,Grocery Store,Coffee Shop,Supermarket,Moving Target,Park,Baseball Field,Dumpling Restaurant


In [47]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 7, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Scarborough,7,Hardware Store,Hotel,Pharmacy,Park,Burger Joint,Breakfast Spot,Italian Restaurant,Pizza Place,Grocery Store,Gym
14,Scarborough,7,Pizza Place,BBQ Joint,Pharmacy,Park,Noodle House,Chinese Restaurant,Caribbean Restaurant,Bubble Tea Shop,Shop & Service,Fast Food Restaurant
17,North York,7,Pharmacy,Housing Development,Pizza Place,Bakery,Bank,Shopping Mall,Coffee Shop,Recreation Center,Sandwich Place,Diner
29,North York,7,Furniture / Home Store,Pizza Place,Sandwich Place,Fast Food Restaurant,Sports Bar,Massage Studio,Frame Store,Bank,Bar,Japanese Restaurant
30,North York,7,Turkish Restaurant,Coffee Shop,Italian Restaurant,Sandwich Place,Middle Eastern Restaurant,Latin American Restaurant,Park,Pizza Place,Vietnamese Restaurant,Airport
31,North York,7,Park,Pizza Place,Vietnamese Restaurant,Tea Room,Bank,Shopping Mall,Coffee Shop,Moving Target,Plaza,Grocery Store
34,North York,7,Park,Coffee Shop,Intersection,Portuguese Restaurant,Hockey Arena,Sporting Goods Shop,Men's Store,Lounge,Pizza Place,Event Space
36,East York,7,Asian Restaurant,Dance Studio,Skating Rink,Beer Store,Spa,Bus Line,Park,Video Store,Cosmetics Shop,Restaurant
39,East York,7,Sandwich Place,Indian Restaurant,Pizza Place,Park,Bank,Bus Station,Burger Joint,Coffee Shop,Discount Store,Warehouse Store
42,East Toronto,7,Sandwich Place,Brewery,Burger Joint,Sushi Restaurant,Intersection,Steakhouse,Movie Theater,Food & Drink Shop,Ice Cream Shop,Hotel


In [48]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 8, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
93,Etobicoke,8,Pharmacy,Café,Playground,Grocery Store,Skating Rink,Bakery,Park,Golf Course,Convenience Store,Bank


In [49]:
toronto_merged.loc[toronto_merged['Cluster Labels'] == 9, toronto_merged.columns[[1] + list(range(5, toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
21,North York,9,Liquor Store,Trail,Pizza Place,Pet Store,Coffee Shop,Park,Sandwich Place,Electronics Store,Sporting Goods Shop,Bank
38,East York,9,Coffee Shop,Sporting Goods Shop,Burger Joint,Gym,Fish & Chips Shop,Sushi Restaurant,Bagel Shop,Supermarket,Bank,Mexican Restaurant
68,Downtown Toronto,9,Airport Terminal,Airport Service,Airport Lounge,Harbor / Marina,Airport Gate,Boat or Ferry,Sculpture Garden,Boutique,Airport Food Court,Plane
76,West Toronto,9,Supermarket,Bakery,Discount Store,Liquor Store,Art Gallery,Brewery,Middle Eastern Restaurant,Bar,Bank,Café
82,West Toronto,9,Mexican Restaurant,Café,Bar,Diner,Gastropub,Bookstore,Flea Market,Sandwich Place,Fried Chicken Joint,Bakery
102,Etobicoke,9,Coffee Shop,Racecourse,Hotel,Sandwich Place,Dog Run,Rental Car Location,Storage Facility,Casino,Swiss Restaurant,Paper / Office Supplies Store
