## A. Introduction
#### This is a project to extract the postal code of Toronto from a WikiPedia page and produce a map representing them. 
#### The followings are the steps that need to be taken for this project:
     1. Extract the data and create the dataframe 
     2. Data Wrangling 
     3. Grouping the Data
     4. Find Latitude & Longitude Corresponding to Postal Codes
     5. Segmenting and Clustering the Data
     6. Analyzing the Data

### Installing and Importing Python Library

In [1]:
%pip install beautifulsoup4 # Installing the BeautifulSoup4
import pandas as pd # library for data analsysis
import numpy as np # library to handle data in a vectorized manner
import random # library for random number generation
from bs4 import BeautifulSoup
import requests # library to handle requests

Collecting beautifulsoup4
[?25l  Downloading https://files.pythonhosted.org/packages/3b/c8/a55eb6ea11cd7e5ac4bacdf92bac4693b90d3ba79268be16527555e186f0/beautifulsoup4-4.8.1-py3-none-any.whl (101kB)
[K     |████████████████████████████████| 102kB 17.0MB/s ta 0:00:01
[?25hCollecting soupsieve>=1.2 (from beautifulsoup4)
  Downloading https://files.pythonhosted.org/packages/5d/42/d821581cf568e9b7dfc5b415aa61952b0f5e3dede4f3cbd650e3a1082992/soupsieve-1.9.4-py2.py3-none-any.whl
Installing collected packages: soupsieve, beautifulsoup4
Successfully installed beautifulsoup4-4.8.1 soupsieve-1.9.4
Note: you may need to restart the kernel to use updated packages.


In [2]:
# Installing the Conda
!conda update -n base -c defaults conda --yes 

Solving environment: done

## Package Plan ##

  environment location: /home/jupyterlab/conda

  added / updated specs: 
    - conda


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    pip-19.2.3                 |           py37_0         1.9 MB
    chardet-3.0.4              |        py37_1003         173 KB
    certifi-2019.9.11          |           py37_0         154 KB
    python-3.7.4               |       h265db76_1        36.5 MB
    pysocks-1.7.1              |           py37_0          30 KB
    ncurses-6.1                |       he6710b0_1         958 KB
    setuptools-41.2.0          |           py37_0         630 KB
    conda-4.7.12               |           py37_0         3.0 MB
    pycparser-2.19             |           py37_0         172 KB
    wheel-0.33.6               |           py37_0          40 KB
    pyopenssl-19.0.0           |           py37_0          82 KB
    co

In [3]:
!pip -q install geopy
# conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
print('geopy installed...')
# convert an address into latitude and longitude values
from geopy.geocoders import Nominatim
print('Nominatim imported...')

geopy installed...
Nominatim imported...


In [4]:
# install the Geocoder
!pip -q install geocoder
import geocoder

In [5]:
import pandas as pd # Importing the library for Data Analysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # Importing JAON library
print('numpy, pandas, ..., imported...')

# Importing JSON_Normalize to tranform JSON file into a pandas dataframe
from pandas.io.json import json_normalize
print('json_normalize imported...')



# Importing Matplotlib & Plot Modules
import matplotlib.cm as cm
import matplotlib.colors as colors
print('matplotlib imported...')

# Importing K-means from Clustering Stage
from sklearn.cluster import KMeans
print('Kmeans imported...')

import time # Import time

numpy, pandas, ..., imported...
json_normalize imported...
matplotlib imported...
Kmeans imported...


## B. Extracting the Postal Codes of Canada: M
#### In this part, the postal code of Canada starting with M will be extracted from a Wikipedai page and a dataframe will be created.

In [6]:
%pip install requests
import requests
# Loading data from internet
wikipedia_link='https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'
raw_wikipedia_page= requests.get(wikipedia_link).text

# Using beautiful soup to parse the HTML/XML codes.
from bs4 import BeautifulSoup
soup = BeautifulSoup(raw_wikipedia_page,'html')
# print(soup.prettify())

Note: you may need to restart the kernel to use updated packages.


In [7]:
My_table = soup.find('table', {"class": 'wikitable sortable'})

In [8]:
New_Table = soup.find('table')

Postcode      = []
Borough       = []
Neighbourhood = []

# Extracting a clean form of the table
for tr_cell in New_Table.find_all('tr'):
    
    counter = 1
    Postcode_var      = -1
    Borough_var       = -1
    Neighbourhood_var = -1
    
    for td_cell in tr_cell.find_all('td'):
        if counter == 1: 
            Postcode_var = td_cell.text
        if counter == 2: 
            Borough_var = td_cell.text
            tag_a_Borough = td_cell.find('a')
        if counter == 3: 
            Neighbourhood_var = str(td_cell.text).strip()
            tag_a_Neighbourhood = td_cell.find('a')
            
        counter +=1
        
    if (Postcode_var == 'Not assigned' or Borough_var == 'Not assigned' or Neighbourhood_var == 'Not assigned'): 
        continue
    try:
        if ((tag_a_Borough is None) or (tag_a_Neighbourhood is None)):
            continue
    except:
        pass
    if(Postcode_var == -1 or Borough_var == -1 or Neighbourhood_var == -1):
        continue
        
    Postcode.append(Postcode_var)
    Borough.append(Borough_var)
    Neighbourhood.append(Neighbourhood_var)

In [9]:
Unique_Postal = set(Postcode)
print('num of unique Postal codes:', len(Unique_Postal))
Postcode_u      = []
Borough_u       = []
Neighbourhood_u = []

for postcode_unique_element in Unique_Postal:
    p_var = ''; b_var = ''; n_var = ''; 
    for postcode_idx, postcode_element in enumerate(Postcode):
        if postcode_unique_element == postcode_element:
            p_var = postcode_element;
            b_var = Borough[postcode_idx]
            if n_var == '': 
                n_var = Neighbourhood[postcode_idx]
            else:
                n_var = n_var + ', ' + Neighbourhood[postcode_idx]
    Postcode_u.append(p_var)
    Borough_u.append(b_var)
    Neighbourhood_u.append(n_var)

num of unique Postal codes: 77


In [10]:
Toronto_Dictionary = {'Postcode':Postcode_u, 'Borough':Borough_u, 'Neighbourhood':Neighbourhood_u}
df_Toronto = pd.DataFrame.from_dict(Toronto_Dictionary)
df_Toronto.to_csv('Toronto_Part-A.csv')
df_Toronto.head(14)

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M9A,Etobicoke,Islington Avenue
1,M4H,East York,Thorncliffe Park
2,M1B,Scarborough,"Rouge, Malvern"
3,M9L,North York,Humber Summit
4,M4Y,Downtown Toronto,Church and Wellesley
5,M9N,York,Weston
6,M3J,North York,"Northwood Park, York University"
7,M2H,North York,Hillcrest Village
8,M2J,North York,Henry Farm
9,M5S,Downtown Toronto,University of Toronto


In [11]:
Toronto_Dictionary = {'Postcode':Postcode_u, 'Borough':Borough_u, 'Neighbourhood':Neighbourhood_u}
df_Toronto = pd.DataFrame.from_dict(Toronto_Dictionary)
df_Toronto.to_csv('Toronto_Part-A.csv')
df_Toronto.head(14)

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M9A,Etobicoke,Islington Avenue
1,M4H,East York,Thorncliffe Park
2,M1B,Scarborough,"Rouge, Malvern"
3,M9L,North York,Humber Summit
4,M4Y,Downtown Toronto,Church and Wellesley
5,M9N,York,Weston
6,M3J,North York,"Northwood Park, York University"
7,M2H,North York,Hillcrest Village
8,M2J,North York,Henry Farm
9,M5S,Downtown Toronto,University of Toronto


## C. Geospatial Data
#### In this part, the "Geospatial Data", latitude and longitude data will be found for each postal code. Then, the geospatial data will be assigned to their corresponding postal code in the dataframe.

In [12]:
Table_PostalCode = soup.find('table')
fields = Table_PostalCode.find_all('td')

postcode = []
borough = []
neighbourhood = []

for i in range(0, len(fields), 3):
    postcode.append(fields[i].text.strip())
    borough.append(fields[i+1].text.strip())
    neighbourhood.append(fields[i+2].text.strip())
        
df_PostalCode = pd.DataFrame(data=[postcode, borough, neighbourhood]).transpose()
df_PostalCode.columns = ['Postcode', 'Borough', 'Neighbourhood']
df_PostalCode.head()

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront


In [13]:
df_PostalCodeNeighbor = df_PostalCode.groupby(['Postcode', 'Borough'])['Neighbourhood'].apply(', '.join).reset_index()
df_PostalCodeNeighbor.columns = ['Postcode', 'Borough', 'Neighbourhood']
df_PostalCodeNeighbor

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M1A,Not assigned,Not assigned
1,M1B,Scarborough,"Rouge, Malvern"
2,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union"
3,M1E,Scarborough,"Guildwood, Morningside, West Hill"
4,M1G,Scarborough,Woburn
5,M1H,Scarborough,Cedarbrae
6,M1J,Scarborough,Scarborough Village
7,M1K,Scarborough,"East Birchmount Park, Ionview, Kennedy Park"
8,M1L,Scarborough,"Clairlea, Golden Mile, Oakridge"
9,M1M,Scarborough,"Cliffcrest, Cliffside, Scarborough Village West"


In [14]:
df_PostalCodeNeighbor['Neighbourhood'].replace('Not assigned', "Queen's Park", inplace=True)
df_PostalCodeNeighbor

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M1A,Not assigned,Queen's Park
1,M1B,Scarborough,"Rouge, Malvern"
2,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union"
3,M1E,Scarborough,"Guildwood, Morningside, West Hill"
4,M1G,Scarborough,Woburn
5,M1H,Scarborough,Cedarbrae
6,M1J,Scarborough,Scarborough Village
7,M1K,Scarborough,"East Birchmount Park, Ionview, Kennedy Park"
8,M1L,Scarborough,"Clairlea, Golden Mile, Oakridge"
9,M1M,Scarborough,"Cliffcrest, Cliffside, Scarborough Village West"


In [15]:
df_PostalCodeNeighbor.shape

(180, 3)

In [16]:
df_geospatial_data = pd.read_csv('http://cocl.us/Geospatial_data')
df_geospatial_data.columns = ['Postcode', 'Latitude', 'Longitude']
df_geospatial_data.head()

Unnamed: 0,Postcode,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


In [17]:
df_Post = pd.merge(df_PostalCodeNeighbor, df_geospatial_data, on=['Postcode'], how='inner')
df_Full_Table = df_Post[[ 'Postcode', 'Borough', 'Neighbourhood', 'Latitude', 'Longitude',]].copy()
df_Full_Table.head()

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude
0,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476


In [18]:
df_Full_Table.head(103)

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude
0,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476
5,M1J,Scarborough,Scarborough Village,43.744734,-79.239476
6,M1K,Scarborough,"East Birchmount Park, Ionview, Kennedy Park",43.727929,-79.262029
7,M1L,Scarborough,"Clairlea, Golden Mile, Oakridge",43.711112,-79.284577
8,M1M,Scarborough,"Cliffcrest, Cliffside, Scarborough Village West",43.716316,-79.239476
9,M1N,Scarborough,"Birch Cliff, Cliffside West",43.692657,-79.264848


## D. Data Visualization 
#### In this part, the dataframe will be visualized on a map.

In [19]:
%pip install folium

Note: you may need to restart the kernel to use updated packages.


In [20]:
import geocoder
df = pd.read_csv('Toronto_Part-A.csv')

In [21]:
# importing new libraries
import numpy as np # library to handle data in a vectorized manner
import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Libraries imported.


### Loading the Dataframe "toronto_base.csv" from Part-A

In [22]:
df_Toronto = pd.read_csv('Toronto_Part-A.csv')
df_Toronto.head(7)

Unnamed: 0.1,Unnamed: 0,Postcode,Borough,Neighbourhood
0,0,M9A,Etobicoke,Islington Avenue
1,1,M4H,East York,Thorncliffe Park
2,2,M1B,Scarborough,"Rouge, Malvern"
3,3,M9L,North York,Humber Summit
4,4,M4Y,Downtown Toronto,Church and Wellesley
5,5,M9N,York,Weston
6,6,M3J,North York,"Northwood Park, York University"


In [23]:
df_geospatial_data = pd.read_csv('http://cocl.us/Geospatial_data')
df_geospatial_data.columns = ['Postcode', 'Latitude', 'Longitude']
df_geospatial_data.head()

Unnamed: 0,Postcode,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


In [24]:
df_Post = pd.merge(df_Toronto, df_geospatial_data, on=['Postcode'], how='inner')
df_Toronto = df_Post[[ 'Postcode', 'Borough', 'Neighbourhood', 'Latitude', 'Longitude',]].copy()
df_Toronto.head()

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude
0,M9A,Etobicoke,Islington Avenue,43.667856,-79.532242
1,M4H,East York,Thorncliffe Park,43.705369,-79.349372
2,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353
3,M9L,North York,Humber Summit,43.756303,-79.565963
4,M4Y,Downtown Toronto,Church and Wellesley,43.66586,-79.38316


### Creating a Map of Toronto

In [25]:
#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

# for the city Toronto, latitude and longtitude are manually extracted via google search
Toronto_Latitude = 43.6532; Toronto_Longitude = -79.3832
map_Toronto = folium.Map(location = [Toronto_Latitude, Toronto_Longitude], zoom_start = 10.7)

# add markers to map
for lat, lng, borough, neighborhood in zip(df_Toronto['Latitude'], df_Toronto['Longitude'],
                                           df_Toronto['Borough'], df_Toronto['Neighbourhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(map_Toronto)  
    
map_Toronto

In [26]:
# @hiddel_cell
CLIENT_ID = 'YOHRI5JQ3KIYWOMLHROI3S4ZUVDG0JB2SBLQWVSP0SNKUOM0' #  Foursquare ID
CLIENT_SECRET = 'ZIL0PGRZQMV45FDOVM0UZAWHTNJ0A0WCPTPDXQIERGOR4ZQD' # Foursquare Secret
VERSION = '20180605' # Foursquare API version

In [27]:
Scarborough_data = df_Toronto[df_Toronto['Borough'] == 'Scarborough'].reset_index(drop=True)
Scarborough_data.head(7)

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude
0,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353
1,M1T,Scarborough,Tam O'Shanter,43.781638,-79.304302
2,M1X,Scarborough,Upper Rouge,43.836125,-79.205636
3,M1M,Scarborough,"Cliffcrest, Cliffside",43.716316,-79.239476
4,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",43.784535,-79.160497
5,M1R,Scarborough,"Maryvale, Wexford",43.750072,-79.295849
6,M1J,Scarborough,Scarborough Village,43.744734,-79.239476


In [28]:
address_scar = 'Scarborough,Toronto'
latitude_scar = 43.773077
longitude_scar = -79.257774
print('The geograpical coordinate of Scarborough are {}, {}.'.format(latitude_scar, longitude_scar))

The geograpical coordinate of Scarborough are 43.773077, -79.257774.


In [29]:
map_Scarb = folium.Map(location=[latitude_scar, longitude_scar], zoom_start=12)

# add markers to map
for lat, lng, label in zip(Scarborough_data['Latitude'], Scarborough_data['Longitude'], Scarborough_data['Neighbourhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(map_Scarb)  
    
map_Scarb

In [30]:
neighborhood_latitude = Scarborough_data.loc[0, 'Latitude'] # neighbourhood latitude value
neighborhood_longitude = Scarborough_data.loc[0, 'Longitude'] # neighbourhood longitude value

neighborhood_name = Scarborough_data.loc[0, 'Neighbourhood'] # neighbourhood name

print('Latitude and longitude values of "{}" are {}, {}.'.format(neighborhood_name, 
                                                               neighborhood_latitude, 
                                                               neighborhood_longitude))

Latitude and longitude values of "Rouge, Malvern" are 43.806686299999996, -79.19435340000001.


In [31]:
# @hiddel_cell
CLIENT_ID = 'YOHRI5JQ3KIYWOMLHROI3S4ZUVDG0JB2SBLQWVSP0SNKUOM0' #  Foursquare ID
CLIENT_SECRET = 'ZIL0PGRZQMV45FDOVM0UZAWHTNJ0A0WCPTPDXQIERGOR4ZQD' # Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: YOHRI5JQ3KIYWOMLHROI3S4ZUVDG0JB2SBLQWVSP0SNKUOM0
CLIENT_SECRET:ZIL0PGRZQMV45FDOVM0UZAWHTNJ0A0WCPTPDXQIERGOR4ZQD


In [32]:
LIMIT = 100
radius = 500
url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&ll={},{}&v={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude_scar, longitude_scar, VERSION, radius, LIMIT)

In [33]:
results = requests.get(url).json()

In [34]:
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [35]:
import json # library to handle JSON files
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

venues = results['response']['groups'][0]['items']  
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head(10)

Unnamed: 0,name,categories,lat,lng
0,Disney Store,Toy / Game Store,43.775537,-79.256833
1,American Eagle Outfitters,Clothing Store,43.775908,-79.258352
2,SEPHORA,Cosmetics Shop,43.775017,-79.258109
3,DAVIDsTEA,Tea Room,43.776613,-79.258516
4,St. Andrews Fish & Chips,Fish & Chips Shop,43.771865,-79.252645
5,Coliseum Scarborough Cinemas,Movie Theater,43.775995,-79.255649
6,Tommy Hilfiger Company Store,Clothing Store,43.776015,-79.257369
7,Chipotle Mexican Grill,Mexican Restaurant,43.77641,-79.258069
8,Shoppers Drug Mart,Pharmacy,43.773305,-79.251662
9,Hot Topic,Clothing Store,43.77545,-79.257929


In [36]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

44 venues were returned by Foursquare.


In [37]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [38]:
Scarborough_venues = getNearbyVenues(names=Scarborough_data['Neighbourhood'],
                                   latitudes=Scarborough_data['Latitude'],
                                   longitudes=Scarborough_data['Longitude']
                                  )

Rouge, Malvern
Tam O'Shanter
Upper Rouge
Cliffcrest, Cliffside
Highland Creek, Rouge Hill, Port Union
Maryvale, Wexford
Scarborough Village
Agincourt North, Milliken
Clairlea, Golden Mile, Oakridge
Woburn
Ionview, Kennedy Park
Birch Cliff
Morningside, West Hill
Agincourt
Dorset Park, Scarborough Town Centre, Wexford Heights


In [39]:
Scarborough_venues.head(10)

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,"Rouge, Malvern",43.806686,-79.194353,Wendy's,43.807448,-79.199056,Fast Food Restaurant
1,"Rouge, Malvern",43.806686,-79.194353,Interprovincial Group,43.80563,-79.200378,Print Shop
2,Tam O'Shanter,43.781638,-79.304302,Remezzo Italian Bistro,43.778649,-79.308264,Italian Restaurant
3,Tam O'Shanter,43.781638,-79.304302,The Royal Chinese Restaurant 避風塘小炒,43.780505,-79.298844,Chinese Restaurant
4,Tam O'Shanter,43.781638,-79.304302,Kub Khao,43.780438,-79.299837,Thai Restaurant
5,Tam O'Shanter,43.781638,-79.304302,Eight Noodles,43.778234,-79.308299,Noodle House
6,Tam O'Shanter,43.781638,-79.304302,TD Canada Trust,43.779169,-79.303617,Bank
7,Tam O'Shanter,43.781638,-79.304302,KFC,43.77944,-79.303371,Fast Food Restaurant
8,Tam O'Shanter,43.781638,-79.304302,Little Caesars,43.780563,-79.298624,Pizza Place
9,Tam O'Shanter,43.781638,-79.304302,Popeyes Louisiana Kitchen,43.780476,-79.29846,Fried Chicken Joint


In [40]:
Scarborough_venues.tail(10)

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
61,Agincourt,43.7942,-79.262029,Twilight,43.791999,-79.258584,Lounge
62,Agincourt,43.7942,-79.262029,Royal Chinese Seafood Restaurant,43.798496,-79.262196,Chinese Restaurant
63,Agincourt,43.7942,-79.262029,Commander Arena,43.794867,-79.267989,Skating Rink
64,"Dorset Park, Scarborough Town Centre, Wexford ...",43.75741,-79.273304,Kairali,43.754915,-79.276945,Indian Restaurant
65,"Dorset Park, Scarborough Town Centre, Wexford ...",43.75741,-79.273304,Kim Kim restaurant,43.753833,-79.276611,Chinese Restaurant
66,"Dorset Park, Scarborough Town Centre, Wexford ...",43.75741,-79.273304,El Pulgarcito,43.75479,-79.277064,Latin American Restaurant
67,"Dorset Park, Scarborough Town Centre, Wexford ...",43.75741,-79.273304,Big Al's Pet Supercentre,43.759279,-79.278325,Pet Store
68,"Dorset Park, Scarborough Town Centre, Wexford ...",43.75741,-79.273304,Karaikudi Chettinad South Indian Restaurant,43.756042,-79.276276,Indian Restaurant
69,"Dorset Park, Scarborough Town Centre, Wexford ...",43.75741,-79.273304,Pho Vietnam,43.75777,-79.278572,Vietnamese Restaurant
70,"Dorset Park, Scarborough Town Centre, Wexford ...",43.75741,-79.273304,Salvation Army Thrift Store,43.755782,-79.276208,Thrift / Vintage Store


In [41]:
Scarborough_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Agincourt,5,5,5,5,5,5
"Agincourt North, Milliken",2,2,2,2,2,2
Birch Cliff,4,4,4,4,4,4
"Clairlea, Golden Mile, Oakridge",8,8,8,8,8,8
"Cliffcrest, Cliffside",3,3,3,3,3,3
"Dorset Park, Scarborough Town Centre, Wexford Heights",7,7,7,7,7,7
"Highland Creek, Rouge Hill, Port Union",2,2,2,2,2,2
"Ionview, Kennedy Park",5,5,5,5,5,5
"Maryvale, Wexford",7,7,7,7,7,7
"Morningside, West Hill",7,7,7,7,7,7


In [42]:
print('There are {} uniques categories.'.format(len(Scarborough_venues['Venue Category'].unique())))

There are 46 uniques categories.


In [43]:
# one hot encoding
Scarb_onehot = pd.get_dummies(Scarborough_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
Scarb_onehot['Neighborhood'] = Scarborough_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [Scarb_onehot.columns[-1]] + list(Scarb_onehot.columns[:-1])
Scarb_onehot = Scarb_onehot[fixed_columns]

Scarb_onehot.head()

Unnamed: 0,Neighborhood,American Restaurant,Auto Garage,Bakery,Bank,Bar,Breakfast Spot,Bus Line,Bus Station,Café,Chinese Restaurant,Coffee Shop,College Stadium,Construction & Landscaping,Department Store,Discount Store,Electronics Store,Fast Food Restaurant,Fried Chicken Joint,General Entertainment,History Museum,Indian Restaurant,Intersection,Italian Restaurant,Jewelry Store,Korean Restaurant,Latin American Restaurant,Lounge,Medical Center,Mexican Restaurant,Middle Eastern Restaurant,Motel,Noodle House,Park,Pet Store,Pharmacy,Pizza Place,Playground,Print Shop,Rental Car Location,Sandwich Place,Shopping Mall,Skating Rink,Soccer Field,Thai Restaurant,Thrift / Vintage Store,Vietnamese Restaurant
0,"Rouge, Malvern",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,"Rouge, Malvern",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0
2,Tam O'Shanter,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Tam O'Shanter,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Tam O'Shanter,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0


In [44]:
Scarb_onehot.shape

(71, 47)

In [45]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [46]:
Scarb_grouped = Scarb_onehot.groupby('Neighborhood').mean().reset_index()
Scarb_grouped.head(10)

Unnamed: 0,Neighborhood,American Restaurant,Auto Garage,Bakery,Bank,Bar,Breakfast Spot,Bus Line,Bus Station,Café,Chinese Restaurant,Coffee Shop,College Stadium,Construction & Landscaping,Department Store,Discount Store,Electronics Store,Fast Food Restaurant,Fried Chicken Joint,General Entertainment,History Museum,Indian Restaurant,Intersection,Italian Restaurant,Jewelry Store,Korean Restaurant,Latin American Restaurant,Lounge,Medical Center,Mexican Restaurant,Middle Eastern Restaurant,Motel,Noodle House,Park,Pet Store,Pharmacy,Pizza Place,Playground,Print Shop,Rental Car Location,Sandwich Place,Shopping Mall,Skating Rink,Soccer Field,Thai Restaurant,Thrift / Vintage Store,Vietnamese Restaurant
0,Agincourt,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.2,0.0,0.0,0.0,0.0
1,"Agincourt North, Milliken",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Birch Cliff,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0
3,"Clairlea, Golden Mile, Oakridge",0.0,0.0,0.25,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0
4,"Cliffcrest, Cliffside",0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0
5,"Dorset Park, Scarborough Town Centre, Wexford ...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.285714,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.142857
6,"Highland Creek, Rouge Hill, Port Union",0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,"Ionview, Kennedy Park",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.2,0.0,0.0,0.2,0.4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,"Maryvale, Wexford",0.0,0.142857,0.142857,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.285714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.142857,0.0,0.0,0.0,0.0,0.0
9,"Morningside, West Hill",0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.142857,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [47]:
Num_Top_Venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(Num_Top_Venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = Scarb_grouped['Neighborhood']

for ind in np.arange(Scarb_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(Scarb_grouped.iloc[ind, :], Num_Top_Venues)

neighborhoods_venues_sorted

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Agincourt,Chinese Restaurant,Skating Rink,Sandwich Place,Lounge,Breakfast Spot,College Stadium,General Entertainment,Fried Chicken Joint,Fast Food Restaurant,Electronics Store
1,"Agincourt North, Milliken",Park,Playground,Vietnamese Restaurant,History Museum,General Entertainment,Fried Chicken Joint,Fast Food Restaurant,Electronics Store,Discount Store,Department Store
2,Birch Cliff,College Stadium,General Entertainment,Skating Rink,Café,Vietnamese Restaurant,History Museum,Fried Chicken Joint,Fast Food Restaurant,Electronics Store,Discount Store
3,"Clairlea, Golden Mile, Oakridge",Bakery,Bus Line,Soccer Field,Fast Food Restaurant,Intersection,Park,Vietnamese Restaurant,Construction & Landscaping,History Museum,General Entertainment
4,"Cliffcrest, Cliffside",American Restaurant,Skating Rink,Motel,College Stadium,History Museum,General Entertainment,Fried Chicken Joint,Fast Food Restaurant,Electronics Store,Discount Store
5,"Dorset Park, Scarborough Town Centre, Wexford ...",Indian Restaurant,Vietnamese Restaurant,Chinese Restaurant,Latin American Restaurant,Pet Store,Thrift / Vintage Store,Department Store,History Museum,General Entertainment,Auto Garage
6,"Highland Creek, Rouge Hill, Port Union",History Museum,Bar,Vietnamese Restaurant,College Stadium,Indian Restaurant,General Entertainment,Fried Chicken Joint,Fast Food Restaurant,Electronics Store,Discount Store
7,"Ionview, Kennedy Park",Discount Store,Coffee Shop,Bus Station,Department Store,College Stadium,Indian Restaurant,History Museum,General Entertainment,Fried Chicken Joint,Fast Food Restaurant
8,"Maryvale, Wexford",Middle Eastern Restaurant,Auto Garage,Bakery,Shopping Mall,Sandwich Place,Breakfast Spot,Vietnamese Restaurant,Construction & Landscaping,General Entertainment,Fried Chicken Joint
9,"Morningside, West Hill",Intersection,Electronics Store,Breakfast Spot,Rental Car Location,Medical Center,Pizza Place,Mexican Restaurant,College Stadium,General Entertainment,Fried Chicken Joint


In [48]:
# import k-means from clustering stage
from sklearn.cluster import KMeans

Scarb_data = Scarborough_data.drop(13)
# set number of clusters
kclusters = 5

Scarb_grouped_clustering = Scarb_grouped.drop('Neighborhood', 1)


# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(Scarb_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 
#len(kmeans.labels_)#=16
#scarborough_data.shape

array([0, 1, 0, 4, 0, 0, 2, 4, 0, 0], dtype=int32)

In [49]:
Scarb_merged = Scarb_data

# add clustering labels
Scarb_merged['Cluster Labels'] = kmeans.labels_

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
Scarb_merged = Scarb_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighbourhood')

Scarb_merged

Unnamed: 0,Postcode,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353,0,Fast Food Restaurant,Print Shop,Vietnamese Restaurant,Coffee Shop,History Museum,General Entertainment,Fried Chicken Joint,Electronics Store,Discount Store,Department Store
1,M1T,Scarborough,Tam O'Shanter,43.781638,-79.304302,1,Pharmacy,Pizza Place,Italian Restaurant,Bank,Noodle House,Rental Car Location,Chinese Restaurant,Shopping Mall,Fast Food Restaurant,Fried Chicken Joint
2,M1X,Scarborough,Upper Rouge,43.836125,-79.205636,0,,,,,,,,,,
3,M1M,Scarborough,"Cliffcrest, Cliffside",43.716316,-79.239476,4,American Restaurant,Skating Rink,Motel,College Stadium,History Museum,General Entertainment,Fried Chicken Joint,Fast Food Restaurant,Electronics Store,Discount Store
4,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",43.784535,-79.160497,0,History Museum,Bar,Vietnamese Restaurant,College Stadium,Indian Restaurant,General Entertainment,Fried Chicken Joint,Fast Food Restaurant,Electronics Store,Discount Store
5,M1R,Scarborough,"Maryvale, Wexford",43.750072,-79.295849,0,Middle Eastern Restaurant,Auto Garage,Bakery,Shopping Mall,Sandwich Place,Breakfast Spot,Vietnamese Restaurant,Construction & Landscaping,General Entertainment,Fried Chicken Joint
6,M1J,Scarborough,Scarborough Village,43.744734,-79.239476,2,Jewelry Store,Playground,Construction & Landscaping,Vietnamese Restaurant,College Stadium,History Museum,General Entertainment,Fried Chicken Joint,Fast Food Restaurant,Electronics Store
7,M1V,Scarborough,"Agincourt North, Milliken",43.815252,-79.284577,4,Park,Playground,Vietnamese Restaurant,History Museum,General Entertainment,Fried Chicken Joint,Fast Food Restaurant,Electronics Store,Discount Store,Department Store
8,M1L,Scarborough,"Clairlea, Golden Mile, Oakridge",43.711112,-79.284577,0,Bakery,Bus Line,Soccer Field,Fast Food Restaurant,Intersection,Park,Vietnamese Restaurant,Construction & Landscaping,History Museum,General Entertainment
9,M1G,Scarborough,Woburn,43.770992,-79.216917,0,Coffee Shop,Korean Restaurant,College Stadium,History Museum,General Entertainment,Fried Chicken Joint,Fast Food Restaurant,Electronics Store,Discount Store,Department Store


In [50]:
# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# create map
map_clusters = folium.Map(location = [latitude_scar, longitude_scar], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(Scarb_merged['Latitude'], Scarb_merged['Longitude'], 
                                  Scarb_merged['Neighbourhood'], Scarb_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [51]:
Scarb_merged.loc[Scarb_merged['Cluster Labels'] == 0, 
                 Scarb_merged.columns[[1] + list(range(5, Scarb_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Scarborough,0,Fast Food Restaurant,Print Shop,Vietnamese Restaurant,Coffee Shop,History Museum,General Entertainment,Fried Chicken Joint,Electronics Store,Discount Store,Department Store
2,Scarborough,0,,,,,,,,,,
4,Scarborough,0,History Museum,Bar,Vietnamese Restaurant,College Stadium,Indian Restaurant,General Entertainment,Fried Chicken Joint,Fast Food Restaurant,Electronics Store,Discount Store
5,Scarborough,0,Middle Eastern Restaurant,Auto Garage,Bakery,Shopping Mall,Sandwich Place,Breakfast Spot,Vietnamese Restaurant,Construction & Landscaping,General Entertainment,Fried Chicken Joint
8,Scarborough,0,Bakery,Bus Line,Soccer Field,Fast Food Restaurant,Intersection,Park,Vietnamese Restaurant,Construction & Landscaping,History Museum,General Entertainment
9,Scarborough,0,Coffee Shop,Korean Restaurant,College Stadium,History Museum,General Entertainment,Fried Chicken Joint,Fast Food Restaurant,Electronics Store,Discount Store,Department Store
12,Scarborough,0,Intersection,Electronics Store,Breakfast Spot,Rental Car Location,Medical Center,Pizza Place,Mexican Restaurant,College Stadium,General Entertainment,Fried Chicken Joint


In [52]:
Scarb_merged.loc[Scarb_merged['Cluster Labels'] == 1, 
                 Scarb_merged.columns[[1] + list(range(5, Scarb_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Scarborough,1,Pharmacy,Pizza Place,Italian Restaurant,Bank,Noodle House,Rental Car Location,Chinese Restaurant,Shopping Mall,Fast Food Restaurant,Fried Chicken Joint
11,Scarborough,1,College Stadium,General Entertainment,Skating Rink,Café,Vietnamese Restaurant,History Museum,Fried Chicken Joint,Fast Food Restaurant,Electronics Store,Discount Store


In [53]:
Scarb_merged.loc[Scarb_merged['Cluster Labels'] == 2, 
                 Scarb_merged.columns[[1] + list(range(5, Scarb_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
6,Scarborough,2,Jewelry Store,Playground,Construction & Landscaping,Vietnamese Restaurant,College Stadium,History Museum,General Entertainment,Fried Chicken Joint,Fast Food Restaurant,Electronics Store


In [54]:
Scarb_merged.loc[Scarb_merged['Cluster Labels'] == 3, 
                 Scarb_merged.columns[[1] + list(range(5, Scarb_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
10,Scarborough,3,Discount Store,Coffee Shop,Bus Station,Department Store,College Stadium,Indian Restaurant,History Museum,General Entertainment,Fried Chicken Joint,Fast Food Restaurant


In [55]:
Scarb_merged.loc[Scarb_merged['Cluster Labels'] == 4, 
                 Scarb_merged.columns[[1] + list(range(5, Scarb_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
3,Scarborough,4,American Restaurant,Skating Rink,Motel,College Stadium,History Museum,General Entertainment,Fried Chicken Joint,Fast Food Restaurant,Electronics Store,Discount Store
7,Scarborough,4,Park,Playground,Vietnamese Restaurant,History Museum,General Entertainment,Fried Chicken Joint,Fast Food Restaurant,Electronics Store,Discount Store,Department Store
14,Scarborough,4,Indian Restaurant,Vietnamese Restaurant,Chinese Restaurant,Latin American Restaurant,Pet Store,Thrift / Vintage Store,Department Store,History Museum,General Entertainment,Auto Garage
