<a href="https://colab.research.google.com/github/kaisu313/Capstone-Week-1-submission/blob/master/Right_Shop_Restaurant.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##PROBLEM/BUSINESS CASE:

A friend who owns a very successful nutrition shop/restaurant in Richmond West since 2017 would like to now open 2 more stores in the greater Vancouver region.  From his shop, he knows his key customer segments are young, 30-45 year old active people who places a high priority on fitness, and family.  A new restaurant costs over $300K after all setup costs so it is important he picks the right location for his new stores.  He believes the key to ensure the new store's success hinge primarily on being in the same type of neighborhood as his current store.  It is important that his new stores success as he has plans to open up 2 more stores within the next 2 years.  

##Approach
We will use Machine Learning unsupervised clustering techniques to analyze what are the other similar areas in the Vancouver region that is similar to Richmond West so that he can be more confident to succeed in the new locations.  

##Before we get the data and start exploring it, let's download all the dependencies that we will need.

##DATA DESCRIPTION

Based on the above criteria, we will use the following data source for a list of neighborhood the Vancouver region offers:

1) Vancouver BC's location coordinates through GEONAMES https://www.geonames.org/postal-codes/CA/BC/british-columbia.html

We will scrub this page and put the relevant datatable into a dataframe.
This data is discrete attribute data which consists of place, postal code, name of neighborhood, province and country. We will use this data's coordinates and location data as an input into Foursquare's API call to explore the neighborhood and venues.

2) FourSquare to explore the most common venues in each area

We will extract the venues in each area through calling FourSquare's API. This dataset return will contain the type of business, as well as its popularity. With this attribute data set, we can rank them accordingly to obtain the most commnon venue within the neighborhood.

3) We will then do clustering through unsupervised learning to see what other comparable locations the new stores should be situated


In [0]:

import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

/bin/bash: conda: command not found
/bin/bash: conda: command not found
Libraries imported.


<a id='item1'></a>




## 1. Download and Explore GEONAMES Dataset

We will download from GEONAMES.ORG and scrape the neighborhood places in British Columbia, Canada.  

In [0]:
import pandas as pd

import requests

from bs4 import BeautifulSoup

req = requests.get("https://www.geonames.org/postal-codes/CA/BC/british-columbia.html")

soup = BeautifulSoup(req.content,'lxml')

table = soup.find_all('table')[2]

df = pd.read_html(str(table))

neighborhood=pd.DataFrame(df[0])

neighborhood.head(8)

Unnamed: 0.1,Unnamed: 0,Place,Code,Country,Admin1,Admin2,Admin3
0,1.0,Port Moody,V3H,Canada,British Columbia,,
1,,49.323/-122.863,49.323/-122.863,49.323/-122.863,49.323/-122.863,49.323/-122.863,49.323/-122.863
2,2.0,Pitt Meadows,V3Y,Canada,British Columbia,,
3,,49.221/-122.69,49.221/-122.69,49.221/-122.69,49.221/-122.69,49.221/-122.69,49.221/-122.69
4,3.0,White Rock,V4B,Canada,British Columbia,,
5,,49.026/-122.806,49.026/-122.806,49.026/-122.806,49.026/-122.806,49.026/-122.806,49.026/-122.806
6,4.0,Penticton,V2A,Canada,British Columbia,,
7,,49.481/-119.586,49.481/-119.586,49.481/-119.586,49.481/-119.586,49.481/-119.586,49.481/-119.586


## Let's further cleanup the NAN and get rid of columns we don't need.  



In [0]:
neighborhood=neighborhood.rename(columns={"Unnamed: 0":"Citynum"})
neighborhood.head(8)



Unnamed: 0,Citynum,Place,Code,Country,Admin1,Admin2,Admin3
0,1.0,Port Moody,V3H,Canada,British Columbia,,
1,,49.323/-122.863,49.323/-122.863,49.323/-122.863,49.323/-122.863,49.323/-122.863,49.323/-122.863
2,2.0,Pitt Meadows,V3Y,Canada,British Columbia,,
3,,49.221/-122.69,49.221/-122.69,49.221/-122.69,49.221/-122.69,49.221/-122.69,49.221/-122.69
4,3.0,White Rock,V4B,Canada,British Columbia,,
5,,49.026/-122.806,49.026/-122.806,49.026/-122.806,49.026/-122.806,49.026/-122.806,49.026/-122.806
6,4.0,Penticton,V2A,Canada,British Columbia,,
7,,49.481/-119.586,49.481/-119.586,49.481/-119.586,49.481/-119.586,49.481/-119.586,49.481/-119.586


In [0]:
van=neighborhood[neighborhood.Country == "Canada"]
van.head()


Unnamed: 0,Citynum,Place,Code,Country,Admin1,Admin2,Admin3
0,1.0,Port Moody,V3H,Canada,British Columbia,,
2,2.0,Pitt Meadows,V3Y,Canada,British Columbia,,
4,3.0,White Rock,V4B,Canada,British Columbia,,
6,4.0,Penticton,V2A,Canada,British Columbia,,
8,5.0,Westbank,V4T,Canada,British Columbia,,


## Now, let's add the coordinates of each place so that we can feed it into Foursquare API call to explore each area.

In [0]:
cord=neighborhood[~(neighborhood.Country == "Canada")]
cord=cord.drop(384)
cord.head()


Unnamed: 0,Citynum,Place,Code,Country,Admin1,Admin2,Admin3
1,,49.323/-122.863,49.323/-122.863,49.323/-122.863,49.323/-122.863,49.323/-122.863,49.323/-122.863
3,,49.221/-122.69,49.221/-122.69,49.221/-122.69,49.221/-122.69,49.221/-122.69,49.221/-122.69
5,,49.026/-122.806,49.026/-122.806,49.026/-122.806,49.026/-122.806,49.026/-122.806,49.026/-122.806
7,,49.481/-119.586,49.481/-119.586,49.481/-119.586,49.481/-119.586,49.481/-119.586,49.481/-119.586
9,,49.866/-119.739,49.866/-119.739,49.866/-119.739,49.866/-119.739,49.866/-119.739,49.866/-119.739


In [0]:
van['Lat']=cord['Place'].values
van.head()



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  """Entry point for launching an IPython kernel.


Unnamed: 0,Citynum,Place,Code,Country,Admin1,Admin2,Admin3,Lat
0,1.0,Port Moody,V3H,Canada,British Columbia,,,49.323/-122.863
2,2.0,Pitt Meadows,V3Y,Canada,British Columbia,,,49.221/-122.69
4,3.0,White Rock,V4B,Canada,British Columbia,,,49.026/-122.806
6,4.0,Penticton,V2A,Canada,British Columbia,,,49.481/-119.586
8,5.0,Westbank,V4T,Canada,British Columbia,,,49.866/-119.739


In [0]:
van.drop(columns=["Citynum","Admin2","Admin3"], axis=1,inplace=True)

van.head()

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  errors=errors,


Unnamed: 0,Place,Code,Country,Admin1,Lat
0,Port Moody,V3H,Canada,British Columbia,49.323/-122.863
2,Pitt Meadows,V3Y,Canada,British Columbia,49.221/-122.69
4,White Rock,V4B,Canada,British Columbia,49.026/-122.806
6,Penticton,V2A,Canada,British Columbia,49.481/-119.586
8,Westbank,V4T,Canada,British Columbia,49.866/-119.739


## Let's split up latitude and longitude coordinates

In [0]:
coordinates=van["Lat"].str.split("/",expand=True)
coordinates.head()

Unnamed: 0,0,1
0,49.323,-122.863
2,49.221,-122.69
4,49.026,-122.806
6,49.481,-119.586
8,49.866,-119.739


In [0]:
van['Latitude']=coordinates[0]
van['Longitude']=coordinates[1]
van.head()

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  """Entry point for launching an IPython kernel.
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


Unnamed: 0,Place,Code,Country,Admin1,Lat,Latitude,Longitude
0,Port Moody,V3H,Canada,British Columbia,49.323/-122.863,49.323,-122.863
2,Pitt Meadows,V3Y,Canada,British Columbia,49.221/-122.69,49.221,-122.69
4,White Rock,V4B,Canada,British Columbia,49.026/-122.806,49.026,-122.806
6,Penticton,V2A,Canada,British Columbia,49.481/-119.586,49.481,-119.586
8,Westbank,V4T,Canada,British Columbia,49.866/-119.739,49.866,-119.739


In [0]:
van.drop(columns=["Lat"], axis=1,inplace=True)

van.head()

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  errors=errors,


Unnamed: 0,Place,Code,Country,Admin1,Latitude,Longitude
0,Port Moody,V3H,Canada,British Columbia,49.323,-122.863
2,Pitt Meadows,V3Y,Canada,British Columbia,49.221,-122.69
4,White Rock,V4B,Canada,British Columbia,49.026,-122.806
6,Penticton,V2A,Canada,British Columbia,49.481,-119.586
8,Westbank,V4T,Canada,British Columbia,49.866,-119.739


In [0]:
van=van.rename(columns={"Admin1":"Province"})
van

Unnamed: 0,Place,Code,Country,Province,Latitude,Longitude
0,Port Moody,V3H,Canada,British Columbia,49.323,-122.863
2,Pitt Meadows,V3Y,Canada,British Columbia,49.221,-122.69
4,White Rock,V4B,Canada,British Columbia,49.026,-122.806
6,Penticton,V2A,Canada,British Columbia,49.481,-119.586
8,Westbank,V4T,Canada,British Columbia,49.866,-119.739
10,Winfield,V4V,Canada,British Columbia,50.022,-119.405
12,Kimberley,V1A,Canada,British Columbia,49.683,-115.986
14,Saltspring Island,V8K,Canada,British Columbia,48.814,-123.497
16,Northern British Columbia (Fort Nelson),V0C,Canada,British Columbia,58.387,-125.717
18,Central Okanagan and High Country (Revelstoke),V0E,Canada,British Columbia,51.505,-119.203


## Here is the completed merged list with places and coordinates!!

## Use geopy library to get the latitude and longitude values of Vancouver

In order to define an instance of the geocoder, we need to define a user_agent. We will name our agent <em>ny_explorer</em>, as shown below.

In [0]:
address = 'Vancouver, BC'

geolocator = Nominatim(user_agent="van_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Vancouver are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Vancouver are 49.2608724, -123.1139529.


## Define Foursquare Credentials and Version

In [0]:
CLIENT_ID = 'ZPWRQK2VHXI0TDUY4ZPHIOAFE4FXMW4RRYXPKUURC5HO1XLL' # your Foursquare ID
CLIENT_SECRET = 'MPTMT1EURSPUOQCSWJQ5FBCONZZMI5EDSFSCEJWE3H2IXWVX' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: ZPWRQK2VHXI0TDUY4ZPHIOAFE4FXMW4RRYXPKUURC5HO1XLL
CLIENT_SECRET:MPTMT1EURSPUOQCSWJQ5FBCONZZMI5EDSFSCEJWE3H2IXWVX


## Let's explore the Richmond West first to see the avenues there and its coordinates.

In [0]:
van.loc[234, 'Place']

'Richmond West'

Get the neighborhood's latitude and longitude values.

In [0]:
neighborhood_latitude = van.loc[234, 'Latitude'] # neighborhood latitude value
neighborhood_longitude = van.loc[234, 'Longitude'] # neighborhood longitude value

neighborhood_name = van.loc[234, 'Place'] # neighborhood name

print('Latitude and longitude values of {} are {}, {}.'.format(neighborhood_name, 
                                                               neighborhood_latitude, 
                                                               neighborhood_longitude))

Latitude and longitude values of Richmond West are 49.163, -123.172.


## Now, let's get the top 100 venues  within a radius 

*   List item
*   List item

of 500 meters.

First, let's create the GET request URL. Name your URL **url**.

In [0]:

LIMIT = 100 # limit of number of venues returned by Foursquare API



radius = 500 # define radius

url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)
url # display URL






'https://api.foursquare.com/v2/venues/explore?&client_id=ZPWRQK2VHXI0TDUY4ZPHIOAFE4FXMW4RRYXPKUURC5HO1XLL&client_secret=MPTMT1EURSPUOQCSWJQ5FBCONZZMI5EDSFSCEJWE3H2IXWVX&v=20180605&ll=49.163,-123.172&radius=500&limit=100'

Send the GET request and examine the resutls

In [0]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5e87a62ac546f3001b6935f7'},
 'response': {'groups': [{'items': [{'reasons': {'count': 0,
       'items': [{'reasonName': 'globalInteractionReason',
         'summary': 'This spot is popular',
         'type': 'general'}]},
      'referralId': 'e-0-4ccbec9497d0224b21745cb8-0',
      'venue': {'categories': [{'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/parks_outdoors/hikingtrail_',
          'suffix': '.png'},
         'id': '4bf58dd8d48988d159941735',
         'name': 'Trail',
         'pluralName': 'Trails',
         'primary': True,
         'shortName': 'Trail'}],
       'id': '4ccbec9497d0224b21745cb8',
       'location': {'cc': 'CA',
        'country': 'Canada',
        'distance': 483,
        'formattedAddress': ['Canada'],
        'labeledLatLngs': [{'label': 'display',
          'lat': 49.16611217362866,
          'lng': -123.17662861384886}],
        'lat': 49.16611217362866,
        'lng': -123.17662861384886},
       'name'

Let's borrow use the **get_category_type** function from the Foursquare lab to explore categories of the venues.

In [0]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

Now we are ready to clean the json and structure it into a *pandas* dataframe.

In [0]:
venues = results['response']['groups'][0]['items']
venues

  

[{'reasons': {'count': 0,
   'items': [{'reasonName': 'globalInteractionReason',
     'summary': 'This spot is popular',
     'type': 'general'}]},
  'referralId': 'e-0-50183062e4b090136c898ce0-0',
  'venue': {'categories': [{'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/travel/hotel_',
      'suffix': '.png'},
     'id': '4bf58dd8d48988d1fa931735',
     'name': 'Hotel',
     'pluralName': 'Hotels',
     'primary': True,
     'shortName': 'Hotel'}],
   'id': '50183062e4b090136c898ce0',
   'location': {'address': '1234 Hornby Street',
    'cc': 'CA',
    'city': 'Vancouver',
    'country': 'Canada',
    'distance': 141,
    'formattedAddress': ['1234 Hornby Street',
     'Vancouver BC V6Z 1W2',
     'Canada'],
    'labeledLatLngs': [{'label': 'display',
      'lat': 49.2779047,
      'lng': -123.1286238}],
    'lat': 49.2779047,
    'lng': -123.1286238,
    'postalCode': 'V6Z 1W2',
    'state': 'BC'},
   'name': 'Residence Inn by Marriott Vancouver Downtown',
   'photos': {

In [0]:
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()




  


Unnamed: 0,name,categories,lat,lng
0,Residence Inn by Marriott Vancouver Downtown,Hotel,49.277905,-123.128624
1,Musette Caffè,Café,49.277813,-123.131349
2,Body Energy Club,Gym / Fitness Center,49.277682,-123.126894
3,Number e food,Sandwich Place,49.277899,-123.13106
4,Breka Bakery & Cafe,Bakery,49.278496,-123.128062


And how many venues were returned by Foursquare?

In [0]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

61 venues were returned by Foursquare.


<a id='item2'></a>

## 2. Explore Neighborhoods in Richmond West 

## Let's create a function to repeat the same process to all the neighborhoods in Vancouver

In [0]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

## Now write the code to run the above function on each neighborhood and create a new dataframe called *van_venues*.

In [0]:
# type your answer here

van_venues = getNearbyVenues(names=van['Place'],
                                   latitudes=van['Latitude'],
                                   longitudes=van['Longitude']
                                  )



Port Moody
Pitt Meadows
White Rock
Penticton
Westbank
Winfield
Kimberley
Saltspring Island
Northern British Columbia (Fort Nelson)
Central Okanagan and High Country (Revelstoke)
West Kootenays (Rossland)
South Okanagan (Summerland)
Omineca and Yellowhead (Smithers)
Cariboo and West Okanagan (100 Mile House)
Chilcotin (Alexis Creek)
Harrison Lake Region (Agassiz)
North Island, Sunshine Coast, and Southern Gulf Islands (Whistler)
North Central Island and Bute Inlet Region (Gold River)
Central Island (Chemainus)
Juan de Fuca Shore (Sooke)
Inside Passage and the Queen Charlottes (Queen Charlotte City)
Lower Skeena (Port Edward)
Atlin Region (Atlin)
Similkameen (Hope)
Vernon West
Fort St. John
Nelson
Langley Township North
Kelowna East
Kamloops Southwest
Vernon Central
Kelowna North
Kelowna Southwest
Kelowna East Central
Kelowna Central
Kelowna West
Kamloops Central and Southeast
Kamloops South and West
Kamloops North
Quesnel
Prince George North
Prince George East Central
Prince George West

## Let's check the size of the resulting dataframe

In [0]:
print(van_venues.shape)


(1830, 7)


##Let's check how many venues were returned for each neighborhood

In [0]:
van_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Abbotsford Southeast,5,5,5,5,5,5
Abbotsford Southwest,1,1,1,1,1,1
Burnaby (Burnaby Heights / Willingdon Heights / West Central Valley),5,5,5,5,5,5
Burnaby (Cascade-Schou / Douglas-Gilpin),4,4,4,4,4,4
Burnaby (East Big Bend / Stride Avenue / Edmonds / Cariboo-Armstrong),6,6,6,6,6,6
Burnaby (Government Road / Lake City / SFU / Burnaby Mountain),3,3,3,3,3,3
Burnaby (Lakeview-Mayfield / Richmond Park / Kingsway-Beresford),2,2,2,2,2,2
Burnaby (Maywood / Marlborough / Oakalla / Windsor),37,37,37,37,37,37
Burnaby (Parkcrest-Aubrey / Ardingley-Sprott),10,10,10,10,10,10
Burnaby (Suncrest / Sussex-Nelson / Clinton-Glenwood / West Big Bend),4,4,4,4,4,4


## Let's find out how many unique categories can be curated from all the returned venues

In [0]:
print('There are {} uniques categories.'.format(len(van_venues['Venue Category'].unique())))

There are 262 uniques categories.


## 3. Analyze Each Neighborhood
Now, we use one hot encoding to setup a dataframe of each venue type for each neighborhood

In [0]:
# one hot encoding
van_onehot = pd.get_dummies(van_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
van_onehot['Neighborhood'] = van_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [van_onehot.columns[-1]] + list(van_onehot.columns[:-1])
van_onehot = van_onehot[fixed_columns]

van_onehot.head()

Unnamed: 0,Neighborhood,Accessories Store,Airport,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Amphitheater,Antique Shop,Apres Ski Bar,Art Gallery,Arts & Crafts Store,Arts & Entertainment,Asian Restaurant,Athletics & Sports,Australian Restaurant,Auto Garage,Automotive Shop,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Beach,Beer Bar,Beer Garden,Belgian Restaurant,Big Box Store,Bistro,Board Shop,Boat or Ferry,Bookstore,Boutique,Bowling Alley,Breakfast Spot,Brewery,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Bus Line,Bus Station,Bus Stop,Business Service,Butcher,Cafeteria,Café,Campground,Candy Store,Caribbean Restaurant,Carpet Store,Casino,Cheese Shop,Child Care Service,Chinese Restaurant,Chocolate Shop,Church,Circus,City,Clothing Store,Cocktail Bar,Coffee Shop,Comfort Food Restaurant,Comic Shop,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Disc Golf,Discount Store,Distribution Center,Dog Run,Donut Shop,Dry Cleaner,Dumpling Restaurant,Duty-free Shop,Electronics Store,Elementary School,Ethiopian Restaurant,Event Space,Factory,Fair,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Field,Filipino Restaurant,Financial or Legal Service,Fish & Chips Shop,Fish Market,Flea Market,Flower Shop,Food Court,Food Service,Food Truck,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,German Restaurant,Gift Shop,Golf Course,Golf Driving Range,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Harbor / Marina,Hardware Store,Hawaiian Restaurant,Health Food Store,Himalayan Restaurant,Historic Site,History Museum,Hockey Arena,Home Service,Hookah Bar,Hostel,Hot Dog Joint,Hotel,Hotel Bar,Hotpot Restaurant,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indonesian Restaurant,Inn,Irish Pub,Italian Restaurant,Japanese Restaurant,Jewelry Store,Juice Bar,Kids Store,Korean Restaurant,Lake,Lawyer,Leather Goods Store,Lebanese Restaurant,Library,Light Rail Station,Lingerie Store,Liquor Store,Locksmith,Lounge,Malay Restaurant,Market,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Mongolian Restaurant,Moroccan Restaurant,Motel,Motorcycle Shop,Mountain,Movie Theater,Moving Target,Museum,Music Store,Music Venue,New American Restaurant,Nightclub,Noodle House,North Indian Restaurant,Office,Optical Shop,Outdoor Sculpture,Paper / Office Supplies Store,Park,Performing Arts Venue,Pet Store,Pharmacy,Physical Therapist,Pie Shop,Pizza Place,Plane,Playground,Plaza,Poke Place,Pool,Portuguese Restaurant,Print Shop,Pub,Public Art,Ramen Restaurant,Record Shop,Rental Car Location,Resort,Restaurant,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Seafood Restaurant,Shoe Store,Shop & Service,Shopping Mall,Shopping Plaza,Skate Park,Skating Rink,Ski Area,Ski Chairlift,Ski Chalet,Ski Lodge,Ski Trail,Soccer Field,Soccer Stadium,South Indian Restaurant,Spa,Speakeasy,Sporting Goods Shop,Sports Bar,Sports Club,Stadium,Steakhouse,Storage Facility,Supermarket,Sushi Restaurant,Taco Place,Taiwanese Restaurant,Tapas Restaurant,Tea Room,Tennis Court,Thai Restaurant,Theater,Theme Park,Theme Park Ride / Attraction,Thrift / Vintage Store,Tourist Information Center,Toy / Game Store,Trade School,Trail,Travel Lounge,Tree,Turkish Restaurant,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Waterfront,Wine Bar,Wine Shop,Women's Store,Yoga Studio
0,Port Moody,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Pitt Meadows,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Pitt Meadows,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Pitt Meadows,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Pitt Meadows,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


And let's examine the new dataframe size.

In [0]:
van_onehot.shape

(1830, 263)

## Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category

In [0]:
van_grouped = van_onehot.groupby('Neighborhood').mean().reset_index()
van_grouped

Unnamed: 0,Neighborhood,Accessories Store,Airport,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Amphitheater,Antique Shop,Apres Ski Bar,Art Gallery,Arts & Crafts Store,Arts & Entertainment,Asian Restaurant,Athletics & Sports,Australian Restaurant,Auto Garage,Automotive Shop,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Beach,Beer Bar,Beer Garden,Belgian Restaurant,Big Box Store,Bistro,Board Shop,Boat or Ferry,Bookstore,Boutique,Bowling Alley,Breakfast Spot,Brewery,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Bus Line,Bus Station,Bus Stop,Business Service,Butcher,Cafeteria,Café,Campground,Candy Store,Caribbean Restaurant,Carpet Store,Casino,Cheese Shop,Child Care Service,Chinese Restaurant,Chocolate Shop,Church,Circus,City,Clothing Store,Cocktail Bar,Coffee Shop,Comfort Food Restaurant,Comic Shop,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Disc Golf,Discount Store,Distribution Center,Dog Run,Donut Shop,Dry Cleaner,Dumpling Restaurant,Duty-free Shop,Electronics Store,Elementary School,Ethiopian Restaurant,Event Space,Factory,Fair,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Field,Filipino Restaurant,Financial or Legal Service,Fish & Chips Shop,Fish Market,Flea Market,Flower Shop,Food Court,Food Service,Food Truck,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,German Restaurant,Gift Shop,Golf Course,Golf Driving Range,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Harbor / Marina,Hardware Store,Hawaiian Restaurant,Health Food Store,Himalayan Restaurant,Historic Site,History Museum,Hockey Arena,Home Service,Hookah Bar,Hostel,Hot Dog Joint,Hotel,Hotel Bar,Hotpot Restaurant,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indonesian Restaurant,Inn,Irish Pub,Italian Restaurant,Japanese Restaurant,Jewelry Store,Juice Bar,Kids Store,Korean Restaurant,Lake,Lawyer,Leather Goods Store,Lebanese Restaurant,Library,Light Rail Station,Lingerie Store,Liquor Store,Locksmith,Lounge,Malay Restaurant,Market,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Mongolian Restaurant,Moroccan Restaurant,Motel,Motorcycle Shop,Mountain,Movie Theater,Moving Target,Museum,Music Store,Music Venue,New American Restaurant,Nightclub,Noodle House,North Indian Restaurant,Office,Optical Shop,Outdoor Sculpture,Paper / Office Supplies Store,Park,Performing Arts Venue,Pet Store,Pharmacy,Physical Therapist,Pie Shop,Pizza Place,Plane,Playground,Plaza,Poke Place,Pool,Portuguese Restaurant,Print Shop,Pub,Public Art,Ramen Restaurant,Record Shop,Rental Car Location,Resort,Restaurant,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Seafood Restaurant,Shoe Store,Shop & Service,Shopping Mall,Shopping Plaza,Skate Park,Skating Rink,Ski Area,Ski Chairlift,Ski Chalet,Ski Lodge,Ski Trail,Soccer Field,Soccer Stadium,South Indian Restaurant,Spa,Speakeasy,Sporting Goods Shop,Sports Bar,Sports Club,Stadium,Steakhouse,Storage Facility,Supermarket,Sushi Restaurant,Taco Place,Taiwanese Restaurant,Tapas Restaurant,Tea Room,Tennis Court,Thai Restaurant,Theater,Theme Park,Theme Park Ride / Attraction,Thrift / Vintage Store,Tourist Information Center,Toy / Game Store,Trade School,Trail,Travel Lounge,Tree,Turkish Restaurant,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Waterfront,Wine Bar,Wine Shop,Women's Store,Yoga Studio
0,Abbotsford Southeast,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Abbotsford Southwest,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Burnaby (Burnaby Heights / Willingdon Heights ...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Burnaby (Cascade-Schou / Douglas-Gilpin),0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Burnaby (East Big Bend / Stride Avenue / Edmon...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Burnaby (Government Road / Lake City / SFU / B...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Burnaby (Lakeview-Mayfield / Richmond Park / K...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Burnaby (Maywood / Marlborough / Oakalla / Win...,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.054054,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.054054,0.0,0.0,0.0,0.0,0.0,0.0,0.162162,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.054054,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.081081,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.054054,0.027027,0.0,0.0,0.0,0.0,0.054054,0.0,0.0,0.0,0.0,0.0,0.054054,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Burnaby (Parkcrest-Aubrey / Ardingley-Sprott),0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0
9,Burnaby (Suncrest / Sussex-Nelson / Clinton-Gl...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


## Let's confirm the new size

In [0]:
van_grouped.shape

(130, 263)

## Let's print each neighborhood along with the top 5 most common venues

In [0]:
num_top_venues = 5

for hood in van_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = van_grouped[van_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Abbotsford Southeast----
         venue  freq
0      Brewery   0.2
1     Bus Line   0.2
2    Pet Store   0.2
3  Coffee Shop   0.2
4   Restaurant   0.2


----Abbotsford Southwest----
                        venue  freq
0  Construction & Landscaping   1.0
1           Mobile Phone Shop   0.0
2        Mongolian Restaurant   0.0
3         Moroccan Restaurant   0.0
4                       Motel   0.0


----Burnaby (Burnaby Heights / Willingdon Heights / West Central Valley)----
           venue  freq
0  Garden Center   0.2
1           Café   0.2
2   Tennis Court   0.2
3           Park   0.2
4       Bus Stop   0.2


----Burnaby (Cascade-Schou / Douglas-Gilpin)----
               venue  freq
0               Park  0.50
1        Auto Garage  0.25
2          Speakeasy  0.25
3  Accessories Store  0.00
4      Moving Target  0.00


----Burnaby (East Big Bend / Stride Avenue / Edmonds / Cariboo-Armstrong)----
               venue  freq
0  Indian Restaurant  0.33
1               Pool  0.17
2      

## Let's put that into a *pandas* dataframe

First, let's write a function to sort the venues in descending order.

In [0]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Now let's create the new dataframe and display the top 10 venues for each neighborhood.

In [0]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = van_grouped['Neighborhood']

for ind in np.arange(van_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(van_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted



Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Abbotsford Southeast,Bus Line,Brewery,Coffee Shop,Pet Store,Restaurant,Falafel Restaurant,Ethiopian Restaurant,Event Space,Factory,Fair
1,Abbotsford Southwest,Construction & Landscaping,Electronics Store,Fish Market,Fish & Chips Shop,Financial or Legal Service,Filipino Restaurant,Field,Fast Food Restaurant,Farmers Market,Farm
2,Burnaby (Burnaby Heights / Willingdon Heights ...,Park,Bus Stop,Garden Center,Tennis Court,Café,Farmers Market,Farm,Falafel Restaurant,Elementary School,Fair
3,Burnaby (Cascade-Schou / Douglas-Gilpin),Park,Auto Garage,Speakeasy,Financial or Legal Service,Filipino Restaurant,Field,Fast Food Restaurant,Farmers Market,Farm,Duty-free Shop
4,Burnaby (East Big Bend / Stride Avenue / Edmon...,Indian Restaurant,Pool,Dog Run,Motel,Art Gallery,Yoga Studio,Fair,Elementary School,Ethiopian Restaurant,Event Space
5,Burnaby (Government Road / Lake City / SFU / B...,Golf Course,Golf Driving Range,Burger Joint,Yoga Studio,Fair,Elementary School,Ethiopian Restaurant,Event Space,Factory,Falafel Restaurant
6,Burnaby (Lakeview-Mayfield / Richmond Park / K...,Bus Stop,Park,Yoga Studio,Fair,Elementary School,Ethiopian Restaurant,Event Space,Factory,Falafel Restaurant,Duty-free Shop
7,Burnaby (Maywood / Marlborough / Oakalla / Win...,Coffee Shop,Pharmacy,Fast Food Restaurant,Chinese Restaurant,Bookstore,Sushi Restaurant,Thai Restaurant,Toy / Game Store,Shopping Mall,Salad Place
8,Burnaby (Parkcrest-Aubrey / Ardingley-Sprott),Convenience Store,Park,Sushi Restaurant,Italian Restaurant,Golf Course,Bus Stop,Stadium,Vietnamese Restaurant,Coffee Shop,Tea Room
9,Burnaby (Suncrest / Sussex-Nelson / Clinton-Gl...,Restaurant,Golf Course,Garden Center,Farmers Market,Yoga Studio,Electronics Store,Elementary School,Ethiopian Restaurant,Event Space,Factory


## Now, let's check out Richmond West most common venues to help learn what type of venues makes it such a good location for business!

In [0]:
RichmondWest =neighborhoods_venues_sorted.iloc[68]
RichmondWest 

Cluster Labels                               5
Neighborhood                     Richmond West
1st Most Common Venue                    Trail
2nd Most Common Venue           Shopping Plaza
3rd Most Common Venue              Yoga Studio
4th Most Common Venue                     Fair
5th Most Common Venue        Elementary School
6th Most Common Venue     Ethiopian Restaurant
7th Most Common Venue              Event Space
8th Most Common Venue                  Factory
9th Most Common Venue       Falafel Restaurant
10th Most Common Venue          Duty-free Shop
Name: 68, dtype: object

##As expected, this area seems to offer outdoor trails, good shopping convenience, yoga classes and an elementary school for young parents.  This aligns with my friend's observation of who his target customers are (i.e. young 30-45 active people who are health conscience and family oriented!).


## Let's now find similar places to Richmond West through Machine Learning's Unsupervised Learnining Clustering techniques!  As Vancouver is a multicultural city mixed with all sorts of amenities, we will use K=20 clusters as we want to fine tune and be selective on the restaurant location.  



Run *k*-means to cluster the neighborhood into 5 clusters.

In [0]:
# set number of clusters
kclusters = 20

van_grouped_clustering = van_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(van_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:20] 

array([ 2,  4,  9, 14,  9,  1, 14,  2,  2,  9,  2,  2, 14,  9,  4,  7, 19,
        5,  2,  9], dtype=int32)

Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

In [0]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)



In [0]:
van.head()

Unnamed: 0,Place,Code,Country,Province,Latitude,Longitude
0,Port Moody,V3H,Canada,British Columbia,49.323,-122.863
2,Pitt Meadows,V3Y,Canada,British Columbia,49.221,-122.69
4,White Rock,V4B,Canada,British Columbia,49.026,-122.806
6,Penticton,V2A,Canada,British Columbia,49.481,-119.586
8,Westbank,V4T,Canada,British Columbia,49.866,-119.739


In [0]:


van_merged = van

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
van_merged = van_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Place',)

van_merged # check the last columns!

Unnamed: 0,Place,Code,Country,Province,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Port Moody,V3H,Canada,British Columbia,49.323,-122.863,12.0,Lake,Yoga Studio,Fish & Chips Shop,Financial or Legal Service,Filipino Restaurant,Field,Fast Food Restaurant,Farmers Market,Farm,Falafel Restaurant
2,Pitt Meadows,V3Y,Canada,British Columbia,49.221,-122.69,2.0,Gym / Fitness Center,Sandwich Place,Convenience Store,Bistro,Elementary School,Grocery Store,Park,Brewery,Coffee Shop,Plaza
4,White Rock,V4B,Canada,British Columbia,49.026,-122.806,9.0,Japanese Restaurant,Café,Dessert Shop,Thai Restaurant,Ice Cream Shop,Steakhouse,Museum,Bistro,Seafood Restaurant,Gas Station
6,Penticton,V2A,Canada,British Columbia,49.481,-119.586,2.0,Fast Food Restaurant,Restaurant,Coffee Shop,Liquor Store,Pizza Place,Grocery Store,Sandwich Place,Supermarket,Diner,Pub
8,Westbank,V4T,Canada,British Columbia,49.866,-119.739,,,,,,,,,,,
10,Winfield,V4V,Canada,British Columbia,50.022,-119.405,9.0,Fast Food Restaurant,Gas Station,Convenience Store,Pizza Place,Pub,Farm,Sandwich Place,Grocery Store,Japanese Restaurant,Ethiopian Restaurant
12,Kimberley,V1A,Canada,British Columbia,49.683,-115.986,9.0,German Restaurant,Plaza,Fast Food Restaurant,American Restaurant,Pharmacy,Café,Factory,Electronics Store,Elementary School,Ethiopian Restaurant
14,Saltspring Island,V8K,Canada,British Columbia,48.814,-123.497,,,,,,,,,,,
16,Northern British Columbia (Fort Nelson),V0C,Canada,British Columbia,58.387,-125.717,,,,,,,,,,,
18,Central Okanagan and High Country (Revelstoke),V0E,Canada,British Columbia,51.505,-119.203,,,,,,,,,,,


Finally, let's visualize the resulting clusters

In [0]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(van_merged['Latitude'], van_merged['Longitude'], van_merged['Place'], van_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
      
        fill=True,
        )

RecursionError: ignored

<a id='item5'></a>

## 5. Examine Clusters

Now, you can examine each cluster and determine the discriminating venue categories that distinguish each cluster. Based on the defining categories, you can then assign a name to each cluster. I will leave this exercise to you.

## Cluster 1

In [0]:
van_merged.loc[van_merged['Cluster Labels'] == 0, van_merged.columns[[0] + list(range(5, van_merged.shape[1]))]]

Unnamed: 0,Place,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
82,Prince George East Central,-122.747,0.0,Hockey Arena,Convenience Store,Park,Yoga Studio,Fair,Electronics Store,Elementary School,Ethiopian Restaurant,Event Space,Factory
106,Port Coquitlam South,-122.787,0.0,Construction & Landscaping,Shop & Service,Park,Farm,Factory,Electronics Store,Elementary School,Ethiopian Restaurant,Event Space,Falafel Restaurant
116,New Westminster Northeast,-122.9,0.0,Bar,Park,Sports Club,Gym,Yoga Studio,Falafel Restaurant,Ethiopian Restaurant,Event Space,Factory,Fair
162,Burnaby (Burnaby Heights / Willingdon Heights ...,-123.007,0.0,Spa,Park,Tennis Court,Garden Center,Café,Fast Food Restaurant,Farmers Market,Farm,Falafel Restaurant,Duty-free Shop
164,Burnaby (Lakeview-Mayfield / Richmond Park / K...,-122.957,0.0,Bus Stop,Park,Mobile Phone Shop,Yoga Studio,Elementary School,Ethiopian Restaurant,Event Space,Factory,Fair,Farm
166,Burnaby (Cascade-Schou / Douglas-Gilpin),-122.994,0.0,Speakeasy,Auto Garage,Park,Falafel Restaurant,Elementary School,Ethiopian Restaurant,Event Space,Factory,Fair,Farm
182,Vancouver (South Renfrew-Collingwood),-123.041,0.0,Fish & Chips Shop,Park,Bar,Asian Restaurant,Bus Stop,Hotel,Filipino Restaurant,Field,Fast Food Restaurant,Farmers Market
192,Vancouver (SE Oakridge / East Marpole / South ...,-123.098,0.0,Sporting Goods Shop,Park,Restaurant,Indian Restaurant,Yoga Studio,Fair,Electronics Store,Elementary School,Ethiopian Restaurant,Event Space
202,Vancouver (North West End / Stanley Park),-123.141,0.0,Garden,Park,Outdoor Sculpture,Trail,Yoga Studio,Factory,Electronics Store,Elementary School,Ethiopian Restaurant,Event Space
218,Vancouver (West Kitsilano / Jericho),-123.198,0.0,Pool,Park,Yoga Studio,Fair,Electronics Store,Elementary School,Ethiopian Restaurant,Event Space,Factory,Farm


## Cluster 2

In [0]:
van_merged.loc[van_merged['Cluster Labels'] == 1, van_merged.columns[[0] + list(range(5, van_merged.shape[1]))]]

Unnamed: 0,Place,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Port Moody,-122.863,1.0,Lake,Yoga Studio,Duty-free Shop,Financial or Legal Service,Filipino Restaurant,Field,Fast Food Restaurant,Farmers Market,Farm,Falafel Restaurant
2,Pitt Meadows,-122.69,1.0,Gym / Fitness Center,Coffee Shop,Pub,Elementary School,Brewery,Bistro,Grocery Store,Gym,Vietnamese Restaurant,Plaza
4,White Rock,-122.806,1.0,Japanese Restaurant,Seafood Restaurant,Café,Dessert Shop,Pizza Place,Greek Restaurant,Breakfast Spot,Bistro,Steakhouse,Museum
6,Penticton,-119.586,1.0,Fast Food Restaurant,Coffee Shop,Pizza Place,Liquor Store,Restaurant,Sandwich Place,Pub,Supermarket,Pharmacy,Diner
10,Winfield,-119.405,1.0,Fast Food Restaurant,Gas Station,Pizza Place,Pub,Farm,Grocery Store,Sandwich Place,Convenience Store,Cruise,Dumpling Restaurant
12,Kimberley,-115.986,1.0,Pharmacy,Plaza,American Restaurant,Fast Food Restaurant,German Restaurant,Factory,Duty-free Shop,Electronics Store,Elementary School,Ethiopian Restaurant
50,Fort St. John,-120.853,1.0,Bank,Coffee Shop,Gas Station,Pharmacy,Ice Cream Shop,Clothing Store,Restaurant,Grocery Store,Liquor Store,Event Space
52,Nelson,-117.286,1.0,Big Box Store,Grocery Store,Shopping Mall,Supermarket,Liquor Store,Gas Station,Pharmacy,Coffee Shop,Fast Food Restaurant,Convenience Store
54,Langley Township North,-122.579,1.0,Restaurant,Bakery,Dessert Shop,Market,Bookstore,Candy Store,Gastropub,Gift Shop,Diner,Burrito Place
60,Vernon Central,-119.277,1.0,Coffee Shop,Greek Restaurant,City,Pub,Sandwich Place,Fast Food Restaurant,Market,Bookstore,Supermarket,Thai Restaurant


## Cluster 3

In [0]:
van_merged.loc[van_merged['Cluster Labels'] == 2, van_merged.columns[[0] + list(range(5, van_merged.shape[1]))]]

Unnamed: 0,Place,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,Pitt Meadows,-122.69,2.0,Gym / Fitness Center,Sandwich Place,Convenience Store,Bistro,Elementary School,Grocery Store,Park,Brewery,Coffee Shop,Plaza
6,Penticton,-119.586,2.0,Fast Food Restaurant,Restaurant,Coffee Shop,Liquor Store,Pizza Place,Grocery Store,Sandwich Place,Supermarket,Diner,Pub
50,Fort St. John,-120.853,2.0,Bank,Coffee Shop,Restaurant,Gas Station,Pharmacy,Ice Cream Shop,Clothing Store,Liquor Store,Grocery Store,Electronics Store
52,Nelson,-117.286,2.0,Big Box Store,Fast Food Restaurant,Shopping Mall,Liquor Store,Supermarket,Grocery Store,Gas Station,Pharmacy,Coffee Shop,Convenience Store
60,Vernon Central,-119.277,2.0,Sandwich Place,Market,Coffee Shop,Fast Food Restaurant,American Restaurant,Park,City,Pub,Thai Restaurant,Supermarket
78,Quesnel,-122.493,2.0,Coffee Shop,Steakhouse,Supermarket,Bakery,Pharmacy,Juice Bar,Ice Cream Shop,Gas Station,Inn,Elementary School
88,Chilliwack Central,-121.905,2.0,Ice Cream Shop,Flower Shop,Construction & Landscaping,Golf Course,Falafel Restaurant,Elementary School,Ethiopian Restaurant,Event Space,Factory,Fair
92,Abbotsford Southeast,-122.283,2.0,Bus Line,Brewery,Coffee Shop,Pet Store,Restaurant,Falafel Restaurant,Ethiopian Restaurant,Event Space,Factory,Fair
126,Surrey Inner Northwest,-122.845,2.0,Coffee Shop,Fast Food Restaurant,Bus Stop,Vietnamese Restaurant,Grocery Store,Pizza Place,Sandwich Place,Gym,Bank,Bookstore
130,Surrey Upper West,-122.857,2.0,Bus Station,Coffee Shop,Indian Restaurant,Gym,Yoga Studio,Ethiopian Restaurant,Event Space,Factory,Fair,Falafel Restaurant


## Cluster 4

In [0]:
van_merged.loc[van_merged['Cluster Labels'] == 3, van_merged.columns[[0] + list(range(5, van_merged.shape[1]))]]

Unnamed: 0,Place,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
112,Coquitlam North,-122.872,3.0,Park,Yoga Studio,Falafel Restaurant,Electronics Store,Elementary School,Ethiopian Restaurant,Event Space,Factory,Fair,Farm
220,Vancouver (Chaldecutt / South University Endow...,-123.209,3.0,Park,Yoga Studio,Falafel Restaurant,Electronics Store,Elementary School,Ethiopian Restaurant,Event Space,Factory,Fair,Farm
248,North Vancouver Southwest Central,-123.083,3.0,Park,Yoga Studio,Falafel Restaurant,Electronics Store,Elementary School,Ethiopian Restaurant,Event Space,Factory,Fair,Farm


## Cluster 5

In [0]:
van_merged.loc[van_merged['Cluster Labels'] == 4, van_merged.columns[[0] + list(range(5, van_merged.shape[1]))]]

Unnamed: 0,Place,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
22,South Okanagan (Summerland),-119.005,4.0,Construction & Landscaping,Electronics Store,Fish Market,Fish & Chips Shop,Financial or Legal Service,Filipino Restaurant,Field,Fast Food Restaurant,Farmers Market,Farm
94,Abbotsford Southwest,-122.349,4.0,Construction & Landscaping,Electronics Store,Fish Market,Fish & Chips Shop,Financial or Legal Service,Filipino Restaurant,Field,Fast Food Restaurant,Farmers Market,Farm
278,Saanich East,-123.315,4.0,Construction & Landscaping,Business Service,Falafel Restaurant,Elementary School,Ethiopian Restaurant,Event Space,Factory,Fair,Yoga Studio,Electronics Store
308,Courtenay Central,-124.984,4.0,Construction & Landscaping,Electronics Store,Fish Market,Fish & Chips Shop,Financial or Legal Service,Filipino Restaurant,Field,Fast Food Restaurant,Farmers Market,Farm


In [0]:
van_merged.loc[van_merged['Cluster Labels'] == 5, van_merged.columns[[0] + list(range(5, van_merged.shape[1]))]]

Unnamed: 0,Place,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
138,Delta East,-122.906,5.0,Trail,Yoga Studio,Fair,Electronics Store,Elementary School,Ethiopian Restaurant,Event Space,Factory,Falafel Restaurant,Dumpling Restaurant
234,Richmond West,-123.172,5.0,Trail,Shopping Plaza,Yoga Studio,Fair,Elementary School,Ethiopian Restaurant,Event Space,Factory,Falafel Restaurant,Duty-free Shop
250,North Vancouver Northwest Central,-123.068,5.0,Trail,Paper / Office Supplies Store,Yoga Studio,Fair,Elementary School,Ethiopian Restaurant,Event Space,Factory,Falafel Restaurant,Duty-free Shop
262,West Vancouver West,-123.263,5.0,Trail,Tapas Restaurant,Yoga Studio,Fair,Elementary School,Ethiopian Restaurant,Event Space,Factory,Falafel Restaurant,Duty-free Shop


## Here we see Richmond West in Cluster # 5.  We see Delta East, North Vancouver Northwest Central, and West Vancouver West are 3 other locations with VERY similar common avenues.  As Delta East is much closer to Richmond West, he may want to start with that first before setting up new stores in North/West Vanvouer.  