 # The Battle of Neighborhoods in Tokyo, Japan
 
## 1. Introduction

As a businessman looking to start a business in a foreign city, one has to understand where the underlying demand lies among several neighborhoods in the city. One such city which we are keen to study is Tokyo, which has over 100 neighborhoods, each with different commuters and residents.

In [1]:
import requests # library to handle requests
import pandas as pd # library for data analsysis
import numpy as np # library to handle data in a vectorized manner
import random # library for random number generation

!pip install geopy
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

# libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 
    
# tranforming json file into a pandas dataframe library
from pandas.io.json import json_normalize

! pip install folium==0.5.0
import folium # plotting library

from bs4 import BeautifulSoup



# 2. Data

We will get our data for neighborhoods from Wikipedia, "https://en.wikipedia.org/wiki/Template:Neighborhoods_of_Tokyo"

In [2]:
List_url = "https://en.wikipedia.org/wiki/Template:Neighborhoods_of_Tokyo"
source = requests.get(List_url).text

In [3]:
soup = BeautifulSoup(source, 'xml')

In [4]:
table=soup.find('table')

In [5]:
soup = BeautifulSoup(source, 'xml')

In [6]:
table

<table class="nowraplinks mw-collapsible autocollapse navbox-inner" style="border-spacing:0;background:transparent;color:inherit"><tbody><tr><th class="navbox-title" colspan="2" scope="col"><style data-mw-deduplicate="TemplateStyles:r992953826">.mw-parser-output .navbar{display:inline;font-size:88%;font-weight:normal}.mw-parser-output .navbar-collapse{float:left;text-align:left}.mw-parser-output .navbar-boxtext{word-spacing:0}.mw-parser-output .navbar ul{display:inline-block;white-space:nowrap;line-height:inherit}.mw-parser-output .navbar-brackets::before{margin-right:-0.125em;content:"[ "}.mw-parser-output .navbar-brackets::after{margin-left:-0.125em;content:" ]"}.mw-parser-output .navbar li{word-spacing:-0.125em}.mw-parser-output .navbar-mini abbr{font-variant:small-caps;border-bottom:none;text-decoration:none;cursor:inherit}.mw-parser-output .navbar-ct-full{font-size:114%;margin:0 7em}.mw-parser-output .navbar-ct-mini{font-size:114%;margin:0 4em}.mw-parser-output .infobox .navbar{fo

In [7]:
#dataframe will consist of three columns: Neighborhood, Latitude, Longitude
column_names = ['Neighborhood', 'Latitude', 'Longitude']
df = pd.DataFrame(columns = column_names)

geolocator = Nominatim(user_agent="tokyo_explorer")

for tr_cell in table.find_all('tr'):
    for td_cell in tr_cell.find_all('td'):
        tag = td_cell.findAll('a')
        for i in tag:
            row = []
            nhood = i.text
            try:
                location = geolocator.geocode(nhood)
                latitude = location.latitude
                longitude = location.longitude
                row.append(nhood)
                row.append(latitude)
                row.append(longitude)
                df.loc[len(df)] = row
            except:
                continue

In [8]:
df.describe()

Unnamed: 0,Latitude,Longitude
count,103.0,103.0
mean,35.497805,137.212523
std,2.378678,13.307574
min,18.253466,22.168767
25%,35.639411,139.658122
50%,35.671679,139.733498
75%,35.698997,139.766032
max,44.079308,143.552653


In [9]:
df.head()

Unnamed: 0,Neighborhood,Latitude,Longitude
0,Akasaka,35.671679,139.735622
1,Akihabara,35.701893,139.774368
2,Aobadai,35.542878,139.517231
3,Aomi,35.624704,139.781226
4,Aoyama,37.898632,139.001079


In [10]:
print("Shape: ", df.shape)

Shape:  (103, 3)


## 2.1 Visualise Tokyo Neighborhoods

In [11]:
import numpy as np
from pandas.io.json import json_normalize  # tranform JSON file into a pandas dataframe

!pip install folium
import folium # map rendering library

!pip install geopy
from geopy.geocoders import Nominatim

# import k-means from clustering stage
from sklearn.cluster import KMeans

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors



In [12]:
CLIENT_ID = 'PRS2K1QCM0SY44IWYR53CUCO3P3IK2OWMXV4ZKIDK5ZD2OL2' # your Foursquare ID
CLIENT_SECRET = '3MU5NEJF20BU3NO1T1HWDFMTCNI5EOBKOFNTZ14UWYQPI1FD' # your Foursquare Secret
ACCESS_TOKEN = 'CYEGD2QNQQT10DRHLYJDDI2MKZFWA00E3NJFCMTZXNSFGAOR' # your FourSquare Access Token
VERSION = '20180604'
LIMIT = 30
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: PRS2K1QCM0SY44IWYR53CUCO3P3IK2OWMXV4ZKIDK5ZD2OL2
CLIENT_SECRET:3MU5NEJF20BU3NO1T1HWDFMTCNI5EOBKOFNTZ14UWYQPI1FD


## 2.2 Create Map of Tokyo

In [13]:
address = "Tokyo"

geolocator = Nominatim(user_agent="tokyo_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinates of Tokyo city are {}, {}.'.format(latitude, longitude))

The geograpical coordinates of Tokyo city are 35.6828387, 139.7594549.


In [14]:
# create map of Toronto using latitude and longitude values
map_tokyo = folium.Map(location=[latitude, longitude], zoom_start=11)
map_tokyo

## 2.3 Visualise Neighborhoods in Tokyo

In [15]:
for lat, lng, neighborhood in zip(
        df['Latitude'], 
        df['Longitude'], 
        df['Neighborhood']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_tokyo)  
    
map_tokyo

## 3. Exploratory Data Analysis

We will explore what are common venues in each neighborhood within a radius of 500m from the neighborhood centre. We start by looking at one neighborhood.

In [16]:
# get categories of venues

def get_category_type(row): 
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [17]:
# get json of venues within radius 500 of each neighborhood capped at 100

neighborhood_name = df.loc[1, 'Neighborhood']
neighborhood_latitude = df.loc[1, 'Latitude'] # neighborhood latitude value
neighborhood_longitude = df.loc[1, 'Longitude'] # neighborhood longitude value

LIMIT = 100 # limit of number of venues returned by Foursquare API
radius = 500 # define radius
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)

# get the result to a json file
results = requests.get(url).json()

In [18]:
venues = results['response']['groups'][0]['items']
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues

  nearby_venues = json_normalize(venues) # flatten JSON


Unnamed: 0,name,categories,lat,lng
0,Akiba Fukurou (アキバフクロウ),Pet Café,35.700795,139.774862
1,Tankiyo (たん清),BBQ Joint,35.700220,139.774665
2,Fruit de saison (フルーフ・デゥ・セゾン),Café,35.702276,139.772864
3,THE ALLEY,Bubble Tea Shop,35.701925,139.772092
4,MOGRA,Nightclub,35.701987,139.775072
...,...,...,...,...
95,Hello! Project Official Shop (ハロー! プロジェクト オフィシ...,Hobby Shop,35.703035,139.771805
96,Steak & Wine Block,Steakhouse,35.698780,139.774654
97,オヤイデ電気 秋葉原店,Electronics Store,35.698624,139.770848
98,Yellow Submarine (イエローサブマリン 秋葉原RPGショップ),Hobby Shop,35.699714,139.771243


## 3.1 Data Across all Neighborhoods

We will extend the EDA to all neighborhoods

In [19]:
# method to get nearby venues across all neighborhoods

def getNearbyVenues(names, latitudes, longitudes, radius=500):
    venues_list=[]
    
    for name, lat, lng in zip(names, latitudes, longitudes):
        # print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [20]:
tokyo_venues = getNearbyVenues(names=df['Neighborhood'],
                                   latitudes=df['Latitude'],
                                   longitudes=df['Longitude']
                                  )

In [21]:
tokyo_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Akasaka,85,85,85,85,85,85
Akihabara,100,100,100,100,100,100
Aobadai,65,65,65,65,65,65
Aomi,43,43,43,43,43,43
Aoyama,21,21,21,21,21,21
...,...,...,...,...,...,...
Yūrakuchō,100,100,100,100,100,100
Zōshigaya,24,24,24,24,24,24
Ōmori,100,100,100,100,100,100
Ōsaki,9,9,9,9,9,9


In [22]:
# one hot encoding
tokyo_onehot = pd.get_dummies(tokyo_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
tokyo_onehot['Neighborhood'] = tokyo_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [tokyo_onehot.columns[-1]] + list(tokyo_onehot.columns[:-1])
tokyo_denc_onehot = tokyo_onehot[fixed_columns]

tokyo_denc_onehot.head()

Unnamed: 0,Zoo,ATM,Accessories Store,Adult Boutique,American Restaurant,Arcade,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,...,Well,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yakitori Restaurant,Yoga Studio,Yoshoku Restaurant,Yunnan Restaurant
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [23]:
tokyo_grouped = tokyo_onehot.groupby('Neighborhood').mean().reset_index()
tokyo_grouped.head()

Unnamed: 0,Neighborhood,ATM,Accessories Store,Adult Boutique,American Restaurant,Arcade,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,...,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yakitori Restaurant,Yoga Studio,Yoshoku Restaurant,Yunnan Restaurant,Zoo
0,Akasaka,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.023529,0.0,0.0,0.0,0.0
1,Akihabara,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,...,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0
2,Aobadai,0.015385,0.0,0.0,0.015385,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.015385,0.0,0.0
3,Aomi,0.0,0.0,0.0,0.0,0.0,0.0,0.023256,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Aoyama,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [24]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    return row_categories_sorted.index.values[0:num_top_venues]

num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = tokyo_grouped['Neighborhood']

for ind in np.arange(tokyo_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(tokyo_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Akasaka,Japanese Restaurant,Chinese Restaurant,Convenience Store,BBQ Joint,Sake Bar,Coffee Shop,Sushi Restaurant,Szechuan Restaurant,Hotel,Kaiseki Restaurant
1,Akihabara,Hobby Shop,Electronics Store,Ramen Restaurant,Café,Sake Bar,Chinese Restaurant,BBQ Joint,Soba Restaurant,Theme Restaurant,Sushi Restaurant
2,Aobadai,Convenience Store,Park,Discount Store,Bakery,Chinese Restaurant,Ramen Restaurant,Grocery Store,Gym / Fitness Center,Soba Restaurant,Fast Food Restaurant
3,Aomi,Exhibit,Convenience Store,Coffee Shop,Plaza,Shopping Mall,Theme Restaurant,Theme Park Ride / Attraction,Theme Park,Tea Room,Bus Stop
4,Aoyama,Pharmacy,Discount Store,Bakery,Bus Stop,Gym,Okonomiyaki Restaurant,Ice Cream Shop,Coffee Shop,Bookstore,Music Venue


## 3.2 Machine Learning - Performing k-means clustering on neighborhoods

We will perform k-means clustering on the neighborhoods with k=10

In [25]:
# set number of clusters
kclusters = 10

tokyo_grouped_clustering = tokyo_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(tokyo_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([7, 2, 0, 2, 2, 7, 0, 0, 0, 7])

In [30]:
kmeans.labels_

array([7, 2, 0, 2, 2, 7, 0, 0, 0, 7, 7, 2, 9, 7, 2, 2, 7, 0, 2, 2, 0, 0,
       0, 7, 2, 7, 2, 2, 0, 7, 7, 7, 7, 8, 0, 2, 0, 0, 0, 7, 0, 0, 0, 2,
       7, 0, 0, 0, 0, 7, 2, 7, 5, 7, 1, 7, 7, 7, 0, 2, 2, 0, 2, 2, 0, 2,
       2, 7, 7, 0, 7, 0, 0, 7, 2, 0, 7, 7, 7, 0, 6, 8, 0, 7, 7, 7, 7, 7,
       3, 4, 2, 8, 7, 0, 2, 0, 0, 0, 2])

In [26]:
tokyo_grouped_clustering

Unnamed: 0,ATM,Accessories Store,Adult Boutique,American Restaurant,Arcade,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,...,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yakitori Restaurant,Yoga Studio,Yoshoku Restaurant,Yunnan Restaurant,Zoo
0,0.000000,0.0,0.0,0.000000,0.00,0.000000,0.000000,0.0,0.0,0.00,...,0.0,0.00,0.0,0.0,0.0,0.023529,0.0,0.000000,0.0,0.0
1,0.000000,0.0,0.0,0.000000,0.01,0.020000,0.000000,0.0,0.0,0.01,...,0.0,0.02,0.0,0.0,0.0,0.000000,0.0,0.010000,0.0,0.0
2,0.015385,0.0,0.0,0.015385,0.00,0.000000,0.000000,0.0,0.0,0.00,...,0.0,0.00,0.0,0.0,0.0,0.000000,0.0,0.015385,0.0,0.0
3,0.000000,0.0,0.0,0.000000,0.00,0.000000,0.023256,0.0,0.0,0.00,...,0.0,0.00,0.0,0.0,0.0,0.000000,0.0,0.000000,0.0,0.0
4,0.000000,0.0,0.0,0.000000,0.00,0.000000,0.000000,0.0,0.0,0.00,...,0.0,0.00,0.0,0.0,0.0,0.000000,0.0,0.000000,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
94,0.000000,0.0,0.0,0.000000,0.00,0.010000,0.000000,0.0,0.0,0.00,...,0.0,0.01,0.0,0.0,0.0,0.010000,0.0,0.010000,0.0,0.0
95,0.000000,0.0,0.0,0.000000,0.00,0.041667,0.000000,0.0,0.0,0.00,...,0.0,0.00,0.0,0.0,0.0,0.000000,0.0,0.000000,0.0,0.0
96,0.010000,0.0,0.0,0.000000,0.01,0.000000,0.000000,0.0,0.0,0.00,...,0.0,0.00,0.0,0.0,0.0,0.000000,0.0,0.010000,0.0,0.0
97,0.000000,0.0,0.0,0.000000,0.00,0.000000,0.000000,0.0,0.0,0.00,...,0.0,0.00,0.0,0.0,0.0,0.000000,0.0,0.000000,0.0,0.0


In [27]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

tokyo_merged = df

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
tokyo_merged = tokyo_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

tokyo_merged.head() # check the last columns!

Unnamed: 0,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Akasaka,35.671679,139.735622,7.0,Japanese Restaurant,Chinese Restaurant,Convenience Store,BBQ Joint,Sake Bar,Coffee Shop,Sushi Restaurant,Szechuan Restaurant,Hotel,Kaiseki Restaurant
1,Akihabara,35.701893,139.774368,2.0,Hobby Shop,Electronics Store,Ramen Restaurant,Café,Sake Bar,Chinese Restaurant,BBQ Joint,Soba Restaurant,Theme Restaurant,Sushi Restaurant
2,Aobadai,35.542878,139.517231,0.0,Convenience Store,Park,Discount Store,Bakery,Chinese Restaurant,Ramen Restaurant,Grocery Store,Gym / Fitness Center,Soba Restaurant,Fast Food Restaurant
3,Aomi,35.624704,139.781226,2.0,Exhibit,Convenience Store,Coffee Shop,Plaza,Shopping Mall,Theme Restaurant,Theme Park Ride / Attraction,Theme Park,Tea Room,Bus Stop
4,Aoyama,37.898632,139.001079,2.0,Pharmacy,Discount Store,Bakery,Bus Stop,Gym,Okonomiyaki Restaurant,Ice Cream Shop,Coffee Shop,Bookstore,Music Venue


In [46]:
tokyo_merged[tokyo_merged['Cluster Labels'].isna()]

Unnamed: 0,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
20,Higashi,26.638966,128.181489,,,,,,,,,,,
35,Kanda,29.845427,79.936388,,,,,,,,,,,
69,Shiba,43.645983,22.168767,,,,,,,,,,,
97,Yayoi,44.079308,143.552653,,,,,,,,,,,


In [55]:
tokyo_merged = tokyo_merged.drop([20,35,69,97], axis=0)

In [68]:
tokyo_merged['Cluster Labels'].value_counts()

0.0    33
7.0    33
2.0    24
8.0     3
3.0     1
4.0     1
5.0     1
1.0     1
6.0     1
9.0     1
Name: Cluster Labels, dtype: int64

## 4. Results

We will plot the clusters and see the results. There are two main clusters, `Cluster 0` in red and `Cluster 7` in yellow.

We will see that `Cluster 7` is in yellow and there are concentrations in certain parts of Tokyo, mainly in the central area of Tokyo.

`Cluster 0` is in red and is widespread across all areas of Tokyo, including the suburbs.

In [58]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(
        tokyo_merged['Latitude'], 
        tokyo_merged['Longitude'], 
        tokyo_merged['Neighborhood'], 
        tokyo_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[int(cluster)-1],
        fill=True,
        fill_color=rainbow[int(cluster)-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [63]:
tokyo_merged.loc[tokyo_merged['Cluster Labels'] == 0, tokyo_merged.columns[[0] + list(range(4, tokyo_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,Aobadai,Convenience Store,Park,Discount Store,Bakery,Chinese Restaurant,Ramen Restaurant,Grocery Store,Gym / Fitness Center,Soba Restaurant,Fast Food Restaurant
6,Asagaya,Convenience Store,BBQ Joint,Café,Coffee Shop,Italian Restaurant,Art Gallery,Drugstore,Pharmacy,Wagashi Place,Clothing Store
7,Asakusa,Convenience Store,Café,Hostel,Ramen Restaurant,Dessert Shop,Wagashi Place,Supermarket,Coffee Shop,Italian Restaurant,Japanese Restaurant
8,Asakusabashi,Convenience Store,Ramen Restaurant,Sake Bar,Grocery Store,Japanese Restaurant,Intersection,Tonkatsu Restaurant,Thai Restaurant,Chinese Restaurant,Paper / Office Supplies Store
17,Hamamatsuchō,Convenience Store,Coffee Shop,Café,Japanese Restaurant,Sake Bar,Ramen Restaurant,Soba Restaurant,BBQ Joint,Tonkatsu Restaurant,Chinese Restaurant
21,Higashikanda,Convenience Store,Sake Bar,Ramen Restaurant,Soba Restaurant,BBQ Joint,Grocery Store,Coffee Shop,Bed & Breakfast,Café,Yoshoku Restaurant
22,Hongō,Convenience Store,Café,Japanese Restaurant,Deli / Bodega,ATM,BBQ Joint,New Auto Dealership,Intersection,Bike Shop,Ramen Restaurant
23,Ichigaya,Convenience Store,Chinese Restaurant,Café,Supermarket,Ramen Restaurant,Sushi Restaurant,Sake Bar,Japanese Restaurant,Clothing Store,Tonkatsu Restaurant
29,Jūjō,Convenience Store,Ramen Restaurant,Chinese Restaurant,Café,Coffee Shop,Noodle House,Japanese Restaurant,Fast Food Restaurant,Shopping Mall,Drugstore
36,Kasumigaseki,Convenience Store,Coffee Shop,Historic Site,Bookstore,Japanese Restaurant,Café,Udon Restaurant,Concert Hall,Ramen Restaurant,Lawyer


In [64]:
tokyo_merged.loc[tokyo_merged['Cluster Labels'] == 1, tokyo_merged.columns[[0] + list(range(4, tokyo_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
55,Nishishinjuku,Botanical Garden,ATM,Pie Shop,Pool,Plaza,Playground,Platform,Planetarium,Pizza Place,Pier


In [65]:
tokyo_merged.loc[tokyo_merged['Cluster Labels'] == 2, tokyo_merged.columns[[0] + list(range(4, tokyo_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Akihabara,Hobby Shop,Electronics Store,Ramen Restaurant,Café,Sake Bar,Chinese Restaurant,BBQ Joint,Soba Restaurant,Theme Restaurant,Sushi Restaurant
3,Aomi,Exhibit,Convenience Store,Coffee Shop,Plaza,Shopping Mall,Theme Restaurant,Theme Park Ride / Attraction,Theme Park,Tea Room,Bus Stop
4,Aoyama,Pharmacy,Discount Store,Bakery,Bus Stop,Gym,Okonomiyaki Restaurant,Ice Cream Shop,Coffee Shop,Bookstore,Music Venue
11,Daikanyama,Japanese Restaurant,Coffee Shop,Bar,BBQ Joint,Café,Italian Restaurant,Boutique,Seafood Restaurant,Dessert Shop,Gastropub
14,Futako-tamagawa,Café,Shopping Mall,Convenience Store,Coffee Shop,Italian Restaurant,Sushi Restaurant,Bookstore,Dessert Shop,Boutique,Ramen Restaurant
15,Ginza,Japanese Restaurant,Sushi Restaurant,Café,Clothing Store,Tonkatsu Restaurant,Kaiseki Restaurant,Ramen Restaurant,Shabu-Shabu Restaurant,Coffee Shop,Dessert Shop
18,Harajuku,Café,Coffee Shop,Clothing Store,Sporting Goods Shop,Convenience Store,Dessert Shop,Sake Bar,Stadium,Outdoor Supply Store,Thai Restaurant
19,Hibiya,Sushi Restaurant,Café,Italian Restaurant,Theater,Japanese Restaurant,Convenience Store,Hotel,Tea Room,Udon Restaurant,French Restaurant
25,Ikebukuro,Sake Bar,Yoshoku Restaurant,Sushi Restaurant,Coffee Shop,Ramen Restaurant,Chinese Restaurant,Japanese Restaurant,Unagi Restaurant,Café,Convenience Store
27,Jiyūgaoka,Café,Convenience Store,Bar,Coffee Shop,Chinese Restaurant,Dessert Shop,Sake Bar,Italian Restaurant,Drugstore,Donburi Restaurant


In [66]:
tokyo_merged.loc[tokyo_merged['Cluster Labels'] == 3, tokyo_merged.columns[[0] + list(range(4, tokyo_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
94,Wakasu,Bus Stop,Sculpture Garden,Park,ATM,Pier,Pool,Plaza,Playground,Platform,Planetarium


In [67]:
tokyo_merged.loc[tokyo_merged['Cluster Labels'] == 4, tokyo_merged.columns[[0] + list(range(4, tokyo_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
95,Yaesu,Intersection,Steakhouse,Portuguese Restaurant,Plaza,Playground,Platform,Planetarium,Pizza Place,Pier,Pie Shop


In [69]:
tokyo_merged.loc[tokyo_merged['Cluster Labels'] == 5, tokyo_merged.columns[[0] + list(range(4, tokyo_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
56,Nishikichō,Train Station,Food & Drink Shop,Liquor Store,Pie Shop,Pool,Plaza,Playground,Platform,Planetarium,Pizza Place


In [70]:
tokyo_merged.loc[tokyo_merged['Cluster Labels'] == 6, tokyo_merged.columns[[0] + list(range(4, tokyo_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
86,Tateishi,Train Station,Italian Restaurant,ATM,Pie Shop,Plaza,Playground,Platform,Planetarium,Pizza Place,Pier


In [71]:
tokyo_merged.loc[tokyo_merged['Cluster Labels'] == 7, tokyo_merged.columns[[0] + list(range(4, tokyo_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Akasaka,Japanese Restaurant,Chinese Restaurant,Convenience Store,BBQ Joint,Sake Bar,Coffee Shop,Sushi Restaurant,Szechuan Restaurant,Hotel,Kaiseki Restaurant
5,Ariake,Convenience Store,Coffee Shop,Bus Stop,Japanese Restaurant,Sake Bar,Plaza,Chinese Restaurant,Café,Intersection,Bike Rental / Bike Share
9,Azabu,Convenience Store,BBQ Joint,Japanese Restaurant,Soba Restaurant,Café,Pizza Place,Yakitori Restaurant,Chinese Restaurant,Bakery,Italian Restaurant
10,Awajichō,Convenience Store,Japanese Curry Restaurant,Coffee Shop,Soba Restaurant,BBQ Joint,Electronics Store,Ramen Restaurant,Sake Bar,Drugstore,Chinese Restaurant
13,Ebisu,Japanese Restaurant,Convenience Store,BBQ Joint,Italian Restaurant,Seafood Restaurant,Coffee Shop,Bar,Pizza Place,Chinese Restaurant,Ramen Restaurant
16,Gotanda,Convenience Store,Chinese Restaurant,BBQ Joint,Ramen Restaurant,Sake Bar,Japanese Restaurant,Sushi Restaurant,Italian Restaurant,Café,Bakery
24,Iidabashi,Convenience Store,Japanese Restaurant,Sake Bar,Italian Restaurant,French Restaurant,BBQ Joint,Café,Chinese Restaurant,Ramen Restaurant,Indian Restaurant
26,Iwamotochō,Sake Bar,Convenience Store,BBQ Joint,Sushi Restaurant,Steakhouse,Ramen Restaurant,Soba Restaurant,Hobby Shop,Japanese Curry Restaurant,ATM
30,Kabukichō,Sake Bar,Bar,BBQ Joint,Japanese Restaurant,Ramen Restaurant,Chinese Restaurant,Sushi Restaurant,Seafood Restaurant,Pastry Shop,Steakhouse
31,Kagurazaka,Convenience Store,Sake Bar,Chinese Restaurant,Café,Coffee Shop,Park,Italian Restaurant,Bakery,Grocery Store,French Restaurant


In [72]:
tokyo_merged.loc[tokyo_merged['Cluster Labels'] == 8, tokyo_merged.columns[[0] + list(range(4, tokyo_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
34,Kamiikebukuro,Intersection,Convenience Store,Café,Park,Ramen Restaurant,Dumpling Restaurant,Grocery Store,Gym / Fitness Center,Restaurant,Music Venue
87,Tatsumi,Park,Intersection,Rest Area,Dessert Shop,Trail,Plaza,Supermarket,Platform,Convenience Store,Pier
99,Yotsuya,Train Station,Intersection,Ramen Restaurant,Convenience Store,ATM,Pier,Plaza,Playground,Platform,Planetarium


In [73]:
tokyo_merged.loc[tokyo_merged['Cluster Labels'] == 9, tokyo_merged.columns[[0] + list(range(4, tokyo_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
12,Den-en-chōfu,Convenience Store,Bus Stop,Intersection,Kids Store,Park,Flower Shop,Pet Store,Print Shop,Pastry Shop,Portuguese Restaurant


## CONCLUSION

As a businessman, I would look at `Cluster 0` or `Cluster 7` to identify common neighborhoods to start my business. Both clusters have a value count of 33.

`Cluster 0` has most common venue is the convenience store followed by cafes and coffee shops. This could be neighborhoods with younger generations of people. I will consider opening either a convenience store or a cafe to cater to this group of people.

`Cluster 7` has many convenience stores but also restaurants and night spots such as sake bars. This could be neighborhoods with larger proportions of working adults or possibly adults who are more well to do and I will consider opening a restaurant or a bar in this cluster to cater to this group of people.

Another cluster to note is `Cluster 2` which is highly concentrated in the central area around Ginza. Based on the cluster, it also contains neighborhoods including Ropponggi and Shibuya which are places with many restaurants. I may also consider to open a restaurant in this cluster.