# Capstone Project - The Battle of Neighborhoods

## 1. Introduction

As pearl milk tea is becoming more and more popular in Japan, the stakeholder wants to open a milk tea shop as a new business. He needs me to recommend a location in Tokyo to open that shop.

He tolds me this business mode is very mature in China. The milk tea shops always open near shopping mall, university, bakery and subway station.

He wants me to find 5 best places in Tokyo to open the milk tea shop by analysing the nearby environment.
I will select 5 possible places in Tokyo by comparing with the milk tea locations in Shanghai.

## 2. Data

### 2.1 The geographic coordinate of milk tea shops in Shanghai and nearby environment

I can get these geographic coordinate from Foursquare API.

Using this data, i can find out the feature of milk tea shops locations in Shanghai, then i can use these features to find the similar places in Tokyo.

### 2.2 The university, shopping and public transportation geographic coordinate in Tokyo

I can also get these geographic coordinate from Foursquare API.

Using the features found in Shanghai's data, i can compare both nearby environment, then find out possible places to open shops.

In [1]:
import json 
import pandas as pd
import folium
import requests
from requests.exceptions import ReadTimeout, ConnectTimeout
import csv
import urllib.request
import re
import numpy as np
from urllib.parse import quote

Use the map's api to get the geographic coordinate of milk tea shops in Shanghai from <amap.com>

In [2]:
poi_search_url = "http://restapi.amap.com/v3/place/text"
def getpoi_page(cityname, keywords, page):
    req_url = poi_search_url + "?key=" + amap_web_key + '&keywords=' + quote(keywords) + '&city=' + quote(cityname) + '&citylimit=true' + '&offset=25' + '&page=' + str(page) + '&output=json'
    data = ''
    with urllib.request.urlopen(req_url) as f:
        data = f.read()
        data = data.decode('utf-8')
    return data

def getpois(cityname, keywords):
    i = 1
    poilist = []
    while True:  
        result = getpoi_page(cityname, keywords, i)
        result = json.loads(result) 
        if result['count'] == '0':
            break
        poilist.extend(result['pois'])
        i = i + 1
    return poilist

milktea_shop_json=getpois(cityname='shanghai',keywords='naicha')

milktea_list = []
for data in milktea_shop_json:
    milktea_list.append({
        'Name': data['name'],
        'Address': data['address'],
        'Location':data['location']
    })
df_milktea_shop = pd.DataFrame(milktea_list, columns=['Name','Address','Location'])


In [3]:
location=df_milktea_shop['Location']
longitude = []
latitude = []
for l in location:
    m = re.search('(.*),(.*)',l)
    longitude.append(m.group(1)) 
    latitude.append(m.group(2))
df_milktea_shop.insert(2,'Longitude',longitude)
df_milktea_shop.insert(3,'Latitude',latitude)
df_milktea_shop.drop('Location',axis=1,inplace=True)

Use the first 100 milk tea shops in shanghai 

In [4]:
df_milktea_shop=df_milktea_shop[0:100]
df_milktea_shop.head()

Unnamed: 0,Name,Address,Longitude,Latitude
0,奶茶会所,光明路与高东新路交叉口西北50米,121.61763,31.332172
1,奶茶咖啡,人民路与梧桐路交叉口西北50米,121.496488,31.227198
2,一点奶茶,共和新路647号,121.463955,31.254808
3,四季奶茶,三牌楼路87号,121.492453,31.222938
4,悸动奶茶,光明镇光明路319号,121.516284,30.906174


Create map of shanghai using latitude and longitude values

In [5]:
map_shanghai = folium.Map(location = [31.230378,121.473658], zoom_start = 11)  

for i in range(0,850):
    folium.CircleMarker(
        [latitude[i], longitude[i]],
        radius=5,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_shanghai)  

map_shanghai

Use foursquare to get the venues of these 100 milk tea shops 

In [7]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
        results = requests.get(url).json()["response"]['groups'][0]['items']
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Name', 
                  'Latitude', 
                  'Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [8]:
milkteashop_venues = getNearbyVenues(names=df_milktea_shop['Name'],
                                   latitudes=df_milktea_shop['Latitude'],
                                   longitudes=df_milktea_shop['Longitude']
                                  )


In [9]:
print(milkteashop_venues.shape)
milkteashop_venues.head()

(999, 7)


Unnamed: 0,Name,Latitude,Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,奶茶会所,31.332172,121.61763,chona,31.331225,121.622488,Airport
1,奶茶会所,31.332172,121.61763,Dole Shanghai,31.336273,121.615633,Farmers Market
2,奶茶咖啡,31.227198,121.496488,Hotel Indigo Shanghai On The Bund (上海外灘英迪格酒店),31.228193,121.495571,Hotel
3,奶茶咖啡,31.227198,121.496488,CHAR Bar,31.228209,121.495593,Hotel Bar
4,奶茶咖啡,31.227198,121.496488,CHAR,31.228187,121.495556,Steakhouse


In [10]:
milkteashop_venues.groupby('Name').count().head()

Unnamed: 0_level_0,Latitude,Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
COCO奶茶,4,4,4,4,4,4
Coco奶茶(浦城路),30,30,30,30,30,30
DOZI奶茶,2,2,2,2,2,2
LHH奶茶点心,10,10,10,10,10,10
TPLUS奶茶店,30,30,30,30,30,30


In [11]:
# one hot encoding
milkteashop_onehot = pd.get_dummies(milkteashop_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
milkteashop_onehot['Name'] = milkteashop_venues['Name'] 

# move neighborhood column to the first column
fixed_columns = [milkteashop_onehot.columns[-1]] + list(milkteashop_onehot.columns[:-1])
milkteashop_onehot = milkteashop_onehot[fixed_columns]

milkteashop_onehot.head()

Unnamed: 0,Name,Airport,American Restaurant,Art Gallery,Art Museum,Asian Restaurant,Athletics & Sports,Auto Workshop,BBQ Joint,Badminton Court,...,Turkish Restaurant,Vegetarian / Vegan Restaurant,Video Store,Vietnamese Restaurant,Warehouse Store,Whisky Bar,Wine Shop,Xinjiang Restaurant,Yoga Studio,Yunnan Restaurant
0,奶茶会所,1,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,奶茶会所,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,奶茶咖啡,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,奶茶咖啡,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,奶茶咖啡,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [12]:
milkteashop_onehot.shape

(999, 153)

In [13]:
milkteashop_grouped = milkteashop_onehot.groupby('Name').mean().reset_index()
milkteashop_grouped.head()

Unnamed: 0,Name,Airport,American Restaurant,Art Gallery,Art Museum,Asian Restaurant,Athletics & Sports,Auto Workshop,BBQ Joint,Badminton Court,...,Turkish Restaurant,Vegetarian / Vegan Restaurant,Video Store,Vietnamese Restaurant,Warehouse Store,Whisky Bar,Wine Shop,Xinjiang Restaurant,Yoga Studio,Yunnan Restaurant
0,COCO奶茶,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Coco奶茶(浦城路),0.0,0.033333,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,...,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,DOZI奶茶,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,LHH奶茶点心,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,TPLUS奶茶店,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.133333,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [14]:
milkteashop_grouped.shape

(82, 153)

In [15]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

num_top_venues = 5


indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Name']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
name_venues_sorted = pd.DataFrame(columns=columns)
name_venues_sorted['Name'] = milkteashop_grouped['Name']

for ind in np.arange(milkteashop_grouped.shape[0]):
    name_venues_sorted.iloc[ind, 1:] = return_most_common_venues(milkteashop_grouped.iloc[ind, :], num_top_venues)

name_venues_sorted.head()


Unnamed: 0,Name,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,COCO奶茶,Grocery Store,Hotel,Restaurant,French Restaurant,Gay Bar
1,Coco奶茶(浦城路),Italian Restaurant,Coffee Shop,Asian Restaurant,Seafood Restaurant,Chinese Restaurant
2,DOZI奶茶,Department Store,Metro Station,Yunnan Restaurant,Food,German Restaurant
3,LHH奶茶点心,Sandwich Place,Dessert Shop,Art Gallery,Coffee Shop,Hotel
4,TPLUS奶茶店,Shopping Mall,Fast Food Restaurant,BBQ Joint,Metro Station,Hostel


Use the k-means to cluster the milk tea shops.

In [16]:
from sklearn.cluster import KMeans

In [17]:
# set number of clusters
kclusters = 5

milkteashop_grouped_clustering = milkteashop_grouped.drop('Name', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=3).fit(milkteashop_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_

array([0, 2, 2, 0, 2, 2, 2, 2, 2, 0, 2, 2, 2, 2, 2, 2, 2, 2, 2, 0, 2, 2,
       2, 0, 2, 0, 2, 0, 2, 2, 2, 2, 2, 2, 0, 0, 1, 2, 0, 0, 2, 2, 1, 2,
       4, 2, 2, 2, 3, 2, 1, 1, 2, 2, 0, 1, 2, 2, 1, 2, 2, 1, 1, 2, 1, 1,
       2, 1, 0, 2, 2, 2, 2, 1, 0, 2, 2, 1, 2, 1, 2, 2], dtype=int32)

In [18]:
name_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

milkteashop_merged = df_milktea_shop

milkteashop_merged = milkteashop_merged.join(name_venues_sorted.set_index('Name'), on='Name')

milkteashop_merged = milkteashop_merged.dropna(axis=0,how='any')

milkteashop_merged['Cluster Labels'] = milkteashop_merged['Cluster Labels'].astype('int')

milkteashop_merged.head() 

Unnamed: 0,Name,Address,Longitude,Latitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,奶茶会所,光明路与高东新路交叉口西北50米,121.61763,31.332172,2,Airport,Farmers Market,Food,Golf Course,German Restaurant
1,奶茶咖啡,人民路与梧桐路交叉口西北50米,121.496488,31.227198,2,Hotel,Clothing Store,Steakhouse,Art Gallery,Furniture / Home Store
2,一点奶茶,共和新路647号,121.463955,31.254808,2,Hotel,Supermarket,Bubble Tea Shop,Hotpot Restaurant,Metro Station
3,四季奶茶,三牌楼路87号,121.492453,31.222938,0,Hotel,Clothing Store,Convenience Store,Museum,Coffee Shop
5,tina奶茶,淞虹路683好,121.361598,31.212718,2,Fast Food Restaurant,Grocery Store,Ramen Restaurant,Sporting Goods Shop,Coffee Shop


In [19]:
import matplotlib.cm as cm
import matplotlib.colors as colors
map_clusters = folium.Map(location=[31.230378,121.473658], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(milkteashop_merged['Latitude'], milkteashop_merged['Longitude'], milkteashop_merged['Name'], milkteashop_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

**Because the venues are all different but similar, i split the venues to single words to find out the most words in all venues**

In [20]:
words=[]
for i in range(0,81):
    for j in range(2,6):
        words.append(name_venues_sorted.loc[i][j].split(' '))

In [21]:
from collections import Counter
words=sum(words,[])
word_counts = Counter(words)
top_five = word_counts.most_common(5)
print(top_five)

[('Restaurant', 119), ('Shop', 41), ('Chinese', 29), ('Coffee', 25), ('Food', 25)]


As it shows, restaurant, shop, coffee and food is the most things near milk tea shops
So, i choose some shopping malls in Tokyo, there are always many restaurants and coffee in the shopping malls, i can choose five places from these 12 shopping centres.

In [25]:
df_Tokyo=pd.read_csv('Tokyo.csv')
df_Tokyo

Unnamed: 0,Name,Longitude,Latitude
0,Tokyo Tower,139.745433,35.65858
1,Shinjuku,139.703356,35.693825
2,Ginza,139.766486,35.671223
3,Akihabara,139.774473,35.702259
4,Ikebukuro,139.707731,35.734831
5,Mastuya Ginza,139.766698,35.672256
6,东京上野,139.774154,35.70873
7,Daikanyama,139.704221,35.650547
8,Nakamise-dori Street,139.796454,35.711841
9,Shibuya Center Street,139.699783,35.660046


In [26]:
map_tokyo = folium.Map(location = [35.6803997,139.76901739999], zoom_start = 11)  

for i in range(0,12):
    folium.CircleMarker(
        [df_Tokyo['Latitude'][i], df_Tokyo['Longitude'][i]],
        radius=5,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_tokyo)  

map_tokyo

In [27]:
Tokyo_venues = getNearbyVenues(names=df_Tokyo['Name'],
                                   latitudes=df_Tokyo['Latitude'],
                                   longitudes=df_Tokyo['Longitude']
                                  )

In [28]:
print(Tokyo_venues.shape)
Tokyo_venues.head()

(1154, 7)


Unnamed: 0,Name,Latitude,Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Tokyo Tower,35.65858,139.745433,Tokyo Shiba Tofuya Ukai (東京芝 とうふ屋うかい),35.657531,139.745105,Kaiseki Restaurant
1,Tokyo Tower,35.65858,139.745433,Tokyo Tower (東京タワー),35.658579,139.745442,Monument / Landmark
2,Tokyo Tower,35.65858,139.745433,Nodaiwa (五代目 野田岩),35.658141,139.743424,Unagi Restaurant
3,Tokyo Tower,35.65858,139.745433,TOWERS188,35.657091,139.743916,American Restaurant
4,Tokyo Tower,35.65858,139.745433,La Casa Del Habano by Cigar Club 飯倉本店,35.659006,139.74324,Smoke Shop


In [29]:
compare_venues=Tokyo_venues.append(milkteashop_venues)
compare_venues.shape

(2153, 7)

In [30]:
compare_venues.groupby('Name').count().head()

Unnamed: 0_level_0,Latitude,Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Akihabara,100,100,100,100,100,100
Ameyoko Arcade,100,100,100,100,100,100
COCO奶茶,4,4,4,4,4,4
Coco奶茶(浦城路),30,30,30,30,30,30
DOZI奶茶,2,2,2,2,2,2


In [31]:
# one hot encoding
compare_onehot = pd.get_dummies(compare_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
compare_onehot['Name'] = compare_venues['Name'] 

# move neighborhood column to the first column
fixed_columns = [compare_onehot.columns[-1]] + list(compare_onehot.columns[:-1])
compare_onehot = compare_onehot[fixed_columns]

compare_onehot.head()

Unnamed: 0,Name,Accessories Store,Airport,American Restaurant,Antique Shop,Arcade,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,...,Wagashi Place,Warehouse Store,Whisky Bar,Wine Bar,Wine Shop,Xinjiang Restaurant,Yakitori Restaurant,Yoga Studio,Yoshoku Restaurant,Yunnan Restaurant
0,Tokyo Tower,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Tokyo Tower,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Tokyo Tower,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Tokyo Tower,0,0,1,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Tokyo Tower,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [32]:
compare_grouped = compare_onehot.groupby('Name').mean().reset_index()

# create columns according to number of top venues
columns = ['Name']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
name_venues_sorted = pd.DataFrame(columns=columns)
name_venues_sorted['Name'] = compare_grouped['Name']

for ind in np.arange(compare_grouped.shape[0]):
    name_venues_sorted.iloc[ind, 1:] = return_most_common_venues(compare_grouped.iloc[ind, :], num_top_venues)

name_venues_sorted.head()
#name_venues_sorted.shape

Unnamed: 0,Name,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,Akihabara,Hobby Shop,Ramen Restaurant,Café,Electronics Store,Sake Bar
1,Ameyoko Arcade,Sake Bar,BBQ Joint,Tonkatsu Restaurant,Ramen Restaurant,Japanese Restaurant
2,COCO奶茶,French Restaurant,Grocery Store,Restaurant,Hotel,Fried Chicken Joint
3,Coco奶茶(浦城路),Coffee Shop,Italian Restaurant,Café,Seafood Restaurant,Asian Restaurant
4,DOZI奶茶,Department Store,Metro Station,Yunnan Restaurant,French Restaurant,Gym


In [33]:
# set number of clusters
kclusters = 5

compare_grouped_clustering = compare_grouped.drop('Name', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=3).fit(compare_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0,
       0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 1, 0, 4, 0, 0, 0, 0, 0, 0, 2, 0, 0,
       2, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0,
       0, 0, 3, 0, 0, 0], dtype=int32)

In [34]:
name_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

compare_merged = df_milktea_shop.drop(['Address'],axis=1).append(df_Tokyo)

compare_merged = compare_merged.join(name_venues_sorted.set_index('Name'), on='Name')

compare_merged = compare_merged.dropna(axis=0,how='any')

compare_merged['Cluster Labels'] = milkteashop_merged['Cluster Labels'].astype('int')

compare_merged.head() 

Unnamed: 0,Name,Longitude,Latitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,奶茶会所,121.61763,31.332172,2.0,Airport,Farmers Market,Yunnan Restaurant,Fried Chicken Joint,Gym / Fitness Center
1,奶茶咖啡,121.496488,31.227198,2.0,Hotel,Clothing Store,Furniture / Home Store,Museum,Hotel Bar
2,一点奶茶,121.463955,31.254808,2.0,Supermarket,Hotpot Restaurant,Bubble Tea Shop,Hotel,Metro Station
3,四季奶茶,121.492453,31.222938,0.0,Hotel,Clothing Store,Supermarket,Coffee Shop,Convenience Store
5,tina奶茶,121.361598,31.212718,2.0,Grocery Store,Ramen Restaurant,Sporting Goods Shop,Cantonese Restaurant,Fast Food Restaurant


In [36]:
compare_merged

Unnamed: 0,Name,Longitude,Latitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,奶茶会所,121.617630,31.332172,2.0,Airport,Farmers Market,Yunnan Restaurant,Fried Chicken Joint,Gym / Fitness Center
1,奶茶咖啡,121.496488,31.227198,2.0,Hotel,Clothing Store,Furniture / Home Store,Museum,Hotel Bar
2,一点奶茶,121.463955,31.254808,2.0,Supermarket,Hotpot Restaurant,Bubble Tea Shop,Hotel,Metro Station
3,四季奶茶,121.492453,31.222938,0.0,Hotel,Clothing Store,Supermarket,Coffee Shop,Convenience Store
5,tina奶茶,121.361598,31.212718,2.0,Grocery Store,Ramen Restaurant,Sporting Goods Shop,Cantonese Restaurant,Fast Food Restaurant
6,Coco奶茶(浦城路),121.514556,31.226997,2.0,Coffee Shop,Italian Restaurant,Café,Seafood Restaurant,Asian Restaurant
7,杰拉奶茶,121.419302,31.202409,2.0,Coffee Shop,Japanese Restaurant,French Restaurant,Noodle House,Hotpot Restaurant
8,薇薇奶茶,121.493796,31.222380,2.0,Supermarket,Metro Station,Hotel,Furniture / Home Store,Coffee Shop
9,多多奶茶,121.390400,31.240050,2.0,Asian Restaurant,Coffee Shop,Yunnan Restaurant,Furniture / Home Store,Gym / Fitness Center
10,暖心奶茶,121.506643,31.294503,1.0,Chinese Restaurant,Fast Food Restaurant,Metro Station,Dumpling Restaurant,Hotel


Compare the cluster labels of milk tea shops in shanghai and possible places in Tokyo, we can choose 
1. Tokyo Tower
2. Ginza
3. Daikanyama
4. Shibuya Center Street
5. Ameyoko Arcade