# Best neighborhood in Tokyo for Location of Ballet School

## Introduction

I was contacted by an owner of the ABC schools of ballet who operates three successful ballet schools in US.  He is planning to open a new school in Tokyo, Japan and asked me to find the best neighborhood in Tokyo for a location of his new school.

I will map the ballet schools and studios already exist in Tokyo, we can identify the neighborhood where less competition exists.  In addition, I will explore the venues in each neighborhood by using k-mean clustering to recommend the best neighborhood for him to open his new ballet school in Tokyo.

## Data

The following data will be used:
* Geographic data of the neighborhoods in Tokyo   
The csv file of the complete list of Japan postal codes freely downloaded from https://www.aggdata.com/free/japan-postal-codes
* Data of the venues around the neighborhoods from Foursquare

## Methodology

Import all libraries and modules needed.

In [46]:
import numpy as np
import pandas as pd
import json

from geopy.geocoders import Nominatim

import requests
from pandas.io.json import json_normalize

import matplotlib.cm as cm
import matplotlib.colors as colors

from sklearn.cluster import KMeans

!pip -q install folium
import folium

print('Libraries imported.')

Libraries imported.


**STEP 1: Create a Dataset**

1. Create a pandas dataframe called "df_tokyo" containing the geographic information of all neighborhoods in Tokyo.

In [47]:
# The code was removed by Watson Studio for sharing.

Unnamed: 0,postal code,place name,state,county/province,latitude,longitude
0,490-1401,Rokujocho,Aichi Ken,Yatomi Shi,34.9,137.15
1,490-1402,Gotoyama,Aichi Ken,Yatomi Shi,34.9,137.15
2,490-1403,Toriganjicho,Aichi Ken,Yatomi Shi,34.9,137.15
3,490-1403,Toriganji,Aichi Ken,Yatomi Shi,34.9,137.15
4,490-1404,Ikadaba,Aichi Ken,Yatomi Shi,34.9,137.15


In [48]:
#Create a new dataframe only for Tokyo
df_tokyo = df_data_1[df_data_1['state'] == 'Tokyo To'].reset_index(drop=True)
df_tokyo.head()

Unnamed: 0,postal code,place name,state,county/province,latitude,longitude
0,206-0000,Ikanikeisaiganaibaai,Tokyo To,Tama Shi,35.693,139.6585
1,206-0001,Wada,Tokyo To,Tama Shi,35.6306,139.4399
2,206-0002,Ichinomiya,Tokyo To,Tama Shi,35.693,139.6585
3,206-0003,Higashiteragata,Tokyo To,Tama Shi,35.693,139.6585
4,206-0004,Mogusa,Tokyo To,Tama Shi,35.693,139.6585


In [49]:
df_tokyo.shape

(3809, 6)

2. Create a map of Tokyo with the neighborhoods (places)

In [50]:
# get the location of Tokyo
address = 'Tokyo, Japan'

geolocator = Nominatim(user_agent="japan_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Tokyo are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Tokyo are 35.6828387, 139.7594549.


In [12]:
# create map of Tokyo with the neighborhoods (places) superimposed on top
map_tokyo = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, label in zip(df_tokyo['latitude'], df_tokyo['longitude'], df_tokyo['place name']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_tokyo)  
    
map_tokyo

3. Define Foursquare credentials and version.

In [51]:
# The code was removed by Watson Studio for sharing.

The credentials and version were defined.


4. Find out the ballet schools in Tokyo

In [52]:
#Search for ballet schools
search_query = 'ballet'
radius = 67000
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}'.format(CLIENT_ID, CLIENT_SECRET, place_latitude, place_longitude, VERSION, search_query, radius)
url

'https://api.foursquare.com/v2/venues/search?client_id=GPMJJZBX0MHHOAWV52CN3M1QJURTVHBIAZQP1CDNEWGR5G1S&client_secret=QYQC1P0DIYDSHPMXDQOERYWYKDLHYITPTO3KW4QEWADU1UOA&ll=35.693000000000005,139.6585&v=20200616&query=ballet&radius=67000'

In [53]:
#Examine the results
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5ee93a31226d4f12e81174b7'},
 'response': {'venues': [{'id': '4eed97c629c28028df0273be',
    'name': 'STUDIO LE BALLET CONTRASTE',
    'location': {'address': '高円寺南2-16-3',
     'lat': 35.69812,
     'lng': 139.648962,
     'labeledLatLngs': [{'label': 'display',
       'lat': 35.69812,
       'lng': 139.648962}],
     'distance': 1033,
     'cc': 'JP',
     'city': '杉並区',
     'state': '東京都',
     'country': '日本',
     'formattedAddress': ['高円寺南2-16-3', '杉並区, 東京都', '日本']},
    'categories': [{'id': '4bf58dd8d48988d134941735',
      'name': 'Dance Studio',
      'pluralName': 'Dance Studios',
      'shortName': 'Dance Studio',
      'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/arts_entertainment/performingarts_dancestudio_',
       'suffix': '.png'},
      'primary': True}],
    'referralId': 'v-1592343236',
    'hasPerk': False},
   {'id': '4e4b474ba8097d9c84ba3870',
    'name': 'K-BALLET GATE 恵比寿スタジオ',
    'location': {'address': '恵比寿

5. Create a dataframe containing the ballet school information in Tokyo.

In [56]:
# create a dataframe containing information about the above ballet schools
# assign relevant part of JSON to ballet
ballet = results['response']['venues']

# tranform venues into a dataframe
ballet_df = json_normalize(ballet)
ballet_df.head()

Unnamed: 0,categories,hasPerk,id,location.address,location.cc,location.city,location.country,location.crossStreet,location.distance,location.formattedAddress,location.labeledLatLngs,location.lat,location.lng,location.postalCode,location.state,name,referralId
0,"[{'id': '4bf58dd8d48988d134941735', 'name': 'D...",False,4eed97c629c28028df0273be,高円寺南2-16-3,JP,杉並区,日本,,1033,"[高円寺南2-16-3, 杉並区, 東京都, 日本]","[{'label': 'display', 'lat': 35.69812, 'lng': ...",35.69812,139.648962,,東京都,STUDIO LE BALLET CONTRASTE,v-1592343236
1,"[{'id': '4bf58dd8d48988d134941735', 'name': 'D...",False,4e4b474ba8097d9c84ba3870,恵比寿4-17-3,JP,東京,日本,カゲオカビル B1F,7270,"[恵比寿4-17-3 (カゲオカビル B1F), 渋谷区, 東京都, 150-0013, 日本]","[{'label': 'display', 'lat': 35.6441625025635,...",35.644163,139.711872,150-0013,東京都,K-BALLET GATE 恵比寿スタジオ,v-1592343236
2,"[{'id': '4bf58dd8d48988d134941735', 'name': 'D...",False,4c6116d0924b76b02135f9b9,目黒4-26-4,JP,目黒区,日本,目黒通り,7945,"[目黒4-26-4 (目黒通り), 目黒区, 東京都, 153-0063, 日本]","[{'label': 'display', 'lat': 35.6308178, 'lng'...",35.630818,139.70163,153-0063,東京都,The Tokyo Ballet (東京バレエ団),v-1592343236
3,"[{'id': '4bf58dd8d48988d134941735', 'name': 'D...",False,53aa7476498e13bffdae1ada,,JP,,日本,,6385,[日本],"[{'label': 'display', 'lat': 35.74465002158848...",35.74465,139.689236,,,Espace de Ballet (エスパス・ドゥ・バレエ教室),v-1592343236
4,"[{'id': '4bf58dd8d48988d1ad941735', 'name': 'T...",False,4e45f261b0fb93df2703c61d,,JP,,日本,,2298,[日本],"[{'label': 'display', 'lat': 35.697715, 'lng':...",35.697715,139.683256,,,Ballet Art Medical Academy,v-1592343236


In [57]:
# keep only columns that include venue name, and anything that is associated with location
filtered_columns = ['name', 'categories'] + [col for col in ballet_df.columns if col.startswith('location.')] + ['id']
ballet_filtered = ballet_df.loc[:, filtered_columns]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
ballet_filtered['categories'] = ballet_filtered.apply(get_category_type, axis=1)

# clean column names by keeping only last term
ballet_filtered.columns = [column.split('.')[-1] for column in ballet_filtered.columns]

ballet_filtered.head()

Unnamed: 0,name,categories,address,cc,city,country,crossStreet,distance,formattedAddress,labeledLatLngs,lat,lng,postalCode,state,id
0,STUDIO LE BALLET CONTRASTE,Dance Studio,高円寺南2-16-3,JP,杉並区,日本,,1033,"[高円寺南2-16-3, 杉並区, 東京都, 日本]","[{'label': 'display', 'lat': 35.69812, 'lng': ...",35.69812,139.648962,,東京都,4eed97c629c28028df0273be
1,K-BALLET GATE 恵比寿スタジオ,Dance Studio,恵比寿4-17-3,JP,東京,日本,カゲオカビル B1F,7270,"[恵比寿4-17-3 (カゲオカビル B1F), 渋谷区, 東京都, 150-0013, 日本]","[{'label': 'display', 'lat': 35.6441625025635,...",35.644163,139.711872,150-0013,東京都,4e4b474ba8097d9c84ba3870
2,The Tokyo Ballet (東京バレエ団),Dance Studio,目黒4-26-4,JP,目黒区,日本,目黒通り,7945,"[目黒4-26-4 (目黒通り), 目黒区, 東京都, 153-0063, 日本]","[{'label': 'display', 'lat': 35.6308178, 'lng'...",35.630818,139.70163,153-0063,東京都,4c6116d0924b76b02135f9b9
3,Espace de Ballet (エスパス・ドゥ・バレエ教室),Dance Studio,,JP,,日本,,6385,[日本],"[{'label': 'display', 'lat': 35.74465002158848...",35.74465,139.689236,,,53aa7476498e13bffdae1ada
4,Ballet Art Medical Academy,Trade School,,JP,,日本,,2298,[日本],"[{'label': 'display', 'lat': 35.697715, 'lng':...",35.697715,139.683256,,,4e45f261b0fb93df2703c61d
