# Where to open a new Italian restaurant in Central Yokohama, Naka-ward?

# Introduction
Restaurant business is easy to start and can be profitable when it gets many customers.  I have a client who spent long time in Venice, Italy, to run small Italian cafe/restaurant as a chef who is thinking to open a new Italian restaurant in Yokohama because the client thinks the city has enough customers for good Italian foods but misses good Italian restaurants.

# Business Problem
I, as a business consultant, agree that Yokohama has lots of potential for Italian restaurant business, but the city is large, and the location is very important to the restaurant business.  Within Yokohama, Central Yokohama, especially Naka-ward, is one of the most commercially developed wards in the city, thus it is considered attractive area to open a new Italian restaurant.  Yet, Naka-ward has 21 km2 (= square kilometer) with more than 150 thousands people living in the area and day time population becomes 240 thousands people with tourists and office workers visiting, a large and densed area.  Within Naka-ward, there are several commercial districts, and it is important to understand their characteristics, and even which Town in disctrict is a better location for Italian restaurant.

# Data
1. Japan Postal Office Postal Code database:
Postal Code is an effective way to segment District, which equates a group of Towns in this case, of the Naka-ward.  There are 106 Postal codes or Towns in Naka-ward. Postal Code is provided by the Japan Postal office in CSV format.
2. Tree-maps database
Tree-maps database is a web base geo-coding service used here to associate geographical coordinate with Postal code, so that each Postal Code or Town can have geographical coordinate that then work well with geographical mapping service, Folium, and area characteristic data base, Foursquare. 
3. Foursuare API:
Foursquare is a location-based social networking website, software for mobile devices, and game. Users "check-in" at venues using text messaging or a device specific application.  We use its API to obtain Venues and their Category to characterize the area.
4. Folium:
Folium builds on the data wrangling strengths of the Python ecosystem and the mapping strengths of the leaflet.js library. Manipulate the data in Python, then visualize it in on a Leaflet map via folium.  This visualization data and function are used to understand density and orientation of Towns in Cluster and Venues on Naka-ward map.

# Methodology
Identify best District within Naka-ward
1. Segment Naka-ward by Town: using Japan Post Office postal code to segment the ward into multiple Towns
2. Attach Coordinates to Town with Tree-maps database function for Folium map visualization and Foursquare venue data download 
3. Map Towns over Naka-ward geography with Folium 
4. Cluster Towns rolling up to District: Use Foursquare venue data to cluster Towns to District by k-mean method
5. Review Districts to understand their characteristics and name them for easier reference
6. Identify current Italian Restaurants in Naka-ward using Foursquare
 

### 1. Segment Naka-ward by Town: using Japan Post Office postal code
Download the Town postal code list of Naka-ward from Post Office web site to local CSV file (= Town.csv)

https://api.nipponsoft.co.jp/zipcode/%E7%A5%9E%E5%A5%88%E5%B7%9D%E7%9C%8C%E6%A8%AA%E6%B5%9C%E5%B8%82%E4%B8%AD%E5%8C%BA

Use Geo-coding web service by uploading the CSV file to get latitude and longitude coordinates by postal code

https://www.tree-maps.com/zip-code-to-coordinate/

Copy the coordinates and paste onto local CSV file (= Coordinate.csv)

In [3]:
# import pandas library
import numpy as np # library to handle data in a vectorized manner
import pandas as pd # library for data analsysis

In [2]:
# The code was removed by Watson Studio for sharing.

Unnamed: 0,Postal_Code,Pref,Ward,Town,Address
0,231-0001,神奈川県,横浜市中区,新港,神奈川県横浜市中区新港
1,231-0002,神奈川県,横浜市中区,海岸通,神奈川県横浜市中区海岸通
2,231-0003,神奈川県,横浜市中区,北仲通,神奈川県横浜市中区北仲通
3,231-0004,神奈川県,横浜市中区,元浜町,神奈川県横浜市中区元浜町
4,231-0005,神奈川県,横浜市中区,本町,神奈川県横浜市中区本町


### 2. Attach Coordinates to Town with Tree-map database function for Folium map visualization and Foursquare venue data download

In [4]:
# Import local CSV file, Coordinate.csv, to pandas dataframe on the notebook 
body = client_c3900fb2224740ba89b0f99a3170af4a.get_object(Bucket='segmentingandclusteringneighborho-donotdelete-pr-v8wucrlwwluvmh',Key='Coordinate.csv')['Body']
# add missing __iter__ method, so pandas accepts body as file-like object
if not hasattr(body, "__iter__"): body.__iter__ = types.MethodType( __iter__, body )

df_Coordinate = pd.read_csv(body)
df_Coordinate.head()


Unnamed: 0,Latitude,Longitude,Postal_Code
0,35.447439,139.636824,231-0012
1,35.442478,139.622418,231-0051
2,35.439313,139.626795,231-0057
3,35.420654,139.648998,231-0834
4,35.436849,139.641109,231-0868


In [5]:
# Check df_Town dataframe -> there are many NaN rows imported
df_Town

Unnamed: 0,Postal_Code,Pref,Ward,Town,Address
0,231-0001,神奈川県,横浜市中区,新港,神奈川県横浜市中区新港
1,231-0002,神奈川県,横浜市中区,海岸通,神奈川県横浜市中区海岸通
2,231-0003,神奈川県,横浜市中区,北仲通,神奈川県横浜市中区北仲通
3,231-0004,神奈川県,横浜市中区,元浜町,神奈川県横浜市中区元浜町
4,231-0005,神奈川県,横浜市中区,本町,神奈川県横浜市中区本町
...,...,...,...,...,...
208,,,,,
209,,,,,
210,,,,,
211,,,,,


In [6]:
# Remove NaN rows on df_Town
df_Town.dropna(how='all',axis=0)


Unnamed: 0,Postal_Code,Pref,Ward,Town,Address
0,231-0001,神奈川県,横浜市中区,新港,神奈川県横浜市中区新港
1,231-0002,神奈川県,横浜市中区,海岸通,神奈川県横浜市中区海岸通
2,231-0003,神奈川県,横浜市中区,北仲通,神奈川県横浜市中区北仲通
3,231-0004,神奈川県,横浜市中区,元浜町,神奈川県横浜市中区元浜町
4,231-0005,神奈川県,横浜市中区,本町,神奈川県横浜市中区本町
...,...,...,...,...,...
101,231-0864,神奈川県,横浜市中区,千代崎町,神奈川県横浜市中区千代崎町
102,231-0865,神奈川県,横浜市中区,北方町,神奈川県横浜市中区北方町
103,231-0866,神奈川県,横浜市中区,柏葉,神奈川県横浜市中区柏葉
104,231-0867,神奈川県,横浜市中区,打越,神奈川県横浜市中区打越


In [7]:
# Check df_Coordinate dataframe -> looks fine
df_Coordinate

Unnamed: 0,Latitude,Longitude,Postal_Code
0,35.447439,139.636824,231-0012
1,35.442478,139.622418,231-0051
2,35.439313,139.626795,231-0057
3,35.420654,139.648998,231-0834
4,35.436849,139.641109,231-0868
...,...,...,...
101,35.443559,139.640082,231-0022
102,35.446519,139.632680,231-0041
103,35.439292,139.642868,231-0024
104,35.441652,139.627941,231-0056


In [8]:
# Merge two tables joined by Postal_Code
df_merged=pd.merge(df_Town,df_Coordinate,on='Postal_Code')
df_merged

Unnamed: 0,Postal_Code,Pref,Ward,Town,Address,Latitude,Longitude
0,231-0001,神奈川県,横浜市中区,新港,神奈川県横浜市中区新港,35.454370,139.641190
1,231-0002,神奈川県,横浜市中区,海岸通,神奈川県横浜市中区海岸通,35.450641,139.642730
2,231-0003,神奈川県,横浜市中区,北仲通,神奈川県横浜市中区北仲通,35.449926,139.637674
3,231-0004,神奈川県,横浜市中区,元浜町,神奈川県横浜市中区元浜町,35.449272,139.640069
4,231-0005,神奈川県,横浜市中区,本町,神奈川県横浜市中区本町,35.449386,139.637551
...,...,...,...,...,...,...,...
101,231-0864,神奈川県,横浜市中区,千代崎町,神奈川県横浜市中区千代崎町,35.434236,139.655064
102,231-0865,神奈川県,横浜市中区,北方町,神奈川県横浜市中区北方町,35.433637,139.658545
103,231-0866,神奈川県,横浜市中区,柏葉,神奈川県横浜市中区柏葉,35.432117,139.643034
104,231-0867,神奈川県,横浜市中区,打越,神奈川県横浜市中区打越,35.434222,139.637952


In [9]:
# Remove unnecessary columns from the dataframe
drop_col=['Pref','Ward','Address']
df_Town=df_merged.drop(drop_col,axis=1)

In [10]:
df_Town

Unnamed: 0,Postal_Code,Town,Latitude,Longitude
0,231-0001,新港,35.454370,139.641190
1,231-0002,海岸通,35.450641,139.642730
2,231-0003,北仲通,35.449926,139.637674
3,231-0004,元浜町,35.449272,139.640069
4,231-0005,本町,35.449386,139.637551
...,...,...,...,...
101,231-0864,千代崎町,35.434236,139.655064
102,231-0865,北方町,35.433637,139.658545
103,231-0866,柏葉,35.432117,139.643034
104,231-0867,打越,35.434222,139.637952


### 3. Map Towns over Naka-ward geography with Folium

In [11]:
# Import libraries for Mapping (and following data analysis and charting)
!pip install folium
import folium
from folium import plugins
from geopy.geocoders import Nominatim
import numpy as np
import requests
from bs4 import BeautifulSoup
import os
from sklearn.cluster import KMeans
!pip install msgpack
import matplotlib.cm as cm
import matplotlib.colors as colors

Collecting folium
  Downloading folium-0.12.1-py2.py3-none-any.whl (94 kB)
[K     |████████████████████████████████| 94 kB 5.4 MB/s  eta 0:00:01
[?25hCollecting branca>=0.3.0
  Downloading branca-0.4.2-py3-none-any.whl (24 kB)
Installing collected packages: branca, folium
Successfully installed branca-0.4.2 folium-0.12.1
Collecting msgpack
  Downloading msgpack-1.0.2-cp37-cp37m-manylinux1_x86_64.whl (273 kB)
[K     |████████████████████████████████| 273 kB 15.6 MB/s eta 0:00:01
[?25hInstalling collected packages: msgpack
Successfully installed msgpack-1.0.2


In [12]:
# Define a center of Naka-ward map = the 84th row '231-0845 立野' is the center of Naka-ward
LAT=df_Town.Latitude.iloc[84]
LNG=df_Town.Longitude.iloc[84]

In [13]:
# Lay out each Ward coordinate on Yokohama map
loc=np.array([df_Town.Latitude,df_Town.Longitude]).T
map_Naka=folium.Map([LAT,LNG],zoom_start=14)
plugins.MarkerCluster(loc).add_to(map_Naka)
map_Naka

### 4. Cluster Towns rolling up to District: Use Foursquare venue data to cluster Towns to District by k-mean method

In [14]:
# Access Foursquare API to get venue data.  Define Client_ID, Client_Secret of mine.and VERSION
CLIENT_ID = 'Y2QO2VYM43BWS2OEAQ1PWCWVDKKELAE5S1QWBKQ3E2XWVNH4' 
CLIENT_SECRET = 'HTZH3ORGXVMLQKSLQWB3EW2BJCJBPFSR1IDIICWSIHTPRO0G' 
VERSION = '20201124'

In [15]:
# Set the function to retrieve venues
import requests
 
radius = 500
LIMIT = 500
 
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])
 
    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Town', 
                  'Town_Latitude', 
                  'Town_Longitude', 
                  'Venue', 
                  'Venue_Lat', 
                  'Venue_Long', 
                  'Venue_Category']
    
    return(nearby_venues)

In [16]:
# Apply above function to Town list 
Naka_venue = getNearbyVenues(names=df_Town['Town'],
                                   latitudes=df_Town['Latitude'],
                                   longitudes=df_Town['Longitude']
                                  )

新港
海岸通
北仲通
元浜町
本町
南仲通
弁天通
太田町
相生町
住吉町
常盤町
尾上町
真砂町
港町
日本大通
横浜公園
山下町
吉浜町
松影町
寿町
扇町
翁町
万代町
不老町
長者町
三吉町
千歳町
山田町
富士見町
山吹町
吉田町
福富町西通
福富町仲通
福富町東通
伊勢佐木町
末広町
羽衣町
蓬莱町
赤門町
英町
初音町
黄金町
末吉町
若葉町
曙町
弥生町
内田町
桜木町
花咲町
野毛町
宮川町
日ノ出町
新山下
小港町
本牧十二天
本牧宮原
和田山
本牧町
本牧ふ頭
錦町
かもめ町
豊浦町
千鳥町
南本牧
本牧原
本牧元町
本牧大里町
本牧三之谷
本牧間門
本牧荒井
本牧和田
矢口台
本牧緑ケ丘
本牧満坂
池袋
根岸加曽台
根岸町
滝之上
豆口台
仲尾台
妙香寺台
上野町
本郷町
西之谷町
立野
大和町
竹之丸
鷺山
麦田町
山元町
西竹之丸
根岸台
根岸旭台
寺久保
簑沢
塚越
大芝台
大平町
元町
山手町
諏訪町
千代崎町
北方町
柏葉
打越
石川町


In [17]:
Naka_venue.shape

(5079, 7)

In [18]:
# Check how the dataframe looks like
Naka_venue

Unnamed: 0,Town,Town_Latitude,Town_Longitude,Venue,Venue_Lat,Venue_Long,Venue_Category
0,新港,35.454370,139.641190,Yokohama Hammerhead (横浜ハンマーヘッド),35.455791,139.641823,Shopping Mall
1,新港,35.454370,139.641190,Port Terrace Cafe,35.454525,139.640704,Café
2,新港,35.454370,139.641190,Yokohama Red Brick Warehouse (横浜赤レンガ倉庫),35.452296,139.642874,Historic Site
3,新港,35.454370,139.641190,Akarenga Park (赤レンガパーク),35.454207,139.643201,Park
4,新港,35.454370,139.641190,Granny Smith Apple Pie & Coffee,35.452569,139.642733,Pie Shop
...,...,...,...,...,...,...,...
5074,石川町,35.436849,139.641109,プリンセスガーデン ヨコハマ,35.440429,139.639518,Bridal Shop
5075,石川町,35.436849,139.641109,QUO VADIS,35.437902,139.645795,Italian Restaurant
5076,石川町,35.436849,139.641109,柏葉 バス停,35.432482,139.641809,Bus Stop
5077,石川町,35.436849,139.641109,forgame,35.432570,139.639677,Shoe Store


In [19]:
# Count Venues by Town
Naka_venue.groupby('Town').count()

Unnamed: 0_level_0,Town_Latitude,Town_Longitude,Venue,Venue_Lat,Venue_Long,Venue_Category
Town,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
かもめ町,3,3,3,3,3,3
万代町,60,60,60,60,60,60
三吉町,32,32,32,32,32,32
上野町,22,22,22,22,22,22
不老町,52,52,52,52,52,52
...,...,...,...,...,...,...
野毛町,100,100,100,100,100,100
長者町,72,72,72,72,72,72
鷺山,14,14,14,14,14,14
麦田町,14,14,14,14,14,14


In [20]:
# How many unique categories can be found from venues
print('There are {} uniques categories.'.format(len(Naka_venue['Venue_Category'].unique())))

There are 210 uniques categories.


In [21]:
# what are those 210 unique venue categories?
print(Naka_venue['Venue_Category'].unique())

['Shopping Mall' 'Café' 'Historic Site' 'Park' 'Pie Shop' 'Jazz Club'
 'Hot Spring' 'Music Venue' 'Museum' 'Restaurant' 'Hawaiian Restaurant'
 'Australian Restaurant' 'Candy Store' 'Playground' 'Italian Restaurant'
 'Donut Shop' 'Theme Park' 'Gift Shop' 'American Restaurant' 'Hotel'
 'Theme Park Ride / Attraction' 'Coffee Shop' 'Chocolate Shop'
 'Paella Restaurant' 'Hotel Bar' 'Sukiyaki Restaurant' 'History Museum'
 'Japanese Restaurant' 'Wedding Hall' 'Food Court' 'Hobby Shop'
 'Shabu-Shabu Restaurant' 'Pizza Place' 'Discount Store' 'Multiplex'
 'Arcade' 'Ice Cream Shop' 'Burger Joint' 'Indian Restaurant'
 'Yoshoku Restaurant' 'Buffet' 'Takoyaki Place' 'Beer Bar'
 'Seafood Restaurant' 'Convenience Store' 'Chinese Restaurant'
 'Mexican Restaurant' 'Fast Food Restaurant' 'Japanese Family Restaurant'
 'Gourmet Shop' 'Arts & Crafts Store' 'Clothing Store' 'Boutique' 'Pier'
 'Motorcycle Shop' 'Steakhouse' 'Coffee Roaster' 'Bakery' 'Plaza'
 'Scenic Lookout' 'Outdoor Sculpture' 'BBQ Joint' '

## 4. Cluster Towns rolling up to District (continued) with k-mean method

In [22]:
# one hot encoding
Naka_onehot = pd.get_dummies(Naka_venue[['Venue_Category']], prefix="", prefix_sep="")
# add Town column back to dataframe
Naka_onehot['Town'] = Naka_venue['Town'] 
# move Town column to the first column
fixed_columns = [Naka_onehot.columns[-1]] + list(Naka_onehot.columns[:-1])
Naka_onehot=Naka_onehot[fixed_columns]
Naka_onehot.head()


Unnamed: 0,Town,ATM,Accessories Store,American Restaurant,Aquarium,Arcade,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,...,Video Store,Wagashi Place,Wedding Hall,Whisky Bar,Wine Bar,Wings Joint,Yakitori Restaurant,Yoshoku Restaurant,Zoo,Zoo Exhibit
0,新港,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,新港,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,新港,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,新港,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,新港,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [23]:
# Check Naka_onehot dataframe size -> rows and columns look okay
Naka_onehot.shape

(5079, 211)

In [24]:
# Group rows by Town and by the mean of occurance of each Category
Naka_grouped = Naka_onehot.groupby('Town').mean().reset_index()
Naka_grouped.head()

Unnamed: 0,Town,ATM,Accessories Store,American Restaurant,Aquarium,Arcade,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,...,Video Store,Wagashi Place,Wedding Hall,Whisky Bar,Wine Bar,Wings Joint,Yakitori Restaurant,Yoshoku Restaurant,Zoo,Zoo Exhibit
0,かもめ町,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,万代町,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.016667,0.016667,0.0,0.0,0.0
2,三吉町,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.03125,0.0,0.0,0.0,0.0,0.0
3,上野町,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,不老町,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.019231,0.019231,0.0,0.0,0.0


In [25]:
# Check the dataframe shape -> looks okay as there are 106 Towns and 211 categories
Naka_grouped.shape

(105, 211)

In [26]:
# print each Town along with the top 5 most common venues
num_top_venues = 5

for hood in Naka_grouped['Town']:
    print("----"+hood+"----")
    temp = Naka_grouped[Naka_grouped['Town'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----かもめ町----
               venue  freq
0       Intersection  0.33
1        Bus Station  0.33
2  Food & Drink Shop  0.33
3                ATM  0.00
4               Port  0.00


----万代町----
                venue  freq
0    Ramen Restaurant  0.13
1   Convenience Store  0.08
2                Café  0.05
3    Baseball Stadium  0.05
4  Chinese Restaurant  0.03


----三吉町----
                        venue  freq
0           Convenience Store  0.38
1            Ramen Restaurant  0.09
2          Donburi Restaurant  0.06
3  Japanese Family Restaurant  0.06
4               Grocery Store  0.06


----上野町----
                venue  freq
0       Historic Site  0.18
1   Convenience Store  0.18
2  Chinese Restaurant  0.14
3          Restaurant  0.09
4      Tennis Stadium  0.05


----不老町----
               venue  freq
0  Convenience Store  0.15
1   Ramen Restaurant  0.12
2   Baseball Stadium  0.10
3           Sake Bar  0.04
4    Soba Restaurant  0.04


----仲尾台----
               venue  freq
0           Bu

In [27]:
# sort the categories by Town in descending order
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [28]:
# Put this to new data frame
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top 10 common venues (= common category type)
columns = ['Town']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
Town_venues_sorted = pd.DataFrame(columns=columns)
Town_venues_sorted['Town'] = Naka_grouped['Town']

for ind in np.arange(Naka_grouped.shape[0]):
    Town_venues_sorted.iloc[ind, 1:] = return_most_common_venues(Naka_grouped.iloc[ind, :], num_top_venues)

Town_venues_sorted.head()

Unnamed: 0,Town,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,かもめ町,Intersection,Food & Drink Shop,Bus Station,Zoo Exhibit,Electronics Store,Fried Chicken Joint,French Restaurant,Fountain,Food Court,Food
1,万代町,Ramen Restaurant,Convenience Store,Baseball Stadium,Café,Tempura Restaurant,Soba Restaurant,Chinese Restaurant,Arcade,Indian Restaurant,Beer Garden
2,三吉町,Convenience Store,Ramen Restaurant,Japanese Family Restaurant,Grocery Store,Donburi Restaurant,Bar,Park,Wine Bar,Comedy Club,Bus Station
3,上野町,Convenience Store,Historic Site,Chinese Restaurant,Restaurant,History Museum,Trail,Tennis Stadium,Tennis Court,Café,Museum
4,不老町,Convenience Store,Ramen Restaurant,Baseball Stadium,Soba Restaurant,Café,Sake Bar,Rock Club,Drugstore,Sporting Goods Shop,Soup Place


In [29]:
# set number of clusters
kclusters = 5

Naka_grouped_clustering = Naka_grouped.drop('Town', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(Naka_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([0, 0, 1, 1, 1, 0, 1, 0, 0, 0], dtype=int32)

In [30]:
# Create a new dataframe that includes the cluster and top 10 venues for each Town
# add clustering Labels
Town_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

Naka_merged = df_Town

# merge Naka_grouped with df_Town to add latitude/longitude for each Town
Naka_merged = Naka_merged.join(Town_venues_sorted.set_index('Town'), on='Town')

Naka_merged.head()

Unnamed: 0,Postal_Code,Town,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,231-0001,新港,35.45437,139.64119,0.0,Café,Shopping Mall,Seafood Restaurant,Convenience Store,Italian Restaurant,Hotel,Coffee Shop,Park,Japanese Restaurant,Mexican Restaurant
1,231-0002,海岸通,35.450641,139.64273,0.0,Café,History Museum,Historic Site,Clothing Store,Convenience Store,Shopping Mall,Italian Restaurant,Pie Shop,Hawaiian Restaurant,American Restaurant
2,231-0003,北仲通,35.449926,139.637674,0.0,Café,Italian Restaurant,Hotel,Bed & Breakfast,Japanese Restaurant,Tonkatsu Restaurant,History Museum,Sake Bar,Coffee Shop,Soba Restaurant
3,231-0004,元浜町,35.449272,139.640069,0.0,Café,Hotel,History Museum,Italian Restaurant,Jazz Club,BBQ Joint,Sake Bar,Coffee Shop,Convenience Store,Bed & Breakfast
4,231-0005,本町,35.449386,139.637551,0.0,Café,Italian Restaurant,Coffee Shop,Soba Restaurant,Hotel,Bed & Breakfast,Japanese Restaurant,Sake Bar,History Museum,Tonkatsu Restaurant


In [31]:
# Check Naka_merged dataframe if Common venues are reasonably differentiated among Towns
pd.set_option('display.max_rows',500)
Naka_merged

Unnamed: 0,Postal_Code,Town,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,231-0001,新港,35.45437,139.64119,0.0,Café,Shopping Mall,Seafood Restaurant,Convenience Store,Italian Restaurant,Hotel,Coffee Shop,Park,Japanese Restaurant,Mexican Restaurant
1,231-0002,海岸通,35.450641,139.64273,0.0,Café,History Museum,Historic Site,Clothing Store,Convenience Store,Shopping Mall,Italian Restaurant,Pie Shop,Hawaiian Restaurant,American Restaurant
2,231-0003,北仲通,35.449926,139.637674,0.0,Café,Italian Restaurant,Hotel,Bed & Breakfast,Japanese Restaurant,Tonkatsu Restaurant,History Museum,Sake Bar,Coffee Shop,Soba Restaurant
3,231-0004,元浜町,35.449272,139.640069,0.0,Café,Hotel,History Museum,Italian Restaurant,Jazz Club,BBQ Joint,Sake Bar,Coffee Shop,Convenience Store,Bed & Breakfast
4,231-0005,本町,35.449386,139.637551,0.0,Café,Italian Restaurant,Coffee Shop,Soba Restaurant,Hotel,Bed & Breakfast,Japanese Restaurant,Sake Bar,History Museum,Tonkatsu Restaurant
5,231-0006,南仲通,35.448198,139.638584,0.0,Café,Coffee Shop,Bed & Breakfast,Convenience Store,Tonkatsu Restaurant,History Museum,Japanese Restaurant,Museum,Park,Soba Restaurant
6,231-0007,弁天通,35.448037,139.637668,0.0,Café,Coffee Shop,Convenience Store,Bed & Breakfast,Japanese Restaurant,Bar,Tonkatsu Restaurant,Chinese Restaurant,Udon Restaurant,Sake Bar
7,231-0011,太田町,35.447828,139.637055,0.0,Coffee Shop,Café,Bed & Breakfast,Convenience Store,Tonkatsu Restaurant,Japanese Restaurant,Bar,Brewery,Tempura Restaurant,Beer Bar
8,231-0012,相生町,35.447439,139.636824,0.0,Coffee Shop,Café,Bed & Breakfast,Convenience Store,Tonkatsu Restaurant,Japanese Restaurant,Ramen Restaurant,Bar,Chinese Restaurant,Brewery
9,231-0013,住吉町,35.447036,139.636577,0.0,Café,Coffee Shop,Bed & Breakfast,Tonkatsu Restaurant,Japanese Restaurant,Convenience Store,Ramen Restaurant,Bar,Chinese Restaurant,Brewery


In [32]:
# Prepare latitude and longitude of Naka-ku for Map creation
address = 'Yokohama'

geolocator = Nominatim(user_agent="Naka_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinates of Naka-ward are {}, {}.'.format(latitude, longitude))

The geograpical coordinates of Naka-ward are 35.444991, 139.636768.


In [31]:
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=13)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(Naka_merged['Latitude'], Naka_merged['Longitude'], Naka_merged['Town'], Naka_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster Labels ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### 5. Review Districts to understand their characteristics and name them for easire reference

In [34]:
# Cluster Label 0: Business & Commercial District
Naka_merged.loc[Naka_merged['Cluster Labels'] == 0, Naka_merged.columns[[1] + list(range(5, Naka_merged.shape[1]))]]

Unnamed: 0,Town,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,新港,Café,Shopping Mall,Seafood Restaurant,Convenience Store,Italian Restaurant,Hotel,Coffee Shop,Park,Japanese Restaurant,Mexican Restaurant
1,海岸通,Café,History Museum,Historic Site,Clothing Store,Convenience Store,Shopping Mall,Italian Restaurant,Pie Shop,Hawaiian Restaurant,American Restaurant
2,北仲通,Café,Italian Restaurant,Hotel,Bed & Breakfast,Japanese Restaurant,Tonkatsu Restaurant,History Museum,Sake Bar,Coffee Shop,Soba Restaurant
3,元浜町,Café,Hotel,History Museum,Italian Restaurant,Jazz Club,BBQ Joint,Sake Bar,Coffee Shop,Convenience Store,Bed & Breakfast
4,本町,Café,Italian Restaurant,Coffee Shop,Soba Restaurant,Hotel,Bed & Breakfast,Japanese Restaurant,Sake Bar,History Museum,Tonkatsu Restaurant
5,南仲通,Café,Coffee Shop,Bed & Breakfast,Convenience Store,Tonkatsu Restaurant,History Museum,Japanese Restaurant,Museum,Park,Soba Restaurant
6,弁天通,Café,Coffee Shop,Convenience Store,Bed & Breakfast,Japanese Restaurant,Bar,Tonkatsu Restaurant,Chinese Restaurant,Udon Restaurant,Sake Bar
7,太田町,Coffee Shop,Café,Bed & Breakfast,Convenience Store,Tonkatsu Restaurant,Japanese Restaurant,Bar,Brewery,Tempura Restaurant,Beer Bar
8,相生町,Coffee Shop,Café,Bed & Breakfast,Convenience Store,Tonkatsu Restaurant,Japanese Restaurant,Ramen Restaurant,Bar,Chinese Restaurant,Brewery
9,住吉町,Café,Coffee Shop,Bed & Breakfast,Tonkatsu Restaurant,Japanese Restaurant,Convenience Store,Ramen Restaurant,Bar,Chinese Restaurant,Brewery


In [35]:
# Cluster Label 1: Shop & Entertainment District
Naka_merged.loc[Naka_merged['Cluster Labels'] == 1, Naka_merged.columns[[1] + list(range(5, Naka_merged.shape[1]))]]

Unnamed: 0,Town,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
18,松影町,Convenience Store,Ramen Restaurant,Baseball Stadium,Historic Site,Soba Restaurant,Bed & Breakfast,Intersection,Italian Restaurant,Japanese Restaurant,Beer Garden
19,寿町,Convenience Store,Ramen Restaurant,Baseball Stadium,Chinese Restaurant,Coffee Shop,Sporting Goods Shop,Café,Soba Restaurant,Bed & Breakfast,Bar
20,扇町,Convenience Store,Baseball Stadium,Ramen Restaurant,Grocery Store,Soba Restaurant,Bed & Breakfast,Intersection,Sporting Goods Shop,Donburi Restaurant,Café
21,翁町,Convenience Store,Baseball Stadium,Soba Restaurant,Ramen Restaurant,Grocery Store,Rock Club,Intersection,Japanese Restaurant,Sporting Goods Shop,Donburi Restaurant
23,不老町,Convenience Store,Ramen Restaurant,Baseball Stadium,Soba Restaurant,Café,Sake Bar,Rock Club,Drugstore,Sporting Goods Shop,Soup Place
25,三吉町,Convenience Store,Ramen Restaurant,Japanese Family Restaurant,Grocery Store,Donburi Restaurant,Bar,Park,Wine Bar,Comedy Club,Bus Station
26,千歳町,Convenience Store,Ramen Restaurant,Japanese Family Restaurant,Grocery Store,Soba Restaurant,Donburi Restaurant,Japanese Curry Restaurant,Bar,Bus Station,Café
27,山田町,Convenience Store,Ramen Restaurant,Grocery Store,Soba Restaurant,Japanese Curry Restaurant,Japanese Family Restaurant,Donburi Restaurant,Betting Shop,Cantonese Restaurant,Middle Eastern Restaurant
28,富士見町,Convenience Store,Ramen Restaurant,Grocery Store,Donburi Restaurant,Chinese Restaurant,Coffee Shop,Japanese Family Restaurant,Teishoku Restaurant,Soba Restaurant,Seafood Restaurant
29,山吹町,Convenience Store,Ramen Restaurant,Grocery Store,Chinese Restaurant,Coffee Shop,Tonkatsu Restaurant,Japanese Curry Restaurant,Italian Restaurant,Discount Store,Soba Restaurant


In [36]:
# Cluster Label 2: Port District
Naka_merged.loc[Naka_merged['Cluster Labels'] == 2, Naka_merged.columns[[1] + list(range(5, Naka_merged.shape[1]))]]

Unnamed: 0,Town,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
72,本牧緑ケ丘,Sports Club,Park,Playground,Zoo Exhibit,Fried Chicken Joint,Fountain,Food Court,Food & Drink Shop,Food,Flower Shop
74,池袋,Park,Sports Club,Convenience Store,Ramen Restaurant,Electronics Store,French Restaurant,Fountain,Food Court,Food & Drink Shop,Food
75,根岸加曽台,Sports Club,Park,Bus Stop,Convenience Store,Zoo Exhibit,Event Space,French Restaurant,Fountain,Food Court,Food & Drink Shop


In [35]:
# Cluster Label 3: Park & Residential District
Naka_merged.loc[Naka_merged['Cluster Labels'] == 3, Naka_merged.columns[[1] + list(range(5, Naka_merged.shape[1]))]]

Unnamed: 0,Town,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
57,本牧町,Bus Stop,Convenience Store,Park,Grocery Store,Szechuan Restaurant,Bakery,Pharmacy,Chinese Restaurant,Snack Place,Bar
60,かもめ町,Intersection,Bus Station,Bus Stop,Food & Drink Shop,Zoo Exhibit,Fast Food Restaurant,Garden,Furniture / Home Store,Fried Chicken Joint,French Restaurant
61,豊浦町,Train Station,Intersection,Bus Stop,Event Space,Furniture / Home Store,Fried Chicken Joint,French Restaurant,Fountain,Food Court,Food & Drink Shop
62,千鳥町,Park,Tennis Court,Bus Stop,Toll Booth,Pool,Historic Site,History Museum,Baseball Field,Bus Station,Intersection
67,本牧三之谷,Historic Site,Park,Bus Stop,Convenience Store,Bakery,Garden,Café,Tea Room,Grocery Store,Snack Place
68,本牧間門,Bus Stop,Park,Snack Place,Bar,Bakery,Historic Site,Coffee Shop,Grocery Store,Lake,Steakhouse
69,本牧荒井,Sports Club,Bus Stop,Bar,Supermarket,Grocery Store,Coffee Shop,Ramen Restaurant,Indian Restaurant,Park,Steakhouse
73,本牧満坂,Japanese Restaurant,Coffee Shop,Park,Grocery Store,Bus Stop,Plaza,Snack Place,Scenic Lookout,Donburi Restaurant,Donut Shop
75,根岸加曽台,Convenience Store,Park,Bus Stop,Sports Club,Zoo Exhibit,Fast Food Restaurant,Furniture / Home Store,Fried Chicken Joint,French Restaurant,Fountain
77,滝之上,Bus Stop,Park,Ramen Restaurant,Plaza,Chinese Restaurant,Snack Place,Museum,Bakery,Stables,Seafood Restaurant


In [37]:
# Cluster Label 4: Industrial District
Naka_merged.loc[Naka_merged['Cluster Labels'] == 4, Naka_merged.columns[[1] + list(range(5, Naka_merged.shape[1]))]]

Unnamed: 0,Town,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
61,豊浦町,Train Station,Intersection,Bus Stop,Zoo Exhibit,French Restaurant,Fountain,Food Court,Food & Drink Shop,Food,Flower Shop
62,千鳥町,Park,Bus Stop,Tennis Court,Baseball Field,Historic Site,History Museum,Pool,Toll Booth,Bus Station,Intersection
65,本牧元町,Bus Station,Bus Stop,Convenience Store,Intersection,History Museum,Park,Event Space,French Restaurant,Fountain,Food Court
66,本牧大里町,Historic Site,Park,Garden,Intersection,Pool,Snack Place,Bus Stop,Bus Station,Tea Room,Tennis Court
67,本牧三之谷,Historic Site,Bus Stop,Park,Convenience Store,Garden,Bakery,Liquor Store,Clothing Store,Tea Room,History Museum
69,本牧荒井,Park,Sports Club,Bus Stop,Bar,Ramen Restaurant,Steakhouse,Grocery Store,Supermarket,Coffee Shop,Japanese Restaurant
70,本牧和田,Park,Bus Stop,Convenience Store,Mobile Phone Shop,Plaza,Japanese Restaurant,Bakery,Chinese Restaurant,Steakhouse,Café
73,本牧満坂,Plaza,Park,Japanese Restaurant,Grocery Store,Coffee Shop,Bus Stop,Scenic Lookout,Indian Restaurant,Ice Cream Shop,Food & Drink Shop
77,滝之上,Bus Stop,Park,Ramen Restaurant,Trail,Snack Place,Seafood Restaurant,Bakery,Museum,Plaza,Chinese Restaurant
78,豆口台,Sports Club,Trail,Platform,Convenience Store,Park,Museum,Bus Stop,Bakery,Train Station,Supermarket


In [38]:
Naka_merged.groupby('Cluster Labels').size()

Cluster Labels
0.0    50
1.0    33
2.0     3
3.0     2
4.0    17
dtype: int64

### More suited for Italian restaurant District is either Label 0 or 1: Business & Commercial District or Shop & Restaurant District

In [39]:
# Creating a new data frame containing only Cluster Labels 0 or 1  as Pos_Towns (=Possible Towns)
Pos_Town=Naka_merged[(Naka_merged['Cluster Labels']==0) | (Naka_merged['Cluster Labels']==1)]
Pos_Towns=Pos_Town[['Town','Latitude','Longitude','Cluster Labels']]
Pos_Towns

Unnamed: 0,Town,Latitude,Longitude,Cluster Labels
0,新港,35.45437,139.64119,0.0
1,海岸通,35.450641,139.64273,0.0
2,北仲通,35.449926,139.637674,0.0
3,元浜町,35.449272,139.640069,0.0
4,本町,35.449386,139.637551,0.0
5,南仲通,35.448198,139.638584,0.0
6,弁天通,35.448037,139.637668,0.0
7,太田町,35.447828,139.637055,0.0
8,相生町,35.447439,139.636824,0.0
9,住吉町,35.447036,139.636577,0.0


### 6. Identify Italian Restaurants in Cluster Labels 0 or 1 Districts using Foursquare

In [40]:
# Set the function to retrieve venues
import requests
 
radius = 150
LIMIT = 100
 
def getNearbyVenues(names, latitudes, longitudes, radius=150):
    
    venues_list=[]
    categoryID='4bf58dd8d48988d110941735' # Italian Restaurant 
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}&categoryId={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT,
            categoryID)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])
 
    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Town', 
                  'Town_Latitude', 
                  'Town_Longitude', 
                  'Venue', 
                  'Venue_Lat', 
                  'Venue_Long', 
                  'Venue_Category']
    
    return(nearby_venues)

In [41]:
# Apply above function to Town list 
Naka_Italian_venue = getNearbyVenues(names=Pos_Towns['Town'],
                                   latitudes=df_Town['Latitude'],
                                   longitudes=df_Town['Longitude']
                                  )

新港
海岸通
北仲通
元浜町
本町
南仲通
弁天通
太田町
相生町
住吉町
常盤町
尾上町
真砂町
港町
日本大通
横浜公園
山下町
吉浜町
松影町
寿町
扇町
翁町
万代町
不老町
長者町
三吉町
千歳町
山田町
富士見町
山吹町
吉田町
福富町西通
福富町仲通
福富町東通
伊勢佐木町
末広町
羽衣町
蓬莱町
赤門町
英町
初音町
黄金町
末吉町
若葉町
曙町
弥生町
内田町
桜木町
花咲町
野毛町
宮川町
日ノ出町
新山下
小港町
本牧十二天
本牧宮原
和田山
本牧町
かもめ町
本牧原
本牧間門
矢口台
根岸町
仲尾台
妙香寺台
上野町
本郷町
西之谷町
立野
大和町
竹之丸
鷺山
麦田町
根岸旭台
大芝台
大平町
元町
山手町
諏訪町
千代崎町
北方町
打越
石川町


In [42]:
# Check how many venues are captured -> 90 venues
Naka_Italian_venue.shape

(90, 7)

In [43]:
# Check briefly the names of Italian restaurants -> Venue_Category looks fine
Naka_Italian_venue

Unnamed: 0,Town,Town_Latitude,Town_Longitude,Venue,Venue_Lat,Venue_Long,Venue_Category
0,新港,35.45437,139.64119,A16 YOKOHAMA,35.454704,139.642311,Italian Restaurant
1,北仲通,35.449926,139.637674,ROJI,35.449998,139.637779,Italian Restaurant
2,北仲通,35.449926,139.637674,Osteria Austro,35.44941,139.63853,Italian Restaurant
3,北仲通,35.449926,139.637674,La Brezza BASHAMICHI (ラブレッツア馬車道),35.449959,139.637685,Italian Restaurant
4,元浜町,35.449272,139.640069,Osteria Austro,35.44941,139.63853,Italian Restaurant
5,元浜町,35.449272,139.640069,Taverna Pollone (タベルナポローネ),35.449406,139.640001,Italian Restaurant
6,本町,35.449386,139.637551,ROJI,35.449998,139.637779,Italian Restaurant
7,本町,35.449386,139.637551,Osteria Austro,35.44941,139.63853,Italian Restaurant
8,本町,35.449386,139.637551,La Brezza BASHAMICHI (ラブレッツア馬車道),35.449959,139.637685,Italian Restaurant
9,南仲通,35.448198,139.638584,Cafe&Kitchen. 333,35.447586,139.638965,Italian Restaurant


In [44]:
# Add back Cluster Labels to above data frame by merging it with Pos_Town data frame
Naka_Italian=pd.merge(Naka_Italian_venue,Pos_Towns,on='Town')
Italian_list=Naka_Italian.drop(['Latitude','Longitude'], axis=1)
Italian_list

Unnamed: 0,Town,Town_Latitude,Town_Longitude,Venue,Venue_Lat,Venue_Long,Venue_Category,Cluster Labels
0,新港,35.45437,139.64119,A16 YOKOHAMA,35.454704,139.642311,Italian Restaurant,0.0
1,北仲通,35.449926,139.637674,ROJI,35.449998,139.637779,Italian Restaurant,0.0
2,北仲通,35.449926,139.637674,Osteria Austro,35.44941,139.63853,Italian Restaurant,0.0
3,北仲通,35.449926,139.637674,La Brezza BASHAMICHI (ラブレッツア馬車道),35.449959,139.637685,Italian Restaurant,0.0
4,元浜町,35.449272,139.640069,Osteria Austro,35.44941,139.63853,Italian Restaurant,0.0
5,元浜町,35.449272,139.640069,Taverna Pollone (タベルナポローネ),35.449406,139.640001,Italian Restaurant,0.0
6,本町,35.449386,139.637551,ROJI,35.449998,139.637779,Italian Restaurant,0.0
7,本町,35.449386,139.637551,Osteria Austro,35.44941,139.63853,Italian Restaurant,0.0
8,本町,35.449386,139.637551,La Brezza BASHAMICHI (ラブレッツア馬車道),35.449959,139.637685,Italian Restaurant,0.0
9,南仲通,35.448198,139.638584,Cafe&Kitchen. 333,35.447586,139.638965,Italian Restaurant,0.0


In [45]:
# Count Italian Venues by Town -> Good to check Italian restaurants are fairly distributed across Towns
Italian_list.groupby('Town').count()

Unnamed: 0_level_0,Town_Latitude,Town_Longitude,Venue,Venue_Lat,Venue_Long,Venue_Category,Cluster Labels
Town,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
万代町,1,1,1,1,1,1,1
不老町,1,1,1,1,1,1,1
住吉町,5,5,5,5,5,5,5
元浜町,2,2,2,2,2,2,2
内田町,1,1,1,1,1,1,1
北仲通,3,3,3,3,3,3,3
南仲通,3,3,3,3,3,3,3
吉田町,5,5,5,5,5,5,5
太田町,5,5,5,5,5,5,5
宮川町,6,6,6,6,6,6,6


In [46]:
# Check the number of unique venue names -> 94 venues captured vs. 44 unique names.  There are fair amount of duplications in the data frame, although some of them are chains and okay for duplication
uc=Italian_list['Venue'].nunique()
uc

44

In [47]:
# Check which Italian Restaurants are suspected for duplication in the data frame
pd.set_option('display.max_rows',200)
vc=Italian_list['Venue'].value_counts()
vc

Piacere                              5
Via Toscanella                       5
VINOTECA SAKURA                      5
Osteria Austro                       4
ハマラジャ                                4
La Pausa (ラパウザ)                      4
Pizza Cozou - ぴざこぞう                  4
Saizeriya (サイゼリヤ)                    3
ヤンキース                                2
ベイサイド Ducky Duck キッチン                2
麺房亭 / 春雷亭                            2
L'isola del Brio                     2
OREZZO                               2
POZ DINING                           2
kinpira kitchen                      2
BARACCA                              2
インコントロ                               2
イタリアンバル ぽると 関内駅前店                    2
Italian Bar BASIL                    2
MILANO                               2
Italian Bar BACCO                    2
ROJI                                 2
OiNOS                                2
iL-CHIANTI 横浜店                       2
Cafe&Kitchen. 333                    2
La Brezza BASHAMICHI (ラブレ

In [51]:
# Visualize where those Italian restaurants are today in Cluster Lables 0 and 1 -> Almost all Italian restaurants are actually in Cluster Labels 0 = Business & Commercial District
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=14)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(Italian_list['Venue_Lat'], Italian_list['Venue_Long'], Italian_list['Venue'], Italian_list['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster Labels ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[int(cluster)-1],
        fill=True,
        fill_color=rainbow[int(cluster)-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

## Results - There are fair concentration of restaurants in a couple of Districts rather than spreading out over entire Naka-ward.  Cluster analysis identified 2 out of 5 Districts having many restaurants, but when focusing just on the category "Italian Restaurant", it is only 1 District, Business & Commercial District, where we can see many Italian restaurants exist today

In [73]:
# Review Most Common Venues to check if Italian Restaurant appear in Business & Commercial District
Naka_merged

Unnamed: 0,Postal_Code,Town,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,231-0001,新港,35.45437,139.64119,0.0,Café,Shopping Mall,Seafood Restaurant,Convenience Store,Italian Restaurant,Hotel,Coffee Shop,Park,Japanese Restaurant,Mexican Restaurant
1,231-0002,海岸通,35.450641,139.64273,0.0,Café,History Museum,Historic Site,Clothing Store,Convenience Store,Shopping Mall,Italian Restaurant,Pie Shop,Hawaiian Restaurant,American Restaurant
2,231-0003,北仲通,35.449926,139.637674,0.0,Café,Italian Restaurant,Hotel,Bed & Breakfast,Japanese Restaurant,Tonkatsu Restaurant,History Museum,Sake Bar,Coffee Shop,Soba Restaurant
3,231-0004,元浜町,35.449272,139.640069,0.0,Café,Hotel,History Museum,Italian Restaurant,Jazz Club,BBQ Joint,Sake Bar,Coffee Shop,Convenience Store,Bed & Breakfast
4,231-0005,本町,35.449386,139.637551,0.0,Café,Italian Restaurant,Coffee Shop,Soba Restaurant,Hotel,Bed & Breakfast,Japanese Restaurant,Sake Bar,History Museum,Tonkatsu Restaurant
5,231-0006,南仲通,35.448198,139.638584,0.0,Café,Coffee Shop,Bed & Breakfast,Convenience Store,Tonkatsu Restaurant,History Museum,Japanese Restaurant,Museum,Park,Soba Restaurant
6,231-0007,弁天通,35.448037,139.637668,0.0,Café,Coffee Shop,Convenience Store,Bed & Breakfast,Japanese Restaurant,Bar,Tonkatsu Restaurant,Chinese Restaurant,Udon Restaurant,Sake Bar
7,231-0011,太田町,35.447828,139.637055,0.0,Coffee Shop,Café,Bed & Breakfast,Convenience Store,Tonkatsu Restaurant,Japanese Restaurant,Bar,Brewery,Tempura Restaurant,Beer Bar
8,231-0012,相生町,35.447439,139.636824,0.0,Coffee Shop,Café,Bed & Breakfast,Convenience Store,Tonkatsu Restaurant,Japanese Restaurant,Ramen Restaurant,Bar,Chinese Restaurant,Brewery
9,231-0013,住吉町,35.447036,139.636577,0.0,Café,Coffee Shop,Bed & Breakfast,Tonkatsu Restaurant,Japanese Restaurant,Convenience Store,Ramen Restaurant,Bar,Chinese Restaurant,Brewery


## Discussion - Even in Business & Commercial District, there are two Towns listing Italian Restaurant as 2nd Most Common Venue.  We should first review such Towns if the area has apparopriate atomosphere.  Recommend either Postal_Code 231-0003 or 0005 , "北中通" or "本町", to first consider the Town to open new Italian Restaurant. Those two towns appear to have many Cafe and Hotel without distractive shops which combined offer more attractive atomosphere for Italian restaurant.¶

In [74]:
# Identfy Towns with Italian Restaurant listed popular among venue category
ItalianTown=Naka_merged.query('Town =="本町"or Town== "北仲通"')
ItalianTown

Unnamed: 0,Postal_Code,Town,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,231-0003,北仲通,35.449926,139.637674,0.0,Café,Italian Restaurant,Hotel,Bed & Breakfast,Japanese Restaurant,Tonkatsu Restaurant,History Museum,Sake Bar,Coffee Shop,Soba Restaurant
4,231-0005,本町,35.449386,139.637551,0.0,Café,Italian Restaurant,Coffee Shop,Soba Restaurant,Hotel,Bed & Breakfast,Japanese Restaurant,Sake Bar,History Museum,Tonkatsu Restaurant


## Conclusion
The combination of Folium map and Foursquare venue information is a powerful tool for geographical analysis.  But it requires proper clustering (grouping) using such like k-mean method before getring down to more detailed analysis because there are just too many venues in Foursquare. 