# Capstone Project - The comparison of two areas- Paris and Tokyo (Week 5)
### Applied Data Science Capstone by IBM/Coursera

## Table of contents
* [Introduction: Business Problem](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)

## Introduction: Business Problem <a name="introduction"></a>

Comparing the kinds of venues in Paris and in Tokyo.

Paris and Tokyo are both popular sightseeing spots.
In this project we will try to find places related to venues. Specially, this report will be targeted to stakeholders interested in visiting or investing **venues** in Paris and in Tokyo.

There are 20 Arrondissements in Paris. 
We will compare the number of venues and its kind in Paris by ward and we will find the 5 most popular venues.


There are 23 special wards in Tokyo.
We will compare the number of venues and its kind in Tokyo by ward. We will find the 5 most popular venues.


Then we will compare the kinds of both venues.


## Data <a name="data"></a>

Based on definition of our problem, factors that will influence our decision are:

*  number of existing venus in the neighborhood (any type of venue)

Following data sources will be needed to extract/generate the required information:
* centers of areas will be generated algorithmically and approximate addresses of centers of those areas will be obtained using **Google Maps API reverse geocoding**
* number of  venues and their type and location in every arrondissement or ward will be obtained using **Foursquare API**
* coordinate of arrondissements of **Paris**  will be obtained using **Paris Data Arrondissements** 
    https://opendata.paris.fr/explore/dataset/arrondissements/table/?sort=-c_ar
* coordinate of **Tokyo** will be obtained using the below website(we will use the downloaded pdf)
    https://www.gsi.go.jp/KOKUJYOHO/CENTER/kendata/tokyo_heso.pdf


### Import libraries

In [1]:
import numpy as np # library to handle data in a vectorized manner
import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Libraries imported.


### Load and explore the dataset

### 1.Arrondissment Candidate (Paris)

Let's create latitude & longitude coordinates for centroids of our candidate arrondissment. We will create a grid of cells covering our area of interest which is aprox. 5x 5 killometers centered around each arrondissemnt.

In [2]:
with open('arrondissements.json') as json_data:
    paris_data = json.load(json_data)

In [3]:
# paris_data

In [4]:
paris_data[0]['fields']

{'c_ar': 1,
 'c_arinsee': 75101,
 'geom': {'coordinates': [[[2.328007329038849, 48.86991742140716],
    [2.329965588686571, 48.86851416917428],
    [2.330306795320875, 48.86835619167467],
    [2.33065673396009, 48.86819218066116],
    [2.33172562934836, 48.86795490259038],
    [2.33172601351949, 48.86795481659967],
    [2.333675321300196, 48.867516125009374],
    [2.335869054057415, 48.866996626507536],
    [2.335869691238242, 48.866996475356],
    [2.337371969067098, 48.86664907439457],
    [2.341083555178272, 48.86577201721946],
    [2.341178272058699, 48.86574963323162],
    [2.341204510696185, 48.865743681005995],
    [2.34126849090564, 48.86572828653818],
    [2.341271025930368, 48.86572767724483],
    [2.345101655171463, 48.864809197959836],
    [2.346675453051013, 48.864431064833674],
    [2.346676032763326, 48.864430925901665],
    [2.350949105218923, 48.86340592861751],
    [2.350947639855089, 48.86340330447355],
    [2.350214645870493, 48.86209499953653],
    [2.3501560417567

In [5]:
# define the dataframe columns
column_names = ['City', 'ArrNo', 'Arrondissement', 'Latitude', 'Longitude']
# instantiate the dataframe
arrondissements = pd.DataFrame(columns=column_names)

In [6]:
arrondissements

Unnamed: 0,City,ArrNo,Arrondissement,Latitude,Longitude


In [7]:
paris_data[0]['fields']['l_aroff']

'Louvre'

In [8]:
for data in paris_data:
    arr_no = arrondissement_no = data['fields']['l_ar'] 
    arrondissement_name = data['fields']['l_aroff']
        
    arrondissement_latlon = data['fields']['geom_x_y']
    arrondissement_lat = arrondissement_latlon[0]
    arrondissement_lon = arrondissement_latlon[1]
    
    arrondissements = arrondissements.append({
                                          'ArrNo': arr_no,
                                          'Arrondissement': arrondissement_name,
                                          'Latitude': arrondissement_lat,
                                          'Longitude': arrondissement_lon}, ignore_index=True)

In [9]:
arrondissements

Unnamed: 0,City,ArrNo,Arrondissement,Latitude,Longitude
0,,1er Ardt,Louvre,48.862563,2.336443
1,,2ème Ardt,Bourse,48.868279,2.342803
2,,17ème Ardt,Batignolles-Monceau,48.887327,2.306777
3,,14ème Ardt,Observatoire,48.829245,2.326542
4,,20ème Ardt,Ménilmontant,48.863461,2.401188
5,,7ème Ardt,Palais-Bourbon,48.856174,2.312188
6,,11ème Ardt,Popincourt,48.859059,2.380058
7,,13ème Ardt,Gobelins,48.828388,2.362272
8,,4ème Ardt,Hôtel-de-Ville,48.854341,2.35763
9,,8ème Ardt,Élysée,48.872721,2.312554


In [10]:
city = []
for i in range(0, arrondissements.shape[0]):
    city.append('Paris')
    i+=1

In [11]:
arrondissements['City'] = city
arrondissements

Unnamed: 0,City,ArrNo,Arrondissement,Latitude,Longitude
0,Paris,1er Ardt,Louvre,48.862563,2.336443
1,Paris,2ème Ardt,Bourse,48.868279,2.342803
2,Paris,17ème Ardt,Batignolles-Monceau,48.887327,2.306777
3,Paris,14ème Ardt,Observatoire,48.829245,2.326542
4,Paris,20ème Ardt,Ménilmontant,48.863461,2.401188
5,Paris,7ème Ardt,Palais-Bourbon,48.856174,2.312188
6,Paris,11ème Ardt,Popincourt,48.859059,2.380058
7,Paris,13ème Ardt,Gobelins,48.828388,2.362272
8,Paris,4ème Ardt,Hôtel-de-Ville,48.854341,2.35763
9,Paris,8ème Ardt,Élysée,48.872721,2.312554


In [12]:
print('The dataframe has 1 area and {} arrondissements.'.format(
        arrondissements.shape[0]
    )
)

The dataframe has 1 area and 20 arrondissements.


Use geopy library to get the latitude and longitude values of Paris

In [13]:
address = 'Paris'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Paris are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Paris are 48.8566969, 2.3514616.


#### Create a map of Paris with neighborhoods superimposed on top.

In [16]:
# create map of Paris using latitude and longitude values
map_paris = folium.Map(location=[latitude, longitude], zoom_start=12)

# add markers to map
for lat, lng, city, arrondissement in zip(arrondissements['Latitude'], arrondissements['Longitude'], arrondissements['City'], arrondissements['Arrondissement']):
    label = '{}, {}'.format(arrondissement, city)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_paris)  
    
map_paris

### Foursquare
Now that we have our location candidates, let's use Foursquare API to get info on venues in each arrondissement.

In [17]:
CLIENT_ID = '' # my Foursquare ID
CLIENT_SECRET = '' # my Foursquare Secret
VERSION = '20180605' # Foursquare API version

#### Explore the first arrondissement in our dataframe

Get the arrondissement's name.

In [18]:
arrondissements.loc[0, 'Arrondissement']

'Louvre'

Get the arrondissement's latitude and longitude values.

In [19]:
arrondissement_latitude = arrondissements.loc[0, 'Latitude'] # neighborhood latitude value
arrondissement_longitude = arrondissements.loc[0, 'Longitude'] # neighborhood longitude value

arrondissement_name = arrondissements.loc[0, 'Arrondissement'] # neighborhood name

print('Latitude and longitude values of {} are {}, {}.'.format(arrondissement_name, 
                                                               arrondissement_latitude, 
                                                               arrondissement_longitude))

Latitude and longitude values of Louvre are 48.8625627018, 2.33644336205.


Now, let's get the top 100 venues that are in Louvre within a radius of 250 meters.

In [39]:
# type your answer here
LIMIT = 100 # limit of number of venues returned by Foursquare API
radius = 250 # define radius

url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    arrondissement_latitude, 
    arrondissement_longitude, 
    radius, 
    LIMIT)
#url # display URL

In [40]:
results_louvre = requests.get(url).json()
#results_louvre

In [41]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [42]:
venues_louvre = results_louvre['response']['groups'][0]['items']
    
nearby_venues_louvre = json_normalize(venues_louvre) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues_louvre =nearby_venues_louvre.loc[:, filtered_columns]

# filter the category for each row
nearby_venues_louvre['venue.categories'] = nearby_venues_louvre.apply(get_category_type, axis=1)

# clean columns
nearby_venues_louvre.columns = [col.split(".")[-1] for col in nearby_venues_louvre.columns]

nearby_venues_louvre.head()

Unnamed: 0,name,categories,lat,lng
0,Musée du Louvre,Art Museum,48.860847,2.33644
1,Palais Royal,Historic Site,48.863236,2.337127
2,Comédie-Française,Theater,48.863088,2.336612
3,La Clef Louvre Paris,Hotel,48.863977,2.33614
4,Cour Napoléon,Plaza,48.861172,2.335088


In [43]:
print('{} venues were returned by Foursquare.'.format(nearby_venues_louvre.shape[0]))

11 venues were returned by Foursquare.


### Explore Arrondissements in Paris

In [44]:
def getNearbyVenuesParis(names, latitudes, longitudes, radius=250):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Arrondissement', 
                  'Arrondissement Latitude', 
                  'Arrondissement Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [46]:
paris_venues = getNearbyVenuesParis(names=arrondissements['Arrondissement'],
                                   latitudes=arrondissements['Latitude'],
                                   longitudes=arrondissements['Longitude']
                                  )

Louvre
Bourse
Batignolles-Monceau
Observatoire
Ménilmontant
Palais-Bourbon
Popincourt
Gobelins
Hôtel-de-Ville
Élysée
Buttes-Montmartre
Opéra
Buttes-Chaumont
Vaugirard
Temple
Panthéon
Luxembourg
Reuilly
Entrepôt
Passy


In [47]:
print(paris_venues.shape)
paris_venues.head()

(463, 7)


Unnamed: 0,Arrondissement,Arrondissement Latitude,Arrondissement Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Louvre,48.862563,2.336443,Musée du Louvre,48.860847,2.33644,Art Museum
1,Louvre,48.862563,2.336443,Palais Royal,48.863236,2.337127,Historic Site
2,Louvre,48.862563,2.336443,Comédie-Française,48.863088,2.336612,Theater
3,Louvre,48.862563,2.336443,La Clef Louvre Paris,48.863977,2.33614,Hotel
4,Louvre,48.862563,2.336443,Cour Napoléon,48.861172,2.335088,Plaza


In [48]:
paris_venues.groupby('Arrondissement').count()

Unnamed: 0_level_0,Arrondissement Latitude,Arrondissement Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Arrondissement,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Batignolles-Monceau,18,18,18,18,18,18
Bourse,30,30,30,30,30,30
Buttes-Chaumont,10,10,10,10,10,10
Buttes-Montmartre,23,23,23,23,23,23
Entrepôt,29,29,29,29,29,29
Gobelins,7,7,7,7,7,7
Hôtel-de-Ville,40,40,40,40,40,40
Louvre,11,11,11,11,11,11
Luxembourg,26,26,26,26,26,26
Ménilmontant,14,14,14,14,14,14


In [49]:
print('There are {} uniques categories.'.format(len(paris_venues['Venue Category'].unique())))

There are 128 uniques categories.


### 2. Ward Candidate (Tokyo)

Tokyo has a total of 23 special wards. We use the dataset as below and translate japanese into English for exploring the dataset.

In [50]:
!conda install -c conda-forge tabula-py --yes

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.



In [51]:
import pandas as pd
import tabula
file = "tokyo_heso.pdf"
path = file
df_tokyo = tabula.read_pdf(path, pages = '1-7', multiple_tables = True, encoding = "utf-8_sig")
print(df_tokyo)

[     0            1            2            3            4            5
0  NaN         東京都庁           東端           西端           南端           北端
1   経度  139°41′30′′  153°59′12′′  136°04′11′′  136°04′11′′  139°01′06′′
2   緯度   35°41′22′′   24°16′59′′   20°25′31′′   20°25′31′′   35°53′54′′,      0            1            2            3    4            5
0  NaN       千代田区役所           東端           西端   南端           北端
1   経度  139°45′13′′  139°46′59′′  139°43′48′′  NaN  139°46′12′′
2  NaN          NaN          NaN          NaN   未定          NaN
3   緯度   35°41′38′′   35°41′47′′   35°41′09′′  NaN   35°42′19′′,      0            1            2    3            4            5
0  NaN        中央区役所           東端   西端           南端           北端
1   経度  139°46′20′′  139°47′33′′  NaN  139°46′21′′  139°46′59′′
2  NaN          NaN          NaN   未定          NaN          NaN
3   緯度   35°40′15′′   35°41′09′′  NaN   35°38′46′′   35°41′47′′,      0            1            2            3            4          

In [52]:
df_tokyo[0]

Unnamed: 0,0,1,2,3,4,5
0,,東京都庁,東端,西端,南端,北端
1,経度,139°41′30′′,153°59′12′′,136°04′11′′,136°04′11′′,139°01′06′′
2,緯度,35°41′22′′,24°16′59′′,20°25′31′′,20°25′31′′,35°53′54′′


In [53]:
df_tokyo[1]

Unnamed: 0,0,1,2,3,4,5
0,,千代田区役所,東端,西端,南端,北端
1,経度,139°45′13′′,139°46′59′′,139°43′48′′,,139°46′12′′
2,,,,,未定,
3,緯度,35°41′38′′,35°41′47′′,35°41′09′′,,35°42′19′′


In [54]:
df_tokyo[1][1][0]

'千代田区役所'

In [55]:
# load the ward name
df_wards_jp = []  
for i in range(0, 24): 
    df_wards_jp.append(df_tokyo[i][1][0])  # add the result 
    i+= 1
df_wards_jp

['東京都庁',
 '千代田区役所',
 '中央区役所',
 '港区役所',
 '新宿区役所',
 '文京区役所',
 '台東区役所',
 '墨田区役所',
 '江東区役所',
 '品川区役所',
 '目黒区役所',
 '大田区役所',
 '世田谷区役所',
 '渋谷区役所',
 '中野区役所',
 '杉並区役所',
 '豊島区役所',
 '北区役所',
 '荒川区役所',
 '板橋区役所',
 '練馬区役所',
 '足立区役所',
 '葛飾区役所',
 '江戸川区役所']

In [56]:
# translate Japanese into English
!conda install -c conda-forge googletrans --yes

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.



In [57]:
from googletrans import Translator

translator = Translator()

jp_words = df_wards_jp
df_wards_en = []

for src in jp_words:
    dst = translator.translate(src, src='ja', dest='en')
    df_wards_en.append(dst.text)

print(df_wards_en)  

['Tokyo Metropolitan Government Building', 'Chiyoda ward office', 'Central Ward Office', 'Minatokuyakusho', 'Shinjuku Ward Office', 'Bunkyo Ward Office', 'Taito Ward Office', 'Sumida ward office', 'Koto ward office', 'Shinagawa ward office', 'Meguro ward office', 'Ota ward office', 'Setagaya ward office', 'Shibuya ward office', 'Nakano ward office', 'Suginami Ward Office', 'Toshima Ward Office', 'North ward office', 'Arakawa Ward', 'Itabashi ward office', 'Nerima ward office', 'Adachi Ward Office', 'Katsushika Ward Office', 'Edogawa ward office']


In [58]:
# 'Minatokuyakusho' is not well translated into English so we replace 'Minatokuyakusho' into 'Minato ward office' as the other ones.
df_wards_en[3] = 'Minato ward office'
df_wards_en

['Tokyo Metropolitan Government Building',
 'Chiyoda ward office',
 'Central Ward Office',
 'Minato ward office',
 'Shinjuku Ward Office',
 'Bunkyo Ward Office',
 'Taito Ward Office',
 'Sumida ward office',
 'Koto ward office',
 'Shinagawa ward office',
 'Meguro ward office',
 'Ota ward office',
 'Setagaya ward office',
 'Shibuya ward office',
 'Nakano ward office',
 'Suginami Ward Office',
 'Toshima Ward Office',
 'North ward office',
 'Arakawa Ward',
 'Itabashi ward office',
 'Nerima ward office',
 'Adachi Ward Office',
 'Katsushika Ward Office',
 'Edogawa ward office']

In [59]:
# We will use only each first word for the name of Tokyo special wards
df_wards = []
for i in range(0, 24):
#     print(i)
    df_wards.append(df_wards_en[i].split(' ')[0])
df_wards

['Tokyo',
 'Chiyoda',
 'Central',
 'Minato',
 'Shinjuku',
 'Bunkyo',
 'Taito',
 'Sumida',
 'Koto',
 'Shinagawa',
 'Meguro',
 'Ota',
 'Setagaya',
 'Shibuya',
 'Nakano',
 'Suginami',
 'Toshima',
 'North',
 'Arakawa',
 'Itabashi',
 'Nerima',
 'Adachi',
 'Katsushika',
 'Edogawa']

In [60]:
# load latitude of Tokyo and  23 specieal wards
df_tokyo_latitude_txt = []  
for i in range(0, 24): 
#     print(i)
#     print(df_tokyo[i][1][0])
    df_tokyo_latitude_txt.append(df_tokyo[i][1][2])  # add the result 
    i+= 1
df_tokyo_latitude_txt

['35°41′22′′',
 nan,
 nan,
 '35°39′29′′',
 '35°41′38′′',
 '35°42′29′′',
 '35°42′46′′',
 '35°42′38′′',
 '35°40′23′′',
 '35°36′32′′',
 '35°38′29′′',
 '35°33′41′′',
 '35°38′46′′',
 '35°39′51′′',
 '35°42′27′′',
 '35°41′58′′',
 '35°43′34′′',
 '35°45′10′′',
 '35°44′10′′',
 '35°45′04′′',
 '35°44′08′′',
 '35°46′30′′',
 '35°44′36′′',
 '35°42′24′′']

In [61]:
#  In the data 'df_tokyo'  there seem to be the rows of NaN, so we need to ignore their rows.
df_tokyo_latitude_txt = []  
for i in range(0, 24): 
    if (i == 1) or (i == 2):
        df_tokyo_latitude_txt.append(df_tokyo[i][1][3])
    else:
        df_tokyo_latitude_txt.append(df_tokyo[i][1][2])  # add the result 
    i+= 1
df_tokyo_latitude_txt

['35°41′22′′',
 '35°41′38′′',
 '35°40′15′′',
 '35°39′29′′',
 '35°41′38′′',
 '35°42′29′′',
 '35°42′46′′',
 '35°42′38′′',
 '35°40′23′′',
 '35°36′32′′',
 '35°38′29′′',
 '35°33′41′′',
 '35°38′46′′',
 '35°39′51′′',
 '35°42′27′′',
 '35°41′58′′',
 '35°43′34′′',
 '35°45′10′′',
 '35°44′10′′',
 '35°45′04′′',
 '35°44′08′′',
 '35°46′30′′',
 '35°44′36′′',
 '35°42′24′′']

In [62]:
# load longitude of Tokyo and  23 specieal wards
df_tokyo_longitude_txt = []  
for i in range(0, 24): 
    df_tokyo_longitude_txt.append(df_tokyo[i][1][1])  # add the result 
    i+= 1
df_tokyo_longitude_txt

['139°41′30′′',
 '139°45′13′′',
 '139°46′20′′',
 '139°45′06′′',
 '139°42′13′′',
 '139°45′08′′',
 '139°46′48′′',
 '139°48′06′′',
 '139°49′02′′',
 '139°43′49′′',
 '139°41′54′′',
 '139°42′58′′',
 '139°39′11′′',
 '139°41′55′′',
 '139°39′50′′',
 '139°38′11′′',
 '139°43′00′′',
 '139°44′01′′',
 '139°47′00′′',
 '139°42′34′′',
 '139°39′08′′',
 '139°48′17′′',
 '139°50′50′′',
 '139°52′06′′']

In [63]:
# Latitude and longitude area both string so we need  convert string to integer
print(type(df_tokyo_latitude_txt[0]))
print(type(df_tokyo_longitude_txt[0]))

<class 'str'>
<class 'str'>


In [64]:
import re

def dms2dd(degrees, minutes, seconds, direction):
    dd = float(degrees) + float(minutes)/60 + float(seconds)/(60*60);
    if direction == 'W' or direction == 'S':
        dd *= -1
    return dd;

def dd2dms(deg):
    d = int(deg)
    md = abs(deg - d) * 60
    m = int(md)
    sd = (md - m) * 60
    return [d, m, sd]

def parse_dms(dms):
    parts = re.split('[^\d\w]+', dms)
    lat = dms2dd(parts[0], parts[1], parts[2], parts[3])

    return (lat)

dd = parse_dms("78°55'44.33324N" )

print(dd)

78.92888888888889


In [65]:
df_tokyo_latitude = []
for dd in df_tokyo_latitude_txt:
    lat= parse_dms (dd+"N")
    df_tokyo_latitude.append(lat)
    
    
print(df_tokyo_latitude)

[35.68944444444444, 35.693888888888885, 35.670833333333334, 35.658055555555556, 35.693888888888885, 35.70805555555556, 35.71277777777778, 35.71055555555556, 35.67305555555555, 35.60888888888889, 35.64138888888889, 35.561388888888885, 35.64611111111111, 35.66416666666667, 35.7075, 35.69944444444444, 35.726111111111116, 35.75277777777778, 35.736111111111114, 35.75111111111111, 35.73555555555556, 35.775, 35.74333333333333, 35.70666666666667]


In [66]:
df_tokyo_latitude=[]
df_tokyo_latitude=[parse_dms (x +"N")  for x in df_tokyo_latitude_txt] 

print(df_tokyo_latitude)

[35.68944444444444, 35.693888888888885, 35.670833333333334, 35.658055555555556, 35.693888888888885, 35.70805555555556, 35.71277777777778, 35.71055555555556, 35.67305555555555, 35.60888888888889, 35.64138888888889, 35.561388888888885, 35.64611111111111, 35.66416666666667, 35.7075, 35.69944444444444, 35.726111111111116, 35.75277777777778, 35.736111111111114, 35.75111111111111, 35.73555555555556, 35.775, 35.74333333333333, 35.70666666666667]


In [67]:
df_tokyo_longitude=[]
df_tokyo_longitude=[parse_dms (x +"N")  for x in df_tokyo_longitude_txt] 

print(df_tokyo_longitude)

[139.69166666666666, 139.7536111111111, 139.77222222222224, 139.75166666666667, 139.7036111111111, 139.75222222222223, 139.78, 139.80166666666668, 139.81722222222223, 139.7302777777778, 139.69833333333332, 139.7161111111111, 139.65305555555557, 139.69861111111112, 139.6638888888889, 139.6363888888889, 139.71666666666667, 139.7336111111111, 139.78333333333333, 139.70944444444444, 139.65222222222224, 139.80472222222224, 139.84722222222223, 139.86833333333334]


In [68]:
# for i,dd in enumerate(df_tokyo_latitude):
#     lat= parse_dms (dd+"N")
#     df_tokyo_latitude[i]=lat
# print(df_tokyo_latitude)

In [69]:
# for i,dd in enumerate(df_tokyo_longitude):
#     lon= parse_dms (dd+"")
#     df_tokyo_longitude[i]=lon
# print(df_tokyo_longitude)

In [70]:
# define the dataframe columns
column_names = ['Area', 'Ward', 'Latitude', 'Longitude'] 

# instantiate the dataframe
wards = pd.DataFrame(columns=column_names)

In [71]:
wards

Unnamed: 0,Area,Ward,Latitude,Longitude


In [72]:
area = []
for i in range(0, 24):
    area.append('Tokyo')
    i+=1

In [73]:
wards['Area'] = area
wards['Ward'] = df_wards
wards['Latitude'] = df_tokyo_latitude
wards['Longitude']= df_tokyo_longitude
wards

Unnamed: 0,Area,Ward,Latitude,Longitude
0,Tokyo,Tokyo,35.689444,139.691667
1,Tokyo,Chiyoda,35.693889,139.753611
2,Tokyo,Central,35.670833,139.772222
3,Tokyo,Minato,35.658056,139.751667
4,Tokyo,Shinjuku,35.693889,139.703611
5,Tokyo,Bunkyo,35.708056,139.752222
6,Tokyo,Taito,35.712778,139.78
7,Tokyo,Sumida,35.710556,139.801667
8,Tokyo,Koto,35.673056,139.817222
9,Tokyo,Shinagawa,35.608889,139.730278


In [74]:
print('The dataframe has {} area and {} wards.'.format(
        len(wards['Area'].unique()),
        wards.shape[0]-1
    )
)

The dataframe has 1 area and 23 wards.


In [75]:
print('The geograpical coordinate of Tokyo are {}, {}.'.format(wards['Latitude'][0],wards['Longitude'][0]))

The geograpical coordinate of Tokyo are 35.68944444444444, 139.69166666666666.


In [76]:
print(wards.dtypes)

Area          object
Ward          object
Latitude     float64
Longitude    float64
dtype: object


#### Create a map of Tokyo with wards superimposed on top.

In [80]:
# create map of Tokyo using latitude and longitude values
map_tokyo = folium.Map(location=[wards['Latitude'][0], wards['Longitude'][0]], zoom_start=11)

# add markers to map
for lat, lng, area, ward in zip(wards['Latitude'], wards['Longitude'], wards['Area'], wards['Ward']):
    label = '{}, {}'.format(ward, area)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_tokyo)  
    
map_tokyo

#### Explore Tokyo Metropolitan Government Building in our dataframe.

In [81]:
wards.loc[0, 'Ward']

'Tokyo'

Get its latitude and longitude values.

In [82]:
tokyo_latitude = wards.loc[0, 'Latitude'] # latitude value
tokyo_longitude = wards.loc[0, 'Longitude'] # longitude value

tokyo_name = wards.loc[0, 'Ward'] # neighborhood name

print('Latitude and longitude values of {} are {}, {}.'.format(tokyo_name, 
                                                               tokyo_latitude, 
                                                               tokyo_longitude))

Latitude and longitude values of Tokyo are 35.68944444444444, 139.69166666666666.


#### Now, let's get the top 100 venues that are in Tokyo Metropolitan Government Building within a radius of 250 meters.

In [85]:
LIMIT = 100 # limit of number of venues returned by Foursquare API
radius = 250 # define radius

url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    tokyo_latitude, 
    tokyo_longitude, 
    radius, 
    LIMIT)
#url # display URL

In [86]:
results = requests.get(url).json()
#results

In [87]:
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

Unnamed: 0,name,categories,lat,lng
0,South Observatory (東京都庁 南展望室),Scenic Lookout,35.68929,139.691821
1,North Observatory (東京都庁 北展望室),Scenic Lookout,35.689797,139.691654
2,"Observatories, Tokyo Metropolitan Government B...",Scenic Lookout,35.689788,139.691645
3,Hyatt Regency Caffè,Café,35.69113,139.691398
4,Pizzo Rante Spacca Napoli (ピッツォランテ スパッカ ナポリ),Italian Restaurant,35.691318,139.692588


In [88]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

30 venues were returned by Foursquare.


### Explore 23 wards in Tokyo

In [89]:
def getNearbyVenues(names, latitudes, longitudes, radius=250):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Ward', 
                  'Ward Latitude', 
                  'Ward Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

We will use just 23 wards instead of Tokyo and 23 wards

In [90]:
each_ward = wards[1:]
each_ward

Unnamed: 0,Area,Ward,Latitude,Longitude
1,Tokyo,Chiyoda,35.693889,139.753611
2,Tokyo,Central,35.670833,139.772222
3,Tokyo,Minato,35.658056,139.751667
4,Tokyo,Shinjuku,35.693889,139.703611
5,Tokyo,Bunkyo,35.708056,139.752222
6,Tokyo,Taito,35.712778,139.78
7,Tokyo,Sumida,35.710556,139.801667
8,Tokyo,Koto,35.673056,139.817222
9,Tokyo,Shinagawa,35.608889,139.730278
10,Tokyo,Meguro,35.641389,139.698333


In [91]:
tokyo_venues = getNearbyVenues(names=each_ward['Ward'],
                                   latitudes=wards['Latitude'],
                                   longitudes=wards['Longitude']
                                  )

Chiyoda
Central
Minato
Shinjuku
Bunkyo
Taito
Sumida
Koto
Shinagawa
Meguro
Ota
Setagaya
Shibuya
Nakano
Suginami
Toshima
North
Arakawa
Itabashi
Nerima
Adachi
Katsushika
Edogawa


In [92]:
print(tokyo_venues.shape)
tokyo_venues.head()

(671, 7)


Unnamed: 0,Ward,Ward Latitude,Ward Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Chiyoda,35.689444,139.691667,South Observatory (東京都庁 南展望室),35.68929,139.691821,Scenic Lookout
1,Chiyoda,35.689444,139.691667,North Observatory (東京都庁 北展望室),35.689797,139.691654,Scenic Lookout
2,Chiyoda,35.689444,139.691667,"Observatories, Tokyo Metropolitan Government B...",35.689788,139.691645,Scenic Lookout
3,Chiyoda,35.689444,139.691667,Hyatt Regency Caffè,35.69113,139.691398,Café
4,Chiyoda,35.689444,139.691667,Pizzo Rante Spacca Napoli (ピッツォランテ スパッカ ナポリ),35.691318,139.692588,Italian Restaurant


Check how many venues were returned for each ward

In [93]:
tokyo_venues.groupby('Ward').count()

Unnamed: 0_level_0,Ward Latitude,Ward Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Ward,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Adachi,47,47,47,47,47,47
Arakawa,11,11,11,11,11,11
Bunkyo,75,75,75,75,75,75
Central,22,22,22,22,22,22
Chiyoda,30,30,30,30,30,30
Edogawa,6,6,6,6,6,6
Itabashi,14,14,14,14,14,14
Katsushika,14,14,14,14,14,14
Koto,24,24,24,24,24,24
Meguro,28,28,28,28,28,28


In [94]:
print('There are {} uniques categories.'.format(len(tokyo_venues['Venue Category'].unique())))

There are 150 uniques categories.


So now we have all venues in arrondissements of Paris and areas of Tokyo within few kilometers. We also know which venues exactly are in vicinity of every arrondissement or ward candidate.

This concludes the data gathering phase - we're now ready to use this data for analysis to compare 2 areas.

## Methodology <a name="methodology"></a>

In this project we will direct our efforts on detecting areas of Paris and Tokyo where both areas are sightseeing spots. We will limit our analysis to area ~250m around each center of arrondissement or ward.

In first step we have collected the required **data: location and type (category) of every venue within 250m from center**. 

Second step in our analysis will be calculation and exploration of '**venue**' across different areas of Paris and Tokyo.

## Analysis <a name="analysis"></a>

We will perform some basic explanatory data analysis.
First, we will analyze each arrondisement and each ward.

second let's count the **number of venue in every area candidate**:

### Analyze each arrondissement

In [95]:
# one hot encoding
paris_onehot = pd.get_dummies(paris_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
paris_onehot['Arrondissement'] = paris_venues['Arrondissement'] 

# move neighborhood column to the first column
fixed_columns = [paris_onehot.columns[-1]] + list(paris_onehot.columns[:-1])
paris_onehot = paris_onehot[fixed_columns]

paris_onehot.head()

Unnamed: 0,Arrondissement,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auvergne Restaurant,BBQ Joint,Bagel Shop,Bakery,Bar,Bed & Breakfast,Beer Garden,Bistro,Bookstore,Boutique,Brasserie,Brazilian Restaurant,Burger Joint,Burgundian Restaurant,Café,Cajun / Creole Restaurant,Candy Store,Ch'ti Restaurant,Cheese Shop,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,Concert Hall,Convenience Store,Cosmetics Shop,Creperie,Cultural Center,Cupcake Shop,Department Store,Dessert Shop,Diner,Discount Store,Ethiopian Restaurant,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Flower Shop,Food & Drink Shop,Food Court,French Restaurant,Garden,Gastropub,Gourmet Shop,Grocery Store,Gym,Gym / Fitness Center,Health Food Store,Historic Site,History Museum,Hookah Bar,Hostel,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Insurance Office,Italian Restaurant,Japanese Restaurant,Kids Store,Korean Restaurant,Latin American Restaurant,Lebanese Restaurant,Liquor Store,Lounge,Mediterranean Restaurant,Memorial Site,Men's Store,Metro Station,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Moroccan Restaurant,Movie Theater,Multiplex,Museum,Music Store,New American Restaurant,Nightclub,Office,Okonomiyaki Restaurant,Park,Pastry Shop,Pedestrian Plaza,Performing Arts Venue,Perfume Shop,Persian Restaurant,Peruvian Restaurant,Pharmacy,Pizza Place,Plaza,Pool,Pub,Record Shop,Resort,Restaurant,Salad Place,Salon / Barbershop,Sandwich Place,Savoyard Restaurant,Scandinavian Restaurant,Seafood Restaurant,Shoe Store,Shopping Mall,Snack Place,Spa,Spanish Restaurant,Supermarket,Sushi Restaurant,Tapas Restaurant,Tea Room,Thai Restaurant,Theater,Theme Park Ride / Attraction,Train Station,Turkish Restaurant,Vegetarian / Vegan Restaurant,Venezuelan Restaurant,Vietnamese Restaurant,Wine Bar,Women's Store
0,Louvre,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Louvre,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Louvre,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0
3,Louvre,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Louvre,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [96]:
paris_onehot.shape

(463, 129)

In [97]:
paris_each_category_count = paris_onehot.groupby('Arrondissement').sum()
paris_each_category_count

Unnamed: 0_level_0,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auvergne Restaurant,BBQ Joint,Bagel Shop,Bakery,Bar,Bed & Breakfast,Beer Garden,Bistro,Bookstore,Boutique,Brasserie,Brazilian Restaurant,Burger Joint,Burgundian Restaurant,Café,Cajun / Creole Restaurant,Candy Store,Ch'ti Restaurant,Cheese Shop,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,Concert Hall,Convenience Store,Cosmetics Shop,Creperie,Cultural Center,Cupcake Shop,Department Store,Dessert Shop,Diner,Discount Store,Ethiopian Restaurant,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Flower Shop,Food & Drink Shop,Food Court,French Restaurant,Garden,Gastropub,Gourmet Shop,Grocery Store,Gym,Gym / Fitness Center,Health Food Store,Historic Site,History Museum,Hookah Bar,Hostel,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Insurance Office,Italian Restaurant,Japanese Restaurant,Kids Store,Korean Restaurant,Latin American Restaurant,Lebanese Restaurant,Liquor Store,Lounge,Mediterranean Restaurant,Memorial Site,Men's Store,Metro Station,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Moroccan Restaurant,Movie Theater,Multiplex,Museum,Music Store,New American Restaurant,Nightclub,Office,Okonomiyaki Restaurant,Park,Pastry Shop,Pedestrian Plaza,Performing Arts Venue,Perfume Shop,Persian Restaurant,Peruvian Restaurant,Pharmacy,Pizza Place,Plaza,Pool,Pub,Record Shop,Resort,Restaurant,Salad Place,Salon / Barbershop,Sandwich Place,Savoyard Restaurant,Scandinavian Restaurant,Seafood Restaurant,Shoe Store,Shopping Mall,Snack Place,Spa,Spanish Restaurant,Supermarket,Sushi Restaurant,Tapas Restaurant,Tea Room,Thai Restaurant,Theater,Theme Park Ride / Attraction,Train Station,Turkish Restaurant,Vegetarian / Vegan Restaurant,Venezuelan Restaurant,Vietnamese Restaurant,Wine Bar,Women's Store
Arrondissement,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1,Unnamed: 57_level_1,Unnamed: 58_level_1,Unnamed: 59_level_1,Unnamed: 60_level_1,Unnamed: 61_level_1,Unnamed: 62_level_1,Unnamed: 63_level_1,Unnamed: 64_level_1,Unnamed: 65_level_1,Unnamed: 66_level_1,Unnamed: 67_level_1,Unnamed: 68_level_1,Unnamed: 69_level_1,Unnamed: 70_level_1,Unnamed: 71_level_1,Unnamed: 72_level_1,Unnamed: 73_level_1,Unnamed: 74_level_1,Unnamed: 75_level_1,Unnamed: 76_level_1,Unnamed: 77_level_1,Unnamed: 78_level_1,Unnamed: 79_level_1,Unnamed: 80_level_1,Unnamed: 81_level_1,Unnamed: 82_level_1,Unnamed: 83_level_1,Unnamed: 84_level_1,Unnamed: 85_level_1,Unnamed: 86_level_1,Unnamed: 87_level_1,Unnamed: 88_level_1,Unnamed: 89_level_1,Unnamed: 90_level_1,Unnamed: 91_level_1,Unnamed: 92_level_1,Unnamed: 93_level_1,Unnamed: 94_level_1,Unnamed: 95_level_1,Unnamed: 96_level_1,Unnamed: 97_level_1,Unnamed: 98_level_1,Unnamed: 99_level_1,Unnamed: 100_level_1,Unnamed: 101_level_1,Unnamed: 102_level_1,Unnamed: 103_level_1,Unnamed: 104_level_1,Unnamed: 105_level_1,Unnamed: 106_level_1,Unnamed: 107_level_1,Unnamed: 108_level_1,Unnamed: 109_level_1,Unnamed: 110_level_1,Unnamed: 111_level_1,Unnamed: 112_level_1,Unnamed: 113_level_1,Unnamed: 114_level_1,Unnamed: 115_level_1,Unnamed: 116_level_1,Unnamed: 117_level_1,Unnamed: 118_level_1,Unnamed: 119_level_1,Unnamed: 120_level_1,Unnamed: 121_level_1,Unnamed: 122_level_1,Unnamed: 123_level_1,Unnamed: 124_level_1,Unnamed: 125_level_1,Unnamed: 126_level_1,Unnamed: 127_level_1,Unnamed: 128_level_1
Batignolles-Monceau,0,0,0,0,0,0,0,0,0,2,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,5,0,0,0,0,2,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
Bourse,0,0,0,0,1,0,0,0,0,0,1,0,0,2,1,1,0,0,0,0,0,0,0,1,0,1,1,0,0,1,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,1,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,1
Buttes-Chaumont,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0
Buttes-Montmartre,0,0,0,0,0,0,0,0,0,1,6,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,1,0,3,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0
Entrepôt,0,0,0,0,0,0,0,0,0,0,1,0,1,3,0,0,0,0,0,0,2,0,0,0,0,0,0,0,3,0,0,1,1,0,0,1,0,0,0,0,0,0,1,1,1,0,3,1,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0
Gobelins,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0
Hôtel-de-Ville,0,0,1,1,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,2,0,0,0,1,1,0,0,1,0,0,0,0,0,0,0,0,0,7,1,0,1,0,0,0,0,0,0,0,2,1,0,0,0,0,2,1,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,1,0,0,0,0,1,0,1,0,0,1,1,0,0,0,0,0,0,0,1,0,2,0,0,0,0,0,0,0,2,0
Louvre,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0
Luxembourg,1,1,1,0,0,0,0,0,1,1,0,0,0,0,1,1,0,0,0,0,1,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,3,1,0,0,1,0,0,0,0,0,0,0,4,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0
Ménilmontant,0,0,0,0,0,0,0,0,0,2,0,0,0,1,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0


In [98]:
paris_grouped = paris_onehot.groupby('Arrondissement').mean().reset_index()
paris_grouped

Unnamed: 0,Arrondissement,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auvergne Restaurant,BBQ Joint,Bagel Shop,Bakery,Bar,Bed & Breakfast,Beer Garden,Bistro,Bookstore,Boutique,Brasserie,Brazilian Restaurant,Burger Joint,Burgundian Restaurant,Café,Cajun / Creole Restaurant,Candy Store,Ch'ti Restaurant,Cheese Shop,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,Concert Hall,Convenience Store,Cosmetics Shop,Creperie,Cultural Center,Cupcake Shop,Department Store,Dessert Shop,Diner,Discount Store,Ethiopian Restaurant,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Flower Shop,Food & Drink Shop,Food Court,French Restaurant,Garden,Gastropub,Gourmet Shop,Grocery Store,Gym,Gym / Fitness Center,Health Food Store,Historic Site,History Museum,Hookah Bar,Hostel,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Insurance Office,Italian Restaurant,Japanese Restaurant,Kids Store,Korean Restaurant,Latin American Restaurant,Lebanese Restaurant,Liquor Store,Lounge,Mediterranean Restaurant,Memorial Site,Men's Store,Metro Station,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Moroccan Restaurant,Movie Theater,Multiplex,Museum,Music Store,New American Restaurant,Nightclub,Office,Okonomiyaki Restaurant,Park,Pastry Shop,Pedestrian Plaza,Performing Arts Venue,Perfume Shop,Persian Restaurant,Peruvian Restaurant,Pharmacy,Pizza Place,Plaza,Pool,Pub,Record Shop,Resort,Restaurant,Salad Place,Salon / Barbershop,Sandwich Place,Savoyard Restaurant,Scandinavian Restaurant,Seafood Restaurant,Shoe Store,Shopping Mall,Snack Place,Spa,Spanish Restaurant,Supermarket,Sushi Restaurant,Tapas Restaurant,Tea Room,Thai Restaurant,Theater,Theme Park Ride / Attraction,Train Station,Turkish Restaurant,Vegetarian / Vegan Restaurant,Venezuelan Restaurant,Vietnamese Restaurant,Wine Bar,Women's Store
0,Batignolles-Monceau,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.277778,0.0,0.0,0.0,0.0,0.111111,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Bourse,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.066667,0.033333,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.033333,0.033333,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.033333,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.133333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.033333,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333
2,Buttes-Chaumont,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Buttes-Montmartre,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.26087,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.043478,0.0,0.130435,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.043478,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Entrepôt,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.034483,0.103448,0.0,0.0,0.0,0.0,0.0,0.0,0.068966,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.103448,0.0,0.0,0.034483,0.034483,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.034483,0.034483,0.0,0.103448,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.068966,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0
5,Gobelins,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.285714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Hôtel-de-Ville,0.0,0.0,0.025,0.025,0.0,0.0,0.0,0.0,0.0,0.025,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.025,0.025,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.175,0.025,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.025,0.0,0.0,0.0,0.0,0.05,0.025,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.025,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.025,0.0,0.025,0.0,0.0,0.025,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0
7,Louvre,0.0,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.181818,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.272727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Luxembourg,0.038462,0.038462,0.038462,0.0,0.0,0.0,0.0,0.0,0.038462,0.038462,0.0,0.0,0.0,0.0,0.038462,0.038462,0.0,0.0,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.038462,0.0,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.115385,0.038462,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.153846,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Ménilmontant,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.071429,0.071429,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [99]:
paris_grouped.shape

(19, 129)

In [100]:
num_top_venues = 5

for hood in paris_grouped['Arrondissement']:
    print("----"+hood+"----")
    temp = paris_grouped[paris_grouped['Arrondissement'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Batignolles-Monceau----
                 venue  freq
0                Hotel  0.28
1  Japanese Restaurant  0.11
2               Bakery  0.11
3   Italian Restaurant  0.11
4                Plaza  0.11


----Bourse----
                 venue  freq
0    French Restaurant  0.13
1          Salad Place  0.10
2               Bistro  0.07
3  Japanese Restaurant  0.03
4  Lebanese Restaurant  0.03


----Buttes-Chaumont----
                 venue  freq
0          Supermarket   0.2
1  Japanese Restaurant   0.1
2                  Bar   0.1
3         Concert Hall   0.1
4    French Restaurant   0.1


----Buttes-Montmartre----
                      venue  freq
0                       Bar  0.26
1         French Restaurant  0.13
2                 Gastropub  0.04
3               Supermarket  0.04
4  Mediterranean Restaurant  0.04


----Entrepôt----
               venue  freq
0  French Restaurant  0.10
1        Coffee Shop  0.10
2             Bistro  0.10
3               Café  0.07
4              Hotel 

#### Put that into a *pandas* dataframe

In [101]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [102]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Arrondissement']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
arrondissement_venues_sorted = pd.DataFrame(columns=columns)
arrondissement_venues_sorted['Arrondissement'] = paris_grouped['Arrondissement']

for ind in np.arange(paris_grouped.shape[0]):
    arrondissement_venues_sorted.iloc[ind, 1:] = return_most_common_venues(paris_grouped.iloc[ind, :], num_top_venues)

arrondissement_venues_sorted

Unnamed: 0,Arrondissement,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Batignolles-Monceau,Hotel,Italian Restaurant,Plaza,Bakery,Japanese Restaurant,Diner,Bar,Grocery Store,Café,French Restaurant
1,Bourse,French Restaurant,Salad Place,Bistro,Women's Store,Dessert Shop,Office,Nightclub,Music Store,Miscellaneous Shop,Lebanese Restaurant
2,Buttes-Chaumont,Supermarket,French Restaurant,Japanese Restaurant,Concert Hall,Metro Station,Spa,Grocery Store,Burger Joint,Bar,Food Court
3,Buttes-Montmartre,Bar,French Restaurant,Mediterranean Restaurant,Park,Dessert Shop,Record Shop,Fast Food Restaurant,Café,Sandwich Place,Savoyard Restaurant
4,Entrepôt,Coffee Shop,Bistro,French Restaurant,Café,Hotel,Cosmetics Shop,Lounge,Creperie,Train Station,Museum
5,Gobelins,French Restaurant,Park,Bar,Theme Park Ride / Attraction,Chinese Restaurant,Korean Restaurant,Dessert Shop,Department Store,Diner,Food Court
6,Hôtel-de-Ville,French Restaurant,Italian Restaurant,Thai Restaurant,Wine Bar,Coffee Shop,Hostel,Arts & Crafts Store,Dessert Shop,Burgundian Restaurant,Candy Store
7,Louvre,Plaza,Art Museum,Historic Site,Italian Restaurant,Theater,Perfume Shop,Hotel,Hookah Bar,Fast Food Restaurant,Creperie
8,Luxembourg,Hotel,French Restaurant,Argentinian Restaurant,Grocery Store,New American Restaurant,Convenience Store,Ice Cream Shop,Pizza Place,Plaza,Clothing Store
9,Ménilmontant,Bakery,Italian Restaurant,Park,Movie Theater,Restaurant,Café,French Restaurant,Bookstore,Bistro,Supermarket


### Analyze each ward

In [103]:
# one hot encoding
tokyo_onehot = pd.get_dummies(tokyo_venues[['Venue Category']], prefix="", prefix_sep="")

# add ward column back to dataframe
tokyo_onehot['Ward'] = tokyo_venues['Ward'] 

# move ward column to the first column
fixed_columns = [tokyo_onehot.columns[-1]] + list(tokyo_onehot.columns[:-1])
tokyo_onehot = tokyo_onehot[fixed_columns]

tokyo_onehot.head()

Unnamed: 0,Ward,ATM,American Restaurant,Antique Shop,Arcade,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,BBQ Joint,Bakery,Bar,Baseball Stadium,Bed & Breakfast,Beer Bar,Beer Garden,Bike Rental / Bike Share,Bistro,Bookstore,Boutique,Bowling Alley,Breakfast Spot,Brewery,Bridge,Buffet,Burger Joint,Bus Stop,Business Center,Cafeteria,Café,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Roaster,Coffee Shop,Comic Shop,Concert Hall,Convenience Store,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dive Bar,Donburi Restaurant,Donut Shop,Drugstore,Dumpling Restaurant,Electronics Store,Event Space,Fabric Shop,Falafel Restaurant,Fast Food Restaurant,Fish Market,Food & Drink Shop,Food Court,Fountain,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Gaming Cafe,Gastropub,General Entertainment,Gift Shop,Grocery Store,Gym,Gym / Fitness Center,Halal Restaurant,Historic Site,History Museum,Hobby Shop,Hookah Bar,Hostel,Hot Dog Joint,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Intersection,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Jazz Club,Juice Bar,Kaiseki Restaurant,Korean Restaurant,Liquor Store,Lounge,Malay Restaurant,Martial Arts Dojo,Metro Station,Monument / Landmark,Movie Theater,Multiplex,Music Store,Music Venue,Nabe Restaurant,Nightclub,Noodle House,Optical Shop,Outdoor Sculpture,Park,Pastry Shop,Pet Café,Pharmacy,Pizza Place,Platform,Playground,Plaza,Pub,Ramen Restaurant,Record Shop,Rental Car Location,Restaurant,Rock Club,Russian Restaurant,Sake Bar,Sauna / Steam Room,Scenic Lookout,Seafood Restaurant,Shabu-Shabu Restaurant,Shoe Store,Shopping Mall,Smoke Shop,Soba Restaurant,South Indian Restaurant,Spa,Spanish Restaurant,Stationery Store,Steakhouse,Supermarket,Sushi Restaurant,Taiwanese Restaurant,Takoyaki Place,Tea Room,Tempura Restaurant,Thai Restaurant,Theater,Theme Park,Tonkatsu Restaurant,Track,Trail,Udon Restaurant,Unagi Restaurant,Used Bookstore,Vegetarian / Vegan Restaurant,Video Store,Wagashi Place,Wine Bar,Yakitori Restaurant,Yoshoku Restaurant
0,Chiyoda,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Chiyoda,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Chiyoda,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Chiyoda,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Chiyoda,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [104]:
tokyo_onehot.shape

(671, 151)

In [105]:
tokyo_each_category_count = tokyo_onehot.groupby('Ward').sum()
tokyo_each_category_count

Unnamed: 0_level_0,ATM,American Restaurant,Antique Shop,Arcade,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,BBQ Joint,Bakery,Bar,Baseball Stadium,Bed & Breakfast,Beer Bar,Beer Garden,Bike Rental / Bike Share,Bistro,Bookstore,Boutique,Bowling Alley,Breakfast Spot,Brewery,Bridge,Buffet,Burger Joint,Bus Stop,Business Center,Cafeteria,Café,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Roaster,Coffee Shop,Comic Shop,Concert Hall,Convenience Store,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dive Bar,Donburi Restaurant,Donut Shop,Drugstore,Dumpling Restaurant,Electronics Store,Event Space,Fabric Shop,Falafel Restaurant,Fast Food Restaurant,Fish Market,Food & Drink Shop,Food Court,Fountain,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Gaming Cafe,Gastropub,General Entertainment,Gift Shop,Grocery Store,Gym,Gym / Fitness Center,Halal Restaurant,Historic Site,History Museum,Hobby Shop,Hookah Bar,Hostel,Hot Dog Joint,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Intersection,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Jazz Club,Juice Bar,Kaiseki Restaurant,Korean Restaurant,Liquor Store,Lounge,Malay Restaurant,Martial Arts Dojo,Metro Station,Monument / Landmark,Movie Theater,Multiplex,Music Store,Music Venue,Nabe Restaurant,Nightclub,Noodle House,Optical Shop,Outdoor Sculpture,Park,Pastry Shop,Pet Café,Pharmacy,Pizza Place,Platform,Playground,Plaza,Pub,Ramen Restaurant,Record Shop,Rental Car Location,Restaurant,Rock Club,Russian Restaurant,Sake Bar,Sauna / Steam Room,Scenic Lookout,Seafood Restaurant,Shabu-Shabu Restaurant,Shoe Store,Shopping Mall,Smoke Shop,Soba Restaurant,South Indian Restaurant,Spa,Spanish Restaurant,Stationery Store,Steakhouse,Supermarket,Sushi Restaurant,Taiwanese Restaurant,Takoyaki Place,Tea Room,Tempura Restaurant,Thai Restaurant,Theater,Theme Park,Tonkatsu Restaurant,Track,Trail,Udon Restaurant,Unagi Restaurant,Used Bookstore,Vegetarian / Vegan Restaurant,Video Store,Wagashi Place,Wine Bar,Yakitori Restaurant,Yoshoku Restaurant
Ward,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1,Unnamed: 51_level_1,Unnamed: 52_level_1,Unnamed: 53_level_1,Unnamed: 54_level_1,Unnamed: 55_level_1,Unnamed: 56_level_1,Unnamed: 57_level_1,Unnamed: 58_level_1,Unnamed: 59_level_1,Unnamed: 60_level_1,Unnamed: 61_level_1,Unnamed: 62_level_1,Unnamed: 63_level_1,Unnamed: 64_level_1,Unnamed: 65_level_1,Unnamed: 66_level_1,Unnamed: 67_level_1,Unnamed: 68_level_1,Unnamed: 69_level_1,Unnamed: 70_level_1,Unnamed: 71_level_1,Unnamed: 72_level_1,Unnamed: 73_level_1,Unnamed: 74_level_1,Unnamed: 75_level_1,Unnamed: 76_level_1,Unnamed: 77_level_1,Unnamed: 78_level_1,Unnamed: 79_level_1,Unnamed: 80_level_1,Unnamed: 81_level_1,Unnamed: 82_level_1,Unnamed: 83_level_1,Unnamed: 84_level_1,Unnamed: 85_level_1,Unnamed: 86_level_1,Unnamed: 87_level_1,Unnamed: 88_level_1,Unnamed: 89_level_1,Unnamed: 90_level_1,Unnamed: 91_level_1,Unnamed: 92_level_1,Unnamed: 93_level_1,Unnamed: 94_level_1,Unnamed: 95_level_1,Unnamed: 96_level_1,Unnamed: 97_level_1,Unnamed: 98_level_1,Unnamed: 99_level_1,Unnamed: 100_level_1,Unnamed: 101_level_1,Unnamed: 102_level_1,Unnamed: 103_level_1,Unnamed: 104_level_1,Unnamed: 105_level_1,Unnamed: 106_level_1,Unnamed: 107_level_1,Unnamed: 108_level_1,Unnamed: 109_level_1,Unnamed: 110_level_1,Unnamed: 111_level_1,Unnamed: 112_level_1,Unnamed: 113_level_1,Unnamed: 114_level_1,Unnamed: 115_level_1,Unnamed: 116_level_1,Unnamed: 117_level_1,Unnamed: 118_level_1,Unnamed: 119_level_1,Unnamed: 120_level_1,Unnamed: 121_level_1,Unnamed: 122_level_1,Unnamed: 123_level_1,Unnamed: 124_level_1,Unnamed: 125_level_1,Unnamed: 126_level_1,Unnamed: 127_level_1,Unnamed: 128_level_1,Unnamed: 129_level_1,Unnamed: 130_level_1,Unnamed: 131_level_1,Unnamed: 132_level_1,Unnamed: 133_level_1,Unnamed: 134_level_1,Unnamed: 135_level_1,Unnamed: 136_level_1,Unnamed: 137_level_1,Unnamed: 138_level_1,Unnamed: 139_level_1,Unnamed: 140_level_1,Unnamed: 141_level_1,Unnamed: 142_level_1,Unnamed: 143_level_1,Unnamed: 144_level_1,Unnamed: 145_level_1,Unnamed: 146_level_1,Unnamed: 147_level_1,Unnamed: 148_level_1,Unnamed: 149_level_1,Unnamed: 150_level_1
Adachi,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,2,0,0,0,2,0,0,5,0,0,0,0,0,1,0,1,0,2,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,9,0,0,1,0,0,3,0,0,0,1,0,0,0,2,1,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,1,0,1,0,0,0,0
Arakawa,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,1,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0
Bunkyo,0,0,0,0,0,0,0,3,0,12,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,2,1,1,0,1,1,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,1,0,0,1,0,1,2,3,1,0,0,0,1,1,0,0,0,0,1,2,0,0,0,1,2,0,1,0,0,1,0,0,0,0,0,1,0,0,0,0,4,1,5,0,0,0,0,0,0,1,2,0,1,0,0,0,0,1,0,0,0,0,3,1,0,2,0,0,1,1,0,0,0,0,1,2,1
Central,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,1,0,0,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,4,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
Chiyoda,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,1,0,0,2,2,0,0,0,2,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,2,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,1,0,0,1,1,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,0,1,0,0,0,0,0,0,0,0,2,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0
Edogawa,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
Itabashi,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,2,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0
Katsushika,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,2,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,1,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
Koto,0,0,0,0,1,0,0,0,2,0,0,0,0,1,0,0,0,0,0,0,2,0,0,0,1,0,0,0,0,0,0,0,0,0,0,2,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,1,0,0,0
Meguro,0,0,0,0,0,0,1,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,4,0,0,0,0,1,0,0,3,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,2,0,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0


#### Next, let's group rows by ward and by taking the mean of the frequency of occurrence of each category

In [106]:
tokyo_grouped = tokyo_onehot.groupby('Ward').mean().reset_index()
tokyo_grouped

Unnamed: 0,Ward,ATM,American Restaurant,Antique Shop,Arcade,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,BBQ Joint,Bakery,Bar,Baseball Stadium,Bed & Breakfast,Beer Bar,Beer Garden,Bike Rental / Bike Share,Bistro,Bookstore,Boutique,Bowling Alley,Breakfast Spot,Brewery,Bridge,Buffet,Burger Joint,Bus Stop,Business Center,Cafeteria,Café,Chinese Restaurant,Clothing Store,Cocktail Bar,Coffee Roaster,Coffee Shop,Comic Shop,Concert Hall,Convenience Store,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dive Bar,Donburi Restaurant,Donut Shop,Drugstore,Dumpling Restaurant,Electronics Store,Event Space,Fabric Shop,Falafel Restaurant,Fast Food Restaurant,Fish Market,Food & Drink Shop,Food Court,Fountain,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Gaming Cafe,Gastropub,General Entertainment,Gift Shop,Grocery Store,Gym,Gym / Fitness Center,Halal Restaurant,Historic Site,History Museum,Hobby Shop,Hookah Bar,Hostel,Hot Dog Joint,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Intersection,Italian Restaurant,Japanese Curry Restaurant,Japanese Restaurant,Jazz Club,Juice Bar,Kaiseki Restaurant,Korean Restaurant,Liquor Store,Lounge,Malay Restaurant,Martial Arts Dojo,Metro Station,Monument / Landmark,Movie Theater,Multiplex,Music Store,Music Venue,Nabe Restaurant,Nightclub,Noodle House,Optical Shop,Outdoor Sculpture,Park,Pastry Shop,Pet Café,Pharmacy,Pizza Place,Platform,Playground,Plaza,Pub,Ramen Restaurant,Record Shop,Rental Car Location,Restaurant,Rock Club,Russian Restaurant,Sake Bar,Sauna / Steam Room,Scenic Lookout,Seafood Restaurant,Shabu-Shabu Restaurant,Shoe Store,Shopping Mall,Smoke Shop,Soba Restaurant,South Indian Restaurant,Spa,Spanish Restaurant,Stationery Store,Steakhouse,Supermarket,Sushi Restaurant,Taiwanese Restaurant,Takoyaki Place,Tea Room,Tempura Restaurant,Thai Restaurant,Theater,Theme Park,Tonkatsu Restaurant,Track,Trail,Udon Restaurant,Unagi Restaurant,Used Bookstore,Vegetarian / Vegan Restaurant,Video Store,Wagashi Place,Wine Bar,Yakitori Restaurant,Yoshoku Restaurant
0,Adachi,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.021277,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.021277,0.042553,0.0,0.0,0.0,0.042553,0.0,0.0,0.106383,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.021277,0.0,0.042553,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.042553,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.06383,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.191489,0.0,0.0,0.021277,0.0,0.0,0.06383,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.042553,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.021277,0.0,0.021277,0.0,0.0,0.0,0.0
1,Arakawa,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.363636,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.181818,0.0,0.0,0.090909,0.090909,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Bunkyo,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.16,0.0,0.0,0.0,0.0,0.0,0.0,0.026667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026667,0.013333,0.013333,0.0,0.013333,0.013333,0.0,0.0,0.0,0.0,0.013333,0.0,0.0,0.0,0.0,0.013333,0.0,0.0,0.0,0.0,0.0,0.013333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013333,0.0,0.013333,0.0,0.0,0.013333,0.0,0.0,0.013333,0.0,0.013333,0.026667,0.04,0.013333,0.0,0.0,0.0,0.013333,0.013333,0.0,0.0,0.0,0.0,0.013333,0.026667,0.0,0.0,0.0,0.013333,0.026667,0.0,0.013333,0.0,0.0,0.013333,0.0,0.0,0.0,0.0,0.0,0.013333,0.0,0.0,0.0,0.0,0.053333,0.013333,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.013333,0.026667,0.0,0.013333,0.0,0.0,0.0,0.0,0.013333,0.0,0.0,0.0,0.0,0.04,0.013333,0.0,0.026667,0.0,0.0,0.013333,0.013333,0.0,0.0,0.0,0.0,0.013333,0.026667,0.013333
3,Central,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.045455,0.0,0.0,0.0,0.045455,0.0,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Chiyoda,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.033333,0.0,0.033333,0.0,0.0,0.066667,0.066667,0.0,0.0,0.0,0.066667,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.033333,0.033333,0.0,0.0,0.033333,0.033333,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Edogawa,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Itabashi,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.071429,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.071429,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.214286,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Katsushika,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Koto,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.041667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.0,0.0
9,Meguro,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.107143,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.071429,0.0,0.107143,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0


In [107]:
tokyo_grouped.shape

(23, 151)

#### Let's print each ward along with the top 5 most common venues

In [108]:
num_top_venues = 5

for hood in tokyo_grouped['Ward']:
    print("----"+hood+"----")
    temp = tokyo_grouped[tokyo_grouped['Ward'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Adachi----
                 venue  freq
0     Ramen Restaurant  0.19
1    Convenience Store  0.11
2  Japanese Restaurant  0.06
3             Sake Bar  0.06
4            Drugstore  0.04


----Arakawa----
               venue  freq
0  Convenience Store  0.36
1               Park  0.18
2              Trail  0.09
3           Pharmacy  0.09
4   Ramen Restaurant  0.09


----Bunkyo----
                 venue  freq
0                  Bar  0.16
1             Sake Bar  0.07
2            Rock Club  0.05
3  Japanese Restaurant  0.04
4            BBQ Joint  0.04


----Central----
                 venue  freq
0  Japanese Restaurant  0.18
1    Convenience Store  0.18
2                  ATM  0.05
3        Historic Site  0.05
4     Ramen Restaurant  0.05


----Chiyoda----
                 venue  freq
0       Scenic Lookout  0.10
1          Coffee Shop  0.07
2  Japanese Restaurant  0.07
3   Chinese Restaurant  0.07
4                 Café  0.07


----Edogawa----
               venue  freq
0        Su

#### Put that into a *pandas* dataframe

In [109]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Now let's create the new dataframe and display the top 10 venues for each ward.

In [110]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Ward']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
wards_venues_sorted = pd.DataFrame(columns=columns)
wards_venues_sorted['Ward'] = tokyo_grouped['Ward']

for ind in np.arange(tokyo_grouped.shape[0]):
    wards_venues_sorted.iloc[ind, 1:] = return_most_common_venues(tokyo_grouped.iloc[ind, :], num_top_venues)

wards_venues_sorted

Unnamed: 0,Ward,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Adachi,Ramen Restaurant,Convenience Store,Sake Bar,Japanese Restaurant,Soba Restaurant,Grocery Store,Coffee Shop,Chinese Restaurant,Drugstore,Italian Restaurant
1,Arakawa,Convenience Store,Park,Ramen Restaurant,Hostel,Pizza Place,Pharmacy,Trail,Drugstore,Falafel Restaurant,Fabric Shop
2,Bunkyo,Bar,Sake Bar,Rock Club,Japanese Restaurant,Thai Restaurant,BBQ Joint,Tonkatsu Restaurant,Japanese Curry Restaurant,Bookstore,Soba Restaurant
3,Central,Convenience Store,Japanese Restaurant,Tempura Restaurant,Historic Site,Indian Restaurant,Gastropub,Food Court,Monument / Landmark,Ramen Restaurant,Coffee Shop
4,Chiyoda,Scenic Lookout,Japanese Restaurant,Café,Chinese Restaurant,Coffee Shop,Steakhouse,Fountain,Hotel Bar,Hotel,Park
5,Edogawa,Supermarket,Convenience Store,Bakery,Electronics Store,Sushi Restaurant,History Museum,Dumpling Restaurant,Fish Market,Fast Food Restaurant,Falafel Restaurant
6,Itabashi,Park,Chinese Restaurant,Grocery Store,Bus Stop,Convenience Store,Indian Restaurant,Intersection,Italian Restaurant,Concert Hall,Noodle House
7,Katsushika,Bus Stop,BBQ Joint,Diner,Breakfast Spot,Music Venue,Chinese Restaurant,Bakery,Park,Donburi Restaurant,Intersection
8,Koto,French Restaurant,Brewery,Convenience Store,Noodle House,Bakery,Theme Park,Event Space,Steakhouse,Beer Garden,Japanese Restaurant
9,Meguro,Convenience Store,Donburi Restaurant,Japanese Restaurant,BBQ Joint,Italian Restaurant,Ramen Restaurant,Coffee Shop,Cafeteria,Pizza Place,Steakhouse


## Results and Discussion <a name="results"></a>

Our analysis shows that there is a great number of venues in Paris and in Tokyo.

After directing our attention to this more narrow area of interest covering (in radius of 250m from center in Paris and in radius of 250m from city hall in Tokyo).

As a result of analysis, we finds that there are many french restaurants in Paris although there are many convenience stores, Japanese restaurants, ramen restaurants in Japan. We see that two countries focus on culutural venues.

## Conclusion <a name="conclusion"></a>

Purpose of this project was to compare two areas, Paris and Tokyo and to identify two areas close to center with the number of venus in order to aid stakeholders interested in visiting or investing venues in Paris and Tokyo.

We compared 20 arrondissements in Paris and 23 wards in Tokyo.

There are lot of convenience stores in Japan, and it would be interesting for investers in France to look at why they are successful in Japan.
It will be interesting to do a marketing research to get the key of success. There are lot of bakeries in France not in Japan, then it's still good idea for french bakers to go to Japan to start business. 
