# Capstone project: Find the most suitable location for japanese restaurant in Paris


## Definition of the problem

I have been employed by a japanese chef that would like to open a new restaurant in Paris.
I must find out where it would be the most suitable.
After thinking about what makes a good location for a japanese restaurant I got those different ideas:
- As people in Paris mostly use public transportation I guess the restaurant should be not too far from a metro station. I decided to only investigate the center of Paris which is zone 1.
- Depending on the area the restaurant is, there might be more japanese immigrants.
- We could try also to get some datas about the average salary for each district of paris.
- Also maybe some japanese restaurants usually work fine when they are close to some other shops or park,...

## Data that I will use to solve this problem
My approach is the following first trying to get as much data as I can like:
- Get the different metro station of Paris and their geolocalisation. We can use https://en.wikipedia.org/wiki/List_of_Paris_M%C3%A9tro_stations
- Get the japanese places already in places around those metro stations. Using foursquare API search
- Get the population for each district of Paris. We can use: https://en.wikipedia.org/wiki/Demographics_of_Paris
- Get the median salary for each district https://www.apur.org/observatoires_apur/familles/obs/3_3revenus/revenus_tab.htm


## Retrieving the different data by scraping and cleaning it

### Population Data
For the population Data we will get it from Wikipedia 

In [260]:
import pandas as pd
import numpy as np
from geopy.geocoders import Nominatim

population=pd.read_html('https://en.wikipedia.org/wiki/Demographics_of_Paris')[2]
population.columns=list(population.iloc[1]) 
population=population[2:22] 
population['Arrondissement']=population['Arrondissement'].astype('int64')
population['Population']=population['Population'].astype('int64')
population['Population per km2']=population['Population per km2'].astype('int64')
population



Unnamed: 0,Arrondissement,Area (km2),Population,Population per km2
2,1,1.826,17268,9457
3,2,0.992,22558,22740
4,3,1.171,36727,31364
5,4,1.601,28068,17532
6,5,2.541,61080,24038
7,6,2.154,44154,20499
8,7,4.088,58166,14228
9,8,3.881,39409,10154
10,9,2.179,60293,27670
11,10,2.892,95436,33000


### Salary Data by District
We get it from the apur website and select only the column we need and clean the column for District and transform it in int64


In [257]:
salaire_median=pd.read_html('https://www.apur.org/observatoires_apur/familles/obs/3_3revenus/revenus_tab.htm',skiprows=5)[0]
salaire_median=salaire_median.iloc[:-3,1:3]
salaire_median.rename(columns={1:'District',2:'Median salary'}, inplace=True)
salaire_median['District']=list(map(lambda x: x.split('e')[0],salaire_median['District']))
salaire_median['District']=salaire_median['District'].astype('int64')

salaire_median['Median salary']=list(map(lambda x: ''.join(x.split(' ')),salaire_median['Median salary']))
salaire_median['Median salary']=salaire_median['Median salary'].astype('int64')
salaire_median

Unnamed: 0,District,Median salary
0,1,47561
1,2,31413
2,3,38404
3,4,41225
4,5,52651
5,6,70965
6,7,77759
7,8,73493
8,9,44895
9,10,24950


In [261]:
paris_demographic= pd.merge(population, salaire_median, left_on='Arrondissement',right_on='District')
paris_demographic.drop(['District'], axis=1, inplace=True)
paris_demographic['Salary normalisation']=(paris_demographic['Median salary']-paris_demographic['Median salary'].min())/(paris_demographic['Median salary'].max()-paris_demographic['Median salary'].min())
paris_demographic['Population normalisation']=(paris_demographic['Population']-paris_demographic['Population'].min())/(paris_demographic['Population'].max()-paris_demographic['Population'].min())


paris_demographic

Unnamed: 0,Arrondissement,Area (km2),Population,Population per km2,Median salary,Salary normalisation,Population normalisation
0,1,1.826,17268,9457,47561,0.48457,0.0
1,2,0.992,22558,22740,31413,0.208951,0.023674
2,3,1.171,36727,31364,38404,0.328275,0.087082
3,4,1.601,28068,17532,41225,0.376425,0.048332
4,5,2.541,61080,24038,52651,0.571448,0.196066
5,6,2.154,44154,20499,70965,0.884038,0.12032
6,7,4.088,58166,14228,77759,1.0,0.183026
7,8,3.881,39409,10154,73493,0.927186,0.099085
8,9,2.179,60293,27670,44895,0.439066,0.192544
9,10,2.892,95436,33000,24950,0.098638,0.349815


### List of the metro in Paris center
We get the data from wikipedia and we choose to take only the ones in Zone 1 that have a District between 1-20(and not a commune) and if one metro station has two Districts we will associate it to the first one in the list

In [131]:
import re
paris_metro=pd.read_html('https://en.wikipedia.org/wiki/List_of_Paris_M%C3%A9tro_stations')[0]
paris_metro=paris_metro[['Station','Zone','Arrondissementor Commune']]
paris_metro=paris_metro[paris_metro['Zone']=='1']
paris_metro.rename(columns={'Arrondissementor Commune':'District'}, inplace=True)

#I have decided to take only the first district that appears in the list
paris_metro['District']=list(map(lambda x: x.split(', ')[0],paris_metro['District']))

#Delete metro station that does not have a district number
paris_metro=paris_metro[paris_metro.apply(lambda x: re.match('[0-9]+',x['District']), axis=1).notna()]

paris_metro

Unnamed: 0,Station,Zone,District
0,Abbesses,1,18
1,Alésia,1,14
2,Alexandre Dumas,1,11
3,Alma – Marceau,1,16
5,Anvers,1,9
...,...,...,...
296,Victor Hugo,1,16
300,Villiers,1,8
301,Volontaires,1,15
302,Voltaire,1,11


We get now the GPS coordinates for those metro station and add it to our dataframe

In [140]:
from time import sleep
locator = Nominatim(user_agent='myGeocoder2')
list_long=list()
list_lat=list()

for station in paris_metro['Station']:
    sleep(1)
    location = locator.geocode(station+', Paris, France')
    list_long.append(location.longitude)
    list_lat.append(location.latitude)
paris_metro['Longitude']=list_long
paris_metro['Latitude']=list_lat
paris_metro

Abbesses, Rue des Abbesses, Quartier des Grandes-Carrières, Paris 18e Arrondissement, Paris, Île-de-France, France métropolitaine, 75018, France
Alésia, Place Victor et Hélène Basch, Quartier du Petit-Montrouge, Paris 14e Arrondissement, Paris, Île-de-France, France métropolitaine, 75014, France
Alexandre Dumas, Boulevard de Charonne, Quartier Sainte-Marguerite, Paris 11e Arrondissement, Paris, Île-de-France, France métropolitaine, 75011, France
Alma-Marceau, Avenue du Président Wilson, Quartier des Champs-Élysées, Paris 8e Arrondissement, Paris, Île-de-France, France métropolitaine, 75008, France
Anvers, Boulevard Marguerite de Rochechouart, Quartier de Rochechouart, Paris 9e Arrondissement, Paris, Île-de-France, France métropolitaine, 75009, France
Argentine, 36;37, Avenue de la Grande Armée, Quartier des Ternes, Paris 17e Arrondissement, Paris, Île-de-France, France métropolitaine, 75116, France
Arts et Métiers, Rue de Turbigo, Quartier des Arts-et-Métiers, Paris 3e Arrondissement, 

Danube, Rue David d'Angers, Quartier d'Amérique, Paris 19e Arrondissement, Paris, Île-de-France, France métropolitaine, 75019, France
Daumesnil, Place Félix Éboué, Quartier de Picpus, Paris 12e Arrondissement, Paris, Île-de-France, France métropolitaine, 75012, France
Denfert-Rochereau, Avenue du Colonel Henri Rol Tanguy, Quartier du Montparnasse, Paris 14e Arrondissement, Paris, Île-de-France, France métropolitaine, 75014, France
Dugommier, Boulevard de Reuilly, Quartier de Bercy, Paris 12e Arrondissement, Paris, Île-de-France, France métropolitaine, 75012, France
Dupleix, Boulevard de Grenelle, Quartier de Grenelle, Paris 15e Arrondissement, Paris, Île-de-France, France métropolitaine, 75015, France
Duroc, Rue de Sèvres, Quartier de l'École Militaire, Paris 7e Arrondissement, Paris, Île-de-France, France métropolitaine, 75007, France
École Militaire, promenade Yehudi Menuhin, Quartier de l'École Militaire, Paris 7e Arrondissement, Paris, Île-de-France, France métropolitaine, 75007, F

Église de la Madeleine, Place de la Madeleine, Quartier de la Madeleine, Paris 8e Arrondissement, Paris, Île-de-France, France métropolitaine, 75008, France
Maison Blanche, Avenue d'Italie, Cité Florale, Quartier de la Maison-Blanche, Paris 13e Arrondissement, Paris, Île-de-France, France métropolitaine, 75013, France
Malesherbes, Avenue de Villiers, Quartier de la Plaine-de-Monceau, Paris 17e Arrondissement, Paris, Île-de-France, France métropolitaine, 75017, France
Maraîchers, Rue des Pyrénées, Saint-Blaise, Quartier de Charonne, Paris 20e Arrondissement, Paris, Île-de-France, France métropolitaine, 75020, France
Marcadet - Poissonniers, Boulevard Barbès, Château Rouge, Quartier de Clignancourt, Paris 18e Arrondissement, Paris, Île-de-France, France métropolitaine, 75018, France
Marx Dormoy, Rue de la Chapelle, Quartier de la Chapelle, Paris 18e Arrondissement, Paris, Île-de-France, France métropolitaine, 75018, France
Maubert - Mutualité, Boulevard Saint-Germain, Quartier de la Sorb

Porte de la Chapelle, Rue de la Chapelle, Quartier de la Chapelle, Paris 18e Arrondissement, Paris, Île-de-France, France métropolitaine, 75018, France
Porte de la Villette, Boulevard Macdonald, Quartier du Pont-de-Flandre, Paris 19e Arrondissement, Paris, Île-de-France, France métropolitaine, 75019, France
Porte de Montreuil, Avenue de la Porte de Montreuil, Saint-Blaise, Quartier de Charonne, Paris 20e Arrondissement, Paris, Île-de-France, France métropolitaine, 75020, France
Porte de Pantin, Avenue Jean Jaurès, Quartier d'Amérique, Paris 19e Arrondissement, Paris, Île-de-France, France métropolitaine, 75019, France
Porte de Saint-Cloud, Place de la Porte de Saint-Cloud, Hameau Boileau, Quartier d'Auteuil, Paris 16e Arrondissement, Paris, Île-de-France, France métropolitaine, 75016, France
Porte de Saint-Ouen, Avenue de Saint-Ouen, Quartier des Grandes-Carrières, Paris 18e Arrondissement, Paris, Île-de-France, France métropolitaine, 75018, France
Porte de Vanves, Boulevard Brune, Qua

Sully - Morland, Quai des Célestins, Quartier de l'Arsenal, Paris 4e Arrondissement, Paris, Île-de-France, France métropolitaine, 75004, France
Télégraphe, Rue de Belleville, Quartier Saint-Fargeau, Paris 20e Arrondissement, Paris, Île-de-France, France métropolitaine, 75020, France
Temple, Rue de Turbigo, Quartier des Arts-et-Métiers, Paris 3e Arrondissement, Paris, Île-de-France, France métropolitaine, 75003, France
Ternes, Boulevard de Courcelles, Quartier du Faubourg-du-Roule, Paris 17e Arrondissement, Paris, Île-de-France, France métropolitaine, 75008, France
Tolbiac, Avenue d'Italie, Cité Florale, Quartier de la Maison-Blanche, Paris 13e Arrondissement, Paris, Île-de-France, France métropolitaine, 75013, France
Trinité d'Estienne d'Orves, Rue de Châteaudun, Quartier de la Chaussée-d'Antin, Paris 9e Arrondissement, Paris, Île-de-France, France métropolitaine, 75009, France
Trocadéro, Place du Trocadéro et du 11 Novembre 1918, Quartier de Chaillot, Paris 16e Arrondissement, Paris, 

Unnamed: 0,Station,Zone,District,Longitude,Latitude
0,Abbesses,1,18,2.337929,48.884568
1,Alésia,1,14,2.327041,48.828032
2,Alexandre Dumas,1,11,2.394498,48.856306
3,Alma – Marceau,1,16,2.300626,48.864969
5,Anvers,1,9,2.344257,48.882880
...,...,...,...,...,...
296,Victor Hugo,1,16,2.286039,48.869956
300,Villiers,1,8,2.314905,48.881119
301,Volontaires,1,15,2.308003,48.841519
302,Voltaire,1,11,2.379804,48.858158


### Using the Foursquare API to explore what is around the metro station
Now let's explore with the foursquare API what kind of places are famous around those metro station. First let's create a function that look around some location and return a dataframe of places

In [142]:
import requests
CLIENT_ID = 'IUL1CXM24FTQBLRTC3TGUSMDG2G2LRSPZQISXFZSC3CHIE1B' # your Foursquare ID
CLIENT_SECRET = 'ABBOMJ1B14TOSQU4POR1COXVXDGEWZNNGMH5WGH1JNPOIXEV' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

LIMIT=10

def get_places_around(neighborhoods, lats,lngs, radius=200):
    list_places=[]
    for neighborhood, lat, lng in zip(neighborhoods,lats,lngs):
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
                    CLIENT_ID, 
                    CLIENT_SECRET, 
                    VERSION, 
                    lat, 
                    lng, 
                    radius, 
                    LIMIT)
        response=requests.get(url).json()
        

        items=response['response']['groups'][0]['items']
        for item in items:
            list_places.append([neighborhood, lat, lng, item['venue']['name'], item['venue']['location']['lat'], item['venue']['location']['lng'], item['venue']['categories'][0]['name']])
    frame_places=pd.DataFrame(list_places)
    frame_places.columns=['Station','Latitude','Longitude','Name','Place latitude','Place longitude','Categories']
    return frame_places

In [143]:
frame_places=get_places_around(paris_metro['Station'],paris_metro['Latitude'],paris_metro['Longitude'] )
frame_places.head()

Unnamed: 0,Station,Latitude,Longitude,Name,Place latitude,Place longitude,Categories
0,Abbesses,48.884568,2.337929,Al Caratello,48.885248,2.336002,Italian Restaurant
1,Abbesses,48.884568,2.337929,Place des Abbesses,48.884406,2.338538,Plaza
2,Abbesses,48.884568,2.337929,Amorino,48.885056,2.336719,Ice Cream Shop
3,Abbesses,48.884568,2.337929,Chez Toinette,48.884224,2.336904,French Restaurant
4,Abbesses,48.884568,2.337929,Guilo Guilo,48.885942,2.337048,Japanese Restaurant


In [144]:
frame_places.shape

(2151, 7)

Let's now take the japanese restaurants.

In [150]:
frame_places_merge=pd.merge(frame_places,paris_metro,on='Station').iloc[:,:-2]
japanese_restaurant=frame_places_merge[frame_places['Categories']=='Japanese Restaurant']

### First visualisation
Just for visualisation we will display the japanese restaurants in blue and the metro station in green

In [174]:
import folium
restaurant_map=folium.Map(location=[48.885942, 2.337048],zoom_start=12)
for rest_lat, rest_lgn in zip(japanese_restaurant['Place latitude'],japanese_restaurant['Place longitude']):
    folium.CircleMarker(
        location=[rest_lat, rest_lgn],
        radius=5,
        popup='japanese restaurant').add_to(restaurant_map)
    
for rest_lat, rest_lgn in zip(paris_metro['Latitude'],paris_metro['Longitude']):
    folium.CircleMarker(
        location=[rest_lat, rest_lgn],
        radius=3,
        color='green',
        popup='metro station').add_to(restaurant_map)
restaurant_map

In [176]:
japanese_restaurant.shape

(58, 9)

### famous placesaround the japanese restaurants using again the foursquare API


In [177]:
places_around_japanese=get_places_around(japanese_restaurant['Name'],japanese_restaurant['Place latitude'],japanese_restaurant['Place longitude'] )


Unnamed: 0,Station,Latitude,Longitude,Name,Place latitude,Place longitude,Categories
0,Guilo Guilo,48.885942,2.337048,Boulangerie Alexine,48.886141,2.334477,Bakery
1,Guilo Guilo,48.885942,2.337048,La Boîte aux Lettres,48.886841,2.338186,Bistro
2,Guilo Guilo,48.885942,2.337048,Al Caratello,48.885248,2.336002,Italian Restaurant
3,Guilo Guilo,48.885942,2.337048,Amorino,48.885056,2.336719,Ice Cream Shop
4,Guilo Guilo,48.885942,2.337048,Chez Toinette,48.884224,2.336904,French Restaurant
...,...,...,...,...,...,...,...
552,Ebis,48.865375,2.332310,Nolinski,48.865367,2.334584,Hotel
553,Ebis,48.865375,2.332310,Brasserie Réjane,48.865486,2.334824,Restaurant
554,Ebis,48.865375,2.332310,Le Roch Hotel & Spa Paris,48.866200,2.332995,Hotel
555,Ebis,48.865375,2.332310,Hôtel Le Pradey,48.864459,2.331654,Hotel


In [179]:
places_around_japanese.rename(columns={'Station':'Restaurant name'}, inplace=True)
places_around_japanese

Unnamed: 0,Restaurant name,Latitude,Longitude,Name,Place latitude,Place longitude,Categories
0,Guilo Guilo,48.885942,2.337048,Boulangerie Alexine,48.886141,2.334477,Bakery
1,Guilo Guilo,48.885942,2.337048,La Boîte aux Lettres,48.886841,2.338186,Bistro
2,Guilo Guilo,48.885942,2.337048,Al Caratello,48.885248,2.336002,Italian Restaurant
3,Guilo Guilo,48.885942,2.337048,Amorino,48.885056,2.336719,Ice Cream Shop
4,Guilo Guilo,48.885942,2.337048,Chez Toinette,48.884224,2.336904,French Restaurant
...,...,...,...,...,...,...,...
552,Ebis,48.865375,2.332310,Nolinski,48.865367,2.334584,Hotel
553,Ebis,48.865375,2.332310,Brasserie Réjane,48.865486,2.334824,Restaurant
554,Ebis,48.865375,2.332310,Le Roch Hotel & Spa Paris,48.866200,2.332995,Hotel
555,Ebis,48.865375,2.332310,Hôtel Le Pradey,48.864459,2.331654,Hotel


### Clustering the data into 4 clusters
We will here try to find thanks to the K-mean algorithm 4 logical clusters where the japanese restaurants are.
First we will transform the column categories in a one hot vector and then group the data by the mean.

In [195]:
places_around_japanese2=places_around_japanese[['Restaurant name','Categories']]
one_hot_places_around_japanese2=pd.get_dummies(places_around_japanese2['Categories'])
one_hot_places_around_japanese2['Restaurant name']=places_around_japanese2['Restaurant name']
one_hot_places_around_japanese2_grouped=one_hot_places_around_japanese2.groupby('Restaurant name').mean().reset_index()
one_hot_places_around_japanese2_grouped

Unnamed: 0,Restaurant name,African Restaurant,American Restaurant,Argentinian Restaurant,Art Gallery,Arts & Crafts Store,Asian Restaurant,Auto Dealership,Auvergne Restaurant,BBQ Joint,...,Tech Startup,Thai Restaurant,Theater,Tram Station,Trattoria/Osteria,Udon Restaurant,Vietnamese Restaurant,Wine Bar,Wine Shop,Women's Store
0,Ayama,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Blueberry,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0
2,Bon Kushikatsu,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,...,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0
3,Côté Sushi Vaugirard,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Ebis,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.1
5,Eizosushi,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0
6,Fukuyama,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0
7,Garden Sushi,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Guilo Guilo,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Himeji-Jõ,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [196]:
from sklearn.cluster import KMeans
kmean=KMeans(n_clusters=4, n_init=12).fit(one_hot_places_around_japanese2_grouped.iloc[:,1:])
kmean.labels_


array([1, 0, 2, 3, 1, 3, 3, 2, 0, 2, 3, 1, 3, 1, 2, 1, 0, 3, 2, 3, 3, 1,
       2, 2, 2, 3, 2, 1, 2, 2, 2, 3, 2, 0, 2, 3, 1, 2, 2, 3, 3, 2, 2, 2,
       0, 1, 2, 3, 0, 0, 3, 0])

Here we put for each japanese restaurants the 5 most common categories of places around.  

In [198]:
def returning_top(dataFrame, num_top_venues):
    return dataFrame.sort_values(ascending=False).index[0:num_top_venues]
     

num_top_venues = 5

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Restaurant name']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Restaurant name'] = one_hot_places_around_japanese2_grouped['Restaurant name']
for ind in range(neighborhoods_venues_sorted.shape[0]):   
    neighborhoods_venues_sorted.iloc[ind, 1:]=returning_top(one_hot_places_around_japanese2_grouped.iloc[ind, 1:], num_top_venues)
neighborhoods_venues_sorted

Unnamed: 0,Restaurant name,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,Ayama,Hotel,Japanese Restaurant,Bakery,Farmers Market,Fruit & Vegetable Store
1,Blueberry,French Restaurant,Clothing Store,Pastry Shop,Japanese Restaurant,Café
2,Bon Kushikatsu,Speakeasy,Coffee Shop,Japanese Restaurant,Italian Restaurant,Bar
3,Côté Sushi Vaugirard,French Restaurant,Hotel,Gym,Japanese Restaurant,Plaza
4,Ebis,Hotel,Women's Store,Restaurant,Israeli Restaurant,Plaza
5,Eizosushi,French Restaurant,Bistro,Hotel,Japanese Restaurant,Supermarket
6,Fukuyama,Bistro,Tapas Restaurant,Organic Grocery,Vietnamese Restaurant,Sandwich Place
7,Garden Sushi,Plaza,Bookstore,Japanese Restaurant,Gourmet Shop,Korean Restaurant
8,Guilo Guilo,French Restaurant,Bakery,Italian Restaurant,Japanese Restaurant,Bistro
9,Himeji-Jõ,Korean Restaurant,Italian Restaurant,Bakery,Hotel,French Restaurant


### Analyze of the different clusters

Now we had the label we found thanks to the clusterisation.

In [199]:
neighborhoods_venues_sorted['Label']=kmean.labels_

In [201]:
neighborhoods_venues_sorted[neighborhoods_venues_sorted['Label']==0]

Unnamed: 0,Restaurant name,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,Label
1,Blueberry,French Restaurant,Clothing Store,Pastry Shop,Japanese Restaurant,Café,0
8,Guilo Guilo,French Restaurant,Bakery,Italian Restaurant,Japanese Restaurant,Bistro,0
16,Koko Bistro,French Restaurant,Multiplex,Bar,Historic Site,Canal Lock,0
33,Sola,French Restaurant,Bookstore,Church,Scenic Lookout,Bakery,0
44,Wanobi,French Restaurant,Bar,Tech Startup,Bagel Shop,Bookstore,0
48,Yoshi,French Restaurant,Health Food Store,Thai Restaurant,Bookstore,Chocolate Shop,0
49,Yoshida,French Restaurant,Italian Restaurant,Bakery,Sandwich Place,Café,0
51,Yuzu,French Restaurant,Café,Bakery,Historic Site,Salad Place,0


In [202]:
neighborhoods_venues_sorted[neighborhoods_venues_sorted['Label']==1]

Unnamed: 0,Restaurant name,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,Label
0,Ayama,Hotel,Japanese Restaurant,Bakery,Farmers Market,Fruit & Vegetable Store,1
4,Ebis,Hotel,Women's Store,Restaurant,Israeli Restaurant,Plaza,1
11,Jipangue,French Restaurant,Hotel,Brasserie,Brewery,Salad Place,1
13,Kiku,Hotel,Candy Store,Karaoke Bar,Restaurant,Corsican Restaurant,1
15,Kinugawa Vendôme,Hotel,French Restaurant,Women's Store,Israeli Restaurant,Dessert Shop,1
21,Miss Kō,Hotel,French Restaurant,Japanese Restaurant,Boutique,Pastry Shop,1
27,Otaku,Hotel,Japanese Restaurant,Gym / Fitness Center,Pizza Place,Historic Site,1
36,Sushi Star,Hotel,Japanese Restaurant,Sandwich Place,Art Gallery,Auto Dealership,1
45,Wrap'n'Roll Sushi,Hotel,Chinese Restaurant,Japanese Restaurant,Sandwich Place,Bistro,1


In [203]:
neighborhoods_venues_sorted[neighborhoods_venues_sorted['Label']==2]

Unnamed: 0,Restaurant name,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,Label
2,Bon Kushikatsu,Speakeasy,Coffee Shop,Japanese Restaurant,Italian Restaurant,Bar,2
7,Garden Sushi,Plaza,Bookstore,Japanese Restaurant,Gourmet Shop,Korean Restaurant,2
9,Himeji-Jõ,Korean Restaurant,Italian Restaurant,Bakery,Hotel,French Restaurant,2
14,Kintaro,Japanese Restaurant,Bookstore,Pastry Shop,Supermarket,French Restaurant,2
18,La Maison du Saké,Chinese Restaurant,French Restaurant,Spa,Japanese Restaurant,Sushi Restaurant,2
22,Mushimushi,African Restaurant,Asian Restaurant,Spanish Restaurant,Steakhouse,Restaurant,2
23,Nakagawa,Restaurant,Bakery,Chinese Restaurant,Indian Restaurant,Bar,2
24,Nana-Ya,Cheese Shop,Japanese Restaurant,Burger Joint,Bakery,French Restaurant,2
26,Osaka Sushi,Bakery,Chinese Restaurant,Café,Bistro,Rock Club,2
28,Otakuni,Indian Restaurant,Asian Restaurant,Cosmetics Shop,Café,Japanese Restaurant,2


In [204]:
neighborhoods_venues_sorted[neighborhoods_venues_sorted['Label']==3]

Unnamed: 0,Restaurant name,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,Label
3,Côté Sushi Vaugirard,French Restaurant,Hotel,Gym,Japanese Restaurant,Plaza,3
5,Eizosushi,French Restaurant,Bistro,Hotel,Japanese Restaurant,Supermarket,3
6,Fukuyama,Bistro,Tapas Restaurant,Organic Grocery,Vietnamese Restaurant,Sandwich Place,3
10,Hoki Sushi,French Restaurant,Plaza,Convenience Store,Bar,Japanese Restaurant,3
12,Karaage-Ya Bourse,French Restaurant,Italian Restaurant,New American Restaurant,Pedestrian Plaza,Nightclub,3
17,Kura,Italian Restaurant,French Restaurant,Gourmet Shop,Middle Eastern Restaurant,Japanese Restaurant,3
19,Le Comptoir Nippon,Italian Restaurant,French Restaurant,Japanese Restaurant,Korean Restaurant,Vietnamese Restaurant,3
20,Le Concert de Cuisine,French Restaurant,Japanese Restaurant,Middle Eastern Restaurant,Hotel,Bistro,3
25,New Jioko,Gym / Fitness Center,Tram Station,Bar,Bus Stop,Japanese Restaurant,3
31,Sakura,French Restaurant,Bistro,Japanese Restaurant,Supermarket,Dessert Shop,3


It seems we can distinguish 3 clear clusters: 
- The first cluster is when there are french restaurant around.
- The second one is when there are hotels around.
- Bar, bistro.

We can then look for places where there are some french restaurants and hotels that have currently not too much japanese restaurant.

In [208]:
french_restaurant=frame_places_merge[frame_places['Categories']=='French Restaurant']
hotel=frame_places_merge[frame_places['Categories']=='Hotel']


Unnamed: 0,Station,Latitude_x,Longitude_x,Name,Place latitude,Place longitude,Categories,Zone,District
51,Argentine,48.875336,2.290132,Hôtel Centre Ville Étoile,48.876155,2.290004,Hotel,1,16
53,Argentine,48.875336,2.290132,MonHotel Lounge & Spa,48.874291,2.289776,Hotel,1,16
54,Argentine,48.875336,2.290132,Hôtel Acacias Étoile,48.876661,2.290817,Hotel,1,16
56,Argentine,48.875336,2.290132,Hotel des Pavillons,48.876169,2.290872,Hotel,1,16
74,Assemblée Nationale,48.861761,2.317974,Hôtel Bourgogne et Montana,48.860189,2.318345,Hotel,1,7
...,...,...,...,...,...,...,...,...,...
2094,Vavin,48.842216,2.329011,Le Six Hotel,48.843793,2.328207,Hotel,1,6
2097,Vavin,48.842216,2.329011,Villa Modigliani,48.841235,2.328159,Hotel,1,6
2099,Vavin,48.842216,2.329011,Hôtel Mercure Paris Montparnasse Raspail,48.841216,2.330011,Hotel,1,6
2116,Villiers,48.881119,2.314905,L'edmond,48.882234,2.313192,Hotel,1,8


### Visualisation of the famous japanese restaurants in blue, the french restaurant in black and the hotel in red alongside with the metro in green

In [280]:
final_map=folium.Map(location=[48.885942, 2.337048],zoom_start=12)
for rest_lat, rest_lgn in zip(japanese_restaurant['Place latitude'],japanese_restaurant['Place longitude']):
    folium.CircleMarker(
        location=[rest_lat, rest_lgn],
        radius=5,
        popup='japanese restaurant').add_to(final_map)
    
for rest_lat, rest_lgn in zip(french_restaurant['Place latitude'],french_restaurant['Place longitude']):
    folium.CircleMarker(
        location=[rest_lat, rest_lgn],
        radius=5,
        color='red',
        popup='french restaurant').add_to(final_map)

for rest_lat, rest_lgn in zip(hotel['Place latitude'],hotel['Place longitude']):
    folium.CircleMarker(
        location=[rest_lat, rest_lgn],
        radius=5,
        color='black',
        popup='french restaurant').add_to(final_map)    
    
for station, rest_lat, rest_lgn in zip(paris_metro['Station'], paris_metro['Latitude'], paris_metro['Longitude']):
    folium.CircleMarker(
        location=[rest_lat, rest_lgn],
        radius=3,
        color='green',
        popup=station).add_to(final_map)
final_map

We could propose to the japanese chefs some areas where we have some french restaurants, hotels but not much japanese restaurants yet.
It seems there are some spots like this around <b>boissiere</b> metro station, <b>porte de Champerret</b>.

Now we can check what are the mean salary and the population for the district for those metro stations.

In [277]:
paris_metro['District']=paris_metro['District'].astype('int64')
paris=pd.merge(paris_demographic, paris_metro, left_on='Arrondissement', right_on='District')
paris[(paris['Station']=='Porte de Champerret') | (paris['Station']=='Boissière')]

Unnamed: 0,Arrondissement,Area (km2),Population,Population per km2,Median salary,Salary normalisation,Population normalisation,Station,Zone,District,Longitude,Latitude
179,16,7.846,170239,21698,70532,0.876647,0.684572,Boissière,1,16,2.290083,48.867017
204,17,5.669,171945,30331,39302,0.343603,0.692206,Porte de Champerret,1,17,2.293491,48.885927


## Conclusion

I would propose to the japanese chef if he wants to avoid hard competition to try to settle around <b>Boissiere</b> metro station as there seem to be many french restaurant and hotel around but no famous japanese yet and in this district people seems to have a pretty good salary in comparison.
Of course there are a lot of factors I did not take in consideration in this project.
Here are some factors we could have checked:
- We could have seen the immigrations by nationalities and by district.
- The price for renting a place in each district.
- The different age categories for each districts

Thanks for your time reading my projects.