# Finding the best spot to open an Italian restaurant in Geneva, Switzerland

## 1. Introduction : Business Problem

This study provides insights on the best spot where to open an Italian restaurant in Geneva, Switzerland for entrepreneurs that might be interested in this type of business.

On one hand, opening a new restaurant is one of the riskiest activities for an entrepreneur. On the other hand, if the restaurant starts gaining momentum, then profit perspectives become very attractive. This is especially true for Italian restaurants where production costs are relatively low compared to high-end restaurants.

Geneva is a good choice because of the high purchasing power of the population, which should lead to a better margin for the entrepreneur, compensating the risk taken.

As Italian restaurants are relatively easily substituted, we are going to try to find a location where the number of restaurants, and especially Italian restaurants, is minimal. Moreover, we would like to be as close as possible to the city center because of the high intensity of the economic activity.

With these criteria in mind, we are going to try to find the best spot, weighing the pros and cons in the following sections.

## 2. Data

Following our [Introduction on the business problem](#1.-Introduction-:-Business-Problem), the criteria that we decide to keep are as follows:
- the lowest number of restaurants in the vicinity;
- the minimal number of **Italian** restaurants in the vicinity;
- the smallest distance to the city center.

To define neighborhoods in the least ambiguous way, we choose to use a grid of locations which are regularly spaced around the city center.

To obtain the required information:

- the centers of candidate areas are generated algorithmically and their approximate addresses are obtained using **Google Maps API reverse geocoding**
- the number of restaurants and their type and location in every neighborhood are obtained using **Foursquare API**
- the coordinates of Geneva's center are obtained using **Google Maps API geocoding** of the **city's heart: Place du Bourg-de-Four**.

With this information we are going to be able to compare neighborhoods in terms of similiraty and Italian restaurant density. This will guide our recommendation for our preferred neighborhood to implement an Italian restaurant.

A qualitative description of the neighborhoods of Geneva, their history and their emblematic or unusual sites can be found here: [Discover Geneva and its districts](https://www.geneve.ch/en/what-geneva/discover-geneva-districts).

### 2.1) Neighborhood Candidates

To create latitude & longitude coordinates for centroids of our candidate neighborhoods, we are going to create a grid of cells covering our area of interest which is circa 5 by 5 km centered around Geneva's city center.

First, we are going to retrieve the latitude & longitude of Geneva's city center using **Place du Bourg-de-Four** as a reference and **Google Maps geocoding API**.

In [1]:
# hidden

In [2]:
import warnings
warnings.filterwarnings("ignore", category=DeprecationWarning) 

In [3]:
import requests

def get_coordinates(api_key, address, verbose=False):
    try:
        url = 'https://maps.googleapis.com/maps/api/geocode/json?key={}&address={}'.format(api_key, address)
        response = requests.get(url).json()
        if verbose:
            print('Google Maps API JSON result =>', response)
        results = response['results']
        geographical_data = results[0]['geometry']['location'] # get geographical coordinates
        lat = geographical_data['lat']
        lon = geographical_data['lng']
        return [lat, lon]
    except:
        return [None, None]
    
address = "Place du Bourg-de-Four, 1204 Genève, Switzerland"
gva_center = get_coordinates(google_api_key, address)
print('Coordinates of {}: {}'.format(address, gva_center))

Coordinates of Place du Bourg-de-Four, 1204 Genève, Switzerland: [46.2001176, 6.148789799999999]


The next step consists of creating a grid of area candidates, equaly spaced, centered around city center and within circa 2.5km from Place du Bourg-de-Four. Neighborhoods will be defined as circular areas with a radius of 200 meters, so our neighborhood centers will be 400 meters apart.

In order to accurately compute distances grid of locations needs to be created in Cartesian 2D coordinate system which allows to calculate distances in meters (not in latitude/longitude degrees). Then those coordinates are going to be projected back to latitude/longitude degrees to be shown on Folium map. Here are the functions used to convert between the WGS84 spherical coordinate system (latitude/longitude degrees) and the UTM Cartesian coordinate system (X/Y coordinates in meters).

In [4]:
!pip install shapely
import shapely.geometry

!pip install pyproj
import pyproj

import math

def lonlat_to_xy(lon, lat):
    proj_latlon = pyproj.Proj(proj='latlong',datum='WGS84')
    proj_xy = pyproj.Proj(proj="utm", zone=33, datum='WGS84')
    xy = pyproj.transform(proj_latlon, proj_xy, lon, lat)
    return xy[0], xy[1]

def xy_to_lonlat(x, y):
    proj_latlon = pyproj.Proj(proj='latlong',datum='WGS84')
    proj_xy = pyproj.Proj(proj="utm", zone=33, datum='WGS84')
    lonlat = pyproj.transform(proj_xy, proj_latlon, x, y)
    return lonlat[0], lonlat[1]

def calc_xy_distance(x1, y1, x2, y2):
    dx = x2 - x1
    dy = y2 - y1
    return math.sqrt(dx*dx + dy*dy)

print('Coordinate transformation check')
print('-------------------------------')
print('Geneva center longitude={}, latitude={}'.format(gva_center[1], gva_center[0]))
x, y = lonlat_to_xy(gva_center[1], gva_center[0])
print('Geneva center UTM X={}, Y={}'.format(x, y))
lo, la = xy_to_lonlat(x, y)
print('Geneva center longitude={}, latitude={}'.format(lo, la))

Collecting shapely
[?25l  Downloading https://files.pythonhosted.org/packages/20/fa/c96d3461fda99ed8e82ff0b219ac2c8384694b4e640a611a1a8390ecd415/Shapely-1.7.0-cp36-cp36m-manylinux1_x86_64.whl (1.8MB)
[K     |████████████████████████████████| 1.8MB 6.9MB/s eta 0:00:01
[?25hInstalling collected packages: shapely
Successfully installed shapely-1.7.0
Collecting pyproj
[?25l  Downloading https://files.pythonhosted.org/packages/e5/c3/071e080230ac4b6c64f1a2e2f9161c9737a2bc7b683d2c90b024825000c0/pyproj-2.6.1.post1-cp36-cp36m-manylinux2010_x86_64.whl (10.9MB)
[K     |████████████████████████████████| 10.9MB 7.3MB/s eta 0:00:01
[?25hInstalling collected packages: pyproj
Successfully installed pyproj-2.6.1.post1
Coordinate transformation check
-------------------------------
Geneva center longitude=6.148789799999999, latitude=46.2001176
Geneva center UTM X=-182774.98192867963, Y=5154496.557561509
Geneva center longitude=6.148789800000001, latitude=46.2001176



Next we create a hexagonal grid of cells where every other row is offset, and vertical row spacing is adjusted so that every cell center is equally distant from all it's neighbors.

In [5]:

gva_center_x, gva_center_y = lonlat_to_xy(gva_center[1], gva_center[0]) # City center in Cartesian coordinates

k = math.sqrt(3) / 2 # Vertical offset for hexagonal grid cells
x_min = gva_center_x - 2500
x_step = 400
y_min = gva_center_y - 2500 - (int(21/k)*k*400 - 5000)/2
y_step = 400 * k 

latitudes = []
longitudes = []
distances_from_center = []
xs = []
ys = []
for i in range(0, int(21/k)):
    y = y_min + i * y_step
    x_offset = 200 if i%2==0 else 0
    for j in range(0, 21):
        x = x_min + j * x_step + x_offset
        distance_from_center = calc_xy_distance(gva_center_x, gva_center_y, x, y)
        if (distance_from_center <= 2501):
            lon, lat = xy_to_lonlat(x, y)
            latitudes.append(lat)
            longitudes.append(lon)
            distances_from_center.append(distance_from_center)
            xs.append(x)
            ys.append(y)

print(len(latitudes), 'candidate neighborhood centers generated.')

141 candidate neighborhood centers generated.


Following that, we want to visualize the city center location and the candidate neighborhood centers:


In [6]:
!pip install folium

import folium

Collecting folium
[?25l  Downloading https://files.pythonhosted.org/packages/a4/f0/44e69d50519880287cc41e7c8a6acc58daa9a9acf5f6afc52bcc70f69a6d/folium-0.11.0-py2.py3-none-any.whl (93kB)
[K     |████████████████████████████████| 102kB 14.6MB/s ta 0:00:01
[?25hCollecting branca>=0.3.0 (from folium)
  Downloading https://files.pythonhosted.org/packages/13/fb/9eacc24ba3216510c6b59a4ea1cd53d87f25ba76237d7f4393abeaf4c94e/branca-0.4.1-py3-none-any.whl
Installing collected packages: branca, folium
Successfully installed branca-0.4.1 folium-0.11.0


In [7]:
map_gva = folium.Map(location=gva_center, zoom_start=13)
folium.Marker(gva_center, popup='Place du Bourg-de-Four').add_to(map_gva)
for lat, lon in zip(latitudes, longitudes):
#     folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_gva) 
    folium.Circle([lat, lon], radius=200, color='blue', fill=False).add_to(map_gva)
#     folium.Marker([lat, lon],popup=[lat, lon]).add_to(map_gva)
map_gva


As we can see on the map above, we have the coordinates of centers of neighborhoods to be evaluated, equally spaced and within circa 2.5km from Place du Bourg-de-Four.

As a next step, we use Google Maps API to get the approximate addresses of those locations.

In [8]:
def get_address(api_key, latitude, longitude, verbose=False):
    try:
        url = 'https://maps.googleapis.com/maps/api/geocode/json?key={}&latlng={},{}'.format(api_key, latitude, longitude)
        response = requests.get(url).json()
        if verbose:
            print('Google Maps API JSON result =>', response)
        results = response['results']
        address = results[0]['formatted_address']
        return address
    except:
        return None

addr = get_address(google_api_key, gva_center[0], gva_center[1])
print('Reverse geocoding check')
print('-----------------------')
print('Address of [{}, {}] is: {}'.format(gva_center[0], gva_center[1], addr))

Reverse geocoding check
-----------------------
Address of [46.2001176, 6.148789799999999] is: Place du Bourg-de-Four 20, 1204 Genève, Switzerland


In [9]:
print('Obtaining location addresses: ', end='')
addresses = []
for lat, lon in zip(latitudes, longitudes):
    address = get_address(google_api_key, lat, lon)
    if address is None:
        address = 'NO ADDRESS'
    address = address.replace(', Switzerland', '') # We don't need country part of address
    addresses.append(address)
    print(' .', end='')
print(' done.')

Obtaining location addresses:  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . done.


Following that, we want to check whether the addresses returned are as expected (properly behaved).

In [10]:
for i in range(0, len(addresses)):
    print (str(i) + "; ", addresses[i])

0;  Chemin de Pinchat 27C, 1227 Carouge
1;  Route de Veyrier 112, 1227 Carouge
2;  Route de Vessy 14, 1206 Genève
3;  Rue Jacques-Grosselin 35, 1227 Carouge
4;  Rue Joseph-Girard 15, 1227 Carouge
5;  Chemin Charles-Poluzzi 51, 1227 Carouge
6;  Route de Veyrier 98B, 1227 Carouge
7;  Route du Bout-du-Monde 27, 1206 Genève
8;  Route de Vessy, 1234 Veyrier
9;  Chemin des Beaux-Champs 44, 1234 Vessy
10;  Route des Jeunes 33 Bis, 1227 Carouge
11;  Avenue Vibert 38, 1227 Carouge
12;  Place du Marché 21, 1227 Carouge
13;  Rue de la Fontenette 18, 1227 Carouge
14;  Carouge GE, Val d'Arve, 1227 Carouge
15;  Route du Bout-du-Monde 4BIS, 1206 Genève
16;  Chemin Edouard-Tavan 16A, 1206 Genève
17;  Route de Vessy 29, 1234 Vessy
18;  Route de Florissant 144, 1231 Conches
19;  Route des Jeunes 6, 1227 Lancy
20;  Avenue de la Praille 47A, 1227 Carouge
21;  Rue Jacques-Grosselin 1, 1227 Carouge
22;  Ave Cardinal-Mermillod 2, 1227 Carouge
23;  Rue de l'Aubépine 21, 1205 Genève
24;  Chemin de la Tour-de-C

The reverse geocoding do not seem to have returned only proper addresses. Therefore, we choose to **keep only the addresses that are unambiguous**, i.e. present some given characteristics.

In [11]:
address_characteristics = ["Rue", "Place", "Chemin", "Route", "Boulevard", "Avenue", "Quai", "Prom", "Sentier", "Tour", "Rampe"]

address_characteristics = [x.upper() for x in address_characteristics]

addresses_upper = [x.upper() for x in addresses]

counter = 0
el_to_keep = []

for address in addresses_upper:     
    for characteristic in address_characteristics:
        if characteristic in address:
            el_to_keep.append(counter)
    counter = counter + 1
    

print("Number of addresses to keep: "+str(len(el_to_keep))+" (over "+str(len(latitudes))+")")



Number of addresses to keep: 123 (over 141)


Removing latitudes and longitudes pairs that don't lead to a proper address.

In [12]:
new_latitudes=[]
new_longitudes=[]
new_distances_from_center=[]
new_xs=[]
new_ys=[]

for el in el_to_keep:
    new_latitudes.append(latitudes[el])
    new_longitudes.append(longitudes[el])
    new_distances_from_center.append(distances_from_center[el])
    new_xs.append(xs[el])
    new_ys.append(ys[el])

Quick checks to make sure everything runned smoothly.

In [13]:
print('Obtaining location addresses: ', end='')
new_addresses = []
for lat, lon in zip(new_latitudes, new_longitudes):
    new_address = get_address(google_api_key, lat, lon)
    if new_address is None:
        new_address = 'NO ADDRESS'
    new_address = new_address.replace(', Switzerland', '') # We don't need the country part of address
    new_addresses.append(new_address)
    print(' .', end='')
print(' done.')

Obtaining location addresses:  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . done.


In [14]:
for i in range(0, len(new_addresses)):
    print (str(i) + "; ", new_addresses[i])

0;  Chemin de Pinchat 27C, 1227 Carouge
1;  Route de Veyrier 112, 1227 Carouge
2;  Route de Vessy 14, 1206 Genève
3;  Rue Jacques-Grosselin 35, 1227 Carouge
4;  Rue Joseph-Girard 15, 1227 Carouge
5;  Chemin Charles-Poluzzi 51, 1227 Carouge
6;  Route de Veyrier 98B, 1227 Carouge
7;  Route du Bout-du-Monde 27, 1206 Genève
8;  Route de Vessy, 1234 Veyrier
9;  Chemin des Beaux-Champs 44, 1234 Vessy
10;  Route des Jeunes 33 Bis, 1227 Carouge
11;  Avenue Vibert 38, 1227 Carouge
12;  Place du Marché 21, 1227 Carouge
13;  Rue de la Fontenette 18, 1227 Carouge
14;  Route du Bout-du-Monde 4BIS, 1206 Genève
15;  Chemin Edouard-Tavan 16A, 1206 Genève
16;  Route de Vessy 29, 1234 Vessy
17;  Route de Florissant 144, 1231 Conches
18;  Route des Jeunes 6, 1227 Lancy
19;  Avenue de la Praille 47A, 1227 Carouge
20;  Rue Jacques-Grosselin 1, 1227 Carouge
21;  Rue de l'Aubépine 21, 1205 Genève
22;  Chemin de la Tour-de-Champel 12, 1206 Genève
23;  Chemin de la Tour-de-Champel 12, 1206 Genève
24;  Avenue d

We **plot these new points** to see which ones have been dropped.

In [15]:
new_map_gva = folium.Map(location=gva_center, zoom_start=13)
folium.Marker(gva_center, popup='Place du Bourg-de-Four').add_to(new_map_gva)
for lat, lon in zip(new_latitudes, new_longitudes):
#     folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(new_map_gva) 
    folium.Circle([lat, lon], radius=200, color='blue', fill=False).add_to(new_map_gva)
#     folium.Marker([lat, lon],popup=[lat, lon]).add_to(new_map_gva)
new_map_gva

On the map above, we notice that some points were wrongly dropped due to reverse geocoding not returning a street name and that other points were righteously dropped because they were on the lake.

Then, we replace the previous variables with the new ones.

In [16]:
latitudes = new_latitudes
longitudes = new_longitudes
addresses = new_addresses
distances_from_center=new_distances_from_center
xs=new_xs
ys=new_ys

The next step is to place everything into a Pandas dataframe.

In [17]:
import pandas as pd

df_locations = pd.DataFrame({'Address': addresses,
                             'Latitude': latitudes,
                             'Longitude': longitudes,
                             'X': xs,
                             'Y': ys,
                             'Distance from center': distances_from_center})

df_locations

Unnamed: 0,Address,Latitude,Longitude,X,Y,Distance from center
0,"Chemin de Pinchat 27C, 1227 Carouge",46.178054,6.145877,-183274.981929,5.152072e+06,2475.883681
1,"Route de Veyrier 112, 1227 Carouge",46.178454,6.150998,-182874.981929,5.152072e+06,2426.932220
2,"Route de Vessy 14, 1206 Genève",46.178853,6.156119,-182474.981929,5.152072e+06,2443.358345
3,"Rue Jacques-Grosselin 35, 1227 Carouge",46.180535,6.137698,-183874.981929,5.152418e+06,2351.595203
4,"Rue Joseph-Girard 15, 1227 Carouge",46.180935,6.142819,-183474.981929,5.152418e+06,2193.171220
5,"Chemin Charles-Poluzzi 51, 1227 Carouge",46.181335,6.147939,-183074.981929,5.152418e+06,2100.000000
6,"Route de Veyrier 98B, 1227 Carouge",46.181734,6.153060,-182674.981929,5.152418e+06,2080.865205
7,"Route du Bout-du-Monde 27, 1206 Genève",46.182134,6.158182,-182274.981929,5.152418e+06,2137.755833
8,"Route de Vessy, 1234 Veyrier",46.182533,6.163303,-181874.981929,5.152418e+06,2264.950331
9,"Chemin des Beaux-Champs 44, 1234 Vessy",46.182932,6.168424,-181474.981929,5.152418e+06,2451.530134


We can now save this data into a local file.

In [18]:
df_locations.to_pickle('./locations.pkl')

### 2.2) Foursquare

As we have our location candidates, we can use **Foursquare API** to get info on restaurants in each neighborhood.

We are only iterested in direct competitors, so we will only look for venues that have 'restaurant' in category name, and we'll make sure to detect and include all the subcategories of the 'Italian restaurant' category, as we need info on Italian restaurants in the neighborhood.

Foursquare credentials are defined in the hidden cell bellow.

In [None]:
# hidden

In [20]:
# Category IDs corresponding to Italian restaurants from Foursquare : https://developer.foursquare.com/docs/resources/categories

food_category = '4d4b7105d754a06374d81259' # 'Root' category for all food-related venues

italian_restaurant_categories = ['4bf58dd8d48988d110941735','55a5a1ebe4b013909087cbb6','55a5a1ebe4b013909087cb7c',
                                 '55a5a1ebe4b013909087cba7','55a5a1ebe4b013909087cba1','55a5a1ebe4b013909087cba4',
                                 '55a5a1ebe4b013909087cb95','55a5a1ebe4b013909087cb89','55a5a1ebe4b013909087cb9b',
                                 '55a5a1ebe4b013909087cb98','55a5a1ebe4b013909087cbbf','55a5a1ebe4b013909087cb79',
                                 '55a5a1ebe4b013909087cbb0','55a5a1ebe4b013909087cbb3','55a5a1ebe4b013909087cb74',
                                 '55a5a1ebe4b013909087cbaa','55a5a1ebe4b013909087cb83','55a5a1ebe4b013909087cb8c',
                                 '55a5a1ebe4b013909087cb92','55a5a1ebe4b013909087cb8f','55a5a1ebe4b013909087cb86',
                                 '55a5a1ebe4b013909087cbb9','55a5a1ebe4b013909087cb7f','55a5a1ebe4b013909087cbbc',
                                 '55a5a1ebe4b013909087cb9e','55a5a1ebe4b013909087cbc2','55a5a1ebe4b013909087cbad']

def is_restaurant(categories, specific_filter=None):
    restaurant_words = ['restaurant', 'diner', 'taverna', 'steakhouse']
    restaurant = False
    specific = False
    for c in categories:
        category_name = c[0].lower()
        category_id = c[1]
        for r in restaurant_words:
            if r in category_name:
                restaurant = True
        if 'fast food' in category_name:
            restaurant = False
        if not(specific_filter is None) and (category_id in specific_filter):
            specific = True
            restaurant = True
    return restaurant, specific

def get_categories(categories):
    return [(cat['name'], cat['id']) for cat in categories]

def format_address(location):
    address = ', '.join(location['formattedAddress'])
    address = address.replace(', Switzerland', '')
    address = address.replace(', Swiss', '')
    return address

def get_venues_near_location(lat, lon, category, client_id, client_secret, radius=200, limit=100):
    version = '20200524'
    url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&categoryId={}&radius={}&limit={}'.format(
        client_id, client_secret, version, lat, lon, category, radius, limit)
    try:
        results = requests.get(url).json()['response']['groups'][0]['items']
        venues = [(item['venue']['id'],
                   item['venue']['name'],
                   get_categories(item['venue']['categories']),
                   (item['venue']['location']['lat'], item['venue']['location']['lng']),
                   format_address(item['venue']['location']),
                   item['venue']['location']['distance']) for item in results]        
    except:
        venues = []
    return venues

The next step is to go over our neighborhood locations and get nearby restaurants. A dictionary of all found restaurants and all found italian restaurants will be kept.


In [21]:
import pickle

def get_restaurants(lats, lons):
    restaurants = {}
    italian_restaurants = {}
    location_restaurants = []

    print('Obtaining venues around candidate locations:', end='')
    for lat, lon in zip(lats, lons):
        # Using radius=250 to meke sure we have overlaps/full coverage so we don't miss any restaurant (we're using dictionaries to remove any duplicates resulting from area overlaps)
        venues = get_venues_near_location(lat, lon, food_category, foursquare_client_id, foursquare_client_secret, radius=250, limit=100)
        area_restaurants = []
        for venue in venues:
            venue_id = venue[0]
            venue_name = venue[1]
            venue_categories = venue[2]
            venue_latlon = venue[3]
            venue_address = venue[4]
            venue_distance = venue[5]
            is_res, is_italian = is_restaurant(venue_categories, specific_filter=italian_restaurant_categories)
            if is_res:
                x, y = lonlat_to_xy(venue_latlon[1], venue_latlon[0])
                restaurant = (venue_id, venue_name, venue_latlon[0], venue_latlon[1], venue_address, venue_distance, is_italian, x, y)
                if venue_distance<=200:
                    area_restaurants.append(restaurant)
                restaurants[venue_id] = restaurant
                if is_italian:
                    italian_restaurants[venue_id] = restaurant
        location_restaurants.append(area_restaurants)
        print(' .', end='')
    print(' done.')
    return restaurants, italian_restaurants, location_restaurants

# Try to load from local file system in case we did this before
restaurants = {}
italian_restaurants = {}
location_restaurants = []
loaded = False
try:
    with open('restaurants_250.pkl', 'rb') as f:
        restaurants = pickle.load(f)
    with open('italian_restaurants_250.pkl', 'rb') as f:
        italian_restaurants = pickle.load(f)
    with open('location_restaurants_250.pkl', 'rb') as f:
        location_restaurants = pickle.load(f)
    print('Restaurant data loaded.')
    loaded = True
except:
    pass

# If load failed use the Foursquare API to get the data
if not loaded:
    restaurants, italian_restaurants, location_restaurants = get_restaurants(latitudes, longitudes)
    
    # Let's persists this in local file system
    with open('restaurants_250.pkl', 'wb') as f:
        pickle.dump(restaurants, f)
    with open('italian_restaurants_250.pkl', 'wb') as f:
        pickle.dump(italian_restaurants, f)
    with open('location_restaurants_250.pkl', 'wb') as f:
        pickle.dump(location_restaurants, f)

Obtaining venues around candidate locations: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . done.


Get the main descriptive statistics about restaurants in our study.

In [22]:
import numpy as np

print('Total number of restaurants:', len(restaurants))
print('Total number of Italian restaurants:', len(italian_restaurants))
print('Percentage of Italian restaurants: {:.2f}%'.format(len(italian_restaurants) / len(restaurants) * 100))
print('Average number of restaurants in neighborhood:', np.array([len(r) for r in location_restaurants]).mean())

Total number of restaurants: 445
Total number of Italian restaurants: 72
Percentage of Italian restaurants: 16.18%
Average number of restaurants in neighborhood: 3.097560975609756


Get an overview of the list of all restaurants.

In [23]:
print('List of all restaurants')
print('-----------------------')
for r in list(restaurants.values())[:10]:
    print(r)
print('...')
print('Total:', len(restaurants))

List of all restaurants
-----------------------
('4bd1feef41b9ef3bae17fde5', "Lion D'or", 46.180438656431654, 6.140090729833083, 'Carouge (GE), Suisse', 217, True, -183691.59997045575, 5152386.660686689)
('4d31fa3e8c42a1cde11be45d', 'Café du Rondeau', 46.17996713598073, 6.139056, 'Place du Rondeau 1, 1217 Carouge, Suisse', 122, False, -183777.30691975844, 5152343.2765200855)
('4b687da1f964a520d17b2be3', 'Kudéta', 46.179934, 6.138951, 'Rondeau de Carouge, 9, 1227 Carouge, Suisse', 117, False, -183785.82039484556, 5152340.507888565)
('5c1153c6ad1789002d654c13', 'Yeast', 46.18088, 6.140539, 'Rue Ancienne 64, 1227 Carouge, Suisse', 175, False, -183651.50912955578, 5152431.782521267)
('5322dfaa498ee62ab3ae1676', 'Hancé Café', 46.182177, 6.141111, 'Schweiz', 190, False, -183591.18837041594, 5152570.851330211)
('4c029c03f56c2d7fbea41b66', 'Sakura sushi bar', 46.183080298734694, 6.1421796101178625, 'Rue du Nord 5, 1180 Rolle, Suisse', 243, False, -183497.48055307346, 5152661.892474028)
('4f8e9

Get an overview of italian restaurants.

In [24]:
print('List of Italian restaurants')
print('---------------------------')
for r in list(italian_restaurants.values())[:10]:
    print(r)
print('...')
print('Total:', len(italian_restaurants))

List of Italian restaurants
---------------------------
('4bd1feef41b9ef3bae17fde5', "Lion D'or", 46.180438656431654, 6.140090729833083, 'Carouge (GE), Suisse', 217, True, -183691.59997045575, 5152386.660686689)
('4b615192f964a520f70f2ae3', 'Via Roma', 46.184032060802586, 6.139397767922606, 'Place du Marché 20 (Rue Jacques-Dalphin), 1227 Carouge, Suisse', 36, True, -183700.16619841126, 5152791.718389224)
('4b82d398f964a52087e730e3', 'Casa Italia', 46.18459336290256, 6.146632692951479, 'Route de Veyrier 32, 1227 Carouge (GE), Suisse', 141, True, -183135.11033136758, 5152791.301490974)
('4bc6e4fc15a7ef3b6d6b78da', 'Mi Piace', 46.18728556149642, 6.133066608729702, 'Av de la Praille 40, 1217 Carouge, Suisse', 159, True, -184147.83241636562, 5153207.970374097)
('4dd7ff8f45ddced820952b80', 'Da Renato', 46.186076168139586, 6.13869144692236, 'Rue Jacques-Dalphin 14, 1227 Carouge, Suisse', 168, True, -183729.10923434782, 5153024.8438150855)
('51484794e4b05ac9a726abe2', 'Dai tre fratelli Scalea'

Get an overview of restaurants around locations.

In [25]:
print('Restaurants around location')
print('---------------------------')
for i in range(100, 110):
    rs = location_restaurants[i][:8]
    names = ', '.join([r[1] for r in rs])
    print('Restaurants around locations {}: {}'.format(i+1, names))

Restaurants around location
---------------------------
Restaurants around locations 101: Bistrot du Boeuf Rouge, Parfums De Beyrouth, El Faro, Jeck's Place, Kirin Asia, Il Monte Bianco, Restaurant Miyako, L'Entrecôte Couronnée
Restaurants around locations 102: Lòu One, FloorTWO Restaurant, Le Chat Botté - Beau Rivage Palace - Genève, Windows Restaurant, Le Grill, Sam-Lor (Le), Il Vero, Le Sumo Yakitori
Restaurants around locations 103: Buvette des Bains des Pâquis
Restaurants around locations 104: Tordoya
Restaurants around locations 105: 
Restaurants around locations 106: il Forno a Legna, Aux deux portes, Le Portail, Restaurante E Churrascaria Aquarela
Restaurants around locations 107: La Maison D'asie, Nomade, Sansui, Giardino Italiano Risorante
Restaurants around locations 108: Cho Lon, Nagomi, Le Corail Rose, Sushi Boky, Ali Haydar Kebab, Restaurant Yinde, Spice of India, Uchino
Restaurants around locations 109: Mosaïque, Little India, Entre Homard & Côte, Le Diwane, Auberge de S

As we can see, some locations do not have a restaurant around them. This is due to the relatively small size of radius around the locations. 

The final step with all the necessary data gathered is to plot the restaurants in our area of interest (blue) and to show Italian restaurants in a different color (red).

In [26]:
map_gva = folium.Map(location=gva_center, zoom_start=13)
folium.Marker(gva_center, popup='Place du Bourg-de-Four').add_to(map_gva)
for res in restaurants.values():
    lat = res[2]; lon = res[3]
    is_italian = res[6]
    color = 'red' if is_italian else 'blue'
    folium.CircleMarker([lat, lon], radius=3, color=color, fill=True, fill_color=color, fill_opacity=1).add_to(map_gva)
map_gva

Now, we have all the restaurants which can be found in an area within 2.5 km from Place du Bourg-de-Four and we also know which ones are Italian. We also know which restaurants are in vicinity of every neighborhood candidate center.

In the next section, we are going to to use this data for analysis to produce the report on optimal locations for a new Italian restaurant.

## 3) Methodology 

With this study, our goal is to **find the zones of Geneva** where the is a low restaurant density, especially regarding Italian restaurants. These places should be **within 2.5 km around the city center**.

In the first part of this study, we gathered **data** to conduct our analysis, i.e. the location and type of restaurants within 2.5km from Geneva's city center (Place du Bourg-de-Four). We also have identified Italian restaurants based on Foursquare's categorization.

The second part of our study is going to be the assessment of **restaurant density** in the selected areas of Geneva. This is going to be done by using heatmaps allowing us to identify some good candidate areas within a decent distance to the center and showing a low number of restaurants and no Italian restaurants in the vicinity.

The final part of our study is going to be the analysis of the best areas for which we are going to create **clusters of locations** that show no more than **two restaurants whithin a radius of 150 meters and no Italian restaurants within a radius of 200 meters**. We are going to display a map with these locations and create clusters - using **k-means clustering** - of them in order to identify general zones / neighborhoods / addresses which should be the basis for the street level exploration and search for an optimal venue location.

## 4) Analysis 

We perform explanatory data analysis and get supplemental information from our data. The first indication is the **average number of restaurants in every candidate area**.

In [27]:
location_restaurants_count = [len(res) for res in location_restaurants]

df_locations['Restaurants in area'] = location_restaurants_count

print('Average number of restaurants in every area with radius=200m:', np.array(location_restaurants_count).mean())

df_locations.head(10)

Average number of restaurants in every area with radius=200m: 3.097560975609756


Unnamed: 0,Address,Latitude,Longitude,X,Y,Distance from center,Restaurants in area
0,"Chemin de Pinchat 27C, 1227 Carouge",46.178054,6.145877,-183274.981929,5152072.0,2475.883681,0
1,"Route de Veyrier 112, 1227 Carouge",46.178454,6.150998,-182874.981929,5152072.0,2426.93222,0
2,"Route de Vessy 14, 1206 Genève",46.178853,6.156119,-182474.981929,5152072.0,2443.358345,0
3,"Rue Jacques-Grosselin 35, 1227 Carouge",46.180535,6.137698,-183874.981929,5152418.0,2351.595203,3
4,"Rue Joseph-Girard 15, 1227 Carouge",46.180935,6.142819,-183474.981929,5152418.0,2193.17122,2
5,"Chemin Charles-Poluzzi 51, 1227 Carouge",46.181335,6.147939,-183074.981929,5152418.0,2100.0,2
6,"Route de Veyrier 98B, 1227 Carouge",46.181734,6.15306,-182674.981929,5152418.0,2080.865205,0
7,"Route du Bout-du-Monde 27, 1206 Genève",46.182134,6.158182,-182274.981929,5152418.0,2137.755833,0
8,"Route de Vessy, 1234 Veyrier",46.182533,6.163303,-181874.981929,5152418.0,2264.950331,0
9,"Chemin des Beaux-Champs 44, 1234 Vessy",46.182932,6.168424,-181474.981929,5152418.0,2451.530134,0


The next step is to compute the **distance to the nearest Italian restaurant from every area's candidate center**.

In [28]:
distances_to_italian_restaurant = []

for area_x, area_y in zip(xs, ys):
    min_distance = 10000
    for res in italian_restaurants.values():
        res_x = res[7]
        res_y = res[8]
        d = calc_xy_distance(area_x, area_y, res_x, res_y)
        if d<min_distance:
            min_distance = d
    distances_to_italian_restaurant.append(min_distance)

df_locations['Distance to Italian restaurant'] = distances_to_italian_restaurant

In [29]:
df_locations.head(10)

Unnamed: 0,Address,Latitude,Longitude,X,Y,Distance from center,Restaurants in area,Distance to Italian restaurant
0,"Chemin de Pinchat 27C, 1227 Carouge",46.178054,6.145877,-183274.981929,5152072.0,2475.883681,0,522.282849
1,"Route de Veyrier 112, 1227 Carouge",46.178454,6.150998,-182874.981929,5152072.0,2426.93222,0,765.187964
2,"Route de Vessy 14, 1206 Genève",46.178853,6.156119,-182474.981929,5152072.0,2443.358345,0,976.532305
3,"Rue Jacques-Grosselin 35, 1227 Carouge",46.180535,6.137698,-183874.981929,5152418.0,2351.595203,3,186.05687
4,"Rue Joseph-Girard 15, 1227 Carouge",46.180935,6.142819,-183474.981929,5152418.0,2193.17122,2,218.887168
5,"Chemin Charles-Poluzzi 51, 1227 Carouge",46.181335,6.147939,-183074.981929,5152418.0,2100.0,2,378.01762
6,"Route de Veyrier 98B, 1227 Carouge",46.181734,6.15306,-182674.981929,5152418.0,2080.865205,0,592.452566
7,"Route du Bout-du-Monde 27, 1206 Genève",46.182134,6.158182,-182274.981929,5152418.0,2137.755833,0,937.604802
8,"Route de Vessy, 1234 Veyrier",46.182533,6.163303,-181874.981929,5152418.0,2264.950331,0,1250.46571
9,"Chemin des Beaux-Champs 44, 1234 Vessy",46.182932,6.168424,-181474.981929,5152418.0,2451.530134,0,1309.795064


In [30]:
print('Average distance to closest Italian restaurant from each area center:', df_locations['Distance to Italian restaurant'].mean())

Average distance to closest Italian restaurant from each area center: 371.543841177495


The average Italian restaurant can be found within 371m from every area's center, meaning that we have to be careful about our filtering.

We create a **heatmap** presenting the **density of restaurants** to get a better overview of their repartition. We also show **distances from Geneva's center** through **white circles** on our map, indicating a distance of **834m, 1668m and 2502m from Place-du-Bourg-de-Four (~thirds of the total distance from city center under the scope)**.

In [31]:
restaurant_latlons = [[res[2], res[3]] for res in restaurants.values()]
italian_latlons = [[res[2], res[3]] for res in italian_restaurants.values()]

In [32]:
from folium import plugins
from folium.plugins import HeatMap

map_gva = folium.Map(location=gva_center, zoom_start=13)
folium.TileLayer('cartodbpositron').add_to(map_gva) #cartodbpositron cartodbdark_matter
HeatMap(restaurant_latlons).add_to(map_gva)
folium.Marker(gva_center).add_to(map_gva)
folium.Circle(gva_center, radius=834, fill=False, color='white').add_to(map_gva)
folium.Circle(gva_center, radius=1668, fill=False, color='white').add_to(map_gva)
folium.Circle(gva_center, radius=2502, fill=False, color='white').add_to(map_gva)
map_gva

From the heatmap above, it seems that there is a **lower restaurant density** in the south / south-east / south-west direction from the city center, but, since there is a parc in the south-west direction, we **choose to focus on the south / south-east direction from the city center**.

The following heatmap presents the **density of Italian restaurants only**.

In [33]:
map_gva = folium.Map(location=gva_center, zoom_start=13)
folium.TileLayer('cartodbpositron').add_to(map_gva) #cartodbpositron cartodbdark_matter
HeatMap(italian_latlons).add_to(map_gva)
folium.Marker(gva_center).add_to(map_gva)
folium.Circle(gva_center, radius=834, fill=False, color='white').add_to(map_gva)
folium.Circle(gva_center, radius=1668, fill=False, color='white').add_to(map_gva)
folium.Circle(gva_center, radius=2502, fill=False, color='white').add_to(map_gva)
map_gva

The heatmap above seems to show that there is a **lower Italian restaurant density** in the south / south-west / south-east directions, but, since there is a parc in the south-west direction, we **choose to focus on the south / south-east direction from the city center**.

**Both previous heatmaps point out that we should focus on the south / south-east direction from the city center**. This corresponds to the **Old Town and Champel neighborhoods**, which are respectively very popular among tourists and residential.



### 4.1) Old Town and Champel neighborhoods

Based on the qualitative description of the neighborhoods of Geneva (which can be found here: [Discover Geneva and its districts](https://www.geneve.ch/en/what-geneva/discover-geneva-districts)), we extract that:
- "The old town is an administrative centre. Both the government and the parliament of the canton of Geneva can be found in Rue de l’Hôtel-de-Ville. [...] In this district popular with tourists, the inhabitants of Eaux-Vives Cité [Cité includes Old Town and surroundings] enjoy an extensive cultural offering including theatres, museums, concert halls and much more."
- "While Champel is primarily residential, the lower part of the district to the east consists of more public buildings, office space and shops, as it is close to the city centre. [...] With the services it provides, it attracts numerous people from throughout the canton. In the upper reaches of Champel, near Florissant, a wide range of tertiary-sector firms can be found nestling between the residential buildings."

Therefore, **there is potential for clients in these neighborhoods and we are going to pursue our analysis in that direction**.

The next step is to **narrow the region of interest to the low density restaurant parts of Old Town and Champel closest to Place du Bourg-de-Four**.

In [34]:
roi_x_min = gva_center_x
roi_y_max = gva_center_y 
roi_width = 2200
roi_height = 2200
roi_center_x = roi_x_min + 500
roi_center_y = roi_y_max - 500
roi_center_lon, roi_center_lat = xy_to_lonlat(roi_center_x, roi_center_y)
roi_center = [roi_center_lat, roi_center_lon]

map_gva = folium.Map(location=roi_center, zoom_start=14)
HeatMap(restaurant_latlons).add_to(map_gva)
folium.Marker(gva_center).add_to(map_gva)
folium.Circle(roi_center, radius=1100, color='white', fill=True, fill_opacity=0.4).add_to(map_gva)
map_gva

This covers most of the pockets of low restaurant density in Old Town and Champel neighborhoods.

Then, we create a new and more dense grid of location candidates restricted to the new region of interest (location candidates 50m appart).

In [48]:
k = math.sqrt(3) / 2 # Vertical offset for hexagonal grid cells
x_step = 50
y_step = 50 * k 
roi_y_min = roi_center_y - 1100

roi_latitudes = []
roi_longitudes = []
roi_xs = []
roi_ys = []
for i in range(0, int(51/k)):
    y = roi_y_min + i * y_step
    x_offset = 25 if i%2==0 else 0
    for j in range(0, 26):
        x = roi_x_min + j * x_step + x_offset
        d = calc_xy_distance(roi_center_x, roi_center_y, x, y)
        if (d <= 1101):
            lon, lat = xy_to_lonlat(x, y)
            roi_latitudes.append(lat)
            roi_longitudes.append(lon)
            roi_xs.append(x)
            roi_ys.append(y)

print(len(roi_latitudes), 'candidate neighborhood centers generated.')

1230 candidate neighborhood centers generated.


The next step is to compute the **number of restaurants in vicinity** (radius of 150 meters) **and** the **distance to the closest Italian restaurant**.

In [50]:
def count_restaurants_nearby(x, y, restaurants, radius=150):    
    count = 0
    for res in restaurants.values():
        res_x = res[7]; res_y = res[8]
        d = calc_xy_distance(x, y, res_x, res_y)
        if d<=radius:
            count += 1
    return count

def find_nearest_restaurant(x, y, restaurants):
    d_min = 100000
    for res in restaurants.values():
        res_x = res[7]; res_y = res[8]
        d = calc_xy_distance(x, y, res_x, res_y)
        if d<=d_min:
            d_min = d
    return d_min

roi_restaurant_counts = []
roi_italian_distances = []

print('Generating data on location candidates... ', end='')
for x, y in zip(roi_xs, roi_ys):
    count = count_restaurants_nearby(x, y, restaurants, radius=250)
    roi_restaurant_counts.append(count)
    distance = find_nearest_restaurant(x, y, italian_restaurants)
    roi_italian_distances.append(distance)
print('done.')

Generating data on location candidates... done.


We put this information into a dataframe.

In [51]:
df_roi_locations = pd.DataFrame({'Latitude':roi_latitudes,
                                 'Longitude':roi_longitudes,
                                 'X':roi_xs,
                                 'Y':roi_ys,
                                 'Restaurants nearby':roi_restaurant_counts,
                                 'Distance to Italian restaurant':roi_italian_distances})

df_roi_locations.head(10)

Unnamed: 0,Latitude,Longitude,X,Y,Restaurants nearby,Distance to Italian restaurant
0,46.186364,6.157174,-182299.981929,5152897.0,0,841.735285
1,46.186414,6.157814,-182249.981929,5152897.0,0,862.699603
2,46.186474,6.153591,-182574.981929,5152940.0,0,579.49384
3,46.186524,6.154231,-182524.981929,5152940.0,0,627.953781
4,46.186574,6.154871,-182474.981929,5152940.0,0,676.637858
5,46.186624,6.155511,-182424.981929,5152940.0,0,725.500952
6,46.186674,6.156151,-182374.981929,5152940.0,1,774.509181
7,46.186724,6.156791,-182324.981929,5152940.0,1,823.636639
8,46.186774,6.157432,-182274.981929,5152940.0,2,836.153647
9,46.186824,6.158072,-182224.981929,5152940.0,2,812.797791


Following that, we **filter those locations** since we are only interested in those **two restaurants or less in a radius of 150 meters and no Italian restaurants in a radius of 250 meters**.

In [54]:
good_res_count = np.array((df_roi_locations['Restaurants nearby']<=2))
print('Locations with no more than two restaurants nearby:', good_res_count.sum())

good_ita_distance = np.array(df_roi_locations['Distance to Italian restaurant']>=250)
print('Locations with no Italian restaurants within 250m:', good_ita_distance.sum())

good_locations = np.logical_and(good_res_count, good_ita_distance)
print('Locations with both conditions met:', good_locations.sum())

df_good_locations = df_roi_locations[good_locations]

Locations with no more than two restaurants nearby: 394
Locations with no Italian restaurants within 250m: 482
Locations with both conditions met: 293


Then we want to plot this on a map.

In [55]:
good_latitudes = df_good_locations['Latitude'].values
good_longitudes = df_good_locations['Longitude'].values

good_locations = [[lat, lon] for lat, lon in zip(good_latitudes, good_longitudes)]

map_gva = folium.Map(location=roi_center, zoom_start=14)
folium.TileLayer('cartodbpositron').add_to(map_gva)
HeatMap(restaurant_latlons).add_to(map_gva)
folium.Circle(roi_center, radius=1100, color='white', fill=True, fill_opacity=0.6).add_to(map_gva)
folium.Marker(gva_center).add_to(map_gva)
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_gva)
map_gva

Each one of these locations is quite close to Place du Bourg-de-Four and them has two restaurants or less in a radius of 150m, and no Italian restaurant closer than 250m. **Based on close competition, each of these locations is a potential candidate for a new Italian restaurant**.

Next step is to show these potential locations in a heatmap.

In [56]:
map_gva = folium.Map(location=roi_center, zoom_start=14)
HeatMap(good_locations, radius=25).add_to(map_gva)
folium.Marker(gva_center).add_to(map_gva)
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_gva)
map_gva

The heatmap above shows clearly the zones with a low number of restaurants in vicinity, and no Italian restaurants at all nearby.

The following step is to **cluster these locations** in order to **generate centers of areas that contain good locations**. These zones, their centers and addresses will be the final result of our analysis.

In [57]:
from sklearn.cluster import KMeans

number_of_clusters = 15

good_xys = df_good_locations[['X', 'Y']].values
kmeans = KMeans(n_clusters=number_of_clusters, random_state=0).fit(good_xys)

cluster_centers = [xy_to_lonlat(cc[0], cc[1]) for cc in kmeans.cluster_centers_]

map_gva = folium.Map(location=roi_center, zoom_start=14)
folium.TileLayer('cartodbpositron').add_to(map_gva)
HeatMap(restaurant_latlons).add_to(map_gva)
folium.Circle(roi_center, radius=1100, color='white', fill=True, fill_opacity=0.4).add_to(map_gva)
folium.Marker(gva_center).add_to(map_gva)
for lon, lat in cluster_centers:
    folium.Circle([lat, lon], radius=300, color='green', fill=True, fill_opacity=0.25).add_to(map_gva) 
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_gva)
map_gva


The clusters above represent groupings of the greatest part of candidate locations and cluster centers are placed in the middle of the are that contain lots of location candidates.

The addresses of these cluster centers are a good starting point to analyse the neighborhoods and find the best possible location based on neighborhood characteristics.

The following step is to plot these areas on a city map with shaded areas indicating the previous clusters.


Zooming in on candidate areas in Old Town.

In [67]:
map_gva = folium.Map(location=[46.197436, 6.146500], zoom_start=15)
folium.Marker(gva_center).add_to(map_gva)
for lon, lat in cluster_centers:
    folium.Circle([lat, lon], radius=300, color='green', fill=False).add_to(map_gva) 
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.Circle([lat, lon], radius=150, color='#0000ff00', fill=True, fill_color='#0066ff', fill_opacity=0.07).add_to(map_gva)
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_gva)
map_gva

Zomming in on candidate areas in Champel.

In [68]:

map_gva = folium.Map(location=[46.189682, 6.157540], zoom_start=15)
folium.Marker(gva_center).add_to(map_gva)
for lon, lat in cluster_centers:
    folium.Circle([lat, lon], radius=300, color='green', fill=False).add_to(map_gva) 
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.Circle([lat, lon], radius=150, color='#0000ff00', fill=True, fill_color='#0066ff', fill_opacity=0.07).add_to(map_gva)
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_gva)
map_gva

The final step is to reverse geocode these candidate area centers and obtain the addresses which can be presented to people interested in implementing an Italian restaurant in Geneva.


In [69]:
candidate_area_addresses = []
print('==============================================================')
print('Addresses of centers of areas recommended for further analysis')
print('==============================================================\n')
for lon, lat in cluster_centers:
    addr = get_address(google_api_key, lat, lon).replace(', Switzerland', '')
    candidate_area_addresses.append(addr)    
    x, y = lonlat_to_xy(lon, lat)
    d = calc_xy_distance(x, y, gva_center_x, gva_center_y)
    print('{}{} => {:.1f}km from Place du Bourg-de-Four'.format(addr, ' '*(50-len(addr)), d/1000))

Addresses of centers of areas recommended for further analysis

Chemin des Glycines 8, 1206 Genève                 => 1.7km from Place du Bourg-de-Four
Avenue Théodore-Weber 2, 1208 Genève               => 1.0km from Place du Bourg-de-Four
Chemin des Crêts-de-Champel 24, 1206 Genève        => 1.5km from Place du Bourg-de-Four
Avenue Peschier 24, 1206 Genève                    => 1.2km from Place du Bourg-de-Four
Rue Le-Corbusier 40, 1208 Genève                   => 1.6km from Place du Bourg-de-Four
Route de Malagnou 30, 1208 Genève                  => 0.9km from Place du Bourg-de-Four
Chemin des Clochettes 4, 1206 Genève               => 1.6km from Place du Bourg-de-Four
Chemin de la Tour-de-Champel 12, 1206 Genève       => 1.4km from Place du Bourg-de-Four
Avenue Peschier 41, 1206 Genève                    => 1.5km from Place du Bourg-de-Four
Chemin de Beau-Soleil 12, 1206 Genève              => 1.7km from Place du Bourg-de-Four
Rue Ernest-Bloch 54, 1207 Genève                   => 1.

This was the final step to our analysis. We have created 15 addresses representing centers of zones containing locations with a restricted number of restaurants and no Italian restaurants nearby, with each of these zones quite close to the city center (within 2km from Place du Bourg-de-Four). These should only be considered as starting points and a deeper exploration of the corresponding zones is necessary. These zones are interesting since they are either popular with tourists or residential with a fair amount of offices, close to the city center and well equipped in terms of public transportation.

In [70]:
map_gva = folium.Map(location=roi_center, zoom_start=14)
folium.Circle(gva_center, radius=50, color='red', fill=True, fill_color='red', fill_opacity=1).add_to(map_gva)
for lonlat, addr in zip(cluster_centers, candidate_area_addresses):
    folium.Marker([lonlat[1], lonlat[0]], popup=addr).add_to(map_gva) 
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.Circle([lat, lon], radius=150, color='#0000ff00', fill=True, fill_color='#0066ff', fill_opacity=0.05).add_to(map_gva)
map_gva

## 5) Results and Discussion 


This study shows that despite the high number of restaurants in Geneva, we can still spot pockets of low restaurant density close enough to the city center. The highest concentration of restaurants was detected north and west from Place du Bourg-de-Four (the heart of Geneva), so we focused our attention to the areas that are in the south / south-east direction. This corresponds to the Old Town and Champel neighborhoods. These neighborhoods represent altogether popular zones among tourists which are close to the city center, that demonstate qualitative socio-economic dynamics and some pockets of low restaurant density.

After focusing to the more narrow area of interest, covering circa 2.2km by 2.2km at the south-east from Place du Bourg-de-Four, we created a dense grid of location candidates spaced 50m appart. Following that, these locations were filtered to remove the areas containing more than two restaurants in a radius of 150m and an Italian restaurant closer than 250m.

Resulting location candidates were then clustered to create zones of interest containing the greatest number of location candidates. Then addresses of the centers of these areas were generated using reverse geocoding to be used as starting points for more detailed local analysis based on additional upcoming factors.

The final result of these procedures is **15 areas containing the largest number of potential new restaurant spots based on the number and the distance to existing venues, i.e. restaurants and, more specifically, Italian restaurants**. As a reminder, the goal of this study was to provide insights on areas close to Geneva's center and not over-crowded with restaurants. Therefore, other criteria should be added in order to implement a new restaurant regardless of the lack of competition in the area. Recommended zones should therefore be considered only as a starting point for more detailed analysis.

## 6) Conclusion

The goal of this study was to spot the best spot where to to open an Italian restaurant in Geneva, Switzerland for entrepreneurs that might be interested in this type of business. By analysing the restaurant density distribution acquired from Foursquare's data, we have been able to identify neighborhoods that suggested further analysis (Old Town and Champel). As a next step, we have generated a collection of locations satisfying some requirements regarding existing restaurants. The clustering of those locations was then done to delimit the main areas of interest with the highest number of potential locations. As a final step, the addresses of these zone centers were retrieved in order to be available as starting points for a more in depth exploration by people that might be interested in this type of business. Of course, additional criteria should be added to this study in order to have a more precise approach and reduce the risks involved.