Due to limitation of Google Places API (i.e. 20 results per API call), we require to make different location based API calls. In order to do it, we need to divide Geneva city into equally split grids and make a call for each grid.

In [2]:
# required module import
import shapely.geometry as shg
import pyproj as pj
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go

In [168]:
# define lat/long to metric transformers
transformer_proxy = pj.Transformer.from_crs('epsg:4326', 'epsg:3857')
transformer_origin = pj.Transformer.from_crs('epsg:3857', 'epsg:4326')

We need to define south-west and north-east corners of the city (i.e. city grid). Google My Maps functionality is used to draw a rectangle which fully covers Geneva city center. Due to Geneva city's natural shape, the defined city grid will be beyond city center and needs to be refined later.

<img src="https://i.imgur.com/YzAmBXI.png" alt="choosing grid area" style="width:400px;"/>

In [169]:
# define south-west and north-east corners of the main grid
sw_corner = shg.Point((46.175395, 6.096637))
ne_corner = shg.Point((46.233728, 6.178177))

In [170]:
# transformation of the corners to web mercator projection
sw_corner_transformed = transformer_proxy.transform(sw_corner.x, sw_corner.y)
ne_corner_transformed = transformer_proxy.transform(ne_corner.x, ne_corner.y)
sw_corner_transformed, ne_corner_transformed

((678674.526391431, 5808501.080072539), (687751.5176707145, 5817883.734060071))

In [171]:
grid_size = 100 # grids will be created for each 100m

grid_points = []
x1, x2 = sw_corner_transformed[0], ne_corner_transformed[0]

while x1 < x2:
    y1, y2 = sw_corner_transformed[1], ne_corner_transformed[1]
    while y1 < y2:
        point = shg.Point(transformer_origin.transform(x1,y1)) # transforming points to long/lat
        grid_points.append(point)
        y1 += grid_size
    x1 += grid_size

In [172]:
# converting coordinate points to pandas dataframe
lat_long, lat, long = [], [], []
for point in grid_points:
    lat_long.append(f'{point.x},{point.y}')
    lat.append(point.x)
    long.append(point.y)
df_grid_points = pd.DataFrame({'lat_long': lat_long, 'lat': lat, 'long': long})
df_grid_points.head()

Unnamed: 0,lat_long,lat,long
0,"46.175395,6.096636999999999",46.175395,6.096637
1,"46.17601703765057,6.096636999999999",46.176017,6.096637
2,"46.176639068264926,6.096636999999999",46.176639,6.096637
3,"46.17726109184308,6.096636999999999",46.177261,6.096637
4,"46.17788310838503,6.096636999999999",46.177883,6.096637


In [125]:
# illustrate coordinates on a map
def plot_gva_grids(df):
    mapbox_access_token = open("../.env/.mapbox_access_token").read()

    fig = go.Figure(go.Scattermapbox(
            lat=df['lat'],
            lon=df['long'],
            mode='markers',
            marker=go.scattermapbox.Marker(
                size=3
            )))

    fig.update_layout(
        autosize=False,
        width=600,
        height=800,
        margin=dict(l=20, r=20, t=20, b=20),
        hovermode='closest',
        mapbox=dict(
            accesstoken=mapbox_access_token,
            bearing=0,
            center=dict(
                lat=46.20199,
                lon=6.14599
            ),
            pitch=0,
            zoom=11
        ),
    )

    fig.show()

In [302]:
print(f'{len(df_grid_points)} initial grid points have been created')
plot_gva_grids(df_grid_points)

8554 initial grid points have been created


<img src="https://i.imgur.com/SMEtEfG.png" alt="choosing grid area" style="width:400px;"/>
<center>Initial city grid, before data cleaning</center>

In [300]:
# points to be removed, based on high level analysis on the map
remove1 = df_grid_points[(df_grid_points['lat']>46.218) & (df_grid_points['long']> 6.154)].index.to_list()
remove2 = df_grid_points[(df_grid_points['lat']<46.1945) & (df_grid_points['long']< 6.1285)].index.to_list()
remove3 = df_grid_points[(df_grid_points['lat']<46.187) & (df_grid_points['long']< 6.155)].index.to_list()
remove4 = df_grid_points[df_grid_points['long']< 6.11].index.to_list()
remove5 = df_grid_points[(df_grid_points['lat']<46.188) & (df_grid_points['long']> 6.162)].index.to_list()
remove6 = df_grid_points[(df_grid_points['lat']<46.220) & (df_grid_points['lat']>46.210)\
                         & (df_grid_points['long']< 6.168) & (df_grid_points['long']> 6.153)].index.to_list()

remove_points = remove1 + remove2 + remove3 + remove4 + remove5 + remove6
print(f'{len(remove_points)} of {len(df_grid_points)} have been removed as initial clean-up')

5086 of 8554 have been removed as initial clean-up


In [303]:
df_tmp = df_grid_points.drop(df_grid_points.index[remove_points]) # temporarily dataframe before detailed data cleaning
plot_gva_grids(df_tmp)

<img src="https://i.imgur.com/8ionnCo.png" alt="1st cleaning iteration" style="width:400px;"/>
<center>1st iteration after high level cleaning via python</center>

In [308]:
# tmp.to_excel("initial_cleanup.xlsx") # saving the work as a excel
df_grid_points_final = pd.read_excel('../raw_data/geneva_grid_points.xlsx')
print(f'Final city grid with {len(df_grid_points_final)} data points')
plot_gva_grids(df_grid_points_final)

Final city grid with 3245 data points


<img src="https://i.imgur.com/1K6xY4f.png" alt="final iteration" style="width:400px;"/>
<center>Final iteration after detailed cleaning via Excel</center>