In [1]:
import pandas as pd
import geopandas as gpd

In [2]:
df = pd.read_csv('./data/geocode_example.csv')

In [3]:
df

Unnamed: 0,location,address
0,HEI campus,"Chemin Eugene-Rigot 2A, 1202 Geneva, Switzerland"
1,Museum of Red Cross and Red Crescent,"Avenue de la Paix 17, 1202 Geneva, Switzerland"
2,Paris,"Paris, Ile de France, France"
3,Munich,"Muenchen, Bayern, Germany"
4,Buckingham Palace,"SW1A 1AA, United Kingdom"


# 1 Using default settings

`geopandas` uses the [geopy](https://pypi.org/project/geopy/) python client to interact with many different popular geocoding web services. We start by using the default settings.

In [4]:
locations = gpd.tools.geocode(df.address)

In [5]:
locations

Unnamed: 0,geometry,address
0,POINT (6.14364 46.22076),"Chemin Eugène Rigot 2A, 1202 Geneva, Switzerland"
1,POINT (6.13681 46.22776),"Avenue de la Paix 17, 1202 Geneva, Switzerland"
2,POINT (2.35085 48.85690),"Paris, Paris, France, France"
3,POINT (11.58022 48.13913),"Munich, BY, Germany, Germany"
4,POINT (-0.14159 51.50101),"SW1A 1AA, London, London, United Kingdom, Unit..."


# 2 Using a different geocoding provider

By default, `geopandas` uses the [GeoCode.Farm geocoding API](https://geocode.farm/) to do the geocoding (i.e. the optional `provider` argument is set to `'geocodefarm'`). For the full set of available geocoders, see [HERE](https://geopy.readthedocs.io/en/latest/#geocoders). GeoCode.Farm allows a max of 250 queries per day and 4 queries per second as part of its free tier.

In this example, we will be using the [Nominatim](https://nominatim.org/) geocoder, which makes use of the OpenStreetMaps database.

In [6]:
locations_2 = gpd.tools.geocode(df.address, provider='nominatim', user_agent='geocoding_test', timeout=1)

In [7]:
locations_2

Unnamed: 0,geometry,address
0,POINT (6.14372 46.22320),"Chemin Eugène-Rigot, Sécheron, Pâquis, Genève,..."
1,POINT (6.14450 46.22340),"Avenue de la Paix, Sécheron, Pâquis, Genève, G..."
2,POINT (2.35146 48.85670),"Paris, Île-de-France, France métropolitaine, F..."
3,POINT (11.57538 48.13711),"München, Bayern, Deutschland"
4,POINT (-0.14159 51.50101),"City of Westminster, London, Greater London, E..."


Note the use of the `user_agent` parameter. This serves to limit the number of requests per application. If you are making a lot of queries, Nominatim also asks you to provide an email address. See here for more details: https://gis.stackexchange.com/questions/293615/user-agent-argument-in-nominatim-in-geopy

The `timeout` parameter sets the number of seconds we are willing to wait for response from the service.

If you have a large dataframe to geocode, consider

- paying for a service
- using a `Rate Limiter` from `geopy`. See here for more details:
https://geopy.readthedocs.io/en/stable/index.html?highlight=rate%20limiter#module-geopy.extra.rate_limiter