Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Geoparsing Tutorial Notebook

Jupyter notebook for geoparsing historical encyclopedia texts in French using the PERDIDO Geoparser.

This notebook is proposed by L. Moncla (INSA Lyon) and K. McDonough (The Alan Turing Institute) as part of the GEODE project.


In this tutorial, we demonstrate how to use a custom version of the Perdido geoparser python library developed in the GEODE project. We will use texts from Diderot and d’Alembert’s Encyclopédie as a case study for querying a corpus and wrangling geoparsed data. We will also compare Perdido’s NER annotations (e.g. it's output) to the results of other well-known python NER libraries (spaCy and Stanza).

In this tutorial, we'll learn about a few different things.

  • How to load data from TEI-XML files into a Python dataframe
  • Use Python dataframe for simple data analysis
  • Test the PERDIDO API for preprocessing French texts (part-of-speech tagging)
  • Test the PERDIDO API for geoparsing (geotagging + geocoding) Encyclopedie articles
  • Display custom geotagging results (PERDIDO TEI-XML) with the displaCy Named Entity Visualizer
  • Display geocoding results on a map

Open the notebook in the cloud

You can open this notebook in an executable and remote environment with Binder or Open In Colab

Set up a python environment

Clone this github repository

git clone

Configure the environment with all dependencies

  • Create a new environment called tutorial-geoparsing-py39
conda create -n tutorial-geoparsing-py39 python=3.9
  • Activate the environment
conda activate tutorial-geoparsing-py39
  • Install fiona package with conda (avoid an issue with pip)
conda install fiona==1.8.21
  • Install dependencies with pip
pip install -r requirements.txt

Launch the jupyter server

jupyter notebook


Data courtesy the ARTFL Encyclopédie Project, University of Chicago.

The authors are grateful to the ASLAN project (ANR-10-LABX-0081) of the Université de Lyon, for its financial support within the French program "Investments for the Future" operated by the National Research Agency (ANR).


No releases published


No packages published