Geo-semantic labelling of Open Data
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


Geo-semantic labelling of Open Data


  • $ git clone
  • $ cd geolabelling
  • (optionally) setup virtual environment
  • $ virtualenv --system-site-packages geolabelling_env
  • $ . geolabelling_env/bin/activate
  • Install requirements
  • $ python install

GeoNames RDF dump

Download the GeoNames RDF dump and extract it to the "local" folder.

MongoDB setup

The GeoNames entities, labels, and additional/external links are all stored in a MongoDB instance. All commands provide --host and --port parameters to access the MongoDB

GeoNames data

  1. Store all GeoNames entities and their parent relations in collection db.geonames:
$ python --host localhost --port 27017 geonames
  1. Build an index for all labels (and alternative labels) in collection db.keywords:
$ python --host localhost --port 27017 keywords
  1. Store country information in db.countries
$ python --host localhost --port 27017 countries


  • Get postal codes from wikidata and store in collection db.postalcodes
$ python --host localhost --port 27017 wikidata-postalcodes
  • Get NUTS from wikidata and store in collection db.nuts
$ python --host localhost --port 27017 wikidata-nuts

OSM data

Add admin levels to hierarchy

Adds the admin levels 2, 4 and 6 (according to the OSM hierarchy) to the DB. Uses the GeoNames API.

$ python divisions --country

Adds the admin level 8 (districts, city divisions) to the DB

$ python city-divisions --country

Get polygons for admin levels from OSM API

$ python openstreetmap/ osm-polygons --level 8 --country

Export polygons to local directory, e.g., "poly-exports/slovakia/8"

$ python openstreetmap/ poly-export --level 8 --country --directory poly-exports

Extract OSM data

First, download the OSM data for countries (e.g. from and place it in the "poly-exports/osm-export/" subdirectory. The script uses the polygons (e.g. poly-exports/slovakia/8) to extract streets and places for the subregions from the download, using Osmosis. The script takes as arguments the path to the repository and the admin level to extract the data. Also the path to Osmosis has to be modified in this script.

$ sh openstreetmap/ /path/to/repo 8

Insert OSM data in DB

Insert the data from the OSM extracts into the DB

$ python openstreetmap/ insert-osm --country --level 8