Skip to content
No description, website, or topics provided.
Shell JavaScript Ruby
Branch: master
Clone or download
Latest commit a6ba56e Aug 2, 2019
Type Name Latest commit message Commit time
Failed to load latest commit information.
config Initial version Dec 18, 2015
scripts Update osm download URL Aug 2, 2019
wof_data Remove suspicious neighbourhoods Jun 12, 2019
.gitignore Initial commit Dec 18, 2015
Dockerfile.base Update elastic and node Nov 21, 2018
Dockerfile.builder Update docker Nov 8, 2018
Dockerfile.loader Pass env var through docker layers Nov 8, 2018 Explain how to run geocoding data build immediately using builder image Nov 8, 2018
finland.polylines Add deduper and polyline import Aug 18, 2016

'# pelias-data-container


Geocoding data build tools

Travis build

Creates and pushes to dockerhub/hsldevcom two docker containers:

  • pelias-data-container-base
  • pelias-data-container-builder

pelias-data-container-base is the base image for the running geocoding data service. It is based on Elasticsearch and also contains all tools for loading and adding address data into the ES index.

pelias-data-container-builder is the data builder application, which builds the final geocoding data container using the base image. It tests built containers thoroughly using hsldevcom/pelias-fuzzy-tests project and a defined regression threshold (currently 2%). If the tests pass, the new container is deployed to dockerhub.

Data builder application

Data builder obeys the following environment variables, which can pe passed to the container using docker run -e option:

  • DOCKER_USER - mandatory dockerhub credentials for image deployment
  • MMLAPIKEY - needed for loading nlsfi data
  • ORG - optional, default 'hsldevcom'
  • BUILD_INTERVAL - optional, as days, defaults to 7
  • THRESHOLD - optional regression limit, as %, defaults to 2%
  • PROD_DEPLOY - optional switch to prevent production deployment, default = 1 (deploys to prod)

Data builder needs an access to host environment's docker service. The following example call to launch the builder container shows how to accomplish this:

docker run -v /var/run/docker.sock:/var/run/docker.sock -e DOCKER_USER=hsldevcom -e DOCKER_AUTH=<secret> -e MMLAPIKEY=<secret> hsldevcom/pelias-data-container-builder

Note: the builder image does not include a tool or script for relaunching the data build immediately from within the container. If an immediate build is needed, rerun the docker image with an environment variable BUILD_INTERVAL=0. The image then executes the build immediately and exits, after which it should be relaunched with normal (= run forever) settings again.

Usage in a local system

Builder app can be run locally to get the data-container image:

#leave dockerhub credentials unset to skip deployment
#runs immediately and once if BUILD_INTERVAL=0
docker run -v /var/run/docker.sock:/var/run/docker.sock -e BUILD_INTERVAL=0 -e MMLAPIKEY=<secret> hsldevcom/pelias-data-container-builder

Another alternative is to install required components locally:

  • Git projects for pelias dataloading (NLSFI, OpenAddresses, OSM, GTFS, etc.)
  • pelias/schema git project
  • WOF admin data and street polylines, both available as a part of this git project
  • Properly configured pelias.json config file
  • Install and start ElasticSearch
  • Export four env. vars, DATA for a data folder path, SCRIPTS for data container scripts of this project, TOOLS path to the parent dir of dataloading and schema tools and MMLAPIKEY for accessing nlsfi data
  • Run the script scripts/

Data deployments

Pelias api updates are sometimes backward incompatible with old data containers. Builder application handles breaking changes by testing the data with development api version, and by running a single compatibility ensuring test with the production api version. Based on these tests, builder deploys selectively to dev and prod. Furthermore, setting an environment variable PROD_DEPLOY=0 prevents production deployments regardless of the tests. This can be useful when the current api and the new container are compatible, but perform poorly together. Using the switch, data can be let automatically build into dev and later be updated by manually tagging a properly compatible api and the new data container into production simultaneously.

You can’t perform that action at this time.