Skip to content
OpenTripPlanner data container
JavaScript Shell Dockerfile
Branch: master
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
otp-data-container Revert "Only include tif files once in data container" Sep 21, 2018
otp-data-tools Try to fix OTPQA May 30, 2018
router-finland
router-hsl Restore HSL trip updater, leave rail updater removed Aug 21, 2019
router-waltti Use xml instead of json for oulu citybikes Jul 30, 2019
task
.dockerignore
.eslintrc.json
.gitignore update Jun 28, 2017
.travis.yml Make travis regex matching ready for next decade May 14, 2019
Dockerfile Update to node 10 Feb 11, 2019
README.md
config.js Change OSM download URLs Aug 2, 2019
deploy.sh Add TOOLS_TAG env to control used otp-data-tools tag Jul 11, 2018
gulpfile.js
index.js Replace deprecated gulp start command Feb 11, 2019
package.json Fix merge Feb 12, 2019
run-builder.sh Resolve names from marathon dns Jul 17, 2019
test.sh Further increase java memory allocation Aug 1, 2019
travis-build.sh Remove env from travis-build Oct 31, 2018
util.js Merge branch 'master' into add-linter Oct 11, 2018

README.md

Build process for OpenTripPlanner-data-container

Greenkeeper badge Build

This project:

Contains tools for fetching, building and deploying fresh otp data-containers for consumption by hsl, waltti and finland otp instances.

Main components

otp-data-builder

The actual data builder application. This is a node.js application that fetches and processes new gtfs/osm data. It's build around gulp and all separate steps of databuilding process can also be called directly from the source tree. The only required external dependency is docker. Docker is used for launching external commands that do for example data manipulation.

install gulp cli: npm install -g gulp-cli

install app deps: npm install

update osm data: gulp osm:update

download new gtfs data for waltti: ROUTERS=waltti gulp gtfs:dl

Configuration

It is possible to change the behaviour of the data builder by defining environment variables.

  • "DOCKER_USER" defines username for authenticating to docker hub.
  • "DOCKER_AUTH" defines password for authenticating to docker hub.
  • (Optional, default latest and tag based on date) "DOCKER_TAG" defines what will be the updated docker tag of the data container images in the remote container registry.
  • (Optional, default latest) "SEED_TAG" defines what version of data container should be used for seeding.
  • (Optional, default latest) "OTP_TAG" defines what version of OTP is used for testing and building graphs.
  • (Optional, default latest) "TOOLS_TAG" defines what version of otp-data-tools image is used for testing.
  • (Optional, default ${process.cwd()}/data) "HOST_DATA" defines base path for volume directories.
  • (Optional, default 'finland, waltti, hsl') "ROUTERS" defines which data containers are being built and deployed.
  • (Optional, default ${process.cwd()}/data) "DATA" defines base path for data directories in container's file system.
  • (Optional, default '0 0 3 * * *') "CRON" defines when data build is being run.
  • (Optional, default {}) "EXTRA_SRC" defines gtfs src values that should be overridden or completely new src that should be added with unique id. "routers" is always a mandatory field. Example format:
    • {"FOLI": {"url": "http://data.foli.fi/gtfs/gtfs.zip", "fit": false, "rules": ["router-waltti/gtfs-rules/waltti.rule"], "routers": ["hsl", "finland"]}}
    • You can remove a src by including "remove": true, {"FOLI": {"remove": true, "routers": ["hsl"]}
  • (Optional, default {}) "EXTRA_UPDATERS" defines router-config.json updater values that should be overridden or completely new updater that should be added with unique id. "routers" is always a mandatory field. Example format:
    • {"turku-alerts": {"type": "real-time-alerts", "frequencySec": 30, "url": "https://foli-beta.nanona.fi/gtfs-rt/reittiopas", "feedId": "FOLI", "fuzzyTripMatching": true, "routers": ["waltti"]}}
    • You can remove a src by including "remove": true, {"turku-alerts": {"remove": true, "routers": ["waltti"]}

Data processing steps

Seed data can be retrieved with a single gulp command:

  1. seed

Downloads previous data containers (env variable SEED_TAG can be used to customize which tag is pulled) and extracts osm and gtfs data from there and places it in 'data/ready' directory. Old data acts as backup in case fetching/validating new data fails.

Currently there is single processing step for OSM data. Because gtfs processing steps require osm data, the osm data must be available before running the gtfs:fit stage later below.

  1. osm:update

This command downloads required OSM packages from configured location, tests the file(s) with otp, and if tests pass data is copied to 'data/downloads/osm' directory.

The data is then processed with the following steps:

  1. gtfs:dl Downloads a GTFS package from configured location, tests the file with otp, if test passes data is copied to directory 'data/fit/gtfs'. The resulting zip is named .zip.

  2. gtfs:fit Runs configured map fits. Copies data to directory 'data/filter/gtfs'.

  3. gtfs:filter Runs configured filterings. Copies data to directory 'data/id/gtfs'.

  4. gtfs:id Sets the gtfs feed id to and copies data to directory 'data/ready/gtfs'.

Building the router from available (seeded or downloaded and processed) data:

  1. router:buildGraph

Prebuilds graph with either current latest version or user defined version (with env variable OTP_TAG) of OTP and creates zip files ready for building the otp-data container.

The final step is router deployment:

  1. deploy.sh

Builds a data container, starts it, starts either latest or user defined version (with env variable OTP_TAG) of otp and runs routing tests (otp-data-tools latest image is used for it by default, TOOLS_TAG env variable can be used to change that) to verify that the data container looks ok. If tests pass the fresh data container is pushed to Dockerhub.

Normally when the application is running (as a container) the index.js is used. It runs the data updating process on a schedule specified as a cron pattern. Data build can be executed immediately by attaching to the builder container with bash 'docker exec -i -t /bin/bash' and then exexuting the command 'node index.js once'. The end result of the build is 2 docker containers uploaded into dockerhub. Digitransit-deployer detects the changes and restarts OTP instances, so that new data becomes in use.

Each data container image runs a http server listening to port 8080, serving both a gtfs data bundle and a pre-built graph:

otp-data-tools

Contains tools for gtfs manipulation, such as One Bus Away gtfs filter. These tools are packaged inside docker container and are used dunring the data build process.

You can’t perform that action at this time.