Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
Mapzen Vector Tile Service
Tilezen Vector Tiles
# install misc tools sudo apt-get install git unzip python-yaml # install postgres / postgis sudo apt-get install postgresql postgresql-contrib postgis postgresql-9.5-postgis-2.2 # Install jinja2 sudo apt-get install python-jinja2 # install tilezen fork of osm2pgsql sudo apt-add-repository ppa:tilezen sudo apt-get update sudo apt-get install osm2pgsql
NOTE: PostgreSQL 9.5+ is required for some
2. Install vector-datasource
# dev packages for building sudo apt-get install build-essential autoconf libtool pkg-config # dev packages for python and dependencies sudo apt-get install python-dev python-virtualenv libgeos-dev libpq-dev python-pip python-pil libxml2-dev libxslt-dev
This repo contains the supplementary data to load and the queries that are issued to the database for each layer.
git clone https://github.com/mapzen/vector-datasource.git cd vector-datasource # now checkout the latest tagged release (see warning below), for example: # git checkout v1.4.0
WARNING: If you are standing up your own instance of the Tilezen stack (rather than doing development), it's best practice to checkout the latest tagged release rather than running off
master. At the time of this writing that is
v1.4.0, so you'd
git checkout v1.4.0 to be on the same code base as the production Mapzen Vector Tile service. Similarly, you'd need to pin yourself against the related project's versions, e.g.:
Requires: tileserver v2.1.0 and tilequeue v1.8.0 and mapbox-vector-tile v1.2.0 mentioned in the release notes in the sections below.
Setup a virtualenv
There are numerous ways to deploy python packages. virtualenv is used here, but other methods should work
At the moment, only
Python 2.7.x is supported, so make sure you have a Python 2.7 version installed
# Create a virtualenv called 'env'. This can be named anything, and can be in the tileserver directory or anywhere on your system. virtualenv env --python python2.7 source env/bin/activate
Install tileserver and tilequeue
pip install -U -r requirements.txt python setup.py develop
3. Load data
Set up database
If you are setting up PostgreSQL for a single-user install, you may want to create a new database user (i.e:
whoami). You can skip this next step if you already have your database roles established.
sudo -u postgres psql CREATE USER [your username] SUPERUSER PASSWORD 'your password here';
First, create the database. We use the database name 'osm' here, but you can use any, e.g. 'gis'.
createdb -E UTF-8 -T template0 osm psql -d osm -c 'CREATE EXTENSION postgis; CREATE EXTENSION hstore;'
Next, download the OpenStreetMap source data. You can use any PBF, but we use a Mapzen metro extract here to get started.
Load PBF data
osm2pgsql --slim --hstore-all -C 1024 -S osm2pgsql.style -d osm path/to/osm.pbf
Vector-datasource uses the slim tables, so
--slim is required and the
--drop option cannot be used.
You may also need to pass in other options, like -U or -W, to ensure that you connect to the database with a user that has the appropriate permissions. For more details, visit the osm2pgsql wiki page and the postgresql docs for creating a user. You may need to check your connection permissions too, which can be found in the pg_hba.conf file.
Note that if you import the planet, the process can take several days, and can consume over 1TB (2TB is preferred cause need more space to prepare the database) of disk space at the time of writing. The OSM planet gets bigger every week, so it might be necessary to do a few trial runs to find out what it takes today. Have a look at some our own performance tuning docs or those from Switch2OSM.org for recommendations.
Load additional data and update database
vector-datasource/data directory contains scripts to load additional data and update the database to match our expected schema.
The additional data included with
shapefiles.tar.gz is from a combination of sources, the full list is in
data/assets.yaml, including a pointer to the latest cached datestamp. Everything bundled is open data, although some of it is manually generated or curated. The data comes from:
openstreetmapdatafor static land/water polygons, and is under the same ODbL as the primary OpenStreetMap data it derives from.
naturalearthdata.comis sourced for themes and layers used at low zooms, and is under public domain..
admin_areasis based on OSM data and available under the ODbL. It gets generated by manually running Valhalla's
buffered_landis based on Natural Earth data and available under the public domain. It is manually curated by Tilezen and is a slightly buffered land polygon to clip admin boundaries against so that we don't get admin boundaries going off into the sea.
To import the data:
# Go to data directory, assumes you already changed directories into vector-datasource (above) cd data # Build the Makefiles that we'll use in the next steps python bootstrap.py # Download external data make -f Makefile-import-data # Import shapefiles into postgis ./import-shapefiles.sh | psql -Xq -d osm # Add indexes and any required database updates ./perform-sql-updates.sh -d osm # Clean up local shape files make -f Makefile-import-data clean
NOTE that you may have to pass in a username/password to these scripts for them to connect to the database. Anywhere
-d osm is specified, you may need to also pass in
-U <username> and perhaps set a password too. For example, if my username is "foo" and my password is "bar", here's what I would do:
export PGPASSWORD=bar ./import-shapefiles.sh | psql -d osm -U foo ./perform-sql-updates.sh -d osm -U foo
To prepare the data:
shapefiles.tar.gz is generated by running:
cd data python bootstrap.py make -f Makefile-prepare-data
This can take a very long time to download all the individual pieces! To speed up basic database setup we cache the results on S3 and indicate in the latest cached datestamp in the assets.yaml.
4. Serve vector tiles
- Use tileserver for serving single tiles with Postgres.
- Use tilequeue for caching a local region with Postgres... and with RAWR tiles to cache the whole world.
cd ../tileserver cp config.yaml.sample config.yaml # update configuration as necessary edit config.yaml
Load Who's on First neighbourhood data
Finally, neighbourhood data is required to be loaded from Who's on First.
wget https://s3.amazonaws.com/nextzen-tile-assets/wof/wof-neighbourhoods.pgdump pg_restore --clean -d osm -O wof-neighbourhoods.pgdump
This will load a snapshot of the neighbourhoods data.
You should periodically update the Who's On First neighbourhoods data by running the following:
wget https://raw.githubusercontent.com/mapzen/tilequeue/master/config.yaml.sample -O tilequeue-config.yaml wget https://raw.githubusercontent.com/mapzen/tilequeue/master/logging.conf.sample tilequeue wof-process-neighbourhoods --config tilequeue-config.yaml
The tile server can be run in one of two ways:
- Directly, as a single-threaded Python process. This is better if you want to debug or step through code, but will not be able to use all the cores of your computer.
- As a WSGI application through a multi-threaded (or multi-process) WSGI server such as
gunicorn. This is better if you want to make best use of your computer by handling requests concurrently. However, it can complicate debugging or stepping through code.
To run tileserver using
gunicorn, we recommend using the same number of workers as CPU cores (the
-w argument, here for example 4):
gunicorn -w 4 "tileserver:wsgi_server('config.yaml')"
To run tileserver stand-alone for debugging:
5. Global build
Need to build tiles for the whole world? There's a new way to do that using tilequeue and RAWR tiles instead of Postgres:
- https://github.com/tilezen/vector-datasource/issues/1537 for tips
- https://github.com/tilezen/raw_tiles/issues/25 for more tips
You're ready to help us improve the Tilezen project! Please read our CONTRIBUTING.md document to understand how to contribute code.
Need to confirm your configuration? A test suite is included which can be run against a tile server.
Sample test URLs
Keeping up to date with osm data
Generally speaking, tile service providers make the trade-off to prefer generating stale tiles over serving the request on demand more slowly. Mapzen also makes this trade-off.
A lot of factors go into choosing how to support a system that remains up to date. For example, existing infrastructure, tolerance for request latency and stale tiles, expected number of users, and cost can all play roles in coming up with a strategy for remaining current with OpenStreetMap changes.
If you are on a particular release and would like to migrate your database to a newer one, you'll want to run the appropriate migrations. Database migrations are required when the database queries & functions that select what map content should be included in tiles change.
Note that the migration for each release in between will need to be run individually. For example, if you are on v0.5.0 and would like to upgrade to v0.7.0, you'll want to run the v0.6.0 and v0.7.0 migrations (we don't provide "combo" migrations).
# in this example, we're on v0.5.0 - checkout the migration to v0.6.0 git checkout v0.6.0 bash data/migrations/run_migrations.sh -d osm # now our database reflects v0.6.0 - checkout the migration to v0.7.0 git checkout v0.7.0 bash data/migrations/run_migrations.sh -d osm # now our database reflects v0.7.0