An osm2pgsql style to simplify OSM data use
Lua Python Makefile
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
scripts Add a scripts directory for development scripts Aug 3, 2016
sql Add missing pier SQL Jan 25, 2017
.gitignore Add a script for generating sql COMMENTs Nov 1, 2015
.travis.yml Exclude test files from coverage reports Aug 5, 2016
LICENSE Update year in license Mar 13, 2017
Makefile Clean up cleartables.json if yaml2json fails Jun 28, 2017
README Initial comment of basic docs and empty files Oct 9, 2015
README.md Suggest a database name of ct Jun 24, 2017
address.lua Directly use a generic function for MP handling Jan 10, 2017
admin.lua Remove stray debug statements Jul 31, 2016
aero.lua Don't try to truncate iata/iaco strings Nov 7, 2017
barrier.lua Add retaining walls Aug 3, 2016
building.lua Directly use a generic function for MP handling Jan 10, 2017
cleartables.yaml Drop place polygon table Mar 5, 2018
common.lua Filter out non-languages from names col Apr 24, 2018
createcomments.py Add table comment support Nov 12, 2015
education.lua Directly use a generic function for MP handling Jan 10, 2017
generic.lua Handle national park boundary relations May 4, 2017
healthcare.lua Directly use a generic function for MP handling Jan 10, 2017
landform.lua Clean up logic expressions Aug 2, 2016
landuse.lua Add a landuse table Jun 24, 2017
pier.lua Add a layer for piers Jan 25, 2017
place.lua Drop place polygon table Mar 5, 2018
protected.lua Handle national park boundary relations May 4, 2017
recreation.lua Add garden support Feb 4, 2017
test-address.lua Update copyright years Jul 13, 2016
test-admin.lua Update copyright years Jul 13, 2016
test-aero.lua Don't try to truncate iata/iaco strings Nov 7, 2017
test-barrier.lua Add retaining walls Aug 3, 2016
test-building.lua Use a generic function for height Jul 31, 2016
test-common.lua Filter out non-languages from names col Apr 24, 2018
test-education.lua Update copyright years Jul 13, 2016
test-generic.lua Handle national park boundary relations May 4, 2017
test-healthcare.lua Fix modes on healthcare Jan 9, 2017
test-landform.lua Handle rock spires, tagged as natural=cliff Aug 2, 2016
test-landuse.lua Add a landuse table Jun 24, 2017
test-pier.lua Add a layer for piers Jan 25, 2017
test-place.lua Set rank for subregions Jan 7, 2017
test-protected.lua Add table for national parks and protected areas Nov 21, 2016
test-recreation.lua Fix test names for recreation Feb 4, 2017
test-transit.lua Add tram stops to transit Feb 8, 2017
test-transportation.lua Only accept the same road areas as osm-carto Jun 28, 2017
test-util.lua Fix backslash handling + whitespace errors Nov 21, 2016
test-water.lua Update copyright years Jul 13, 2016
test-wetland.lua Add a table for wetlands Nov 19, 2016
test-wood.lua Add a table for wooded areas Nov 19, 2016
transit.lua Add tram stops to transit Feb 8, 2017
transportation.lua Clean up transportation whitespace Jun 28, 2017
util.lua Fix backslash handling + whitespace errors Nov 21, 2016
water.lua Directly use a generic function for MP handling Jan 10, 2017
wetland.lua Directly use a generic function for MP handling Jan 10, 2017
wood.lua Directly use a generic function for MP handling Jan 10, 2017
yaml2json.py Catch more types of YAML errors for nice printing May 5, 2017

README.md

ClearTables

An osm2pgsql multi-backend style designed to simplify consumption of OSM data for rendering, export, or analysis.

ClearTables is currently under rapid development, and schema changes will frequently require database reloads.

Requirements

  • osm2pgsql 0.90.1 or later. Early versions after 0.86.0 may still work with bugs.
  • Lua, required for both osm2pgsql and testing the transforms
  • PostgreSQL 9.1 or later
  • PostGIS 2.0 or later
  • Python with PyYAML
  • Make. Any version of Make should work, or the commands are simple enough to run by hand.

Usage

make
createdb ct
psql -d ct -c 'CREATE EXTENSION postgis; CREATE EXTENSION hstore;'
cat sql/types/*.sql | psql -1Xq -d ct
# Add other osm2pgsql flags for large imports, updates, etc
osm2pgsql -d ct --number-processes 2 --output multi --style cleartables.json extract.osm.pbf
cat sql/post/*.sql | psql -1Xq -d ct

Replace ct with the name of your database if naming it differently.

osm2pgsql will connect to PostgreSQL once per process for each table, for a total of processes * tables connections. If PostgreSQL max_connections is increased from the default, --number-processes can be increased. If --number-processes is omitted, osm2pgsql will attempt to use as many processes as hardware threads.

Principles

These are still a bit vague, and might be split into principles and practices

  • Simplify data for the consumer

  • Use PostgreSQL types other than text if appropriate

  • Use boolean for yes/no values

  • Use enum types where there's a limited list of possibilities independent of data to be included, or a well defined ordering

FAQ

Why no addresses in the building table?

Addresses and buildings have a many-to-many relationship. Multiple addresses inside one building are very common, and multiple buildings in one address can be found. If rendering, a separate table is fine, and if doing an analysis these cases need to be considered which requires joins.

Why road refs as an array?

A road may have multiple refs, and it's wrong to ignore this. To pretend that there's only one ref, use SQL like array_to_string(refs, E'\n') or array_to_string(refs, ';'). The latter will reform the ref tag as it was in the original data.

Why no support for osm2pgsql --hstore?

ClearTables uses the hstore type but doesn't support --hstore.

  1. The goal of ClearTables is to abstract away OSM tagging. Copying all the tags to the output is contrary to this.

  2. Copying all tags is technically possible, but wouldn't be done with --hstore, instead it would be done similar to the names column. The --hstore option doesn't work well when using custom column names which may collide with OSM tags.

  3. With tables for different types of features fine-grained selection of appropriate columns is possible and hstore isn't necessary.

  4. Values within a hstore are untyped which is contrary to the principle of using appropriate types.

Contributing

Bug reports, suggestions and (especially!) pull requests are very welcome on the Github issue tracker. Please check the tracker to see if your issue is already known, and be nice. For questions, please use IRC (irc.oftc.net or http://irc.osm.org, channel #osm-dev) and http://help.osm.org.

Code style

  • 2sp for YAML, 4sp for Lua
  • tags are OSM tags, cols are database columns
  • Space after function name when defining a function, e.g. function f (args)
  • Tests for all Lua functions except ones which are only tail calls

Table names

  • Use _polygon and _point suffix when there will be two tables holding the same type of object represented differently (e.g. most POIs)
  • Use _area when there isn't a corresponding _point table for the same object, but there is another table for points or lines of a similar class but different objects (e.g. wood_areas for forests and wood_line for rows of trees)

Lua guidelines

  • Always set columns to strings, even if they're only true/false. It's unwise to count on anything else making it from Lua to C to C++ to PostgreSQL. This lets PostgreSQL do the only coversion.
  • Test particular columns of a transform function instead of the entire output table, e.g. assert(transform({foo="bar"}).baz == "qux") instead of assert(deepcompare(transform({foo="bar"}), {baz="qux"})).

Getting started

Issues tagged with new column are often good ones to get started with. Issues tagged experimental are focused on researching new best practices and state of the art.

Similar projects

Additional Reading