OpenStreetMap History Renderer & Tools
This Repository contains an experimental, work-in-progress (as everything in the OSM universe, i think ;)) history renderer. It is able to import a history excerpt (or a full history dump) of OpenStreetMap data and create an image from a specified region for a specific point in time. You may want to see berlin as it was in 2008? No Problem! The importer is written against Jochen Topfs great osmium framework which provides history-capable readers for xml and pbf. If you want ot try it, check out the Tutorial.
If you'd like to support this project, Flatter it:
The importer can be compiled with g++ or clang++. Both compilers are mentioned in the Makefile, so just uncomment whichever suites your needs best. Build it using make and then run it als described below.
In order to run it, you'll need data-input. I'd suggest starting with a small extract as a basis. There are some hosted extracts.
All extracts have been created using my OpenStreetMap History Splitter, so if you want your own area, go and download the latest Full-Experimental Dump and split it yourself using the
Next you'll going to need a postgres-database with
btree_gist installed. Check out the Tutorial for details.
After you have your data in place, use the importer to import the data:
You can specify some options at the command line:
./osm-history-importer --nodestore sparse --debug --prefix "hist_" --dsn "host='172.16.0.73' dbname='histtest'" gau-odernheim.osh.pbf
See the libpq documentation for a detailed descriptions of the dsn parameters. Beware: the importer does not honor relations right now, so no multipolygon-areas or routes in the database.
After the import is completed, you can use the render.py and render-animation.py in the "rendering" directory. They work on regular osm styles, so you need to follow the usual preparations for those styles:
svn co http://svn.openstreetmap.org/applications/rendering/mapnik/ osm-mapnik-style cd osm-mapnik-style/ ./get-coastlines.sh ./generate_xml.py --accept-none --prefix 'hist_view'
If you have mapnik2, you'll need to migrate the osm.xml to mapnik2 syntax: upgrade_map_xml.py osm.xml osm-mapnik2.xml
'hist_view' is a special value. During rendering, render.py will create views (hist_view_point, hist_view_line, ..) that represent the state of the database at a given point in time. Those views behave just like regular osm2pgsql tables and enable history rendering with nearly all existing osm2pgsql styles. Now that you prepared your rendering-style, it's time to render your first image:
./render.py --style ~/osm-mapnik-style/osm-mapnik2.xml --bbox 8.177700,49.771700,8.205600,49.791600 --date 2009-01-01
Interesting how your town looked in 2009, hm? And it was all in the OSM-Database - all the time! Sleeping data.. getting melancholic So let's see how your town evolved over time - let's make an animation:
./render-animation.py --style ~/osm-mapnik-style/osm-mapnik2.xml --bbox 8.177700,49.771700,8.205600,49.791600
This will leave you with an set of .png files, one for each month since the first node was placed in your area. If you want render-animation.py to assemble a real video for you, use
--type mp4. This will create a lossless mp4 for you. Use render-animation.py
-h to get information over the wide range of control, the script gives to you.
The Importer comes with two nodestores: stl and sparse.
The Stl-Nodestore is the default one. It's build on top of the the STL-Template std::map. Currently it seems, that it's faster then the spase nodestore, but it's only capable of importing very small extracts, because it's not very memory efficient.
The Sparse-Nodestore is the newer one. It's build on top of the the Google Sparsetable and a custom memory block management. It's much, much more space efficient but it seems to take slightly time on startup and it also contains more custom code, so more potential for bugs. Sooner or later sparse will become the defaul node-store, as it's your only option to import larger extracts or even a whole planet.
Space & Time Requirements
I imported rheinland-pfalz.osh.pbf (308M) with the sparse nodestore. It took around 1.2 GB of RAM from which apparently ~700M was taken by the nodestore and 400M by the pbf reader. Process Runtime was around 30 Minutes. The generated Tables on disk took ~14 GB including indexes.
The MinorTimesCalculator calculates for which timestamps a minor way version is needed. This is the place that determines the granularity of your database on the time axis.
By default it stores data to the split second. When a node is moved two times within a second (or two nodes of the same way), one minor version is generated. If the node timestamps differ, for each timestamp a minor version is generated. Worst case you'll have a full way geometry for each and every second. In future it will be possible to reduce the granularity so, say, a day, so at worst you'll have a full way geometry per day. This should reduce the database size drasticaly.
Yep, the import is slow. I know and I haven't done much optimizing in the code. The Route I'm going is
- No Tool
- A Tool
- A fast Tool
And I'm currently working on 2. Some lines in the code have been annotated with
// SPEED, which means that I know a speed improvement is possible here, but I haven't implemented it yet because I want to have a) running code as soon as possible and b) code, that makes it easy to change things around. Both is impossible with highly optimized code.
Is the rendering slow? Who knows - I don't. I don't know how a combined spatial + date-time btree index performs on a huge dataset, if a simple geom index will be more efficient or if another database scheme is suited better, but as with the imorter there's no other way to learn about this other then trying.
The importer currently stores lat and lon for each and every node, indexed by node-id and node-timestamp inside two nested std::map instances. This is okay for smaller regions but there are ways to improve both, memory usage and speed. The roadmap is here very similar to the one mentioned above:
- Make it work
- Make it scale
- Make it fast
Currently I'm thinking of some ways to improve memory usage. I don't think the version is really needed so we may save 4 bytes per node. Lat/Lon is stored as double which could be shrinked using a fixed-length storage and varints. I'm thinking about using protobuffers or sqlite as node-stores but some benchmarks would be needed. If you have another idea, don't hesitate to drop me an email. Take a look at the Wiki-Page for some notes about memory usage of different imports.
I'm not someone who doesn't like to write documentation - I just don't like to write it twice. So once the code runs smoothly and is cleaned up, I'll update and rewrite the documentation. If you're stuck while playing around in this early stage, please just drop me a mail and I'll help you out.
If you have any questions just ask at firstname.lastname@example.org.