note: these tests are quite specific to our requirements for Pelias, and as such don't cover the complete functionality of each library.
Our requirements are pretty unusual in that we frequently import the whole OSM planet file (~26GB compressed) in to elasticsearch. As this can take ~20 days, any speed improvement for us will have significant impact on our dev cycles.
The tests involve decompressing a PBF extract of London stored on SSD and serializing each node/way to a single line of JSON which is then sent to stdout.
you can view the raw results output here
It seems that the golang parser
go-osmpbf is substantially faster than the rest of the pack; most probably due to it taking advantage of multicore architecture by using many goroutines.
osm-pbf-parser comes in second, most probably due to the amazing work done by @astro. There is some potential to parallelize this library; although there is a cost/benefit tradeoff for various reasons. This lib is very impressive considering it's only working on a single core.
Again it's a surprise to see that the
reproduce the results yourself
Determining absolute performance is a bit of a fools errand; however if you think there may be an error in my method or you'd like to add another lib then you can re-run the tests yourself.
Make sure you have the most current versions of the following installed:
- mercurial (for the golang dep)
for impartial PBF stats I use:
- osmconvert (sudo apt-get install osmctools)
go get github.com/qedus/osmpbf; npm install;
For the results above I used an SSD instance on AWS with the following read performance:
$ sudo hdparm -Tt /dev/xvdb /dev/xvdb: Timing cached reads: 20518 MB in 2.00 seconds = 10271.47 MB/sec Timing buffered disk reads: 2204 MB in 3.00 seconds = 734.64 MB/sec
We provide PBF extracts free-of-charge at https://mapzen.com/metro-extracts/
The file I used for this test was
London, UK. If the area you're looking for is not available, please submit a PR with your bbox and we'll add it for you!