Skip to content
Relative performance comparison of OSM PBF parsers
JavaScript Go HTML Shell
Branch: master
Clone or download
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
results add results Jan 8, 2015
.gitignore initial commit Jan 8, 2015
README.md README Jan 8, 2015
node-osmium-stream.js initial commit Jan 8, 2015
node-osmium.js initial commit Jan 8, 2015
osm-pbf-parser.js initial commit Jan 8, 2015
osm-read.js initial commit Jan 8, 2015
osmpbf.go initial commit Jan 8, 2015
package.json no tests: exit 0 to avoid tests showing as failed Jul 11, 2016
run.sh better grepping Jan 9, 2015
sink.js initial commit Jan 8, 2015

README.md

Measure relative performance of PBF parsers in javascript, C++ and golang.

note: these tests are quite specific to our requirements for Pelias, and as such don't cover the complete functionality of each library.

Our requirements are pretty unusual in that we frequently import the whole OSM planet file (~26GB compressed) in to elasticsearch. As this can take ~20 days, any speed improvement for us will have significant impact on our dev cycles.

The tests involve decompressing a PBF extract of London stored on SSD and serializing each node/way to a single line of JSON which is then sent to stdout.

libraries

results

results

you can view the raw results output here

analysis

It seems that the golang parser go-osmpbf is substantially faster than the rest of the pack; most probably due to it taking advantage of multicore architecture by using many goroutines.

Surprisingly the pure javascript library osm-pbf-parser comes in second, most probably due to the amazing work done by @astro. There is some potential to parallelize this library; although there is a cost/benefit tradeoff for various reasons. This lib is very impressive considering it's only working on a single core.

Again it's a surprise to see that the node-osmium javascript bindings to the C++ library underperforming compared to the pure js version. The documentation states that it works on multiple-cores although I never managed to see that in action. The library itself is impressively feature-rich and is ideal for most usecases.

reproduce the results yourself

Determining absolute performance is a bit of a fools errand; however if you think there may be an error in my method or you'd like to add another lib then you can re-run the tests yourself.

prerequisites

Make sure you have the most current versions of the following installed:

  • nodejs
  • golang
  • mercurial (for the golang dep)

for impartial PBF stats I use:

  • osmconvert (sudo apt-get install osmctools)

dependencies

go get github.com/qedus/osmpbf;
npm install;

run test

bash run.sh

drive performance

For the results above I used an SSD instance on AWS with the following read performance:

$ sudo hdparm -Tt /dev/xvdb

/dev/xvdb:
 Timing cached reads:   20518 MB in  2.00 seconds = 10271.47 MB/sec
 Timing buffered disk reads: 2204 MB in  3.00 seconds = 734.64 MB/sec

metro extracts

We provide PBF extracts free-of-charge at https://mapzen.com/metro-extracts/

The file I used for this test was London, UK. If the area you're looking for is not available, please submit a PR with your bbox and we'll add it for you!

You can’t perform that action at this time.