An OpenStreetMap pbf parser which exports json, allows you to cherry-pick tags and handles denormalizing ways and relations. Available as a standalone binary and comes with a convenient npm wrapper.
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
bin chore(test): Use dedicated test script to catch errors Sep 13, 2018
build feat(compute_bounds): compute bounding boxes for ways Nov 1, 2018
test feat(compute_bounds): compute bounding boxes for ways Nov 1, 2018
.gitignore add end-to-end test Apr 13, 2015
.jshintignore multiarch support Apr 13, 2015
.jshintrc multiarch support Apr 13, 2015
.lgtm Add LGTM config Jun 8, 2016
.npmignore add npmignore to keep large fixture and pbf files out of npm May 12, 2015
.npmrc chore(npm): add npmrc disabling package-lock.json Sep 10, 2018
.travis.yml feat(release): replace semantic-release dep with Travis build stages Sep 8, 2018
.validate.json multiarch support Apr 13, 2015
README.md feat(compute_bounds): compute bounding boxes for ways Nov 1, 2018
centroid_test.go fix centroid tests Nov 2, 2018
compile.sh feat(compile): update compile script to work on bash3, reduced strict… Nov 1, 2018
encoding_test.go feat(binary_encoding): improved compression Jul 13, 2018
index.js propagate signals from parent to child May 22, 2017
line_centroid.go add line centroid algo, refactor code to use either the line algo or … Aug 4, 2016
line_centroid_test.go add line centroid algo, refactor code to use either the line algo or … Aug 4, 2016
package.json chore(test): Use dedicated test script to catch errors Sep 13, 2018
pbf2json.go feat(compute_bounds): compute bounding boxes for ways Nov 1, 2018
poly_centroid.go centroid: improved polygon centroid calculation which removes inner v… Jul 2, 2018
poly_centroid_test.go centroid: improved polygon centroid calculation which removes inner v… Jul 2, 2018

README.md

pbf2json creates a JSON stream of openstreetmap data from any PBF extract, you can pick-and-choose only the bits of the file you want, the library will take care of de-normalizing the relational data (nodes/ways) so you can put it straight in to your favourite document-store, inverted index or graph database.

animated-gif

Run from pre-built binary

Greenkeeper badge

You don't need to have Go installed on your system to use one of the binary files in ./build:

# 64-bit linux distributions
$ ./build/pbf2json.linux-x64
# 64-bit OSX distributions
$ ./build/pbf2json.darwin-x64

you can also run it on your raspberry pi!

# embedded devices
$ ./build/pbf2json.linux-arm

Usage

To control which tags are output you must pass the -tags= flag to pbf2json and the PBF filepath:

$ ./build/pbf2json.linux-x64 -tags="amenity" /tmp/wellington_new-zealand.osm.pbf
{"id":170603342,"type":"node","lat":-41.289843000000005,"lon":174.7944402,"tags":{"amenity":"fountain","created_by":"Potlatch 0.5d","name":"Oriental Bay Fountain","source":"knowledge"},"timestamp":"0001-01-01T00:00:00Z"}
{"id":170605346,"type":"node","lat":-41.2861039,"lon":174.7711539,"tags":{"amenity":"fountain","created_by":"Potlatch 0.10c","source":"knowledge"},"timestamp":"0001-01-01T00:00:00Z"}

Advanced Usage

Multiple tags can be specified with commas, records will be returned if they match one OR the other:

# all buildings and shops
-tags="building,shop"

Tags can also be grouped with the + symbol, records will only be returned if they match one AND the other:

# only records with BOTH housenumber and street specified
-tags="addr:housenumber+addr:street"

You can also combine the above 2 delimiters to get even more control over what get's returned:

# only highways and waterways which have a name
-tags="highway+name,waterway+name"

If you need to target only specific values for a tag you can specify exactly which values you wish to extract using the ~ symbol:

# only extract cuisine tags which have the value of vegetarian or vegan
-tags="cuisine~vegetarian,cuisine~vegan"

Denormalization

When processing the ways, the node refs are looked up for you and the lat/lon values are added to each way.

Since version 3.0 centroids are also computed for each way, since version 5.0 bounds are now also computed.

Output of the nodes array (as seen below) is optional, this was disabled by default in version 5.0 but can be enabled with the flag --waynodes=true.

{
  "id": 301435061,
  "type": "way",
  "tags": {
    "addr:housenumber": "33",
    "addr:postcode": "N5 1TH",
    "addr:street": "Highbury Park",
    "building": "residential"
  },
  "centroid": {
    "lat": "51.554679",
    "lon": "-0.098485"
  },
  "bounds": {
    "e": "-0.0983673",
    "n": "51.5547179",
    "s": "51.5546574",
    "w": "-0.0985915"
  },
  "nodes": [
    {
      "lat": "51.554663",
      "lon": "-0.098369"
    },
    {
      "lat": "51.554657",
      "lon": "-0.098529"
    },
    {
      "lat": "51.554656",
      "lon": "-0.098592"
    },
    {
      "lat": "51.554676",
      "lon": "-0.098590"
    },
    {
      "lat": "51.554680",
      "lon": "-0.098529"
    },
    {
      "lat": "51.554715",
      "lon": "-0.098529"
    },
    {
      "lat": "51.554720",
      "lon": "-0.098369"
    },
    {
      "lat": "51.554663",
      "lon": "-0.098369"
    }
  ]
}

Leveldb

This library uses leveldb to store the lat/lon info about nodes so that it can denormalize the ways for you.

By default the leveldb path is set to /tmp, you can change where it stores the data with a flag:

$ ./build/pbf2json.linux-x64 -leveldb="/tmp/somewhere"

Batched writes

Since version 3.0 writing of node info to leveldb is done in batches to improve performance.

By default the batch size is 50000, you can change this with the following flag:

$ ./build/pbf2json.linux-x64 -batch="1000"

NPM module

var pbf2json = require('pbf2json'),
    through = require('through2');

var config = {
  file: '/tmp/wellington_new-zealand.osm.pbf',
  tags: 'addr:housenumber+addr:street',
  leveldb: '/tmp'
};

pbf2json.createReadStream( config )
 .pipe( through.obj( function( item, e, next ){
    console.log( item );
    next();
 }));

Run the go code from source

Make sure Go is installed and configured on your system, see: https://gist.github.com/missinglink/4212a81a7d9c125b68d9

Note: You should install the latest version of Golang, at least 1.5+, last tested on 1.6.2

sudo apt-get install mercurial;
go get;
go run pbf2json.go;

Compile source for all supported architecture

If you are doing a release and would like to compile for all supported architectures:

note if this is your first time doing this please read the notes in './compile.sh' to set it all up on your machine.

bash compile.sh;

Compile source for a new architecture

If you would like to compile a version of this lib for an architecture which isn't currently supported you can:

go get;
go build;
chmod +x pbf2json;
mv pbf2json build/pbf2json.{platform}-{arch};

Note you will need to change the variables {platform} and {arch} to match those returned by nodejs for your system:

$ node
> var os=require('os')
> os.platform()
'linux'
> os.arch()
'x64'

Then submit a pull request, you are awesome ;)