GitHub - aaronlidman/openstreetPOIs: Extracts and builds points of interest from OpenStreetMap nodes and ways

openstreetPOIs extracts and builds points of interest from OpenStreetMap data. It extracts nodes with tags but also builds areas (buildings, lakes, parks, etc..) and uses the centroid as the point for that feature. The aim is to have a scalable way of parsing any osm file for points, quickly and without the need of setting up a database like PostGIS.

settings.py contains a list of all the features to be extracted. It's made up of a sensible default of what I consider useful features but it is also easily edittable to your liking.

example-dc.geojson is an example of the default output from a 18mb .osm file of an area of Washington DC.

Sample Data

These were all created from Geofabrik extracts on July 12, 2013 with a slightly older version of openstreetPOIs.

California - 252 mb
Western United States - 372 mb
North America - 2072 mb

###Installation

git clone https://github.com/aaronlidman/openstreetPOIs.git && cd openstreetPOIs
Mac or Ubuntu? (homebrew or apt required)
- Mac
  - brew update
  - brew install python geos leveldb protobuf
- Ubuntu
  - 12.10 minimum required, plyvel has a problem with 12.04. details
  - apt-get update
  - apt-get -y install python-dev python-pip build-essential libprotobuf-dev protobuf-compiler libleveldb-dev libgeos-dev
Optional. Setup your python virtualenv.
pip install --requirement requirements.txt

Dependencies

Usage

Get your desired OSM data (good starting point) in .pbf, .osm.bz2 or just .osm. With all the dependencies installed and python setup, run: python osmpois.py YOUR_OSM_FILE.EXT and add options (below).

Options

-h, --help - show the help message and exit
--output OUT - Destination filename to create (no extension, .extension gets added on) (default: output)
--overwrite - Overwrite any conflicting files.
--require-key - Only output items that have the 'name' tag defined.
--groupsize - How large of a group to use for coordinate lookup. (default: 20) lower = more RAM, higher = more disk.
--precache - Precache all coordinates. Removes the coordinate lookup process which uses lots of RAM.
--max-nodes - Maximum number of nodes in a way to consider for simplification. Anything over max is skipped. (default: 250) To include everything, no matter how large, set to 2000.

Tips

Depending on the hardware you use and options you specify, processing time can vary a lot.

This process relies very much on your hard drive, the faster your hard drive the better. My results are typically twice as fast with an SSD.
RAM is the first limiting factor while parsing large files. Increase the groupsize option to help mitigate this. The larger the groupsize the less RAM that will be used but this comes at the cost of using your hard drive, which is slower. Anything from one to a couple million might help.
The output is in the default OSM data projection, EPSG:4326 aka WGS84.
- I might add reprojections later, through pyproj.
For quickest results use PBF files.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
LICENSE.md		LICENSE.md
README.md		README.md
example-dc.geojson		example-dc.geojson
openstreetPOIs.png		openstreetPOIs.png
osmpois.py		osmpois.py
requirements.txt		requirements.txt
settings.py		settings.py
todo.md		todo.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LICENSE.md

LICENSE.md

README.md

README.md

example-dc.geojson

example-dc.geojson

openstreetPOIs.png

openstreetPOIs.png

osmpois.py

osmpois.py

requirements.txt

requirements.txt

settings.py

settings.py

todo.md

todo.md

Repository files navigation

Sample Data

Dependencies

Usage

Options

Tips

BSD License

About

Releases

Packages

Languages

License

aaronlidman/openstreetPOIs

Folders and files

Latest commit

History

Repository files navigation

Sample Data

Dependencies

Usage

Options

Tips

BSD License

About

Resources

License

Stars

Watchers

Forks

Languages