Uses parallel programming & small tricks to produce large map intersections up to 20-30x faster.
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
README.md
fast_map_intersection.sh

README.md

Fast map intersection (postgis)

This template script will split up a large GIS map intersection query into hundreds of little pieces that run quickly in parallel. This is a special case; for other types of queries, try http://parpsql.com / http://github.com/gbb/par_psql.

The script may look a little ugly - it requires a few minutes to edit a couple of variables before it can be run for your own query. It is very useful if you are trying to combine 2 maps - we have used this locally to reduce our intersection computation time from around 1 day, to just 1 hour on national scale maps with millions of rows. The results have been checked as identical. However, we are using 2x8-core processors, with hyperthreading...

It gives you an indication of progress as it runs, which is a nice advantage over an ordinary long-running query. You can optionally put extra clauses into the SQL if you need e.g. a subset of the map extracted out by some WHERE condition.

Performance features: it uses unlogged tables for temporary results; increases work_mem a bit temporarily.

Tips

How to use

  • Edit the 'fast_intersect.sh' file to specify your query in the query section.
  • Specify the name for your result.
  • Type "./fast_intersection.sh"
  • Wait a little, watch the progress bars.
  • Done!
  • (Press crtl-z to freeze the program if you want to inspect the command files; then 'fg' to restart).

AUTHOR

Graeme B Bell, Norwegian Forest and Landscape Institute / NIBIO (Norwegian Institute for Bioeconomy Research).

NEWS

v1.0.1 Presented at FOSS4G Europe in Como.

THANKS

  • Thanks to the Norwegian Forest and Landscape Institute (now NIBIO, the Norwegian Institute of Bioeconomy Research) for supporting open source publication/sharing of our local scripts that may be useful for others.

  • Martijn Meijers for the suggestion of using UNION ALL for a little extra speed.

LINKS

If you like this script, you may also like these links - Mark Wynter has written some guides to using Gnu Parallel for other GIS work with postgis.

http://dimensionaledge.com/intro-vector-tiling-map-reduce-postgis/

http://dimensionaledge.com/from-days-to-minutes-geoprocessing-of-alberta-land-use-data/

TODO

  • add more examples?
  • make 'work_mem' optional (don't emit output if it's unset).
  • rewrite program as a parser rather than a template script?