Skip to content

c/sedona-s2geography: Improve performance of s2geography operators #139

@paleolimbot

Description

@paleolimbot

While most physical manipulations for the geography type (e.g., to/from WKB/WKT/Point) are implemented outside of s2geography, most operations that require doing math currently go through the s2geography C++ library which currently does a rather naive approach similar to what we do for GEOS: make me a "Geography", do something, then write the output. This is implemented in the s2geography project ( https://github.com/paleolimbot/s2geography ) which powers Spherely in Python and s2 in R.

Just like creating GEOS geometries from WKB in a loop incurs substantial overhead, so does creating geographies. There are a few things we can do to speed up s2-based operations:

  • For index-assisted operations like the predicates, implement S2Shape on top of WKB, similar to how our wkb::Wkb rust object is a zero-copy wrapper around the WKB buffer that doesn't copy it. I did a blog post several years ago demonstrating this concept: https://dewey.dunnington.ca/post/2021/prototyping-an-apache-arrow-representation-of-geometry/#zero-copy-s2arrow , but never did anything with it. Now is the time!
  • For simpler operations like perimeter and length, just iterate over the coordinates and use S2's fantastic set of primitives to do the work.
  • For repeated operations that benefit from preparedness, use the existing ShapeIndexGeography as a prepared geometry. This can be serialized as well such that we could implement a prepared geography Arrow type. This will probably be necessary to support a reasonable spatial join implementation for Geography.

I'm planning on making updates to the geoarrow-c stack (to support the zero-copy shim on top of WKB that we'll need in s2geography) and s2geography (to implement the S2Shape wrapper + optimized predicates) in the next few weeks, targeting SedonaDB 0.2.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions