Add GEOSPrepared<PRED>XY functions to C API #677

dbaston · 2022-09-09T13:43:05Z

This PR adds a function GEOSPreparedXY to the C API to allow point-in-polygon tests without the overhead of point construction. The significance of this overhead depends on the data. When checking many points against a complex boundary, this PR makes almost no difference:

bin/perf_geospreparedcontains ~/data/australia.txt 100000

# Performing 100000 point-in-polygon tests.
# Reading shapes from /home/dan/data/australia.txt
# Read 1 geometries.
# GEOSPreparedContains: 32931 hits from 100000 points in 202,552
# GEOSPreparedContainsXY: 32931 hits from 100000 points in 196,158

With smaller polygons, it makes a measurable improvement (about 20%)

bin/perf_geospreparedcontains ~/data/wsa_individual.wkt 500

# Performing 500 point-in-polygon tests.
# Reading shapes from /home/dan/data/wsa_individual.wkt
# Read 2768 geometries.
# GEOSPreparedContains: 672198 hits from 500 points in 239,358
# GEOSPreparedContainsXY: 672198 hits from 500 points in 186,093

The C++ code surrounding prepared geometries is relatively complex, so rather than modify it to thread a Coordinate object through, I modified the C API GEOSContextHandle to hold a mutable Point object.

The existing API for modifying a Point is rather complex and slow. This is the best I could do:

struct SetCoordinateValue : public geos::geom::CoordinateFilter {
        SetCoordinateValue(double x, double y) : m_x(x), m_y(y) {}

        void filter_rw(Coordinate *c) const override {
            c->x = m_x;
            c->y = m_y;
        }

        double m_x;
        double m_y;
    };

    SetCoordinateValue filter(x, y);
    extHandle->point2d->apply_rw(&filter);
    extHandle->point2d->geometryChanged();

So I added a Point::setXY method to make this easier and faster.

If there is general support for this approach, I can document the new function and add XY variants in other cases where they would be useful (GEOSPreparedIntersectsXY, etc.)

pramsey

Looks A-OK to me.

dbaston · 2022-09-09T16:20:20Z

Do the following make sense?

GEOSPreparedContainsProperlyXY
GEOSPreparedCoveredByXY
GEOSPreparedDisjointXY
GEOSPreparedIntersectsXY
GEOSPreparedTouchesXY

pramsey · 2022-09-09T16:22:22Z

GEOSPreparedCoveredByXY feels a little pointless (ha ha ha) but perhaps in a completionist sense, might as well do it? It is possible to have both true and false results from it I guess.

dbaston · 2022-09-09T16:25:16Z

Is it a synonym for GEOSPreparedContains if the second argument is a point?

dbaston · 2022-09-09T16:40:50Z

Disregard, I think I'm confused by the wording of our docstrings.

geos/capi/geos_c.h.in

Lines 4299 to 4309 in c68119c

    
           /** 
        
           * Using a \ref GEOSPreparedGeometry do a high performance 
        
           * calculation of whether the provided geometry is covered by. 
        
           * \param pg1 The prepared geometry 
        
           * \param g2 The geometry to test 
        
           * \returns 1 on true, 0 on false, 2 on exception 
        
           * \see GEOSCoveredBy 
        
           */ 
        
           extern char GEOS_DLL GEOSPreparedCoveredBy( 
        
               const GEOSPreparedGeometry* pg1, 
        
               const GEOSGeometry* g2);

We are testing if the prepared geometry is covered by the provided geometry, but the text implies (to me) the reverse.

So I should add GEOSPreparedCoversXY, not GEOSPreparedCoveredByXY.

And it is probably misleading to add GEOSPreparedTouchesXY, since it just defaults to the underlying Geometry implementation.

benchmarks/capi/GEOSPreparedContainsPerfTest.cpp

pramsey · 2022-09-09T17:54:18Z

Is this not actually all achievable (performance-wise) by going:

Create a prepared geometry based on polygon input.
Create a coordinate sequence with one dummy entry.
Create a point based on that coordinate sequence.
For each point in my data set
- Call GEOSCoordSeq_setXY() on the coordinate sequence
- Call GEOSPrepared*() on the polygon and the point

dbaston · 2022-09-09T18:00:59Z

There's an assumption that you stop touching the coordinate sequence once you use it to create a geometry. (We should document this!) There are at least two reasons for this:

There is no guarantee that the coordinate sequence you provide lives on in the geometry (we might move its contents into a new coordinate sequence.)
If you manipulate the coordinates directly, the geometry has no way to know that the envelope should be updated, so any calculation relying on the envelope will be wrong.

pramsey · 2022-09-09T18:02:46Z

In that case does the existence of a CAPI GEOSPoint_setXY() abrogate the need for the multiple GEOSPrepared*XY() signatures?

dbaston · 2022-09-09T18:04:53Z

It would meet the same need with fewer signatures. I think it's a bit less discoverable, though.

pramsey · 2022-09-09T18:07:42Z

Another take on the signatures is: is there any substantial use for anything other than PreparedIntersectsXY? (if we want discoverability... all other use cases could be handled with Point_setXY). Or we could just add an example of using Point_setXY() to the examples fleet. At this point I'm just flailing. What does your spidey sense tell you?

dbaston · 2022-09-09T18:13:20Z

Point_setXY seems like a useful addition regardless. I think the Prepared*XY signatures are useful because a user doesn't have to know that Point creation overhead can be important, and that reusing a point with Point_setXY is a way around it. You just start looking for how to do a PIP, and the simplest thing to call happens to also be the fastest. Do we need Covers, Contains, ContainsProperly, Intersects, and Disjoint? No, but do they hurt? (I don't think so, but it's a genuine question)

dr-jts · 2022-09-09T18:18:27Z

Do we need Covers, Contains, ContainsProperly, Intersects, and Disjoint? No, but do they hurt? (I don't think so, but it's a genuine question)

More to test, maintain and document.

dr-jts · 2022-09-09T18:21:48Z

If you manipulate the coordinates directly, the geometry has no way to know that the envelope should be updated, so any calculation relying on the envelope will be wrong.

There is a Geometry::geometryChanged method that allows forcing envelope update. It's not exposed in the C API, except indirectly by GEOSGeom_transformXY_r.

Usually I'm not crazy about mutating geometries, but this seems like one case where it's justified.

dbaston · 2022-09-09T18:22:40Z

More to test, maintain and document.

Yes, though keep in mind we're talking about five two-line functions.

dr-jts · 2022-09-09T19:25:10Z

Because of the much more limited semantics of Point relationships I suggest the following functions:

GEOSPreparedIntersectsXY
GEOSPreparedContainsXY
GEOSPreparedContainsProperlyXY

Disjoint is just ! Intersects, and I suspect is rarely used.

Touches can be determined by Intersects(Boundary(geom), point). PreparedGeometry could provide a fast implementation for this for Point inputs - but doesn't at present.

dbaston · 2022-09-09T19:48:29Z

I removed GEOSPreparedCoversXY and GEOSPreparedDisjointXY.

jorisvandenbossche · 2022-09-24T09:34:28Z

This is a nice addition! (working on exposing this in shapely)

I am wondering one thing: since this is restricted to checking against a Point, can there ever be a difference between Contains and ContainsProperly?

Contains allows common points on the boundary but requires at least one point in the interior, while ContainsProperly disallows common boundary points (only intersecting the interior).
But since in this case there is only a single point, also for Contains, this point needs to be in the interior, and thus this is equivalent to ContainsProperly?

dr-jts · 2022-09-24T14:11:27Z

I am wondering one thing: since this is restricted to checking against a Point, can there ever be a difference between Contains and ContainsProperly?

Great point (so to speak). You are right, there is no difference between Contains and ContainsProperly for single points. I suggest the GEOSPreparedContainsProperlyXY functions be removed (and some doc pointing out this equivalence be added to GEOSPreparedContainsXY).

References #677

dbaston requested a review from pramsey September 9, 2022 13:43

dbaston force-pushed the prepared-xy branch from c812592 to e8d2c40 Compare September 9, 2022 13:44

pramsey approved these changes Sep 9, 2022

View reviewed changes

dbaston force-pushed the prepared-xy branch from e8d2c40 to 30bb54c Compare September 9, 2022 17:13

brendan-ward reviewed Sep 9, 2022

View reviewed changes

benchmarks/capi/GEOSPreparedContainsPerfTest.cpp Outdated Show resolved Hide resolved

brendan-ward reviewed Sep 9, 2022

View reviewed changes

benchmarks/capi/GEOSPreparedContainsPerfTest.cpp Outdated Show resolved Hide resolved

dbaston force-pushed the prepared-xy branch from cba700e to 4cf202e Compare September 9, 2022 19:47

dbaston force-pushed the prepared-xy branch from 4cf202e to 5c7b590 Compare September 9, 2022 20:00

dr-jts added the Enhancement New feature or feature improvement. label Sep 9, 2022

dbaston added 5 commits September 12, 2022 11:30

Inline Geometry::geometryChangedAction

3ed1f13

Add Point::setXY

5d7dd84

CAPI: Add XY variants of several GEOSPrepared functions

8bc04db

Inline Point::getCoordinate

b98f10f

Inline Envelope::covers

b52b389

dbaston force-pushed the prepared-xy branch from 5c7b590 to b52b389 Compare September 12, 2022 15:30

dbaston merged commit 194aed1 into libgeos:main Sep 12, 2022

jorisvandenbossche mentioned this pull request Sep 24, 2022

ENH: faster contains_xy/intersects_xy predicates special casing for point coordinates shapely/shapely#1548

Merged

dbaston added a commit that referenced this pull request Sep 24, 2022

Remove GEOSPreparedContainsProperlyXY

df8b93c

References #677

dr-jts changed the title ~~Add GEOSPreparedContainsXY to C API~~ Add GEOSPrepared_Op_XY functions to C API Oct 5, 2022

dr-jts changed the title ~~Add GEOSPrepared_Op_XY functions to C API~~ Add GEOSPrepared_PRED_XY functions to C API Oct 5, 2022

dr-jts changed the title ~~Add GEOSPrepared_PRED_XY functions to C API~~ Add GEOSPrepared<PRED>XY functions to C API Oct 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add GEOSPrepared<PRED>XY functions to C API #677

Add GEOSPrepared<PRED>XY functions to C API #677

dbaston commented Sep 9, 2022

pramsey left a comment

dbaston commented Sep 9, 2022

pramsey commented Sep 9, 2022

dbaston commented Sep 9, 2022

dbaston commented Sep 9, 2022

pramsey commented Sep 9, 2022 •

edited

dbaston commented Sep 9, 2022

pramsey commented Sep 9, 2022

dbaston commented Sep 9, 2022

pramsey commented Sep 9, 2022

dbaston commented Sep 9, 2022 •

edited

dr-jts commented Sep 9, 2022

dr-jts commented Sep 9, 2022

dbaston commented Sep 9, 2022

dr-jts commented Sep 9, 2022

dbaston commented Sep 9, 2022

jorisvandenbossche commented Sep 24, 2022

dr-jts commented Sep 24, 2022 •

edited

Add GEOSPrepared<PRED>XY functions to C API #677

Add GEOSPrepared<PRED>XY functions to C API #677

Conversation

dbaston commented Sep 9, 2022

pramsey left a comment

Choose a reason for hiding this comment

dbaston commented Sep 9, 2022

pramsey commented Sep 9, 2022

dbaston commented Sep 9, 2022

dbaston commented Sep 9, 2022

pramsey commented Sep 9, 2022 • edited

dbaston commented Sep 9, 2022

pramsey commented Sep 9, 2022

dbaston commented Sep 9, 2022

pramsey commented Sep 9, 2022

dbaston commented Sep 9, 2022 • edited

dr-jts commented Sep 9, 2022

dr-jts commented Sep 9, 2022

dbaston commented Sep 9, 2022

dr-jts commented Sep 9, 2022

dbaston commented Sep 9, 2022

jorisvandenbossche commented Sep 24, 2022

dr-jts commented Sep 24, 2022 • edited

pramsey commented Sep 9, 2022 •

edited

dbaston commented Sep 9, 2022 •

edited

dr-jts commented Sep 24, 2022 •

edited