Create `GeoSeries.contains_properly` method using point_in_polygon. #749

thomcom · 2022-10-21T15:47:02Z

Closes #743
Closes #744

Description

This PR closes the above named issues relating to creating a .contains method and, more importantly, resolving boundary case inconsistency with point_in_polygon.

~~As it stands the colinearity test I've added to is_point_in_polygon doubles the runtime of brute-force point_in_polygon and has no visible effect on the runtime of quadtree_point_in_polygon.~~

~~- Note I need to double check the above benchmark, having set this project down for the last few weeks.~~

This depends on #750, please do not review the C++ code here until that PR is merged. Please do review the python code.

Benchmark

Benchmark results are in, looks like there's no measurable speed difference between 22.12 pre-boundary exclusion and our current implementation:

(rapids) rapids@compose:~/cuspatial/python/cuspatial/benchmarks$ pytest api/bench_api.py::bench_point_in_polygon
================================================== test session starts ===================================================
platform linux -- Python 3.8.15, pytest-7.2.0, pluggy-1.0.0
benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /home/tcomer/mnt/NVIDIA/rapids-docker/cuspatial/python/cuspatial/benchmarks, configfile: pytest.ini
plugins: cov-4.0.0, benchmark-4.0.0, cases-3.6.13, xdist-3.0.2, anyio-3.6.2, hypothesis-6.58.1
collected 1 item                                                                                                         

api/bench_api.py .                                                                                                 [100%]


---------------------------------------------- benchmark: 1 tests ---------------------------------------------
Name (time in s)              Min     Max    Mean  StdDev  Median     IQR  Outliers     OPS  Rounds  Iterations
---------------------------------------------------------------------------------------------------------------
bench_point_in_polygon     1.9636  1.9749  1.9678  0.0043  1.9660  0.0045       1;0  0.5082       5           1
---------------------------------------------------------------------------------------------------------------

Legend:
  Outliers: 1 Standard Deviation from Mean; 1.5 IQR (InterQuartile Range) from 1st Quartile and 3rd Quartile.
  OPS: Operations Per Second, computed as 1 / Mean
=================================================== 1 passed in 16.28s ===================================================
(rapids) rapids@compose:~/cuspatial/python/cuspatial/benchmarks$ git status
On branch feature/GeoSeries.contains

vs branch-22.12

(rapids) rapids@compose:~/cuspatial/python/cuspatial/benchmarks$ pytest api/bench_api.py::bench_point_in_polygon
================================== test session starts ===================================
platform linux -- Python 3.8.15, pytest-7.2.0, pluggy-1.0.0
benchmark: 4.0.0 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /home/tcomer/mnt/NVIDIA/rapids-docker/cuspatial/python/cuspatial/benchmarks, configfile: pytest.ini
plugins: cov-4.0.0, benchmark-4.0.0, cases-3.6.13, xdist-3.0.2, anyio-3.6.2, hypothesis-6.58.1
collected 1 item                                                                         

api/bench_api.py .                                                                 [100%]


---------------------------------------------- benchmark: 1 tests ---------------------------------------------
Name (time in s)              Min     Max    Mean  StdDev  Median     IQR  Outliers     OPS  Rounds  Iterations
---------------------------------------------------------------------------------------------------------------
bench_point_in_polygon     1.9516  1.9843  1.9730  0.0126  1.9760  0.0127       1;0  0.5068       5           1
---------------------------------------------------------------------------------------------------------------

Legend:
  Outliers: 1 Standard Deviation from Mean; 1.5 IQR (InterQuartile Range) from 1st Quartile and 3rd Quartile.
  OPS: Operations Per Second, computed as 1 / Mean
=================================== 1 passed in 16.61s ===================================
(rapids) rapids@compose:~/cuspatial/python/cuspatial/benchmarks$ git status
On branch benchmark/branch-22.12

Still adding:

Detailed description of xfail result.
Self-review existing .contains implementation in python.
Update .contains docs when necessary.
Benchmark again and document here.
Move binops_with_quadtree.py to next branch.
.contains Examples

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.

…to produce the correct results.

…ries.contains

…ontains and performance.

…tial into feature/GeoSeries.contains

isVoid

Just some more style requests, with an open question in the end. Great work!

python/cuspatial/cuspatial/core/geoseries.py

python/cuspatial/cuspatial/core/binops/contains.py

python/cuspatial/cuspatial/core/geoseries.py

python/cuspatial/cuspatial/tests/test_contains.py

harrism · 2022-11-30T07:22:50Z

python/cuspatial/cuspatial/tests/test_contains.py

+    expected = gpdlhs.contains(gpdrhs).values
+    assert (got == expected).all()
+    got = rhs.contains_properly(lhs).values_host
+    expected = gpdrhs.contains(gpdlhs).values


Hmmm, shouldn't you be using shapely.contains_properly() for the expected result as we discussed in the meeting?

The interfaces are not the same. Using shapely.contains_properly(x, y) is a method that takes two Shapely geometries and returns True or False. .contains is a GeoSeries method that operates on self and other. Refactoring these tests to use shapely only is not comparing apples to oranges.

Naively you can use a for-loop...

expected = pd.Series() for lhs, rhs in zip(gpdlhs, gpdrhs): expected = pd.concat([expected, [shapely.contains_properly(lhs, rhs)])

But comparing cuspatial.contains_properly to geopandas.contains is comparing apples to oranges.

harrism · 2022-11-30T07:24:58Z

BTW, does this support multipoint in polygon?

Co-authored-by: Michael Wang <isVoid@users.noreply.github.com>

Co-authored-by: Mark Harris <mharris@nvidia.com>

… docs once more.

…tial into feature/GeoSeries.contains

thomcom · 2022-11-30T19:53:02Z

BTW, does this support multipoint in polygon?

yes, there are tests for it.

harrism · 2022-11-30T20:33:11Z

@thomcom looks like you may have accidentally deleted all the tests (test_contains.py) in 4a651f1?

harrism

Just a few other tiny things.

harrism · 2022-11-30T20:26:36Z

python/cuspatial/cuspatial/core/binops/contains.py

+    are properly contained within the corresponding polygon. Polygon A contains Point B 
+    properly if B intersects the interior of A but not the boundary (or exterior). 
+
+    Note that polygons must be closed: the first and last vertex of each


Should this be in a Note section as well?

python/cuspatial/cuspatial/core/binops/contains.py

python/cuspatial/cuspatial/core/geoseries.py

Co-authored-by: Mark Harris <mharris@nvidia.com>

thomcom · 2022-11-30T22:13:27Z

@gpucibot merge

thomcom added 8 commits October 17, 2022 16:27

Add tests to verify border-exclusion and modify point_in_polygon.cuh …

758c66b

…to produce the correct results.

Merge branch 'feature/pip_boundary_exclusion_test' into feature/GeoSe…

731987b

…ries.contains

Trying to write contains tests, having difficulties.

76c8b1d

Pass all clockwise polygon tests.

beacf76

Test of boundary exclusion.

67d6153

Create is_point_colinear_with_polygon method. Get colinearity working.

8a19829

Merge branch 'feature/GeoSeries.contains' into test-contains

d2f93ed

Tweak one test because it shares a point with the offending polygon.

7cce26b

github-actions bot added libcuspatial Relates to the cuSpatial C++ library Python Related to Python code labels Oct 21, 2022

thomcom added 3 commits October 21, 2022 11:24

Modify colinearity test to early terminate.

11be135

Create a point_in_polygon_one_to_one for better correspondence with c…

5aa9c09

…ontains and performance.

Wresling with pre-commit.

a9c1806

github-actions bot added the cmake Related to CMake code or build configuration label Oct 21, 2022

thomcom added 16 commits October 21, 2022 16:01

Rename to pairwise.

1b4f850

Need to include the proper file.

cba43a4

Move shared is_point_in_polygon to its own file.

fbfd0b7

Remove unneeded includes.

a0cfbe2

Fix the tests that should no longer pass.

046e4a1

Fix the tests that should no longer pass.

c87e741

Now we allow open or closed polygons again.

7f1b632

Write tests for pairwise point in polygon.

cd205b2

Create new branch with pairwise_point_in_polygon cpp changes.

121e146

Clean up docs on cursory review.

756282d

Use T as TypeParam

f5f7777

Merge

ebb266e

Writing tests and implementation for polygon/point contains.

e784307

Merge branch 'branch-22.12' into feature/GeoSeries.contains

2eb4b9d

Create polygon and multipolygon generator.

f53acec

Refactor and create exhaustive polygon contains tests.

390890d

thomcom added 4 commits November 29, 2022 17:28

Handle more review comments.

cd6bd66

Merge branch 'feature/GeoSeries.contains' of github.com:thomcom/cuspa…

141b86c

…tial into feature/GeoSeries.contains

Comment

d72df1c

Fix alignment.

d08ad01

isVoid approved these changes Nov 30, 2022

View reviewed changes

harrism requested changes Nov 30, 2022

View reviewed changes

This was referenced Nov 30, 2022

GeoSeries.contains that matches GeoPandas with the exception of cases that depend on intersection. #770

Closed

Create cuspatial.GeoSeries objects directly from Shapely arrays. #830

Closed

thomcom and others added 7 commits November 30, 2022 08:38

Fix example layout again.

cb1b149

Update python/cuspatial/cuspatial/core/geoseries.py

8db29b6

Co-authored-by: Michael Wang <isVoid@users.noreply.github.com>

Update python/cuspatial/cuspatial/core/geoseries.py

f31b70e

Co-authored-by: Michael Wang <isVoid@users.noreply.github.com>

Update python/cuspatial/cuspatial/core/geoseries.py

80f5ea1

Co-authored-by: Michael Wang <isVoid@users.noreply.github.com>

Apply suggestions from code review

4a594b9

Co-authored-by: Mark Harris <mharris@nvidia.com>

Get rid of GeoPandas round trip in .contains_properly example and fix…

4a651f1

… docs once more.

Merge branch 'feature/GeoSeries.contains' of github.com:thomcom/cuspa…

d401c58

…tial into feature/GeoSeries.contains

thomcom changed the title ~~Create GeoSeries.contains method using point_in_polygon.~~ Create GeoSeries.contains_properly method using point_in_polygon. Nov 30, 2022

Style issue, black is hopefully not breaking flake8 in CI.

1c4f1bc

thomcom added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Nov 30, 2022

Add __init__.py to binops hoping to resolve CI issue.

d007515

Need test_contains_properly.py

91c09af

harrism reviewed Nov 30, 2022

View reviewed changes

thomcom and others added 2 commits November 30, 2022 15:29

Remove xfails.

0d32a64

Update python/cuspatial/cuspatial/core/geoseries.py

70822a8

Co-authored-by: Mark Harris <mharris@nvidia.com>

harrism approved these changes Nov 30, 2022

View reviewed changes

rapids-bot bot merged commit 4ca88ff into rapidsai:branch-22.12 Nov 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create `GeoSeries.contains_properly` method using point_in_polygon. #749

Create `GeoSeries.contains_properly` method using point_in_polygon. #749

thomcom commented Oct 21, 2022 •

edited by harrism

Loading

isVoid left a comment •

edited

Loading

harrism Nov 30, 2022

thomcom Nov 30, 2022

isVoid Nov 30, 2022

harrism Nov 30, 2022 •

edited

Loading

harrism commented Nov 30, 2022

thomcom commented Nov 30, 2022

harrism commented Nov 30, 2022

harrism left a comment

harrism Nov 30, 2022

thomcom commented Nov 30, 2022

Create GeoSeries.contains_properly method using point_in_polygon. #749

Create GeoSeries.contains_properly method using point_in_polygon. #749

Conversation

thomcom commented Oct 21, 2022 • edited by harrism Loading

Description

Benchmark

Still adding:

Checklist

isVoid left a comment • edited Loading

Choose a reason for hiding this comment

harrism Nov 30, 2022

Choose a reason for hiding this comment

thomcom Nov 30, 2022

Choose a reason for hiding this comment

isVoid Nov 30, 2022

Choose a reason for hiding this comment

harrism Nov 30, 2022 • edited Loading

Choose a reason for hiding this comment

harrism commented Nov 30, 2022

thomcom commented Nov 30, 2022

harrism commented Nov 30, 2022

harrism left a comment

Choose a reason for hiding this comment

harrism Nov 30, 2022

Choose a reason for hiding this comment

thomcom commented Nov 30, 2022

Create `GeoSeries.contains_properly` method using point_in_polygon. #749

Create `GeoSeries.contains_properly` method using point_in_polygon. #749

thomcom commented Oct 21, 2022 •

edited by harrism

Loading

isVoid left a comment •

edited

Loading

harrism Nov 30, 2022 •

edited

Loading