polygonize: keep diagonal-notch hole in dask 8-conn merge (#2606)#2633
Merged
Conversation
The dask connectivity=8 cross-chunk merge filled in the diagonal notch where two same-value regions meet only at a corner across a chunk boundary, so the merged polygon covered one extra cell. Total polygon area came out larger than the raster. _merge_polygon_rings traced the notch as a separate negative (hole) ring, but _group_rings_into_polygons dropped that hole: it tested containment using the hole's first vertex, which sits on the exterior boundary at the pinch point, so _point_in_ring returned False. Add _ring_interior_point() and use a strictly-interior point of the hole for the containment test. numpy and dask 8-conn now report equal per-value areas; 4-conn was already correct. Covers numpy, dask+numpy and dask+cupy.
brendancol
commented
May 29, 2026
Contributor
Author
brendancol
left a comment
There was a problem hiding this comment.
PR Review: polygonize: keep diagonal-notch hole in dask 8-conn merge (#2606)
Blockers
None.
Suggestions
None.
Nits
- xrspatial/polygonize.py:1503 —
_ring_interior_point's centroid fallback divides byn. Every ring that reaches_group_rings_into_polygonsis closed with at least 3 unique vertices, son >= 3and the divide is safe. A one-line note that the fallback assumes a non-empty ring would be nice, but it's not required.
What looks good
- Root cause is pinned correctly. The hole got dropped because containment was tested with the hole's first vertex, which lands on the exterior at the pinch point. Switching to a strictly-interior point is the right fix and the minimal one.
_ring_interior_pointhandles non-convex notch rings, not just convex ones, by nudging inward along the orientation-aware edge normal. I checked it on an L-shaped ring and on the CW hole ring.- The
eps=1e-6inward step is safe._group_rings_into_polygonsruns on integer pixel coordinates (the transform is applied later, in_merge_chunk_polygons/_merge_from_separated), so the offset stays well below the unit grid spacing. - Area parity now holds for numpy, dask+numpy, and dask+cupy. The 367 cases where dask geometry is "invalid" per shapely match numpy exactly. 8-connectivity self-touching polygons are documented and expected, and the #2172 figure-8 tests still pass.
- Tests cover the reproducer, 20 random rasters with ragged chunks, and the dask+cupy path.
Checklist
- Algorithm matches the numpy reference: yes (area parity)
- All implemented backends consistent: yes (numpy, dask+numpy, dask+cupy)
- NaN handling: unaffected by this change
- Edge cases covered: random ragged chunks, reproducer
- Dask chunk boundaries: this is the fix
- No premature materialization introduced
- Benchmark: not needed (bug fix, no new function)
- README matrix: not applicable
- Docstrings present: yes,
_ring_interior_pointdocumented
brendancol
commented
May 29, 2026
Contributor
Author
brendancol
left a comment
There was a problem hiding this comment.
Follow-up: addressed the one nit from the previous review (commit 23aef0c) — added a comment noting the centroid fallback assumes a non-empty ring (n >= 3). No code-path change; the 23 regression tests still pass. No remaining findings.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #2606
The dask
connectivity=8cross-chunk merge filled in the diagonal notch where two same-value regions meet only at a corner across a chunk boundary. The merged polygon then covered one extra cell, so total polygon area came out larger than the raster._merge_polygon_ringstraced the notch as a separate negative (hole) ring, but_group_rings_into_polygonsdropped it: the containment test used the hole's first vertex, which sits on the exterior boundary at the pinch point, so_point_in_ringreturned False._ring_interior_point()and used a strictly-interior point of the hole for the containment test. Works for the non-convex notch rings the merge produces.After the fix, numpy and dask 8-conn report equal per-value areas. 4-conn was already correct.
Backend coverage: numpy (unaffected), dask+numpy (fixed), dask+cupy (fixed). The shared merge path is CPU-side, so the cupy and dask+cupy routes go through the same code.
Test plan:
test_polygonize_issue_2606.py: reproducer total-area and per-value-area parity, 20 random rasters/chunkings, dask+cupy paritytest_polygonize*suite green (291 passed)