Skip to content

polygonize: keep diagonal-notch hole in dask 8-conn merge (#2606)#2633

Merged
brendancol merged 3 commits into
mainfrom
deep-sweep-accuracy-polygonize-2026-05-29-01
May 29, 2026
Merged

polygonize: keep diagonal-notch hole in dask 8-conn merge (#2606)#2633
brendancol merged 3 commits into
mainfrom
deep-sweep-accuracy-polygonize-2026-05-29-01

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Closes #2606

The dask connectivity=8 cross-chunk merge filled in the diagonal notch where two same-value regions meet only at a corner across a chunk boundary. The merged polygon then covered one extra cell, so total polygon area came out larger than the raster.

  • _merge_polygon_rings traced the notch as a separate negative (hole) ring, but _group_rings_into_polygons dropped it: the containment test used the hole's first vertex, which sits on the exterior boundary at the pinch point, so _point_in_ring returned False.
  • Added _ring_interior_point() and used a strictly-interior point of the hole for the containment test. Works for the non-convex notch rings the merge produces.

After the fix, numpy and dask 8-conn report equal per-value areas. 4-conn was already correct.

Backend coverage: numpy (unaffected), dask+numpy (fixed), dask+cupy (fixed). The shared merge path is CPU-side, so the cupy and dask+cupy routes go through the same code.

Test plan:

  • test_polygonize_issue_2606.py: reproducer total-area and per-value-area parity, 20 random rasters/chunkings, dask+cupy parity
  • Full test_polygonize* suite green (291 passed)
  • cupy-marked polygonize tests green on a CUDA host (843 passed)
  • 0 area mismatches across 400 random integer rasters for both connectivities (was nonzero for 8-conn before)

The dask connectivity=8 cross-chunk merge filled in the diagonal notch
where two same-value regions meet only at a corner across a chunk
boundary, so the merged polygon covered one extra cell.  Total polygon
area came out larger than the raster.

_merge_polygon_rings traced the notch as a separate negative (hole)
ring, but _group_rings_into_polygons dropped that hole: it tested
containment using the hole's first vertex, which sits on the exterior
boundary at the pinch point, so _point_in_ring returned False.

Add _ring_interior_point() and use a strictly-interior point of the
hole for the containment test.  numpy and dask 8-conn now report equal
per-value areas; 4-conn was already correct.  Covers numpy, dask+numpy
and dask+cupy.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 29, 2026
Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: polygonize: keep diagonal-notch hole in dask 8-conn merge (#2606)

Blockers

None.

Suggestions

None.

Nits

  • xrspatial/polygonize.py:1503 — _ring_interior_point's centroid fallback divides by n. Every ring that reaches _group_rings_into_polygons is closed with at least 3 unique vertices, so n >= 3 and the divide is safe. A one-line note that the fallback assumes a non-empty ring would be nice, but it's not required.

What looks good

  • Root cause is pinned correctly. The hole got dropped because containment was tested with the hole's first vertex, which lands on the exterior at the pinch point. Switching to a strictly-interior point is the right fix and the minimal one.
  • _ring_interior_point handles non-convex notch rings, not just convex ones, by nudging inward along the orientation-aware edge normal. I checked it on an L-shaped ring and on the CW hole ring.
  • The eps=1e-6 inward step is safe. _group_rings_into_polygons runs on integer pixel coordinates (the transform is applied later, in _merge_chunk_polygons / _merge_from_separated), so the offset stays well below the unit grid spacing.
  • Area parity now holds for numpy, dask+numpy, and dask+cupy. The 367 cases where dask geometry is "invalid" per shapely match numpy exactly. 8-connectivity self-touching polygons are documented and expected, and the #2172 figure-8 tests still pass.
  • Tests cover the reproducer, 20 random rasters with ragged chunks, and the dask+cupy path.

Checklist

  • Algorithm matches the numpy reference: yes (area parity)
  • All implemented backends consistent: yes (numpy, dask+numpy, dask+cupy)
  • NaN handling: unaffected by this change
  • Edge cases covered: random ragged chunks, reproducer
  • Dask chunk boundaries: this is the fix
  • No premature materialization introduced
  • Benchmark: not needed (bug fix, no new function)
  • README matrix: not applicable
  • Docstrings present: yes, _ring_interior_point documented

Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up: addressed the one nit from the previous review (commit 23aef0c) — added a comment noting the centroid fallback assumes a non-empty ring (n >= 3). No code-path change; the 23 regression tests still pass. No remaining findings.

@brendancol brendancol merged commit a262594 into main May 29, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

polygonize dask connectivity=8 over-fills diagonal notches at chunk boundaries

1 participant