Skip to content

test(rasterize): close eager-cupy merge-mode coverage gap (deep-sweep pass 3)#2511

Merged
brendancol merged 4 commits into
mainfrom
deep-sweep-test-coverage-rasterize-2026-05-27
May 28, 2026
Merged

test(rasterize): close eager-cupy merge-mode coverage gap (deep-sweep pass 3)#2511
brendancol merged 4 commits into
mainfrom
deep-sweep-test-coverage-rasterize-2026-05-27

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

Deep-sweep test-coverage pass 3 on xrspatial.rasterize. Adds 23 tests in test_rasterize_coverage_2026_05_27.py, all passing on a CUDA host. No source changes.

  • Cat 1 HIGH -- eager cupy (use_cuda=True, no chunks=) had no parametrised six-mode merge parity test against eager numpy. TestDaskNumpy.test_merge_mode_parity and TestDaskCupy.test_merge_mode_parity carry it for the tiled paths, and pass 1 only pinned merge='last' on a single non-overlapping polygon. A routing regression in any of the six GPU atomic kernels (_ensure_gpu_kernels, rasterize.py:1308-1556) would slip past the dask+cupy parity tests because the tiled finalize path is different. Pin polygon and point overlap scenes across last / first / max / min / sum / count, with sanity checks (first != last, min < max) that lock the fixture as non-degenerate.
  • Cat 1 MEDIUM -- _run_cupy with zero geometries had no direct test (eager numpy and both dask backends did). Pin that use_cuda=True with [] keeps the cupy backend instead of short-circuiting to numpy, under both an explicit fill and the default NaN fill.
  • Cat 2 MEDIUM -- merge='count' with all-equal property values: four overlapping rectangles burning the same 1.0 must still count overlaps as > 1 on all four backends. A future GPU atomic optimisation that deduplicated identical-value writes would silently divide counts by the number of unique values.
  • Cat 4 MEDIUM -- name= kwarg thread-through on dask+numpy / eager cupy / dask+cupy (eager numpy was the only path covered).

State CSV updated: rasterize row -> last_inspected=2026-05-27, severity_max=HIGH, categories_found=1;2;4.

Test plan

  • pytest xrspatial/tests/test_rasterize_coverage_2026_05_27.py -- 23 passed locally on CUDA host
  • pytest xrspatial/tests/test_rasterize*.py -- 325 passed, 2 skipped (no regressions in passes 1+2 or the main test file)
  • CI green on numpy-only and dask-only runners (no GPU markers must be hit there)

… pass 3)

Pass 3 of the test-coverage sweep on rasterize.  Adds 23 tests in
test_rasterize_coverage_2026_05_27.py, all passing on a CUDA host.
No source changes.

Cat 1 HIGH -- eager cupy backend (use_cuda=True with no chunks=) had
no parametrised merge-mode parity test.  Pass 1 only pinned the
default merge='last' on a single non-overlapping polygon
(TestCuPy.test_cupy_matches_numpy), and pass 2's Inf-burn tests
covered sum/min/max indirectly on a narrow fixture.  TestDaskNumpy
and TestDaskCupy carry a six-mode parametrised parity test against
eager numpy (last/first/max/min/sum/count) but their eager-cupy
twin was missing.  A routing regression that wired one of the six
GPU atomic kernels in _ensure_gpu_kernels (rasterize.py:1308-1556)
to the wrong opcode would slip past the dask+cupy tiled-finalize
tests because _run_dask_cupy always exercises a different finalize
path than _run_cupy on its own.  Add the missing parametrised
parity test on a three-way overlapping polygon scene and a
three-way overlapping point scene, with sanity-check companions
(first != last, min < max) that lock the fixture as non-degenerate.

Cat 1 MEDIUM -- empty geometry list on the eager cupy backend.
test_rasterize.TestListInput covers eager numpy and the dask paths
cover dask+numpy / dask+cupy, but _run_cupy with zero geometries
(zero-sized cupy bbox / edge / segment buffers feeding
_gpu_init_buffers and _gpu_finalize_buffers) had no direct test.
Pin that use_cuda=True with [] still returns a cupy.ndarray (not a
numpy short-circuit) under both an explicit fill value and the
default NaN fill.

Cat 2 MEDIUM -- all-equal property values under merge='count'.  Four
overlapping rectangles all burning 1.0 must still count overlaps as
> 1.  A future GPU atomic optimisation that deduplicated identical-
value writes would silently divide counts by the number of unique
values, breaking density rasters.  Pin the contract across all four
backends.

Cat 4 MEDIUM -- name= kwarg thread-through.
test_rasterize.TestBasic.test_output_name only covers eager numpy;
each non-default backend constructs its own output DataArray in a
separate code path (_run_dask_numpy, _run_cupy, _run_dask_cupy) and
a regression dropping name= on any of those would not surface from
the existing eager test.  Pin name= on the three other backends.

State: .claude/sweep-test-coverage-state.csv -- rasterize row
updated to last_inspected=2026-05-27, severity_max=HIGH,
categories_found=1;2;4, with pass-3 notes appended.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 27, 2026
The pass-3 coverage file referenced ``box`` at module scope in three
places: two list constants (_EAGER_CUPY_MERGE_PAIRS, _ALL_EQUAL_PAIRS)
and one class attribute (TestNameKwargBackends._SIMPLE_PAIR).  Each one
fires at module import, before pytestmark can apply skip_no_shapely.
On a shapely-less environment (the ``[vector]`` extra opt-out path
introduced in #2496) collection failed with NameError before any test
could be skipped.

Wrap the two list constants in ``if has_shapely:`` and convert the
class attribute into a ``_simple_pair()`` method so ``box`` is only
referenced when the test body actually runs.  Verified by simulating
ImportError on shapely during pytest --collect-only: 23 tests collect
cleanly without shapely, all 23 still pass on a CUDA host with shapely
present.
@brendancol
Copy link
Copy Markdown
Contributor Author

One blocker, two nits. Fix in 99b9b3e.

Blocker -- module-level box(...) crashes import without shapely

_EAGER_CUPY_MERGE_PAIRS, _ALL_EQUAL_PAIRS, and TestNameKwargBackends._SIMPLE_PAIR all called box(...) at module scope. Since #2496 made shapely optional, a numpy-only runner hits NameError: name 'box' is not defined during collection, before pytestmark skip has a chance to fire.

Reproduced by injecting ImportError on import shapely.* and running pytest --collect-only: prior tip failed, HEAD collects all 23 tests cleanly.

Fix: wrap the two list constants in if has_shapely: and turn the class attribute into a _simple_pair() method so box is only touched from test bodies.

Nits (not pushed)

  1. test_eager_cupy_first_differs_from_last and test_eager_cupy_min_differs_from_max sit under @skip_no_cuda but only call the numpy backend. The non-degeneracy property they pin doesn't depend on the backend, so they'd still be useful on numpy-only runners. Not worth re-shuffling.

  2. test_count_numpy asserts max >= 3.0; observed max is 4.0 (four-way overlap at world (5.5, 5.5)). == 4.0 would catch a regression that dropped one of the four writes, but the >= 3.0 bound is what the test was actually built to defend against (dedup-to-1), so leaving it.

Test status

  • new file: 23 passed (CUDA host)
  • new file + test_rasterize.py: 238 passed, 2 skipped
  • --collect-only with shapely suppressed: 23 collected, no NameError

The ort auto-merge of origin/main kept both the branch's Pass 3 rasterize
row and main's older Pass 2 rasterize row (and similarly retained a stale
Pass 1 polygonize row alongside main's Pass 2 update). The branch's Pass 3
notes already encode the full Pass 1+2+3 history, and main's Pass 2
polygonize row is the authoritative replacement for the earlier Pass 1.

Dropped the stale rows, restored alphabetical module ordering, and
verified only the rasterize Pass 3 entry differs from origin/main.
@brendancol brendancol merged commit 52e36fe into main May 28, 2026
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants