geotiff: route eager backends through shared finalization helper (#2179)#2207
Conversation
Wave 2 of #2162. Replace the duplicated inline finalization block in the four eager GeoTIFF read sites with _finalize_eager_read from #2177: - xrspatial/geotiff/__init__.py: eager numpy open_geotiff - xrspatial/geotiff/_backends/gpu.py: GPU CPU-fallback path - xrspatial/geotiff/_backends/gpu.py: GPU local main path - xrspatial/geotiff/_backends/gpu.py: GPU HTTP path Drops about 250 lines of repeated validate / populate-attrs / mask / cast / set_nodata_attrs / DataArray-build code that the helper now owns. The GPU local-eager path now slices the window/band before masking rather than masking the full IFD and slicing after. That brings attrs['nodata_pixels_present'] into line with the CPU eager path: the attr now reports sentinel presence within the read window. No GPU test pinned the old IFD-level semantics. Adds test_eager_finalization_parity_2162.py covering float sentinel, int in-range / out-of-range sentinels, mask_nodata=False, no declared sentinel, and explicit dtype= across the numpy and cupy backends.
brendancol
left a comment
There was a problem hiding this comment.
PR Review: geotiff: route eager backends through shared finalization helper (#2179)
Blockers (must fix before merge)
None.
Suggestions (should fix, not blocking)
-
Add a windowed-read parity case to the new test file. The GPU local-eager path now slices before masking (
gpu.py:822-833), which is the behaviour change called out in the PR body. None of the new parity tests passwindow=, so the new contract is not pinned in the parity matrix. Cover at least the "window contains no sentinel" vs "window contains sentinel" pair so a future refactor cannot silently re-route the mask through the full IFD. (xrspatial/geotiff/tests/test_eager_finalization_parity_2162.py) -
Add a MinIsWhite parity case. The local-eager
mask_sentinelresolution has three branches (_mw_mask_nodata,_cpu_fallback_geo._mask_nodata, rawnodata). The parity tests exercise the raw and CPU-fallback branches via regular tiled and stripped files, but the MinIsWhite branch is not in the new matrix.test_miniswhite_nodata_1809.pyandtest_gpu_miniswhite_band_first_2097.pycover it indirectly; pinning it in the parity file would close the gap on the specific code path this PR rewrites. (xrspatial/geotiff/tests/test_eager_finalization_parity_2162.py) -
Add a multi-band (3D) parity case. The helper's
arr.ndim == 3branch (_attrs.py:1333-1335) builds a(y, x, band)DataArray; existing multi-band tests exercise it, but the parity file does not. Worth a single test for the stripped multi-band path the GPU CPU-fallback site lands on. -
_apply_nodata_mask_gpu_with_presenceis now dead code. After this PR, the helper at_gpu_helpers.py:45has no remaining callers (verified by grep acrossxrspatial/). The three references ingpu.pyare only in comments now, and the helper's docstring still pitches it as the per-call optimization for the inline GPU mask. Cleanup is out of scope for this PR, but worth a follow-up issue to delete the helper and the_apply_nodata_mask_gpucompanion if it falls out of use too.
Nits (optional improvements)
-
_coords_unusedvariable name.gpu.py:832discards the coords from_gpu_apply_window_bandwitharr_gpu, _coords_unused = .... A leading underscore alone is the conventional unused-marker; the_unusedsuffix is redundant. Either_coordsor_would match the surrounding style. -
mask_sentinelflow in the GPU local-eager path.gpu.py:845-854resolvesmask_sentineleven whenmask_nodata=False. That is correct because the helper still uses the sentinel to scan fornodata_pixels_presentin the no-mask branch, but the original inline code only computed_gpu_mask_valueunder themask_nodata=Truegate. Worth a short inline comment so a future reader does not read "unconditional sentinel resolution" as a bug. -
Possible micro-perf regression on large GPU buffers.
_apply_eager_nodata_mask(_attrs.py:1162,_attrs.py:1184) usesarr[mask] = np.nanfancy indexing where the previous GPU helper usedcupy.putmask. The two are equivalent in output butcupy.putmaskavoids one intermediate allocation. Likely negligible against the rest of the pipeline, but worth flagging in case GPU profiling later shows it. Not blocking.
What looks good
- Net diff is solid: ~250 lines of duplicated finalization removed across two files.
- The slice-before-mask behavioural change on the GPU local-eager path is documented in both the PR body and the inline comment block at
gpu.py:822-831. The parity claim (CPU and GPU now agree onnodata_pixels_presentsemantics) is sound and validated by the windowed sanity check I ran outside the test suite. - All four call sites consistently land on
mask_sentinel = getattr(geo_info, '_mask_nodata', nodata)or its branched equivalent, so the four-way drift the issue body called out is gone. - Test file has a clean structure: shared
_LIFECYCLE_ATTRSconstant and_assert_lifecycle_attrs_matchhelper avoid the copy-paste-asserts pattern that would have grown with each parity case. - Existing test suite still passes (4584 passed, 45 skipped on no-CUDA, 1 xfailed, 1 pre-existing deselected flake).
Checklist
- Algorithm matches reference (helper from #2177 mirrors the inline blocks field-for-field; cupy duck-typing verified)
- All implemented backends produce consistent results (the parity tests assert this for numpy / cupy)
- NaN handling is correct (mask substitutes NaN; integer promotion gated on in-range integer sentinel)
- Edge cases are covered by tests (float, int in-range, int out-of-range, no sentinel, mask_nodata=False, explicit dtype)
- [-] Dask chunk boundaries handled correctly (n/a, sibling PR C #2178 handles dask backends)
- No premature materialization or unnecessary copies (helper uses in-place fancy indexing on the existing buffer)
- Benchmark exists or is not needed (pure refactor, no perf contract change)
- [-] README feature matrix updated (n/a, no API change)
- Docstrings present and accurate (helper docstring already documents the wave-2 / wave-3 contract; PR body covers behaviour change)
Follow-up commit applying the in-PR review findings: - Rename ``_coords_unused`` to ``_`` in the GPU local-eager path per the convention for discarded tuple elements; add a one-line comment so a future reader can see why the local coords are dropped (the helper rebuilds them). - Add a clarifying comment on the GPU local-eager ``mask_sentinel`` resolution: it intentionally runs even when ``mask_nodata=False`` because the helper still needs the sentinel for the ``nodata_pixels_present`` scan (#2135). - Extend the parity test file with three additional cases: windowed reads (pins the slice-before-mask behaviour), a MinIsWhite raster (exercises the ``_mw_mask_nodata`` branch), and a stripped multi-band file (exercises the 3-D output branch on the GPU CPU-fallback path). The ``_apply_nodata_mask_gpu_with_presence`` dead-code cleanup and the optional ``cupy.putmask`` perf optimisation are tracked in follow-up issue #2208 rather than bundled here.
The module docstring listed five cases; the file now has nine test functions after the windowed / MinIsWhite / multi-band / dtype= cases. Extend the bullet list so the docstring matches the test contents.
brendancol
left a comment
There was a problem hiding this comment.
PR Review (follow-up): geotiff: route eager backends through shared finalization helper (#2179)
Follow-up review on commit 850ab04 (and the doc-only fix in bbbeda0). The second commit addressed the in-PR findings from the initial review; the third commit fixed the one new finding this pass turned up. No remaining blockers, suggestions, or nits.
Blockers (must fix before merge)
None.
Suggestions (should fix, not blocking)
None remaining.
Nits (optional improvements)
None remaining.
Disposition of the initial review's findings (final)
- Add a windowed parity case -- FIXED in 850ab04 (
test_windowed_read_presence_matches_window_contents). - Add a MinIsWhite parity case -- FIXED in 850ab04 (
test_miniswhite_post_inversion_sentinel_parity). - Add a multi-band parity case -- FIXED in 850ab04 (
test_multiband_stripped_parity). _apply_nodata_mask_gpu_with_presencedead code -- DEFERRED to follow-up issue #2208._coords_unusedvariable name -- FIXED in 850ab04 (renamed to_with explanatory comment).mask_sentinelflow comment -- FIXED in 850ab04 (block comment atgpu.py:847-851).cupy.putmaskmicro-perf -- DEFERRED to follow-up issue #2208.- Stale parity test docstring -- FIXED in bbbeda0 (matrix list now matches the nine test functions).
What looks good
- The follow-up commits stayed in scope: each finding either landed in-PR or got a tracking issue.
- Test count went from 6 to 9; the three new cases cover the behaviour change (slice-before-mask on the GPU local-eager path), the MinIsWhite branch of the
mask_sentinelresolution, and the 3-D output branch. - The two
gpu.pycomments added in 850ab04 (the_discard rationale and the unconditional sentinel resolution explanation) head off two plausible future "why is this here?" follow-ups. - All nine parity tests pass; the broader GeoTIFF test suite still passes (4587 passed, 45 skipped on no-CUDA, 1 xfailed, 1 pre-existing deselected flake).
Checklist
- Algorithm matches reference
- All implemented backends produce consistent results
- NaN handling is correct
- Edge cases are covered by tests
- [-] Dask chunk boundaries handled correctly (n/a)
- No premature materialization or unnecessary copies
- Benchmark exists or is not needed
- [-] README feature matrix updated (n/a)
- Docstrings present and accurate
# Conflicts: # xrspatial/geotiff/_backends/gpu.py
…) (#2217) * geotiff: remove dead _apply_nodata_mask_gpu_with_presence helper (#2208) After #2207 routed all three GPU eager sites through _finalize_eager_read, the sibling helper _apply_nodata_mask_gpu_with_presence had zero remaining callers. Drop the helper and the four stale comment references that named it. _apply_nodata_mask_gpu stays put because the chunked GPU dask path still calls it from _chunk_task. * Address review nit: drop redundant hasattr test (#2208) The ImportError test already covers the import-time surface and implicitly the module-attribute absence. Keep one assertion path, not two.
Summary
Wave 2 of #2162 (PR D of the series). Replaces the duplicated eager
finalization block at four sites with
_finalize_eager_readfrom PRB (#2177, now on
main):xrspatial/geotiff/__init__.py— eager numpyopen_geotiffxrspatial/geotiff/_backends/gpu.py— GPU CPU-fallback pathxrspatial/geotiff/_backends/gpu.py— GPU local main pathxrspatial/geotiff/_backends/gpu.py— GPU HTTP pathNet diff: -354 / +385, removing about 250 lines of repeated
validate / populate-attrs / mask / cast / set_nodata_attrs / DataArray-buildcode the helper now owns.
The three GPU sites resolve
mask_sentinelthree different waysbefore calling the helper:
_stripped_geo._mask_nodatafromread_to_array_mw_mask_nodatalocal, with_cpu_fallback_geofallback, then raw
nodatageo_info._mask_nodatafromread_to_array_finalize_eager_read's host-side mask block runs on CuPy arrays vianumpy duck-typing, so no GPU-specific branch is needed.
Behavior changes
The GPU local-eager path now slices the window/band before masking,
not after. The CPU eager path has always masked post-slice; the GPU
path was masking the full IFD then slicing. This brings the
attrs['nodata_pixels_present']attr in line: it now reportssentinel presence within the read window on every backend. No
existing GPU test pinned the old IFD-level semantics.
Tests
Adds
xrspatial/geotiff/tests/test_eager_finalization_parity_2162.pycovering the numpy and cupy eager backends against the same files:
mask_nodata=False: literal sentinel survivesdtype=kwarg:nodata_dtype_castrecordedExisting eager tests still pass:
test_nodata_lifecycle_attrs_2135,test_nodata_semantics_split_1988,test_georef_status_2136,test_gpu_nodata_1542, plus the broader GeoTIFF suite (4584 passed,45 skipped on no-CUDA, 1 xfailed, 1 pre-existing deselected flake
unrelated to this PR).
Test plan
pytest xrspatial/geotiff/tests/test_eager_finalization_parity_2162.py -vpytest xrspatial/geotiff/tests/test_nodata_lifecycle_attrs_2135.pypytest xrspatial/geotiff/tests/test_gpu_nodata_1542.pypytest xrspatial/geotiff/(full suite, deselecting one pre-existing flake)Closes #2179.
Parent: #2162. Sibling PR C (#2178) handles the dask backends in
parallel; sibling PR E (#2180) handles VRT in wave 3.