Skip to content

geotiff: route eager backends through shared finalization helper (#2179)#2207

Merged
brendancol merged 4 commits into
mainfrom
issue-2179
May 20, 2026
Merged

geotiff: route eager backends through shared finalization helper (#2179)#2207
brendancol merged 4 commits into
mainfrom
issue-2179

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

Wave 2 of #2162 (PR D of the series). Replaces the duplicated eager
finalization block at four sites with _finalize_eager_read from PR
B (#2177, now on main):

  • xrspatial/geotiff/__init__.py — eager numpy open_geotiff
  • xrspatial/geotiff/_backends/gpu.py — GPU CPU-fallback path
  • xrspatial/geotiff/_backends/gpu.py — GPU local main path
  • xrspatial/geotiff/_backends/gpu.py — GPU HTTP path

Net diff: -354 / +385, removing about 250 lines of repeated validate / populate-attrs / mask / cast / set_nodata_attrs / DataArray-build
code the helper now owns.

The three GPU sites resolve mask_sentinel three different ways
before calling the helper:

  • CPU-fallback path: _stripped_geo._mask_nodata from read_to_array
  • Local main path: _mw_mask_nodata local, with _cpu_fallback_geo
    fallback, then raw nodata
  • HTTP path: geo_info._mask_nodata from read_to_array

_finalize_eager_read's host-side mask block runs on CuPy arrays via
numpy duck-typing, so no GPU-specific branch is needed.

Behavior changes

The GPU local-eager path now slices the window/band before masking,
not after. The CPU eager path has always masked post-slice; the GPU
path was masking the full IFD then slicing. This brings the
attrs['nodata_pixels_present'] attr in line: it now reports
sentinel presence within the read window on every backend. No
existing GPU test pinned the old IFD-level semantics.

Tests

Adds xrspatial/geotiff/tests/test_eager_finalization_parity_2162.py
covering the numpy and cupy eager backends against the same files:

  • float sentinel: in-place NaN mask
  • int in-range sentinel: int -> float64 promotion
  • int out-of-range sentinel: mask no-op
  • mask_nodata=False: literal sentinel survives
  • no declared sentinel: lifecycle attrs absent
  • explicit dtype= kwarg: nodata_dtype_cast recorded

Existing eager tests still pass: test_nodata_lifecycle_attrs_2135,
test_nodata_semantics_split_1988, test_georef_status_2136,
test_gpu_nodata_1542, plus the broader GeoTIFF suite (4584 passed,
45 skipped on no-CUDA, 1 xfailed, 1 pre-existing deselected flake
unrelated to this PR).

Test plan

  • pytest xrspatial/geotiff/tests/test_eager_finalization_parity_2162.py -v
  • pytest xrspatial/geotiff/tests/test_nodata_lifecycle_attrs_2135.py
  • pytest xrspatial/geotiff/tests/test_gpu_nodata_1542.py
  • pytest xrspatial/geotiff/ (full suite, deselecting one pre-existing flake)

Closes #2179.

Parent: #2162. Sibling PR C (#2178) handles the dask backends in
parallel; sibling PR E (#2180) handles VRT in wave 3.

Wave 2 of #2162. Replace the duplicated inline finalization block in
the four eager GeoTIFF read sites with _finalize_eager_read from
#2177:

- xrspatial/geotiff/__init__.py: eager numpy open_geotiff
- xrspatial/geotiff/_backends/gpu.py: GPU CPU-fallback path
- xrspatial/geotiff/_backends/gpu.py: GPU local main path
- xrspatial/geotiff/_backends/gpu.py: GPU HTTP path

Drops about 250 lines of repeated validate / populate-attrs / mask /
cast / set_nodata_attrs / DataArray-build code that the helper now
owns.

The GPU local-eager path now slices the window/band before masking
rather than masking the full IFD and slicing after. That brings
attrs['nodata_pixels_present'] into line with the CPU eager path:
the attr now reports sentinel presence within the read window. No
GPU test pinned the old IFD-level semantics.

Adds test_eager_finalization_parity_2162.py covering float sentinel,
int in-range / out-of-range sentinels, mask_nodata=False, no declared
sentinel, and explicit dtype= across the numpy and cupy backends.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 20, 2026
Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: geotiff: route eager backends through shared finalization helper (#2179)

Blockers (must fix before merge)

None.

Suggestions (should fix, not blocking)

  • Add a windowed-read parity case to the new test file. The GPU local-eager path now slices before masking (gpu.py:822-833), which is the behaviour change called out in the PR body. None of the new parity tests pass window=, so the new contract is not pinned in the parity matrix. Cover at least the "window contains no sentinel" vs "window contains sentinel" pair so a future refactor cannot silently re-route the mask through the full IFD. (xrspatial/geotiff/tests/test_eager_finalization_parity_2162.py)

  • Add a MinIsWhite parity case. The local-eager mask_sentinel resolution has three branches (_mw_mask_nodata, _cpu_fallback_geo._mask_nodata, raw nodata). The parity tests exercise the raw and CPU-fallback branches via regular tiled and stripped files, but the MinIsWhite branch is not in the new matrix. test_miniswhite_nodata_1809.py and test_gpu_miniswhite_band_first_2097.py cover it indirectly; pinning it in the parity file would close the gap on the specific code path this PR rewrites. (xrspatial/geotiff/tests/test_eager_finalization_parity_2162.py)

  • Add a multi-band (3D) parity case. The helper's arr.ndim == 3 branch (_attrs.py:1333-1335) builds a (y, x, band) DataArray; existing multi-band tests exercise it, but the parity file does not. Worth a single test for the stripped multi-band path the GPU CPU-fallback site lands on.

  • _apply_nodata_mask_gpu_with_presence is now dead code. After this PR, the helper at _gpu_helpers.py:45 has no remaining callers (verified by grep across xrspatial/). The three references in gpu.py are only in comments now, and the helper's docstring still pitches it as the per-call optimization for the inline GPU mask. Cleanup is out of scope for this PR, but worth a follow-up issue to delete the helper and the _apply_nodata_mask_gpu companion if it falls out of use too.

Nits (optional improvements)

  • _coords_unused variable name. gpu.py:832 discards the coords from _gpu_apply_window_band with arr_gpu, _coords_unused = .... A leading underscore alone is the conventional unused-marker; the _unused suffix is redundant. Either _coords or _ would match the surrounding style.

  • mask_sentinel flow in the GPU local-eager path. gpu.py:845-854 resolves mask_sentinel even when mask_nodata=False. That is correct because the helper still uses the sentinel to scan for nodata_pixels_present in the no-mask branch, but the original inline code only computed _gpu_mask_value under the mask_nodata=True gate. Worth a short inline comment so a future reader does not read "unconditional sentinel resolution" as a bug.

  • Possible micro-perf regression on large GPU buffers. _apply_eager_nodata_mask (_attrs.py:1162, _attrs.py:1184) uses arr[mask] = np.nan fancy indexing where the previous GPU helper used cupy.putmask. The two are equivalent in output but cupy.putmask avoids one intermediate allocation. Likely negligible against the rest of the pipeline, but worth flagging in case GPU profiling later shows it. Not blocking.

What looks good

  • Net diff is solid: ~250 lines of duplicated finalization removed across two files.
  • The slice-before-mask behavioural change on the GPU local-eager path is documented in both the PR body and the inline comment block at gpu.py:822-831. The parity claim (CPU and GPU now agree on nodata_pixels_present semantics) is sound and validated by the windowed sanity check I ran outside the test suite.
  • All four call sites consistently land on mask_sentinel = getattr(geo_info, '_mask_nodata', nodata) or its branched equivalent, so the four-way drift the issue body called out is gone.
  • Test file has a clean structure: shared _LIFECYCLE_ATTRS constant and _assert_lifecycle_attrs_match helper avoid the copy-paste-asserts pattern that would have grown with each parity case.
  • Existing test suite still passes (4584 passed, 45 skipped on no-CUDA, 1 xfailed, 1 pre-existing deselected flake).

Checklist

  • Algorithm matches reference (helper from #2177 mirrors the inline blocks field-for-field; cupy duck-typing verified)
  • All implemented backends produce consistent results (the parity tests assert this for numpy / cupy)
  • NaN handling is correct (mask substitutes NaN; integer promotion gated on in-range integer sentinel)
  • Edge cases are covered by tests (float, int in-range, int out-of-range, no sentinel, mask_nodata=False, explicit dtype)
  • [-] Dask chunk boundaries handled correctly (n/a, sibling PR C #2178 handles dask backends)
  • No premature materialization or unnecessary copies (helper uses in-place fancy indexing on the existing buffer)
  • Benchmark exists or is not needed (pure refactor, no perf contract change)
  • [-] README feature matrix updated (n/a, no API change)
  • Docstrings present and accurate (helper docstring already documents the wave-2 / wave-3 contract; PR body covers behaviour change)

Follow-up commit applying the in-PR review findings:

- Rename ``_coords_unused`` to ``_`` in the GPU local-eager path
  per the convention for discarded tuple elements; add a one-line
  comment so a future reader can see why the local coords are
  dropped (the helper rebuilds them).
- Add a clarifying comment on the GPU local-eager
  ``mask_sentinel`` resolution: it intentionally runs even when
  ``mask_nodata=False`` because the helper still needs the
  sentinel for the ``nodata_pixels_present`` scan (#2135).
- Extend the parity test file with three additional cases:
  windowed reads (pins the slice-before-mask behaviour), a
  MinIsWhite raster (exercises the ``_mw_mask_nodata`` branch),
  and a stripped multi-band file (exercises the 3-D output
  branch on the GPU CPU-fallback path).

The ``_apply_nodata_mask_gpu_with_presence`` dead-code cleanup
and the optional ``cupy.putmask`` perf optimisation are tracked
in follow-up issue #2208 rather than bundled here.
The module docstring listed five cases; the file now has nine
test functions after the windowed / MinIsWhite / multi-band /
dtype= cases. Extend the bullet list so the docstring matches
the test contents.
Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review (follow-up): geotiff: route eager backends through shared finalization helper (#2179)

Follow-up review on commit 850ab04 (and the doc-only fix in bbbeda0). The second commit addressed the in-PR findings from the initial review; the third commit fixed the one new finding this pass turned up. No remaining blockers, suggestions, or nits.

Blockers (must fix before merge)

None.

Suggestions (should fix, not blocking)

None remaining.

Nits (optional improvements)

None remaining.

Disposition of the initial review's findings (final)

  • Add a windowed parity case -- FIXED in 850ab04 (test_windowed_read_presence_matches_window_contents).
  • Add a MinIsWhite parity case -- FIXED in 850ab04 (test_miniswhite_post_inversion_sentinel_parity).
  • Add a multi-band parity case -- FIXED in 850ab04 (test_multiband_stripped_parity).
  • _apply_nodata_mask_gpu_with_presence dead code -- DEFERRED to follow-up issue #2208.
  • _coords_unused variable name -- FIXED in 850ab04 (renamed to _ with explanatory comment).
  • mask_sentinel flow comment -- FIXED in 850ab04 (block comment at gpu.py:847-851).
  • cupy.putmask micro-perf -- DEFERRED to follow-up issue #2208.
  • Stale parity test docstring -- FIXED in bbbeda0 (matrix list now matches the nine test functions).

What looks good

  • The follow-up commits stayed in scope: each finding either landed in-PR or got a tracking issue.
  • Test count went from 6 to 9; the three new cases cover the behaviour change (slice-before-mask on the GPU local-eager path), the MinIsWhite branch of the mask_sentinel resolution, and the 3-D output branch.
  • The two gpu.py comments added in 850ab04 (the _ discard rationale and the unconditional sentinel resolution explanation) head off two plausible future "why is this here?" follow-ups.
  • All nine parity tests pass; the broader GeoTIFF test suite still passes (4587 passed, 45 skipped on no-CUDA, 1 xfailed, 1 pre-existing deselected flake).

Checklist

  • Algorithm matches reference
  • All implemented backends produce consistent results
  • NaN handling is correct
  • Edge cases are covered by tests
  • [-] Dask chunk boundaries handled correctly (n/a)
  • No premature materialization or unnecessary copies
  • Benchmark exists or is not needed
  • [-] README feature matrix updated (n/a)
  • Docstrings present and accurate

# Conflicts:
#	xrspatial/geotiff/_backends/gpu.py
@brendancol brendancol merged commit 9602e57 into main May 20, 2026
4 of 5 checks passed
brendancol added a commit that referenced this pull request May 21, 2026
…) (#2217)

* geotiff: remove dead _apply_nodata_mask_gpu_with_presence helper (#2208)

After #2207 routed all three GPU eager sites through
_finalize_eager_read, the sibling helper
_apply_nodata_mask_gpu_with_presence had zero remaining callers.
Drop the helper and the four stale comment references that named it.
_apply_nodata_mask_gpu stays put because the chunked GPU dask path
still calls it from _chunk_task.

* Address review nit: drop redundant hasattr test (#2208)

The ImportError test already covers the import-time surface and
implicitly the module-attribute absence. Keep one assertion path,
not two.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GeoTIFF: migrate eager backends to shared eager finalization helper (PR D of #2162)

1 participant