Skip to content

geotiff: forward read kwargs through all GPU CPU-fallback paths#2244

Merged
brendancol merged 3 commits into
mainfrom
issue-2238
May 21, 2026
Merged

geotiff: forward read kwargs through all GPU CPU-fallback paths#2244
brendancol merged 3 commits into
mainfrom
issue-2238

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Closes #2238.

Summary

  • Forward allow_rotated, window, band, and max_pixels through every CPU-fallback _read_to_array call inside read_geotiff_gpu (stripped-layout, planar=2 per-band, sparse-tile, and post GPU-decode-failure).
  • Skip the post-decode shape check and _gpu_apply_window_band slicer on CPU-decoded paths so the now-already-sliced fallback buffer doesn't get double-windowed or trip the multi-band shape guard.
  • Flip test_cupy_rotated_read_drops_crs from xfail to pass; add a regression-test file pinning the kwarg-forwarding contract on all four fallback sites.

Backend coverage

Touches only xrspatial/geotiff/_backends/gpu.py. The cupy and dask+cupy paths use this code; the dask+cupy chunked path already forwarded kwargs correctly via _read_geotiff_gpu_chunked, so it's unaffected. The numpy and dask+numpy paths are unchanged.

Test plan

  • pytest xrspatial/geotiff/tests/test_gpu_fallback_forwards_kwargs_2238.py (6 new tests, all pass)
  • pytest xrspatial/geotiff/tests/test_allow_rotated_no_crs_2122.py (xfail-flipped test now passes; 8 tests total)
  • pytest -k "gpu or cupy or planar or sparse or fallback or rotated" xrspatial/geotiff/tests/ (1189 passed, 20 skipped, no regressions)

The eager GPU read had four CPU-fallback ``_read_to_array`` call sites:
the stripped-layout branch, the planar=2 per-band fallback, the
sparse-tile branch, and the post GPU-decode-failure branch. Three of
them dropped every caller kwarg, and the stripped one dropped
``allow_rotated``. Symptoms:

- rotated files failed with ``NotImplementedError`` even with
  ``allow_rotated=True`` (rotation guard fired on the fallback parser)
- a caller-raised ``max_pixels`` still hit the default 1B-pixel cap
  inside the fallback
- ``window`` / ``band`` had to be repaired post-decode by
  ``_gpu_apply_window_band``, which masked the dropped-kwarg defect

Now every CPU-fallback site forwards ``allow_rotated`` / ``window`` /
``band`` / ``max_pixels`` to ``_read_to_array``. The post-decode shape
check and ``_gpu_apply_window_band`` slicer skip on CPU-decoded paths
because the buffer is already windowed and banded. Mirrors the existing
``arr_was_cpu_decoded`` skip for ``_apply_orientation_gpu``.

Adds regression tests covering all four sites via a kwarg recorder, plus
end-to-end rotated-fallback, max_pixels-honored, and windowed-banded
fallback assertions. Flips ``test_cupy_rotated_read_drops_crs`` from
xfail to pass.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 21, 2026
Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: geotiff: forward read kwargs through all GPU CPU-fallback paths

Blockers (must fix before merge)

  • None.

Suggestions (should fix, not blocking)

  • test_sparse_tile_fallback_respects_caller_max_pixels (test_gpu_fallback_forwards_kwargs_2238.py:351-376) does not actually exercise the fallback. With max_pixels=10 on a 32x32 file, the pre-fallback _check_dimensions(width, height, samples, max_pixels) at gpu.py:559 raises before the sparse-tile branch fires. The test passes for the wrong reason. The kwarg-recorder test directly above (test_sparse_tile_fallback_forwards_all_kwargs) already pins max_pixels plumbing via assertion on seen[0]['max_pixels']. Either drop this test as redundant, or convert it to a case where the file's full IFD fits under max_pixels but a synthetic sub-decoder failure forces the fallback to re-run with the same kwarg (closer to the recorder pattern in test_planar2_fallback_forwards_all_kwargs).

  • The hand-crafted TIFF fixture helpers _write_rotated_tiled_tiff and _write_sparse_rotated_tiled_tiff are nearly identical; the sparse variant differs only in the offset/byte_count packing for the last tile. Consider factoring the shared IFD-build logic into a helper or parameterising one writer on a sparse: bool flag. About 80 lines of fixture code collapses to about 10. Not blocking; the duplication is contained and well-commented.

Nits (optional improvements)

  • gpu.py:801: the comment "now-already-sliced fallback buffer" reads slightly awkwardly. "the already-sliced fallback buffer" or "the buffer (now sliced by _read_to_array)" is cleaner.

  • test_gpu_fallback_forwards_kwargs_2238.py:392-393: import order. _gpu_decode is imported inside the test body. Pre-existing style in this repo (see test_predictor2_big_endian_gpu_1517.py) puts function-scope imports at the top of the test function. The current order works; flagged only for consistency.

What looks good

  • The fix is small and mirrors the existing arr_was_cpu_decoded skip pattern already used by _apply_orientation_gpu (gpu.py:817-820). The new skip on _gpu_apply_window_band (gpu.py:864-873) and the multi-band shape guard (gpu.py:801-811) reuses the same flag, so future readers can see all three skips together.
  • The post-decode shape check is correctly narrowed to non-CPU-decoded paths so it still catches pure-GPU kernel regressions while not falsely tripping on the now-windowed fallback output.
  • The xfail flip on test_cupy_rotated_read_drops_crs (was strict=True, raises=NotImplementedError) provides automatic regression coverage; if the kwarg-forwarding breaks again, that test will raise in CI before the new test file even runs.
  • Test coverage hits all four call sites: stripped (via the existing #2122 test), sparse (real round-trip on a hand-crafted rotated tiled file), planar=2 (monkeypatched _gpu_decode_single_band_tiles returning None), and decode-failure (both gpu_decode_tiles_from_file and gpu_decode_tiles patched to raise).
  • The _make_kwarg_recorder helper is the right abstraction here. Asserting on seen[0] rather than chasing data shapes keeps each fallback site's contract pinned at the API boundary, independent of downstream slicing fixes.
  • No backend dispatch changes; the numpy / dask / dask+cupy paths are correctly identified as unaffected (the dask+cupy chunked path already forwards kwargs via _read_geotiff_gpu_chunked).

Checklist

  • Algorithm correctness: no algorithm change; kwarg-plumbing fix only
  • All implemented backends produce consistent results: verified by the cross-fallback recorder tests and the 1189-test sweep
  • NaN handling: unchanged; the sparse-tile fallback still goes through _finalize_eager_read for the mask
  • Edge cases: window/band/max_pixels/allow_rotated each covered for at least one fallback site
  • Dask chunk boundaries: N/A (eager GPU path only)
  • No premature materialization or unnecessary copies: N/A
  • Benchmark: N/A (bug fix)
  • README feature matrix: N/A (no backend change)
  • Docstrings: comments at each new branch reference issue #2238; no public API change

Two follow-ups from the PR review:

- Drop ``test_sparse_tile_fallback_respects_caller_max_pixels``. The
  test set ``max_pixels=10`` on a 32x32 file and asserted on a raise,
  but the raise comes from gpu.py's own ``_check_dimensions`` pre-check
  at line 559, before the sparse-tile fallback fires. The kwarg-recorder
  test directly above already pins ``max_pixels`` plumbing via assertion
  on ``seen[0]['max_pixels']``, so the dropped test was redundant.

- Fold ``_write_sparse_rotated_tiled_tiff`` into ``_write_rotated_tiled_tiff``
  with a ``sparse: bool`` kwarg. The two writers shared ~80 lines of
  IFD-build logic; the sparse variant only differed in the offset and
  byte_count packing for the last tile.
Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review (round 2): post-follow-up

Blockers (must fix before merge)

  • None.

Suggestions (should fix, not blocking)

  • None remaining from the previous review.

Nits (optional improvements)

  • None remaining from the previous review.

What changed since the last review

  • test_sparse_tile_fallback_respects_caller_max_pixels was dropped. The recorder test (test_sparse_tile_fallback_forwards_all_kwargs) covers max_pixels plumbing.
  • _write_sparse_rotated_tiled_tiff was folded into _write_rotated_tiled_tiff via a sparse: bool kwarg. The shared IFD-build logic now lives in one place; the sparse path only adds offset=0/bytecount=0 entries for the last tile.

Disposition of round-1 findings

  • Suggestion 1 (misleading max_pixels test): fixed by deletion. The duplicate-coverage rationale in the review was the right call.
  • Suggestion 2 (TIFF writer duplication): fixed via the sparse=True parameter on the unified writer. Diff dropped ~93 lines of fixture code.
  • Nit 1 (comment wording): dismissed. The phrase the reviewer flagged was in the round-1 review body, not the source. The actual gpu.py:804 comment ("the buffer is already sliced and will not equal the full IFD shape") reads cleanly.
  • Nit 2 (import order): dismissed. Surrounding test files (e.g. test_predictor2_big_endian_gpu_1517.py) put function-scope imports inside the test body, matching the current style. The suggested move would diverge from the repo's existing convention.

Verification

  • pytest xrspatial/geotiff/tests/test_gpu_fallback_forwards_kwargs_2238.py xrspatial/geotiff/tests/test_allow_rotated_no_crs_2122.py: 13 passed

@brendancol brendancol merged commit 829d3a8 into main May 21, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GPU tiled fallback drops critical read kwargs (allow_rotated/window/band/max_pixels)

1 participant