Skip to content

Thread missing kwargs through geotiff backend entry points (#1561)#1591

Merged
brendancol merged 3 commits into
mainfrom
issue-1561
May 11, 2026
Merged

Thread missing kwargs through geotiff backend entry points (#1561)#1591
brendancol merged 3 commits into
mainfrom
issue-1561

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

open_geotiff and to_geotiff are dispatchers; they route to dask or GPU backends based on chunks= / gpu=. The backend-specific entry points (read_geotiff_dask, write_geotiff_gpu) accepted a smaller kwarg set than the dispatchers, so dispatcher calls with one of the missing kwargs silently dropped them.

read_geotiff_dask — add window, band, max_pixels and forward from open_geotiff.

  • window: clips the lazy region; chunks lay out within the window and coords shift to the window origin (matches the eager path).
  • band: selects a single 0-based band per chunk; collapses output to 2D.
  • max_pixels: up-front guard plus per-chunk forwarding through _delayed_read_window.

write_geotiff_gpu — add tiled, bigtiff, max_z_error, streaming_buffer_bytes and forward bigtiff / streaming_buffer_bytes from to_geotiff.

  • tiled: must be True (nvCOMP is tile-based). False raises ValueError; to_geotiff also rejects tiled=False+gpu=True before reaching the GPU entry point.
  • bigtiff: forwarded into _assemble_tiff's force_bigtiff.
  • max_z_error: accepted for API parity; non-zero raises because nvCOMP has no LERC backend.
  • streaming_buffer_bytes: accepted for API parity; a no-op (whole array stays on device).

gpu= for the dask path remains out of scope — it requires a real dask+CuPy backend implementation, not a kwarg plumb.

Closes #1561.

Test plan

  • 12 new tests in xrspatial/geotiff/tests/test_backend_kwarg_parity_1561.py cover the threaded kwargs end-to-end via both the dispatcher and the backend entry points
  • Full xrspatial/geotiff/tests/ suite: 1017 passed (1005 prior + 12 new), 3 skipped

open_geotiff and to_geotiff are dispatchers; they route to dask or GPU
backends based on chunks=/gpu=. The backend-specific entry points
(read_geotiff_dask, write_geotiff_gpu) used to accept a smaller kwarg
set than the dispatchers, so dispatcher calls with one of the missing
kwargs silently dropped them.

read_geotiff_dask
-----------------
Add window=, band=, max_pixels= and forward them from open_geotiff.

* window: clips the lazy region; chunks lay out within the window and
  file-relative coords are shifted to the window origin (matching the
  eager path).
* band: selects a single 0-based band per chunk via the existing
  read_to_array path; collapses the output to 2D.
* max_pixels: up-front guard against the windowed region, plus
  per-chunk forwarding through _delayed_read_window so direct calls
  to read_to_array remain bounded too.

write_geotiff_gpu
-----------------
Add tiled=, bigtiff=, max_z_error=, streaming_buffer_bytes= and
forward bigtiff / streaming_buffer_bytes from to_geotiff.

* tiled: must be True (nvCOMP is tile-based). False raises ValueError
  rather than silently producing a tiled file. to_geotiff also
  rejects tiled=False+gpu=True before reaching the GPU entry point.
* bigtiff: forwarded into _assemble_tiff's force_bigtiff so the GPU
  path can produce >4GB files.
* max_z_error: accepted at the signature level for API parity; any
  non-zero value raises because nvCOMP has no LERC backend.
* streaming_buffer_bytes: accepted at the signature level for API
  parity; a no-op because the GPU writer keeps the whole array on
  device.

gpu= for the dask path remains out of scope here -- it requires a
real dask+CuPy backend implementation, not a kwarg plumb.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 11, 2026
@brendancol brendancol requested a review from Copilot May 11, 2026 15:12
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes GeoTIFF API parity issues where dispatcher entry points (open_geotiff, to_geotiff) accepted kwargs that were previously not threaded through to their backend-specific implementations (read_geotiff_dask, write_geotiff_gpu), causing silent kwarg drops depending on dispatch.

Changes:

  • Forward window, band, and max_pixels from open_geotiff(..., chunks=...) into read_geotiff_dask, and implement the corresponding behaviors in the dask reader.
  • Extend write_geotiff_gpu’s signature to accept additional to_geotiff kwargs (tiled, bigtiff, max_z_error, streaming_buffer_bytes) and thread the relevant ones from to_geotiff into the GPU writer.
  • Add regression tests to ensure kwarg parity is preserved end-to-end via both the dispatchers and backend entry points.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File Description
xrspatial/geotiff/__init__.py Threads missing kwargs through dispatcher/backends and adds window/band/max_pixels handling to the dask reader; expands GPU writer API and forwards new kwargs from to_geotiff.
xrspatial/geotiff/tests/test_backend_kwarg_parity_1561.py New regression test module covering kwarg threading/parity across dispatcher and backend entry points.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

import os
name = os.path.splitext(os.path.basename(source))[0]

attrs = {}
Comment on lines +148 to +164
def test_write_geotiff_gpu_rejects_tiled_false():
"""The GPU writer is tiled-only; ``tiled=False`` must fail loudly."""
from xrspatial.geotiff import write_geotiff_gpu

dummy = np.zeros((2, 2), dtype=np.float32)
with pytest.raises(ValueError, match="tiled=True"):
# path is irrelevant -- validation fires before any file I/O.
write_geotiff_gpu(dummy, "/tmp/parity_1561_unused.tif", tiled=False)


def test_write_geotiff_gpu_rejects_nonzero_max_z_error():
"""LERC budget is not implementable on the GPU path."""
from xrspatial.geotiff import write_geotiff_gpu

dummy = np.zeros((2, 2), dtype=np.float32)
with pytest.raises(ValueError, match="max_z_error is not supported"):
write_geotiff_gpu(dummy, "/tmp/parity_1561_unused.tif", max_z_error=1.0)
Comment on lines +160 to +164
from xrspatial.geotiff import write_geotiff_gpu

dummy = np.zeros((2, 2), dtype=np.float32)
with pytest.raises(ValueError, match="max_z_error is not supported"):
write_geotiff_gpu(dummy, "/tmp/parity_1561_unused.tif", max_z_error=1.0)
Comment on lines +169 to +173
pytest.importorskip("cupy")
import cupy

if not cupy.cuda.is_available():
pytest.skip("CUDA runtime not available")
Comment on lines +190 to +194
pytest.importorskip("cupy")
import cupy

if not cupy.cuda.is_available():
pytest.skip("CUDA runtime not available")
Comment on lines +1686 to 1689
_r2a_kwargs = {}
if max_pixels is not None:
_r2a_kwargs['max_pixels'] = max_pixels
arr, _ = read_to_array(source, window=(r0, c0, r1, c1),
- _populate_attrs_from_geo_info: update note to reflect that the dask
  path now also passes window= through this helper (since #1561), so it
  no longer special-cases the eager path.
- _delayed_read_window HTTP COG branch: thread max_pixels through to
  _fetch_decode_cog_http_tiles so caller-supplied caps are honored on
  remote (http/https) chunk reads, not just on local-file chunk reads.
- test_backend_kwarg_parity_1561: replace hard-coded /tmp/... paths
  with tmp_path so the tests remain portable on Windows. Switch the
  two GPU tests to the shared _gpu_only marker pattern used elsewhere
  in this package (importlib find_spec + cupy.cuda.is_available, then
  pytest.mark.skipif) rather than ad-hoc importorskip+pytest.skip.
@brendancol brendancol merged commit c5e873d into main May 11, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

geotiff: backend-specific entry points silently drop kwargs of their dispatchers

2 participants