Thread missing kwargs through geotiff backend entry points (#1561)#1591
Merged
Conversation
open_geotiff and to_geotiff are dispatchers; they route to dask or GPU backends based on chunks=/gpu=. The backend-specific entry points (read_geotiff_dask, write_geotiff_gpu) used to accept a smaller kwarg set than the dispatchers, so dispatcher calls with one of the missing kwargs silently dropped them. read_geotiff_dask ----------------- Add window=, band=, max_pixels= and forward them from open_geotiff. * window: clips the lazy region; chunks lay out within the window and file-relative coords are shifted to the window origin (matching the eager path). * band: selects a single 0-based band per chunk via the existing read_to_array path; collapses the output to 2D. * max_pixels: up-front guard against the windowed region, plus per-chunk forwarding through _delayed_read_window so direct calls to read_to_array remain bounded too. write_geotiff_gpu ----------------- Add tiled=, bigtiff=, max_z_error=, streaming_buffer_bytes= and forward bigtiff / streaming_buffer_bytes from to_geotiff. * tiled: must be True (nvCOMP is tile-based). False raises ValueError rather than silently producing a tiled file. to_geotiff also rejects tiled=False+gpu=True before reaching the GPU entry point. * bigtiff: forwarded into _assemble_tiff's force_bigtiff so the GPU path can produce >4GB files. * max_z_error: accepted at the signature level for API parity; any non-zero value raises because nvCOMP has no LERC backend. * streaming_buffer_bytes: accepted at the signature level for API parity; a no-op because the GPU writer keeps the whole array on device. gpu= for the dask path remains out of scope here -- it requires a real dask+CuPy backend implementation, not a kwarg plumb.
Contributor
There was a problem hiding this comment.
Pull request overview
This PR fixes GeoTIFF API parity issues where dispatcher entry points (open_geotiff, to_geotiff) accepted kwargs that were previously not threaded through to their backend-specific implementations (read_geotiff_dask, write_geotiff_gpu), causing silent kwarg drops depending on dispatch.
Changes:
- Forward
window,band, andmax_pixelsfromopen_geotiff(..., chunks=...)intoread_geotiff_dask, and implement the corresponding behaviors in the dask reader. - Extend
write_geotiff_gpu’s signature to accept additionalto_geotiffkwargs (tiled,bigtiff,max_z_error,streaming_buffer_bytes) and thread the relevant ones fromto_geotiffinto the GPU writer. - Add regression tests to ensure kwarg parity is preserved end-to-end via both the dispatchers and backend entry points.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
xrspatial/geotiff/__init__.py |
Threads missing kwargs through dispatcher/backends and adds window/band/max_pixels handling to the dask reader; expands GPU writer API and forwards new kwargs from to_geotiff. |
xrspatial/geotiff/tests/test_backend_kwarg_parity_1561.py |
New regression test module covering kwarg threading/parity across dispatcher and backend entry points. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| import os | ||
| name = os.path.splitext(os.path.basename(source))[0] | ||
|
|
||
| attrs = {} |
Comment on lines
+148
to
+164
| def test_write_geotiff_gpu_rejects_tiled_false(): | ||
| """The GPU writer is tiled-only; ``tiled=False`` must fail loudly.""" | ||
| from xrspatial.geotiff import write_geotiff_gpu | ||
|
|
||
| dummy = np.zeros((2, 2), dtype=np.float32) | ||
| with pytest.raises(ValueError, match="tiled=True"): | ||
| # path is irrelevant -- validation fires before any file I/O. | ||
| write_geotiff_gpu(dummy, "/tmp/parity_1561_unused.tif", tiled=False) | ||
|
|
||
|
|
||
| def test_write_geotiff_gpu_rejects_nonzero_max_z_error(): | ||
| """LERC budget is not implementable on the GPU path.""" | ||
| from xrspatial.geotiff import write_geotiff_gpu | ||
|
|
||
| dummy = np.zeros((2, 2), dtype=np.float32) | ||
| with pytest.raises(ValueError, match="max_z_error is not supported"): | ||
| write_geotiff_gpu(dummy, "/tmp/parity_1561_unused.tif", max_z_error=1.0) |
Comment on lines
+160
to
+164
| from xrspatial.geotiff import write_geotiff_gpu | ||
|
|
||
| dummy = np.zeros((2, 2), dtype=np.float32) | ||
| with pytest.raises(ValueError, match="max_z_error is not supported"): | ||
| write_geotiff_gpu(dummy, "/tmp/parity_1561_unused.tif", max_z_error=1.0) |
Comment on lines
+169
to
+173
| pytest.importorskip("cupy") | ||
| import cupy | ||
|
|
||
| if not cupy.cuda.is_available(): | ||
| pytest.skip("CUDA runtime not available") |
Comment on lines
+190
to
+194
| pytest.importorskip("cupy") | ||
| import cupy | ||
|
|
||
| if not cupy.cuda.is_available(): | ||
| pytest.skip("CUDA runtime not available") |
Comment on lines
+1686
to
1689
| _r2a_kwargs = {} | ||
| if max_pixels is not None: | ||
| _r2a_kwargs['max_pixels'] = max_pixels | ||
| arr, _ = read_to_array(source, window=(r0, c0, r1, c1), |
- _populate_attrs_from_geo_info: update note to reflect that the dask path now also passes window= through this helper (since #1561), so it no longer special-cases the eager path. - _delayed_read_window HTTP COG branch: thread max_pixels through to _fetch_decode_cog_http_tiles so caller-supplied caps are honored on remote (http/https) chunk reads, not just on local-file chunk reads. - test_backend_kwarg_parity_1561: replace hard-coded /tmp/... paths with tmp_path so the tests remain portable on Windows. Switch the two GPU tests to the shared _gpu_only marker pattern used elsewhere in this package (importlib find_spec + cupy.cuda.is_available, then pytest.mark.skipif) rather than ad-hoc importorskip+pytest.skip.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
open_geotiffandto_geotiffare dispatchers; they route to dask or GPU backends based onchunks=/gpu=. The backend-specific entry points (read_geotiff_dask,write_geotiff_gpu) accepted a smaller kwarg set than the dispatchers, so dispatcher calls with one of the missing kwargs silently dropped them.read_geotiff_dask— addwindow,band,max_pixelsand forward fromopen_geotiff.window: clips the lazy region; chunks lay out within the window and coords shift to the window origin (matches the eager path).band: selects a single 0-based band per chunk; collapses output to 2D.max_pixels: up-front guard plus per-chunk forwarding through_delayed_read_window.write_geotiff_gpu— addtiled,bigtiff,max_z_error,streaming_buffer_bytesand forwardbigtiff/streaming_buffer_bytesfromto_geotiff.tiled: must be True (nvCOMP is tile-based). False raisesValueError;to_geotiffalso rejectstiled=False+gpu=Truebefore reaching the GPU entry point.bigtiff: forwarded into_assemble_tiff'sforce_bigtiff.max_z_error: accepted for API parity; non-zero raises because nvCOMP has no LERC backend.streaming_buffer_bytes: accepted for API parity; a no-op (whole array stays on device).gpu=for the dask path remains out of scope — it requires a real dask+CuPy backend implementation, not a kwarg plumb.Closes #1561.
Test plan
xrspatial/geotiff/tests/test_backend_kwarg_parity_1561.pycover the threaded kwargs end-to-end via both the dispatcher and the backend entry pointsxrspatial/geotiff/tests/suite: 1017 passed (1005 prior + 12 new), 3 skipped