Skip to content

geotiff: backend-specific entry points silently drop kwargs of their dispatchers #1561

@brendancol

Description

@brendancol

Summary

open_geotiff and to_geotiff are the dispatcher entry points; they route to the dask or GPU backend based on chunks= and gpu=. The backend-specific functions (read_geotiff_dask, write_geotiff_gpu) accept a smaller kwarg set than the dispatchers, so when a user calls the dispatcher with one of the missing kwargs and the dispatcher routes to the smaller-API backend, the kwarg is silently dropped.

read_geotiff_dask is missing vs open_geotiff

  • window — not threaded through; dask readers cannot do windowed lazy reads.
  • band — not threaded through; cannot pre-select a single band.
  • gpu — not threaded through; no dask+CuPy path from this function.
  • max_pixels — not threaded through; the dispatcher's max_pixels guard is bypassed.

Today open_geotiff(path, chunks=512, window=..., band=...) would route to read_geotiff_dask and discard window and band because the dispatcher does not pass them through.

write_geotiff_gpu is missing vs to_geotiff

  • bigtiff — cannot force >4GB layout on the GPU path.
  • tiled — GPU writer assumes tiled; no way to request strips.
  • max_z_error — LERC budget is not accepted (related: the CPU path also rejects max_z_error+GPU upfront, so this gap is consistent at the dispatcher level but still asymmetric at the explicit-entry-point level).
  • streaming_buffer_bytes — GPU writer materialises the whole array on device, has no streaming concept.
  • compression_level — accepted but documented as ignored. Surfaces as a no-op on GPU writes that work on CPU.

to_geotiff also threads attrs like extra_tags, gdal_metadata_xml, x_resolution, y_resolution, resolution_unit through to the CPU writer when the input is a DataArray. The GPU writer reads none of them, so those metadata fields silently fail to round-trip on GPU writes.

Why this matters

A user who learns the API from to_geotiff and then switches to write_geotiff_gpu for explicit GPU control loses metadata without any error or warning. Same for switching from open_geotiff to read_geotiff_dask.

Proposed fix

Two-step, separate PRs:

  1. Thread the missing kwargs through read_geotiff_dask. window= is doable per-chunk; band= and max_pixels= are simple guards; gpu=True enables a dask+CuPy path (probably out of scope for this issue).

  2. Wire the metadata kwargs through write_geotiff_gpu. bigtiff, tiled, extra_tags, gdal_metadata_xml, x_resolution, y_resolution, resolution_unit should all reach the underlying tile writer. streaming_buffer_bytes is a no-op on GPU (whole array on device) — accept it for API parity, document as ignored.

Found during the geotiff API consistency sweep (Cat 5, HIGH).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions