Skip to content

Consolidate GeoTIFF public API on open_geotiff / to_geotiff #2960

@brendancol

Description

@brendancol

Reason or Problem

The xrspatial.geotiff public API currently exposes seven read/write entry points: the two dispatchers open_geotiff and to_geotiff, plus five backend-specific functions (read_geotiff_dask, read_geotiff_gpu, read_vrt, write_geotiff_gpu, write_vrt). A new user faces several ways to do the same read or write, and the contract is larger than it needs to be ahead of 1.0.

The dispatchers already pick the backend from their parameters. open_geotiff routes to GPU on gpu=True, to dask on chunks=..., and to the VRT reader on a .vrt extension. to_geotiff routes to the GPU writer on gpu=/CuPy-backed data and to the VRT writer on a .vrt path. Both forward every backend-specific kwarg already (missing_sources, band_nodata, on_gpu_failure, and so on). The backend-named functions duplicate a control surface the dispatchers already provide.

Proposal

Shrink the public read/write surface to the two dispatchers and select the backend through their parameters. Make the four data-backend functions private and keep one public helper for the VRT-mosaic case that does not fit the dispatcher signature.

Design:

Rename the four data backends to leading-underscore names and keep them importable but out of __all__:

  • read_geotiff_dask -> _read_geotiff_dask
  • read_geotiff_gpu -> _read_geotiff_gpu
  • read_vrt -> _read_vrt
  • write_geotiff_gpu -> _write_geotiff_gpu

write_vrt builds a VRT mosaic from a list of existing GeoTIFF files. It has no DataArray to write, so it cannot fold into to_geotiff(data, path). Keep it public, renamed to build_vrt.

The dispatchers already call these functions internally, so wiring is a matter of updating the imports and call sites in __init__.py and _writers/eager.py. Backend selection stays implicit through the existing kwargs (gpu=, chunks=, .vrt extension); no new backend= parameter. Because the dispatchers forward every backend kwarg already, no read or write capability is lost.

This is a clean break with no back-compat shims, which suits a pre-1.0 contract change.

Usage:

Read, choosing the backend by parameter:

import xrspatial.geotiff as g

da = g.open_geotiff("dem.tif")               # eager numpy
da = g.open_geotiff("dem.tif", chunks=512)   # dask
da = g.open_geotiff("dem.tif", gpu=True)     # GPU
da = g.open_geotiff("mosaic.vrt")            # VRT

Write the same way:

g.to_geotiff(da, "out.tif")               # CPU
g.to_geotiff(da, "out.tif", gpu=True)     # GPU
g.to_geotiff(da, "tiled.vrt")             # tile + VRT index
g.build_vrt("mosaic.vrt", ["a.tif", "b.tif"])   # mosaic existing files

Value: One read entry point and one write entry point, with the backend chosen by a parameter. The public surface drops from seven functions to three.

Stakeholders and Impacts

Anyone importing the backend-named functions directly. The change touches the geotiff source, the test suite (call sites move to the underscore names), the reference docs, and the README feature matrix. The dispatch behavior and numerical output do not change.

Drawbacks

Direct callers of the old names must update their imports. There is no deprecation period, so the break is immediate. This is acceptable pre-1.0 and the rename is mechanical.

Alternatives

  • Soft-deprecate with DeprecationWarning wrappers instead of an immediate rename. Rejected to avoid carrying shims into 1.0.
  • Add an explicit backend= parameter. Rejected as a second way to say what gpu=/chunks=/.vrt already say.
  • Fold write_vrt into to_geotiff by accepting a file list as the first argument. Rejected because it mixes two unrelated operations in one signature.

Unresolved Questions

The public mosaic helper name: build_vrt versus keeping write_vrt or using to_vrt.

Additional Notes or Context

Scope for the implementation PR: the contract change, dispatch wiring, docs and README updates, contract and dispatch-parity tests, and a mechanical caller rename across source, tests, and stored notebook code. Out of scope: rewriting notebook narratives or doc prose beyond the name and contract updates, and rewriting test bodies to call the dispatchers (tests keep exercising the backends directly under the new private names).

Metadata

Metadata

Assignees

No one assigned

    Labels

    apiAPI design and consistencygeotiffGeoTIFF moduleproposalIdea that needs design discussion

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions