Consolidate GeoTIFF public API on open_geotiff / to_geotiff#2962
Conversation
…#2960) Consolidate the public read/write surface on open_geotiff and to_geotiff. The four data backends (read_geotiff_dask/gpu, read_vrt, write_geotiff_gpu) become private underscore names; the dispatchers route to them from the existing gpu=/chunks=/.vrt kwargs. write_vrt (the file-list VRT mosaic builder) stays public, renamed build_vrt, since it has no DataArray to write and cannot fold into to_geotiff(data, path). Update __all__, the module Public API docstring, and all source + test call sites.
New test_api_consolidation.py pins the consolidated public surface (open_geotiff/to_geotiff/build_vrt), asserts the five backend names are gone from the package namespace, and checks each dispatcher still reaches its private backend. Notebook now calls build_vrt and describes the GPU and VRT paths through the dispatchers.
Reference docs now document open_geotiff/to_geotiff/build_vrt as the public surface; the backend functions are described as private with the dispatcher kwargs that reach them. README feature matrix drops the write_geotiff_gpu row (GPU write is to_geotiff(gpu=True)) and renames write_vrt to build_vrt.
The leading underscore shifted call/def parens by one column, so visual-indent continuation lines (mostly the backend signatures) and two closing brackets were under-indented. Realign with autopep8 (E124/E128), silence the re-export-only _write_geotiff_gpu F401, and rewrap the grown import in test_signature_contract.py.
brendancol
left a comment
There was a problem hiding this comment.
PR Review: Consolidate GeoTIFF public API on open_geotiff / to_geotiff
Reviewed in a worktree on the PR head branch. This is a mechanical rename plus contract change, so the correctness/algorithm/performance checks are mostly N/A; I focused on dispatch wiring, caller completeness, the internal-vs-public split, and test/doc coverage.
Blockers
None.
Suggestions
None blocking. The change is well scoped and the diff is balanced (227 insertions / 227 deletions, all renames and continuation-line realignment).
Nits
- The module docstring (
xrspatial/geotiff/__init__.py) says to import the private backends "from their backend modules." They are also bound on the package (xrspatial.geotiff._read_geotiff_gpu, etc.), which the new contract test and the existinggpu/test_writer.pymonkeypatch rely on. Worth a half sentence so readers know both paths resolve. Optional. - Two pre-existing flake8 issues remain in
tests/unit/test_metadata.py(E501 line 31, F841 line 1446). The PR already calls these out as pre-existing; leaving them is the right call for a focused diff.
What looks good
- Dispatch wiring verified:
open_geotiffroutes to_read_vrt/_read_geotiff_gpu/_read_geotiff_dask(__init__.py:929/952/970);to_geotiffroutes to_write_geotiff_gpu(_writers/eager.py:730) and_write_vrt_tiledfor.vrtoutput. - The internal low-level
_vrt.read_vrt/_vrt.write_vrtkeep their names; the rename touched only the public_backends/_writerswrappers. The aliased internal imports (read_vrt as _read_vrt_internal,write_vrt as _write_vrt_internal) are intact. build_vrtis the same length aswrite_vrt, so no signature drift; the deprecatedcrs_wkt/vrt_pathalias handling is preserved.- New
tests/parity/test_api_consolidation.pypins the surface, asserts the five old names are gone from the package namespace, and checks each dispatcher reaches its private backend (GPU cases gated on CuPy). - Full geotiff suite green, GPU paths included: 6104 passed, 81 skipped, 1 xfailed.
Checklist
- No algorithm change (mechanical rename); dispatch and outputs unchanged
- All four backends still reachable through the dispatchers (kwargs forwarded)
- Internal
_vrtreader/writer untouched; public/internal split correct - Edge cases covered by the existing suite plus the new contract test
- No premature materialization or copies introduced
- No new benchmark needed (no perf-path change)
- README feature matrix updated (build_vrt row; write_geotiff_gpu row dropped)
- Docstrings / reference docs / user guide / notebook updated
Note that the private backends are bound on the package namespace (xrspatial.geotiff._read_geotiff_gpu) as well as importable from their backend modules, since the contract test and the GPU writer test monkeypatch the package binding.
brendancol
left a comment
There was a problem hiding this comment.
Follow-up review
Addressed the one actionable item from the previous pass.
- Docstring nit (fixed in d22f4f0): the module docstring now states the private backends are bound on the package namespace (
xrspatial.geotiff._read_geotiff_gpu) as well as importable from their backend modules. - Pre-existing flake8 in
tests/unit/test_metadata.py(E501 line 31, F841 line 1446): dismissed. These exist onorigin/main, are unrelated to this change, and are not in lines this PR touches; fixing them would widen the diff for no reason.
Re-verified after the follow-up commit: import xrspatial.geotiff resolves, __all__ is {build_vrt, open_geotiff, to_geotiff} plus error classes and status constants, and flake8 xrspatial/geotiff/__init__.py is clean. No open blockers or suggestions remain.
Closes #2960.
What changed
The geotiff public read/write surface is now the two dispatchers plus one mosaic helper:
open_geotiff-- read; backend chosen bygpu=/chunks=/.vrtextensionto_geotiff-- write; backend chosen bygpu=/ CuPy data /.vrtpathbuild_vrt-- mosaic a list of existing GeoTIFF files into a VRT (renamed fromwrite_vrt; kept public because it has no DataArray to write and cannot fold intoto_geotiff(data, path))The four data backends become private:
read_geotiff_dask->_read_geotiff_dask,read_geotiff_gpu->_read_geotiff_gpu,read_vrt->_read_vrt,write_geotiff_gpu->_write_geotiff_gpu. They stay importable from their backend modules for callers that want to bypass auto-dispatch. The dispatchers already forwarded every backend kwarg (missing_sources,band_nodata,on_gpu_failure, ...), so no read or write capability is lost.This is a clean break with no back-compat shims, consistent with a pre-1.0 contract change.
Why
Seven entry points gave users several ways to do the same read or write. The dispatchers already selected the backend from their parameters, so the backend-named functions duplicated a control surface that already existed.
Backend coverage
No behavior or numerical change on any path. numpy, dask+numpy, cupy, and dask+cupy all exercised by the existing suite (CuPy present in the dev env, so the GPU paths ran). The rename is mechanical; dispatch and outputs are unchanged.
Notes
__all__drops tobuild_vrt,open_geotiff,to_geotiff(plus the error classes and status constants).tests/parity/test_api_consolidation.pypins the public surface, asserts the five old names are gone from the package namespace, and checks each dispatcher still reaches its private backend.CHANGELOG.md(updated at release time). Two pre-existing flake8 issues intests/unit/test_metadata.py(unrelated to this change) were left as-is.Test plan
pytest xrspatial/geotiff/tests-- 6104 passed, 81 skipped, 1 xfailedflake8clean on changed files except the two pre-existing test_metadata.py issues