Skip to content

GeoTIFF: migrate dask backends to shared lazy finalization helper (PR C of #2162) #2178

@brendancol

Description

@brendancol

Parent: #2162
Depends on: #2177 (adds the lazy helper)

Goal

Replace duplicated finalization in the two dask paths with _finalize_lazy_read_attrs from #2177.

Scope

Two call sites, each about 25 lines of nearly identical validate then populate then set-nodata-with-None code:

  1. xrspatial/geotiff/_backends/dask.py:359-386 — CPU dask finalization in read_geotiff_dask.
  2. xrspatial/geotiff/_backends/gpu.py:1530-1553 — GPU+dask finalization in the dask branch of read_geotiff_gpu.

Both sites construct attrs, call _validate_read_geo_info, call _populate_attrs_from_geo_info, and call _set_nodata_attrs with pixels_present=None. After the migration each site is one call to _finalize_lazy_read_attrs(...) followed by the backend-specific dask graph assembly.

Preserve all current behavior:

  • attrs['nodata_pixels_present'] remains absent (None) in lazy outputs.
  • attrs['nodata_dtype_cast'] is recorded when the caller forced a cast.
  • MinIsWhite inversion (at dask.py:227-238) still preempts the sentinel before the chunk tasks see it.

Tests

Add xrspatial/geotiff/tests/test_lazy_finalization_parity_2162.py. Parametrize over dask+numpy and dask+cupy. Assert:

  • Both backends produce the same attrs dict (modulo array backend markers) for matched inputs.
  • nodata_pixels_present is absent in both.
  • nodata_dtype_cast matches across both when the caller forces a cast.
  • georef_status matches across both for the five georef states (full, transform_only, crs_only, none, rotated_dropped).

Skip GPU tests when CUDA is unavailable.

Files

  • xrspatial/geotiff/_backends/dask.py (rewrite 359-386 to call _finalize_lazy_read_attrs)
  • xrspatial/geotiff/_backends/gpu.py (rewrite 1530-1553 to call _finalize_lazy_read_attrs)
  • xrspatial/geotiff/tests/test_lazy_finalization_parity_2162.py (new)

Constraints

  • No public API change.
  • Existing dask tests must still pass: test_nodata_lifecycle_attrs_2135.py, test_nodata_semantics_split_1988.py, test_georef_status_2136.py.
  • Lazy contract holds: no .compute() triggered by finalization.

Out of scope

  • Eager backends (sibling PR D in wave 2).
  • VRT (PR E in wave 3).

Metadata

Metadata

Assignees

No one assigned

    Labels

    apiAPI design and consistencybackend-coverageAdding missing dask/cupy/dask+cupy backend supportenhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions