Skip to content

geotiff: split _finalize_lazy_read_attrs dtype into graph_dtype + caller_dtype #2206

@brendancol

Description

@brendancol

Goal

Split the dtype parameter on _finalize_lazy_read_attrs into two so the call-site fixup in the dask backends can go away.

Context

_finalize_lazy_read_attrs (in xrspatial/geotiff/_attrs.py, added in #2177) takes one dtype argument that does two jobs:

  1. Resolved graph dtype: drives masked_nodata = bool(mask_nodata and np.dtype(dtype).kind == 'f').
  2. Caller-supplied cast attr: written as attrs['nodata_dtype_cast'] = np.dtype(dtype).name.

On the dask paths these are different values. When mask_nodata=True and the source is integer, the dask backends auto-promote the graph dtype to float64 without the caller asking for a cast. The auto-promotion is the right value for masked_nodata and the wrong value for nodata_dtype_cast.

#2178 worked around this by passing the resolved graph dtype to the helper and fixing up nodata_dtype_cast via a small _apply_caller_dtype_cast helper (one call per backend). The helper docstring on _finalize_lazy_read_attrs already flagged the conflation as deferred to the migration PR.

Proposed change

Split the helper's signature:

def _finalize_lazy_read_attrs(
    *,
    geo_info,
    nodata,
    mask_nodata,
    graph_dtype,        # was: dtype; drives masked_nodata
    caller_dtype=None,  # new; drives nodata_dtype_cast
    window,
    allow_rotated=False,
    allow_unparseable_crs=False,
    attrs_in=None,
) -> dict:
    ...

graph_dtype is the resolved dask graph dtype (what the helper currently computes masked from). caller_dtype is the caller's dtype= kwarg verbatim, or None when the caller did not pass one.

After the split:

  • _apply_caller_dtype_cast becomes dead code; remove it.
  • Both dask backends drop the post-helper fixup; the helper handles both attrs in one place.

Out of scope

The eager backends (#2179 wave 2 sibling, #2180 wave 3) do not have the auto-promotion issue because masking is folded into a single eager step; they would just pass caller_dtype=dtype and graph_dtype=resolved_dtype symmetrically.

Files

Acceptance criteria

  • _finalize_lazy_read_attrs takes two dtype parameters.
  • The dask backends call the helper once with no post-call fixup.
  • Existing tests still pass.
  • test_lazy_finalization_parity_2162.py still passes unchanged.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions