resample: preserve scalar non-dim coords and refresh stale nodata attr (#2542)#2549
Merged
Merged
Conversation
#2542) The 2D non-identity path in `resample()` rebuilt the output `xr.DataArray` with only the spatial dim-coords, silently dropping any scalar non-dim coord on the input. rioxarray stores CRS metadata on a `spatial_ref` scalar coord, so chained `out.rio.crs` calls lost their CRS even though `attrs['crs']` round-tripped fine. The identity path (`scale_factor=1.0`, `agg.copy()`) and the 3D path (per-band `xr.concat`) both happened to preserve the coord, leaving the 2D path as the inconsistent outlier (Cat 5 backend / path-inconsistent metadata). Separately, `_resolve_nodata()` reads `attrs['nodata']` as a sentinel fallback after `_FillValue`, but the output post-processing only refreshed `_FillValue` and `nodatavals`. Inputs declaring `attrs['nodata'] = -9999` came back with that same finite value alongside data that was now NaN, mismatching any downstream consumer that trusts the attr (Cat 4 dtype/nodata semantics). Fixes both gaps in one place at the final `xr.DataArray` constructor: refresh `attrs['nodata']` to NaN when present, and fold any zero-dim non-dim coord into the output coords dict. Spatially-shaped non-dim coords (whose length is tied to the input resolution) intentionally do not carry; the user would need to resample them separately. Backend coverage: numpy / cupy / dask+numpy / dask+cupy all route through the same constructor and were verified end-to-end for the spatial_ref carry. Nodata-attr refresh was verified on numpy / cupy / dask+numpy; dask+cupy with a non-NaN nodata sentinel hits a pre-existing xr.where + cupy.astype quirk unrelated to this audit. 7 new tests in TestMetadataPropagation cover nodata-attr refresh, spatial_ref / scalar selector coord carry, identity-vs-downsample coord parity, and the intentional drop of spatially-shaped extras. Closes #2542
Contributor
Author
PR Review: preserve scalar non-dim coords and refresh stale nodata attrRead source ( BlockersNone. Suggestions
Nits
What looks good
Checklist
|
- Add test_spatial_ref_preserved_3d: pin the 3D per-band -> xr.concat path, which inherits the carry-across transitively via the per-band 2D resample() call. - Add test_spatial_ref_preserved_cupy / _dask: pin the 4-backend parity claim from the PR body so a backend-divergent regression would surface (the fix lives at the shared 2D xr.DataArray constructor, but the contract is now explicitly tested). - Drop redundant int() casts on scalar coord assertions; np scalar equality is one less coercion to read. 178 passed (was 175 + 3 new).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
resample()rebuilt its outputxr.DataArraywith only the spatial dim-coords, silently dropping any scalar non-dim coord on the input (e.g. rioxarray'sspatial_ref, or a squeezedtime/bandselector). The identity path (scale_factor=1.0) and the 3D per-band path both kept these coords, so the 2D path was the inconsistent outlier (Cat 5)._resolve_nodata()readsattrs['nodata']as a sentinel fallback, but the output post-processing only refreshed_FillValueandnodatavals. Inputs declaringattrs['nodata'] = -9999came back with that same finite value alongside data that was now NaN (Cat 4).xr.DataArrayconstructor. Spatially-shaped non-dim coords intentionally do not carry across; their length is tied to the input resolution.Backend coverage
numpy / cupy / dask+numpy / dask+cupy all route through the same constructor. Verified end-to-end for the
spatial_refcarry across all four backends. Nodata-attr refresh verified on numpy / cupy / dask+numpy; dask+cupy with a non-NaN sentinel hits a pre-existingxr.where+cupy.astypequirk unrelated to this audit.Test plan
xrspatial/tests/test_resample.py(168 existing + 7 new)TestMetadataPropagationcases: nodata-attr refresh, spatial_ref carry, scalar band/time carry, identity-vs-downsample coord parity, intentional drop of spatially-shaped extrasCloses #2542