Skip to content

Fix metadata propagation in reproject (#1572, #1573)#1575

Merged
brendancol merged 1 commit intomainfrom
deep-sweep-metadata-reproject-2026-05-10
May 11, 2026
Merged

Fix metadata propagation in reproject (#1572, #1573)#1575
brendancol merged 1 commit intomainfrom
deep-sweep-metadata-reproject-2026-05-10

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

Before this change, a (y, x, band)-shaped raster passed to geoid_height_raster produced a 2D output labelled ('x', 'band') with the wrong shape and the wrong coords (because raster.dims[-2:] for a (y, x, band) input is ('x', 'band')). Input attrs like crs, res, transform, _FillValue, long_name were also dropped.

Without rioxarray installed, a raster carrying only attrs['nodatavals'] = (-9999,) would be treated by reproject as if nodata was NaN, so the -9999 pixels survived resampling unmasked while the output still carried a stale nodatavals=(-9999,) attr alongside a fresh nodata=NaN. xrspatial.resample already handles nodatavals, so the inconsistency was local to reproject.

Test plan

  • pytest xrspatial/tests/test_reproject.py -- 200 passing including 9 new metadata tests
  • Cross-backend probe (numpy, cupy, dask+numpy, dask+cupy) confirms attrs['nodata'] and attrs['nodatavals'] agree on output for all four paths
  • 3D (y, x, band) input to geoid_height_raster returns a 2D (y, x) array with input crs preserved
  • pytest xrspatial/tests/test_resample.py -- 169 passing (no regressions from the shared _detect_nodata change)

Found during deep-sweep / metadata pass on the reproject module.

geoid_height_raster previously dropped all input attrs and used
raster.dims[-2:] as the output dims. The latter produced garbage for
3D (y, x, band) rasters: the output came out shaped (4, 3) with dims
('x', 'band') instead of (4, 4) with dims ('y', 'x'). Use
_find_spatial_dims so the y/x axes are resolved regardless of layout,
and carry input attrs forward so crs / res / transform survive.

reproject and merge previously ignored attrs['nodatavals'] (rasterio's
plural convention) unless rioxarray happened to be installed and its
accessor picked it up. Without rioxarray, a raster carrying only
nodatavals was treated as if nodata was NaN, so the sentinel pixels
silently survived resampling. _detect_nodata now consults nodatavals
after _FillValue / nodata / missing_value, and the output keeps the
nodatavals tuple consistent with the resolved nodata.

Tests cover both bugs across 2D and 3D inputs and verify attrs
survive on numpy, dask, cupy, and dask+cupy backends.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 11, 2026
@brendancol brendancol requested a review from Copilot May 11, 2026 11:45
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes two metadata propagation issues in xrspatial.reproject: (1) geoid_height_raster now correctly identifies spatial dimensions for 3D rasters and preserves input attributes, and (2) nodata detection/propagation now honors the rasterio attrs['nodatavals'] convention and keeps nodata/nodatavals consistent after reproject()/merge().

Changes:

  • Update geoid_height_raster to resolve (y, x) dims via _find_spatial_dims, always return a 2D (y, x) output, and carry forward input attrs with units/model layered on top.
  • Extend _detect_nodata to consult attrs['nodatavals'] (tuple/scalar) when other nodata keys are absent.
  • Refresh output attrs['nodatavals'] in reproject() and merge() to match the resolved output nodata sentinel, and add regression tests for both issues.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
xrspatial/tests/test_reproject.py Adds regression tests for nodatavals handling in reproject/merge and for metadata + 3D→2D behavior in geoid_height_raster.
xrspatial/reproject/_vertical.py Fixes geoid_height_raster spatial-dim resolution for 3D inputs and preserves input attrs on output.
xrspatial/reproject/_crs_utils.py Adds attrs['nodatavals'] to the nodata detection fallback chain.
xrspatial/reproject/__init__.py Ensures reproject()/merge() refresh attrs['nodatavals'] to match the resolved nodata.
.claude/sweep-metadata-state.csv Updates internal sweep tracking metadata for the inspected module/issues.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@brendancol brendancol merged commit c79f8de into main May 11, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

reproject ignores attrs['nodatavals'] when rioxarray is not installed geoid_height_raster drops input attrs and mishandles 3D rasters

2 participants