Skip to content

Fix COG cubic overview poisoned by nodata sentinel (#1623)#1628

Merged
brendancol merged 2 commits into
mainfrom
deep-sweep-accuracy-geotiff-2026-05-11-run2
May 11, 2026
Merged

Fix COG cubic overview poisoned by nodata sentinel (#1623)#1628
brendancol merged 2 commits into
mainfrom
deep-sweep-accuracy-geotiff-2026-05-11-run2

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Summary

Closes #1623.

to_geotiff(cog=True, overview_resampling='cubic', nodata=<finite>) on a float raster with NaN regions produced overview pixels with bad ringing near nodata borders. Same root cause as #1613, but for the cubic branch: the writer rewrote NaN to the sentinel before reduction, then _block_reduce_2d(method='cubic') ignored nodata and handed the sentinel-poisoned array to scipy.ndimage.zoom(order=3). The cubic spline blended the sentinel into neighbouring cells (values like 1133.44 and -10290.08 appeared where the source data was a constant 100).

Fix:

  • CPU _block_reduce_2d cubic branch: when nodata is supplied on a float dtype and the sentinel is found, mask sentinel to NaN, run cubic with prefilter=False so a single NaN cannot poison the whole row/column, then rewrite any NaN in the result back to the sentinel. prefilter=False only fires when a sentinel is present, so the non-nodata cubic semantics are unchanged.
  • GPU _block_reduce_2d_gpu: previously raised ValueError on method='cubic'. Added a CPU fallback (same pattern as 'mode') so the GPU writer produces byte-equivalent overviews to the CPU writer. GPU_OVERVIEW_METHODS now includes 'cubic'.

12 regression tests in test_cog_cubic_overview_nodata_1623.py cover the helper, the round trip, the no-nodata regression guard, +/-inf nodata, the NaN-sentinel no-op, the GPU fallback, and CPU/GPU byte-equivalence.

Test plan

  • pytest xrspatial/geotiff/tests/test_cog_cubic_overview_nodata_1623.py (12 pass, including 3 GPU-gated)
  • pytest xrspatial/geotiff/tests/test_cog_overview_nodata_1613.py (14 pass, the prior nodata-overview fix)
  • pytest xrspatial/geotiff/tests/test_cog.py xrspatial/geotiff/tests/test_overview_filter.py xrspatial/geotiff/tests/test_streaming_codecs_2026_05_11.py (44 pass, existing cubic round-trip + cubic-distinct-from-mean)
  • Full xrspatial/geotiff/tests/ run: 1256 pass, 7 skipped, 3 pre-existing matplotlib-deepcopy failures in TestPalette (unrelated)

State file .claude/sweep-accuracy-state.csv updated with the pass-16 record for the geotiff module.

`to_geotiff(cog=True, overview_resampling='cubic', nodata=<finite>)` on a
float raster with NaN regions produced overview pixels with severe
ringing near nodata borders. Same class of bug as #1613, but for the
`cubic` branch: the writer rewrites NaN to the finite sentinel before
reduction, then `_block_reduce_2d(method='cubic')` handed the
sentinel-poisoned array straight to `scipy.ndimage.zoom(order=3)`. The
cubic spline blended the sentinel into neighbouring cells (values like
1133.44 and -10290.08 appeared where the source data was a constant
100).

Fix: when `nodata` is supplied on a float dtype and the sentinel is
present in the input, mask sentinel to NaN, run cubic with
`prefilter=False` so a single NaN does not poison the entire row/column
(the default B-spline prefilter is global), then rewrite any NaN in the
result back to the sentinel. `prefilter=False` only fires when a
sentinel is actually found in the input, so the default cubic semantics
still apply to inputs without nodata.

GPU side: `_block_reduce_2d_gpu` previously raised `ValueError` on
`method='cubic'`. Added a CPU fallback the same way the helper already
handles `mode`, so the GPU writer path produces byte-equivalent
overviews to the CPU writer when cubic + nodata are combined.
`GPU_OVERVIEW_METHODS` now includes `'cubic'`.

12 regression tests in `test_cog_cubic_overview_nodata_1623.py`:
helper no-ringing, poisoning repro, no-nodata path unchanged, end-to-end
COG round-trip, GPU fallback, CPU/GPU byte-match, +/-inf nodata
masking, NaN-sentinel no-op, and the `GPU_OVERVIEW_METHODS` contract.
All 1256 existing geotiff tests still pass.

Found by deep-sweep-accuracy 2026-05-11. State CSV updated with the
pass-16 record.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 11, 2026
@brendancol brendancol requested a review from Copilot May 11, 2026 20:51
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request fixes incorrect COG overview pixels produced when using cubic resampling with a finite nodata sentinel on float rasters that contain NaN regions (issue #1623). It ensures both CPU and GPU writer paths avoid “sentinel ringing” artifacts near nodata borders and remain consistent with each other.

Changes:

  • Update CPU _block_reduce_2d(..., method='cubic') to honor nodata for float arrays by masking sentinel→NaN, running cubic with prefilter=False, then rewriting NaNs back to the sentinel.
  • Add a GPU cubic overview implementation via CPU fallback and include 'cubic' in GPU_OVERVIEW_METHODS.
  • Add a dedicated regression test module covering CPU helper behavior, end-to-end COG round trips, edge cases (+/-inf, NaN sentinel), and CPU/GPU equivalence.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
xrspatial/geotiff/_writer.py Implements nodata-aware cubic downsampling to prevent sentinel contamination during SciPy cubic zoom.
xrspatial/geotiff/_gpu_decode.py Adds 'cubic' to GPU overview methods and routes cubic overviews through a CPU fallback for parity.
xrspatial/geotiff/tests/test_cog_cubic_overview_nodata_1623.py Adds regression coverage for cubic+nodatavalue behavior and CPU/GPU equivalence.
.claude/sweep-accuracy-state.csv Updates sweep/inspection state record for the geotiff module.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread xrspatial/geotiff/_writer.py Outdated
Comment on lines +146 to +155
for ``nearest`` and ``mode`` methods. The ``cubic`` branch honours
``nodata`` by masking the sentinel to NaN, running cubic with
``prefilter=False`` to keep the kernel local, and rewriting any
NaN in the output back to the sentinel (issue #1623).
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot apply changes based on this feedback

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in 7b81662: the _block_reduce_2d docstring now scopes the NaN-preservation wording to the nan-aware reducers and explicitly notes that the cubic path rewrites NaNs back to the sentinel before returning.

Agent-Logs-Url: https://github.com/xarray-contrib/xarray-spatial/sessions/97661048-d1c0-4605-8108-e34347d42061

Co-authored-by: brendancol <433221+brendancol@users.noreply.github.com>
@brendancol brendancol merged commit c294578 into main May 11, 2026
1 check was pending
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

COG cubic overview poisoned by nodata sentinel

3 participants