Fix COG overview poisoned by nodata sentinel (#1613)#1618
Merged
Conversation
The NaN-to-sentinel rewrite in to_geotiff and write_geotiff_gpu ran before _make_overview / make_overview_gpu, so np.nanmean and the GPU counterparts saw the sentinel as a finite value and biased every overview pixel. A raster with NaN pixels and nodata=-9999 produced overview cells like -4998.75 where the correct nan-aware mean was 1.5. Thread a nodata kwarg through the reducers so they mask the sentinel back to NaN before aggregating. The writer's overview loop passes nodata in, then rewrites any all-sentinel cells (which surface as NaN from the reducer) back to the sentinel for the on-disk pyramid. CPU and GPU paths both fixed. New regression tests cover mean / min / max / median, partial-NaN blocks, all-NaN blocks, integer dtype passthrough, and CPU-GPU agreement.
Contributor
There was a problem hiding this comment.
Pull request overview
Fixes COG overview generation when nodata=<finite> is used on float rasters containing NaNs by ensuring overview reducers ignore the nodata sentinel (CPU + GPU paths), and adds a focused regression test suite for issue #1613.
Changes:
- Thread
nodatathrough CPU overview helpers (_block_reduce_2d,_make_overview) and mask sentinel back to NaN during reduction. - Mirror the same
nodata-aware masking behavior in GPU overview helpers (_block_reduce_2d_gpu,make_overview_gpu) and in the GPU COG overview loop. - Add regression tests covering CPU/GPU behavior and direct helper-level reduction semantics for issue #1613.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
xrspatial/geotiff/_writer.py |
Adds nodata-aware masking in CPU overview reducers and rewrites NaNs back to sentinel in the overview loop. |
xrspatial/geotiff/_gpu_decode.py |
Adds nodata-aware masking in GPU overview reducers and threads nodata through make_overview_gpu. |
xrspatial/geotiff/__init__.py |
Passes nodata into GPU overview generation and rewrites reducer-produced NaNs back to the sentinel per level. |
xrspatial/geotiff/tests/test_cog_overview_nodata_1613.py |
New regression tests validating correct overview values with nodata sentinels on CPU + GPU. |
.claude/sweep-accuracy-state.csv |
Updates internal accuracy tracking notes to record the #1613 fix. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+193
to
+196
| # and poisons the overview (issue #1613). | ||
| if (nodata is not None | ||
| and not np.isnan(nodata) | ||
| and np.isfinite(nodata)): |
Comment on lines
+1155
to
+1158
| # all-sentinel comes back as NaN; ``_write_tiled`` / ``_write_stripped`` | ||
| # serialise that NaN to disk, where the eager reader will mask | ||
| # it (and a future writer pass could rewrite to ``nodata`` for | ||
| # external readers -- out of scope for this fix). |
Comment on lines
+215
to
+217
| # nanmean / nanmin / nanmax / nanmedian raise warnings on all-nan | ||
| # blocks; ``np.errstate`` would silence them but the resulting NaN is | ||
| # the desired output so we leave the warning visible. |
Comment on lines
+2926
to
+2929
| # honour it as missing-data (issue #1613). | ||
| if (nodata is not None | ||
| and not np.isnan(nodata) | ||
| and np.isfinite(nodata)): |
- Drop redundant np.isfinite gate in _block_reduce_2d (CPU + GPU) so nodata=+/-inf is masked back to NaN like a finite sentinel, matching the upstream NaN->sentinel rewrite gate (`not np.isnan(nodata)` used at _writer.py:1171,1525,1620). - Suppress RuntimeWarning from nanmean/nanmin/nanmax/nanmedian on all-NaN blocks locally; the all-NaN output is the desired signal that the overview loop rewrites to the sentinel, so the warning was noise on every nodata-border COG write. - Fix the comment above the overview loop: NaN from an all-sentinel reduction is rewritten back to the sentinel before _write_tiled / _write_stripped runs, not serialised as NaN. - Add regression tests covering nodata=inf (CPU + GPU) and the warning-suppression contract for all-NaN blocks.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
to_geotiff(..., cog=True, nodata=<finite>)produced corrupted overview pyramids because the NaN-to-sentinel rewrite ran before_make_overview/make_overview_gpu.np.nanmean(and the GPU equivalents) then saw the sentinel as a real value and biased every reduced pixel toward it.nodatathrough_block_reduce_2d,_block_reduce_2d_gpu,_make_overview, andmake_overview_gpuso each reducer masks the sentinel back to NaN before aggregating. The writer's overview loop rewrites any all-sentinel reductions (NaN from the reducer) back to the sentinel for the on-disk pyramid.Reproduction (before fix)
After fix:
[[1.5, 3.5], [15.0, 35.0]](matchesnp.nanmeanon the original NaN-keyed data).Test plan
test_cog_overview_nodata_1613.py(11 tests, 8 CPU + 3 GPU): mean / min / max / median ignore the sentinel, partial-NaN blocks reduce correctly, all-NaN blocks reduce to NaN then get rewritten to the sentinel, integer dtype passthrough unchanged, CPU and GPU produce identical pyramids.TestOverviewResamplingsuite still passes (12 tests).nodata/overview/cogtests in the geotiff suite pass.