Apply LERC valid-mask in GPU decode path (depends on #1529)#1535
Merged
brendancol merged 2 commits intoxarray-contrib:mainfrom May 9, 2026
Merged
Apply LERC valid-mask in GPU decode path (depends on #1529)#1535brendancol merged 2 commits intoxarray-contrib:mainfrom
brendancol merged 2 commits intoxarray-contrib:mainfrom
Conversation
Contributor
Author
|
@copilot review |
There was a problem hiding this comment.
Pull request overview
This PR aligns GPU GeoTIFF LERC decoding with the CPU reader by preserving and applying LERC’s per-pixel valid-mask so masked pixels are restored to the file’s nodata value (or NaN fallback for floating dtypes), instead of leaking LERC’s zero-fill into outputs.
Changes:
- Plumb LERC valid-mask through the GPU tile decode path and apply a post-assembly masked fill on-device.
- Add a shared
_resolve_masked_fill()helper in the CPU reader and use it fromread_geotiff_gputo match CPU nodata semantics for LERC. - Add CPU and GPU test coverage for LERC masked pixels across NaN nodata, float sentinel nodata, uint16 sentinel nodata, and no-mask round-trips.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
xrspatial/geotiff/_gpu_decode.py |
Capture per-tile LERC valid masks and apply masked-fill after GPU tile assembly. |
xrspatial/geotiff/_reader.py |
Add _resolve_masked_fill() and apply LERC valid-mask during CPU decode for tiles/strips/COG HTTP reads. |
xrspatial/geotiff/_compression.py |
Introduce lerc_decompress_with_mask() and keep lerc_decompress() backward compatible (drops mask). |
xrspatial/geotiff/__init__.py |
Pass masked_fill into GPU tile decode when compression is LERC. |
xrspatial/geotiff/tests/test_lerc_valid_mask.py |
New CPU tests validating wrapper behavior and TIFF round-trips with/without masks. |
xrspatial/geotiff/tests/test_lerc_valid_mask_gpu.py |
New GPU tests ensuring read_geotiff_gpu matches CPU output for masked LERC pixels. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+2008
to
+2015
| if not invalid.any(): | ||
| return out | ||
|
|
||
| d_invalid = cupy.asarray(invalid) | ||
| if out.ndim == 3: | ||
| # Broadcast (H, W) mask across the sample axis. | ||
| out[d_invalid, ...] = out.dtype.type(masked_fill) | ||
| else: |
) The CPU LERC reader from xarray-contrib#1529 honours the LERC valid-mask and writes the file's nodata sentinel into masked pixels. The GPU LERC tile-decode path was still dropping the mask, so masked pixels read back as 0 on GPU but as NaN or the sentinel on CPU. Same bug, GPU side. Changes: _gpu_decode.py: the LERC branch now calls lerc_decompress_with_mask per tile and keeps any returned valid-mask. After predictor decode and tile assembly, _apply_lerc_mask_fill builds an invalid mask on host (matching the GPU assembly kernel's tile-grid layout), copies it to GPU once, and overwrites masked positions with the resolved fill value. Tiles LERC reports as fully valid skip the host work, so the no-mask path stays zero-copy. gpu_decode_tiles and gpu_decode_tiles_from_file get a masked_fill keyword that is forwarded through. read_geotiff_gpu computes it via _resolve_masked_fill(ifd.nodata_str, file_dtype) for LERC sources. tests/test_lerc_valid_mask_gpu.py: 4 tests covering float32+NaN, float32+sentinel, uint16+sentinel, and the no-mask regression. Each compares read_geotiff_gpu output to read_to_array output for the same file. Skipped unless cupy + CUDA + lerc are available. Out of scope: the encode side. The xrspatial writer still hard-codes hasMask=False; the tests reuse the lerc_compress monkeypatch fixture from the CPU PR to inject a valid-mask through lerc.encode directly.
_apply_lerc_mask_fill bypassed _check_gpu_memory when allocating the H*W boolean invalid mask and the cupy temporary backing the boolean indexing assignment. On near-budget decodes that pushed the GPU into OOM during the mask transfer rather than failing fast with the usual xrspatial budget message. Add explicit budget checks for both the mask buffer and the index temporary (worst-case one int64 per pixel) so the LERC path stays under the same envelope as the rest of the GPU decode.
0ee0f5a to
bf4fdb8
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Follow-up to #1529. The CPU LERC reader applies the LERC valid-mask
and writes nodata into masked positions, but the GPU LERC tile-decode
path in
_gpu_decode.pywas still dropping the mask. A masked pixelread back as 0 on GPU and as NaN or the sentinel on CPU.
This PR fixes the GPU side:
lerc_decompress_with_maskper tile andkeeps any returned mask.
built on host, copied to the GPU once, and used to write the
resolved fill value into masked positions.
gpu_decode_tilesandgpu_decode_tiles_from_fileget amasked_fill=kwarg thatread_geotiff_gpupopulates via_resolve_masked_fill(ifd.nodata_str, file_dtype)when compressionis LERC.
Depends on #1529 (uses
lerc_decompress_with_maskand_resolve_masked_fillintroduced there). Land #1529 first; this PRshould rebase cleanly off main once it is merged.
Test plan
pytest xrspatial/geotiff/tests/test_lerc_valid_mask_gpu.pypasses (4 tests).pytest xrspatial/geotiff/tests/test_lerc.py test_lerc_max_z_error.py test_gpu_byteswap_1508.py test_lerc_valid_mask.pypasses (38 tests).read_geotiff_gpureturns NaN at the masked position (matches CPU); without the fix it returned 0.0.