Skip to content

geotiff: cubic overview on integer COG poisons pixels near nodata border #1975

@brendancol

Description

@brendancol

Summary

to_geotiff(data, cog=True, overview_resampling='cubic', nodata=N) on
integer rasters with a finite nodata sentinel produces severe ringing
artifacts in the overview pyramid near the nodata border. This is the
integer-dtype mirror of #1623 (which fixed the float case).

Root cause

In xrspatial/geotiff/_writer.py::_block_reduce_2d the cubic branch
masks the sentinel to NaN only when the input dtype is float
(line ~423: arr2d.dtype.kind == 'f'):

if (nodata is not None
        and arr2d.dtype.kind == 'f'        # gates out integer dtypes
        and not np.isnan(nodata)):
    ...mask sentinel to NaN, run zoom(prefilter=False), rewrite NaN to sentinel...
return zoom(arr2d, 0.5, order=3).astype(arr2d.dtype)

For integer rasters the function falls straight through to the
unmasked zoom(arr2d, 0.5, order=3) path. The bicubic spline then
blends the sentinel value (e.g. -9999) into neighbouring valid cells
along the border, producing pixels like 1082, 1085, 1134 (well above
the actual data value 100) and -11104 (below the sentinel -9999).
Cast back to the integer dtype, these are silent garbage. The read
side's int-to-NaN mask only catches exact sentinel hits, so the
poisoned values survive and corrupt every overview level.

Same bug class as #1623 (float cubic + nodata), which was fixed by
masking the sentinel to NaN, running cubic with prefilter=False, then
rewriting NaN back to the sentinel. The integer case needs the same
treatment plus a cropped.astype(float64) promotion before zoom so
NaN can be carried through the spline, with cupy.around(...).astype(int_dtype)
on the way out.

The GPU writer routes cubic through this same CPU helper
(_block_reduce_2d_gpu at _gpu_decode.py:3118 falls back to CPU for
cubic), so fixing the CPU helper fixes both backends.

Reproducer

import numpy as np, xarray as xr
from xrspatial.geotiff import to_geotiff, open_geotiff
data = np.full((1024, 1024), 100, dtype=np.int16)
data[0:256, 0:256] = -9999     # nodata corner
da = xr.DataArray(data, dims=('y','x'),
    coords={'y': np.arange(1024.0), 'x': np.arange(1024.0)})
to_geotiff(da, '/tmp/cubic_int.tif', cog=True,
           overview_resampling='cubic', nodata=-9999, crs=4326)
r1 = open_geotiff('/tmp/cubic_int.tif', overview_level=1)
print(r1.values[128, 124:132])
# Observed: [1082. 1082. 1085. 1134.    5.   93.  100.  100.]
# Expected: [-9999/NaN ... 100, 100, 100, 100] (clean boundary)

Severity

HIGH (Cat 1 precision loss + Cat 2 NaN propagation + Cat 5 backend
parity). The pyramid is silently corrupted at every zoom > 0 whenever
the user combines integer data, cubic resampling, and a finite nodata
sentinel. Downstream consumers (Mapbox/Leaflet tile servers, QGIS
zoom-out) see the poisoned values as legitimate measurements.

Fix sketch

In _block_reduce_2d's cubic branch, drop the dtype.kind == 'f'
gate. Promote the cropped array to float64 (so NaN survives), mask
sentinel to NaN with the existing _int_nodata_in_range style guard,
run zoom(... prefilter=False), rewrite NaN back to the sentinel,
then cast through np.around(...).astype(arr2d.dtype) so the cast is
well-defined (mirrors the existing integer-mean/min/max/median tail).

Categories

Cat 1, Cat 2, Cat 5

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions