Skip to content

MinIsWhite inversion runs before nodata mask, swapping data and nodata pixels #1809

@brendancol

Description

@brendancol

Summary

_apply_photometric_miniswhite runs before the nodata-sentinel-to-NaN mask in every read path. For a TIFF that is both Photometric=MinIsWhite (tag 262 == 0) and carries a GDAL_NODATA tag, the inversion changes the pixel values that the mask then compares against the original sentinel, so:

  • pixels whose stored value equalled the nodata sentinel survive as iinfo.max - sentinel (uint) or -sentinel (float) instead of becoming NaN
  • pixels whose stored value equalled iinfo.max - sentinel / -sentinel are incorrectly converted to NaN

The bug fires on all four backends (open_geotiff numpy, dask, GPU eager, HTTP COG).

Reproducer

import numpy as np, tifffile, tempfile, os
from xrspatial.geotiff import open_geotiff

stored = np.array([[0, 100, 200], [50, 0, 255]], dtype=np.uint8)
with tempfile.TemporaryDirectory() as td:
    path = os.path.join(td, 'mw_nodata.tif')
    tifffile.imwrite(
        path, stored, photometric='miniswhite',
        extratags=[('GDAL_NODATA', 's', 0, '0\0', True)],
    )
    print(open_geotiff(path).values)

Actual output:

[[255. 155.  55.]
 [205. 255.  nan]]

Expected: the two 0 cells (the sentinel) should be NaN. The cell whose stored value is 255 should remain a real datum (post-inversion 0).

The float path has the same symptom: _apply_photometric_miniswhite returns -arr, the subsequent mask then compares -nodata to the sentinel and flags the wrong pixels.

Affected sites

  • xrspatial/geotiff/_reader.py:2459 -- read_to_array applies MinIsWhite after the array is decoded but before open_geotiff runs the nodata-mask block at xrspatial/geotiff/__init__.py:899-937.
  • xrspatial/geotiff/_reader.py:1957 -- _read_cog_http does the same.
  • xrspatial/geotiff/__init__.py:2476 -- the dask chunk fetcher inverts before the dask nodata block at line 2484.
  • xrspatial/geotiff/__init__.py:3331-3336 -- the GPU eager path also inverts before _apply_nodata_mask_gpu at line 3346.

Root cause

Inversion is a value-domain transform; the nodata mask is keyed on the original file sentinel and must run on the unmodified decoded array. Today every backend reverses the order.

Fix sketch

In each of the four sites, apply the nodata sentinel mask first (producing a float array with NaN where the sentinel was), then apply MinIsWhite to the non-NaN cells only. For the integer path, do the inversion on the integer array before the float promotion -- but skip the inversion where the mask flagged the cell -- or apply both transforms in a single integer-domain pass that excludes sentinel positions.

Severity

HIGH (Cat 2 + Cat 5). Silent wrong result on MinIsWhite + nodata GeoTIFFs across all four backends. Discovered during deep-sweep accuracy pass on 2026-05-13.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions