Skip to content

Dask streaming writer: silent corruption when extra_tags overrides TAG_PHOTOMETRIC for MinIsWhite single-band raster #2073

@brendancol

Description

@brendancol

Describe the bug

The dask streaming GeoTIFF writer silently corrupts single-band rasters when attrs['extra_tags'] overrides TAG_PHOTOMETRIC to 0 (MinIsWhite). The eager writer rejects this override at xrspatial/geotiff/_writer.py:1600-1617 because the reader unconditionally inverts MinIsWhite single-band data, so the writer has to pre-invert pixels for the round-trip to work. The streaming path at xrspatial/geotiff/_writer.py:1909-1917 only checks the photometric kwarg. Then at xrspatial/geotiff/_writer.py:1979-1990 it lets extra_tags override the IFD tag with no equivalent check, so the override bypasses the inversion logic and writes a corrupted file.

Reproducer

import dask.array as da
import numpy as np
import xarray as xr
from xrspatial.geotiff import to_geotiff, read_geotiff

TAG_PHOTOMETRIC = 262
SHORT = 3

arr = xr.DataArray(
    da.from_array(np.array([[10, 20], [30, 40]], dtype=np.uint8), chunks=(1, 2)),
)
arr.attrs['extra_tags'] = [(TAG_PHOTOMETRIC, SHORT, 1, 0)]
to_geotiff(arr, '/tmp/bad.tif')
print(read_geotiff('/tmp/bad.tif').values)
# [[245 235] [225 215]] -- reader inverts on read, writer never pre-inverted.

Expected behavior

to_geotiff raises ValueError with the same message the eager writer uses, pointing the caller at photometric= or telling them to drop the override.

Actual behavior

Write succeeds. File round-trips with inverted pixel values.

Source locations

  • Eager guard: xrspatial/geotiff/_writer.py:1600-1617
  • Streaming photometric check: xrspatial/geotiff/_writer.py:1909-1917
  • Streaming IFD override: xrspatial/geotiff/_writer.py:1979-1990

Additional context

Input-validation gap. Fix mirrors the eager guard inside the streaming path so the dask write rejects the same impossible combination.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions