Skip to content

to_geotiff: silent data corruption on 3D input with non-whitelisted leading dim #1812

@brendancol

Description

@brendancol

Describe the bug

to_geotiff silently corrupts data when the input is a 3D xr.DataArray
whose leading dim is not in ('band', 'bands', 'channel'). The writer
relies on a moveaxis at xrspatial/geotiff/__init__.py:1686 (eager) and
:1636 (dask) that only fires when data.dims[0] is in the
_BAND_DIM_NAMES whitelist. Anything else (e.g. ('time', 'y', 'x'))
skips the moveaxis, so the array passes through untouched and the writer
treats the leading axis as the spatial y axis.

The read backends honour the on-disk axis order and emit (y, x, band),
so the round-trip produces an array with the wrong shape and wrong
per-slice contents.

Repro

import numpy as np
import xarray as xr
import tempfile
from xrspatial.geotiff import open_geotiff, to_geotiff

arr = np.zeros((2, 4, 5), dtype=np.uint8)
arr[0] = 1
arr[1] = 2

da = xr.DataArray(arr, dims=('time', 'y', 'x'),
                  attrs={'crs': 'EPSG:4326'})

tmp = tempfile.NamedTemporaryFile(suffix='.tif', delete=False)
tmp.close()
to_geotiff(da, tmp.name, crs=4326)

out = open_geotiff(tmp.name)
print(out.shape)                  # (2, 4, 5)  wrong; should be (4, 5, 2)
print(out.values[:, :, 0].sum())  # 12         wrong; should be arr[0].sum() == 20
print(out.values[:, :, 1].sum())  # 12         wrong; should be arr[1].sum() == 40

Expected behaviour

The writer should refuse an ambiguous 3D layout and raise ValueError,
telling the caller to either rename the leading dim to one of
_BAND_DIM_NAMES or transpose to (y, x, band). Silent pass-through
gives the user no signal that the layout was misinterpreted.

The same gate is used in three writer paths, all of which need the fix:

  • to_geotiff eager branch (xrspatial/geotiff/__init__.py:1686)
  • to_geotiff dask-streaming branch (xrspatial/geotiff/__init__.py:1636)
  • write_geotiff_gpu (xrspatial/geotiff/__init__.py:3636+)

Additional context

Surfaced by the 2026-05-13 metadata propagation sweep. Existing tests
cover the (band, y, x) and (y, x, band) happy paths but no test
asserts the behaviour for a non-whitelisted leading dim. Earlier
re-audits documented this as LOW (TIFF format limitation); re-categorised
HIGH here because it is silent data corruption, not just dim
relabelling.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions