Skip to content

geotiff: predictor=3 + integer dtype silently misdecoded on read #1933

@brendancol

Description

@brendancol

Summary

The reader accepts a malformed IFD that claims Predictor=3 (Floating-Point Predictor, TIFF Technical Note 3) paired with an integer SampleFormat (1=uint or 2=int) without complaint. _apply_predictor in xrspatial/geotiff/_reader.py routes to fp_predictor_decode solely on the predictor tag, with no check that the data is actually float. The byte-swizzle unshuffle then runs on integer bytes, producing garbage pixel values that look like valid integers.

The writer side already rejects this combination: _writer.py _resolve_predictor raises ValueError(\"predictor=3 (floating-point) requires float data, got dtype=...\") when called with a non-float dtype. xrspatial-written files cannot hit this asymmetry, but external or adversarial files can.

Reproduction

A synthetic IFD with BitsPerSample=32, SampleFormat=1 (UINT), Predictor=3, and uncompressed strip data round-trips through open_geotiff without raising, and the returned uint32 array does not match the bytes that were written. The bug is silent: no warning, no exception, and a downstream consumer has no signal that the decoded data is wrong.

Fix

Reject Predictor=3 when SampleFormat is not 3 (float) on read. Either:

  1. Raise ValueError (mirror the writer message: "predictor=3 (floating-point) requires float data, got SampleFormat="), or
  2. Warn and fall back to predictor=1 (no predictor).

Option (1) is cleaner and matches the writer's contract. The check belongs in _apply_predictor (or one step earlier where the IFD is first validated) so every backend path (eager numpy, dask, GPU) picks it up via the shared routine.

Regression test

A unit test in xrspatial/geotiff/tests/ that builds a small in-memory TIFF with the bad combination and asserts the new exception.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions