Skip to content

geotiff: include GB allocation estimate in max_pixels error message #2553

@brendancol

Description

@brendancol

Reason or Problem

When the max_pixels safety gate in xrspatial.geotiff._layout._check_dimensions rejects a GeoTIFF, the error reports the pixel count but not the allocation size in bytes. You have to do the multiplication yourself to know whether bumping max_pixels is reasonable for your machine.

Current message:

TIFF image dimensions (50000 x 50000 x 1 = 2,500,000,000 pixels) exceed the safety limit of 1,000,000,000 pixels.  Pass a larger max_pixels value to read_to_array() if this file is legitimate.

The VRT SimpleSource message in _vrt.py has the same gap.

Proposal

Add a GB allocation estimate next to the pixel count in both messages. Use 4 bytes per pixel (float32) as the basis, since the constant docstring at the top of _layout.py already uses that convention:

#: ~1 billion pixels, which is ~4 GB for float32 single-band.
MAX_PIXELS_DEFAULT = 1_000_000_000

Design:

Compute the estimate at the point of raising and include it in both the actual and the limit portions of the message. The bytes-per-pixel basis is spelled out inline so the reader knows what assumption produced the number.

Example new message:

TIFF image dimensions (50000 x 50000 x 1 = 2,500,000,000 pixels, ~9.31 GB at 4 bytes/pixel) exceed the safety limit of 1,000,000,000 pixels (~3.73 GB at 4 bytes/pixel).  Pass a larger max_pixels value to read_to_array() if this file is legitimate.

Usage: No API change. The estimate appears in the error when the gate trips.

Value: A reader sees the GB number and can decide whether the request is reasonable without doing the multiplication by hand.

Stakeholders and Impacts

Anyone reading a large GeoTIFF that trips the gate. No API or behaviour change beyond the message text. One test in xrspatial/geotiff/tests/test_security.py asserts the message contains "exceed the safety limit"; that substring is preserved.

Drawbacks

The byte estimate is approximate. Decoded dtype may not be float32. The message says "at 4 bytes/pixel" so the assumption is explicit.

Alternatives

Threading the actual dtype through _check_dimensions would give an exact number, but it requires touching every call site and reading the IFD bit depth before the safety check. Not worth the churn for an error message hint.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions