Bounds-check IFD entry table in parse_ifd (#1672)#1677
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR hardens the GeoTIFF parser’s parse_ifd routine by adding explicit bounds checks for the IFD entry table reads (entry count, per-entry header fields, and next-IFD pointer), ensuring truncated buffers raise typed ValueError instead of leaking struct.error—supporting the fuzz/property-test exception contract introduced in #1661.
Changes:
- Add pre-
unpack_frombounds checks inparse_ifdfornum_entries, the entry table region, and the next-IFD pointer. - Add focused unit tests for classic TIFF and BigTIFF buffers that are 1 byte short of each guarded read.
- Extend the Hypothesis fuzz suite with truncation-based properties to prevent regressions where
struct.errorescapes.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
xrspatial/geotiff/_header.py |
Adds entry-table bounds checks in parse_ifd so truncation errors raise ValueError instead of struct.error. |
xrspatial/geotiff/tests/test_ifd_entry_table_bounds_1672.py |
New unit tests covering truncations at each newly-guarded read in classic TIFF and BigTIFF layouts. |
xrspatial/geotiff/tests/test_fuzz_hypothesis_1661.py |
Adds truncation fuzz/property tests to enforce typed-error-only behavior on truncated inputs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+25
to
+31
| from xrspatial.geotiff._dtypes import LONG | ||
| from xrspatial.geotiff._header import ( | ||
| TIFFHeader, | ||
| TAG_IMAGE_WIDTH, | ||
| parse_header, | ||
| parse_ifd, | ||
| ) |
Comment on lines
+483
to
+488
| num_entries_size = 8 if is_big else 2 | ||
| if offset + num_entries_size > data_len: | ||
| raise ValueError( | ||
| f"IFD num_entries at offset {offset} needs " | ||
| f"{num_entries_size} bytes but file length is {data_len}" | ||
| ) |
brendancol
added a commit
that referenced
this pull request
May 12, 2026
Addresses two Copilot review comments on PR #1677. The new bounds checks added in #1672 only guarded `offset + size > len(data)`, missing negative offsets. With a negative `offset`, `struct.unpack_from` interprets it as Python's negative-index semantic (read from the buffer end), silently returning wrong bytes or raising raw `struct.error` and escaping the typed `ValueError` contract. This commit adds explicit `< 0` checks on all three bounds-check sites (num_entries read, entry-table region, next-IFD pointer) and an unused `TIFFHeader` import is removed from the test module so flake8 stays clean. Tests cover negative offsets on both classic TIFF and BigTIFF, plus a large negative value, and reuse existing fixtures to keep the file self-contained.
`parse_ifd` was unpacking `num_entries`, every entry's tag/type/count, and the next-IFD pointer with no prior length check. A truncated or crafted TIFF could escape with `struct.error`, outside the ValueError/TypeError contract the #1661 fuzz tests assume. The S2 typed-error work in #1661 covered the IFD value area but not the entry table itself. Adds three explicit bounds checks (num_entries field, full entry table, next-IFD pointer) each raising ValueError with the offset, required byte count, and file length, matching the existing MAX_IFD_ENTRY_COUNT / value-area error message style. Tests: - test_ifd_entry_table_bounds_1672.py: one-byte-short buffers for each of the three new checks, in classic and BigTIFF layouts. - test_fuzz_hypothesis_1661.py: new Group 4 truncation property tests sweeping every byte offset; fails on struct.error escape.
Addresses two Copilot review comments on PR #1677. The new bounds checks added in #1672 only guarded `offset + size > len(data)`, missing negative offsets. With a negative `offset`, `struct.unpack_from` interprets it as Python's negative-index semantic (read from the buffer end), silently returning wrong bytes or raising raw `struct.error` and escaping the typed `ValueError` contract. This commit adds explicit `< 0` checks on all three bounds-check sites (num_entries read, entry-table region, next-IFD pointer) and an unused `TIFFHeader` import is removed from the test module so flake8 stays clean. Tests cover negative offsets on both classic TIFF and BigTIFF, plus a large negative value, and reuse existing fixtures to keep the file self-contained.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three
struct.unpack_fromcalls insideparse_ifdread from the IFD entry table without first checking the buffer length: thenum_entriesfield, each entry's tag/type/count, and the next-IFD pointer. A truncated file used to escape withstruct.error, which is outside theValueError/TypeErrorcontract #1661's Hypothesis fuzz tests assume. The typed-error work in #1661 (commit 58a8b3d) covered the IFD value area but not the entry table itself.This PR adds three bounds checks in
parse_ifd, each raisingValueErrorwith the offset, required byte count, andlen(data). Message style matches the existingMAX_IFD_ENTRY_COUNTand value-area bounds-check messages.Tests
test_ifd_entry_table_bounds_1672.py: unit tests building buffers exactly one byte short of each new bounds check, in both classic and BigTIFF layouts, plus a sanity case for a complete buffer and an out-of-bounds offset.test_fuzz_hypothesis_1661.py: new Group 4 truncation property tests. One runs the existing corpus through random truncation; a second sweeps integer truncation points 0..400 against a freshmake_minimal_tiff. Both fail ifstruct.errorescapes;ValueError,TypeError,MemoryError, andOverflowErrorare accepted.Verification
Fixes #1672.