Prevent `FixedSizeBinaryArray::value` offset truncation by alamb · Pull Request #9850 · apache/arrow-rs

alamb · 2026-04-29T17:06:19Z

Which issue does this PR close?

None.

Rationale for this change

FixedSizeBinaryArray::value_offset_at cast the requested index to i32 before multiplying by the element width. For indexes beyond i32::MAX, that truncation could produce a negative byte offset and cause value() to read before the start of the value buffer.

What changes are included in this PR?

Check for offset overflow
Adds regression tests

Note I also added some more docs for FixedSizeBinaryArray that may help reviewers

Are these changes tested?

I can't find any way to test this this issue without actually allocating a large array (over 2GB)

Are there any user-facing changes?

Better limit checking

alamb · 2026-04-29T17:43:44Z

+    /// checking for overflow.
    #[inline]
    fn value_offset_at(&self, i: usize) -> i32 {
        self.value_length * i as i32


this arithmetic can overflow if the result is larger than i32::MAX

adamreeve · 2026-04-30T00:24:34Z

    /// Caller is responsible for ensuring that the index is within the bounds
-    /// of the array
+    /// of the array and the resulting byte offset fits in `i32`
    pub unsafe fn value_unchecked(&self, i: usize) -> &[u8] {


Would it make sense to add new methods that are alternatives to value_offset and value_offset_at that return usize, so we don't need this limitation? Or at least update this method so it doesn't use them and doesn't suffer from i32 overflow. Because with this change, value now handles when value_offset is greater than max i32, but value_unchecked still doesn't.

I would expect value_unchecked to work correctly for all cases where value doesn't panic.

And existing code that uses value_unchecked might validate the index but not be aware of this hidden safety requirement.

Short answer is yes. I also spent some more time reviewing the code in FixedSizeBinaryArray and I am now convinced there are several other miuses of i32 <-> usize . I am working on an improvement, though I worry it will be a larger PR

alamb · 2026-04-30T19:49:17Z

I also added some more docs for FixedSizeBinaryArray that may help reviewers

Add more documentation for FixedSizeBinary arrays #9866

alamb · 2026-04-30T20:12:27Z

I have played around with several options for improving this code. I think there are several potential i32 math overflow issues, but fixing them all in a single PR is going to be somewhat hard to review and take some time

What I am thinking about is adding a new invariant to the FixedSizeArray constructor that prevents constructing arrays with value buffers larger than 2GB as a temporary workaround in one PR. Then I can fixup the actual arithmetic in a second:

github-actions Bot added the arrow Changes to the arrow crate label Apr 29, 2026

[arrow-array]: prevent FixedSizeBinaryArray offset truncation

7faf57d

alamb force-pushed the codex/fixed-size-binary-offset-overflow branch from e77819a to 7faf57d Compare April 29, 2026 17:42

alamb commented Apr 29, 2026

View reviewed changes

alamb marked this pull request as ready for review April 29, 2026 18:22

adamreeve reviewed Apr 30, 2026

View reviewed changes

alamb mentioned this pull request Apr 30, 2026

Alamb/fixed size binary overfkow try 2 #9867

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prevent `FixedSizeBinaryArray::value` offset truncation#9850

Prevent `FixedSizeBinaryArray::value` offset truncation#9850
alamb wants to merge 1 commit intoapache:mainfrom
alamb:codex/fixed-size-binary-offset-overflow

alamb commented Apr 29, 2026 •

edited

Loading

Uh oh!

alamb Apr 29, 2026

Uh oh!

adamreeve Apr 30, 2026

Uh oh!

alamb Apr 30, 2026

Uh oh!

alamb commented Apr 30, 2026

Uh oh!

alamb commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

alamb commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

alamb Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

adamreeve Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

alamb Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

alamb commented Apr 30, 2026

Uh oh!

alamb commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

alamb commented Apr 29, 2026 •

edited

Loading