Skip to content

Unsound Delta cast: unsigned-widening corrupts wrapped deltas #8193

@joseph-isaacs

Description

@joseph-isaacs

Summary

The Delta (FastLanes) cast kernel (encodings/fastlanes/src/delta/compute/cast.rs) is unsound on its unsigned-widening path (e.g. u8 -> u32).

Root cause

Delta stores deltas as wrapping_sub at the source bit width and reconstructs values with a prefix-sum of wrapping_add at the array's bit width (delta_decompress.rs). The cast kernel rewraps the encoding by casting the bases and deltas child arrays to the target type and rebuilding a DeltaArray.

Casting an unsigned delta to a wider unsigned type is a zero-extension that preserves each delta's value but not the modulus of reconstruction. After widening, wrapping_add wraps at 2^target instead of 2^source, so any sequence whose deltas wrapped at the source width decodes to wrong values.

The existing guards reject signed sources/targets and narrowing, but the remaining unsigned-widening path — described in the code comment as "the lossless widening cast" — is not lossless.

Reproduction

let primitive = PrimitiveArray::from_iter([200u8, 50, 75, 10, 255]);
let array = Delta::try_from_primitive_array(&primitive, ..);
let casted = array.into_array().cast(U32);
// decoded: [200, 306, 331, 522, 767]
// expected: [200,  50,  75,  10, 255]

200 -> 50 stores delta 0x6A (106). At u8: 200 wrapping_add 106 = 306 mod 256 = 50. At u32: 200 + 106 = 306. Every index after a wrap is corrupted.

The existing tests miss this because they only use monotonic, small, non-wrapping sequences.

Fix

Bail out of the fast path for any width-changing cast (return Ok(None) so the generic decompress-and-re-encode path handles it). Only same-width unsigned casts remain reinterpretive and sound. (Widening could be salvaged by decompressing to check for wraparound, per the existing TODO(DK), but that defeats the fast path.)

Tracked from #8192.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugA bug issue

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions