Fix unsound Delta unsigned-widening cast#8195
Merged
Merged
Conversation
Merging this PR will not alter performance
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | WallTime | cuda/bitpacked_u8/unpack/3bw[100M] |
300.2 µs | 349.9 µs | -14.19% |
| ❌ | Simulation | chunked_varbinview_canonical_into[(100, 100)] |
273.1 µs | 307.8 µs | -11.28% |
| ⚡ | Simulation | chunked_varbinview_opt_canonical_into[(1000, 10)] |
225.3 µs | 188 µs | +19.87% |
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing claude/blissful-ramanujan-7S6HL (cf09b33) with develop (fa2c35a)
Delta stores deltas as `wrapping_sub` at the source bit width and reconstructs values with `wrapping_add` at the array's bit width, so reusing the encoded bases and deltas across a primitive-type change is unsound. The previous kernel did this for any unsigned widening, which corrupted every value after a delta that wrapped at the source width (e.g. [200u8, 50] cast to u32 decoded as [200, 306] instead of [200, 50]). It also had two latent bugs: a u32 -> f32 cast entered the reuse path and then errored in `Delta::try_new` instead of falling back, and casting a nullable DeltaArray to non-nullable at the same ptype silently discarded real nulls. The kernel now reuses the encoded components only for a nullability change at the same primitive type (adding or keeping nullability, where validity lives in the deltas child). Every ptype change, and dropping nullability, defers to the generic decompress-and-re-encode path, which decodes correctly and validates nulls. Adds regression tests for the widening corruption, the u32 -> f32 fallback, adding nullability, preserving nulls across a same-ptype nullability cast, and erroring when dropping nullability with real nulls. Fixes #8193. Signed-off-by: Claude <noreply@anthropic.com>
c32a13f to
cf09b33
Compare
robert3005
approved these changes
Jun 1, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #8193.
Problem
The Delta cast kernel reused the encoded
bases/deltasacross primitive-type changes, which is unsound. Delta stores deltas aswrapping_subat the source bit width and decodes withwrapping_addat the array's bit width, so widening corrupted every value after a delta that wrapped at the source width:Two further latent bugs in the same kernel:
u32 -> f32passed the guard, entered the reuse path, then errored inDelta::try_new(float dtype) instead of falling back.Fix
Reuse the encoded components only for a nullability change at the same primitive type (adding/keeping nullability, where validity lives in the deltas child). Every ptype change, and dropping nullability, defers to the generic decompress-and-re-encode path, which decodes correctly and validates nulls.
Nothing more is recoverable from metadata: widening would need a wrap/monotonicity flag (value range is insufficient —
[0,200]and[200,0]share min/max but only one is safe to widen); narrowing/signedness need the value range, which can't be derived from the components.Tests
test_cast_delta_unsigned_widening_wraps— widening corruption regression.test_cast_delta_same_width_float_target—u32 -> f32falls back instead of erroring.test_cast_delta_add_nullability— same-ptype add-nullability fast path.test_cast_delta_nullability_preserves_nulls— null positions round-trip.test_cast_delta_drop_nullability_with_nulls_errors— dropping nullability with real nulls errors.test_cast_delta_conformance,test_cast_delta_u8_to_u32,test_cast_delta_nullable.Checks
cargo test -p vortex-fastlanes— 291 + 3 doctests pass (delta::compute::cast11/11).cargo clippy -p vortex-fastlanes --all-targets --all-features— clean.cargo +nightly fmt -p vortex-fastlanes -- --check— clean.Rebased onto the latest
developand squashed to a single commit.