-
Notifications
You must be signed in to change notification settings - Fork 145
Fuzzing Crash: VortexError in file_io #7228
Description
Fuzzing Crash Report
Analysis
Crash Location: vortex-compressor/src/builtins/dict/float.rs:24:dictionary_encode
Error Message:
Assertion failed error: this must be present since `DictScheme` declared that we need distinct values
Stack Trace
stack backtrace:
0: __rustc::rust_begin_unwind
at /rustc/db3e99bbab28c6ca778b13222becdea54533d908/library/std/src/panicking.rs:689:5
1: core::panicking::panic_fmt
at /rustc/db3e99bbab28c6ca778b13222becdea54533d908/library/core/src/panicking.rs:80:14
2: panic_display<vortex_error::VortexError>
at /rustc/db3e99bbab28c6ca778b13222becdea54533d908/library/core/src/panicking.rs:259:5
3: {closure#0}<&vortex_compressor::stats::float::DistinctInfo<half::binary16::f16>>
at ./vortex-error/src/lib.rs:500:9
4: unwrap_or_else<&vortex_compressor::stats::float::DistinctInfo<half::binary16::f16>, vortex_error::{impl#12}::vortex_expect::{closure_env#0}<&vortex_compressor::stats::float::DistinctInfo<half::binary16::f16>>>
at /rustc/db3e99bbab28c6ca778b13222becdea54533d908/library/core/src/option.rs:1067:21
5: vortex_expect<&vortex_compressor::stats::float::DistinctInfo<half::binary16::f16>>
at ./vortex-error/src/lib.rs:349:14
6: dictionary_encode
at ./vortex-compressor/src/builtins/dict/float.rs:24:42
7: compress
at ./vortex-compressor/src/builtins/dict/mod.rs:202:20
8: estimate_compression_ratio_with_sampling<vortex_compressor::builtins::dict::FloatDictScheme>
at ./vortex-compressor/src/scheme.rs:280:10
9: expected_compression_ratio
at ./vortex-compressor/src/builtins/dict/mod.rs:188:20
10: choose_scheme
at ./vortex-compressor/src/compressor.rs:336:32
11: choose_and_compress
at ./vortex-compressor/src/compressor.rs:305:36
12: compress_canonical
at ./vortex-compressor/src/compressor.rs:177:22
13: compress
at ./vortex-compressor/src/compressor.rs:160:14
14: {async_block#0}
at ./vortex-layout/src/layouts/dict/writer.rs:154:65
15: poll<alloc::boxed::Box<(dyn core::future::future::Future<Output=core::result::Result<alloc::sync::Arc<dyn vortex_layout::layout::Layout, alloc::alloc::Global>, vortex_error::VortexError>> + core::marker::Send), alloc::alloc::Global>>
at /rustc/db3e99bbab28c6ca778b13222becdea54533d908/library/core/src/future/future.rs:133:9
16: {async_block#0}
at ./vortex-layout/src/strategy.rs:80:14
17: poll<alloc::boxed::Box<(dyn core::future::future::Future<Output=core::result::Result<alloc::sync::Arc<dyn vortex_layout::layout::Layout, alloc::alloc::Global>, vortex_error::VortexError>> + core::marker::Send), alloc::alloc::Global>>
at /rustc/db3e99bbab28c6ca778b13222becdea54533d908/library/core/src/future/future.rs:133:9
18: {async_block#0}
at ./vortex-layout/src/layouts/zoned/writer.rs:135:14
19: poll<alloc::boxed::Box<(dyn core::future::future::Future<Output=core::result::Result<alloc::sync::Arc<dyn vortex_layout::layout::Layout, alloc::alloc::Global>, vortex_error::VortexError>> + core::marker::Send), alloc::alloc::Global>>
... (94 more frames truncated)
Root Cause Analysis
The crash is a VortexExpect assertion failure in dictionary_encode (vortex-compressor/src/builtins/dict/float.rs:24) where the code expects pre-computed distinct values to be present on the float stats, but they are None. This happens because estimate_compression_ratio_with_sampling takes a sample of the original array and creates a fresh ArrayAndStats for it, then calls compress on that sample. While the original array's stats had distinct_count available (checked in expected_compression_ratio), the newly sampled array's stats may not have the full distinct values (DistinctInfo) computed, causing the vortex_expect call to panic. The fix should either ensure distinct values are recomputed on the sampled data before dictionary_encode is called, or have dictionary_encode gracefully handle the case where distinct values are missing by returning an error or falling back instead of panicking.
Summary
- Target:
file_io - Crash File:
crash-55707cfb952519caba2f1ab1c079c4f485ac447a - Branch: develop
- Commit: df84cee
- Crash Artifact: https://github.com/vortex-data/vortex/actions/runs/23812682680/artifacts/6206811417
Reproduce
cargo +nightly fuzz run -D --sanitizer=none file_io ./fuzz/artifacts/file_io/crash-55707cfb952519caba2f1ab1c079c4f485ac447a -- -rss_limit_mb=0Reproduction Steps
-
Download the crash artifact: https://github.com/vortex-data/vortex/actions/runs/23812682680/artifacts/6206811417
-
Assuming you download the zipfile to
~/Downloads, and your working directory is the repository root:
# Create the artifacts directory if you haven't already.
mkdir -p ./fuzz/artifacts
# Move the zipfile.
mv ~/Downloads/file_io-crash-artifacts.zip ./fuzz/artifacts/
# Unzip the zipfile.
unzip ./fuzz/artifacts/file_io-crash-artifacts.zip -d ./fuzz/artifacts/
# You can remove the zipfile now if you want to.
rm ./fuzz/artifacts/file_io-crash-artifacts.zip- Reproduce the crash:
cargo +nightly fuzz run -D --sanitizer=none file_io ./fuzz/artifacts/file_io/crash-55707cfb952519caba2f1ab1c079c4f485ac447a -- -rss_limit_mb=0If you want a backtrace:
RUST_BACKTRACE=1 cargo +nightly fuzz run -D --sanitizer=none file_io ./fuzz/artifacts/file_io/crash-55707cfb952519caba2f1ab1c079c4f485ac447a -- -rss_limit_mb=0RUST_BACKTRACE=full cargo +nightly fuzz run -D --sanitizer=none file_io ./fuzz/artifacts/file_io/crash-55707cfb952519caba2f1ab1c079c4f485ac447a -- -rss_limit_mb=0Single command to get a backtrace
mkdir -p ./fuzz/artifacts
mv ~/Downloads/file_io-crash-artifacts.zip ./fuzz/artifacts/
unzip ./fuzz/artifacts/file_io-crash-artifacts.zip -d ./fuzz/artifacts/
rm ./fuzz/artifacts/file_io-crash-artifacts.zip
RUST_BACKTRACE=1 cargo +nightly fuzz run -D --sanitizer=none file_io ./fuzz/artifacts/file_io/crash-55707cfb952519caba2f1ab1c079c4f485ac447a -- -rss_limit_mb=0Auto-created by fuzzing workflow