Skip to content

Fuzzing Crash: VortexError in file_io #7228

@github-actions

Description

@github-actions

Fuzzing Crash Report

Analysis

Crash Location: vortex-compressor/src/builtins/dict/float.rs:24:dictionary_encode

Error Message:

Assertion failed error: this must be present since `DictScheme` declared that we need distinct values
Stack Trace
stack backtrace:
   0: __rustc::rust_begin_unwind
             at /rustc/db3e99bbab28c6ca778b13222becdea54533d908/library/std/src/panicking.rs:689:5
   1: core::panicking::panic_fmt
             at /rustc/db3e99bbab28c6ca778b13222becdea54533d908/library/core/src/panicking.rs:80:14
   2: panic_display<vortex_error::VortexError>
             at /rustc/db3e99bbab28c6ca778b13222becdea54533d908/library/core/src/panicking.rs:259:5
   3: {closure#0}<&vortex_compressor::stats::float::DistinctInfo<half::binary16::f16>>
             at ./vortex-error/src/lib.rs:500:9
   4: unwrap_or_else<&vortex_compressor::stats::float::DistinctInfo<half::binary16::f16>, vortex_error::{impl#12}::vortex_expect::{closure_env#0}<&vortex_compressor::stats::float::DistinctInfo<half::binary16::f16>>>
             at /rustc/db3e99bbab28c6ca778b13222becdea54533d908/library/core/src/option.rs:1067:21
   5: vortex_expect<&vortex_compressor::stats::float::DistinctInfo<half::binary16::f16>>
             at ./vortex-error/src/lib.rs:349:14
   6: dictionary_encode
             at ./vortex-compressor/src/builtins/dict/float.rs:24:42
   7: compress
             at ./vortex-compressor/src/builtins/dict/mod.rs:202:20
   8: estimate_compression_ratio_with_sampling<vortex_compressor::builtins::dict::FloatDictScheme>
             at ./vortex-compressor/src/scheme.rs:280:10
   9: expected_compression_ratio
             at ./vortex-compressor/src/builtins/dict/mod.rs:188:20
  10: choose_scheme
             at ./vortex-compressor/src/compressor.rs:336:32
  11: choose_and_compress
             at ./vortex-compressor/src/compressor.rs:305:36
  12: compress_canonical
             at ./vortex-compressor/src/compressor.rs:177:22
  13: compress
             at ./vortex-compressor/src/compressor.rs:160:14
  14: {async_block#0}
             at ./vortex-layout/src/layouts/dict/writer.rs:154:65
  15: poll<alloc::boxed::Box<(dyn core::future::future::Future<Output=core::result::Result<alloc::sync::Arc<dyn vortex_layout::layout::Layout, alloc::alloc::Global>, vortex_error::VortexError>> + core::marker::Send), alloc::alloc::Global>>
             at /rustc/db3e99bbab28c6ca778b13222becdea54533d908/library/core/src/future/future.rs:133:9
  16: {async_block#0}
             at ./vortex-layout/src/strategy.rs:80:14
  17: poll<alloc::boxed::Box<(dyn core::future::future::Future<Output=core::result::Result<alloc::sync::Arc<dyn vortex_layout::layout::Layout, alloc::alloc::Global>, vortex_error::VortexError>> + core::marker::Send), alloc::alloc::Global>>
             at /rustc/db3e99bbab28c6ca778b13222becdea54533d908/library/core/src/future/future.rs:133:9
  18: {async_block#0}
             at ./vortex-layout/src/layouts/zoned/writer.rs:135:14
  19: poll<alloc::boxed::Box<(dyn core::future::future::Future<Output=core::result::Result<alloc::sync::Arc<dyn vortex_layout::layout::Layout, alloc::alloc::Global>, vortex_error::VortexError>> + core::marker::Send), alloc::alloc::Global>>
   ... (94 more frames truncated)

Root Cause Analysis

The crash is a VortexExpect assertion failure in dictionary_encode (vortex-compressor/src/builtins/dict/float.rs:24) where the code expects pre-computed distinct values to be present on the float stats, but they are None. This happens because estimate_compression_ratio_with_sampling takes a sample of the original array and creates a fresh ArrayAndStats for it, then calls compress on that sample. While the original array's stats had distinct_count available (checked in expected_compression_ratio), the newly sampled array's stats may not have the full distinct values (DistinctInfo) computed, causing the vortex_expect call to panic. The fix should either ensure distinct values are recomputed on the sampled data before dictionary_encode is called, or have dictionary_encode gracefully handle the case where distinct values are missing by returning an error or falling back instead of panicking.

Summary

Reproduce

cargo +nightly fuzz run -D --sanitizer=none file_io ./fuzz/artifacts/file_io/crash-55707cfb952519caba2f1ab1c079c4f485ac447a -- -rss_limit_mb=0
Reproduction Steps
  1. Download the crash artifact: https://github.com/vortex-data/vortex/actions/runs/23812682680/artifacts/6206811417

  2. Assuming you download the zipfile to ~/Downloads, and your working directory is the repository root:

# Create the artifacts directory if you haven't already.
mkdir -p ./fuzz/artifacts

# Move the zipfile.
mv ~/Downloads/file_io-crash-artifacts.zip ./fuzz/artifacts/

# Unzip the zipfile.
unzip ./fuzz/artifacts/file_io-crash-artifacts.zip -d ./fuzz/artifacts/

# You can remove the zipfile now if you want to.
rm ./fuzz/artifacts/file_io-crash-artifacts.zip
  1. Reproduce the crash:
cargo +nightly fuzz run -D --sanitizer=none file_io ./fuzz/artifacts/file_io/crash-55707cfb952519caba2f1ab1c079c4f485ac447a -- -rss_limit_mb=0

If you want a backtrace:

RUST_BACKTRACE=1 cargo +nightly fuzz run -D --sanitizer=none file_io ./fuzz/artifacts/file_io/crash-55707cfb952519caba2f1ab1c079c4f485ac447a -- -rss_limit_mb=0
RUST_BACKTRACE=full cargo +nightly fuzz run -D --sanitizer=none file_io ./fuzz/artifacts/file_io/crash-55707cfb952519caba2f1ab1c079c4f485ac447a -- -rss_limit_mb=0
Single command to get a backtrace
mkdir -p ./fuzz/artifacts
mv ~/Downloads/file_io-crash-artifacts.zip ./fuzz/artifacts/
unzip ./fuzz/artifacts/file_io-crash-artifacts.zip -d ./fuzz/artifacts/
rm ./fuzz/artifacts/file_io-crash-artifacts.zip
RUST_BACKTRACE=1 cargo +nightly fuzz run -D --sanitizer=none file_io ./fuzz/artifacts/file_io/crash-55707cfb952519caba2f1ab1c079c4f485ac447a -- -rss_limit_mb=0

Auto-created by fuzzing workflow

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugA bug issuefuzzerIssues detected by the fuzzer

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions