Skip to content

Fuzzing Crash: BinaryViewVector creation panic in FSST filter (views length mismatch) #6034

@github-actions

Description

@github-actions

Fuzzing Crash Report

Analysis

Crash Location: vortex-vector/src/binaryview/vector.rs:136 in the try_new function

Error Message:

Failed to create `BinaryViewVector`:
  views buffer length 0 != validity length 1

Stack Trace:

   3: try_new<vortex_vector::binaryview::types::StringType>
             at ./vortex-vector/src/binaryview/vector.rs:136:9
   4: new<vortex_vector::binaryview::types::StringType>
             at ./vortex-vector/src/binaryview/vector.rs:118:9
   5: new_unchecked<vortex_vector::binaryview::types::StringType>
             at ./vortex-vector/src/binaryview/vector.rs:100:13
   6: execute_parent
             at ./encodings/fsst/src/kernel.rs:112:17
   7: execute_parent<vortex_fsst::array::FSSTVTable, vortex_fsst::kernel::FSSTFilterKernel>
             at ./vortex-array/src/kernel.rs:123:14
   8: execute<vortex_fsst::array::FSSTVTable>
             at ./vortex-array/src/kernel.rs:57:43
   9: execute_parent
             at ./encodings/fsst/src/array.rs:193:24
  10: execute_canonical_parent<vortex_fsst::array::FSSTVTable>
             at ./vortex-array/src/vtable/dyn_.rs:236:28
  11: execute
             at ./vortex-array/src/executor.rs:87:18
  12: execute<vortex_array::canonical::Canonical>
             at ./vortex-array/src/executor.rs:43:9

Root Cause:

The fuzzer discovered a bug in the FSST filter kernel where it attempts to create a BinaryViewVector (specifically a StringVector) with mismatched buffer and validity lengths. The validation check at vortex-vector/src/binaryview/vector.rs:136-141 catches this:

vortex_ensure!(
    views.len() == validity.len(),
    "views buffer length {} != validity length {}",
    views.len(),
    validity.len()
);

The crash occurs when:

  1. A filter operation is executed on an FSST-encoded array
  2. The FSST filter kernel (encodings/fsst/src/kernel.rs:112) calls StringVector::new_unchecked
  3. The new_unchecked function internally calls new, which calls try_new
  4. The views buffer has length 0, but the validity mask has length 1

This indicates that the FSST filter kernel is creating a BinaryViewVector with inconsistent state - likely the filter operation resulted in an empty views buffer but the validity mask wasn't properly synchronized.

Looking at the call path through the executor and filter evaluation suggests this happens during lazy evaluation of filter expressions on FSST-encoded struct arrays.

Debug Output
FuzzFileAction {
    array: ChunkedArray {
        dtype: Struct(
            StructFields {
                names: FieldNames(
                    [
                        FieldName(
                            "ec\u{1}!\u{1f}",
                        ),
                        FieldName(
                            "",
                        ),
                    ],
                ),
                dtypes: [
                    FieldDType {
                        inner: Owned(
                            Utf8(
                                Nullable,
                            ),
                        ),
                    },
                    FieldDType {
                        inner: Owned(
                            Utf8(
                                Nullable,
                            ),
                        ),
                    },
                ],
            },
            NonNullable,
        ),
        len: 23,
        chunk_offsets: PrimitiveArray {
            dtype: Primitive(
                U64,
                NonNullable,
            ),
            buffer: BufferHandle(
                Host(
                    Buffer<u8> {
                        length: 32,
                        alignment: Alignment(
                            8,
                        ),
                        as_slice: [0, 0, 0, 0, 0, 0, 0, 0, 17, 0, 0, 0, 0, 0, 0, 0, ...],
                    },
                ),
            ),
            validity: NonNullable,
            stats_set: ArrayStats { ... },
        },
        chunks: [
            StructArray { len: 17, ... },
            StructArray { len: 6, ... },
        ],
        ...
    },
    projection_expr: Some(...),
    filter_expr: Some(...),
    compressor_strategy: Default,
}

Note: Full debug output truncated for brevity. The array is a ChunkedArray with struct fields containing UTF-8 data, with both projection and filter expressions applied.

Summary

Reproduction

  1. Download the crash artifact:

  2. Reproduce locally:

# The artifact contains file_io/crash-27105207cda6f9e292fe6cf22af4e64c1b3efeac
cargo +nightly fuzz run -D --sanitizer=none file_io file_io/crash-27105207cda6f9e292fe6cf22af4e64c1b3efeac -- -rss_limit_mb=0
  1. Get full backtrace:
RUST_BACKTRACE=full cargo +nightly fuzz run -D --sanitizer=none file_io file_io/crash-27105207cda6f9e292fe6cf22af4e64c1b3efeac -- -rss_limit_mb=0

Auto-created by fuzzing workflow with Claude analysis

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions