Skip to content

Fuzzing Crash: Unsupported extension type in Arrow conversion #5822

@github-actions

Description

@github-actions

Fuzzing Crash Report

Analysis

Crash Location: vortex-dtype/src/arrow.rs:262 in the to_arrow_dtype function

Error Message:

Unsupported extension type "example.ipv4"

Stack Trace:

   3: to_arrow_dtype
             at ./vortex-dtype/src/arrow.rs:262:21
   4: to_arrow_dtype
             at ./vortex-dtype/src/arrow.rs:234:28
   5: {closure#0}
             at ./vortex-array/src/arrow/compute/to_arrow/canonical.rs:109:46
   ...
  11: into_arrow_preferred
             at ./vortex-array/src/arrow/mod.rs:41:9
  12: arrow_compare
             at ./vortex-array/src/compute/compare.rs:333:36

Root Cause: The fuzzer created an ExtensionArray with a custom extension type "example.ipv4". When the test tries to perform a comparison operation, it attempts to convert the extension type to Arrow format. However, the to_arrow_dtype function only supports known temporal extension types (via is_temporal_ext_type), and bails with an error for any other extension type ID.

The issue is that the fuzzer can generate arbitrary extension type IDs, but the Arrow conversion code doesn't have a fallback strategy for unknown extension types (e.g., converting them to their storage type).

Debug Output
FuzzFileAction {
    array: ListViewArray {
        dtype: List(
            Extension(
                ExtDType {
                    id: ExtID(
                        "example.ipv4",
                    ),
                    storage_dtype: Bool(
                        Nullable,
                    ),
                    metadata: None,
                },
            ),
            Nullable,
        ),
        elements: ExtensionArray {
            dtype: Extension(
                ExtDType {
                    id: ExtID(
                        "example.ipv4",
                    ),
                    storage_dtype: Bool(
                        Nullable,
                    ),
                    metadata: None,
                },
            ),
            storage: BoolArray {
                dtype: Bool(
                    Nullable,
                ),
                bits: BitBuffer {
                    buffer: Buffer<u8> {
                        length: 1,
                        alignment: Alignment(
                            1,
                        ),
                        as_slice: [3],
                    },
                    offset: 0,
                    len: 3,
                },
                validity: AllValid,
                stats_set: ArrayStats {
                    inner: RwLock {
                        data: StatsSet {
                            values: [],
                        },
                    },
                },
            },
            stats_set: ArrayStats {
                inner: RwLock {
                    data: StatsSet {
                        values: [],
                    },
                },
            },
        },
        offsets: PrimitiveArray {
            dtype: Primitive(
                U16,
                NonNullable,
            ),
            buffer: Buffer<u8> {
                length: 6,
                alignment: Alignment(
                    2,
                ),
                as_slice: [0, 0, 3, 0, 3, 0],
            },
            validity: NonNullable,
            stats_set: ArrayStats {
                inner: RwLock {
                    data: StatsSet {
                        values: [
                            (
                                IsSorted,
                                Exact(
                                    ScalarValue(
                                        Bool(
                                            true,
                                        ),
                                    ),
                                ),
                            ),
                        ],
                    },
                },
            },
        },
        sizes: PrimitiveArray {
            dtype: Primitive(
                U16,
                NonNullable,
            ),
            buffer: Buffer<u8> {
                length: 6,
                alignment: Alignment(
                    2,
                ),
                as_slice: [3, 0, 0, 0, 0, 0],
            },
            validity: NonNullable,
            stats_set: ArrayStats {
                inner: RwLock {
                    data: StatsSet {
                        values: [],
                    },
                },
            },
        },
        is_zero_copy_to_list: true,
        validity: AllValid,
        stats_set: ArrayStats {
            inner: RwLock {
                data: StatsSet {
                    values: [],
                },
            },
        },
    },
    projection_expr: None,
    filter_expr: None,
    compressor_strategy: Compact,
}

Summary

Reproduction

  1. Download the crash artifact:

  2. Reproduce locally:

# The artifact contains file_io/crash-4c9f9437bf8e76457bcc9aaa86d746fa96628f96
cargo +nightly fuzz run -D --sanitizer=none file_io file_io/crash-4c9f9437bf8e76457bcc9aaa86d746fa96628f96 -- -rss_limit_mb=0
  1. Get full backtrace:
RUST_BACKTRACE=full cargo +nightly fuzz run -D --sanitizer=none file_io file_io/crash-4c9f9437bf8e76457bcc9aaa86d746fa96628f96 -- -rss_limit_mb=0

Auto-created by fuzzing workflow with Claude analysis

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions