Skip to content

Scan panic in FFI #7808

@m7kss1

Description

@m7kss1

What happened?

A scan can panic when applying a filter expression whose validity depends on the scanned schema and the expression is not type-compatible with that schema

RowIdxLayoutReader and similar scan layout readers call fallible expression partitioning/optimization code and unwrap the result with vortex_expect(...)

Stacktrace:

thread '<unnamed>' (2727824) panicked at /home/ubuntu/vortex/vortex-error/src/lib.rs:347:33:
We should not fail to partition expression over struct fields:
  Other error: Cannot compare different DTypes u8 and i32
Backtrace:
   0: <vortex_array::scalar_fn::fns::binary::Binary as vortex_array::scalar_fn::vtable::ScalarFnVTable>::return_dtype
             at /home/ubuntu/vortex/vortex-array/src/scalar_fn/fns/binary/mod.rs:135:13
   1: <vortex_array::scalar_fn::typed::TypedScalarFnInstance<V> as vortex_array::scalar_fn::typed::DynScalarFn>::return_dtype
             at /home/ubuntu/vortex/vortex-array/src/scalar_fn/typed.rs:175:9
   2: vortex_array::scalar_fn::erased::ScalarFnRef::return_dtype
             at /home/ubuntu/vortex/vortex-array/src/scalar_fn/erased.rs:127:16
   3: vortex_array::expr::expression::Expression::return_dtype
             at /home/ubuntu/vortex/vortex-array/src/expr/expression.rs:106:24
   4: vortex_array::expr::expression::Expression::return_dtype::{{closure}}
             at /home/ubuntu/vortex/vortex-array/src/expr/expression.rs:104:24

NOTE: There is a follow-up issue:

After this panic is fixed, the same test below may expose a separate vx_partition_next ownership bug

vx_partition_next moves the internal VxPartitionScan out of the FFI handle with ptr::read. If scan execution returns an error before the function writes a replacement state back with ptr::write the handle is left containing moved-out bytes. When the caller later calls vx_partition_free, those bytes are dropped again which can result double free

Steps to reproduce

Consider the following vortex-ffi testcase to reproduce:

TEST_CASE("Broken scan with DType mismatch in filter", "[filter]") {
    vx_session *session = vx_session_new();
    defer {
        vx_session_free(session);
    };
    TempPath path = write_sample(session);
    vx_error *error = nullptr;

    vx_data_source_options ds_opts = {};
    ds_opts.paths = path.c_str();
    const vx_data_source *ds = vx_data_source_new(session, &ds_opts, &error);
    require_no_error(error);
    defer {
        vx_data_source_free(ds);
    };

    vx_expression *root = vx_expression_root();
    defer {
        vx_expression_free(root);
    };

    // NOTE: this is from vortex-ffi/test/scan.cpp
    vx_expression *age_col = vx_expression_get_item("age", root);
    REQUIRE(age_col != nullptr);
    defer {
        vx_expression_free(age_col);
    };

    vx_scalar *lit = vx_scalar_new_i32(67, false);
    defer {
        vx_scalar_free(lit);
    };

    vx_expression *lit_expr = vx_expression_literal(lit, &error);
    require_no_error(error);
    defer {
        vx_expression_free(lit_expr);
    };

    // DType mismatch between age_col (u8) and lit (i32)
    vx_expression *filter = vx_expression_binary(VX_OPERATOR_EQ, age_col, lit_expr);
    REQUIRE(filter != nullptr);
    defer {
        vx_expression_free(filter);
    };

    vx_scan_options scan_opts = {};
    scan_opts.max_threads = 1;
    scan_opts.filter = filter;

    vx_scan *scan = vx_data_source_scan(ds, &scan_opts, nullptr, &error);
    require_no_error(error);
    defer {
        vx_scan_free(scan);
    };

    vx_partition *partition = vx_scan_next_partition(scan, &error);
    require_no_error(error);
    REQUIRE(partition != nullptr);
    defer {
        vx_partition_free(partition);
    };

    // This call must set vx_error and return nullptr, not panic
    const vx_array *array = vx_partition_next(partition, &error);
    REQUIRE(array == nullptr);
    REQUIRE(error != nullptr);
    vx_error_free(error);
}

Environment

Last commit: a31a0db9f054d069d58c6f045dcf308c57b90ca8
Linux, Ubuntu
x86-64

Additional context

No response

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions