Skip to content

arrow_cast accepts invalid Time32(Microsecond)/Time32(Nanosecond)/Time64(Second)/Time64(Millisecond), panics on use #22194

@Dandandan

Description

@Dandandan

Describe the bug

arrow_cast accepts type strings that are not legal Arrow type combinations. The cast itself succeeds and the type propagates through the logical plan, but downstream operations panic in arrow-array with not implemented: Unexpected data type Time32(µs) (or similar).

Per the Arrow spec, Time32 only supports Second and Millisecond; Time64 only supports Microsecond and Nanosecond. The other four combinations should be rejected.

To Reproduce

use datafusion::prelude::SessionContext;

#[tokio::main]
async fn main() {
    let ctx = SessionContext::new();
    let _ = ctx
        .sql("SELECT arrow_cast(0, 'Time32(Microsecond)') + 1")
        .await
        .unwrap()
        .create_physical_plan()
        .await;
}

Panic:

thread 'main' panicked at .../arrow-array-58.3.0/src/array/mod.rs:986:15:
not implemented: Unexpected data type Time32(µs)

All four invalid combinations panic when used with arithmetic:

SELECT arrow_cast(0, 'Time32(Microsecond)') + 1
SELECT arrow_cast(0, 'Time32(Nanosecond)')  + 1
SELECT arrow_cast(0, 'Time64(Second)')      + 1
SELECT arrow_cast(0, 'Time64(Millisecond)') + 1

The original fuzzer find was:

SELECT arrow_cast('5:00', 'Time32(Second)') - arrow_cast('03:00', 'Time32(Microsecond)')

Expected behavior

arrow_cast should reject the four invalid Time(Unit) combinations at planning time with a plan_err! such as:

Invalid Arrow type combination: Time32 only supports Second and Millisecond. Use Time64(Microsecond) for sub-millisecond precision.

The public SQL API should never panic on user-supplied SQL, even with obviously-malformed type strings.

Root cause

arrow_cast constructs a DataType::Time32(TimeUnit::Microsecond) (or similar invalid combo) from the user-supplied string without validating against Arrow's type-system rules. Downstream arrow-array code paths assume the type is well-formed and panic via unimplemented!() when they see the illegal combination.

Two-sided fix:

  • DataFusion: validate Time32/Time64 × TimeUnit combinations when parsing the target type in arrow_cast.
  • arrow-rs (separate): even if a malformed type reaches array code, it should return DataFusionError/ArrowError rather than unimplemented!().

Additional context

Found by a cargo fuzz target (fuzz/fuzz_targets/sql_physical_plan.rs) seeded with SQL from datafusion/sqllogictest/test_files/. The fuzzer mutated an existing arrow_cast(..., 'Time32(Second)') example by changing SecondMicrosecond.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions