Describe the bug
arrow_cast accepts type strings that are not legal Arrow type combinations. The cast itself succeeds and the type propagates through the logical plan, but downstream operations panic in arrow-array with not implemented: Unexpected data type Time32(µs) (or similar).
Per the Arrow spec, Time32 only supports Second and Millisecond; Time64 only supports Microsecond and Nanosecond. The other four combinations should be rejected.
To Reproduce
use datafusion::prelude::SessionContext;
#[tokio::main]
async fn main() {
let ctx = SessionContext::new();
let _ = ctx
.sql("SELECT arrow_cast(0, 'Time32(Microsecond)') + 1")
.await
.unwrap()
.create_physical_plan()
.await;
}
Panic:
thread 'main' panicked at .../arrow-array-58.3.0/src/array/mod.rs:986:15:
not implemented: Unexpected data type Time32(µs)
All four invalid combinations panic when used with arithmetic:
SELECT arrow_cast(0, 'Time32(Microsecond)') + 1
SELECT arrow_cast(0, 'Time32(Nanosecond)') + 1
SELECT arrow_cast(0, 'Time64(Second)') + 1
SELECT arrow_cast(0, 'Time64(Millisecond)') + 1
The original fuzzer find was:
SELECT arrow_cast('5:00', 'Time32(Second)') - arrow_cast('03:00', 'Time32(Microsecond)')
Expected behavior
arrow_cast should reject the four invalid Time(Unit) combinations at planning time with a plan_err! such as:
Invalid Arrow type combination: Time32 only supports Second and Millisecond. Use Time64(Microsecond) for sub-millisecond precision.
The public SQL API should never panic on user-supplied SQL, even with obviously-malformed type strings.
Root cause
arrow_cast constructs a DataType::Time32(TimeUnit::Microsecond) (or similar invalid combo) from the user-supplied string without validating against Arrow's type-system rules. Downstream arrow-array code paths assume the type is well-formed and panic via unimplemented!() when they see the illegal combination.
Two-sided fix:
- DataFusion: validate
Time32/Time64 × TimeUnit combinations when parsing the target type in arrow_cast.
- arrow-rs (separate): even if a malformed type reaches array code, it should return
DataFusionError/ArrowError rather than unimplemented!().
Additional context
Found by a cargo fuzz target (fuzz/fuzz_targets/sql_physical_plan.rs) seeded with SQL from datafusion/sqllogictest/test_files/. The fuzzer mutated an existing arrow_cast(..., 'Time32(Second)') example by changing Second → Microsecond.
Describe the bug
arrow_castaccepts type strings that are not legal Arrow type combinations. The cast itself succeeds and the type propagates through the logical plan, but downstream operations panic inarrow-arraywithnot implemented: Unexpected data type Time32(µs)(or similar).Per the Arrow spec,
Time32only supportsSecondandMillisecond;Time64only supportsMicrosecondandNanosecond. The other four combinations should be rejected.To Reproduce
Panic:
All four invalid combinations panic when used with arithmetic:
The original fuzzer find was:
Expected behavior
arrow_castshould reject the four invalidTime(Unit)combinations at planning time with aplan_err!such as:The public SQL API should never panic on user-supplied SQL, even with obviously-malformed type strings.
Root cause
arrow_castconstructs aDataType::Time32(TimeUnit::Microsecond)(or similar invalid combo) from the user-supplied string without validating against Arrow's type-system rules. Downstreamarrow-arraycode paths assume the type is well-formed and panic viaunimplemented!()when they see the illegal combination.Two-sided fix:
Time32/Time64×TimeUnitcombinations when parsing the target type inarrow_cast.DataFusionError/ArrowErrorrather thanunimplemented!().Additional context
Found by a
cargo fuzztarget (fuzz/fuzz_targets/sql_physical_plan.rs) seeded with SQL fromdatafusion/sqllogictest/test_files/. The fuzzer mutated an existingarrow_cast(..., 'Time32(Second)')example by changingSecond→Microsecond.