Describe the bug
map(['a','b'], [col, col * 10]) fails at execution time with:
Execution error: map requires key and value lists to have the same length
The error occurs whenever the key list is a compile-time constant (all literals) and the value list contains column references. The two sides are evaluated at different granularities: keys produce a ColumnarValue::Scalar(FixedSizeList[N]) (length = N = number of keys), while values produce a ColumnarValue::Array (length = batch_size). The length check in map's implementation then compares N != batch_size and raises the error.
To Reproduce
use datafusion::prelude::SessionContext;
#[tokio::test]
async fn map_literal_keys_column_values() {
let ctx = SessionContext::new();
ctx.sql("CREATE TABLE t AS VALUES (1), (2), (3)")
.await.unwrap().collect().await.unwrap();
// Fails: "map requires key and value lists to have the same length"
ctx.sql("SELECT map(['a','b'], [column1, column1 * 10]) FROM t")
.await.unwrap().collect().await.unwrap();
}
Using make_array('a','b') instead of ['a','b'] does not help — same failure.
All-literal calls (map(['a','b'], [1, 2])) work correctly.
Expected behavior
Each output row should contain a map value {a: <col_val>, b: <col_val * 10>}.
Additional context
- DataFusion version: 53.1.0
- The fix likely needs to expand
ColumnarValue::Scalar to an Array of batch_size before the per-element-count check in datafusion-functions-nested/src/map.rs.
Describe the bug
map(['a','b'], [col, col * 10])fails at execution time with:The error occurs whenever the key list is a compile-time constant (all literals) and the value list contains column references. The two sides are evaluated at different granularities: keys produce a
ColumnarValue::Scalar(FixedSizeList[N])(length = N = number of keys), while values produce aColumnarValue::Array(length = batch_size). The length check inmap's implementation then compares N != batch_size and raises the error.To Reproduce
Using
make_array('a','b')instead of['a','b']does not help — same failure.All-literal calls (
map(['a','b'], [1, 2])) work correctly.Expected behavior
Each output row should contain a map value
{a: <col_val>, b: <col_val * 10>}.Additional context
ColumnarValue::Scalarto anArrayofbatch_sizebefore the per-element-count check indatafusion-functions-nested/src/map.rs.