Skip to content

Bug: UDAF unexpectedly returns non-empty result for empty table #17269

@niebayes

Description

@niebayes

Describe the bug

Applying a UDAF on an empty table would return a non-empty result, a single null value.

To Reproduce

I have written a simple test for reproducing the bug. For any UDAF, such as last_value and first_value, calling it on an empty table would give a result containing a non empty result.

#[tokio::test]
    async fn test_last_value() {
        use std::sync::Arc;

        use arrow_array::RecordBatch;
        use arrow_schema::{DataType, Field, Schema};
        use datafusion::assert_batches_eq;
        use datafusion::physical_plan::collect;
        use datafusion::prelude::SessionContext;

        let schema = Arc::new(Schema::new(vec![
            Field::new("id", DataType::Int32, false),
            Field::new("value", DataType::Int32, false),
        ]));
        let batch = RecordBatch::new_empty(schema);

        let ctx = SessionContext::new();
        ctx.register_batch("t", batch).unwrap();

        let plan = ctx
            .sql("select last_value(value order by id) from t")
            .await
            .unwrap()
            .logical_plan()
            .clone();
        let exec_plan = ctx.state().create_physical_plan(&plan).await.unwrap();
        let batches = collect(exec_plan, ctx.task_ctx()).await.unwrap();

        assert_eq!(batches.iter().map(|b| b.num_rows()).sum::<usize>(), 1);
        assert_batches_eq!(
            &[
                "+----------------------------------------------------+",
                "| last_value(t.value) ORDER BY [t.id ASC NULLS LAST] |",
                "+----------------------------------------------------+",
                "|                                                    |",
                "+----------------------------------------------------+",
            ],
            &batches
        );
    }

Expected behavior

Calling a UDAF on an empty table should given an empty result.

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions