Skip to content

ARRAY_AGG .. ORDER BY fails on NULL typed columns #18344

@sadboy

Description

@sadboy

Describe the bug

This appears to have regressed somewhere between v48 and v49.

To Reproduce

DataFusion CLI v50.3.0
> create table source as
select null as col
union all
select null
union all
select null
;

0 row(s) fetched. 
Elapsed 0.009 seconds.

> select
    array_agg(distinct col) within group (order by col)
from source
group by all
;

Arrow error: Invalid argument error: The data type type Null has no natural order

Expected behavior

DataFusion CLI v48.0.1
> create table source as
select null as col
union all
select null
union all
select null
;

0 row(s) fetched. 
Elapsed 0.011 seconds.

> select
    array_agg(distinct col) within group (order by col)
from source
group by all
;

+--------------------------------+
| array_agg(DISTINCT source.col) |
+--------------------------------+
| [NULL]                         |
+--------------------------------+
1 row(s) fetched. 
Elapsed 0.019 seconds.

Additional context

The exact semantics of a Null typed column is perhaps debatable, but if this behavior is intentional, we should probably catch this at planning time instead of letting it err at execution time.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions