Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regression: group by count distinct doesn't work for timestamps with time zone #10042

Closed
Tracked by #9904
joroKr21 opened this issue Apr 11, 2024 · 0 comments · Fixed by #10043
Closed
Tracked by #9904

Regression: group by count distinct doesn't work for timestamps with time zone #10042

joroKr21 opened this issue Apr 11, 2024 · 0 comments · Fixed by #10043
Labels
bug Something isn't working regression Something that used to work no longer does

Comments

@joroKr21
Copy link
Contributor

Describe the bug

I didn't check in which version it regressed, but it worked in v34 and is broken in v37.
It looks like the regression was caused by the specialization of the DistinctCount implementation of AggregateExpr.

To Reproduce

create table t_source
as values
 ('2018-11-13T17:11:10.011375885995', 'Row 0', 'X'),
 ('2011-12-13T11:13:10.12345',        'Row 1', 'X'),
 (null,                               'Row 2', 'Y'),
 ('2021-01-01T05:11:10.432',          'Row 3', 'Y');

create table t as
select
  arrow_cast(column1, 'Timestamp(Second, Some("UTC"))') as secs_utc,
  column2 as names,
  column3 as tag
from t_source;

SELECT tag, count(DISTINCT secs_utc) FROM t GROUP BY tag ORDER BY tag;

Expected behavior

It should work. The error we get is:

External error: query failed: DataFusion error: External error: External error: Arrow error: Invalid argument error: column types must match schema types, expected List(Field { name: "item", data_type: Timestamp(Second, Some("UTC")), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }) but found List(Field { name: "item", data_type: Timestamp(Second, None), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }) at column index 1

Additional context

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working regression Something that used to work no longer does
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants