Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add additional test coverage for aggregaes using dates/times/timestamps/decimals #6939

Merged
merged 2 commits into from
Jul 12, 2023

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Jul 12, 2023

Which issue does this PR close?

Related to #6889

Rationale for this change

I was doing some testing and found out that the new GroupsAccumulator in #6904 aren't implemented for date/time/timestamps -- so here are some tests to cover that

What changes are included in this PR?

Add some tests

Are these changes tested?

These are only tests

Are there any user-facing changes?

@github-actions github-actions bot added core Core datafusion crate sqllogictest labels Jul 12, 2023
Copy link
Contributor

@tustvold tustvold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The results all seem plausible to me

query TRT
select c2, sum(c1), arrow_typeof(sum(c1)) from d_table GROUP BY c2 ORDER BY c2;
----
A 1100.045 Decimal128(20, 3)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure whether the output type is correct, but I also don't know that it isn't so 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @liukun4515 in case you have some thoughts (this PR doesn't change the type FWIW)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The data type of c1 is c1 decimal(10,3)
the result of sum(c1) is decimal(10+10,3), this align with spark rule: the new precision = old precision + 10

@alamb alamb merged commit 44e3876 into apache:main Jul 12, 2023
20 of 21 checks passed
@alamb alamb deleted the alamb/agg_test_coverage branch July 12, 2023 21:59
alamb added a commit to alamb/datafusion that referenced this pull request Jul 13, 2023
…ps/decimals (apache#6939)

* Add additional test coverage for aggregaes using dates/times/timestamps/decimals

* Add coverage for date32/date64
alamb added a commit that referenced this pull request Jul 13, 2023
* Vectorized hash grouping

* Prepare for merge to main

* Improve comments and update size calculations

* Implement test for accumulate_boolean

refactor

* Use resize instead of resize_with

* fix avg size calculation

* Simplify sum accumulator

* Add comments explaining i64 as counts

* Clarify `aggreate_arguments`

* Apply suggestions from code review

Co-authored-by: Mustafa Akur <106137913+mustafasrepo@users.noreply.github.com>

* Clarify rationale for ScratchSpace being a field

* use slice syntax

* Update datafusion/physical-expr/src/aggregate/average.rs

Co-authored-by: Mustafa Akur <106137913+mustafasrepo@users.noreply.github.com>

* Update datafusion/physical-expr/src/aggregate/count.rs

Co-authored-by: Mustafa Akur <106137913+mustafasrepo@users.noreply.github.com>

* Update datafusion/physical-expr/src/aggregate/groups_accumulator/adapter.rs

Co-authored-by: Mustafa Akur <106137913+mustafasrepo@users.noreply.github.com>

* fix diagram

* Update datafusion/physical-expr/src/aggregate/groups_accumulator/adapter.rs

Co-authored-by: Mustafa Akur <106137913+mustafasrepo@users.noreply.github.com>

* simplify the supported logic

* Add a log message when using slow adapter

* fmt

* Revert "chore(deps): update bigdecimal requirement from 0.3.0 to 0.4.0 (#6848)" (#6896)

This reverts commit d0def42.

* Make FileScanConfig::project pub (#6931)

Co-authored-by: Daniël Heres <daniel.heres@coralogix.com>

* feat: add round trip test of physical plan in tpch unit tests (#6918)

* Use thiserror to implement the From trait for DFSqlLogicTestError (#6924)

* parallel csv scan (#6801)

* parallel csv scan

* add max line length

* Update according to review comments

* Update Configuration doc

---------

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* Add additional test coverage for aggregaes using dates/times/timestamps/decimals (#6939)

* Add additional test coverage for aggregaes using dates/times/timestamps/decimals

* Add coverage for date32/date64

* Support timestamp types for min/max

* Fix aggregate nullability calculation

---------

Co-authored-by: Mustafa Akur <106137913+mustafasrepo@users.noreply.github.com>
Co-authored-by: Daniël Heres <danielheres@gmail.com>
Co-authored-by: Daniël Heres <daniel.heres@coralogix.com>
Co-authored-by: r.4ntix <r.4ntix@gmail.com>
Co-authored-by: Jonah Gao <jonahgaox@gmail.com>
Co-authored-by: Yongting You <2010youy01@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core datafusion crate sqllogictest
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants