Skip to content

Comments

DISTINCT COUNT/SUM/AVG NULL support.#11348

Merged
Jackie-Jiang merged 6 commits intoapache:masterfrom
shenyu0127:distinct-aggregation
Aug 21, 2023
Merged

DISTINCT COUNT/SUM/AVG NULL support.#11348
Jackie-Jiang merged 6 commits intoapache:masterfrom
shenyu0127:distinct-aggregation

Conversation

@shenyu0127
Copy link
Contributor

@shenyu0127 shenyu0127 commented Aug 15, 2023

This PR adds NULL support for queries like:

  • SELECT DISTINCT[COUNT|SUM|AVG](col) FROM table1
  • SELECT DISTINCT[COUNT|SUM|AVG](col1), col2 FROM table1 GROUP BY col2

@codecov-commenter
Copy link

codecov-commenter commented Aug 15, 2023

Codecov Report

Merging #11348 (7a7f179) into master (f25f3ed) will decrease coverage by 53.95%.
Report is 31 commits behind head on master.
The diff coverage is 0.00%.

@@              Coverage Diff              @@
##             master   #11348       +/-   ##
=============================================
- Coverage     68.46%   14.52%   -53.95%     
+ Complexity     6535      201     -6334     
=============================================
  Files          2233     2300       +67     
  Lines        119952   123622     +3670     
  Branches      18191    18796      +605     
=============================================
- Hits          82128    17958    -64170     
- Misses        31984   104149    +72165     
+ Partials       5840     1515     -4325     
Flag Coverage Δ
integration <0.01% <0.00%> (?)
integration1 <0.01% <0.00%> (-20.99%) ⬇️
integration2 0.00% <0.00%> (-25.66%) ⬇️
java-11 14.52% <0.00%> (-53.89%) ⬇️
java-17 14.51% <0.00%> (-53.80%) ⬇️
java-20 14.51% <0.00%> (-53.79%) ⬇️
temurin 14.52% <0.00%> (-53.95%) ⬇️
unittests 14.52% <0.00%> (?)
unittests1 ?
unittests2 14.52% <0.00%> (-0.14%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
...gregation/function/AggregationFunctionFactory.java 0.00% <0.00%> (-83.62%) ⬇️
...tion/BaseDistinctAggregateAggregationFunction.java 0.00% <0.00%> (-50.83%) ⬇️
...ation/function/DistinctAvgAggregationFunction.java 0.00% <0.00%> (-86.67%) ⬇️
...ion/function/DistinctAvgMVAggregationFunction.java 0.00% <0.00%> (-100.00%) ⬇️
...ion/function/DistinctCountAggregationFunction.java 0.00% <0.00%> (-80.00%) ⬇️
...n/function/DistinctCountMVAggregationFunction.java 0.00% <0.00%> (-100.00%) ⬇️
...ation/function/DistinctSumAggregationFunction.java 0.00% <0.00%> (-85.72%) ⬇️
...ion/function/DistinctSumMVAggregationFunction.java 0.00% <0.00%> (-100.00%) ⬇️

... and 1747 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@shenyu0127 shenyu0127 marked this pull request as ready for review August 15, 2023 02:04
// For dictionary-encoded expression, store dictionary ids into the bitmap
Dictionary dictionary = blockValSet.getDictionary();
if (dictionary != null) {
if (dictionary != null && !_nullHandlingEnabled) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can still use dictionary id even when null handling is enabled

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot use dictionary ID when NULL handling is enabled because Integer.MIN_VALUE and NULL share the same dictionary ID.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is not true here. We won't add null values into the set, so the one added to the set is guaranteed to be Integer.MIN_VALUE

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

I previously thought the int[] dictIds = blockValSet.getDictionaryIdsSV(); was to get the dictionary IDs metadata and had the confusion in the earlier comment.

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also want to add support to aggregateGroupByMV()

// For dictionary-encoded expression, store dictionary ids into the bitmap
Dictionary dictionary = blockValSet.getDictionary();
if (dictionary != null) {
if (dictionary != null && !_nullHandlingEnabled) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part should also be fixed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Sorry, I should have fixed this part in the previous commit.

Copy link
Contributor Author

@shenyu0127 shenyu0127 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also want to add support to aggregateGroupByMV()

Indeed. Done.

// For dictionary-encoded expression, store dictionary ids into the bitmap
Dictionary dictionary = blockValSet.getDictionary();
if (dictionary != null) {
if (dictionary != null && !_nullHandlingEnabled) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Sorry, I should have fixed this part in the previous commit.

@Jackie-Jiang Jackie-Jiang merged commit 4a263ea into apache:master Aug 21, 2023
@shenyu0127 shenyu0127 deleted the distinct-aggregation branch August 21, 2023 17:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants