-
Notifications
You must be signed in to change notification settings - Fork 3.5k
[fix](nereids) EliminateGroupByConstant should replace agg's group by after removing constant group by keys #49473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
englefly
merged 2 commits into
apache:master
from
feiniaofeiafei:fix_eliminate_group_by_constant
Mar 26, 2025
Merged
[fix](nereids) EliminateGroupByConstant should replace agg's group by after removing constant group by keys #49473
englefly
merged 2 commits into
apache:master
from
feiniaofeiafei:fix_eliminate_group_by_constant
Mar 26, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
34c8909
to
f1d2701
Compare
run buildall |
any test case? |
TPC-H: Total hot run time: 34175 ms
|
TPC-DS: Total hot run time: 186302 ms
|
ClickBench: Total hot run time: 31.79 s
|
f1d2701
to
40e6a29
Compare
run buildall |
TPC-H: Total hot run time: 34471 ms
|
TPC-DS: Total hot run time: 193154 ms
|
ClickBench: Total hot run time: 32.04 s
|
starocean999
approved these changes
Mar 26, 2025
PR approved by at least one committer and no changes requested. |
PR approved by anyone and no changes requested. |
englefly
approved these changes
Mar 26, 2025
16 tasks
morrySnow
pushed a commit
that referenced
this pull request
Apr 15, 2025
…9589) ### What problem does this PR solve? Related PR: #32878 #49473 Problem Summary: SELECT IF( t.`gender` IN ('女'), ( TIMESTAMPDIFF( YEAR, NOW(), NOW() ) ), 1 ) AS x0, TIMESTAMPDIFF( YEAR, NOW(), NOW() ) AS x1 FROM t1 AS t GROUP BY x0, x1; after EliminateGroupByConstant, this sql will be rewritten to SELECT IF( t.`gender` IN ('女'), 0, 1 ) AS x0, 0 AS x1 FROM t1 AS t GROUP BY IF( t.`gender` IN ('女'), ( TIMESTAMPDIFF( YEAR, NOW(), NOW() ) ), 1 ) ; The select expression and the group by expression is different, and will report error in normalizeagg. The fix in PR #49473 may introduce another issue. Consider the following query: SELECT func2(100) FROM t GROUP BY func1(), func2(func1()); If func1() can be constant-folded to 100, then func2(func1()) will be replaced with func2(100), allowing the query to execute successfully. However, when func1() cannot be folded to 100, the query will fail. This creates an inconsistent behavior where query execution depends on whether func1() can be constant-folded or not, which is not an ideal implementation. To address this issue, this PR modifies the normalizeAgg logic to eliminate constant group by keys. With this change, the query will consistently fail regardless of whether func1() can be folded or not, ensuring more predictable behavior.
github-actions bot
pushed a commit
that referenced
this pull request
Apr 15, 2025
…9589) ### What problem does this PR solve? Related PR: #32878 #49473 Problem Summary: SELECT IF( t.`gender` IN ('女'), ( TIMESTAMPDIFF( YEAR, NOW(), NOW() ) ), 1 ) AS x0, TIMESTAMPDIFF( YEAR, NOW(), NOW() ) AS x1 FROM t1 AS t GROUP BY x0, x1; after EliminateGroupByConstant, this sql will be rewritten to SELECT IF( t.`gender` IN ('女'), 0, 1 ) AS x0, 0 AS x1 FROM t1 AS t GROUP BY IF( t.`gender` IN ('女'), ( TIMESTAMPDIFF( YEAR, NOW(), NOW() ) ), 1 ) ; The select expression and the group by expression is different, and will report error in normalizeagg. The fix in PR #49473 may introduce another issue. Consider the following query: SELECT func2(100) FROM t GROUP BY func1(), func2(func1()); If func1() can be constant-folded to 100, then func2(func1()) will be replaced with func2(100), allowing the query to execute successfully. However, when func1() cannot be folded to 100, the query will fail. This creates an inconsistent behavior where query execution depends on whether func1() can be constant-folded or not, which is not an ideal implementation. To address this issue, this PR modifies the normalizeAgg logic to eliminate constant group by keys. With this change, the query will consistently fail regardless of whether func1() can be folded or not, ensuring more predictable behavior.
github-actions bot
pushed a commit
that referenced
this pull request
Apr 15, 2025
… after removing constant group by keys (#49473) ### What problem does this PR solve? Issue Number: close #xxx Related PR: #xxx Problem Summary: ### Release note ```sql SELECT IF( t.`gender` IN ('女'), ( TIMESTAMPDIFF( YEAR, NOW(), NOW() ) ), 1 ) AS x0, TIMESTAMPDIFF( YEAR, NOW(), NOW() ) AS x1 FROM t1 AS t GROUP BY x0, x1; ``` after EliminateGroupByConstant, this sql will be rewritten to ```sql SELECT IF( t.`gender` IN ('女'), 0, 1 ) AS x0, 0 AS x1 FROM t1 AS t GROUP BY IF( t.`gender` IN ('女'), ( TIMESTAMPDIFF( YEAR, NOW(), NOW() ) ), 1 ) ; ``` The select expression and the group by expression is different, and will report error in normalizeagg. This pr changes using the foldmap rewrite the group by expresssion, and after change the sql after EliminateGroupByConstant become: ```sql SELECT IF( t.`gender` IN ('女'), 0, 1 ) AS x0, 0 AS x1 FROM t1 AS t GROUP BY IF( t.`gender` IN ('女'), 0, 1 ) ; ``` the select expression and the group by expression becomes same.
github-actions bot
pushed a commit
that referenced
this pull request
Apr 15, 2025
… after removing constant group by keys (#49473) ### What problem does this PR solve? Issue Number: close #xxx Related PR: #xxx Problem Summary: ### Release note ```sql SELECT IF( t.`gender` IN ('女'), ( TIMESTAMPDIFF( YEAR, NOW(), NOW() ) ), 1 ) AS x0, TIMESTAMPDIFF( YEAR, NOW(), NOW() ) AS x1 FROM t1 AS t GROUP BY x0, x1; ``` after EliminateGroupByConstant, this sql will be rewritten to ```sql SELECT IF( t.`gender` IN ('女'), 0, 1 ) AS x0, 0 AS x1 FROM t1 AS t GROUP BY IF( t.`gender` IN ('女'), ( TIMESTAMPDIFF( YEAR, NOW(), NOW() ) ), 1 ) ; ``` The select expression and the group by expression is different, and will report error in normalizeagg. This pr changes using the foldmap rewrite the group by expresssion, and after change the sql after EliminateGroupByConstant become: ```sql SELECT IF( t.`gender` IN ('女'), 0, 1 ) AS x0, 0 AS x1 FROM t1 AS t GROUP BY IF( t.`gender` IN ('女'), 0, 1 ) ; ``` the select expression and the group by expression becomes same.
seawinde
pushed a commit
to seawinde/doris
that referenced
this pull request
Apr 17, 2025
…ache#49589) ### What problem does this PR solve? Related PR: apache#32878 apache#49473 Problem Summary: SELECT IF( t.`gender` IN ('女'), ( TIMESTAMPDIFF( YEAR, NOW(), NOW() ) ), 1 ) AS x0, TIMESTAMPDIFF( YEAR, NOW(), NOW() ) AS x1 FROM t1 AS t GROUP BY x0, x1; after EliminateGroupByConstant, this sql will be rewritten to SELECT IF( t.`gender` IN ('女'), 0, 1 ) AS x0, 0 AS x1 FROM t1 AS t GROUP BY IF( t.`gender` IN ('女'), ( TIMESTAMPDIFF( YEAR, NOW(), NOW() ) ), 1 ) ; The select expression and the group by expression is different, and will report error in normalizeagg. The fix in PR apache#49473 may introduce another issue. Consider the following query: SELECT func2(100) FROM t GROUP BY func1(), func2(func1()); If func1() can be constant-folded to 100, then func2(func1()) will be replaced with func2(100), allowing the query to execute successfully. However, when func1() cannot be folded to 100, the query will fail. This creates an inconsistent behavior where query execution depends on whether func1() can be constant-folded or not, which is not an ideal implementation. To address this issue, this PR modifies the normalizeAgg logic to eliminate constant group by keys. With this change, the query will consistently fail regardless of whether func1() can be folded or not, ensuring more predictable behavior.
feiniaofeiafei
added a commit
to feiniaofeiafei/doris
that referenced
this pull request
Apr 21, 2025
…ache#49589) Related PR: apache#32878 apache#49473 Problem Summary: SELECT IF( t.`gender` IN ('女'), ( TIMESTAMPDIFF( YEAR, NOW(), NOW() ) ), 1 ) AS x0, TIMESTAMPDIFF( YEAR, NOW(), NOW() ) AS x1 FROM t1 AS t GROUP BY x0, x1; after EliminateGroupByConstant, this sql will be rewritten to SELECT IF( t.`gender` IN ('女'), 0, 1 ) AS x0, 0 AS x1 FROM t1 AS t GROUP BY IF( t.`gender` IN ('女'), ( TIMESTAMPDIFF( YEAR, NOW(), NOW() ) ), 1 ) ; The select expression and the group by expression is different, and will report error in normalizeagg. The fix in PR apache#49473 may introduce another issue. Consider the following query: SELECT func2(100) FROM t GROUP BY func1(), func2(func1()); If func1() can be constant-folded to 100, then func2(func1()) will be replaced with func2(100), allowing the query to execute successfully. However, when func1() cannot be folded to 100, the query will fail. This creates an inconsistent behavior where query execution depends on whether func1() can be constant-folded or not, which is not an ideal implementation. To address this issue, this PR modifies the normalizeAgg logic to eliminate constant group by keys. With this change, the query will consistently fail regardless of whether func1() can be folded or not, ensuring more predictable behavior.
feiniaofeiafei
added a commit
to feiniaofeiafei/doris
that referenced
this pull request
May 8, 2025
…ache#49589) Related PR: apache#32878 apache#49473 Problem Summary: SELECT IF( t.`gender` IN ('女'), ( TIMESTAMPDIFF( YEAR, NOW(), NOW() ) ), 1 ) AS x0, TIMESTAMPDIFF( YEAR, NOW(), NOW() ) AS x1 FROM t1 AS t GROUP BY x0, x1; after EliminateGroupByConstant, this sql will be rewritten to SELECT IF( t.`gender` IN ('女'), 0, 1 ) AS x0, 0 AS x1 FROM t1 AS t GROUP BY IF( t.`gender` IN ('女'), ( TIMESTAMPDIFF( YEAR, NOW(), NOW() ) ), 1 ) ; The select expression and the group by expression is different, and will report error in normalizeagg. The fix in PR apache#49473 may introduce another issue. Consider the following query: SELECT func2(100) FROM t GROUP BY func1(), func2(func1()); If func1() can be constant-folded to 100, then func2(func1()) will be replaced with func2(100), allowing the query to execute successfully. However, when func1() cannot be folded to 100, the query will fail. This creates an inconsistent behavior where query execution depends on whether func1() can be constant-folded or not, which is not an ideal implementation. To address this issue, this PR modifies the normalizeAgg logic to eliminate constant group by keys. With this change, the query will consistently fail regardless of whether func1() can be folded or not, ensuring more predictable behavior.
koarz
pushed a commit
to koarz/doris
that referenced
this pull request
Jun 4, 2025
… after removing constant group by keys (apache#49473) ### What problem does this PR solve? Issue Number: close #xxx Related PR: #xxx Problem Summary: ### Release note ```sql SELECT IF( t.`gender` IN ('女'), ( TIMESTAMPDIFF( YEAR, NOW(), NOW() ) ), 1 ) AS x0, TIMESTAMPDIFF( YEAR, NOW(), NOW() ) AS x1 FROM t1 AS t GROUP BY x0, x1; ``` after EliminateGroupByConstant, this sql will be rewritten to ```sql SELECT IF( t.`gender` IN ('女'), 0, 1 ) AS x0, 0 AS x1 FROM t1 AS t GROUP BY IF( t.`gender` IN ('女'), ( TIMESTAMPDIFF( YEAR, NOW(), NOW() ) ), 1 ) ; ``` The select expression and the group by expression is different, and will report error in normalizeagg. This pr changes using the foldmap rewrite the group by expresssion, and after change the sql after EliminateGroupByConstant become: ```sql SELECT IF( t.`gender` IN ('女'), 0, 1 ) AS x0, 0 AS x1 FROM t1 AS t GROUP BY IF( t.`gender` IN ('女'), 0, 1 ) ; ``` the select expression and the group by expression becomes same.
koarz
pushed a commit
to koarz/doris
that referenced
this pull request
Jun 4, 2025
…ache#49589) ### What problem does this PR solve? Related PR: apache#32878 apache#49473 Problem Summary: SELECT IF( t.`gender` IN ('女'), ( TIMESTAMPDIFF( YEAR, NOW(), NOW() ) ), 1 ) AS x0, TIMESTAMPDIFF( YEAR, NOW(), NOW() ) AS x1 FROM t1 AS t GROUP BY x0, x1; after EliminateGroupByConstant, this sql will be rewritten to SELECT IF( t.`gender` IN ('女'), 0, 1 ) AS x0, 0 AS x1 FROM t1 AS t GROUP BY IF( t.`gender` IN ('女'), ( TIMESTAMPDIFF( YEAR, NOW(), NOW() ) ), 1 ) ; The select expression and the group by expression is different, and will report error in normalizeagg. The fix in PR apache#49473 may introduce another issue. Consider the following query: SELECT func2(100) FROM t GROUP BY func1(), func2(func1()); If func1() can be constant-folded to 100, then func2(func1()) will be replaced with func2(100), allowing the query to execute successfully. However, when func1() cannot be folded to 100, the query will fail. This creates an inconsistent behavior where query execution depends on whether func1() can be constant-folded or not, which is not an ideal implementation. To address this issue, this PR modifies the normalizeAgg logic to eliminate constant group by keys. With this change, the query will consistently fail regardless of whether func1() can be folded or not, ensuring more predictable behavior.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
approved
Indicates a PR has been approved by one committer.
dev/2.1.10-merged
dev/3.0.6-merged
reviewed
usercase
Important user case type label
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Release note
after EliminateGroupByConstant, this sql will be rewritten to
The select expression and the group by expression is different, and will report error in normalizeagg.
This pr changes using the foldmap rewrite the group by expresssion, and after change the sql after EliminateGroupByConstant become:
the select expression and the group by expression becomes same.
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)