-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Having-clause causes DecodeNode wedges into two-stage Agg mistakenly #19906
Merged
satanson
merged 1 commit into
StarRocks:main
from
satanson:forbid_having_clause_adopt_dict_opt
Mar 21, 2023
Merged
[BugFix] Having-clause causes DecodeNode wedges into two-stage Agg mistakenly #19906
satanson
merged 1 commit into
StarRocks:main
from
satanson:forbid_having_clause_adopt_dict_opt
Mar 21, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
stdpain
previously approved these changes
Mar 21, 2023
…stakenly Signed-off-by: satanson <ranpanf@gmail.com>
satanson
force-pushed
the
forbid_having_clause_adopt_dict_opt
branch
from
March 21, 2023 12:46
213ff8f
to
f69de86
Compare
stdpain
approved these changes
Mar 21, 2023
ZiheLiu
approved these changes
Mar 21, 2023
Kudos, SonarCloud Quality Gate passed! |
@Mergifyio backport branch-3.0 |
@Mergifyio backport branch-2.5 |
✅ Backports have been created
|
✅ Backports have been created
|
mergify bot
pushed a commit
that referenced
this pull request
Mar 21, 2023
…stakenly (#19906) Signed-off-by: satanson <ranpanf@gmail.com> ------- For the query select count(distinct c1), count(distinct c2) from t0 having count(1) > 0, when CTE optimization is close, low-cardinality dict optimization is open and two-stage aggregation is adopted, A wrong plan will be generated, a DecodeNode wedges into two-stage Agg whose agg function is multi_distinct_count. The 1st agg(below DecodeNode) aggregates dict-encoding input data into Set and serialize it then send it to 2nd agg, the 2nd agg(above DecodeNode) deserializes the data and treat it as Set, this fact causes be crashes. The root cause is that 1st agg is rewritten before 2nd agg when apply dict optimization, however 2nd agg fails to be rewritten because it has having-clausing that references some aggregation, so dict optimization can not propagates upwards and DecodeNode is interpolated between two aggs. (cherry picked from commit 5f46ec7)
mergify bot
pushed a commit
that referenced
this pull request
Mar 21, 2023
…stakenly (#19906) Signed-off-by: satanson <ranpanf@gmail.com> ------- For the query select count(distinct c1), count(distinct c2) from t0 having count(1) > 0, when CTE optimization is close, low-cardinality dict optimization is open and two-stage aggregation is adopted, A wrong plan will be generated, a DecodeNode wedges into two-stage Agg whose agg function is multi_distinct_count. The 1st agg(below DecodeNode) aggregates dict-encoding input data into Set and serialize it then send it to 2nd agg, the 2nd agg(above DecodeNode) deserializes the data and treat it as Set, this fact causes be crashes. The root cause is that 1st agg is rewritten before 2nd agg when apply dict optimization, however 2nd agg fails to be rewritten because it has having-clausing that references some aggregation, so dict optimization can not propagates upwards and DecodeNode is interpolated between two aggs. (cherry picked from commit 5f46ec7)
@Mergifyio backport branch-2.3 |
✅ Backports have been created
|
mergify bot
pushed a commit
that referenced
this pull request
Mar 22, 2023
…stakenly (#19906) Signed-off-by: satanson <ranpanf@gmail.com> ------- For the query select count(distinct c1), count(distinct c2) from t0 having count(1) > 0, when CTE optimization is close, low-cardinality dict optimization is open and two-stage aggregation is adopted, A wrong plan will be generated, a DecodeNode wedges into two-stage Agg whose agg function is multi_distinct_count. The 1st agg(below DecodeNode) aggregates dict-encoding input data into Set and serialize it then send it to 2nd agg, the 2nd agg(above DecodeNode) deserializes the data and treat it as Set, this fact causes be crashes. The root cause is that 1st agg is rewritten before 2nd agg when apply dict optimization, however 2nd agg fails to be rewritten because it has having-clausing that references some aggregation, so dict optimization can not propagates upwards and DecodeNode is interpolated between two aggs. (cherry picked from commit 5f46ec7)
wanpengfei-git
pushed a commit
that referenced
this pull request
Mar 22, 2023
…stakenly (#19906) Signed-off-by: satanson <ranpanf@gmail.com> ------- For the query select count(distinct c1), count(distinct c2) from t0 having count(1) > 0, when CTE optimization is close, low-cardinality dict optimization is open and two-stage aggregation is adopted, A wrong plan will be generated, a DecodeNode wedges into two-stage Agg whose agg function is multi_distinct_count. The 1st agg(below DecodeNode) aggregates dict-encoding input data into Set and serialize it then send it to 2nd agg, the 2nd agg(above DecodeNode) deserializes the data and treat it as Set, this fact causes be crashes. The root cause is that 1st agg is rewritten before 2nd agg when apply dict optimization, however 2nd agg fails to be rewritten because it has having-clausing that references some aggregation, so dict optimization can not propagates upwards and DecodeNode is interpolated between two aggs. (cherry picked from commit 5f46ec7)
satanson
added a commit
that referenced
this pull request
Mar 22, 2023
…stakenly (#19906) Signed-off-by: satanson <ranpanf@gmail.com> ------- For the query select count(distinct c1), count(distinct c2) from t0 having count(1) > 0, when CTE optimization is close, low-cardinality dict optimization is open and two-stage aggregation is adopted, A wrong plan will be generated, a DecodeNode wedges into two-stage Agg whose agg function is multi_distinct_count. The 1st agg(below DecodeNode) aggregates dict-encoding input data into Set and serialize it then send it to 2nd agg, the 2nd agg(above DecodeNode) deserializes the data and treat it as Set, this fact causes be crashes. The root cause is that 1st agg is rewritten before 2nd agg when apply dict optimization, however 2nd agg fails to be rewritten because it has having-clausing that references some aggregation, so dict optimization can not propagates upwards and DecodeNode is interpolated between two aggs. (cherry picked from commit 5f46ec7)
https://github.com/Mergifyio backport branch-2.4 |
✅ Backports have been created
|
mergify bot
pushed a commit
that referenced
this pull request
Mar 22, 2023
…stakenly (#19906) Signed-off-by: satanson <ranpanf@gmail.com> ------- For the query select count(distinct c1), count(distinct c2) from t0 having count(1) > 0, when CTE optimization is close, low-cardinality dict optimization is open and two-stage aggregation is adopted, A wrong plan will be generated, a DecodeNode wedges into two-stage Agg whose agg function is multi_distinct_count. The 1st agg(below DecodeNode) aggregates dict-encoding input data into Set and serialize it then send it to 2nd agg, the 2nd agg(above DecodeNode) deserializes the data and treat it as Set, this fact causes be crashes. The root cause is that 1st agg is rewritten before 2nd agg when apply dict optimization, however 2nd agg fails to be rewritten because it has having-clausing that references some aggregation, so dict optimization can not propagates upwards and DecodeNode is interpolated between two aggs. (cherry picked from commit 5f46ec7)
wanpengfei-git
pushed a commit
that referenced
this pull request
Mar 22, 2023
…stakenly (#19906) Signed-off-by: satanson <ranpanf@gmail.com> ------- For the query select count(distinct c1), count(distinct c2) from t0 having count(1) > 0, when CTE optimization is close, low-cardinality dict optimization is open and two-stage aggregation is adopted, A wrong plan will be generated, a DecodeNode wedges into two-stage Agg whose agg function is multi_distinct_count. The 1st agg(below DecodeNode) aggregates dict-encoding input data into Set and serialize it then send it to 2nd agg, the 2nd agg(above DecodeNode) deserializes the data and treat it as Set, this fact causes be crashes. The root cause is that 1st agg is rewritten before 2nd agg when apply dict optimization, however 2nd agg fails to be rewritten because it has having-clausing that references some aggregation, so dict optimization can not propagates upwards and DecodeNode is interpolated between two aggs. (cherry picked from commit 5f46ec7)
wanpengfei-git
pushed a commit
that referenced
this pull request
Mar 22, 2023
…stakenly (#19906) Signed-off-by: satanson <ranpanf@gmail.com> ------- For the query select count(distinct c1), count(distinct c2) from t0 having count(1) > 0, when CTE optimization is close, low-cardinality dict optimization is open and two-stage aggregation is adopted, A wrong plan will be generated, a DecodeNode wedges into two-stage Agg whose agg function is multi_distinct_count. The 1st agg(below DecodeNode) aggregates dict-encoding input data into Set and serialize it then send it to 2nd agg, the 2nd agg(above DecodeNode) deserializes the data and treat it as Set, this fact causes be crashes. The root cause is that 1st agg is rewritten before 2nd agg when apply dict optimization, however 2nd agg fails to be rewritten because it has having-clausing that references some aggregation, so dict optimization can not propagates upwards and DecodeNode is interpolated between two aggs. (cherry picked from commit 5f46ec7)
wanpengfei-git
pushed a commit
that referenced
this pull request
Mar 22, 2023
…stakenly (#19906) Signed-off-by: satanson <ranpanf@gmail.com> ------- For the query select count(distinct c1), count(distinct c2) from t0 having count(1) > 0, when CTE optimization is close, low-cardinality dict optimization is open and two-stage aggregation is adopted, A wrong plan will be generated, a DecodeNode wedges into two-stage Agg whose agg function is multi_distinct_count. The 1st agg(below DecodeNode) aggregates dict-encoding input data into Set and serialize it then send it to 2nd agg, the 2nd agg(above DecodeNode) deserializes the data and treat it as Set, this fact causes be crashes. The root cause is that 1st agg is rewritten before 2nd agg when apply dict optimization, however 2nd agg fails to be rewritten because it has having-clausing that references some aggregation, so dict optimization can not propagates upwards and DecodeNode is interpolated between two aggs. (cherry picked from commit 5f46ec7)
wanpengfei-git
pushed a commit
that referenced
this pull request
Mar 22, 2023
…stakenly (#19906) Signed-off-by: satanson <ranpanf@gmail.com> ------- For the query select count(distinct c1), count(distinct c2) from t0 having count(1) > 0, when CTE optimization is close, low-cardinality dict optimization is open and two-stage aggregation is adopted, A wrong plan will be generated, a DecodeNode wedges into two-stage Agg whose agg function is multi_distinct_count. The 1st agg(below DecodeNode) aggregates dict-encoding input data into Set and serialize it then send it to 2nd agg, the 2nd agg(above DecodeNode) deserializes the data and treat it as Set, this fact causes be crashes. The root cause is that 1st agg is rewritten before 2nd agg when apply dict optimization, however 2nd agg fails to be rewritten because it has having-clausing that references some aggregation, so dict optimization can not propagates upwards and DecodeNode is interpolated between two aggs. (cherry picked from commit 5f46ec7)
numbernumberone
pushed a commit
to numbernumberone/starrocks
that referenced
this pull request
May 31, 2023
…stakenly (StarRocks#19906) Signed-off-by: satanson <ranpanf@gmail.com> ------- For the query select count(distinct c1), count(distinct c2) from t0 having count(1) > 0, when CTE optimization is close, low-cardinality dict optimization is open and two-stage aggregation is adopted, A wrong plan will be generated, a DecodeNode wedges into two-stage Agg whose agg function is multi_distinct_count. The 1st agg(below DecodeNode) aggregates dict-encoding input data into Set and serialize it then send it to 2nd agg, the 2nd agg(above DecodeNode) deserializes the data and treat it as Set, this fact causes be crashes. The root cause is that 1st agg is rewritten before 2nd agg when apply dict optimization, however 2nd agg fails to be rewritten because it has having-clausing that references some aggregation, so dict optimization can not propagates upwards and DecodeNode is interpolated between two aggs.
abc982627271
pushed a commit
to abc982627271/starrocks
that referenced
this pull request
Jun 5, 2023
…stakenly (StarRocks#19906) Signed-off-by: satanson <ranpanf@gmail.com> ------- For the query select count(distinct c1), count(distinct c2) from t0 having count(1) > 0, when CTE optimization is close, low-cardinality dict optimization is open and two-stage aggregation is adopted, A wrong plan will be generated, a DecodeNode wedges into two-stage Agg whose agg function is multi_distinct_count. The 1st agg(below DecodeNode) aggregates dict-encoding input data into Set and serialize it then send it to 2nd agg, the 2nd agg(above DecodeNode) deserializes the data and treat it as Set, this fact causes be crashes. The root cause is that 1st agg is rewritten before 2nd agg when apply dict optimization, however 2nd agg fails to be rewritten because it has having-clausing that references some aggregation, so dict optimization can not propagates upwards and DecodeNode is interpolated between two aggs.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What type of PR is this:
Which issues of this PR fixes :
Fixes ##19901
Problem Summary(Required) :
For the query select count(distinct c1), count(distinct c2) from t0 having count(1) > 0, when CTE optimization is close, low-cardinality dict optimization is open and two-stage aggregation is adopted, A wrong plan will be generated, a DecodeNode wedges into two-stage Agg whose agg function is multi_distinct_count. The 1st agg(below DecodeNode) aggregates dict-encoding input data into Set and serialize it then send it to 2nd agg, the 2nd agg(above DecodeNode) deserializes the data and treat it as Set, this fact causes be crashes.
The root cause is that 1st agg is rewritten before 2nd agg when apply dict optimization, however 2nd agg fails to be rewritten because it has having-clausing that references some aggregation, so dict optimization can not propagates upwards and DecodeNode is interpolated between two aggs.
Checklist:
Bugfix cherry-pick branch check: