Prevent Early Ordering Pushdown to Enable Aggregation Pushdown to MySQL#16278
Prevent Early Ordering Pushdown to Enable Aggregation Pushdown to MySQL#16278systay merged 14 commits intovitessio:mainfrom
Conversation
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
Tests
Documentation
New flags
If a workflow is added or modified:
Backward compatibility
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #16278 +/- ##
==========================================
- Coverage 68.71% 68.71% -0.01%
==========================================
Files 1547 1547
Lines 198290 198286 -4
==========================================
- Hits 136248 136244 -4
Misses 62042 62042 ☔ View full report in Codecov by Sentry. |
Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Andres Taylor <andres@planetscale.com>
|
|
||
| columns := &applyJoinColumns{} | ||
| output, err := splitAggrColumnsToLeftAndRight(ctx, rootAggr, join, !join.JoinType.IsInner(), columns, lhs, rhs) | ||
| join.JoinColumns = columns |
There was a problem hiding this comment.
This bug was not related to the rest of the PR, but it caused tests to fail, so it had to go.
| panic(vterrors.VT13001(fmt.Sprintf("expression depends on tables outside this join: %s", sqlparser.String(e)))) | ||
| } | ||
|
|
||
| col.GroupBy = addToGroupBy |
There was a problem hiding this comment.
This is another bug that was not related to the issue being found, but it caused some plans to be missing group by expressions
| "Name": "user", | ||
| "Sharded": true | ||
| }, | ||
| "FieldQuery": "select min(a.id), a.tcol1, weight_string(a.tcol1), weight_string(a.id) from `user` as a where 1 != 1 group by a.tcol1, weight_string(a.tcol1), weight_string(a.id)", |
There was a problem hiding this comment.
Not just an optimization, the previous plan also looks wrong. It should have been selecting weight_string(min(a.id)) and not weight_string(a.id)
harshit-gangal
left a comment
There was a problem hiding this comment.
could you publish some benchmarks as the PR description claims the impact is of improved performance?
| }, | ||
| "FieldQuery": "select foo, col, weight_string(foo) from `user` where 1 != 1 group by col, foo, weight_string(foo)", | ||
| "FieldQuery": "select foo, col, weight_string(foo) from `user` where 1 != 1 group by foo, col, weight_string(foo)", | ||
| "OrderBy": "1 ASC, (0|2) ASC", | ||
| "Query": "select foo, col, weight_string(foo) from `user` where id between :vtg1 and :vtg2 group by col, foo, weight_string(foo) order by `user`.col asc, foo asc", | ||
| "Query": "select foo, col, weight_string(foo) from `user` where id between :vtg1 and :vtg2 group by foo, col, weight_string(foo) order by `user`.col asc, foo asc", | ||
| "Table": "`user`" |
There was a problem hiding this comment.
there is no impact on the output by this change.
harshit-gangal
left a comment
There was a problem hiding this comment.
We should add some e2e tests for the fix and any new cases added in the plan tests.
harshit-gangal
left a comment
There was a problem hiding this comment.
I could not find how we improved the planning unless the improvements are in tpch cases.
I added a plan to the bottom of That query produces a very different plan on {
"QueryType": "SELECT",
"Original": "select sum(user.type) from user join user_extra on user.team_id = user_extra.id group by user_extra.col order by user_extra.col",
"Instructions": {
"OperatorType": "Aggregate",
"Variant": "Ordered",
"Aggregates": "sum(0) AS sum(`user`.type)",
"GroupBy": "1",
"ResultColumns": 1,
"Inputs": [
{
"OperatorType": "Sort",
"Variant": "Memory",
"OrderBy": "1 ASC",
"Inputs": [
{
"OperatorType": "Join",
"Variant": "Join",
"JoinColumnIndexes": "L:0,R:0",
"JoinVars": {
"user_team_id": 1
},
"TableName": "`user`_user_extra",
"Inputs": [
{
"OperatorType": "Route",
"Variant": "Scatter",
"Keyspace": {
"Name": "user",
"Sharded": true
},
"FieldQuery": "select `user`.type, `user`.team_id from `user` where 1 != 1",
"Query": "select `user`.type, `user`.team_id from `user`",
"Table": "`user`"
},
{
"OperatorType": "Route",
"Variant": "Scatter",
"Keyspace": {
"Name": "user",
"Sharded": true
},
"FieldQuery": "select user_extra.col from user_extra where 1 != 1",
"Query": "select user_extra.col from user_extra where user_extra.id = :user_team_id",
"Table": "user_extra"
}
]
}
]
}
]
},
"TablesUsed": [
"user.user",
"user.user_extra"
]
}As you can see here, MySQL is not doing any aggregation at all. |
Signed-off-by: Andres Taylor <andres@planetscale.com>
|
| }, | ||
| "FieldQuery": "select count(*), :user_id + user_extra.id, weight_string(:user_id + user_extra.id) from user_extra where 1 != 1 group by :user_id + user_extra.id", | ||
| "Query": "select count(*), :user_id + user_extra.id, weight_string(:user_id + user_extra.id) from user_extra group by :user_id + user_extra.id", | ||
| "FieldQuery": "select count(*), :user_id + user_extra.id, weight_string(:user_id + user_extra.id) from user_extra where 1 != 1 group by :user_id + user_extra.id, weight_string(:user_id + user_extra.id)", |
There was a problem hiding this comment.
Does the additional grouping by weight string make sense? 🤔 If so, can you explain why?
There was a problem hiding this comment.
no, they don't really make sense
There was a problem hiding this comment.
They are needed for some specific cases, we have CI tests which are failing due to removal of weight_string changes.
| { | ||
| "OperatorType": "Projection", | ||
| "Expressions": [ | ||
| "sum(volume) * count(*) as revenue", |
There was a problem hiding this comment.
can you explain why we are not passing volume as an index, like we do for the other columns?
There was a problem hiding this comment.
we are, the printing of the plan is just not showing it
Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Andres Taylor <andres@planetscale.com>
This reverts commit f7c0ef7. Turns out we do need them. Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Andres Taylor <andres@planetscale.com>
Description
This PR introduces a crucial check in the
pushOrderingUnderAggr method to prevent the premature pushdown of theORDER BYclause during the query planning process. By ensuring that the ordering is not pushed down too early, we allow the aggregation to be effectively pushed down to MySQL, optimizing query execution.Details:
Added a Check for Planning Phase:
Rationale:
Optimization:
Impact:
Related Issue(s)
Issue: #16279
Checklist