-
Notifications
You must be signed in to change notification settings - Fork 695
[KQP / RBO] Add sortings FSM to remove redundant sort operations #18782
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[KQP / RBO] Add sortings FSM to remove redundant sort operations #18782
Conversation
df60ac5
to
d4a4b8b
Compare
6356b20
to
1c19cea
Compare
⚪ |
⚪ |
⚪ Test history | Ya make output | Test bloat
🟢
*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation |
⚪ Test history | Ya make output | Test bloat
🟢
*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation |
⚪ Test history | Ya make output | Test bloat
⚪ Test history | Ya make output | Test bloat | Test bloat
⚪ Test history | Ya make output | Test bloat | Test bloat | Test bloat
🟢
*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation |
⚪ Test history | Ya make output | Test bloat
⚪ Test history | Ya make output | Test bloat | Test bloat
⚪ Test history | Ya make output | Test bloat | Test bloat | Test bloat
🟢
*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation |
⚪ Test history | Ya make output | Test bloat
⚪ Test history | Ya make output | Test bloat | Test bloat
⚪ Test history | Ya make output | Test bloat | Test bloat | Test bloat
🟢
*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation |
⚪ Test history | Ya make output | Test bloat
⚪ Test history | Ya make output | Test bloat | Test bloat
⚪ Test history | Ya make output | Test bloat | Test bloat | Test bloat
🟢
*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation |
1917f94
to
c221538
Compare
⚪ Test history | Ya make output | Test bloat
⚪ Test history | Ya make output | Test bloat | Test bloat
⚪ Test history | Ya make output | Test bloat | Test bloat | Test bloat
🟢
*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation |
⚪ |
⚪ |
⚪ Test history | Ya make output | Test bloat
⚪ Test history | Ya make output | Test bloat | Test bloat
⚪ Test history | Ya make output | Test bloat | Test bloat | Test bloat
🟢
*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation |
⚪ Test history | Ya make output | Test bloat
⚪ Test history | Ya make output | Test bloat | Test bloat
⚪ Test history | Ya make output | Test bloat | Test bloat | Test bloat
🟢
*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation |
⚪ Test history | Ya make output | Test bloat
⚪ Test history | Ya make output | Test bloat | Test bloat
⚪ Test history | Ya make output | Test bloat | Test bloat | Test bloat
🟢
*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation |
⚪ Test history | Ya make output | Test bloat
⚪ Test history | Ya make output | Test bloat | Test bloat
⚪ Test history | Ya make output | Test bloat | Test bloat | Test bloat
🟢
*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation |
⚪ Test history | Ya make output | Test bloat
⚪ Test history | Ya make output | Test bloat | Test bloat
⚪ Test history | Ya make output | Test bloat | Test bloat | Test bloat
🟢
*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation |
author Pavel Ivanov <pudge1000-7@ydb.tech> 1748548106 +0000 committer Pavel Ivanov <pudge1000-7@qavm-9f0570a4.qemu> 1750464062 +0300 [] ... [] ... [] ... [] ... [] ... [] ... [] ... [] ... [] ... [] ... [] ... [] ... [] ... [] ... [] ... [] ... [] ... [] ... [] ... [] ... [] ... [] ... [] ... [] ... [] ... [] ... Update yql_type_annotation.h [] ... revert revert
31161e3
to
c9380bd
Compare
⚪ Test history | Ya make output | Test bloat
⚪ Test history | Ya make output | Test bloat | Test bloat
⚪ Test history | Ya make output | Test bloat | Test bloat | Test bloat
🟢
*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation |
⚪ Test history | Ya make output | Test bloat
⚪ Test history | Ya make output | Test bloat | Test bloat
🟢
*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR integrates a finite state machine (FSM) for tracking sorting and shuffling into the query optimizer to identify and remove redundant Sort/Top operators, and updates affected tests and plans to match the new behavior.
- Introduce Sorting and Shuffling FSM in
dq_opt_*
code, propagate ordering metadata in statistics, and extend optimizer rules to remove redundant sorts. - Update KQP physical optimizer rules (
kqp_opt_phy_sort.cpp
,kqp_opt_phy_build_stage.cpp
) to call the new sort-removal logic and adjust handlers. - Refresh many
.plan
test fixtures and Python tests to expect the removal ofTopSort
operators and add debug instrumentation.
Reviewed Changes
Copilot reviewed 106 out of 106 changed files in this pull request and generated 3 comments.
Show a summary per file
File | Description |
---|---|
ydb/tests/functional/suite_tests/test_base.py | Added debug print calls for query names |
ydb/library/yql/dq/opt/dq_opt_stat_transformer_base.cpp | Spelling, stats propagation, RemoveSorting |
ydb/library/yql/dq/opt/dq_opt_join_hypergraph.h | FSM construction updated |
ydb/core/kqp/opt/physical/kqp_opt_phy_sort.cpp | Renamed and refactored sort-removal rule |
ydb/core/kqp/opt/physical/kqp_opt_phy_build_stage.cpp | Implement sort-over-read removal, debug logs |
Comments suppressed due to low confidence (1)
ydb/core/kqp/opt/physical/kqp_opt_phy_sort.cpp:24
- The definition of
GetTable
ends with an extra semicolon after its closing brace, which will trigger a compilation error. Remove the trailing;
.
TKqpTable GetTable(TExprBase input, bool isReadRanges) {
print('query_name: {query_name}') | ||
print('statement: {statement}') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This print call does not interpolate query_name
. Use an f-string (e.g. print(f"query_name: {query_name}")
) or .format()
to include the variable value.
print('query_name: {query_name}') | |
print('statement: {statement}') | |
print(f'query_name: {query_name}') | |
print(f'statement: {statement}') |
Copilot uses AI. Check for mistakes.
@@ -14,6 +14,44 @@ TDqStatisticsTransformerBase::TDqStatisticsTransformerBase(TTypeAnnotationContex | |||
: TypeCtx(typeCtx), Pctx(ctx), CardinalityHints(hints) | |||
{ } | |||
|
|||
void PropogateTableAliasesFromChildren(const TExprNode::TPtr& input, TTypeAnnotationContext* typeCtx) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function name PropogateTableAliasesFromChildren
is misspelled. It should be PropagateTableAliasesFromChildren
.
void PropogateTableAliasesFromChildren(const TExprNode::TPtr& input, TTypeAnnotationContext* typeCtx) { | |
void PropagateTableAliasesFromChildren(const TExprNode::TPtr& input, TTypeAnnotationContext* typeCtx) { |
Copilot uses AI. Check for mistakes.
YQL_CLOG(TRACE, CoreDq) << "sperm streamlookup:" << typeCtx.GetStats(streamLookup.Cast().Raw())->ToString(); | ||
YQL_CLOG(TRACE, CoreDq) << "sperm2: Output of unionall" << inputStats->ToString(); | ||
} | ||
Cout << "sperm : " << sortedByOrderingIdx << Endl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Remove this debugging output or guard it behind a diagnostic flag—logging 'sperm' is unclear and will clutter test runs.
Copilot uses AI. Check for mistakes.
Changelog entry
...
Changelog category
Description for reviewers
...