New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-28716][SQL] Add id to Exchange and Subquery's stringArgs method for easier identifying their reuses in query plans #25434
Conversation
ok to test |
ok to test |
Test build #109045 has finished for PR 25434 at commit
|
Test build #109067 has finished for PR 25434 at commit
|
Can you put explain output differences with/without this pr in the PR description? Also, It'd be better to put the simple query to reproduce the case. |
assert(output.contains( | ||
"""== BroadcastExchange HashedRelationBroadcastMode(List(input[0, bigint, false])) == | ||
assert(output.replaceAll("\\[id=#\\d+\\]", "[id=#x]").contains( | ||
"""== BroadcastExchange HashedRelationBroadcastMode(List(input[0, bigint, false])), [id=#x] == |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to always print this id? only for reuse cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then you need to make two passes over the tree. The first one to figure out which exchanges are reused and the second to print the tree. This seems a bit excessive for a debugging tool.
Merging to master. |
…d for easier identifying their reuses in query plans Add id to Exchange and Subquery's stringArgs method for easier identifying their reuses in query plans, for example: ``` ReusedExchange d_date_sk#827, BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint))) [id=apache#2710] ``` Where `2710` is the id of the reused exchange. Passes existing tests Closes apache#25434 from dbaliafroozeh/ImplementStringArgsExchangeSubqueryExec. Authored-by: Ali Afroozeh <ali.afroozeh@databricks.com> Signed-off-by: herman <herman@databricks.com>
…d for easier identifying their reuses in query plans Add id to Exchange and Subquery's stringArgs method for easier identifying their reuses in query plans, for example: ``` ReusedExchange d_date_sk#827, BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint))) [id=apache#2710] ``` Where `2710` is the id of the reused exchange. Passes existing tests Closes apache#25434 from dbaliafroozeh/ImplementStringArgsExchangeSubqueryExec. Authored-by: Ali Afroozeh <ali.afroozeh@databricks.com> Signed-off-by: herman <herman@databricks.com>
What changes were proposed in this pull request?
Add id to Exchange and Subquery's stringArgs method for easier identifying their reuses in query plans, for example:
Where
2710
is the id of the reused exchange.How was this patch tested?
Passes existing tests