-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Add a utility function to get all of the PartitionedFile for an ExecutionPlan #5572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hi @mingmwang, @alamb, could you help review this PR? |
|
I think you can move the |
|
@mustafasrepo do you have time to review this PR? |
I'll add another commit to this PR for this. |
Sure. I will look at it. |
|
Unfortunately, due to the Orphan rule, I fail to combine the trait |
| use datafusion_common::Result; | ||
| use std::sync::Arc; | ||
|
|
||
| impl TreeNodeRewritable for Arc<dyn ExecutionPlan> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason for implementing TreeNodeRewritable trait in another file. I guess, you could have implement this trait in the file where ExecutionPlan is implemented. I am not familiar with these pattern. I am asking to understand better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the Orphan rule is relaxed in the future, the same trait of TreeNodeRewritable can be used for the PhysicalExpr. The traits defined in the tree_node file are common abstractions. They are not just for ExecutionPlan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand why TreeNodeRewritable trait is defined in datafusion/core/src/physical_plan/tree_node/mod.rs. What I mean was that Arc<dyn ExecutionPlan> implements TreeNodeRewritable trait. I guess, we could do this implementation in datafusion/core/src/physical_plan/mod.rs. However, it is just a stylistic issue. I just wondered the reason why you did it in datafusion/core/src/physical_plan/tree_node/rewritable.rs
| use crate::physical_plan::ExecutionPlan; | ||
| use std::sync::Arc; | ||
|
|
||
| impl TreeNodeVisitable for Arc<dyn ExecutionPlan> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the above case I guess TreeNodeVisitable could have been implemented where ExecutionPlan is defined. Is there any reason for this specific pattern.
|
LGTM!. Thanks @yahoNanJing. |
|
Thanks @mustafasrepo. I'll merge this PR. |
|
@yahoNanJing |
|
Thanks @mingmwang for your suggestion. I'll raise another PR to replace the Visitor by closure which will be much easier to use. |
Which issue does this PR close?
Closes #5566.
Rationale for this change
Currently the
TreeNodeRewriteris as a visitor to transform a node to another. However, sometimes we don't need to do the transformation and what we want is only to collect some info from the node. To achieve this, it's better to introduce another visitor for collecting info and keep the node unchanged.What changes are included in this PR?
ExecutionPlanAre these changes tested?
Are there any user-facing changes?