feat: task_manager delegats physical plan creation to execution graph#1726
feat: task_manager delegats physical plan creation to execution graph#1726milenkovicm wants to merge 2 commits into
Conversation
…ph implementation Currently, ballista schedule creates and optimizes physical plan before it is delegated to execution graph for execution. this makes plan change a bit more complicated than it needs to be. Propagating logical plan to execution graph will give ability to it to transform logical plan (or non optimized physical plan) in any way it needs, simplifying planning rules. There is a bit of refactoring of existing code, moving some of the code (`EXPLAIN` handling) to its own method.
| .filter(|url| url.as_str().starts_with("file:///")) | ||
| .collect(); | ||
| if !local_paths.is_empty() { | ||
| // These are local files rather than remote object stores, so we |
There was a problem hiding this comment.
Is it intentional that this check for local files is no more made ?
There was a problem hiding this comment.
yes, i don't think this rule get triggered at all
| // optimizing the plan here is redundant because the physical planner will do this again | ||
| // but it is helpful to see what the optimized plan will be | ||
| let optimized_plan = session_ctx.state().optimize(plan)?; | ||
| debug!("Optimized plan: {}", optimized_plan.display_indent()); |
There was a problem hiding this comment.
It would be good to still log the optimized plan for better diagnostics.
There was a problem hiding this comment.
plan has been optimized but not used later, so log does not have a great value
| if node.output_partitioning().partition_count() == 0 { | ||
| let empty: Arc<dyn ExecutionPlan> = | ||
| Arc::new(EmptyExec::new(node.schema())); |
There was a problem hiding this comment.
Should this logic be preserved somewhere in the new implementation ?
There was a problem hiding this comment.
there should be no nodes with output partitioning 0,
There was a problem hiding this comment.
Try to reproduce a case for this, couldnt make it. I also think this is safe.
metegenez
left a comment
There was a problem hiding this comment.
Should this change also affect StaticExecutionGraph? @milenkovicm
it could but i dont want to change it in this PR |
metegenez
left a comment
There was a problem hiding this comment.
That make sense, so overall this make ExecutionGraph trait more responsible for getting optimal physical plans. I couldnt find anything that doesnt make sense here. LGTM.
|
thanks @metegenez & @martin-g |
Which issue does this PR close?
Closes #.
Rationale for this change
Currently, ballista schedule creates and optimizes physical plan before it is delegated to execution graph for execution. this makes plan change a bit more complicated than it needs to be.
Propagating logical plan to execution graph will give ability to it to transform logical plan (or non optimized physical plan) in any way it needs, simplifying planning rules.
There is a bit of refactoring of existing code, moving some of the code (
EXPLAINhandling) to its own method.What changes are included in this PR?
EXPLAINrelated code to a functionAre there any user-facing changes?
yes if users are implementing
execution graphinterface