Skip to content

feat: task_manager delegats physical plan creation to execution graph#1726

Open
milenkovicm wants to merge 2 commits into
apache:mainfrom
milenkovicm:feat_refactor_submit_job
Open

feat: task_manager delegats physical plan creation to execution graph#1726
milenkovicm wants to merge 2 commits into
apache:mainfrom
milenkovicm:feat_refactor_submit_job

Conversation

@milenkovicm
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Closes #.

Rationale for this change

Currently, ballista schedule creates and optimizes physical plan before it is delegated to execution graph for execution. this makes plan change a bit more complicated than it needs to be.

Propagating logical plan to execution graph will give ability to it to transform logical plan (or non optimized physical plan) in any way it needs, simplifying planning rules.

There is a bit of refactoring of existing code, moving some of the code (EXPLAIN handling) to its own method.

What changes are included in this PR?

  • change method parameters across few methods around task manager and execution graph creation
  • move EXPLAIN related code to a function

Are there any user-facing changes?

yes if users are implementing execution graph interface

…ph implementation

Currently, ballista schedule creates and optimizes
physical plan before it is delegated to execution
graph for execution. this makes plan change a bit
more complicated than it needs to be.

Propagating logical plan to execution graph will
give ability to it to transform logical plan
(or non optimized physical plan) in any way it needs,
simplifying planning rules.

There is a bit of refactoring of existing code,
moving some of the code (`EXPLAIN` handling) to
its own method.
@milenkovicm milenkovicm requested a review from metegenez May 18, 2026 18:17
@milenkovicm milenkovicm changed the title feat: task_manager delegating physical plan creation to execution graph feat: task_manager delegats physical plan creation to execution graph May 18, 2026
.filter(|url| url.as_str().starts_with("file:///"))
.collect();
if !local_paths.is_empty() {
// These are local files rather than remote object stores, so we
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it intentional that this check for local files is no more made ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, i don't think this rule get triggered at all

Comment thread ballista/scheduler/src/state/aqe/mod.rs Outdated
// optimizing the plan here is redundant because the physical planner will do this again
// but it is helpful to see what the optimized plan will be
let optimized_plan = session_ctx.state().optimize(plan)?;
debug!("Optimized plan: {}", optimized_plan.display_indent());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to still log the optimized plan for better diagnostics.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plan has been optimized but not used later, so log does not have a great value

Comment on lines -456 to -458
if node.output_partitioning().partition_count() == 0 {
let empty: Arc<dyn ExecutionPlan> =
Arc::new(EmptyExec::new(node.schema()));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this logic be preserved somewhere in the new implementation ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there should be no nodes with output partitioning 0,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to reproduce a case for this, couldnt make it. I also think this is safe.

Copy link
Copy Markdown
Contributor

@metegenez metegenez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this change also affect StaticExecutionGraph? @milenkovicm

@milenkovicm
Copy link
Copy Markdown
Contributor Author

Should this change also affect StaticExecutionGraph? @milenkovicm

it could but i dont want to change it in this PR

Copy link
Copy Markdown
Contributor

@metegenez metegenez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That make sense, so overall this make ExecutionGraph trait more responsible for getting optimal physical plans. I couldnt find anything that doesnt make sense here. LGTM.

@milenkovicm
Copy link
Copy Markdown
Contributor Author

thanks @metegenez & @martin-g
i plan to keep this PR open until after ballista 53 release which i plan for later

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants