Skip to content

feat(engine): add jump-to-operator support#4444

Open
aglinxinyuan wants to merge 33 commits intomainfrom
xinyuan-scheduler-jump
Open

feat(engine): add jump-to-operator support#4444
aglinxinyuan wants to merge 33 commits intomainfrom
xinyuan-scheduler-jump

Conversation

@aglinxinyuan
Copy link
Copy Markdown
Contributor

@aglinxinyuan aglinxinyuan commented Apr 22, 2026

What changes were proposed in this PR?

Add a generic controller-side primitive for jumping execution to the region containing a target operator (JumpToOperatorRegion).

Design:

  • The schedule produced by WorkflowScheduler remains static; jump behavior does not live in WorkflowScheduler.
  • The execution-position state is owned by WorkflowExecutionCoordinator, which holds a mutable Schedule reference and rewrites the iteration tail when a jump is requested.
  • Schedule exposes O(1) getLevelIndexOfOperator lookup plus a rewriteExecutionFrom(levelIndex) data primitive; jump policy stays out of Schedule.
  • The initial schedule is wired into the coordinator via ControllerProcessor.updateExecutionSchedule(...) after WorkflowScheduler.updateSchedule(...), avoiding inline lambdas in ControllerProcessor.

Any related issues, documentation, discussions?

Closes #4443

Precursor test coverage for related modules (separate PRs against main):

How was this PR tested?

  • sbt "WorkflowExecutionService/compile" — passes.
  • sbt "WorkflowExecutionService/testOnly org.apache.texera.amber.engine.architecture.scheduling.WorkflowExecutionCoordinatorSpec" — 1/1 tests pass.

Was this PR authored or co-authored using generative AI tooling?

Generated-by: ChatGPT (Codex), Claude Code (Claude Opus 4.7)

@aglinxinyuan aglinxinyuan self-assigned this Apr 22, 2026
@aglinxinyuan aglinxinyuan changed the base branch from xinyuan-state-only to main April 22, 2026 05:33
@aglinxinyuan aglinxinyuan linked an issue Apr 25, 2026 that may be closed by this pull request
5 tasks
@Yicong-Huang
Copy link
Copy Markdown
Contributor

I think we should let the coordinator jump back to a stage, rather than "schedule" or "scheduler".

@aglinxinyuan
Copy link
Copy Markdown
Contributor Author

I think we should let the coordinator jump back to a stage, rather than "schedule" or "scheduler".

Updated based on the review.

I refactored the implementation so that:

  • Schedule remains static/immutable
  • WorkflowScheduler only produces and holds the generated schedule
  • WorkflowExecutionCoordinator owns the current execution position and handles jumpToOperator(...)

@Yicong-Huang
Copy link
Copy Markdown
Contributor

I think we should let the coordinator jump back to a stage, rather than "schedule" or "scheduler".

Updated based on the review.

I refactored the implementation so that:

  • Schedule remains static/immutable
  • WorkflowScheduler only produces and holds the generated schedule
  • WorkflowExecutionCoordinator owns the current execution position and handles jumpToOperator(...)

Thanks a lot! Let's keep plan/schedule those data classes immutable. I can later add tests to guard this property.

@chenlica
Copy link
Copy Markdown
Contributor

A random comment: the PR number 4444 is very special :-)

@Yicong-Huang
Copy link
Copy Markdown
Contributor

A random comment: the PR number 4444 is very special :-)

one day we will reach 44444 ;)

@aglinxinyuan aglinxinyuan changed the title feat(engine): add scheduler jump-to-operator support feat(engine): add jump-to-operator support Apr 27, 2026
@aglinxinyuan
Copy link
Copy Markdown
Contributor Author

@Yicong-Huang, please have another pass.

@aglinxinyuan
Copy link
Copy Markdown
Contributor Author

Making a major revision based on the offline discussion with @Yicong-Huang. I will let people know when the PR is ready for review.

@aglinxinyuan aglinxinyuan marked this pull request as draft April 28, 2026 05:18
@github-actions github-actions Bot added the ci changes related to CI label Apr 28, 2026
@github-actions github-actions Bot removed the ci changes related to CI label Apr 28, 2026
@aglinxinyuan aglinxinyuan marked this pull request as ready for review April 30, 2026 06:40
@aglinxinyuan
Copy link
Copy Markdown
Contributor Author

aglinxinyuan commented Apr 30, 2026

@Yicong-Huang, please have a pass. The PR is updated based on the discussion. WorkflowExecutionCoordinator can now accept an updated Schedule, so your use case for dynamic scheduling should also be supported.

Copy link
Copy Markdown
Contributor

@Yicong-Huang Yicong-Huang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM in general, we can differ minor fix later. Please add more tests as this is a very new behavior

aglinxinyuan and others added 2 commits April 30, 2026 15:48
Add cases for multi-jump sequences, unknown-target rejection, jump
before any pull, jump after schedule exhaustion, and forward jump.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reject any `Schedule` constructed with gaps or non-zero starting level
keys. The schedule generator already produces contiguous-from-0 keys,
so this only tightens the contract for direct callers and tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
# Conflicts:
#	amber/src/main/scala/org/apache/texera/amber/engine/architecture/scheduling/WorkflowExecutionCoordinator.scala
#	amber/src/test/scala/org/apache/texera/amber/engine/architecture/scheduling/WorkflowExecutionCoordinatorSpec.scala
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add scheduler support for jumping to a target operator

3 participants