bugfix: Ensure deferred checkpoint includes successor tasks in execution queue#46
bugfix: Ensure deferred checkpoint includes successor tasks in execution queue#46
Conversation
Summary
ProblemWhen a task requests a deferred checkpoint via
Since the checkpoint was created before successors were enqueued, the restored queue was empty. FixReorder the loop so that
Test Plan
|
There was a problem hiding this comment.
Pull request overview
Fixes a workflow-engine checkpointing bug where deferred checkpoints were created before successor tasks were enqueued, causing resumed executions to miss remaining work.
Changes:
- Moved deferred checkpoint handling in
WorkflowEngine.execute()to run after successor tasks are added to the execution queue. - Added a new regression test file validating both resume execution and restored pending-queue contents for deferred checkpoints.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
graflow/core/engine.py |
Reorders deferred checkpoint creation to include newly enqueued successor tasks in the checkpoint queue snapshot. |
tests/core/test_checkpoint_resume_bug.py |
Adds regression tests to ensure resumed contexts execute remaining successors and that the restored pending queue includes successors. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This pull request fixes a bug in the workflow engine's checkpointing logic to ensure that when a deferred checkpoint is requested, all successor tasks are properly included in the checkpoint's pending task queue. This guarantees that resuming from a checkpoint will correctly execute all remaining tasks. The change moves the checkpoint handling to occur after successor processing and adds thorough tests to verify correct behavior.
Bug fix: Deferred checkpoint handling
graflow/core/engine.pyto occur after successor tasks are added to the queue, ensuring that checkpoints include all pending successors. [1] [2]Testing improvements
tests/core/test_checkpoint_resume_bug.pywith two tests: