Add: Job lifecycle management and persistence#82
Add: Job lifecycle management and persistence#82FL4TLiN3 wants to merge 2 commits intoepic/job-conceptfrom
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub. |
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
| ...job, | ||
| totalSteps: job.totalSteps + runResultCheckpoint.stepNumber, | ||
| usage: sumUsage(job.usage, runResultCheckpoint.usage), | ||
| } |
There was a problem hiding this comment.
Bug: Step count double-counted during job delegation
When delegation occurs, the child run's checkpoint inherits the parent's stepNumber via buildDelegateToState (which spreads ...resultCheckpoint). The child's final stepNumber is therefore cumulative across all runs. Adding runResultCheckpoint.stepNumber to job.totalSteps in each while loop iteration causes double-counting.
For example: parent does 3 steps (stepNumber=3), child starts at step 4 and does 2 steps (stepNumber=5). The calculation 3 + 5 = 8 is wrong; it should be 3 + 2 = 5. The same issue affects usage accumulation via sumUsage, as child checkpoints also inherit parent usage.
- Simplify checkpoint storage path to jobs/{jobId}/checkpoints/{id}.json
- Remove timestamp from checkpoint filename
- Update --resume-from to strictly require --continue-job
- Change TUI history from Run-based to Job-based display
- Show jobId and checkpointId in TUI for easier CLI usage
Closes #81
Closes #82
|
Consolidated into a combined PR with #81 |
* Add: Job lifecycle management and persistence
* Refactor: Simplify checkpoint path and display Job-based history in TUI
- Simplify checkpoint storage path to jobs/{jobId}/checkpoints/{id}.json
- Remove timestamp from checkpoint filename
- Update --resume-from to strictly require --continue-job
- Change TUI history from Run-based to Job-based display
- Show jobId and checkpointId in TUI for easier CLI usage
Closes #81
Closes #82
* Refactor: Move persistence logic from run-manager to runtime
- Add getAllJobs() to job-store.ts
- Add getAllRuns() to run-setting-store.ts
- Add getCheckpointsByJobId(), getEventsByRun(), getEventContents() to default-store.ts
- Export new functions from runtime index
- Simplify run-manager.ts to delegate to runtime functions
* Fix: Job totalSteps and usage double-counting
stepNumber and usage in checkpoints are cumulative within a Job,
so directly assign instead of summing to avoid double-counting.
Summary
Test plan
Closes #80
Note
Persist Jobs to job.json, move checkpoints to per-job directory with simplified APIs, update TUI to browse jobs, and clarify CLI/docs that --resume-from requires --continue-job.
job-storewithstoreJob,retrieveJob,createInitialJob; export via runtime index.run()(create/updatejob.json; trackstatus,totalSteps,usage,finishedAt).perstack/jobs/<jobId>/checkpoints/<checkpointId>.json; updatedefaultStore/RetrieveCheckpointandexecuteStateMachinestoreCheckpointsignature.getAllJobs,getCheckpointsByJobId, refactorgetMostRecentCheckpoint,getCheckpointById, and checkpoint/details helpers to job-scoped APIs.startcommand/history now lists Jobs (JobHistoryItem) instead of Runs; adapt callbacks (onLoadCheckpoints,onLoadEvents) and browsing UI to job-centric flow.cli.mdxand state management to state--resume-fromrequires--continue-joband adjust examples/notes.--resume-frommessage.Written by Cursor Bugbot for commit d402db5. This will update automatically on new commits. Configure here.