-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Summary
Implement distributed sweep orchestration in Stepbit for running QuantLab parameter searches across multiple workers/nodes.
Position in the roadmap
This is a post-MVP integration phase, not the next operational step.
The immediate next block is the local adapter MVP in #61. Distributed sweeps should only start after the local Stepbit ↔ QuantLab bridge is stable, validated, and well documented.
Why this stays open
Single-node/local execution is enough for the minimum integration bridge, but larger parameter sweeps may later require coordination, monitoring, and fault handling across multiple workers.
This issue remains as the scaling phase after the local adapter path is proven.
Functionality
Implement a distributed sweep orchestration flow that supports:
- sweep request submission
- partitioning of parameter combinations
- dispatch to available workers/nodes
- per-job/per-node status tracking
- result aggregation
- best-candidate selection
- retry handling for transient worker failures
Minimum payload
params_gridstop_criteriamax_concurrencynode_listor equivalent worker pool selection
Job states
submittedrunningdonefailedpartial
Acceptance criteria
- a simulated multi-node sweep can complete end-to-end
- results are aggregated into a normalized summary
- node failures trigger retry logic where appropriate
- job status is observable during execution
- best-result selection is deterministic and documented
Depends on
- local adapter MVP in feat(integration): add QuantLabTool adapter MVP for local Stepbit execution #61
- execution policy / worker assignment decisions
- validated end-to-end local integration path
Notes
Do not treat this as part of the minimum viable integration.
This is the later scaling step after the local bridge is stable.
Linked Antigravity task:
.agents/tasks/task-stepbit-distributed-sweeps.md
Tracked in:
.agents/tasks/stage-stepbit-integration-roadmap.md