feat: add seeded run creation#730
Conversation
|
@greptileai review |
Greptile SummaryThis PR adds
Confidence Score: 5/5Safe to merge — all changed paths have clear validation and the seed-to-run mapping is derived from durable execution-plan data rather than insertion order. The seeded fanout feature is well-contained: seeds are embedded per-run in the execution plan at write time and re-read from it at response time, so the mapping survives any repository-level reordering. Validation in both the CLI and the backend handler covers range, uniqueness, strategy, and incompatible flag combinations. No auth or data-integrity boundaries are affected by the change. No files require special attention.
|
| Filename | Overview |
|---|---|
| backend/internal/api/eval_session_service.go | Core seeded-run logic: moves executionPlan build inside the per-repetition loop, injects seed from SeedFanout[i], and derives seededRuns from each run's ExecutionPlan rather than positional matching. |
| backend/internal/api/eval_sessions.go | Adds MaxIterations decoding/validation and decodeEvalSessionSeedFanout; validates strategy, array length, positive integers, no duplicates. |
| backend/internal/api/eval_session_reads.go | Adds Seed field to child-run response; evalSessionChildRunSeed correctly extracts seed from ExecutionPlan JSON and filters seeds < 1. |
| cli/cmd/run_create_helpers.go | Adds buildSeededEvalSessionBody; correctly zeroes MaxIterations before calling buildEvalSessionBody, then injects it top-level; seeds generated as sequential int64 [1..N]. |
| cli/cmd/run.go | Routes seeds > 0 to seeded eval session path; --follow + --seeds returns a clear error. |
| cli/cmd/eval_session_helpers.go | Changes executionMode to 'comparison' for multi-deployment eval sessions; adds seeded-run seed display in presentCreatedEvalSession. |
| docs/api-server/openapi.yaml | Documents max_iterations, seed_fanout, EvalSessionSeedFanoutConfig, EvalSessionSeededRun, and seed on child runs; constraints match backend validation. |
Sequence Diagram
sequenceDiagram
participant CLI as CLI (run create --seeds N)
participant API as POST /v1/eval-sessions
participant Svc as RunCreationManager
participant Repo as Repository
CLI->>CLI: "buildSeededEvalSessionBody seeds=[1..N]"
CLI->>API: "POST body {seed_fanout, max_iterations}"
API->>API: decodeEvalSessionSeedFanout validate
API->>Svc: CreateEvalSession(input)
loop repetition 0..N-1
Svc->>Svc: buildExecutionPlan(runInput + seed[i])
Svc->>Svc: append childRun with ExecutionPlan
end
Svc->>Svc: buildRoutingTaskSnapshot embed seed_fanout
Svc->>Repo: CreateEvalSessionWithQueuedRuns
Repo-->>Svc: Session + Runs[]
loop each returned Run
Svc->>Svc: evalSessionChildRunSeed(run.ExecutionPlan)
end
Svc-->>API: Session + RunIDs + SeededRuns
API-->>CLI: eval_session + run_ids + seeded_runs
CLI->>CLI: print run_id (seed N) per line
Reviews (2): Last reviewed commit: "fix: derive seeded run mapping from plan..." | Re-trigger Greptile
| @@ -48,7 +48,7 @@ func buildEvalSessionBody(workspaceID string, request runCreateRequest, repetiti | |||
|
|
|||
There was a problem hiding this comment.
Silent behavioral change for existing
--repetitions multi-deployment users
Changing executionMode from "multi_agent" to "comparison" in buildEvalSessionBody affects all multi-deployment eval sessions (not only seeded ones), including agentclash run create --repetitions N --deployments A,B. If any backend or downstream consumers previously accepted only "multi_agent", those calls will start sending "comparison" after this release with no migration notice. The correction appears intentional, but it might be worth noting in the PR or release notes since it's a silent breaking change for existing callers on the repetitions path.
Prompt To Fix With AI
This is a comment left during a code review.
Path: cli/cmd/eval_session_helpers.go
Line: 48
Comment:
**Silent behavioral change for existing `--repetitions` multi-deployment users**
Changing `executionMode` from `"multi_agent"` to `"comparison"` in `buildEvalSessionBody` affects all multi-deployment eval sessions (not only seeded ones), including `agentclash run create --repetitions N --deployments A,B`. If any backend or downstream consumers previously accepted only `"multi_agent"`, those calls will start sending `"comparison"` after this release with no migration notice. The correction appears intentional, but it might be worth noting in the PR or release notes since it's a silent breaking change for existing callers on the repetitions path.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Noted in the PR body under Compatibility note. This is an intentional correction to the backend-supported eval-session value: the existing multi_agent payload was rejected by the API, while comparison is already the accepted execution mode for multi-participant eval sessions.
|
@greptileai review |
Summary
agentclash run create --seeds Nto create a seeded eval session with one child run per seedseeded_runsfrom eval-session creation andseedon eval-session child-run reads--max-iterfor seeded eval sessions and document the new API shapeCompatibility note
buildEvalSessionBodynow sendsexecution_mode: "comparison"for multi-deployment eval sessions. This is the backend-supported value; the previous CLI value,"multi_agent", was rejected by the eval-session API, so this fixes the existing--repetitionsmulti-deployment path rather than changing stored backend semantics.Tests
cd backend && go test ./internal/api -run 'TestGetEvalSessionEndpointReturnsDetail|TestCreateEvalSessionEndpointReturnsCreated|TestDecodeEvalSessionConfigRejectsInvalidSeedFanout|TestRunCreationManagerCreateEvalSessionPersistsSeedFanout' -count=1cd cli && go test ./cmd -run 'TestBuildSeededEvalSessionBody|TestRunCreateSeedsRoutesToEvalSessions|TestRunCreateSeedsRejectsFollow|TestBuildSeededEvalSessionBodyRejectsUnsupportedScope|TestBuildSeededEvalSessionBodyRejectsInvalidFlagRanges|TestBuildEvalSessionBody_MultiDeployment_LabelsAndMode' -count=1git diff --checkcd backend && go test ./...cd backend && go vet ./...cd cli && go build ./...cd cli && go vet ./...cd cli && go test -short -race -count=1 ./...cd cli && go run github.com/goreleaser/goreleaser/v2@latest checkFixes #699