feat: add seeded run creation by Atharva-Kanherkar · Pull Request #730 · agentclash/agentclash

Atharva-Kanherkar · 2026-05-10T11:57:59Z

Summary

add agentclash run create --seeds N to create a seeded eval session with one child run per seed
persist explicit seed fanout metadata on the eval-session routing snapshot and child run execution plans
surface seeded_runs from eval-session creation and seed on eval-session child-run reads
support --max-iter for seeded eval sessions and document the new API shape

Compatibility note

buildEvalSessionBody now sends execution_mode: "comparison" for multi-deployment eval sessions. This is the backend-supported value; the previous CLI value, "multi_agent", was rejected by the eval-session API, so this fixes the existing --repetitions multi-deployment path rather than changing stored backend semantics.

Tests

cd backend && go test ./internal/api -run 'TestGetEvalSessionEndpointReturnsDetail|TestCreateEvalSessionEndpointReturnsCreated|TestDecodeEvalSessionConfigRejectsInvalidSeedFanout|TestRunCreationManagerCreateEvalSessionPersistsSeedFanout' -count=1
cd cli && go test ./cmd -run 'TestBuildSeededEvalSessionBody|TestRunCreateSeedsRoutesToEvalSessions|TestRunCreateSeedsRejectsFollow|TestBuildSeededEvalSessionBodyRejectsUnsupportedScope|TestBuildSeededEvalSessionBodyRejectsInvalidFlagRanges|TestBuildEvalSessionBody_MultiDeployment_LabelsAndMode' -count=1
git diff --check
cd backend && go test ./...
cd backend && go vet ./...
cd cli && go build ./...
cd cli && go vet ./...
cd cli && go test -short -race -count=1 ./...
cd cli && go run github.com/goreleaser/goreleaser/v2@latest check

Fixes #699

Atharva-Kanherkar · 2026-05-10T12:02:18Z

@greptileai review

greptile-apps · 2026-05-10T12:06:41Z

Greptile Summary

This PR adds agentclash run create --seeds N to create a seeded eval session with one child run per seed (seeds [1..N]), persisting seed metadata in both each run's ExecutionPlan and the eval-session routing snapshot, and surfacing it through seeded_runs on creation and seed on child-run reads. It also wires --max-iter support into seeded eval sessions and fixes the existing multi-deployment execution_mode value from the rejected "multi_agent" to the backend-accepted "comparison".

Seeded fanout: buildSeededEvalSessionBody generates sequential seeds [1..N], embeds them per-run in ExecutionPlan, and records the fanout in the routing snapshot. The seededRuns mapping in the response is derived from each returned run's ExecutionPlan seed (not from insertion order), making it resilient to any repository-level reordering.
Validation: decodeEvalSessionSeedFanout enforces strategy=\"explicit\", seed count equals repetitions, all values ≥ 1, and no duplicates; --max-iter, --follow, race-context flags, and suite_only scope are all explicitly rejected or handled for the seeded path.
Backend + CLI test coverage: new tests cover seed persistence, fanout snapshot shape, HTTP request decoding, and validation rejections.

Confidence Score: 5/5

Safe to merge — all changed paths have clear validation and the seed-to-run mapping is derived from durable execution-plan data rather than insertion order.

The seeded fanout feature is well-contained: seeds are embedded per-run in the execution plan at write time and re-read from it at response time, so the mapping survives any repository-level reordering. Validation in both the CLI and the backend handler covers range, uniqueness, strategy, and incompatible flag combinations. No auth or data-integrity boundaries are affected by the change.

No files require special attention.

Important Files Changed

Filename	Overview
backend/internal/api/eval_session_service.go	Core seeded-run logic: moves executionPlan build inside the per-repetition loop, injects seed from SeedFanout[i], and derives seededRuns from each run's ExecutionPlan rather than positional matching.
backend/internal/api/eval_sessions.go	Adds MaxIterations decoding/validation and decodeEvalSessionSeedFanout; validates strategy, array length, positive integers, no duplicates.
backend/internal/api/eval_session_reads.go	Adds Seed field to child-run response; evalSessionChildRunSeed correctly extracts seed from ExecutionPlan JSON and filters seeds < 1.
cli/cmd/run_create_helpers.go	Adds buildSeededEvalSessionBody; correctly zeroes MaxIterations before calling buildEvalSessionBody, then injects it top-level; seeds generated as sequential int64 [1..N].
cli/cmd/run.go	Routes seeds > 0 to seeded eval session path; --follow + --seeds returns a clear error.
cli/cmd/eval_session_helpers.go	Changes executionMode to 'comparison' for multi-deployment eval sessions; adds seeded-run seed display in presentCreatedEvalSession.
docs/api-server/openapi.yaml	Documents max_iterations, seed_fanout, EvalSessionSeedFanoutConfig, EvalSessionSeededRun, and seed on child runs; constraints match backend validation.

Sequence Diagram

sequenceDiagram
    participant CLI as CLI (run create --seeds N)
    participant API as POST /v1/eval-sessions
    participant Svc as RunCreationManager
    participant Repo as Repository

    CLI->>CLI: "buildSeededEvalSessionBody seeds=[1..N]"
    CLI->>API: "POST body {seed_fanout, max_iterations}"
    API->>API: decodeEvalSessionSeedFanout validate
    API->>Svc: CreateEvalSession(input)
    loop repetition 0..N-1
        Svc->>Svc: buildExecutionPlan(runInput + seed[i])
        Svc->>Svc: append childRun with ExecutionPlan
    end
    Svc->>Svc: buildRoutingTaskSnapshot embed seed_fanout
    Svc->>Repo: CreateEvalSessionWithQueuedRuns
    Repo-->>Svc: Session + Runs[]
    loop each returned Run
        Svc->>Svc: evalSessionChildRunSeed(run.ExecutionPlan)
    end
    Svc-->>API: Session + RunIDs + SeededRuns
    API-->>CLI: eval_session + run_ids + seeded_runs
    CLI->>CLI: print run_id (seed N) per line

_{Reviews (2): Last reviewed commit: "fix: derive seeded run mapping from plan..." | Re-trigger Greptile}

greptile-apps · 2026-05-10T12:06:47Z

@@ -48,7 +48,7 @@ func buildEvalSessionBody(workspaceID string, request runCreateRequest, repetiti



Silent behavioral change for existing --repetitions multi-deployment users

Changing executionMode from "multi_agent" to "comparison" in buildEvalSessionBody affects all multi-deployment eval sessions (not only seeded ones), including agentclash run create --repetitions N --deployments A,B. If any backend or downstream consumers previously accepted only "multi_agent", those calls will start sending "comparison" after this release with no migration notice. The correction appears intentional, but it might be worth noting in the PR or release notes since it's a silent breaking change for existing callers on the repetitions path.

Prompt To Fix With AI

This is a comment left during a code review. Path: cli/cmd/eval_session_helpers.go Line: 48 Comment: **Silent behavioral change for existing `--repetitions` multi-deployment users** Changing `executionMode` from `"multi_agent"` to `"comparison"` in `buildEvalSessionBody` affects all multi-deployment eval sessions (not only seeded ones), including `agentclash run create --repetitions N --deployments A,B`. If any backend or downstream consumers previously accepted only `"multi_agent"`, those calls will start sending `"comparison"` after this release with no migration notice. The correction appears intentional, but it might be worth noting in the PR or release notes since it's a silent breaking change for existing callers on the repetitions path. How can I resolve this? If you propose a fix, please make it concise.

Noted in the PR body under Compatibility note. This is an intentional correction to the backend-supported eval-session value: the existing multi_agent payload was rejected by the API, while comparison is already the accepted execution mode for multi-participant eval sessions.

Atharva-Kanherkar · 2026-05-10T12:11:09Z

@greptileai review

feat: add seeded run creation

3b1b53f

greptile-apps Bot reviewed May 10, 2026

View reviewed changes

fix: derive seeded run mapping from plans

68184cb

Atharva-Kanherkar merged commit d24dd6b into main May 10, 2026
7 checks passed

Atharva-Kanherkar deleted the codex/roadmap-693-run-seeds branch May 10, 2026 12:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add seeded run creation#730

feat: add seeded run creation#730
Atharva-Kanherkar merged 2 commits into
mainfrom
codex/roadmap-693-run-seeds

Atharva-Kanherkar commented May 10, 2026 •

edited

Loading

Uh oh!

Atharva-Kanherkar commented May 10, 2026

Uh oh!

greptile-apps Bot commented May 10, 2026 •

edited

Loading

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps Bot May 10, 2026

Uh oh!

Atharva-Kanherkar May 10, 2026

Uh oh!

Atharva-Kanherkar commented May 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		@@ -48,7 +48,7 @@ func buildEvalSessionBody(workspaceID string, request runCreateRequest, repetiti

Conversation

Atharva-Kanherkar commented May 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Compatibility note

Tests

Uh oh!

Atharva-Kanherkar commented May 10, 2026

Uh oh!

greptile-apps Bot commented May 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps Bot May 10, 2026

Choose a reason for hiding this comment

Uh oh!

Atharva-Kanherkar May 10, 2026

Choose a reason for hiding this comment

Uh oh!

Atharva-Kanherkar commented May 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Atharva-Kanherkar commented May 10, 2026 •

edited

Loading

greptile-apps Bot commented May 10, 2026 •

edited

Loading