docs: clarify --workers parallelism scope and workspace-mode semantics#1050
Merged
docs: clarify --workers parallelism scope and workspace-mode semantics#1050
Conversation
#1039) Answers the questions from #1039: - --workers N is a global concurrent test-case budget: with a single eval file, N test cases run in parallel; with M eval files, min(N, M) files run concurrently each with floor(N/min(N,M)) workers. - Documents the static workspace race condition when multiple eval files share the same path and run concurrently, with --workers 1 as the serialization escape hatch. - Updates workspace-pool.mdx concurrency section to explain multi-file slot allocation and the static-workspace cross-file hazard. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Deploying agentv with
|
| Latest commit: |
f09ffa7
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://82922c69.agentv.pages.dev |
| Branch Preview URL: | https://docs-1039-workers-workspace.agentv.pages.dev |
… run concurrently Surfaces workspace.path from EvalSuiteResult so the CLI can detect cross-file workspace collisions before starting concurrent execution. Emits a console.warn pointing users to --workers 1 as the fix. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…te literals Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace the global worker-budget-split scheduler with workspace-aware grouping: eval files sharing the same static workspace.path run sequentially within their group; groups with distinct paths (or no static workspace) run in parallel. Each file gets the full --workers N budget with no splitting. Also removes the now-unnecessary concurrent-workspace warning added in the previous commit and updates docs + CLI help to reflect the new semantics. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rrency runWithLimit limits how many groups run in parallel to --workers N. Each file within a group still gets the full --workers budget for within-file test-case parallelism. Max concurrent test cases is bounded by workers² rather than unbounded across all groups. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace workspace-group scheduler with a plain for-of loop. Eval files always run sequentially; --workers N controls within-file test-case parallelism. This matches the standard model used by promptfoo and convex-evals and eliminates all cross-file workspace race conditions without any grouping complexity. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Remove runWithLimit (unused after switching to plain for-of loop) and workspacePath from fileMetadata Map type (set but never read). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #1039
Summary
Answers the three questions raised in #1039 about
--workersand--workspace-modesemantics:1. What is the scope of
--workers N?--workers Nis a global concurrent test-case budget distributed across the entire run:min(N, files)eval files run concurrently, each receivingfloor(N / concurrent_files)workers for intra-file parallelism.So with 3 eval files and
--workers 6: 3 files run concurrently, each with 2 workers → up to 6 concurrent test cases total.Use
--workers 1to run everything sequentially (one eval file at a time, one test case at a time).2. What are the semantics of
--workspace-mode?The workspace-pool.mdx guide already documents all three modes. This PR adds cross-file concurrency context:
pooled(default for evals withrepos): Each active worker acquires its own pool slot, regardless of which eval file it belongs to. No cross-file contention.temp: Each test case gets its own scratch directory. No contention.static: Single fixed path shared by all workers and all eval files. Race-prone when multiple eval files run concurrently against the same path.3. Is there a way to opt into eval-file-level serialization?
Yes:
--workers 1serializes both files and test cases. Alternatively, use--workspace-mode pooled(the default) which gives each worker its own slot and avoids cross-file contention automatically.Changes
apps/cli/src/commands/eval/commands/run.ts: Updated--workershelp text to explain the global-budget-split-across-files semantics.apps/web/src/content/docs/docs/evaluation/running-evals.mdx: Added a "Parallelism" section with single-file vs multi-file examples and a static-workspace race condition warning.apps/web/src/content/docs/docs/guides/workspace-pool.mdx: Extended the Concurrency section to explain multi-file slot allocation and the static-workspace cross-file hazard.Red / Green UAT
Before:
agentv eval --helpshows:After:
🤖 Generated with Claude Code