Add BEM pre-step gate to fleet dispatcher for multi-issue parallel dispatch#2207
Merged
Trecek merged 4 commits intoMay 8, 2026
Conversation
- Raise max_concurrent_dispatches default from 1 to 3 in FleetConfig
and defaults.yaml
- Add max_total_issues: 12 cap to FleetConfig with validation
- Replace stale serial-only guidance in fleet dispatcher prompt with
BEM pre-step gate instructions (bem-wrapper dispatch, dispatch_plan
reading, parallel group sequencing)
- Fix inaccurate semaphore language in campaign prompt ("calls queue"
→ fast-fail wait/retry, "max_concurrent=1" → sequential static)
- Add contract tests for BEM gate prompt content
- Add contract test for campaign prompt accuracy (no "calls queue")
- Add test_fourth_concurrent_dispatch_refused_with_max3 E2E test
- Rename existing config test to reflect new default of 3
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…rom_dynaconf, update contracts CLAUDE.md - Re-add fleet_lock mention to campaign discipline section (test_fleet_lock_mentioned) - Add max_total_issues val() line to FleetConfig construction in from_dynaconf (REQ-CONFIG-001) - Add test_fleet_dispatch_bem_gate.py and test_campaign_prompt_accuracy.py to tests/contracts/CLAUDE.md
…ic literal 3 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…weak or-12 match, pin match string Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Trecek
added a commit
that referenced
this pull request
May 8, 2026
…spatch (#2207) ## Summary Add a BEM (build-execution-map) conflict gate to the fleet dispatcher's multi-issue dispatch path, and raise the fleet semaphore's default concurrency to allow the gate to actually enable parallel execution. The work spans four areas: (1) replace stale serial-only guidance in the fleet dispatcher prompt with BEM gate instructions, (2) fix inaccurate semaphore language in the campaign prompt, (3) raise `max_concurrent_dispatches` default from 1 to 3 and add a `max_total_issues: 12` cap to `FleetConfig`, and (4) add contract and config tests that enforce these prompt and config invariants going forward. All machinery already exists: `bem-wrapper` is a `food-truck` recipe, `dispatch_food_truck` returns `l3_payload.dispatch_plan` inline, and `FleetSemaphore` already supports N>1. The gap is entirely in prompts and config defaults. ## Requirements ### REQ-BEM-001: Multi-issue conflict gate section in fleet dispatcher prompt Add a new prompt section to `_build_fleet_dispatch_prompt()` in `src/autoskillit/cli/_prompts_kitchen.py` that instructs the fleet dispatcher to: 1. Detect when the user requests 2+ issues dispatched (regardless of whether user says "parallel" — any multi-issue dispatch needs the gate) 2. Dispatch `bem-wrapper` first: `dispatch_food_truck(recipe="bem-wrapper", task="Build execution map for conflict analysis", ingredients={"issue_urls": "<comma-separated URLs>", "base_branch": "<target branch>"}, capture={"execution_map": "${{ result.execution_map }}"})` 3. Read `l3_payload.dispatch_plan` from the `dispatch_food_truck` response — this is a JSON array of `{"group": N, "parallel": bool, "issues": "..."}` objects 4. Sequence subsequent dispatches per the group plan: - `parallel: true` groups → dispatch all issues in the group simultaneously - `parallel: false` groups → dispatch sequentially within the group - Wait for all dispatches in group N to complete before starting group N+1 5. Fall back to sequential dispatch (one at a time) if `bem-wrapper` fails 6. Single-issue dispatches skip BEM entirely (no pre-step needed) ### REQ-BEM-002: Update stale serial dispatch guidance Replace the stale text at `_prompts_kitchen.py:121-122`: ``` - Serial execution: dispatch one food truck at a time. fleet_lock enforces this — do NOT attempt parallel dispatches. ``` With guidance describing BEM-gated parallel dispatch. The new text should describe: single-issue dispatches proceed directly; multi-issue dispatches require the BEM pre-step; group ordering and merge-wait between groups; max concurrent dispatches is 3. Reference the existing `max_parallel` cap that BEM already enforces on group sizes. ### REQ-BEM-003: Contract test for fleet dispatcher BEM gate Add a contract test (in `tests/contracts/`) asserting that the fleet dispatcher prompt returned by `_build_fleet_dispatch_prompt()` contains multi-issue conflict gate instructions. Similar pattern to `test_sous_chef_routing.py:176` which asserts sous-chef contains BEM invocation instructions. Suggested assertions: - Prompt contains "bem-wrapper" reference - Prompt contains "dispatch_plan" reference - Prompt contains "multi-issue" or "conflict" gate language - Prompt does NOT contain the stale "Serial execution: dispatch one food truck at a time" text ### REQ-BEM-004: Include `bem-wrapper` in fleet dispatcher recipe table If `_food_truck_section` (the recipe table injected into the fleet dispatcher prompt) does not already include `bem-wrapper`, ensure it appears so the dispatcher knows it's available for dispatch without calling `list_recipes` first. ### REQ-SEM-001: Raise `max_concurrent_dispatches` default to 3 Update the default from 1 to 3 in both `src/autoskillit/config/_config_dataclasses.py` (`FleetConfig.max_concurrent_dispatches`) and `src/autoskillit/config/defaults.yaml` (`fleet.max_concurrent_dispatches`). This allows the semaphore to permit up to 3 food trucks running simultaneously, which BEM groups can utilize when `parallel: true`. Users can still override via `.autoskillit/config.yaml` or the `AUTOSKILLIT_FLEET__MAX_CONCURRENT_DISPATCHES` env var. Update `FleetConfig.validate()` to enforce `1 <= max_concurrent_dispatches <= 3` (hard cap to prevent runaway subprocess spawning). Note: `FleetSemaphore` itself needs no changes — it already supports N>1 via `asyncio.BoundedSemaphore(max_concurrent)`. ### REQ-SEM-002: Add `max_total_issues` fleet config cap (default 12) Add a new `max_total_issues: int = 12` field to `FleetConfig` in `_config_dataclasses.py` and `defaults.yaml`. This caps the total number of issues a single fleet dispatch session can process (across all food trucks and all groups). The fleet dispatcher prompt should instruct the dispatcher to count total issues before dispatching `bem-wrapper` and refuse (or warn) if over the cap. User-overridable via config/env var. Validation: `1 <= max_total_issues`. Note: This is distinct from the existing `max_parallel` (default 6), which caps issues per BEM group, not total issues across the session. `max_parallel` is a BEM skill argument; `max_total_issues` is a fleet-level config field. Both are needed: `max_parallel` controls group sizing, `max_total_issues` controls session-level scope. Rationale for 12: with `max_parallel=6` (issues per group) and `max_concurrent_dispatches=3`, 12 issues is ~2 fully-loaded parallel groups — the upper bound of what the system should handle in a single dispatch session. ### REQ-SEM-003: Fix inaccurate "calls queue" campaign prompt claim Replace the inaccurate text at `_prompts_campaign.py:53`: ``` The fleet semaphore gates actual concurrency; calls queue when the semaphore is saturated. ``` With accurate language reflecting the actual `at_capacity()` → `FLEET_PARALLEL_REFUSED` fast-fail behavior. The dispatcher must wait for a running dispatch to complete before retrying, not assume queuing. Also update `_prompts_campaign.py:231` which says "max_concurrent=1 for static dispatches" — this will be inaccurate once REQ-SEM-001 raises the default. ### REQ-SEM-004: Update tests for new defaults - Update any tests that hardcode `max_concurrent_dispatches=1` as the expected default (search for `max_concurrent_dispatches` and `max_concurrent=1` in tests/) - Add validation test for the new `max_total_issues` field and the `max_concurrent_dispatches <= 3` upper bound - Update `test_fleet_e2e.py` tests that assert `FLEET_PARALLEL_REFUSED` at max=1 to account for the new default of 3 - Add contract test asserting the campaign prompt does NOT contain "calls queue" language ## Conflict Resolution Decisions The following files had merge conflicts that were automatically resolved. (None provided) Closes #2182 ## Implementation Plan Plan file: `/home/talon/projects/autoskillit-runs/impl-20260507-160445-669401/.autoskillit/temp/make-plan/fleet_dispatcher_bem_gate_plan_2026-05-07_160822.md` 🤖 Generated with [Claude Code](https://claude.com/claude-code) via AutoSkillit <!-- autoskillit:pipeline-signature steps=prepare_pr,run_arch_lenses,compose_pr,annotate_pr_diff,review_pr --> ## Token Usage Summary | Step | Model | count | uncached | output | cache_read | peak_ctx | turns | cache_write | time | |------|-------|-------|----------|--------|------------|----------|-------|-------------|------| | plan | claude-sonnet-4-6 | 1 | 123 | 13.2k | 495.3k | 52.1k | 89 | 46.7k | 5m 41s | | verify | claude-sonnet-4-6 | 1 | 236 | 14.8k | 1.5M | 68.7k | 83 | 56.0k | 4m 8s | | implement* | MiniMax-M2.7-highspeed | 1 | 1.2M | 15.0k | 1.3M | 74.0k | 121 | 119.9k | 5m 13s | | fix | claude-sonnet-4-6 | 2 | 246 | 12.2k | 1.2M | 63.9k | 76 | 93.3k | 12m 8s | | prepare_pr* | MiniMax-M2.7-highspeed | 1 | 87.9k | 4.9k | 198.4k | 28.8k | 22 | 15.2k | 1m 27s | | compose_pr* | MiniMax-M2.7-highspeed | 1 | 46.4k | 2.9k | 169.6k | 28.8k | 14 | 15.1k | 55s | | **Total** | | | 1.3M | 63.1k | 4.9M | 74.0k | | 346.3k | 29m 34s | \* *Step used a non-Anthropic provider; caching behavior may differ.* ## Token Efficiency | Step | LoC Changed | cache_read/LoC | cache_write/LoC | output/LoC | |------|-------------|----------------|-----------------|------------| | plan | 0 | — | — | — | | verify | 0 | — | — | — | | implement | 223 | 5925.8 | 537.8 | 67.3 | | fix | 7 | 174400.0 | 13330.4 | 1745.6 | | prepare_pr | 0 | — | — | — | | compose_pr | 0 | — | — | — | | **Total** | **230** | 21230.1 | 1505.5 | 274.1 | ## Model Usage Breakdown | Model | steps | uncached | output | cache_read | cache_write | time | |-------|-------|----------|--------|------------|-------------|------| | claude-sonnet-4-6 | 3 | 605 | 40.2k | 3.2M | 196.0k | 21m 58s | | MiniMax-M2.7-highspeed | 3 | 1.3M | 22.8k | 1.7M | 150.2k | 7m 35s | --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This was referenced May 20, 2026
Open
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add a BEM (build-execution-map) conflict gate to the fleet dispatcher's multi-issue dispatch path, and raise the fleet semaphore's default concurrency to allow the gate to actually enable parallel execution. The work spans four areas: (1) replace stale serial-only guidance in the fleet dispatcher prompt with BEM gate instructions, (2) fix inaccurate semaphore language in the campaign prompt, (3) raise
max_concurrent_dispatchesdefault from 1 to 3 and add amax_total_issues: 12cap toFleetConfig, and (4) add contract and config tests that enforce these prompt and config invariants going forward.All machinery already exists:
bem-wrapperis afood-truckrecipe,dispatch_food_truckreturnsl3_payload.dispatch_planinline, andFleetSemaphorealready supports N>1. The gap is entirely in prompts and config defaults.Requirements
REQ-BEM-001: Multi-issue conflict gate section in fleet dispatcher prompt
Add a new prompt section to
_build_fleet_dispatch_prompt()insrc/autoskillit/cli/_prompts_kitchen.pythat instructs the fleet dispatcher to:bem-wrapperfirst:dispatch_food_truck(recipe="bem-wrapper", task="Build execution map for conflict analysis", ingredients={"issue_urls": "<comma-separated URLs>", "base_branch": "<target branch>"}, capture={"execution_map": "${{ result.execution_map }}"})l3_payload.dispatch_planfrom thedispatch_food_truckresponse — this is a JSON array of{"group": N, "parallel": bool, "issues": "..."}objectsparallel: truegroups → dispatch all issues in the group simultaneouslyparallel: falsegroups → dispatch sequentially within the groupbem-wrapperfailsREQ-BEM-002: Update stale serial dispatch guidance
Replace the stale text at
_prompts_kitchen.py:121-122:With guidance describing BEM-gated parallel dispatch. The new text should describe: single-issue dispatches proceed directly; multi-issue dispatches require the BEM pre-step; group ordering and merge-wait between groups; max concurrent dispatches is 3. Reference the existing
max_parallelcap that BEM already enforces on group sizes.REQ-BEM-003: Contract test for fleet dispatcher BEM gate
Add a contract test (in
tests/contracts/) asserting that the fleet dispatcher prompt returned by_build_fleet_dispatch_prompt()contains multi-issue conflict gate instructions. Similar pattern totest_sous_chef_routing.py:176which asserts sous-chef contains BEM invocation instructions.Suggested assertions:
REQ-BEM-004: Include
bem-wrapperin fleet dispatcher recipe tableIf
_food_truck_section(the recipe table injected into the fleet dispatcher prompt) does not already includebem-wrapper, ensure it appears so the dispatcher knows it's available for dispatch without callinglist_recipesfirst.REQ-SEM-001: Raise
max_concurrent_dispatchesdefault to 3Update the default from 1 to 3 in both
src/autoskillit/config/_config_dataclasses.py(FleetConfig.max_concurrent_dispatches) andsrc/autoskillit/config/defaults.yaml(fleet.max_concurrent_dispatches). This allows the semaphore to permit up to 3 food trucks running simultaneously, which BEM groups can utilize whenparallel: true. Users can still override via.autoskillit/config.yamlor theAUTOSKILLIT_FLEET__MAX_CONCURRENT_DISPATCHESenv var. UpdateFleetConfig.validate()to enforce1 <= max_concurrent_dispatches <= 3(hard cap to prevent runaway subprocess spawning).Note:
FleetSemaphoreitself needs no changes — it already supports N>1 viaasyncio.BoundedSemaphore(max_concurrent).REQ-SEM-002: Add
max_total_issuesfleet config cap (default 12)Add a new
max_total_issues: int = 12field toFleetConfigin_config_dataclasses.pyanddefaults.yaml. This caps the total number of issues a single fleet dispatch session can process (across all food trucks and all groups). The fleet dispatcher prompt should instruct the dispatcher to count total issues before dispatchingbem-wrapperand refuse (or warn) if over the cap. User-overridable via config/env var. Validation:1 <= max_total_issues.Note: This is distinct from the existing
max_parallel(default 6), which caps issues per BEM group, not total issues across the session.max_parallelis a BEM skill argument;max_total_issuesis a fleet-level config field. Both are needed:max_parallelcontrols group sizing,max_total_issuescontrols session-level scope.Rationale for 12: with
max_parallel=6(issues per group) andmax_concurrent_dispatches=3, 12 issues is ~2 fully-loaded parallel groups — the upper bound of what the system should handle in a single dispatch session.REQ-SEM-003: Fix inaccurate "calls queue" campaign prompt claim
Replace the inaccurate text at
_prompts_campaign.py:53:With accurate language reflecting the actual
at_capacity()→FLEET_PARALLEL_REFUSEDfast-fail behavior. The dispatcher must wait for a running dispatch to complete before retrying, not assume queuing. Also update_prompts_campaign.py:231which says "max_concurrent=1 for static dispatches" — this will be inaccurate once REQ-SEM-001 raises the default.REQ-SEM-004: Update tests for new defaults
max_concurrent_dispatches=1as the expected default (search formax_concurrent_dispatchesandmax_concurrent=1in tests/)max_total_issuesfield and themax_concurrent_dispatches <= 3upper boundtest_fleet_e2e.pytests that assertFLEET_PARALLEL_REFUSEDat max=1 to account for the new default of 3Conflict Resolution Decisions
The following files had merge conflicts that were automatically resolved.
(None provided)
Closes #2182
Implementation Plan
Plan file:
/home/talon/projects/autoskillit-runs/impl-20260507-160445-669401/.autoskillit/temp/make-plan/fleet_dispatcher_bem_gate_plan_2026-05-07_160822.md🤖 Generated with Claude Code via AutoSkillit
Token Usage Summary
* Step used a non-Anthropic provider; caching behavior may differ.
Token Efficiency
Model Usage Breakdown