Add BEM pre-step gate to fleet dispatcher for multi-issue parallel dispatch by Trecek · Pull Request #2207 · TalonT-Org/AutoSkillit

Trecek · 2026-05-08T00:16:25Z

Summary

Add a BEM (build-execution-map) conflict gate to the fleet dispatcher's multi-issue dispatch path, and raise the fleet semaphore's default concurrency to allow the gate to actually enable parallel execution. The work spans four areas: (1) replace stale serial-only guidance in the fleet dispatcher prompt with BEM gate instructions, (2) fix inaccurate semaphore language in the campaign prompt, (3) raise max_concurrent_dispatches default from 1 to 3 and add a max_total_issues: 12 cap to FleetConfig, and (4) add contract and config tests that enforce these prompt and config invariants going forward.

All machinery already exists: bem-wrapper is a food-truck recipe, dispatch_food_truck returns l3_payload.dispatch_plan inline, and FleetSemaphore already supports N>1. The gap is entirely in prompts and config defaults.

Requirements

REQ-BEM-001: Multi-issue conflict gate section in fleet dispatcher prompt

Add a new prompt section to _build_fleet_dispatch_prompt() in src/autoskillit/cli/_prompts_kitchen.py that instructs the fleet dispatcher to:

Detect when the user requests 2+ issues dispatched (regardless of whether user says "parallel" — any multi-issue dispatch needs the gate)
Dispatch bem-wrapper first: dispatch_food_truck(recipe="bem-wrapper", task="Build execution map for conflict analysis", ingredients={"issue_urls": "<comma-separated URLs>", "base_branch": "<target branch>"}, capture={"execution_map": "${{ result.execution_map }}"})
Read l3_payload.dispatch_plan from the dispatch_food_truck response — this is a JSON array of {"group": N, "parallel": bool, "issues": "..."} objects
Sequence subsequent dispatches per the group plan:
- parallel: true groups → dispatch all issues in the group simultaneously
- parallel: false groups → dispatch sequentially within the group
- Wait for all dispatches in group N to complete before starting group N+1
Fall back to sequential dispatch (one at a time) if bem-wrapper fails
Single-issue dispatches skip BEM entirely (no pre-step needed)

REQ-BEM-002: Update stale serial dispatch guidance

Replace the stale text at _prompts_kitchen.py:121-122:

- Serial execution: dispatch one food truck at a time. fleet_lock enforces this —
do NOT attempt parallel dispatches.

With guidance describing BEM-gated parallel dispatch. The new text should describe: single-issue dispatches proceed directly; multi-issue dispatches require the BEM pre-step; group ordering and merge-wait between groups; max concurrent dispatches is 3. Reference the existing max_parallel cap that BEM already enforces on group sizes.

REQ-BEM-003: Contract test for fleet dispatcher BEM gate

Add a contract test (in tests/contracts/) asserting that the fleet dispatcher prompt returned by _build_fleet_dispatch_prompt() contains multi-issue conflict gate instructions. Similar pattern to test_sous_chef_routing.py:176 which asserts sous-chef contains BEM invocation instructions.

Suggested assertions:

Prompt contains "bem-wrapper" reference
Prompt contains "dispatch_plan" reference
Prompt contains "multi-issue" or "conflict" gate language
Prompt does NOT contain the stale "Serial execution: dispatch one food truck at a time" text

REQ-BEM-004: Include `bem-wrapper` in fleet dispatcher recipe table

If _food_truck_section (the recipe table injected into the fleet dispatcher prompt) does not already include bem-wrapper, ensure it appears so the dispatcher knows it's available for dispatch without calling list_recipes first.

REQ-SEM-001: Raise `max_concurrent_dispatches` default to 3

Update the default from 1 to 3 in both src/autoskillit/config/_config_dataclasses.py (FleetConfig.max_concurrent_dispatches) and src/autoskillit/config/defaults.yaml (fleet.max_concurrent_dispatches). This allows the semaphore to permit up to 3 food trucks running simultaneously, which BEM groups can utilize when parallel: true. Users can still override via .autoskillit/config.yaml or the AUTOSKILLIT_FLEET__MAX_CONCURRENT_DISPATCHES env var. Update FleetConfig.validate() to enforce 1 <= max_concurrent_dispatches <= 3 (hard cap to prevent runaway subprocess spawning).

Note: FleetSemaphore itself needs no changes — it already supports N>1 via asyncio.BoundedSemaphore(max_concurrent).

REQ-SEM-002: Add `max_total_issues` fleet config cap (default 12)

Add a new max_total_issues: int = 12 field to FleetConfig in _config_dataclasses.py and defaults.yaml. This caps the total number of issues a single fleet dispatch session can process (across all food trucks and all groups). The fleet dispatcher prompt should instruct the dispatcher to count total issues before dispatching bem-wrapper and refuse (or warn) if over the cap. User-overridable via config/env var. Validation: 1 <= max_total_issues.

Note: This is distinct from the existing max_parallel (default 6), which caps issues per BEM group, not total issues across the session. max_parallel is a BEM skill argument; max_total_issues is a fleet-level config field. Both are needed: max_parallel controls group sizing, max_total_issues controls session-level scope.

Rationale for 12: with max_parallel=6 (issues per group) and max_concurrent_dispatches=3, 12 issues is ~2 fully-loaded parallel groups — the upper bound of what the system should handle in a single dispatch session.

REQ-SEM-003: Fix inaccurate "calls queue" campaign prompt claim

Replace the inaccurate text at _prompts_campaign.py:53:

The fleet semaphore gates actual concurrency; calls queue when the semaphore is saturated.

With accurate language reflecting the actual at_capacity() → FLEET_PARALLEL_REFUSED fast-fail behavior. The dispatcher must wait for a running dispatch to complete before retrying, not assume queuing. Also update _prompts_campaign.py:231 which says "max_concurrent=1 for static dispatches" — this will be inaccurate once REQ-SEM-001 raises the default.

REQ-SEM-004: Update tests for new defaults

Update any tests that hardcode max_concurrent_dispatches=1 as the expected default (search for max_concurrent_dispatches and max_concurrent=1 in tests/)
Add validation test for the new max_total_issues field and the max_concurrent_dispatches <= 3 upper bound
Update test_fleet_e2e.py tests that assert FLEET_PARALLEL_REFUSED at max=1 to account for the new default of 3
Add contract test asserting the campaign prompt does NOT contain "calls queue" language

Conflict Resolution Decisions

The following files had merge conflicts that were automatically resolved.

(None provided)

Closes #2182

Implementation Plan

Plan file: /home/talon/projects/autoskillit-runs/impl-20260507-160445-669401/.autoskillit/temp/make-plan/fleet_dispatcher_bem_gate_plan_2026-05-07_160822.md

🤖 Generated with Claude Code via AutoSkillit

Token Usage Summary

Step	Model	count	uncached	output	cache_read	peak_ctx	turns	cache_write	time
plan	claude-sonnet-4-6	1	123	13.2k	495.3k	52.1k	89	46.7k	5m 41s
verify	claude-sonnet-4-6	1	236	14.8k	1.5M	68.7k	83	56.0k	4m 8s
implement*	MiniMax-M2.7-highspeed	1	1.2M	15.0k	1.3M	74.0k	121	119.9k	5m 13s
prepare_pr*	MiniMax-M2.7-highspeed	1	87.9k	4.9k	198.4k	28.8k	22	15.2k	1m 27s
compose_pr*	MiniMax-M2.7-highspeed	1	46.4k	2.9k	169.6k	28.8k	14	15.1k	55s
review_pr	claude-sonnet-4-6	1	124	36.3k	755.8k	79.4k	45	67.4k	7m 7s
resolve_review	claude-sonnet-4-6	1	237	14.4k	1.4M	65.2k	70	52.9k	7m 44s
Total			1.3M	101.5k	5.9M	79.4k		373.2k	32m 17s

* Step used a non-Anthropic provider; caching behavior may differ.

Token Efficiency

Step	LoC Changed	cache_read/LoC	cache_write/LoC	output/LoC
plan	0	—	—	—
verify	0	—	—	—
implement	223	5925.8	537.8	67.3
prepare_pr	0	—	—	—
compose_pr	0	—	—	—
review_pr	0	—	—	—
resolve_review	16	89758.8	3304.4	901.1
Total	239	24494.1	1561.4	424.8

Model Usage Breakdown

Model	steps	uncached	output	cache_read	cache_write	time
claude-sonnet-4-6	3	605	40.2k	3.2M	196.0k	21m 58s
MiniMax-M2.7-highspeed	3	1.3M	22.8k	1.7M	150.2k	7m 35s

- Raise max_concurrent_dispatches default from 1 to 3 in FleetConfig and defaults.yaml - Add max_total_issues: 12 cap to FleetConfig with validation - Replace stale serial-only guidance in fleet dispatcher prompt with BEM pre-step gate instructions (bem-wrapper dispatch, dispatch_plan reading, parallel group sequencing) - Fix inaccurate semaphore language in campaign prompt ("calls queue" → fast-fail wait/retry, "max_concurrent=1" → sequential static) - Add contract tests for BEM gate prompt content - Add contract test for campaign prompt accuracy (no "calls queue") - Add test_fourth_concurrent_dispatch_refused_with_max3 E2E test - Rename existing config test to reflect new default of 3 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…rom_dynaconf, update contracts CLAUDE.md - Re-add fleet_lock mention to campaign discipline section (test_fleet_lock_mentioned) - Add max_total_issues val() line to FleetConfig construction in from_dynaconf (REQ-CONFIG-001) - Add test_fleet_dispatch_bem_gate.py and test_campaign_prompt_accuracy.py to tests/contracts/CLAUDE.md

…ic literal 3 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…weak or-12 match, pin match string Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…spatch (#2207) ## Summary Add a BEM (build-execution-map) conflict gate to the fleet dispatcher's multi-issue dispatch path, and raise the fleet semaphore's default concurrency to allow the gate to actually enable parallel execution. The work spans four areas: (1) replace stale serial-only guidance in the fleet dispatcher prompt with BEM gate instructions, (2) fix inaccurate semaphore language in the campaign prompt, (3) raise `max_concurrent_dispatches` default from 1 to 3 and add a `max_total_issues: 12` cap to `FleetConfig`, and (4) add contract and config tests that enforce these prompt and config invariants going forward. All machinery already exists: `bem-wrapper` is a `food-truck` recipe, `dispatch_food_truck` returns `l3_payload.dispatch_plan` inline, and `FleetSemaphore` already supports N>1. The gap is entirely in prompts and config defaults. ## Requirements ### REQ-BEM-001: Multi-issue conflict gate section in fleet dispatcher prompt Add a new prompt section to `_build_fleet_dispatch_prompt()` in `src/autoskillit/cli/_prompts_kitchen.py` that instructs the fleet dispatcher to: 1. Detect when the user requests 2+ issues dispatched (regardless of whether user says "parallel" — any multi-issue dispatch needs the gate) 2. Dispatch `bem-wrapper` first: `dispatch_food_truck(recipe="bem-wrapper", task="Build execution map for conflict analysis", ingredients={"issue_urls": "<comma-separated URLs>", "base_branch": "<target branch>"}, capture={"execution_map": "${{ result.execution_map }}"})` 3. Read `l3_payload.dispatch_plan` from the `dispatch_food_truck` response — this is a JSON array of `{"group": N, "parallel": bool, "issues": "..."}` objects 4. Sequence subsequent dispatches per the group plan: - `parallel: true` groups → dispatch all issues in the group simultaneously - `parallel: false` groups → dispatch sequentially within the group - Wait for all dispatches in group N to complete before starting group N+1 5. Fall back to sequential dispatch (one at a time) if `bem-wrapper` fails 6. Single-issue dispatches skip BEM entirely (no pre-step needed) ### REQ-BEM-002: Update stale serial dispatch guidance Replace the stale text at `_prompts_kitchen.py:121-122`: ``` - Serial execution: dispatch one food truck at a time. fleet_lock enforces this — do NOT attempt parallel dispatches. ``` With guidance describing BEM-gated parallel dispatch. The new text should describe: single-issue dispatches proceed directly; multi-issue dispatches require the BEM pre-step; group ordering and merge-wait between groups; max concurrent dispatches is 3. Reference the existing `max_parallel` cap that BEM already enforces on group sizes. ### REQ-BEM-003: Contract test for fleet dispatcher BEM gate Add a contract test (in `tests/contracts/`) asserting that the fleet dispatcher prompt returned by `_build_fleet_dispatch_prompt()` contains multi-issue conflict gate instructions. Similar pattern to `test_sous_chef_routing.py:176` which asserts sous-chef contains BEM invocation instructions. Suggested assertions: - Prompt contains "bem-wrapper" reference - Prompt contains "dispatch_plan" reference - Prompt contains "multi-issue" or "conflict" gate language - Prompt does NOT contain the stale "Serial execution: dispatch one food truck at a time" text ### REQ-BEM-004: Include `bem-wrapper` in fleet dispatcher recipe table If `_food_truck_section` (the recipe table injected into the fleet dispatcher prompt) does not already include `bem-wrapper`, ensure it appears so the dispatcher knows it's available for dispatch without calling `list_recipes` first. ### REQ-SEM-001: Raise `max_concurrent_dispatches` default to 3 Update the default from 1 to 3 in both `src/autoskillit/config/_config_dataclasses.py` (`FleetConfig.max_concurrent_dispatches`) and `src/autoskillit/config/defaults.yaml` (`fleet.max_concurrent_dispatches`). This allows the semaphore to permit up to 3 food trucks running simultaneously, which BEM groups can utilize when `parallel: true`. Users can still override via `.autoskillit/config.yaml` or the `AUTOSKILLIT_FLEET__MAX_CONCURRENT_DISPATCHES` env var. Update `FleetConfig.validate()` to enforce `1 <= max_concurrent_dispatches <= 3` (hard cap to prevent runaway subprocess spawning). Note: `FleetSemaphore` itself needs no changes — it already supports N>1 via `asyncio.BoundedSemaphore(max_concurrent)`. ### REQ-SEM-002: Add `max_total_issues` fleet config cap (default 12) Add a new `max_total_issues: int = 12` field to `FleetConfig` in `_config_dataclasses.py` and `defaults.yaml`. This caps the total number of issues a single fleet dispatch session can process (across all food trucks and all groups). The fleet dispatcher prompt should instruct the dispatcher to count total issues before dispatching `bem-wrapper` and refuse (or warn) if over the cap. User-overridable via config/env var. Validation: `1 <= max_total_issues`. Note: This is distinct from the existing `max_parallel` (default 6), which caps issues per BEM group, not total issues across the session. `max_parallel` is a BEM skill argument; `max_total_issues` is a fleet-level config field. Both are needed: `max_parallel` controls group sizing, `max_total_issues` controls session-level scope. Rationale for 12: with `max_parallel=6` (issues per group) and `max_concurrent_dispatches=3`, 12 issues is ~2 fully-loaded parallel groups — the upper bound of what the system should handle in a single dispatch session. ### REQ-SEM-003: Fix inaccurate "calls queue" campaign prompt claim Replace the inaccurate text at `_prompts_campaign.py:53`: ``` The fleet semaphore gates actual concurrency; calls queue when the semaphore is saturated. ``` With accurate language reflecting the actual `at_capacity()` → `FLEET_PARALLEL_REFUSED` fast-fail behavior. The dispatcher must wait for a running dispatch to complete before retrying, not assume queuing. Also update `_prompts_campaign.py:231` which says "max_concurrent=1 for static dispatches" — this will be inaccurate once REQ-SEM-001 raises the default. ### REQ-SEM-004: Update tests for new defaults - Update any tests that hardcode `max_concurrent_dispatches=1` as the expected default (search for `max_concurrent_dispatches` and `max_concurrent=1` in tests/) - Add validation test for the new `max_total_issues` field and the `max_concurrent_dispatches <= 3` upper bound - Update `test_fleet_e2e.py` tests that assert `FLEET_PARALLEL_REFUSED` at max=1 to account for the new default of 3 - Add contract test asserting the campaign prompt does NOT contain "calls queue" language ## Conflict Resolution Decisions The following files had merge conflicts that were automatically resolved. (None provided) Closes #2182 ## Implementation Plan Plan file: `/home/talon/projects/autoskillit-runs/impl-20260507-160445-669401/.autoskillit/temp/make-plan/fleet_dispatcher_bem_gate_plan_2026-05-07_160822.md` 🤖 Generated with [Claude Code](https://claude.com/claude-code) via AutoSkillit  ## Token Usage Summary | Step | Model | count | uncached | output | cache_read | peak_ctx | turns | cache_write | time | |------|-------|-------|----------|--------|------------|----------|-------|-------------|------| | plan | claude-sonnet-4-6 | 1 | 123 | 13.2k | 495.3k | 52.1k | 89 | 46.7k | 5m 41s | | verify | claude-sonnet-4-6 | 1 | 236 | 14.8k | 1.5M | 68.7k | 83 | 56.0k | 4m 8s | | implement* | MiniMax-M2.7-highspeed | 1 | 1.2M | 15.0k | 1.3M | 74.0k | 121 | 119.9k | 5m 13s | | fix | claude-sonnet-4-6 | 2 | 246 | 12.2k | 1.2M | 63.9k | 76 | 93.3k | 12m 8s | | prepare_pr* | MiniMax-M2.7-highspeed | 1 | 87.9k | 4.9k | 198.4k | 28.8k | 22 | 15.2k | 1m 27s | | compose_pr* | MiniMax-M2.7-highspeed | 1 | 46.4k | 2.9k | 169.6k | 28.8k | 14 | 15.1k | 55s | | **Total** | | | 1.3M | 63.1k | 4.9M | 74.0k | | 346.3k | 29m 34s | \* *Step used a non-Anthropic provider; caching behavior may differ.* ## Token Efficiency | Step | LoC Changed | cache_read/LoC | cache_write/LoC | output/LoC | |------|-------------|----------------|-----------------|------------| | plan | 0 | — | — | — | | verify | 0 | — | — | — | | implement | 223 | 5925.8 | 537.8 | 67.3 | | fix | 7 | 174400.0 | 13330.4 | 1745.6 | | prepare_pr | 0 | — | — | — | | compose_pr | 0 | — | — | — | | **Total** | **230** | 21230.1 | 1505.5 | 274.1 | ## Model Usage Breakdown | Model | steps | uncached | output | cache_read | cache_write | time | |-------|-------|----------|--------|------------|-------------|------| | claude-sonnet-4-6 | 3 | 605 | 40.2k | 3.2M | 196.0k | 21m 58s | | MiniMax-M2.7-highspeed | 3 | 1.3M | 22.8k | 1.7M | 150.2k | 7m 35s | --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

Trecek and others added 4 commits May 7, 2026 16:23

fix(review): extract _MAX_CONCURRENT_DISPATCHES constant — remove mag…

b2375db

…ic literal 3 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix(review): tighten test assertions — rename camelCase test, remove …

4218307

…weak or-12 match, pin match string Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Trecek added this pull request to the merge queue May 8, 2026

Merged via the queue into develop with commit 39d7115 May 8, 2026
2 checks passed

Trecek deleted the fleet-dispatcher-add-bem-pre-step-gate-for-multi-issue-paral/2182 branch May 8, 2026 01:12

Trecek mentioned this pull request May 8, 2026

Promote develop to main (200 PRs, 160+ issues, 179 fixes, 480 features, 27 refactors, 22 infra) #2213

Merged

This was referenced May 20, 2026

Harden BEM pre-step gate directives in fleet dispatch prompt to prevent multi-issue bypass #2634

Open

Harden BEM pre-step gate directives in fleet dispatch prompt #2635

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add BEM pre-step gate to fleet dispatcher for multi-issue parallel dispatch#2207

Add BEM pre-step gate to fleet dispatcher for multi-issue parallel dispatch#2207
Trecek merged 4 commits into
developfrom
fleet-dispatcher-add-bem-pre-step-gate-for-multi-issue-paral/2182

Trecek commented May 8, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Trecek commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Requirements

REQ-BEM-001: Multi-issue conflict gate section in fleet dispatcher prompt

REQ-BEM-002: Update stale serial dispatch guidance

REQ-BEM-003: Contract test for fleet dispatcher BEM gate

REQ-BEM-004: Include bem-wrapper in fleet dispatcher recipe table

REQ-SEM-001: Raise max_concurrent_dispatches default to 3

REQ-SEM-002: Add max_total_issues fleet config cap (default 12)

REQ-SEM-003: Fix inaccurate "calls queue" campaign prompt claim

REQ-SEM-004: Update tests for new defaults

Conflict Resolution Decisions

Implementation Plan

Token Usage Summary

Token Efficiency

Model Usage Breakdown

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Trecek commented May 8, 2026 •

edited

Loading

REQ-BEM-004: Include `bem-wrapper` in fleet dispatcher recipe table

REQ-SEM-001: Raise `max_concurrent_dispatches` default to 3

REQ-SEM-002: Add `max_total_issues` fleet config cap (default 12)