Skip to content

Add BEM pre-step gate to fleet dispatcher for multi-issue parallel dispatch#2207

Merged
Trecek merged 4 commits into
developfrom
fleet-dispatcher-add-bem-pre-step-gate-for-multi-issue-paral/2182
May 8, 2026
Merged

Add BEM pre-step gate to fleet dispatcher for multi-issue parallel dispatch#2207
Trecek merged 4 commits into
developfrom
fleet-dispatcher-add-bem-pre-step-gate-for-multi-issue-paral/2182

Conversation

@Trecek
Copy link
Copy Markdown
Collaborator

@Trecek Trecek commented May 8, 2026

Summary

Add a BEM (build-execution-map) conflict gate to the fleet dispatcher's multi-issue dispatch path, and raise the fleet semaphore's default concurrency to allow the gate to actually enable parallel execution. The work spans four areas: (1) replace stale serial-only guidance in the fleet dispatcher prompt with BEM gate instructions, (2) fix inaccurate semaphore language in the campaign prompt, (3) raise max_concurrent_dispatches default from 1 to 3 and add a max_total_issues: 12 cap to FleetConfig, and (4) add contract and config tests that enforce these prompt and config invariants going forward.

All machinery already exists: bem-wrapper is a food-truck recipe, dispatch_food_truck returns l3_payload.dispatch_plan inline, and FleetSemaphore already supports N>1. The gap is entirely in prompts and config defaults.

Requirements

REQ-BEM-001: Multi-issue conflict gate section in fleet dispatcher prompt

Add a new prompt section to _build_fleet_dispatch_prompt() in src/autoskillit/cli/_prompts_kitchen.py that instructs the fleet dispatcher to:

  1. Detect when the user requests 2+ issues dispatched (regardless of whether user says "parallel" — any multi-issue dispatch needs the gate)
  2. Dispatch bem-wrapper first: dispatch_food_truck(recipe="bem-wrapper", task="Build execution map for conflict analysis", ingredients={"issue_urls": "<comma-separated URLs>", "base_branch": "<target branch>"}, capture={"execution_map": "${{ result.execution_map }}"})
  3. Read l3_payload.dispatch_plan from the dispatch_food_truck response — this is a JSON array of {"group": N, "parallel": bool, "issues": "..."} objects
  4. Sequence subsequent dispatches per the group plan:
    • parallel: true groups → dispatch all issues in the group simultaneously
    • parallel: false groups → dispatch sequentially within the group
    • Wait for all dispatches in group N to complete before starting group N+1
  5. Fall back to sequential dispatch (one at a time) if bem-wrapper fails
  6. Single-issue dispatches skip BEM entirely (no pre-step needed)

REQ-BEM-002: Update stale serial dispatch guidance

Replace the stale text at _prompts_kitchen.py:121-122:

- Serial execution: dispatch one food truck at a time. fleet_lock enforces this —
do NOT attempt parallel dispatches.

With guidance describing BEM-gated parallel dispatch. The new text should describe: single-issue dispatches proceed directly; multi-issue dispatches require the BEM pre-step; group ordering and merge-wait between groups; max concurrent dispatches is 3. Reference the existing max_parallel cap that BEM already enforces on group sizes.

REQ-BEM-003: Contract test for fleet dispatcher BEM gate

Add a contract test (in tests/contracts/) asserting that the fleet dispatcher prompt returned by _build_fleet_dispatch_prompt() contains multi-issue conflict gate instructions. Similar pattern to test_sous_chef_routing.py:176 which asserts sous-chef contains BEM invocation instructions.

Suggested assertions:

  • Prompt contains "bem-wrapper" reference
  • Prompt contains "dispatch_plan" reference
  • Prompt contains "multi-issue" or "conflict" gate language
  • Prompt does NOT contain the stale "Serial execution: dispatch one food truck at a time" text

REQ-BEM-004: Include bem-wrapper in fleet dispatcher recipe table

If _food_truck_section (the recipe table injected into the fleet dispatcher prompt) does not already include bem-wrapper, ensure it appears so the dispatcher knows it's available for dispatch without calling list_recipes first.

REQ-SEM-001: Raise max_concurrent_dispatches default to 3

Update the default from 1 to 3 in both src/autoskillit/config/_config_dataclasses.py (FleetConfig.max_concurrent_dispatches) and src/autoskillit/config/defaults.yaml (fleet.max_concurrent_dispatches). This allows the semaphore to permit up to 3 food trucks running simultaneously, which BEM groups can utilize when parallel: true. Users can still override via .autoskillit/config.yaml or the AUTOSKILLIT_FLEET__MAX_CONCURRENT_DISPATCHES env var. Update FleetConfig.validate() to enforce 1 <= max_concurrent_dispatches <= 3 (hard cap to prevent runaway subprocess spawning).

Note: FleetSemaphore itself needs no changes — it already supports N>1 via asyncio.BoundedSemaphore(max_concurrent).

REQ-SEM-002: Add max_total_issues fleet config cap (default 12)

Add a new max_total_issues: int = 12 field to FleetConfig in _config_dataclasses.py and defaults.yaml. This caps the total number of issues a single fleet dispatch session can process (across all food trucks and all groups). The fleet dispatcher prompt should instruct the dispatcher to count total issues before dispatching bem-wrapper and refuse (or warn) if over the cap. User-overridable via config/env var. Validation: 1 <= max_total_issues.

Note: This is distinct from the existing max_parallel (default 6), which caps issues per BEM group, not total issues across the session. max_parallel is a BEM skill argument; max_total_issues is a fleet-level config field. Both are needed: max_parallel controls group sizing, max_total_issues controls session-level scope.

Rationale for 12: with max_parallel=6 (issues per group) and max_concurrent_dispatches=3, 12 issues is ~2 fully-loaded parallel groups — the upper bound of what the system should handle in a single dispatch session.

REQ-SEM-003: Fix inaccurate "calls queue" campaign prompt claim

Replace the inaccurate text at _prompts_campaign.py:53:

The fleet semaphore gates actual concurrency; calls queue when the semaphore is saturated.

With accurate language reflecting the actual at_capacity()FLEET_PARALLEL_REFUSED fast-fail behavior. The dispatcher must wait for a running dispatch to complete before retrying, not assume queuing. Also update _prompts_campaign.py:231 which says "max_concurrent=1 for static dispatches" — this will be inaccurate once REQ-SEM-001 raises the default.

REQ-SEM-004: Update tests for new defaults

  • Update any tests that hardcode max_concurrent_dispatches=1 as the expected default (search for max_concurrent_dispatches and max_concurrent=1 in tests/)
  • Add validation test for the new max_total_issues field and the max_concurrent_dispatches <= 3 upper bound
  • Update test_fleet_e2e.py tests that assert FLEET_PARALLEL_REFUSED at max=1 to account for the new default of 3
  • Add contract test asserting the campaign prompt does NOT contain "calls queue" language

Conflict Resolution Decisions

The following files had merge conflicts that were automatically resolved.

(None provided)

Closes #2182

Implementation Plan

Plan file: /home/talon/projects/autoskillit-runs/impl-20260507-160445-669401/.autoskillit/temp/make-plan/fleet_dispatcher_bem_gate_plan_2026-05-07_160822.md

🤖 Generated with Claude Code via AutoSkillit

Token Usage Summary

Step Model count uncached output cache_read peak_ctx turns cache_write time
plan claude-sonnet-4-6 1 123 13.2k 495.3k 52.1k 89 46.7k 5m 41s
verify claude-sonnet-4-6 1 236 14.8k 1.5M 68.7k 83 56.0k 4m 8s
implement* MiniMax-M2.7-highspeed 1 1.2M 15.0k 1.3M 74.0k 121 119.9k 5m 13s
prepare_pr* MiniMax-M2.7-highspeed 1 87.9k 4.9k 198.4k 28.8k 22 15.2k 1m 27s
compose_pr* MiniMax-M2.7-highspeed 1 46.4k 2.9k 169.6k 28.8k 14 15.1k 55s
review_pr claude-sonnet-4-6 1 124 36.3k 755.8k 79.4k 45 67.4k 7m 7s
resolve_review claude-sonnet-4-6 1 237 14.4k 1.4M 65.2k 70 52.9k 7m 44s
Total 1.3M 101.5k 5.9M 79.4k 373.2k 32m 17s

* Step used a non-Anthropic provider; caching behavior may differ.

Token Efficiency

Step LoC Changed cache_read/LoC cache_write/LoC output/LoC
plan 0
verify 0
implement 223 5925.8 537.8 67.3
prepare_pr 0
compose_pr 0
review_pr 0
resolve_review 16 89758.8 3304.4 901.1
Total 239 24494.1 1561.4 424.8

Model Usage Breakdown

Model steps uncached output cache_read cache_write time
claude-sonnet-4-6 3 605 40.2k 3.2M 196.0k 21m 58s
MiniMax-M2.7-highspeed 3 1.3M 22.8k 1.7M 150.2k 7m 35s

Trecek and others added 4 commits May 7, 2026 16:23
- Raise max_concurrent_dispatches default from 1 to 3 in FleetConfig
  and defaults.yaml
- Add max_total_issues: 12 cap to FleetConfig with validation
- Replace stale serial-only guidance in fleet dispatcher prompt with
  BEM pre-step gate instructions (bem-wrapper dispatch, dispatch_plan
  reading, parallel group sequencing)
- Fix inaccurate semaphore language in campaign prompt ("calls queue"
  → fast-fail wait/retry, "max_concurrent=1" → sequential static)
- Add contract tests for BEM gate prompt content
- Add contract test for campaign prompt accuracy (no "calls queue")
- Add test_fourth_concurrent_dispatch_refused_with_max3 E2E test
- Rename existing config test to reflect new default of 3

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…rom_dynaconf, update contracts CLAUDE.md

- Re-add fleet_lock mention to campaign discipline section (test_fleet_lock_mentioned)
- Add max_total_issues val() line to FleetConfig construction in from_dynaconf (REQ-CONFIG-001)
- Add test_fleet_dispatch_bem_gate.py and test_campaign_prompt_accuracy.py to tests/contracts/CLAUDE.md
…ic literal 3

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…weak or-12 match, pin match string

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@Trecek Trecek added this pull request to the merge queue May 8, 2026
Merged via the queue into develop with commit 39d7115 May 8, 2026
2 checks passed
@Trecek Trecek deleted the fleet-dispatcher-add-bem-pre-step-gate-for-multi-issue-paral/2182 branch May 8, 2026 01:12
Trecek added a commit that referenced this pull request May 8, 2026
…spatch (#2207)

## Summary

Add a BEM (build-execution-map) conflict gate to the fleet dispatcher's
multi-issue dispatch path, and raise the fleet semaphore's default
concurrency to allow the gate to actually enable parallel execution. The
work spans four areas: (1) replace stale serial-only guidance in the
fleet dispatcher prompt with BEM gate instructions, (2) fix inaccurate
semaphore language in the campaign prompt, (3) raise
`max_concurrent_dispatches` default from 1 to 3 and add a
`max_total_issues: 12` cap to `FleetConfig`, and (4) add contract and
config tests that enforce these prompt and config invariants going
forward.

All machinery already exists: `bem-wrapper` is a `food-truck` recipe,
`dispatch_food_truck` returns `l3_payload.dispatch_plan` inline, and
`FleetSemaphore` already supports N>1. The gap is entirely in prompts
and config defaults.

## Requirements

### REQ-BEM-001: Multi-issue conflict gate section in fleet dispatcher
prompt
Add a new prompt section to `_build_fleet_dispatch_prompt()` in
`src/autoskillit/cli/_prompts_kitchen.py` that instructs the fleet
dispatcher to:
1. Detect when the user requests 2+ issues dispatched (regardless of
whether user says "parallel" — any multi-issue dispatch needs the gate)
2. Dispatch `bem-wrapper` first:
`dispatch_food_truck(recipe="bem-wrapper", task="Build execution map for
conflict analysis", ingredients={"issue_urls": "<comma-separated URLs>",
"base_branch": "<target branch>"}, capture={"execution_map": "${{
result.execution_map }}"})`
3. Read `l3_payload.dispatch_plan` from the `dispatch_food_truck`
response — this is a JSON array of `{"group": N, "parallel": bool,
"issues": "..."}` objects
4. Sequence subsequent dispatches per the group plan:
- `parallel: true` groups → dispatch all issues in the group
simultaneously
   - `parallel: false` groups → dispatch sequentially within the group
- Wait for all dispatches in group N to complete before starting group
N+1
5. Fall back to sequential dispatch (one at a time) if `bem-wrapper`
fails
6. Single-issue dispatches skip BEM entirely (no pre-step needed)

### REQ-BEM-002: Update stale serial dispatch guidance
Replace the stale text at `_prompts_kitchen.py:121-122`:
```
- Serial execution: dispatch one food truck at a time. fleet_lock enforces this —
do NOT attempt parallel dispatches.
```
With guidance describing BEM-gated parallel dispatch. The new text
should describe: single-issue dispatches proceed directly; multi-issue
dispatches require the BEM pre-step; group ordering and merge-wait
between groups; max concurrent dispatches is 3. Reference the existing
`max_parallel` cap that BEM already enforces on group sizes.

### REQ-BEM-003: Contract test for fleet dispatcher BEM gate
Add a contract test (in `tests/contracts/`) asserting that the fleet
dispatcher prompt returned by `_build_fleet_dispatch_prompt()` contains
multi-issue conflict gate instructions. Similar pattern to
`test_sous_chef_routing.py:176` which asserts sous-chef contains BEM
invocation instructions.

Suggested assertions:
- Prompt contains "bem-wrapper" reference
- Prompt contains "dispatch_plan" reference
- Prompt contains "multi-issue" or "conflict" gate language
- Prompt does NOT contain the stale "Serial execution: dispatch one food
truck at a time" text

### REQ-BEM-004: Include `bem-wrapper` in fleet dispatcher recipe table
If `_food_truck_section` (the recipe table injected into the fleet
dispatcher prompt) does not already include `bem-wrapper`, ensure it
appears so the dispatcher knows it's available for dispatch without
calling `list_recipes` first.

### REQ-SEM-001: Raise `max_concurrent_dispatches` default to 3
Update the default from 1 to 3 in both
`src/autoskillit/config/_config_dataclasses.py`
(`FleetConfig.max_concurrent_dispatches`) and
`src/autoskillit/config/defaults.yaml`
(`fleet.max_concurrent_dispatches`). This allows the semaphore to permit
up to 3 food trucks running simultaneously, which BEM groups can utilize
when `parallel: true`. Users can still override via
`.autoskillit/config.yaml` or the
`AUTOSKILLIT_FLEET__MAX_CONCURRENT_DISPATCHES` env var. Update
`FleetConfig.validate()` to enforce `1 <= max_concurrent_dispatches <=
3` (hard cap to prevent runaway subprocess spawning).

Note: `FleetSemaphore` itself needs no changes — it already supports N>1
via `asyncio.BoundedSemaphore(max_concurrent)`.

### REQ-SEM-002: Add `max_total_issues` fleet config cap (default 12)
Add a new `max_total_issues: int = 12` field to `FleetConfig` in
`_config_dataclasses.py` and `defaults.yaml`. This caps the total number
of issues a single fleet dispatch session can process (across all food
trucks and all groups). The fleet dispatcher prompt should instruct the
dispatcher to count total issues before dispatching `bem-wrapper` and
refuse (or warn) if over the cap. User-overridable via config/env var.
Validation: `1 <= max_total_issues`.

Note: This is distinct from the existing `max_parallel` (default 6),
which caps issues per BEM group, not total issues across the session.
`max_parallel` is a BEM skill argument; `max_total_issues` is a
fleet-level config field. Both are needed: `max_parallel` controls group
sizing, `max_total_issues` controls session-level scope.

Rationale for 12: with `max_parallel=6` (issues per group) and
`max_concurrent_dispatches=3`, 12 issues is ~2 fully-loaded parallel
groups — the upper bound of what the system should handle in a single
dispatch session.

### REQ-SEM-003: Fix inaccurate "calls queue" campaign prompt claim
Replace the inaccurate text at `_prompts_campaign.py:53`:
```
The fleet semaphore gates actual concurrency; calls queue when the semaphore is saturated.
```
With accurate language reflecting the actual `at_capacity()` →
`FLEET_PARALLEL_REFUSED` fast-fail behavior. The dispatcher must wait
for a running dispatch to complete before retrying, not assume queuing.
Also update `_prompts_campaign.py:231` which says "max_concurrent=1 for
static dispatches" — this will be inaccurate once REQ-SEM-001 raises the
default.

### REQ-SEM-004: Update tests for new defaults
- Update any tests that hardcode `max_concurrent_dispatches=1` as the
expected default (search for `max_concurrent_dispatches` and
`max_concurrent=1` in tests/)
- Add validation test for the new `max_total_issues` field and the
`max_concurrent_dispatches <= 3` upper bound
- Update `test_fleet_e2e.py` tests that assert `FLEET_PARALLEL_REFUSED`
at max=1 to account for the new default of 3
- Add contract test asserting the campaign prompt does NOT contain
"calls queue" language

## Conflict Resolution Decisions

The following files had merge conflicts that were automatically
resolved.

(None provided)

Closes #2182

## Implementation Plan

Plan file:
`/home/talon/projects/autoskillit-runs/impl-20260507-160445-669401/.autoskillit/temp/make-plan/fleet_dispatcher_bem_gate_plan_2026-05-07_160822.md`

🤖 Generated with [Claude Code](https://claude.com/claude-code) via
AutoSkillit
<!-- autoskillit:pipeline-signature
steps=prepare_pr,run_arch_lenses,compose_pr,annotate_pr_diff,review_pr
-->

## Token Usage Summary

| Step | Model | count | uncached | output | cache_read | peak_ctx |
turns | cache_write | time |

|------|-------|-------|----------|--------|------------|----------|-------|-------------|------|
| plan | claude-sonnet-4-6 | 1 | 123 | 13.2k | 495.3k | 52.1k | 89 |
46.7k | 5m 41s |
| verify | claude-sonnet-4-6 | 1 | 236 | 14.8k | 1.5M | 68.7k | 83 |
56.0k | 4m 8s |
| implement* | MiniMax-M2.7-highspeed | 1 | 1.2M | 15.0k | 1.3M | 74.0k
| 121 | 119.9k | 5m 13s |
| fix | claude-sonnet-4-6 | 2 | 246 | 12.2k | 1.2M | 63.9k | 76 | 93.3k
| 12m 8s |
| prepare_pr* | MiniMax-M2.7-highspeed | 1 | 87.9k | 4.9k | 198.4k |
28.8k | 22 | 15.2k | 1m 27s |
| compose_pr* | MiniMax-M2.7-highspeed | 1 | 46.4k | 2.9k | 169.6k |
28.8k | 14 | 15.1k | 55s |
| **Total** | | | 1.3M | 63.1k | 4.9M | 74.0k | | 346.3k | 29m 34s |

\* *Step used a non-Anthropic provider; caching behavior may differ.*

## Token Efficiency

| Step | LoC Changed | cache_read/LoC | cache_write/LoC | output/LoC |
|------|-------------|----------------|-----------------|------------|
| plan | 0 | — | — | — |
| verify | 0 | — | — | — |
| implement | 223 | 5925.8 | 537.8 | 67.3 |
| fix | 7 | 174400.0 | 13330.4 | 1745.6 |
| prepare_pr | 0 | — | — | — |
| compose_pr | 0 | — | — | — |
| **Total** | **230** | 21230.1 | 1505.5 | 274.1 |

## Model Usage Breakdown

| Model | steps | uncached | output | cache_read | cache_write | time |
|-------|-------|----------|--------|------------|-------------|------|
| claude-sonnet-4-6 | 3 | 605 | 40.2k | 3.2M | 196.0k | 21m 58s |
| MiniMax-M2.7-highspeed | 3 | 1.3M | 22.8k | 1.7M | 150.2k | 7m 35s |

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant