Skip to content

feat(§29): implement parallel step execution with std::thread::scope#154

Merged
AlexChesser merged 1 commit intomainfrom
claude/parallel-execution-plan-p5FXZ
Apr 15, 2026
Merged

feat(§29): implement parallel step execution with std::thread::scope#154
AlexChesser merged 1 commit intomainfrom
claude/parallel-execution-plan-p5FXZ

Conversation

@AlexChesser
Copy link
Copy Markdown
Owner

Completes the §29 implementation on top of the Phase 1-3+6 foundations already on main (#151). This adds the actual parallel dispatch so async: true steps run concurrently instead of being treated as sequential.

Closes #117.

Summary

  • async: true steps launch on scoped threads and run concurrently with subsequent sequential steps, with forked Session state (cloned turn log snapshot, isolated http_session_store when resume: false).
  • action: join synchronizes branches and merges outputs — string join (default, [step_id]:\n<response> headers) or structured join (namespaced JSON when all deps declare output_schema, optionally validated against the join's own schema).
  • on_error: fail_fast (default) surfaces the first branch failure; on_error: wait_for_all collects error envelopes into the merged JSON for post-hoc inspection.
  • defaults.max_concurrency enforced via a Mutex<usize> + Condvar semaphore — no new dependency, no async runtime.
  • Turn log entries tagged with concurrent_group, launched_at, completed_at ISO 8601 timestamps (§29.8).
  • Dotted-path template resolution: {{ step.<join>.<dep>.<field> }} walks JSON paths through structured join responses (§29.6).

Implementation notes

  • New module ail-core/src/executor/parallel.rsConcurrencySemaphore, BranchResult, merge_join_results, civil-calendar timestamp helpers (no chrono dependency).
  • execute_core branches to a new execute_core_with_parallel path when any step is async. The parallel variant wraps the main loop in std::thread::scope so async launches coexist with sequential steps executed on the main thread.
  • Runner signatures updated to &(dyn Runner + Sync) on the parallel path; all concrete runners were already Sync by composition. RunnerFactory now returns Box<dyn Runner + Send + Sync>.
  • TurnEntry gains Clone for session forking.
  • Session::fork_for_branch(isolated_http: bool) clones entries into a NullProvider-backed TurnLog and mints a fresh http_session_store when resume: false (SPEC §29.9 opt-out of context inheritance).
  • execute_single_step no longer handles action: join directly — joins are fully coordinated in execute_core_with_parallel.

Tests

ail-core/tests/spec/s29_parallel.rs — 23 new tests:

  • Parse-time validation (9): orphaned async, join without depends_on, forward references, cycles, concurrent resume: true conflict, mixed structured/unstructured deps, action: join recognition, on_error mode parsing, max_concurrency parsing.
  • Runtime execution (9): two-async-plus-join end-to-end, runner invocation count, string-join declaration order, sequential step sees join result, condition: never on async unblocks join, on_result on join fires abort/continue, regression for non-async pipelines, max_concurrency: 1 serialization, shared concurrent_group metadata.
  • Fixture round-trips (5): parallel_basic, parallel_structured, and three invalid-case YAML fixtures.

Full suite: 504 passed, 0 failed (previous 481 + 23 new §29 tests).

Quality gates

  • cargo build
  • cargo clippy -- -D warnings
  • cargo fmt --check
  • cargo test ✅ (504 passed)

Docs

  • spec/core/s29-parallel-execution.md status → implemented (v0.3).
  • spec/README.md entry updated.
  • CLAUDE.md and ail-core/CLAUDE.md updated with new template variables, executor/parallel.rs module responsibility, Known Constraints entry for §29, and updated Step / ActionKind / JoinErrorMode key types.
  • CHANGELOG.md v0.3 in-progress entry.

Deferred (spec-authorized)

  • Mid-flight runner-level cancellation for fail_fast — branches complete on their own; first error still propagates. SPEC §29.7 explicitly declares cancel signals "best effort".
  • Controlled-mode executor events for async launches.

Test plan

  • Existing test suite still passes (504 tests).
  • New §29 tests exercise each validation rule and runtime path.
  • Sequential-only pipelines take the fast path (no scoped-thread overhead).
  • RecordingStubRunner confirms both async branches actually invoke the runner.
  • Structured-join response is valid JSON with namespaced keys.
  • concurrent_group metadata is present on branch entries and shared across siblings.

https://claude.ai/code/session_01RYo2Rp8t2RkV5R8rfJrd3W

Completes the §29 implementation on top of the Phase 1-3+6 foundations
already on main (PR #151). This adds the actual parallel dispatch so
async steps run concurrently instead of being treated as sequential.

New capabilities:
- `async: true` steps launch on scoped threads and run concurrently with
  subsequent sequential steps. Forked Session with cloned turn log and
  isolated http_session_store (when resume: false).
- `action: join` synchronizes branches, merges results. Two modes:
  - String join (default): labelled `[step_id]:\n<response>` concatenation
  - Structured join: JSON merge when join declares output_schema. Dep
    responses parsed as JSON, namespaced by step id; output validated
    against the join's schema if declared.
- `on_error: fail_fast` (default) / `wait_for_all` on the join step.
  fail_fast surfaces the first branch error; wait_for_all collects error
  envelopes into the merged JSON.
- `defaults.max_concurrency` enforced via Mutex+Condvar semaphore — no
  external dependency, no async runtime.
- Turn log entries tagged with `concurrent_group`, `launched_at`,
  `completed_at` ISO 8601 timestamps (§29.8).
- Dotted-path template resolution: `{{ step.<join>.<dep>.<field> }}`
  walks JSON paths through structured join responses (§29.6).

Implementation details:
- New module `executor/parallel.rs` with ConcurrencySemaphore, BranchResult,
  merge_join_results, and timestamp helpers (civil-from-days, no chrono).
- `execute_core` now branches to `execute_core_with_parallel` when any
  step declares `async`. The scoped-thread variant wraps the main step
  loop in std::thread::scope so async launches coexist with sequential
  step execution on the main thread.
- Runner signatures updated to `&(dyn Runner + Sync)` on the parallel
  path. All concrete runners are already Sync; RunnerFactory now returns
  `Box<dyn Runner + Send + Sync>`.
- TurnEntry gains `Clone` derive for session forking.
- Session gains `fork_for_branch(isolated_http: bool)` — clones entries
  into a NullProvider-backed TurnLog; mints a fresh http_session_store
  when resume:false (SPEC §29.9 opt-out of context inheritance).
- `execute_single_step` no longer mishandles `action: join` — joins are
  now fully coordinated in `execute_core_with_parallel`.

Tests:
- `ail-core/tests/spec/s29_parallel.rs` — 23 tests covering parse-time
  validation (orphan detection, forward refs, cycles, concurrent resume
  conflict, structured-join compatibility, join without depends_on,
  max_concurrency), runtime execution (two-async-plus-join end-to-end,
  branch invocation count, string join ordering, sequential step after
  async sees join result, condition:never on async unblocks join,
  on_result on join step fires abort/continue, regression check for
  non-async pipelines, max_concurrency serialization, shared
  concurrent_group across branches), and fixture round-trips.
- Full test suite: 504 passed (previous 481 + 23 new), 0 failed.

Docs:
- `spec/core/s29-parallel-execution.md` status → implemented.
- `spec/README.md` entry updated.
- `CLAUDE.md` and `ail-core/CLAUDE.md` updated with new template vars,
  module responsibilities, and Known Constraints entry.
- `CHANGELOG.md` v0.3 in-progress entry.

Deferred (spec-authorized):
- Mid-flight runner-level cancellation for fail_fast — branches complete
  on their own; first error still propagates (SPEC §29.7 "best effort").
- Controlled-mode executor events for async launches.

https://claude.ai/code/session_01RYo2Rp8t2RkV5R8rfJrd3W
@AlexChesser AlexChesser merged commit b87bc08 into main Apr 15, 2026
6 checks passed
AlexChesser pushed a commit that referenced this pull request Apr 15, 2026
…l site

The cherry-pick missed the second `evaluate_on_result` call site, which
was added by PR #154 (parallel execution) after this branch originally
forked. The §29 join-step code path needs the same signature update
the sequential dispatch path already got: pass `&Session` + `&step_id`,
route `Err` through the parallel outcome cell instead of `?`.

The borrow shape differs slightly — the parallel path can't just
re-borrow the turn_log entry while also passing `&session`, because the
enclosing closure captures session by mutable reference. Clone the
last entry up front to release the immutable borrow before calling the
evaluator.
AlexChesser added a commit that referenced this pull request Apr 15, 2026
#157)

* feat(#130): implement on_result `expression:` matcher + `matches` regex operator

Implements the spec committed earlier on this branch:
- §5.4 `expression:` matcher — arbitrary §12.2 condition against any
  template variable accessible in the turn log.
- §5.4 `matches: /PAT/FLAGS` named matcher — shorthand for
  `expression: '{{ step.<id>.response }} matches /.../flags'`.
- §12.2 `matches` operator — regex comparison, shared with `condition:`.
- §12.3 regex syntax — single source of truth for `/PAT/FLAGS` form;
  flags i/m/s accepted, g rejected at parse time, other Perl flags
  rejected with a specific error that points to inline `(?x)` for
  verbose mode.

Design notes:
- Regex is compiled at parse time (in `parse_regex_literal`) so
  malformed patterns fail pipeline load, not match time. Source literal
  preserved alongside the compiled `regex::Regex` for diagnostics and
  materialize output.
- `Condition` gains a `Regex(RegexCondition)` variant. `PartialEq` is
  dropped from `Condition` (regex::Regex has no PartialEq); the one
  existing `assert_eq!` on `Option<Condition>` was rewritten as a
  `matches!` pattern match. No production code compared Condition
  values for equality.
- `ResultMatcher::Expression { source, condition }` reuses the
  condition evaluator for both comparison and regex forms, so the two
  grammars cannot drift apart.
- `evaluate_on_result()` now takes `&Session` and returns
  `Result<Option<ResultAction>, AilError>`. Unresolvable template
  variables in an `expression:` LHS abort the pipeline via
  CONDITION_INVALID — same contract as `condition:` (SPEC §11).
- Named `matches:` is desugared at parse time into the expression form,
  so the runtime has exactly one regex evaluation path.
- Materialize round-trips `expression:` using the preserved source.
- do_while's `exit_when` deliberately NOT extended — it stays
  ConditionExpr-only for now. Regex in loop exits is out of scope for
  this change; can be added later by widening exit_when to `Condition`.

Testing: 911 tests pass (419 lib + 492 integration). New coverage:
- regex_literal: 20 unit tests (parsing, flags, error cases)
- condition.rs: 4 new tests for Condition::Regex evaluation
- s12_step_conditions: 6 new integration tests (matches operator,
  case sensitivity, parser path, invalid regex, g-flag rejection)
- s05_3_on_result: 5 new integration tests (expression: matcher on
  stderr, named matches: shorthand, expression with matches op,
  unresolvable template, parse-time matcher count enforcement)

* fix(#130): propagate expression: Result through §29 parallel-join call site

The cherry-pick missed the second `evaluate_on_result` call site, which
was added by PR #154 (parallel execution) after this branch originally
forked. The §29 join-step code path needs the same signature update
the sequential dispatch path already got: pass `&Session` + `&step_id`,
route `Err` through the parallel outcome cell instead of `?`.

The borrow shape differs slightly — the parallel path can't just
re-borrow the turn_log entry while also passing `&session`, because the
enclosing closure captures session by mutable reference. Clone the
last entry up front to release the immutable borrow before calling the
evaluator.

---------

Co-authored-by: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Design & implement parallel step execution (§21)

2 participants