feat(modules/cargo): cargo/build + cargo/test — structured Rust toolchain wrappers by joelteply · Pull Request #1502 · CambrianTech/continuum

joelteply · 2026-05-31T01:45:03Z

Summary

Closes Priority 2 from PERSONA-AS-DEVELOPER-GAP.md: Rust iteration parity with TypeScript. Personas can now build + test their own scaffolded modules and get the same structured feedback density a human gets from npm run build:ts / cargo test.

What this adds

New stateless cargo ServiceModule at src/workers/continuum-core/src/modules/cargo/:

Command	Signature	Returns
`cargo/build`	`{package?, features?, release?, workingDir?, timeoutMs?}`	`{success, errors: CargoMessage[], warnings: CargoMessage[], exitCode?, durationMs, error?}`
`cargo/test`	`{package?, filter?, features?, libOnly?, release?, workingDir?, timeoutMs?}`	`{success, passed, failed, ignored, measured, failures: string[], buildErrors: CargoMessage[], exitCode?, durationMs, error?}`

Plus 6 ts-rs wire types: CargoBuildParams, CargoBuildResult, CargoTestParams, CargoTestResult, CargoMessage, CargoSpan.

Doctrine followed (per field manual)

§3 Module Design Template — typed envelope at entry/exit, ts-rs annotations, camelCase serde, optional fields elided
§4.1 Concurrency — stateless module (cargo handles its own target-dir locking; module-level lock unnecessary)
§4.2 Concurrency tests — multi-thread tokio (worker_threads = 4) fires 8 parallel real-cargo subprocess invocations; every result consistent
Three primitives — both commands are pure Commands. Streaming variants land when Stream cell shape ships (gap report priority 4)
Rethink-not-port — designed Rust-first; no TS predecessor

Sharp design decisions (the kinks the tests caught pre-merge)

Bug	What broke	Fix
`parse_summary_counts` got 0 every time	Positional indices 0,1 broke on `"ok. 22 passed"` (token 0 is "ok.")	Scan WITHIN each chunk for first `<int> <label>` pair
Failures-block exit condition	Exited on lines containing `:` — but test names ARE `module::test`	Enter on `failures:`, capture single-token lines containing `::`, exit on next `test result:`
Repeated `failures:` blocks	libtest emits TWO per failing binary (with stdout decorators + without)	Capture from both, dedupe by first-seen order, skip `---- ... ----` markers
Subprocess pipe deadlock potential	If child fills stdout buffer waiting for us	Concurrent tokio tasks for stdout/stderr read alongside `wait()`
Runaway timeout	Persona could hold substrate forever	Hard-cap: build 15min, test 30min (configurable lower)

Composability with the grid (the alignment payoff)

Per the gap report's "later parts of the vision" section: both result envelopes are flat camelCase JSON, trivially serializable across airc's grid. A persona on Joel's M-series Mac can call cargo/test against a module a persona on a peer's RTX 5090 just authored — result envelope routes back on the same Commands/Events bus.

Once events/command-completed (gap report priority 3) lands, build/test attribution becomes observable in real time — closing the loop from "I built this" to "the grid knows I built this." See [[alignment-via-substrate-economics]].

Test plan (29/29 pass)

parse_build_messages (5):
  parse_build_extracts_errors_with_codes_and_spans ... ok
  parse_build_separates_warnings_from_errors ... ok
  parse_build_ignores_non_diagnostic_reasons ... ok
  parse_build_tolerates_non_json_lines ... ok
  parse_build_handles_diagnostic_without_primary_span ... ok

parse_test_output (5):
  parse_test_extracts_passing_counts_from_summary ... ok
  parse_test_captures_failure_names_in_order ... ok
  parse_test_aggregates_across_multiple_test_binaries ... ok
  parse_test_dedupes_failures_across_repeated_blocks ... ok
  parse_test_empty_output_returns_zero_counts_not_error ... ok

parse_summary_counts (2):
  summary_counts_handles_filtered_out_field ... ok
  summary_counts_handles_failed_verdict ... ok  ← caught the verdict-prefix bug

timeout (2):
  timeout_uses_default_when_none_provided ... ok
  timeout_clamps_to_max_when_request_exceeds_it ... ok

types (5): round-trip, defaults, optional-omission, lib_only, failure-order

dispatch (2):
  config_advertises_cargo_prefix ... ok
  handle_command_rejects_unknown_command_loud ... ok

end-to-end (1):
  end_to_end_subprocess_pipeline_works ... ok  ← real `cargo --version`

concurrency stress (1):
  concurrent_cargo_invocations_dont_corrupt_subprocess_pipeline ... ok
  ← 8 parallel real cargo invocations, multi-thread tokio

ts-rs exports (6): all 6 binding files generated

What this PR does NOT do

Does NOT add TS wrapper commands. Rust ServiceModule + IPC bridge is canonical per rust-is-the-core-node-is-the-shell
Does NOT stream output line-by-line. Returns single envelope at end. Streaming = gap report priority 4 (needs Stream cell shape)
Does NOT manage per-persona workspaces. Optional working_dir (default cwd); per-persona workspace isolation is orthogonal
Does NOT depend on libtest's unstable JSON output. Parses stable human-readable format. Can upgrade when JSON stabilizes
Does NOT scaffold via generate/module invocation. Hand-authored matching v2 template shape exactly. Future PR can swap in literal generator invocation

Stacks on

canary directly. Independent of any open chain.

References

docs/planning/PERSONA-AS-DEVELOPER-GAP.md — Priority 2 (Priority 1 was feat(modules/code): code/exists + code/list + code/glob — filesystem introspection cluster #1501)
docs/architecture/COMMAND-INFRASTRUCTURE-FIELD-MANUAL.md §3 + §4
docs/architecture/MODULE-CATALOG.md §0 — new cargo row to add when this lands
Memories: three-primitives-commands-events-persona, alignment-via-substrate-economics, continuum-thesis-airc-is-the-medium

Next up

Priority 3 from the gap report: events/command-completed — restores RTOS-brain doctrine by emitting completion events on the bus instead of forcing blocking polls. Largest scope of the three; touches dispatch hot path. Separate PR after this lands.

…hain wrappers Closes Priority 2 from [PERSONA-AS-DEVELOPER-GAP.md](docs/planning/PERSONA-AS-DEVELOPER-GAP.md): Rust iteration parity with TypeScript. Personas can now build + test their own scaffolded modules and get the same structured feedback density Joel gets from `npm run build:ts` / `cargo test`. # What this PR adds New stateless `cargo` ServiceModule (`src/workers/continuum-core/src/modules/cargo/`): | Command | Signature | Returns | |---|---|---| | `cargo/build` | `{package?, features?, release?, working_dir?, timeout_ms?}` | `{success, errors: CargoMessage[], warnings: CargoMessage[], exit_code?, duration_ms, error?}` | | `cargo/test` | `{package?, filter?, features?, lib_only?, release?, working_dir?, timeout_ms?}` | `{success, passed, failed, ignored, measured, failures: string[], build_errors: CargoMessage[], exit_code?, duration_ms, error?}` | Plus 6 ts-rs-exported wire types: `CargoBuildParams`, `CargoBuildResult`, `CargoTestParams`, `CargoTestResult`, `CargoMessage`, `CargoSpan`. # Doctrine followed (per [field manual](docs/architecture/COMMAND-INFRASTRUCTURE-FIELD-MANUAL.md)) - **Module Design Template §3** — typed `Params/Result` shapes with `#[derive(TS)]`, camelCase serde, optional fields with `#[serde(skip_serializing_if = "Option::is_none")]` + `#[ts(optional)]` - **Concurrency doctrine §4.1** — module is stateless; cargo manages its own target-dir locking (concurrent invocations on the same target dir serialize at cargo's level; different target dirs stay parallel). When correctness lives BELOW the module, the module-level lock is unnecessary. - **Concurrency doctrine §4.2** — multi-thread tokio stress test (`flavor = "multi_thread", worker_threads = 4`) fires 8 parallel real-cargo subprocess invocations through `run_with_timeout` and asserts every result is internally consistent (no plumbing corruption under concurrent spawn/wait). - **Three primitives** — both commands are pure **Commands** (request/response). When the Stream cell shape lands (gap report priority 4), `cargo/build/stream` and `cargo/test/stream` can follow as line-by-line variants. - **Rethink-not-port** — designed Rust-first; no TS predecessor. # Sharp design decisions (the kinks the tests caught pre-merge) 1. **`parse_summary_counts` had to scan within each chunk** for the first `<int> <label>` pair, not require positional indices 0 and 1. libtest's summary line includes a verdict prefix in the first chunk: `"ok. 22 passed; 1 failed"` or `"FAILED. 22 passed; 1 failed"`. Positional parsing got 0 every time. Test `summary_counts_handles_failed_verdict` pins it. 2. **Failures-block exit condition was wrong.** Initial impl exited on lines containing `:` — but test names ARE `module::path::test` which contains `::`. Fix: enter on `failures:`, capture single- token lines that contain `::` (strong "this is a Rust test name" heuristic), exit on next `test result:`. Test `parse_test_captures_failure_names_in_order` pins it. 3. **libtest emits TWO `failures:` blocks per failing binary** — first with `---- foo::b stdout ----` decorators + panic stdout, second with the bare test-name list. Parser captures from both forms (skipping decorator lines), then dedupes by first-seen order. Test `parse_test_dedupes_failures_across_repeated_blocks` pins it. 4. **Timeout clamping is hard-capped at substrate level.** `BUILD_MAX_TIMEOUT_MS = 900_000` (15 min); `TEST_MAX_TIMEOUT_MS = 1_800_000` (30 min). Higher values silently clamp — prevents a runaway persona from holding the substrate forever. Defaults (5min / 10min) cover typical iteration loops. 5. **Subprocess output captured concurrently with `wait()`.** Using tokio tasks for stdout/stderr read avoids the classic deadlock where the child fills its pipe buffer waiting for us to read while we wait for it to exit. # Composability with the grid (the alignment payoff) Per the gap report's "later parts of the vision" section: both result envelopes are flat camelCase JSON, trivially serializable across airc's grid. A persona on Joel's M-series Mac can call `cargo/test` against a module a persona on a peer's RTX 5090 just authored — result envelope routes back on the same Commands/Events bus. The substrate already routes commands across peers; this PR makes the wire shape grid-friendly. See [[alignment-via-substrate-economics]] — once `events/command-completed` (gap report priority 3) lands, build/test attribution becomes observable in real time, closing the loop from "I built this" to "the grid knows I built this." # Tests (29/29 pass) **parse_build_messages (5)** — fixture cargo JSON lines: - E0382 with code + primary span + rendered - Warnings separate from errors - Non-diagnostic reasons skipped (compiler-artifact, build-finished) - Non-JSON lines tolerated - Diagnostic without primary span (linker errors) **parse_test_output (5)** — fixture libtest output: - All-pass summary extraction - Failure-name capture in order - Multi-binary aggregation (sum across summaries) - Dedup across repeated failures blocks - Empty output returns zero counts (vacuously success) **parse_summary_counts (2)** — edge cases: - "filtered out" tail field tolerated - FAILED verdict prefix doesn't break positional parsing **timeout (2)** — defaults + clamping to max **types (5)** — camelCase round-trip, defaults, optional-omission, lib_only flag, failure-order preservation **dispatch (2)** — config advertises cargo/ prefix; unknown command surfaces typed error **end-to-end (1)** — real `cargo --version` subprocess pipeline **concurrency stress (1)** — 8 parallel real `cargo --version` invocations on multi-thread tokio, every result consistent **ts-rs exports (6)** — wire bindings auto-generated # What this PR does NOT do - **Does NOT add TS wrapper commands.** Rust ServiceModule + IPC bridge is the canonical surface per `rust-is-the-core-node-is-the-shell`. - **Does NOT stream output.** Returns single envelope at end. Streaming is gap report priority 4 — needs Stream cell shape implementation. - **Does NOT manage per-persona workspaces.** Takes optional `working_dir` (default: process cwd). Per-persona workspace isolation is an orthogonal layer (`workspace/resolve` command for a future PR). - **Does NOT depend on libtest's JSON output** (`-Z unstable-options`). Parses stable human-readable test output. When libtest stabilizes JSON output, can upgrade to structured per-test events in a follow-up. - **Does NOT scaffold via `generate/module --stateful` invocation** for the dogfood demo. Hand-authored matching the v2 template shape exactly. A future PR can swap in a literal generator invocation as a build-time scaffold step. # References - [docs/planning/PERSONA-AS-DEVELOPER-GAP.md](docs/planning/PERSONA-AS-DEVELOPER-GAP.md) Priority 2 (this PR) — Priority 1 was code/exists+list+glob (#1501) - [docs/architecture/COMMAND-INFRASTRUCTURE-FIELD-MANUAL.md](docs/architecture/COMMAND-INFRASTRUCTURE-FIELD-MANUAL.md) §3 (Module Design Template) + §4 (Concurrency doctrine) - [docs/architecture/MODULE-CATALOG.md §0](docs/architecture/MODULE-CATALOG.md) — new `cargo` row to add when this lands - Memories: [[three-primitives-commands-events-persona]], [[alignment-via-substrate-economics]], [[continuum-thesis-airc-is-the-medium]] Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…ands actually dispatch Adversarial PR review caught: `pub mod cargo;` was added to `modules/mod.rs` but the production wire-up in `ipc::start_server` never called `runtime.register(Arc::new(CargoModule::new()))`. Net effect: `cargo/build` and `cargo/test` would return "Unknown command — No module registered for this command prefix" at runtime. The unit tests passed because they instantiate `CargoModule::new()` directly and call `handle_command`, bypassing the runtime registry entirely. The PR shipped dead code from the caller's perspective — the title's deliverable didn't work end-to-end. Fix: add the missing import + register call alongside the other ServiceModule registrations in `ipc/mod.rs::start_server`, sandwich between `ForgeModule` and `EventsModule` for consistency with the existing ordering. Per [[every-error-is-an-opportunity-to-battle-harden]]: the proper substrate-level fix is a CI guard that asserts every `pub mod foo;` in `modules/mod.rs` is paired with a `runtime.register(Arc::new( FooModule::new()))` call somewhere in `ipc/mod.rs`. Filed as a follow-up task — the dispatcher's silent miss on an "Unknown command" prefix is exactly the class of bug that mechanical checks should catch. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…event (#1503) Closes Priority 3 from [PERSONA-AS-DEVELOPER-GAP.md](docs/planning/PERSONA-AS-DEVELOPER-GAP.md): restores the RTOS-brain doctrine ("handlers read pre-staged results, never block on recall/embedding/planning") at the dispatch layer. Every `CommandExecutor::execute()` now emits a `command:completed` event on the wired bus after the dispatch settles — subscribers consume completion events instead of polling result surfaces. # What this adds ## `CommandCompletedEvent` (new type) ```rust pub struct CommandCompletedEvent { pub command_name: String, pub duration_ms: u64, pub success: bool, pub error: Option<String>, } ``` - ts-rs exported to `shared/generated/runtime/CommandCompletedEvent.ts` - camelCase wire shape, optional `error` elided on success - Topic constant `COMMAND_COMPLETED_TOPIC = "command:completed"` centralized for publishers + subscribers + tests to share ## `CommandExecutor` extensions - New `bus: Option<Arc<MessageBus>>` field - Builder `with_message_bus(bus: Arc<MessageBus>) -> Self` - New init function `init_executor_with_bus_and_interceptors(...)` for production startup; existing `init_executor` paths still work without a bus (telemetry no-ops) - `execute()` wraps `execute_inner()` with timing + event emission — single `OnceLock`-set path for both production and back-compat ## `MessageBus` change Added `command:` to the realtime passthrough list. The bus coalesces non-realtime events with the same prefix in 50ms windows to prevent floods from bulk ops — but command-completion events violate the RTOS doctrine if coalesced (a persona's loop would miss 31 out of 32 events under multi-persona load). Now flows through uncoalesced, same as `chat:`, `sentinel:`, `presence:`, `tool:`. # Sharp design decisions (kinks the tests caught pre-merge) 1. **Coalescing dropped events under load.** Initial `concurrent_dispatches_each_emit_their_own_event` test asserted 32 events from 32 concurrent dispatches — got 1. Root cause: the bus's 50ms coalescing window collapses same-prefix events. Fix: `command:` joins the realtime passthrough list. The test then confirms 32 distinct events arrive (with unique command_names, no event loss, no payload corruption). 2. **CommandResult doesn't impl Clone.** Test fixtures need to return the same canned result on repeated calls. Solution: `CannedModule` stores `Result<Value, String>` (cloneable) and wraps in `CommandResult::Json` on each handler call. No substrate change. 3. **Event emission is infallible telemetry, not contract.** The `emit_command_completed` helper publishes via `publish_async_only` (fire-and-forget) and silently logs serialize failures (which shouldn't happen for a struct of plain fields, but tolerated). Telemetry must never break the dispatch contract. # Pinned invariants (multi-thread tests) `runtime::command_executor::tests`: - `dispatch_emits_completed_event_on_success` — happy path event with command_name + duration + success=true + no error - `dispatch_emits_completed_event_on_handler_error` — failure path event with success=false + populated error mirroring the Err msg - `dispatch_without_wired_bus_is_no_op_telemetry` — back-compat path (no bus) doesn't panic + dispatch still works - `ts_bridge_failure_still_emits_completed_event` — third dispatch tier (TS bridge fallthrough) covered for both no-handler and failure paths; telemetry is exhaustive - `concurrent_dispatches_each_emit_their_own_event` — `flavor = "multi_thread", worker_threads = 4`; 32 parallel dispatches each produce exactly one distinct event (no loss, no dupe, no payload interleave) `runtime::command_events::tests`: - `event_round_trips_through_wire_with_camel_case` - `event_with_error_includes_error_on_wire` - `event_parses_from_wire_shape_subscribers_will_see` — pin the exact JSON shape downstream consumers will see - `topic_constant_is_namespaced_action_format` - `export_bindings_commandcompletedevent` (ts-rs) # What this PR does NOT do - **Does NOT wire production startup to use the new init function.** `ipc::start_server` still calls `init_executor_with_interceptors` (no bus). A follow-up PR threads the runtime's bus through into startup. Safe: with no bus wired, the event emission is a silent no-op so production behavior is byte-identical until the wire lands. - **Does NOT emit per-tier events** (interceptor handled vs local Rust vs TS bridge). One event per `execute()` call — the outermost outcome. Per-tier telemetry can be added later if a consumer needs it. - **Does NOT emit `command:queued` / `command:dispatching` lifecycle events.** Just `command:completed`. The Stream cell shape (gap report priority 4) is the right home for in-flight progress events when it lands. - **Does NOT add a default subscriber** (a persona loop that consumes these events). The substrate ships the publisher; consumers wire up per their use case via `bus.receiver()` or the existing `bus.subscribe()` path. # Substrate doctrine reinforced Per [[three-primitives-commands-events-persona]] + [[alignment-via-substrate-economics]]: this PR composes the Commands primitive (dispatch) with the Events primitive (completion notifications) at the kernel layer. Personas now have a substrate-level signal for "command X just finished with outcome Y" — the foundation `code/shell/stream` (gap report priority 4) extends with line-by-line streaming when the Stream cell shape activates. For the alignment economics: once peer dispatches over airc grid also emit these events on the local bus (transparent via the GridInterceptor → grid event echo), attribution becomes substrate-observable across the grid. A peer's `cargo/build` completing on their machine emits `command:completed` to your local bus; your persona learns who built what, when. # References - [docs/planning/PERSONA-AS-DEVELOPER-GAP.md](docs/planning/PERSONA-AS-DEVELOPER-GAP.md) Priority 3 (this PR). Priority 1 was #1501, Priority 2 was #1502. - [docs/architecture/COMMAND-INFRASTRUCTURE-FIELD-MANUAL.md](docs/architecture/COMMAND-INFRASTRUCTURE-FIELD-MANUAL.md) §2 (Substrate primitives) — adds the dispatch-level event hook - [MODULE-CATALOG.md §0](docs/architecture/MODULE-CATALOG.md) — runtime substrate row to add when this lands - Memories: [[three-primitives-commands-events-persona]], [[alignment-via-substrate-economics]], [[rtos-brain-no-region-on-hot-path]] Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>

github-actions Bot added the size: XL label May 31, 2026

joelteply mentioned this pull request May 31, 2026

feat(runtime): events/command-completed — every dispatch emits a bus event (RTOS doctrine restored) #1503

Merged

joelteply merged commit e5b0ce1 into canary May 31, 2026
4 checks passed

joelteply deleted the feat/cargo-build-test-module branch May 31, 2026 02:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(modules/cargo): cargo/build + cargo/test — structured Rust toolchain wrappers#1502

feat(modules/cargo): cargo/build + cargo/test — structured Rust toolchain wrappers#1502
joelteply merged 2 commits into
canaryfrom
feat/cargo-build-test-module

joelteply commented May 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

joelteply commented May 31, 2026

Summary

What this adds

Doctrine followed (per field manual)

Sharp design decisions (the kinks the tests caught pre-merge)

Composability with the grid (the alignment payoff)

Test plan (29/29 pass)

What this PR does NOT do

Stacks on

References

Next up

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant