feat(limits): structured resource limits in boruna_run (closes #5) by escapeboy · Pull Request #13 · escapeboy/boruna

escapeboy · 2026-04-25T15:21:49Z

Sprint 0.3-S10 — closes #5 (FleetQ P1, third in a row)

Lets FleetQ drop their PHP-side timeout wrapper and surface clean per-limit UX (`limit_kind` + `phase` discriminators) instead of parsing exit-code strings.

What's in this PR

MCP surface

boruna_run({
  source: "...",
  policy: ...,
  limits: {
    "max_wall_ms":      30000,
    "max_output_bytes": 1048576,
    "max_memory_mb":    256
  }
})

Overrun returns:

{
  "success":     false,
  "error_kind":  "limit_exceeded",
  "limit_kind":  "wall_ms" | "output_bytes",
  "phase":       "execution" | "serialization",
  "limit":       30000,
  "message":     "wall-clock execution limit of 30000 ms exceeded",
  "steps":       <step count at failure>
}

The `phase` discriminator distinguishes a timeout-mid-run from a too-large-output detected after a successful run — without it, integrators reading `steps` for billing would conflate them.

Security: `max_memory_mb` rejected at parse time, not silently ignored

Reviewer caught this as a HIGH finding: silent acceptance of an unenforced memory limit lets a script OOM the host while the integrator believes memory is bounded. Fixed by returning typed:

{ \"success\": false, \"error_kind\": \"unsupported_limit\", \"limit_kind\": \"memory_mb\", \"message\": \"...\" }

at parse time. Field stays in the schema for forward compatibility; enforcement lands in a future sprint (Linux `setrlimit` + per-platform fallback).

VM layer

New `Vm::set_max_wall_ms(Option)` — checked every 1024 steps using `std::time::Instant` (NOT `chrono` — per ADR 001 determinism contract).
New `VmError::WallTimeExceeded(u64)` variant.
Honoured by both `Vm::run()` AND `Vm::execute_bounded()` — the actor-scheduler path was a silent-bypass risk that the review caught.

Documented limitations (per review)

Wall-time check fires only between bytecode steps. A single CapCall to a slow handler (LLM/HTTP/DB) executes synchronously and is NOT interrupted by `max_wall_ms`. Use `NetPolicy.timeout_ms` for per-net-call budgets; equivalent per-handler controls land with each future capability handler.
Wall-time-keyed errors must NEVER feed an EventLog or EvidenceBundle — wall-clock-keyed values break replay across hosts. Hard invariant on `set_max_wall_ms`. Today only the MCP path uses it; the orchestrator's Runner deliberately does not.

Tests

7 new MCP tests (cumulative budget, exact-match boundary, over-cap rejection with `phase=serialization`, security: `max_memory_mb` rejection, unrelated-runtime-error-still-`runtime_error`).
3 new VM tests (wall-clock fires, unset doesn't fire, generous doesn't fire).
All 591+ existing workspace tests pass.
`cargo clippy --workspace -- -D warnings` clean.
`cargo fmt --all -- --check` clean.

Review

`ce-correctness-reviewer` surfaced 6 HIGH findings (1 self-withdrawn). All addressed before commit:

#	Finding	Fix
1	CapCall uninterruptible	Documented as a known limitation
2	`execute_bounded` ignored `max_wall_ms`	Fixed (`start_time` set there too, preserved across yields)
3	`steps` misleading on `output_bytes`	Added `phase` discriminator so the contract is unambiguous
4	Full-serialize peak memory	Added `Value::String` pre-check before recursing
5	`max_memory_mb` silent acceptance	REJECTED at parse time as `unsupported_limit` (security fix)
6	`WallTimeExceeded` → audit hash leak	Beefed up doc invariant on `set_max_wall_ms`

What's NOT in this PR (follow-ups)

The user-facing `docs/reference/mcp-server.md` update for the `limits` parameter and the `limit_exceeded` / `unsupported_limit` error kinds. Lives on PR MCP surface stabilization: tool reference docs + protocol_version: 1 (closes #6) #11's branch which adds that file in the first place; this PR is branched off master. Folds in as a small follow-up after PR MCP surface stabilization: tool reference docs + protocol_version: 1 (closes #6) #11 merges (estimated 10 lines added to mcp-server.md).
`max_memory_mb` enforcement (Linux `setrlimit` + per-platform fallback) — separate future sprint.
Per-CapCall timeout for LLM and other slow handlers — lands when each handler ships.

Closes

Closes [P1] Structured resource limits with typed errors in boruna_run #5 (FleetQ P1: structured resource limits with typed errors)

FleetQ status after this PR

3 of 9 P1/P2 asks closed (#3 `capability_set_hash` via PR #10; #6 `protocol_version` via PR #11; #5 `limits` via this PR). Only P2 asks remain (#7 record/replay net.fetch, #8 output schema, #9 OTLP) — at which point `0.3-S2` (persistent state) becomes the right next big sprint.

🤖 Generated with Claude Code

Sprint 0.3-S10. Third FleetQ P1 ask shipped in a row. Lets integrators drop their PHP-side timeout wrapper and surface clean per-limit UX instead of parsing exit-code strings. ## What changed ### VM layer (boruna-vm) - New `Vm::set_max_wall_ms(Option<u64>)` — wall-clock execution limit in milliseconds. Checked every 1024 steps inside the execute loop using std::time::Instant (NOT chrono — per ADR 001 determinism contract). Honoured by both Vm::run() AND Vm::execute_bounded() — the actor-scheduler path was a silent-bypass risk that the review caught and this fix closes. - New VmError::WallTimeExceeded(u64) variant. - 3 new VM tests: limit fires on long-running program, unset doesn't fire, generous limit doesn't fire. ### MCP layer (boruna-mcp) - New optional `limits` parameter on boruna_run accepting max_wall_ms, max_output_bytes, and max_memory_mb. Plumbed through to the VM (max_wall_ms) and the response serializer (max_output_bytes). - New error envelope: { success: false, error_kind: "limit_exceeded", limit_kind: "wall_ms" | "output_bytes", phase: "execution" | "serialization", limit: <configured>, message: <human>, steps: <step_count at failure> } The `phase` discriminator distinguishes a timeout-mid-run from a too-large-output detected after a successful run — without it, integrators reading `steps` for billing would conflate them. ## Security: max_memory_mb rejected at parse time, not silently ignored The ce-correctness-reviewer flagged this as a HIGH security finding: silent acceptance of an unenforced memory limit lets a script OOM the host while the integrator believes memory is bounded. Fixed by returning a typed `error_kind: "unsupported_limit", limit_kind: "memory_mb"` at parse time. Enforcement lands in a future sprint (Linux setrlimit + per-platform fallback); the field stays in the schema for forward compatibility. ## Documented limitations (per review) - Wall-time check fires only between bytecode steps. A single CapCall to a slow handler (LLM, HTTP, DB) executes synchronously and is NOT interrupted by max_wall_ms. Use NetPolicy.timeout_ms for per-net-call budgets; equivalent per-handler controls land with each future cap handler. Documented in `set_max_wall_ms` doc comment + design doc. - Wall-time-keyed errors must NEVER be fed into an EventLog or EvidenceBundle — wall-clock-keyed values break replay across hosts. Documented as a hard invariant on `set_max_wall_ms`. Today only the MCP path uses it; the orchestrator's Runner deliberately does not. - Output-bytes pre-checks the dominant unbounded-allocation case (Value::String length) before serializing, so a 1 GB string returned from a script doesn't peak twice through memory before being rejected. Container types still go through post-serialize check. ## Tests - 7 new MCP tests (cumulative budget, exact-match boundary, over-cap rejection with phase=serialization, max_memory_mb rejection, unrelated-runtime-error-still-runtime_error, max_memory_mb security). - 3 new VM tests (wall-clock fires, unset doesn't fire, generous doesn't fire). - All 591+ existing workspace tests pass. - clippy --workspace -- -D warnings clean. - cargo fmt --all -- --check clean. ## Review ce-correctness-reviewer ran on the draft and surfaced 6 valid HIGH findings (1 self-withdrawn). All addressed before commit: 1. CapCall uninterruptible — documented as a known limitation. 2. execute_bounded ignored max_wall_ms — fixed (start_time set there too, preserved across yields). 3. `steps` misleading on output_bytes — added `phase` discriminator so the contract is unambiguous. 4. Full-serialize peak memory — added Value::String pre-check. 5. max_memory_mb silent acceptance — REJECTED at parse time as unsupported_limit (security fix). 6. WallTimeExceeded → audit hash leak risk — beefed up doc invariant. ## Documentation - New `docs/design-resource-limits.md` — design rationale, scope, the determinism trade-off, acceptance criteria. - The user-facing `docs/reference/mcp-server.md` update lands as a small follow-up after PR #11 merges (this branch is off master, which doesn't yet include that file). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Captures the 6 HIGH review findings and how each was addressed (1 withdrawn; 5 fixed before commit). Notable: max_memory_mb silent acceptance was a real security regression that the review caught — now rejected at parse time as `unsupported_limit`. Establishes the precedent that any "accepted-but-not-enforced" schema field should reject at parse time, not silently ignore. Also documents the new `phase` discriminator pattern on typed errors — when one error_kind can fire from multiple code paths, add a phase field at lock time so integrators can branch correctly without needing a future protocol_version bump. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

escapeboy and others added 2 commits April 25, 2026 18:21

escapeboy mentioned this pull request Apr 25, 2026

feat(observability): per-call OpenTelemetry spans (closes #9 — final FleetQ ask) #16

Merged

escapeboy merged commit 910ce92 into master Apr 25, 2026
2 of 3 checks passed

escapeboy deleted the feat/0.3-s10-resource-limits branch April 25, 2026 17:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(limits): structured resource limits in boruna_run (closes #5)#13

feat(limits): structured resource limits in boruna_run (closes #5)#13
escapeboy merged 2 commits intomasterfrom
feat/0.3-s10-resource-limits

escapeboy commented Apr 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

escapeboy commented Apr 25, 2026

Sprint 0.3-S10 — closes #5 (FleetQ P1, third in a row)

What's in this PR

MCP surface

Security: `max_memory_mb` rejected at parse time, not silently ignored

VM layer

Documented limitations (per review)

Tests

Review

What's NOT in this PR (follow-ups)

Closes

FleetQ status after this PR

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant