feat(limits): structured resource limits in boruna_run (closes #5)#13
Merged
feat(limits): structured resource limits in boruna_run (closes #5)#13
Conversation
Sprint 0.3-S10. Third FleetQ P1 ask shipped in a row. Lets integrators
drop their PHP-side timeout wrapper and surface clean per-limit UX
instead of parsing exit-code strings.
## What changed
### VM layer (boruna-vm)
- New `Vm::set_max_wall_ms(Option<u64>)` — wall-clock execution limit
in milliseconds. Checked every 1024 steps inside the execute loop
using std::time::Instant (NOT chrono — per ADR 001 determinism
contract). Honoured by both Vm::run() AND Vm::execute_bounded() —
the actor-scheduler path was a silent-bypass risk that the review
caught and this fix closes.
- New VmError::WallTimeExceeded(u64) variant.
- 3 new VM tests: limit fires on long-running program, unset doesn't
fire, generous limit doesn't fire.
### MCP layer (boruna-mcp)
- New optional `limits` parameter on boruna_run accepting
max_wall_ms, max_output_bytes, and max_memory_mb. Plumbed through
to the VM (max_wall_ms) and the response serializer (max_output_bytes).
- New error envelope:
{
success: false, error_kind: "limit_exceeded",
limit_kind: "wall_ms" | "output_bytes",
phase: "execution" | "serialization",
limit: <configured>, message: <human>,
steps: <step_count at failure>
}
The `phase` discriminator distinguishes a timeout-mid-run from a
too-large-output detected after a successful run — without it,
integrators reading `steps` for billing would conflate them.
## Security: max_memory_mb rejected at parse time, not silently ignored
The ce-correctness-reviewer flagged this as a HIGH security finding:
silent acceptance of an unenforced memory limit lets a script OOM the
host while the integrator believes memory is bounded. Fixed by
returning a typed `error_kind: "unsupported_limit", limit_kind: "memory_mb"`
at parse time. Enforcement lands in a future sprint (Linux setrlimit
+ per-platform fallback); the field stays in the schema for forward
compatibility.
## Documented limitations (per review)
- Wall-time check fires only between bytecode steps. A single CapCall
to a slow handler (LLM, HTTP, DB) executes synchronously and is NOT
interrupted by max_wall_ms. Use NetPolicy.timeout_ms for per-net-call
budgets; equivalent per-handler controls land with each future cap
handler. Documented in `set_max_wall_ms` doc comment + design doc.
- Wall-time-keyed errors must NEVER be fed into an EventLog or
EvidenceBundle — wall-clock-keyed values break replay across hosts.
Documented as a hard invariant on `set_max_wall_ms`. Today only the
MCP path uses it; the orchestrator's Runner deliberately does not.
- Output-bytes pre-checks the dominant unbounded-allocation case
(Value::String length) before serializing, so a 1 GB string returned
from a script doesn't peak twice through memory before being rejected.
Container types still go through post-serialize check.
## Tests
- 7 new MCP tests (cumulative budget, exact-match boundary, over-cap
rejection with phase=serialization, max_memory_mb rejection,
unrelated-runtime-error-still-runtime_error, max_memory_mb security).
- 3 new VM tests (wall-clock fires, unset doesn't fire, generous doesn't
fire).
- All 591+ existing workspace tests pass.
- clippy --workspace -- -D warnings clean.
- cargo fmt --all -- --check clean.
## Review
ce-correctness-reviewer ran on the draft and surfaced 6 valid HIGH
findings (1 self-withdrawn). All addressed before commit:
1. CapCall uninterruptible — documented as a known limitation.
2. execute_bounded ignored max_wall_ms — fixed (start_time set there
too, preserved across yields).
3. `steps` misleading on output_bytes — added `phase` discriminator
so the contract is unambiguous.
4. Full-serialize peak memory — added Value::String pre-check.
5. max_memory_mb silent acceptance — REJECTED at parse time as
unsupported_limit (security fix).
6. WallTimeExceeded → audit hash leak risk — beefed up doc invariant.
## Documentation
- New `docs/design-resource-limits.md` — design rationale, scope, the
determinism trade-off, acceptance criteria.
- The user-facing `docs/reference/mcp-server.md` update lands as a
small follow-up after PR #11 merges (this branch is off master, which
doesn't yet include that file).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Captures the 6 HIGH review findings and how each was addressed (1 withdrawn; 5 fixed before commit). Notable: max_memory_mb silent acceptance was a real security regression that the review caught — now rejected at parse time as `unsupported_limit`. Establishes the precedent that any "accepted-but-not-enforced" schema field should reject at parse time, not silently ignore. Also documents the new `phase` discriminator pattern on typed errors — when one error_kind can fire from multiple code paths, add a phase field at lock time so integrators can branch correctly without needing a future protocol_version bump. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Sprint 0.3-S10 — closes #5 (FleetQ P1, third in a row)
Lets FleetQ drop their PHP-side timeout wrapper and surface clean per-limit UX (`limit_kind` + `phase` discriminators) instead of parsing exit-code strings.
What's in this PR
MCP surface
Overrun returns:
{ "success": false, "error_kind": "limit_exceeded", "limit_kind": "wall_ms" | "output_bytes", "phase": "execution" | "serialization", "limit": 30000, "message": "wall-clock execution limit of 30000 ms exceeded", "steps": <step count at failure> }The `phase` discriminator distinguishes a timeout-mid-run from a too-large-output detected after a successful run — without it, integrators reading `steps` for billing would conflate them.
Security: `max_memory_mb` rejected at parse time, not silently ignored
Reviewer caught this as a HIGH finding: silent acceptance of an unenforced memory limit lets a script OOM the host while the integrator believes memory is bounded. Fixed by returning typed:
{ \"success\": false, \"error_kind\": \"unsupported_limit\", \"limit_kind\": \"memory_mb\", \"message\": \"...\" }at parse time. Field stays in the schema for forward compatibility; enforcement lands in a future sprint (Linux `setrlimit` + per-platform fallback).
VM layer
Documented limitations (per review)
Tests
Review
`ce-correctness-reviewer` surfaced 6 HIGH findings (1 self-withdrawn). All addressed before commit:
What's NOT in this PR (follow-ups)
Closes
FleetQ status after this PR
3 of 9 P1/P2 asks closed (#3 `capability_set_hash` via PR #10; #6 `protocol_version` via PR #11; #5 `limits` via this PR). Only P2 asks remain (#7 record/replay net.fetch, #8 output schema, #9 OTLP) — at which point `0.3-S2` (persistent state) becomes the right next big sprint.
🤖 Generated with Claude Code