feat(mcp): output JSON Schema validation gate in boruna_run (closes #8) by escapeboy · Pull Request #14 · escapeboy/boruna

escapeboy · 2026-04-25T15:38:41Z

Sprint 0.5-S6 — closes #8 (FleetQ P2, fourth in a row)

Lets integrators drop their host-language JSON Schema validation layer — Boruna validates the agent's return value before yielding it, with structured per-path errors. Pulled forward from 0.5.0 because FleetQ wanted it in their pipeline; same logic that worked for #6 (lock the surface before 0.5 freezes it).

What's in this PR

MCP surface

boruna_run({
  source: \"...\",
  output_schema: {
    \"type\": \"object\",
    \"required\": [\"status\", \"items\"],
    \"properties\": {
      \"status\": { \"type\": \"string\", \"enum\": [\"ok\", \"warn\"] },
      \"items\":  { \"type\": \"array\", \"minItems\": 1 }
    }
  }
})

Mismatch returns:

{
  \"success\":      false,
  \"error_kind\":   \"validation_failed\",
  \"phase\":        \"output_validation\",
  \"message\":      \"result does not match output_schema\",
  \"errors\":       [{ \"path\": \"/status\", \"message\": \"...\" }, ...],
  \"truncated\":    false,
  \"total_errors\": 3,
  \"steps\":        <vm.step_count() at successful completion>
}

Malformed/oversized schema:

{ \"success\": false, \"error_kind\": \"invalid_output_schema\", \"message\": \"...\" }

Hard limits (locked)

256 KB max schema size (compact JSON bytes). Larger → `invalid_output_schema` before compilation.
100 errors max in response (matches `TRACE_LIMIT` pattern). Pathological schemas can produce thousands; cap protects MCP transport. `truncated: true` + `total_errors: N` when exceeded.
Draft 2020-12 enforced. Schemas declaring non-2020-12 `$schema` are rejected at parse time, not silently honoured at older-draft semantics. Same "reject at parse, don't silently override" pattern as 0.3-S10's `unsupported_limit` for `max_memory_mb`.

Known limitation (documented loudly)

`format_value` emits Boruna records/enums/Some/Ok as wrapper JSON:

Boruna value	JSON shape
`Record`	`{"type":"record","type_id":N,"fields":[positional values]}`
`Enum`	`{"type":"enum","type_id":N,"variant":,"payload":...}`
`Some/Ok/Err`	`{"option":"Some","value":...}` etc.

A schema written for the natural object shape (e.g. `{"type":"object","properties":{"name":...}}`) will FAIL validation against a record return. The gate is most useful for primitive return types (`Int`, `String`, `Bool`) and homogeneous `List`/`Map` containers. Record/enum projection lands in a future sprint. Documented in design doc, doc comment, CHANGELOG.

Tests

11 new MCP tests (passthrough, valid, invalid w/ path errors, invalid w/ multiple errors, malformed schema, runtime-error-takes-precedence, empty-schema-accepts, $schema-non-2020-12-rejected, $schema-2020-12-accepted, size-limit, total_errors-field-present, wrapper-format-anchor)
All 591+ existing workspace tests pass
clippy clean, fmt clean

Review

`ce-correctness-reviewer` surfaced 1 HIGH + 3 MEDIUM findings. All addressed before commit:

#	Finding	Fix
1 (HIGH)	`validator_for` honored older drafts via `$schema`	REJECT non-2020-12 `$schema` at parse time
2 (MED)	`format_value` wrapper-format limitation	Documented loudly (design doc + doc comment + CHANGELOG + this PR)
3 (MED)	Unbounded errors array	Cap at 100 + `truncated` / `total_errors` fields
4 (MED)	No size limit on `output_schema`	256 KB cap before compilation

Security

`jsonschema = "0.30"` added with `default-features = false` — no `resolve-http` or `resolve-file` features, so `$ref` to remote URLs or local files cannot trigger SSRF or arbitrary file reads.

What's NOT in this PR (follow-ups)

The user-facing `docs/reference/mcp-server.md` update for `output_schema` and new error kinds. Lives on PR MCP surface stabilization: tool reference docs + protocol_version: 1 (closes #6) #11's branch which adds that file; this PR is branched off master. Folds in as a small follow-up after PR MCP surface stabilization: tool reference docs + protocol_version: 1 (closes #6) #11 merges.
Logical projection of Record/Enum to by-name JSON before validation. Future sprint when there's a real customer ask.
Schema compilation caching. Defer until measured.

Closes

Closes [P2] Output schema validation as a first-class gate #8 (FleetQ P2: output schema validation as a first-class gate)

FleetQ status after this PR

4 of 9 P1/P2 asks closed (#3, #5, #6, #8). Only 2 P2 asks remain:

[P2] Record/replay for net.fetch #7 — Record/replay for net.fetch
[P2] Per-call observability hooks (OpenTelemetry) #9 — Per-call OpenTelemetry observability

After those, only the 0.3.0 critical-path work (persistent state + dependent sprints) remains as the dominant roadmap focus.

🤖 Generated with Claude Code

Sprint 0.5-S6, pulled forward from 0.5.0. Fourth FleetQ ask shipped in a row (after #3, #6, #5). Lets integrators drop their host-language JSON Schema validation layer — Boruna validates the agent's return value before yielding it, with structured per-path errors. ## What changed - New optional `output_schema: object` parameter on `boruna_run`. Accepts any JSON Schema 2020-12 object. Post-execution gate. - New `error_kind` values: - `validation_failed` (phase: "output_validation") with per-path JSON Pointer errors, truncated and total_errors fields - `invalid_output_schema` for malformed or oversized schemas - `jsonschema = "0.30"` dep in boruna-mcp (default features off — no resolve-http/resolve-file, so $ref can't trigger SSRF/file reads) ## Hard limits (locked, all addressed in review) - 256 KB max schema size (compact JSON bytes). Larger → invalid_output_schema before any compilation. Mirrors the spirit of the 1 MB source cap. - 100 errors max in response (matches TRACE_LIMIT pattern). Pathological schemas can produce thousands; cap protects MCP transport. truncated: true + total_errors: N fields when exceeded. - Draft 2020-12 enforced. Schemas declaring non-2020-12 $schema are REJECTED at parse time, not silently honoured at older-draft semantics (which would change `id` vs `$id`, `exclusiveMinimum` boolean vs number, etc.). Same "reject at parse, don't silently override" pattern as 0.3-S10 max_memory_mb. ## Known limitation (documented loudly) `format_value` emits Boruna records/enums/Some/Ok as wrapper JSON: - Record → {"type":"record","type_id":N,"fields":[positional values]} - Enum → {"type":"enum","type_id":N,"variant":<n>,"payload":...} - Some/Ok/Err → {"option":"Some","value":...} etc. A schema written for the natural object shape (e.g. {"type":"object", "properties":{"name":...}}) will FAIL validation against a record return. The gate is most useful for primitive return types (Int, String, Bool) and homogeneous List/Map containers. Record/enum projection lands in a future sprint. Documented in: - docs/design-output-schema.md - validate_output_against_schema doc comment - mcp-server.md (lands when PR #11 merges) ## Tests - 9 new MCP tests: schema-not-set passthrough, schema-passing, schema- failing-with-path-errors, schema-failing-with-multiple-errors, invalid-schema-rejected, runtime-error-takes-precedence, empty-schema- accepts-anything, $schema-non-2020-12-rejected, $schema-2020-12-accepted, size-limit-rejection, total_errors-field-present. - All 591+ existing workspace tests pass. - clippy --workspace -- -D warnings clean. - cargo fmt --all -- --check clean. ## Review ce-correctness-reviewer surfaced 1 HIGH + 3 MEDIUM findings. All addressed before commit: 1. (HIGH) `validator_for` honored older drafts via $schema → REJECT non-2020-12 $schema at parse time (jsonschema crate honours $schema over with_draft, so silent override impossible). 2. (MED) `format_value` wrapper-format limitation for Record/Enum → documented loudly in design doc, doc comment, and CHANGELOG. 3. (MED) Unbounded errors array → cap at 100 with truncated/total_errors. 4. (MED) No size limit on output_schema → 256 KB cap before compilation. ## What's NOT in this PR (follow-ups) - The user-facing docs/reference/mcp-server.md update for the output_schema parameter and new error kinds. Lives on PR #11's branch which adds that file in the first place; this PR is branched off master. Folds in as a small follow-up after PR #11 merges. - Logical projection of Record/Enum to by-name JSON before validation. Future sprint when there's a real customer ask. - Schema compilation caching (compile-by-hash). Defer until measured. ## Closes - Closes #8 (FleetQ P2: output schema validation as a first-class gate) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Captures the 4 review findings (1 HIGH + 3 MEDIUM) and how each was addressed before commit. Notable: the test-driven discovery that `with_draft` is only a fallback (not a lock) when `\$schema` is set — forced the right fix (reject non-2020-12 \$schema at parse time). Establishes "reject at parse, don't silently override" as a project convention, now applied to: max_memory_mb (0.3-S10), policy invalid strings (0.2.0), output_schema size, non-2020-12 \$schema (0.5-S6). Also locks the wrapper-format limitation in 4 docs (design doc, doc comment, CHANGELOG, PR description) so integrators have honest expectations until logical projection lands in a future sprint. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

escapeboy and others added 2 commits April 25, 2026 18:38

escapeboy mentioned this pull request Apr 25, 2026

feat(observability): per-call OpenTelemetry spans (closes #9 — final FleetQ ask) #16

Merged

escapeboy merged commit 0214e9c into master Apr 25, 2026
2 of 3 checks passed

escapeboy deleted the feat/0.5-s6-output-schema branch April 25, 2026 17:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(mcp): output JSON Schema validation gate in boruna_run (closes #8)#14

feat(mcp): output JSON Schema validation gate in boruna_run (closes #8)#14
escapeboy merged 2 commits intomasterfrom
feat/0.5-s6-output-schema

escapeboy commented Apr 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

escapeboy commented Apr 25, 2026

Sprint 0.5-S6 — closes #8 (FleetQ P2, fourth in a row)

What's in this PR

MCP surface

Hard limits (locked)

Known limitation (documented loudly)

Tests

Review

Security

What's NOT in this PR (follow-ups)

Closes

FleetQ status after this PR

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant