Skip to content

feat(mcp): output JSON Schema validation gate in boruna_run (closes #8)#14

Merged
escapeboy merged 2 commits intomasterfrom
feat/0.5-s6-output-schema
Apr 25, 2026
Merged

feat(mcp): output JSON Schema validation gate in boruna_run (closes #8)#14
escapeboy merged 2 commits intomasterfrom
feat/0.5-s6-output-schema

Conversation

@escapeboy
Copy link
Copy Markdown
Owner

Sprint 0.5-S6 — closes #8 (FleetQ P2, fourth in a row)

Lets integrators drop their host-language JSON Schema validation layer — Boruna validates the agent's return value before yielding it, with structured per-path errors. Pulled forward from 0.5.0 because FleetQ wanted it in their pipeline; same logic that worked for #6 (lock the surface before 0.5 freezes it).

What's in this PR

MCP surface

boruna_run({
  source: \"...\",
  output_schema: {
    \"type\": \"object\",
    \"required\": [\"status\", \"items\"],
    \"properties\": {
      \"status\": { \"type\": \"string\", \"enum\": [\"ok\", \"warn\"] },
      \"items\":  { \"type\": \"array\", \"minItems\": 1 }
    }
  }
})

Mismatch returns:

{
  \"success\":      false,
  \"error_kind\":   \"validation_failed\",
  \"phase\":        \"output_validation\",
  \"message\":      \"result does not match output_schema\",
  \"errors\":       [{ \"path\": \"/status\", \"message\": \"...\" }, ...],
  \"truncated\":    false,
  \"total_errors\": 3,
  \"steps\":        <vm.step_count() at successful completion>
}

Malformed/oversized schema:

{ \"success\": false, \"error_kind\": \"invalid_output_schema\", \"message\": \"...\" }

Hard limits (locked)

  • 256 KB max schema size (compact JSON bytes). Larger → `invalid_output_schema` before compilation.
  • 100 errors max in response (matches `TRACE_LIMIT` pattern). Pathological schemas can produce thousands; cap protects MCP transport. `truncated: true` + `total_errors: N` when exceeded.
  • Draft 2020-12 enforced. Schemas declaring non-2020-12 `$schema` are rejected at parse time, not silently honoured at older-draft semantics. Same "reject at parse, don't silently override" pattern as 0.3-S10's `unsupported_limit` for `max_memory_mb`.

Known limitation (documented loudly)

`format_value` emits Boruna records/enums/Some/Ok as wrapper JSON:

Boruna value JSON shape
`Record` `{"type":"record","type_id":N,"fields":[positional values]}`
`Enum` `{"type":"enum","type_id":N,"variant":,"payload":...}`
`Some/Ok/Err` `{"option":"Some","value":...}` etc.

A schema written for the natural object shape (e.g. `{"type":"object","properties":{"name":...}}`) will FAIL validation against a record return. The gate is most useful for primitive return types (`Int`, `String`, `Bool`) and homogeneous `List`/`Map` containers. Record/enum projection lands in a future sprint. Documented in design doc, doc comment, CHANGELOG.

Tests

  • 11 new MCP tests (passthrough, valid, invalid w/ path errors, invalid w/ multiple errors, malformed schema, runtime-error-takes-precedence, empty-schema-accepts, $schema-non-2020-12-rejected, $schema-2020-12-accepted, size-limit, total_errors-field-present, wrapper-format-anchor)
  • All 591+ existing workspace tests pass
  • clippy clean, fmt clean

Review

`ce-correctness-reviewer` surfaced 1 HIGH + 3 MEDIUM findings. All addressed before commit:

# Finding Fix
1 (HIGH) `validator_for` honored older drafts via `$schema` REJECT non-2020-12 `$schema` at parse time
2 (MED) `format_value` wrapper-format limitation Documented loudly (design doc + doc comment + CHANGELOG + this PR)
3 (MED) Unbounded errors array Cap at 100 + `truncated` / `total_errors` fields
4 (MED) No size limit on `output_schema` 256 KB cap before compilation

Security

`jsonschema = "0.30"` added with `default-features = false` — no `resolve-http` or `resolve-file` features, so `$ref` to remote URLs or local files cannot trigger SSRF or arbitrary file reads.

What's NOT in this PR (follow-ups)

Closes

FleetQ status after this PR

4 of 9 P1/P2 asks closed (#3, #5, #6, #8). Only 2 P2 asks remain:

After those, only the 0.3.0 critical-path work (persistent state + dependent sprints) remains as the dominant roadmap focus.

🤖 Generated with Claude Code

escapeboy and others added 2 commits April 25, 2026 18:38
Sprint 0.5-S6, pulled forward from 0.5.0. Fourth FleetQ ask shipped in a
row (after #3, #6, #5). Lets integrators drop their host-language JSON
Schema validation layer — Boruna validates the agent's return value
before yielding it, with structured per-path errors.

## What changed

- New optional `output_schema: object` parameter on `boruna_run`.
  Accepts any JSON Schema 2020-12 object. Post-execution gate.
- New `error_kind` values:
  - `validation_failed` (phase: "output_validation") with per-path
    JSON Pointer errors, truncated and total_errors fields
  - `invalid_output_schema` for malformed or oversized schemas
- `jsonschema = "0.30"` dep in boruna-mcp (default features off —
  no resolve-http/resolve-file, so $ref can't trigger SSRF/file reads)

## Hard limits (locked, all addressed in review)

- 256 KB max schema size (compact JSON bytes). Larger → invalid_output_schema
  before any compilation. Mirrors the spirit of the 1 MB source cap.
- 100 errors max in response (matches TRACE_LIMIT pattern). Pathological
  schemas can produce thousands; cap protects MCP transport. truncated:
  true + total_errors: N fields when exceeded.
- Draft 2020-12 enforced. Schemas declaring non-2020-12 $schema are
  REJECTED at parse time, not silently honoured at older-draft semantics
  (which would change `id` vs `$id`, `exclusiveMinimum` boolean vs
  number, etc.). Same "reject at parse, don't silently override"
  pattern as 0.3-S10 max_memory_mb.

## Known limitation (documented loudly)

`format_value` emits Boruna records/enums/Some/Ok as wrapper JSON:
- Record → {"type":"record","type_id":N,"fields":[positional values]}
- Enum → {"type":"enum","type_id":N,"variant":<n>,"payload":...}
- Some/Ok/Err → {"option":"Some","value":...} etc.

A schema written for the natural object shape (e.g. {"type":"object",
"properties":{"name":...}}) will FAIL validation against a record
return. The gate is most useful for primitive return types (Int,
String, Bool) and homogeneous List/Map containers. Record/enum
projection lands in a future sprint. Documented in:
- docs/design-output-schema.md
- validate_output_against_schema doc comment
- mcp-server.md (lands when PR #11 merges)

## Tests

- 9 new MCP tests: schema-not-set passthrough, schema-passing, schema-
  failing-with-path-errors, schema-failing-with-multiple-errors,
  invalid-schema-rejected, runtime-error-takes-precedence, empty-schema-
  accepts-anything, $schema-non-2020-12-rejected, $schema-2020-12-accepted,
  size-limit-rejection, total_errors-field-present.
- All 591+ existing workspace tests pass.
- clippy --workspace -- -D warnings clean.
- cargo fmt --all -- --check clean.

## Review

ce-correctness-reviewer surfaced 1 HIGH + 3 MEDIUM findings. All
addressed before commit:
1. (HIGH) `validator_for` honored older drafts via $schema → REJECT
   non-2020-12 $schema at parse time (jsonschema crate honours $schema
   over with_draft, so silent override impossible).
2. (MED) `format_value` wrapper-format limitation for Record/Enum →
   documented loudly in design doc, doc comment, and CHANGELOG.
3. (MED) Unbounded errors array → cap at 100 with truncated/total_errors.
4. (MED) No size limit on output_schema → 256 KB cap before compilation.

## What's NOT in this PR (follow-ups)

- The user-facing docs/reference/mcp-server.md update for the
  output_schema parameter and new error kinds. Lives on PR #11's branch
  which adds that file in the first place; this PR is branched off
  master. Folds in as a small follow-up after PR #11 merges.
- Logical projection of Record/Enum to by-name JSON before validation.
  Future sprint when there's a real customer ask.
- Schema compilation caching (compile-by-hash). Defer until measured.

## Closes

- Closes #8 (FleetQ P2: output schema validation as a first-class gate)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Captures the 4 review findings (1 HIGH + 3 MEDIUM) and how each was
addressed before commit. Notable: the test-driven discovery that
`with_draft` is only a fallback (not a lock) when `\$schema` is set —
forced the right fix (reject non-2020-12 \$schema at parse time).

Establishes "reject at parse, don't silently override" as a project
convention, now applied to: max_memory_mb (0.3-S10), policy invalid
strings (0.2.0), output_schema size, non-2020-12 \$schema (0.5-S6).

Also locks the wrapper-format limitation in 4 docs (design doc, doc
comment, CHANGELOG, PR description) so integrators have honest
expectations until logical projection lands in a future sprint.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@escapeboy escapeboy merged commit 0214e9c into master Apr 25, 2026
2 of 3 checks passed
@escapeboy escapeboy deleted the feat/0.5-s6-output-schema branch April 25, 2026 17:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[P2] Output schema validation as a first-class gate

1 participant