rfc: ADR-0009 sandbox cartridge + ADR-0010 cross-machine federation (epic #87 items 2 + 3)#93
Merged
Merged
Conversation
… standards#98) Adds `elixir/test/phase_c_seam_test.exs` — a Phase C seam-test module that complements http-capability-gateway#11 (gateway-side X-Trust-Level strip + re-emit) by documenting the BoJ-side half of the §3 defence-in-depth pair. ## Live tests (4 passing) * Loopback callers (127.0.0.1 + ::1) honour gateway-forwarded X-Trust-Level — the gateway-equivalent path. * :public cartridge accepts a non-loopback caller regardless of header. * `TrustPolicy.satisfies?/3` accepts every trust claim when `is_local: true`. ## Skipped tests (5 — they document a finding) Phase A contract §3 invariant 3 states: > Any X-Trust-Level arriving from any other source MUST be ignored > and treated as untrusted. `BojRest.TrustPolicy.satisfies?/3` does not currently enforce this — its third clause (`satisfies?(:authenticated, trust, _local) when trust in ["authenticated", "internal"]`) matches regardless of `is_local`. A non-loopback caller reaching BoJ's back-side bind (a §4 violation) can therefore claim any trust class by setting a header. Mitigation today: §4 (back-side bind isolation) keeps the non-loopback path unreachable in well-configured deployments. The §3 invariant is nonetheless "mandatory, not advisory" per the contract. The 5 skipped tests are tagged `@tag skip: <reason>`; they will pass as-is when the fix lands (one additional clause in `satisfies?/3`: `def satisfies?(_required, _trust, false), do: false` between the `:public` and `:authenticated` clauses). Tests-only PR — production code, the bug-codifying assertions in `trust_policy_test.exs` / `router_test.exs`, and the contract-doc implementation note are deliberately NOT included this round, pending owner decision on the §3 enforcement (separate follow-up PR). `mix test` 188 → 186 + 5 skipped = same coverage, +5 skipped, +4 live; 0 failures. Refs hyperpolymath/standards#98 Refs hyperpolymath/standards#91 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ederation Second coupled RFC pair for epic #87 Tier B (items 2 + 3). Coupled because sandbox-mcp is machine-local by construction (ADR-0009) and the federation design in ADR-0010 must remain aware of that constraint (federation coordinates *which peer* runs a sandbox; sandbox handles themselves never cross machines). ADR-0009 — Sandbox cartridge - Multi-provider, tier-gated code execution as one MCP cartridge - Five backends: e2b, Modal, CodeSandbox, Replit, and `local` (Podman + bubblewrap, the SaaS-free floor) - Provider differences (isolation_level / cold_start_ms / language_support / egress_policy / attestation) deliberately surfaced, not abstracted - Tier-gating: capabilities.network=true flips sandbox_exec from tier-2 to tier-4; policy engine (ADR-0007) computes effective tier per-call - Wires explicitly with panic-attack-mcp (pre-flight) and vordr-mcp (post-flight) — three-cartridge canonical untrusted-execution flow - Sandboxes bound to peer tokens; lifetime ≤ peer session lifetime - 7 tools: sandbox_create/exec/read/write/install/destroy/list ADR-0010 — Cross-machine coord federation - Promote local-coord-mcp from loopback-only to federated - Three pillars: DID identity (did:boj:peer:...) / ML-KEM-1024 key exchange / federated quarantine - ML-DSA-87 signature + ChaCha20-Poly1305 AEAD on the wire - Three topology variants: mesh, hub, hub-and-rim (recommended for prod) - Master-uniqueness invariant federated via HOTSTUFF-style election - Opt-in per machine via COORD_FEDERATED=true + signed roster - Post-quantum on the wire = end-to-end EXHIBIT-B compliance - 6-stage implementation plan, ~6 weeks total Both RFCs include consequences (positive + negative), explicit non-goals, and open questions calling out decisions deferred to implementation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
🔍 Hypatia Security ScanFindings: 29 issues detected
View findings[
{
"reason": "Stale AI session file -- delete",
"type": "stale",
"file": "GEMINI.md",
"action": "delete",
"rule_module": "root_hygiene",
"severity": "medium"
},
{
"reason": "Issue in quality.yml",
"type": "missing_workflow",
"file": "quality.yml",
"action": "create",
"rule_module": "workflow_audit",
"severity": "high"
},
{
"reason": "Issue in security-policy.yml",
"type": "missing_workflow",
"file": "security-policy.yml",
"action": "create",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Action hyperpolymath/standards/.github/workflows/governance-reusable.yml@main needs attention",
"type": "unpinned_action",
"file": "governance.yml",
"action": "pin_sha",
"rule_module": "workflow_audit",
"severity": "high"
},
{
"reason": "Python file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/.github/scripts/validate-eclexiaiser.py",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/sanctify-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/academic-workflow-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/fireflag-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/ephapax-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/bofig-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
}
]Powered by Hypatia Neurosymbolic CI/CD Intelligence |
🔍 Hypatia Security ScanFindings: 29 issues detected
View findings[
{
"reason": "Stale AI session file -- delete",
"type": "stale",
"file": "GEMINI.md",
"action": "delete",
"rule_module": "root_hygiene",
"severity": "medium"
},
{
"reason": "Issue in quality.yml",
"type": "missing_workflow",
"file": "quality.yml",
"action": "create",
"rule_module": "workflow_audit",
"severity": "high"
},
{
"reason": "Issue in security-policy.yml",
"type": "missing_workflow",
"file": "security-policy.yml",
"action": "create",
"rule_module": "workflow_audit",
"severity": "medium"
},
{
"reason": "Action hyperpolymath/standards/.github/workflows/governance-reusable.yml@main needs attention",
"type": "unpinned_action",
"file": "governance.yml",
"action": "pin_sha",
"rule_module": "workflow_audit",
"severity": "high"
},
{
"reason": "Python file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/.github/scripts/validate-eclexiaiser.py",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/sanctify-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/academic-workflow-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/fireflag-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/ephapax-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
},
{
"reason": "TypeScript file detected -- banned language",
"type": "banned_language_file",
"file": "/home/runner/work/boj-server/boj-server/cartridges/bofig-mcp/adapter/mod.ts",
"action": "flag",
"rule_module": "cicd_rules",
"severity": "critical"
}
]Powered by Hypatia Neurosymbolic CI/CD Intelligence |
This was referenced May 20, 2026
Merged
hyperpolymath
added a commit
that referenced
this pull request
May 20, 2026
…epic #87 items 5 + 6) (#95) ## Summary Third and final RFC pair for epic #87 Tier B. Both ADRs concern **server-initiated MCP messages** — webhooks fan out \`notifications/event\` to clients; sampling sends \`sampling/createMessage\` to clients. Both are opt-in per client and require graceful fallback when the client doesn't cooperate. **No code in this PR.** Pure design docs. ## ADR-0011 — Webhooks inbound + MCP notifications Closes the agent feedback loop: external events surface as MCP \`notifications/event\` instead of forcing agents to poll. - **Six providers v1**: github, gitlab, cloudflare, sentry, stripe, generic - **Single listener**: \`POST /webhooks/{provider}/{token}\` on the existing Cowboy endpoint (ADR-0004 preserved) - **Per-provider signature verification** — HMAC-SHA256, JWS as appropriate; signature-rejected events never reach the notification path - **Subscription persistence** at \`~/.boj/webhooks/\` (chmod 0600); managed via 5 new bridge tools (\`boj_webhook_subscribe/list/unsubscribe/rotate/replay\`) - **Bounded replay buffer** (100 events × N subscriptions) so reconnecting clients catch up - **Fan-out by selector** — broadcast, by \`client_kind\`, or by \`peer_token\` ## ADR-0012 — Server-initiated sampling \`sampling/createMessage\` is MCP's underused reverse path. BoJ uses it for **two specific patterns**, both opt-in per call site: | Pattern | When | |---|---| | Composition router | \`boj_cartridge_invoke\` against an ambiguous intent (e.g. "deploy and monitor") — ask the LLM which cartridge fits | | Clarification | An argument could match multiple backends — ask the LLM (which knows the user) which one | **Budget-bounded**: \`BOJ_SAMPLING_BUDGET_PER_SESSION\` (default 50); exceeded → deterministic fallback. **Always has fallback**: client rejection or timeout never blocks. **Hard NO list**: never in security-critical paths, never to ask "should I proceed?", never for input validation. Per-call OTel span (depends on PR #91) so sampling activity is observable in the user's existing telemetry. ## Why this pair Both are **server-initiated**, both are **opt-in per client**, both **degrade gracefully**. Coupling them in one review surfaces the shared protocol concerns (client cooperation, fallback discipline, audit attribution) in one place. ## Review focus Recommend reading **Open questions** in both: - ADR-0011: subscription persistence across restarts; backpressure on slow clients; provider extensibility (config vs code); authorization tier for subscription creation - ADR-0012: system-prompt safety; sampling-decision caching; multi-client target heuristic; cost attribution; fallback determinism ## What this doesn't change - No code touched. - No existing tools or cartridges affected. - ADR-0004 (single listener) preserved by 0011; ADR-0007 (policy DSL) extended by both (sampling result feeds a policy-gated tool call; webhook subscriptions are policy-gated artefacts). ## Sequencing This completes epic #87 Tier B (all 6 RFCs across PRs #92, #93, this one). Tier A is complete in code (#89, #91). Tier C is the long-running proof campaigns — separate work, no RFCs needed at this stage. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
14 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Second coupled RFC pair for epic #87 Tier B. Coupled because sandbox-mcp is machine-local by construction and the federation design must remain aware of that — federation coordinates which peer runs a sandbox; sandbox handles themselves never cross machines.
No code in this PR. Pure design docs.
ADR-0009 — Sandbox cartridge
Multi-provider, tier-gated code execution as one MCP cartridge with five swappable backends:
Tier gating: `capabilities.network: true` flips `sandbox_exec` from tier-2 to tier-4, requiring master approval. Policy engine (ADR-0007) computes effective tier per-call.
Wires explicitly with `panic-attack-mcp` (pre-flight static analysis) + `vordr-mcp` (post-flight integrity). Three-cartridge canonical untrusted-execution flow.
ADR-0010 — Cross-machine coord federation
Promote `local-coord-mcp` from loopback-only to federated. Three pillars:
Three topology variants: mesh / hub / hub-and-rim (recommended for production — dedicated routing machine keeps the federation up when any LLM-peer machine goes down).
Opt-in per machine via `COORD_FEDERATED=true` + signed roster. No insecure-federation mode — federation without crypto is not federation.
Implementation plan is six staged sub-RFCs (~6 weeks total) so the campaign can be reviewed in chunks rather than as a single mega-PR.
Why this pair
Sandbox is the biggest execution-blast-radius operation BoJ exposes; federation is the most architecturally far-reaching extension. Both depend on ADR-0007 (policy DSL) landing first. Both touch the trust-tier model from different angles — execution and identity. Coupling them in one review surfaces the interactions early.
Review focus
Recommend reading Open questions in both:
The federation RFC is the largest commitment in epic #87 — call out anything in the staged plan that feels off-sequence.
What this doesn't change
Sequencing
Per epic #87: this is Tier B step 2 of 3. The third pair (items 5 + 6: webhooks + sampling) follows. All three pairs are independent of each other and of the code PRs (#88/#89/#91).
🤖 Generated with Claude Code