Skip to content

Finish guardrails MVP floor#20

Merged
terisuke merged 1 commit intodevfrom
codex/guardrails-mvp-floor
Apr 3, 2026
Merged

Finish guardrails MVP floor#20
terisuke merged 1 commit intodevfrom
codex/guardrails-mvp-floor

Conversation

@terisuke
Copy link
Copy Markdown

@terisuke terisuke commented Apr 3, 2026

Summary

  • add scripted scenario and replay harness coverage for guardrail workflows
  • harden the guardrail plugin with MVP floor state tracking and prompt injection
  • align provider admission with current Z.AI coding-plan, OpenAI OAuth/Codex, and OpenRouter evaluation lanes
  • keep the thin distribution launcher working under ESM and repo-local .env usage

Verification

  • cd packages/opencode && bun test test/scenario/guardrails.test.ts
  • cd packages/opencode && bun typecheck
  • git push hook: bun turbo typecheck
  • live smoke: openai/gpt-5.4 -> OPENAI_OK
  • live smoke: zai-coding-plan/glm-5.1 -> ZAI_PLAN_OK

Closes #7
Closes #13

Copilot AI review requested due to automatic review settings April 3, 2026 09:12
@terisuke terisuke merged commit 6d0bc57 into dev Apr 3, 2026
@terisuke terisuke deleted the codex/guardrails-mvp-floor branch April 3, 2026 09:12
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR completes the “guardrails MVP floor” by adding deterministic scenario replays for guardrail workflows, hardening the guardrail plugin with file-backed runtime state + enforcement gates, and aligning provider admission with the updated Z.AI coding-plan lane and OpenAI OAuth/Codex/OpenRouter evaluation usage—while keeping the distribution launchers working under ESM and supporting repo-local .env loading.

Changes:

  • Add a scripted replay catalog + runtime harness to exercise guarded commands end-to-end using a deterministic fake LLM server.
  • Harden the guardrail plugin with MVP-floor state tracking (reads/edits/factcheck/review freshness), version baseline checks, and context-budget enforcement injected into review/ship/handoff prompts.
  • Update provider admission to include zai-coding-plan, broaden allowlists, and bump the OpenRouter evaluation agent default model.

Reviewed changes

Copilot reviewed 12 out of 14 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
packages/opencode/test/scenario/replay.ts Adds scripted replay definitions (expected prompts/results/tool steps) for guarded workflows.
packages/opencode/test/scenario/harness.ts Adds an Effect-based replay runner that boots real session/plugin layers with a fake LLM server and asserts artifacts.
packages/opencode/test/scenario/guardrails.test.ts Expands scenario coverage (provider lanes, OAuth Codex visibility, MVP-floor plugin enforcement, and replay-based executability checks).
packages/opencode/bin/opencode Converts launcher to ESM and adds a Bun-based fallback execution path.
packages/guardrails/README.md Updates documentation to reflect zai-coding-plan as a first-class provider lane.
packages/guardrails/profile/plugins/guardrail.ts Implements MVP-floor plugin hardening: state tracking, baseline/version checks, context budget, and prompt injection for review/ship/handoff/compaction.
packages/guardrails/profile/opencode.json Adds zai-coding-plan provider + expands provider/model allowlists (incl. OpenAI Codex/OAuth and OpenRouter eval set).
packages/guardrails/profile/agents/provider-eval.md Updates provider-eval agent default model to openrouter/openai/gpt-5.4-mini.
packages/guardrails/managed/opencode.json Mirrors profile provider enablement + allowlists for managed deployments.
packages/guardrails/bin/opencode-guardrails Converts to ESM and adds repo-local .env discovery/loading before launching opencode with the guardrail profile.
docs/ai-guardrails/README.md Adds references and pointers to scenario replays + ADR boundary guidance.
docs/ai-guardrails/adr/006-plugin-hardening-floor.md New ADR defining the MVP hardening floor and explicit deferrals.
docs/ai-guardrails/adr/005-scripted-scenario-replays.md New ADR defining the replay approach and why it’s used.
docs/ai-guardrails/adr/002-provider-admission-lanes.md Updates admission-lane ADR to include zai-coding-plan in the standard confidential-code lane.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +504 to +531
const providers = await Provider.list()
const openai = providers[ProviderID.openai]
const models = Object.keys(openai.models).sort()

expect(openai).toBeDefined()
expect(models).toEqual(
expect.arrayContaining([
"gpt-5.1-codex",
"gpt-5.1-codex-max",
"gpt-5.1-codex-mini",
"gpt-5.2",
"gpt-5.2-codex",
"gpt-5.3-codex",
"gpt-5.4",
]),
)
expect(openai.models["gpt-5.4"]?.cost.input).toBe(0)
await expect(
Plugin.trigger(
"chat.params",
{
sessionID: "session_test",
agent: "implement",
model: openai.models["gpt-5.4"],
},
{ temperature: undefined, topP: undefined, topK: undefined, options: {} },
),
).resolves.toEqual({ temperature: undefined, topP: undefined, topK: undefined, options: {} })
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Auth.set("openai", ...) persists credentials to the shared test auth store (Global.Path.data/auth.json). Since the test run shares OPENCODE_TEST_HOME across files, leaving this OAuth entry behind can change provider-loading behavior in unrelated tests. Add cleanup for the OpenAI auth entry (e.g., Auth.remove("openai")) in a finally block in this test or in the file-level afterEach.

Suggested change
const providers = await Provider.list()
const openai = providers[ProviderID.openai]
const models = Object.keys(openai.models).sort()
expect(openai).toBeDefined()
expect(models).toEqual(
expect.arrayContaining([
"gpt-5.1-codex",
"gpt-5.1-codex-max",
"gpt-5.1-codex-mini",
"gpt-5.2",
"gpt-5.2-codex",
"gpt-5.3-codex",
"gpt-5.4",
]),
)
expect(openai.models["gpt-5.4"]?.cost.input).toBe(0)
await expect(
Plugin.trigger(
"chat.params",
{
sessionID: "session_test",
agent: "implement",
model: openai.models["gpt-5.4"],
},
{ temperature: undefined, topP: undefined, topK: undefined, options: {} },
),
).resolves.toEqual({ temperature: undefined, topP: undefined, topK: undefined, options: {} })
try {
const providers = await Provider.list()
const openai = providers[ProviderID.openai]
const models = Object.keys(openai.models).sort()
expect(openai).toBeDefined()
expect(models).toEqual(
expect.arrayContaining([
"gpt-5.1-codex",
"gpt-5.1-codex-max",
"gpt-5.1-codex-mini",
"gpt-5.2",
"gpt-5.2-codex",
"gpt-5.3-codex",
"gpt-5.4",
]),
)
expect(openai.models["gpt-5.4"]?.cost.input).toBe(0)
await expect(
Plugin.trigger(
"chat.params",
{
sessionID: "session_test",
agent: "implement",
model: openai.models["gpt-5.4"],
},
{ temperature: undefined, topP: undefined, topK: undefined, options: {} },
),
).resolves.toEqual({ temperature: undefined, topP: undefined, topK: undefined, options: {} })
} finally {
await Auth.remove("openai")
}

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

guardrails: plugin hardening wave 2 for MVP floor guardrails: scenario and replay harness

2 participants