feat: enforce agent capabilities in propose()#8
Conversation
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughA test case was added to verify escalation behavior when a global Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 033c670a93
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // Agent capability check (only if proposer is registered) | ||
| const agent = this.agents.get(proposal.proposer); | ||
| if (agent !== undefined) { | ||
| if (!matchesCapability(agent.capabilities, proposal.action.type)) { |
There was a problem hiding this comment.
Guard capability checks against invalid agent shapes
propose() now calls matchesCapability(agent.capabilities, ...) for registered proposers, but registerAgent() does not enforce runtime shape. In JavaScript usage, registering an agent with capabilities: null (or non-string entries) causes propose() to throw (.some/.split type errors) instead of returning a decision, so a malformed registration can crash authorization flow. Please validate/coerce capabilities before calling the matcher.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
別タスクで対応する。registerAgent() のバリデーション強化として一括対応予定。capabilities が配列かつ各要素が string であることを検証する。
| const agentRiskScore = proposal.action.riskScore; | ||
| if ( | ||
| agentRiskScore !== undefined && | ||
| agentRiskScore > agent.maxApprovableRisk |
There was a problem hiding this comment.
Reject non-finite maxApprovableRisk before risk comparison
The per-agent risk gate assumes agent.maxApprovableRisk is a valid number, but for non-typed callers it can be undefined/NaN/string and make agentRiskScore > agent.maxApprovableRisk evaluate false. That silently disables the new agent-specific escalation check and can auto-approve proposals that should be escalated by agent policy. Add runtime validation for maxApprovableRisk at registration (or before comparison).
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
別タスクで対応する。上記と同じく registerAgent() のバリデーション強化として一括対応。
There was a problem hiding this comment.
Code Review
This pull request introduces agent capability and risk score validation within the propose method, ensuring that registered agents are escalated if they lack necessary permissions or exceed their risk thresholds. New test cases verify these checks, including wildcard matching and handling of unregistered agents. Feedback suggests refactoring the escalation logic in src/core/runtime.ts to reduce code duplication when recording audit logs and returning decisions.
| if (!matchesCapability(agent.capabilities, proposal.action.type)) { | ||
| const capDecision: Decision = { | ||
| proposalId, | ||
| proposal, | ||
| status: "escalated", | ||
| tier: resolved.tier, | ||
| reviewers: [...resolved.reviewers], | ||
| reason: "proposer lacks capability for this action type", | ||
| timestamp, | ||
| }; | ||
|
|
||
| this.audit.record({ | ||
| proposalId, | ||
| eventType: "escalated", | ||
| proposal, | ||
| decision: capDecision, | ||
| timestamp: new Date().toISOString(), | ||
| }); | ||
|
|
||
| return capDecision; | ||
| } | ||
|
|
||
| const agentRiskScore = proposal.action.riskScore; | ||
| if ( | ||
| agentRiskScore !== undefined && | ||
| agentRiskScore > agent.maxApprovableRisk | ||
| ) { | ||
| const riskDecision: Decision = { | ||
| proposalId, | ||
| proposal, | ||
| status: "escalated", | ||
| tier: resolved.tier, | ||
| reviewers: [...resolved.reviewers], | ||
| reason: "riskScore exceeds agent's maxApprovableRisk", | ||
| timestamp, | ||
| }; | ||
|
|
||
| this.audit.record({ | ||
| proposalId, | ||
| eventType: "escalated", | ||
| proposal, | ||
| decision: riskDecision, | ||
| timestamp: new Date().toISOString(), | ||
| }); | ||
|
|
||
| return riskDecision; | ||
| } |
There was a problem hiding this comment.
The logic for handling agent capability and risk score escalation is duplicated across two blocks. This can be refactored to improve maintainability and reduce redundancy by determining the escalation reason first and then performing the audit recording and return in a single block.
let escalationReason: string | undefined;
if (!matchesCapability(agent.capabilities, proposal.action.type)) {
escalationReason = "proposer lacks capability for this action type";
} else if (
proposal.action.riskScore !== undefined &&
proposal.action.riskScore > agent.maxApprovableRisk
) {
escalationReason = "riskScore exceeds agent's maxApprovableRisk";
}
if (escalationReason) {
const agentDecision: Decision = {
proposalId,
proposal,
status: "escalated",
tier: resolved.tier,
reviewers: [...resolved.reviewers],
reason: escalationReason,
timestamp,
};
this.audit.record({
proposalId,
eventType: "escalated",
proposal,
decision: agentDecision,
timestamp: new Date().toISOString(),
});
return agentDecision;
}There was a problem hiding this comment.
対応しない。capDecision と riskDecision は現状では構造が似ているが、将来的に異なる情報(具体的な capability 名、超過量等)を含める可能性がある。~50 行の重複は許容範囲で、ヘルパー化するとコードの流れが追いにくくなる。
There was a problem hiding this comment.
🧹 Nitpick comments (2)
src/core/runtime.ts (1)
166-216: Consider a fail-closed option for unknown proposers.This preserves zero-config compatibility, but a typoed or spoofed
proposal.proposerstill skips both new checks and falls back to the legacy policy/threshold path. An opt-inrequireRegisteredProposermode, or at least a distinct unknown-agent escalation/audit path, would make capability enforcement much safer to roll out.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/core/runtime.ts` around lines 166 - 216, Add a fail-closed path for unknown proposers by honoring a config flag (e.g., requireRegisteredProposer) on the runtime: if this.agents.get(proposal.proposer) returns undefined and this.requireRegisteredProposer is true, construct an escalated Decision (use same shape as capDecision/riskDecision, include proposalId, proposal, status:"escalated", tier:resolved.tier, reviewers:[...resolved.reviewers], reason:"unknown proposer" and timestamp) call this.audit.record with eventType:"escalated" and return that decision; otherwise preserve the existing legacy policy/threshold behavior. Reference this.agents, proposal.proposer, matchesCapability, Decision, and this.audit.record when locating where to add the check.src/core/runtime.test.ts (1)
379-467: Add a registered-agent case that still trips the global threshold.These tests prove the new early escalations, but they never show that a registered agent still hits
escalateAboveafter capability andmaxApprovableRiskboth pass. A bug that short-circuits registered agents to approval would still satisfy this block.Suggested test
+ it("registered agent still respects global escalateAbove after agent checks pass", async () => { + const tegata = new Tegata({ escalateAbove: 70 }); + const registered = tegata.registerAgent({ + id: "ci-bot", + name: "CI Bot", + role: "proposer", + capabilities: ["ci:*:deploy"], + maxApprovableRisk: 90, + }); + expect(registered.ok).toBe(true); + + const decision = await tegata.propose({ + proposer: "ci-bot", + action: { type: "ci:production:deploy", riskScore: 80 }, + }); + + expect(decision.status).toBe("escalated"); + expect(decision.reason).toContain("threshold"); + });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/core/runtime.test.ts` around lines 379 - 467, Add a test that verifies a registered agent who has matching capability and whose maxApprovableRisk allows the action still gets escalated when the action.riskScore exceeds the global escalateAbove threshold: create a Tegata, register an agent (e.g., id "ci-bot") with a capability that matches the action type and maxApprovableRisk above the action's riskScore check, then call Tegata.propose with an action whose riskScore is greater than the global escalateAbove (the current tests imply 70) and assert decision.status is "escalated" and decision.reason contains the global-threshold message; place this alongside the other it(...) tests referencing Tegata.registerAgent and Tegata.propose.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/core/runtime.test.ts`:
- Around line 379-467: Add a test that verifies a registered agent who has
matching capability and whose maxApprovableRisk allows the action still gets
escalated when the action.riskScore exceeds the global escalateAbove threshold:
create a Tegata, register an agent (e.g., id "ci-bot") with a capability that
matches the action type and maxApprovableRisk above the action's riskScore
check, then call Tegata.propose with an action whose riskScore is greater than
the global escalateAbove (the current tests imply 70) and assert decision.status
is "escalated" and decision.reason contains the global-threshold message; place
this alongside the other it(...) tests referencing Tegata.registerAgent and
Tegata.propose.
In `@src/core/runtime.ts`:
- Around line 166-216: Add a fail-closed path for unknown proposers by honoring
a config flag (e.g., requireRegisteredProposer) on the runtime: if
this.agents.get(proposal.proposer) returns undefined and
this.requireRegisteredProposer is true, construct an escalated Decision (use
same shape as capDecision/riskDecision, include proposalId, proposal,
status:"escalated", tier:resolved.tier, reviewers:[...resolved.reviewers],
reason:"unknown proposer" and timestamp) call this.audit.record with
eventType:"escalated" and return that decision; otherwise preserve the existing
legacy policy/threshold behavior. Reference this.agents, proposal.proposer,
matchesCapability, Decision, and this.audit.record when locating where to add
the check.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 8bc9c326-63a0-47c4-b7ff-32b895b7b4ca
📒 Files selected for processing (2)
src/core/runtime.test.tssrc/core/runtime.ts
|
CodeRabbit nitpick への回答: 1. 2. registered agent が global threshold で引っかかるテスト追加 — 対応する。正当なテストギャップ。agent checks(capability + maxApprovableRisk)通過後に global |
Registered agent that passes capability + maxApprovableRisk checks should still be escalated when riskScore exceeds global escalateAbove. Addresses CodeRabbit review feedback on PR #8. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Registered agent that passes capability + maxApprovableRisk checks should still be escalated when riskScore exceeds global escalateAbove. Addresses CodeRabbit review feedback on PR #8. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
f0b92f0 to
9b938ae
Compare
* test: add global threshold test for registered agents Registered agent that passes capability + maxApprovableRisk checks should still be escalated when riskScore exceeds global escalateAbove. Addresses CodeRabbit review feedback on PR #8. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: implement review/approve handler flow PolicyRule is now a discriminated union (AutoPolicy | ReviewPolicy | ApprovePolicy). Review and approve tiers require a handler at the type level — no runtime validation needed. Handlers return ReviewResult (approved | denied); escalation and timeout are Tegata's responsibility. - Add ReviewResult, ReviewHandler, ApprovalHandler types - Split PolicyRule into discriminated union keyed on tier - Add Decision.decidedBy for audit attribution - Extend ResolvedPolicy with handler/timeoutMs - Switch-based resolvePolicy for exhaustive case handling - clonePolicyRule helper (structuredClone can't clone functions) - executeHandler with Promise.race timeout + .then(ok, err) - propose() review/approve: pending → handler → decided/timed_out/escalated - ADR-004: design decisions and PR strategy Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: harden handler execution for LLM/agent safety - Wrap handler call in Promise.resolve().then() to catch sync throws - Clone proposal before passing to handler (prevent mutation) - Validate handler result shape at trust boundary (status, decidedBy) - Sanitize handler error messages in Decision.reason (no leaking) - Add tests: sync throw, proposal mutation, invalid shape, empty decidedBy, error message sanitization Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Added agent capability checking in propose(). Registered agents are checked for capability match and maxApprovableRisk before the global threshold check. Unregistered proposers skip these checks for zero-config compatibility. 5 new test cases added.
Summary by CodeRabbit