Skip to content

Add opt-in agent self-approval gate#166

Open
sepo-agent-app[bot] wants to merge 4 commits into
mainfrom
agent/implement-issue-165/codex-25508067922
Open

Add opt-in agent self-approval gate#166
sepo-agent-app[bot] wants to merge 4 commits into
mainfrom
agent/implement-issue-165/codex-25508067922

Conversation

@sepo-agent-app
Copy link
Copy Markdown

@sepo-agent-app sepo-agent-app Bot commented May 7, 2026

Summary

  • Add agent-self-approve as an internal orchestrator-dispatched PR approval gate gated by AGENT_ALLOW_SELF_APPROVE.
  • Add deterministic CLIs to preflight PR head state, parse structured self-approval verdicts, and submit approval only for matching open PR heads.
  • Extend orchestrator handoff policy, delegated authorization checks, prompts, workflow docs, and config docs.
  • Add focused tests for self-approval parsing, stale-head rejection, opt-in dispatch, and delegated route authorization.

Verification

  • npm run build
  • node --test dist/__tests__/self-approval.test.js dist/__tests__/handoff.test.js dist/__tests__/triage.test.js dist/__tests__/orchestrator-preflight-cli.test.js dist/__tests__/orchestrate-handoff-cli.test.js
  • YAML parse check for agent-self-approve.yml and agent-orchestrator.yml
  • git diff --check

Closes #165

@sepo-agent-app sepo-agent-app Bot added the agent Handled by the agent label May 7, 2026
@lolipopshock lolipopshock marked this pull request as ready for review May 7, 2026 16:36
@lolipopshock
Copy link
Copy Markdown
Contributor

@sepo-agent /orchestrate there's a merge conflict to main after merging #162 can you check and fix

@sepo-agent-app

This comment has been minimized.

@sepo-agent-app

This comment has been minimized.

claude_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
permission_mode: approve-all
prompt: agent-self-approve
route: agent-self-approve
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run.ts validates each run-agent-task envelope against VALID_ROUTES, but this PR adds the prompt mapping without adding agent-self-approve to .agent/src/envelope.ts. With this route, the self-approval workflow will fail envelope validation before the prompt runs. Please add the route to the allowlist and cover it in validateEnvelope tests.

github_token: ${{ steps.auth.outputs.token }}
openai_api_key: ${{ secrets.OPENAI_API_KEY }}
claude_oauth_token: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}
permission_mode: approve-all
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approval decision is supposed to be enforced only by deterministic post-processing, but the inspection agent runs with permission_mode: approve-all while receiving a write-capable GitHub token. That gives a prompt-injected or mistaken agent path to call gh and approve/comment before resolve-self-approve.js checks the verdict and head SHA. Please run this step with read-only tool permissions or otherwise withhold write-capable GitHub auth until the deterministic approval step.

};
}

if (inspectedHeadSha && inspectedHeadSha !== expectedHeadSha) {
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inspected_head_sha is optional in practice here because a blank value skips this check and an APPROVE verdict can still set shouldApprove: true. The prompt makes that field part of the structured contract, and the safety guarantee depends on proving which head the agent inspected. Please reject blank inspected head SHA for approvals and add a regression test.

TARGET_NUMBER: ${{ inputs.pr_number }}
run: node .agent/dist/cli/resolve-self-approve.js

- name: Post self-approval status
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If resolve-self-approve.js fails while submitting the approval, this status step is skipped because the condition lacks always(), so the PR gets no visible stop or failure comment. Please make the status/artifact path run on result-step failure, or catch submission errors and write a failed status body from the resolver.

name: Agent / Self Approve

on:
workflow_dispatch:
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this workflow is directly workflow_dispatch-able, an enabled repo can run self-approval on any open PR without proving it came from the intended review -> SHIP handoff. Please have preflight require trusted orchestration context or deterministically verify the latest trusted review/rubrics signal before approval.

@sepo-agent-app

This comment has been minimized.

@sepo-agent-app

This comment has been minimized.

@sepo-agent-app

This comment has been minimized.

@sepo-agent-app

This comment has been minimized.

@sepo-agent-app

This comment has been minimized.

id: prepare
env:
AGENT_ALLOW_SELF_APPROVE: ${{ vars.AGENT_ALLOW_SELF_APPROVE || 'false' }}
GH_TOKEN: ${{ github.token }}
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This preflight uses github.token to determine the trusted actor, but review synthesis and rubrics comments are posted with the resolved App/PAT token in agent-review.yml. In the normal Sepo App setup this compares github-actions[bot] to sepo-agent-app, so self-approval always blocks. Resolve GitHub auth before prepare and use that token for provenance reads, while still passing only the read-only token to the agent step.

Comment thread .agent/src/self-approval.ts Outdated
};
}

if (latest.kind === "review synthesis" && latest.conclusion === "ship") {
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This trusts a SHIP comment without proving it applies to expectedHeadSha. Since issue comments are not commit-scoped, a stale SHIP from an older PR head can authorize approval of new commits. Please pass or persist the reviewed head SHA from the review synthesis/handoff and require it to match the current expected head before approval. A rubrics PASS should not be sufficient by itself to replace a current SHIP review synthesis.

run: node .agent/dist/cli/post-response.js

- uses: actions/upload-artifact@v4
if: steps.prepare.outputs.should_run == 'true'
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This artifact upload is also skipped when resolve-self-approve.js fails because Actions applies an implicit success() to steps without always(). That drops the response and session artifacts exactly when they are needed to debug approval API rejection or resolver failure. Please run this on failure too, e.g. always() && steps.prepare.outputs.should_run == 'true'.

@sepo-agent-app

This comment has been minimized.

@sepo-agent-app

This comment has been minimized.

@sepo-agent-app

This comment has been minimized.

@sepo-agent-app

This comment has been minimized.

@sepo-agent-app

This comment has been minimized.

Comment thread .agent/src/cli/post-comment.ts Outdated
let reviewedHeadSha = "";
if (target === "pr" && repo && targetNumber > 0) {
try {
reviewedHeadSha = fetchPrMeta(targetNumber, repo).headOid;
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fetches the PR head when the synthesis comment is posted, not when the reviewer lanes actually inspected the diff. If the PR head changes during review or synthesis, a stale SHIP synthesis can be stamped with the new head SHA and later authorize self-approval of unreviewed commits. Please capture the reviewed head before the review run, pass that captured SHA through the review/synthesis path, and avoid emitting a trusted current-head marker if the PR changed during the run.

Comment thread .agent/src/github.ts Outdated
body: string,
): "created" | "updated" {
const existing = fetchIssueCommentRecords(prNumber, repo)
.filter((comment) => comment.id && comment.body.includes(marker))
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please filter marker matches to comments authored by the authenticated Sepo actor before updating. As written, any PR commenter can include <!-- sepo-agent-self-approval -->, and the next self-approval status run will try to PATCH that comment instead of creating or updating Sepo’s own status comment.

@sepo-agent-app

This comment has been minimized.

@sepo-agent-app

This comment has been minimized.

@sepo-agent-app
Copy link
Copy Markdown
Author

sepo-agent-app Bot commented May 7, 2026

Sepo pushed fixes for this PR. Branch: agent/implement-issue-165/codex-25508067922. Requested by @sepo-agent.

  • Captured the reviewed PR head before review lanes start and forwarded it to synthesis comment posting.
  • Omitted the trusted synthesis head marker when the PR head changes before posting, so self-approval cannot trust stale review provenance.
  • Filtered self-approval status marker updates to comments authored by the authenticated Sepo actor; untrusted marker comments now cause a fresh Sepo comment instead.
  • Added focused regression coverage for stable reviewed-head markers, changed-head omission, and untrusted marker comments.
  • Verification: npm run build; node --test dist/__tests__/post-comment-cli.test.js dist/__tests__/post-response-cli.test.js dist/__tests__/envelope.test.js; YAML parse check; git diff --check.

@sepo-agent-app
Copy link
Copy Markdown
Author

sepo-agent-app Bot commented May 7, 2026

Sepo is dispatching follow-up automation.

Source Next Target Round Status
fix-pr review PR #166 8 / 40 Dispatched

Reason: agent planner selected review: fix-pr succeeded on PR #166 and pushed targeted fixes for the latest review synthesis findings; the PR is open, CLEAN, and checks are passing, so one more review should validate the new head before stopping.

@sepo-agent-app
Copy link
Copy Markdown
Author

sepo-agent-app Bot commented May 7, 2026

Rubrics Review

Total Score Verdict Rubrics Scored
100 PASS 11
Dimension Rubric Result Score Evidence
coding_workflow / generic Validate delegated route authorization pass 8/8 /orchestrate includes agent-self-approve in downstream authorization only when self-approval is enabled, and validates requester access for each delegated route:
const association = isKnownAuthorAssociation(input.authorAssociation) ? input.authorAssociation : "NONE";
const delegatedRoutes: string[] = [...ORCHESTRATE_DELEGATED_ROUTES];
if (input.selfApproveEnabled) {
delegatedRoutes.push(ORCHESTRATE_SELF_APPROVE_ROUTE);
}
for (const route of delegatedRoutes) {
if (isAssociationAllowedForRoute(policy, route, association, input.isPublicRepo)) {
continue;
}
const allowed = getAllowedAssociationsForRoute(policy, route, input.isPublicRepo);
return `orchestrate requests require ${route} access; ${route} currently requires ${allowed.join(", ")} access.`;
}
coding_workflow / specific Default follow-up orchestration to one-shot pass 8/8 Review SHIP still stops unless selfApproveEnabled is true; self-approval only hands back to fix-pr on REQUEST_CHANGES:

repo/.agent/src/handoff.ts

Lines 454 to 494 in c40d76a

if (sourceAction === "review") {
if (conclusion === "ship") {
if (input.selfApproveEnabled) {
return {
decision: "dispatch",
nextAction: "agent-self-approve",
targetNumber: nextTarget,
reason: "review verdict is SHIP; dispatching self-approval gate",
nextRound,
};
}
return { decision: "stop", reason: "review verdict is SHIP", nextRound };
}
if (REVIEW_TO_FIX_PR.has(conclusion)) {
return {
decision: "dispatch",
nextAction: "fix-pr",
targetNumber: nextTarget,
reason: `review verdict is ${conclusion}; dispatching fix-pr`,
nextRound,
handoffContext: resolveFixPrHandoffContext(input),
};
}
return { decision: "stop", reason: `review verdict ${conclusion} has no handoff`, nextRound };
}
if (sourceAction === "agent_self_approve") {
if (conclusion === "approved") {
return { decision: "stop", reason: "self-approval approved the pull request", nextRound };
}
if (conclusion === "request_changes" || conclusion === "changes_requested") {
return {
decision: "dispatch",
nextAction: "fix-pr",
targetNumber: nextTarget,
reason: "self-approval requested follow-up changes; dispatching fix-pr",
nextRound,
handoffContext: resolveFixPrHandoffContext(input),
};
}
return { decision: "stop", reason: `self-approval concluded ${conclusion}`, nextRound };
communication / generic Keep status comments compact pass 6/6 Self-approval status is a compact two-column table with concise reason/context text:
export function formatSelfApprovalBody(input: {
conclusion: string;
reason: string;
handoffContext?: string;
approved?: boolean;
runUrl?: string;
}): string {
const status = input.approved ? "Approved" : "Not approved";
const conclusion = input.conclusion || "unknown";
const lines = [
"Sepo self-approval completed.",
"",
"| Status | Conclusion |",
"|---|---|",
`| ${status} | \`${conclusion}\` |`,
"",
`Reason: ${input.reason || "No reason provided."}`,
];
const context = String(input.handoffContext || "").trim();
if (context && !input.approved) {
lines.push("", "Follow-up context:", context);
}
if (input.runUrl) {
lines.push("", `Run: ${input.runUrl}`);
}
lines.push("", "<!-- sepo-agent-self-approval -->");
return lines.join("\n");
coding_workflow / generic Reuse existing code pass 8/8 The PR uses existing workflow/action patterns, typed CLIs, run-agent-task, shared GitHub helpers, route validation, and handoff policy instead of embedding substantial logic in shell.
coding_workflow / generic Make surgical changes pass 8/8 The diff is scoped to the opt-in self-approval route, prompt, workflow, authorization/orchestration plumbing, docs, and focused tests.
coding_workflow / generic Make user-facing automation idempotent pass 7/7 Self-approval comments carry a stable marker, and upsert updates only comments authored by the authenticated Sepo actor:

repo/.agent/src/github.ts

Lines 241 to 262 in c40d76a

export function upsertPrCommentByMarker(
prNumber: number,
repo: string,
marker: string,
body: string,
): "created" | "updated" {
const trustedActor = normalizeActorLogin(fetchAuthenticatedActorLogin());
const existing = fetchIssueCommentRecords(prNumber, repo)
.filter((comment) => (
comment.id
&& comment.body.includes(marker)
&& trustedActor
&& normalizeActorLogin(comment.authorLogin) === trustedActor
))
.sort((left, right) => Date.parse(left.createdAt || "") - Date.parse(right.createdAt || ""));
const latest = existing[existing.length - 1];
if (latest) {
updateIssueComment(repo, latest.id, body);
return "updated";
}
postPrComment(prNumber, body, repo);
return "created";
coding_workflow / generic Keep docs in sync pass 7/7 Orchestrator docs describe opt-in dispatch, captured reviewed head, trusted synthesis provenance, read-only inspection, deterministic approval, and stop conditions:
When `AGENT_ALLOW_SELF_APPROVE=true`, review-originated `SHIP` handoffs dispatch
`agent-self-approve.yml` instead of stopping. The review workflow stamps
synthesis comments with the PR head SHA captured before reviewer lanes start.
The self-approval workflow captures the PR head SHA before the agent runs,
checks for a latest trusted review synthesis with a matching reviewed head SHA,
and asks for a high-level structured verdict using read-approved tools
and a read-only GitHub token. Only deterministic resolver code receives approval
authority. It submits an approving PR review only when the verdict is `APPROVE`,
the PR is still open, that trusted synthesis is `SHIP`, and both the inspected
and current head SHAs match the captured head. A `REQUEST_CHANGES`
verdict hands back to the orchestrator as `agent-self-approve -> fix-pr` with
the agent's handoff context. `BLOCKED`, failed parsing, missing inspected head
SHAs, stale heads, closed PRs, untrusted review state, and disabled configuration
stop without approving. The default is disabled, so existing orchestrated review
chains still stop on `SHIP`.
coding_workflow / generic Prefer explicit structured inputs pass 7/7 The workflow keeps explicit workflow_dispatch inputs, and the parser expects structured verdict fields including verdict, reason, handoff_context, and inspected_head_sha:
export function parseSelfApprovalDecision(raw: string): SelfApprovalDecision | null {
const json = extractJsonObject(raw);
if (!json) return null;
let parsed: unknown;
try {
parsed = JSON.parse(json) as unknown;
} catch {
return null;
}
if (!parsed || typeof parsed !== "object" || Array.isArray(parsed)) return null;
const record = parsed as Record<string, unknown>;
const verdict = normalizeVerdict(String(record.verdict || record.decision || ""));
if (!verdict) return null;
const reason = String(record.reason || record.rationale || "").trim();
const handoffContext = String(record.handoff_context ?? record.handoffContext ?? "").trim();
const inspectedHeadSha = String(
record.inspected_head_sha ?? record.inspectedHeadSha ?? record.head_sha ?? record.headSha ?? "",
).trim();
return {
verdict,
reason: reason || "self-approval agent returned no reason",
handoffContext,
inspectedHeadSha,
};
}
coding_style / generic Prefer simple implementations pass 7/7 The resolver is a direct sequence of opt-in, PR state, decision, head SHA, provenance, and verdict checks:
export function resolveSelfApproval(input: SelfApprovalResolveInput): SelfApprovalResolveResult {
if (!input.allowSelfApprove) {
return {
conclusion: "blocked",
shouldApprove: false,
shouldOrchestrate: false,
reason: "AGENT_ALLOW_SELF_APPROVE is not enabled",
handoffContext: "",
};
}
if (normalizeToken(input.targetKind) !== "pull_request") {
return {
conclusion: "blocked",
shouldApprove: false,
shouldOrchestrate: false,
reason: "self-approval is only supported for pull requests",
handoffContext: "",
};
}
if (normalizeToken(input.prState) !== "open") {
return {
conclusion: "blocked",
shouldApprove: false,
shouldOrchestrate: false,
reason: `pull request is ${input.prState.toLowerCase() || "not open"}`,
handoffContext: "",
};
}
if (!input.decision) {
return {
conclusion: "failed",
shouldApprove: false,
shouldOrchestrate: false,
reason: "self-approval agent response was missing a valid JSON decision",
handoffContext: "",
};
}
const expectedHeadSha = input.expectedHeadSha.trim();
const currentHeadSha = input.currentHeadSha.trim();
const inspectedHeadSha = input.decision.inspectedHeadSha.trim();
if (input.decision.verdict === "approve" && !inspectedHeadSha) {
return {
conclusion: "blocked",
shouldApprove: false,
shouldOrchestrate: false,
reason: "self-approval approval verdict was missing inspected head SHA",
handoffContext: input.decision.handoffContext,
};
}
if (!expectedHeadSha || !currentHeadSha || expectedHeadSha !== currentHeadSha) {
return {
conclusion: "blocked",
shouldApprove: false,
shouldOrchestrate: false,
reason: "pull request head changed after self-approval inspection",
handoffContext: input.decision.handoffContext,
};
}
if (inspectedHeadSha && inspectedHeadSha !== expectedHeadSha) {
return {
conclusion: "blocked",
shouldApprove: false,
shouldOrchestrate: false,
reason: "self-approval agent reported a different inspected head SHA",
handoffContext: input.decision.handoffContext,
};
}
if (input.approvalProvenanceTrusted === false) {
return {
conclusion: "blocked",
shouldApprove: false,
shouldOrchestrate: false,
reason: input.approvalProvenanceReason || "missing trusted review synthesis for self-approval",
handoffContext: input.decision.handoffContext,
};
}
if (input.decision.verdict === "approve") {
return {
conclusion: "approved",
shouldApprove: true,
shouldOrchestrate: false,
reason: input.decision.reason,
handoffContext: input.decision.handoffContext,
};
}
if (input.decision.verdict === "request_changes") {
return {
conclusion: "request_changes",
shouldApprove: false,
shouldOrchestrate: true,
reason: input.decision.reason,
handoffContext: input.decision.handoffContext || input.decision.reason,
};
}
return {
conclusion: "blocked",
shouldApprove: false,
shouldOrchestrate: false,
reason: input.decision.reason,
handoffContext: input.decision.handoffContext,
};
coding_workflow / generic Surface stateful failures pass 7/7 Metadata/provenance/approval failures become explicit failed or blocked results, and the workflow keeps status/artifact steps under always():
let prState = "";
let currentHeadSha = "";
let metadataReadReason = "";
let approvalProvenanceTrusted = false;
let approvalProvenanceReason = "missing trusted review synthesis for self-approval";
if (repo && prNumber) {
try {
const meta = fetchPrMeta(prNumber, repo);
prState = meta.state;
currentHeadSha = meta.headOid;
} catch {
metadataReadReason = "could not read pull request metadata during self-approval resolution";
prState = "";
currentHeadSha = "";
}
try {
const provenance = evaluateSelfApprovalProvenance({
comments: fetchIssueCommentRecords(prNumber, repo),
trustedActorLogin: fetchAuthenticatedActorLogin(),
expectedHeadSha,
});
approvalProvenanceTrusted = provenance.trusted;
approvalProvenanceReason = provenance.reason;
} catch {
approvalProvenanceTrusted = false;
approvalProvenanceReason = "could not read trusted review synthesis";
}
}
let result = metadataReadReason
? {
conclusion: "failed" as const,
shouldApprove: false,
shouldOrchestrate: false,
reason: metadataReadReason,
handoffContext: decision?.handoffContext || "",
}
: resolveSelfApproval({
allowSelfApprove,
targetKind,
prState,
expectedHeadSha,
currentHeadSha,
decision,
approvalProvenanceTrusted,
approvalProvenanceReason,
});
let approved = false;
if (result.shouldApprove) {
try {
submitApproval(repo, prNumber, expectedHeadSha, formatSelfApprovalBody({
conclusion: result.conclusion,
reason: result.reason,
handoffContext: result.handoffContext,
approved: true,
runUrl: currentRunUrl(),
}));
approved = true;
} catch (err: unknown) {
const message = err instanceof Error ? err.message : String(err);
result = {
conclusion: "failed",
shouldApprove: false,
shouldOrchestrate: false,
reason: `approval submission failed: ${message || "unknown error"}`,
handoffContext: result.handoffContext,
};
}
}
const body = formatSelfApprovalBody({
conclusion: result.conclusion,
reason: result.reason,
handoffContext: result.handoffContext,
approved,
runUrl: currentRunUrl(),
});
const bodyFile = writeBodyFile(body);
setOutput("conclusion", result.conclusion);
setOutput("approved", String(approved));
setOutput("should_orchestrate", String(result.shouldOrchestrate));
setOutput("handoff_context", result.handoffContext);
setOutput("reason", result.reason);
setOutput("body_file", bodyFile);
and
- name: Post self-approval status
if: >-
always() &&
steps.prepare.outputs.should_run == 'true' &&
(steps.result.outcome == 'failure' || steps.result.outputs.approved != 'true')
env:
BODY_FILE: ${{ steps.result.outputs.body_file }}
GH_TOKEN: ${{ steps.auth.outputs.token }}
GITHUB_REPOSITORY: ${{ github.repository }}
RESPONSE_KIND: pr_comment
TARGET_NUMBER: ${{ inputs.pr_number }}
run: node .agent/dist/cli/post-response.js
- uses: actions/upload-artifact@v4
if: >-
always() &&
steps.prepare.outputs.should_run == 'true'
with:
name: agent-self-approve-result-${{ inputs.pr_number }}
path: |
${{ steps.agent.outputs.response_file }}
${{ steps.agent.outputs.session_log_file }}
${{ steps.result.outputs.body_file }}
if-no-files-found: ignore
retention-days: 30
coding_workflow / generic Comment on accepted workflow stops pass 6/6 Disabled, non-PR, unreadable metadata, closed PR, missing head, or untrusted synthesis paths write visible stop bodies instead of silently no-oping:
const allowSelfApprove = envFlagEnabled(process.env.AGENT_ALLOW_SELF_APPROVE);
function stop(reason: string): void {
const bodyFile = writeBodyFile(formatSelfApprovalBody({
conclusion: "blocked",
reason,
approved: false,
}));
setOutput("should_run", "false");
setOutput("head_sha", "");
setOutput("reason", reason);
setOutput("body_file", bodyFile);
}
if (!allowSelfApprove) {
stop("AGENT_ALLOW_SELF_APPROVE is not enabled");
} else if (targetKind !== "pull_request") {
stop("self-approval is only supported for pull requests");
} else if (!repo || !targetNumber) {
stop("missing pull request target");
} else {
let meta: ReturnType<typeof fetchPrMeta>;
let metadataReadable = true;
try {
meta = fetchPrMeta(targetNumber, repo);
} catch {
stop("could not read pull request metadata");
metadataReadable = false;
meta = { headRef: "", headOid: "", isCrossRepository: false, state: "" };
}
if (!metadataReadable) {
// Outputs were already written by stop().
} else if (String(meta.state || "").trim().toUpperCase() !== "OPEN") {
stop(`pull request is ${String(meta.state || "not open").toLowerCase()}`);
} else if (!meta.headOid) {
stop("could not resolve pull request head SHA");
} else {
try {
const provenance = evaluateSelfApprovalProvenance({
comments: fetchIssueCommentRecords(targetNumber, repo),
trustedActorLogin: fetchAuthenticatedActorLogin(),
expectedHeadSha: meta.headOid,
});
if (!provenance.trusted) {
stop(provenance.reason);
} else {
setOutput("should_run", "true");
setOutput("head_sha", meta.headOid);
setOutput("reason", "");
setOutput("body_file", "");
}
} catch {
stop("could not read trusted review synthesis");
}

Notes

  • The current head satisfies the main conservative-gate rubrics: default-disabled, PR-only, requester-aware delegated authorization, read-only inspection, deterministic approval submission, current-head provenance, and visible stop/failure behavior.
  • I did not score unrelated rubrics such as default merge strategy, preview contracts, rubric-authoring guidance, or docs inventory rules.
  • No rubric-specific follow-up needed.

Findings

  • No rubric-specific findings.

Final Rubric Verdict

PASS

@sepo-agent-app
Copy link
Copy Markdown
Author

sepo-agent-app Bot commented May 7, 2026

AI Review Synthesis

Dual-agent review by Claude and Codex.
Requested by @lolipopshock.

Summary of PR/Issue

  • PR Add opt-in agent self-approval gate #166 adds an opt-in agent-self-approve gate for Sepo, closing Implement requested change #165 and implementing the discussion request for a conservative final PR approval step.
  • The intended model is internal/orchestrator-led: after review synthesis returns SHIP, a self-approval agent inspects the PR and deterministic code submits approval only when opt-in, PR state, verdict, and head SHA checks pass.
  • The PR adds a new workflow, prompt, typed prepare/resolve CLIs, orchestration handoff policy, delegated authorization plumbing, docs, and focused tests.
  • The current PR is still unmergeable against main after Collapse old fix-pr status comments #162, and reviewers found several runtime and safety gaps in the approval path.

Review

Codex marked the PR NEEDS_REWORK, Claude’s overall verdict was MINOR_ISSUES, and the rubrics review was PARTIAL. This synthesis prioritizes the blockers because GitHub currently reports the PR as CONFLICTING/DIRTY, and the approval workflow has deterministic runtime and trust-boundary issues.

Issue Severity Description
Merge conflict with main after #162 BLOCKING PR #166 is not currently mergeable. The docs conflict needs a rebase that preserves both #162’s comment-collapse wording and this PR’s self-approval documentation.
agent-self-approve is not a valid runtime route BLOCKING The workflow passes route: agent-self-approve, but .agent/src/envelope.ts does not include that route in VALID_ROUTES, so run.ts will reject the envelope before the prompt runs.
Agent inspection step can mutate GitHub before deterministic checks BLOCKING The self-approval agent runs with permission_mode: approve-all and a write-capable GitHub token, bypassing the intended guarantee that only deterministic post-processing can approve.
Missing inspected_head_sha still allows approval BLOCKING A blank inspected SHA skips the mismatch check, so an APPROVE verdict can still set shouldApprove: true without proving which head the agent inspected.
Self-approval can bypass the review-to-SHIP handoff WARNING Direct workflow_dispatch plus current preflight allows enabled self-approval on an open PR without verifying trusted review -> SHIP provenance or equivalent review/rubrics signals.
Failure handling can hide or misreport approval-state errors WARNING Metadata read failures are swallowed, approval API failures can skip PR-visible status, and artifacts are skipped on result-step failure.
Self-approval status comments are not idempotent WARNING The body has a stable marker, but the workflow posts via generic post-response without updating/reusing an existing self-approval comment.

Progress

  • The PR already implements the conservative default-disabled opt-in flag, PR-only preflight, orchestrator handoff wiring, docs, and focused tests for the main happy and stale-head paths.
  • The rubrics review already acknowledged strong alignment on delegated authorization, one-shot default behavior, code reuse, and docs.
  • The rubrics review already surfaced the idempotency and metadata-read warnings, so no duplicate inline comments were added for those.
  • Inline comments were posted for the unregistered route, write-capable inspection step, missing inspected head SHA, skipped failure status, and direct dispatch handoff bypass.
  • Claude’s INFO notes about prompt substitution and approval-review-body asymmetry look intentional and are not included as required action items.

Issue Details

Merge conflict with main after #162

Cause: GitHub reports PR #166 as mergeable: CONFLICTING / mergeStateStatus: DIRTY after #162 landed adjacent documentation changes.

Candidate solutions: Rebase the PR branch onto main and reconcile the docs so both AGENT_COLLAPSE_OLD_REVIEWS wording and the new AGENT_ALLOW_SELF_APPROVE content remain.

Comments: Codex treated this as blocking; Claude called it mechanical but still identified it as unresolved.

`agent-self-approve` is not a valid runtime route

Cause: agent-self-approve.yml sends route: agent-self-approve, while envelope.ts lacks that route in VALID_ROUTES.

Candidate solutions: Add agent-self-approve to VALID_ROUTES and extend validateEnvelope tests to cover it.

Agent inspection step can mutate GitHub before deterministic checks

Cause: agent-self-approve.yml gives the inspection agent a write-capable token and approve-all permissions.

Candidate solutions: Run the agent inspection step with read-only permissions and reserve the write-capable token for resolve-self-approve.js, or otherwise prevent tool-level mutation before deterministic verdict/head checks.

Missing `inspected_head_sha` still allows approval

Cause: self-approval.ts only blocks when inspectedHeadSha is non-empty and mismatched.

Candidate solutions: Reject blank inspected_head_sha for approval verdicts and add a regression test for blank/missing inspected SHA.

Comments: Claude considered this lower risk because current-head still matches expected-head; Codex treated it as blocking because it weakens the stated inspected-head contract.

Self-approval can bypass the review-to-SHIP handoff

Cause: agent-self-approve.yml is directly dispatchable, and preflight does not validate trusted orchestration provenance or latest trusted review/rubrics state.

Candidate solutions: Require trusted orchestrator context, or deterministically verify a latest trusted SHIP review/rubrics signal before approval can proceed.

Failure handling can hide or misreport approval-state errors

Cause: resolve-self-approve.ts converts PR metadata read failures into empty state/head, and agent-self-approve.yml lacks always() handling for result-step failures.

Candidate solutions: Surface metadata read failures explicitly, catch approval submission failures into a failed status body, and make status/artifact steps run on resolver failure with focused tests.

Self-approval status comments are not idempotent

Cause: self-approval.ts emits a stable marker, but agent-self-approve.yml posts through generic post-response without marker lookup/update.

Candidate solutions: Reuse or update the existing <!-- sepo-agent-self-approval --> comment, or add collapse/update behavior equivalent to existing agent status comment handling.

Comments: Already covered clearly in the posted rubrics review.

Recommended Next Step

FIX_PR: The unresolved findings require concrete branch changes, and the merge conflict plus deterministic route/safety fixes are suitable for an automated fix-pr pass.

Final Verdict

NEEDS_REWORK

Action Items

  • Rebase/resolve the Collapse old fix-pr status comments #162 merge conflict while preserving both Collapse old fix-pr status comments #162 and Add opt-in agent self-approval gate #166 documentation changes.
  • Add agent-self-approve to VALID_ROUTES and cover it in validateEnvelope tests.
  • Run self-approval inspection without write-capable tools/token; reserve approval authority for deterministic resolver code.
  • Reject approval verdicts with missing or blank inspected_head_sha and add regression coverage.
  • Enforce trusted review -> SHIP provenance or latest trusted review/rubrics signal before approval.
  • Surface metadata and approval submission failures visibly, including status/artifact behavior on resolver failure.
  • Reuse/update the stable self-approval status comment instead of creating duplicate comments.

@sepo-agent-app sepo-agent-app Bot added priority/p1 Project management: high priority effort/high Project management: high effort labels May 7, 2026
@lolipopshock
Copy link
Copy Markdown
Contributor

@sepo-agent /answer I think this PR becomes quite complicated compared to the original version /vision. Can you summarize what's changed in this process -- what are the new concerns addressed here?

@sepo-agent-app
Copy link
Copy Markdown
Author

sepo-agent-app Bot commented May 8, 2026

Yes. The PR grew because “agent can approve a PR” is a privileged write path, so the implementation had to defend several trust boundaries.

Original /vision scope was roughly: add AGENT_ALLOW_SELF_APPROVE, a PR-only agent-self-approve route/workflow/prompt, structured verdict parsing, deterministic approval after head-SHA checks, and review -> agent-self-approve -> fix-pr/stop orchestration.

The new concerns added during the PR process were:

  • Runtime integration correctness: agent-self-approve had to be registered as a valid runtime route, not just a workflow name.
  • Write-authority separation: the self-approval agent now runs with read-approved tools and a read-only GitHub token; only deterministic resolver code gets approval authority. See agent-self-approve.yml.
  • Trusted review provenance: approval now requires the latest trusted review synthesis to be SHIP for the same PR head. Rubrics PASS alone is not enough. See self-approval.ts.
  • Reviewed-head correctness: review now captures the PR head before reviewer lanes run, and synthesis only stamps that head if the PR did not change before posting. See agent-review.yml and post-comment.ts.
  • Inspected-head enforcement: an APPROVE verdict must include inspected_head_sha, and expected/current/inspected heads must match. See self-approval.ts.
  • Requester/delegated-route authorization: orchestrator capability checks include agent-self-approve only when enabled and require route access. See orchestrator-capabilities.ts.
  • Failure visibility: metadata/provenance/approval failures now become explicit blocked/failed results, and status/artifacts run under always(). See resolve-self-approve.ts.
  • Idempotent/trusted status comments: self-approval status comments use a marker, but updates are limited to comments authored by the authenticated Sepo actor. See github.ts.

So the short version: the original idea was “add a conservative final approval gate.” The added complexity is mostly about making sure that gate cannot approve the wrong commit, trust stale review output, use the wrong actor token, let the inspecting agent mutate GitHub directly, silently fail, or overwrite untrusted comments.

@lolipopshock
Copy link
Copy Markdown
Contributor

@sepo-agent /answer

Ok let's address the issues one by one

Trusted review provenance: approval now requires the latest trusted review synthesis to be SHIP for the same PR head. Rubrics PASS alone is not enough. See self-approval.ts.

For this, I worry the requirement might be too strict. This might lead to infinite loops? Alternatively the approval agent can decide to start a new implementation with slightly adjusted decisions? But this may pose a bigger issue for how to change the architecture...

@sepo-agent-app
Copy link
Copy Markdown
Author

sepo-agent-app Bot commented May 8, 2026

I think your concern is valid, but I’d separate two cases:

  1. Requirement for actually approving: keep this strict.
    A self-approval should only approve when there is a trusted SHIP review synthesis for the exact PR head. Otherwise the agent could approve commits that were never reviewed, which is the dangerous failure mode.

  2. What happens when that requirement is not met: this should not necessarily be terminal forever.
    If the blocker is just “missing/stale review provenance,” the right next action is probably refresh review, not fix-pr and not approval.

So I would not loosen the approval condition, but I would make the orchestration state machine more expressive.

A good model would be:

  • review -> SHIP plus AGENT_ALLOW_SELF_APPROVE=true dispatches agent-self-approve.
  • agent-self-approve -> APPROVE submits approval only if:
    • same PR head
    • trusted latest synthesis is SHIP
    • self-approval inspected the same head
  • agent-self-approve -> REQUEST_CHANGES dispatches fix-pr with the self-approval rationale.
  • agent-self-approve -> NEEDS_REVIEW dispatches review, not fix-pr.
  • agent-self-approve -> BLOCKED stops visibly.

That avoids making the self-approval agent start implementation directly. I’d avoid giving self-approval a free-form “start new implementation with adjusted decisions” power, because then the final approval gate becomes a second orchestrator with approval authority. That mixes two roles that should stay separate:

  • Orchestrator: decides next workflow.
  • Self-approval: makes the final high-level approve/request-changes/block judgment.
  • Fix-pr/implement: changes code.

For loop risk: yes, review -> self-approve -> review can loop if review provenance keeps going stale. The fix is a bounded transition, not weaker approval. For example:

  • only dispatch NEEDS_REVIEW when provenance is stale/missing, not when the self-approval agent dislikes the PR;
  • obey the existing max-round budget;
  • stop if the same head has already produced the same provenance failure;
  • include a clear stop reason like “review synthesis for current head is missing; rerun review before self-approval.”

So my revised stance: strict current-head SHIP provenance is right for approval, but stale/missing provenance should be a recoverable orchestration condition, not treated the same as code needing changes.

@lolipopshock
Copy link
Copy Markdown
Contributor

@sepo-agent /answer

Failure visibility: metadata/provenance/approval failures now become explicit blocked/failed results, and status/artifacts run under always(). See resolve-self-approve.ts.

What is this?

@sepo-agent-app
Copy link
Copy Markdown
Author

sepo-agent-app Bot commented May 8, 2026

That item is about making self-approval failures visible on the PR, instead of only showing up as a failed GitHub Actions log.

Concrete examples:

  • Sepo cannot read PR metadata.
  • Sepo cannot read the trusted review synthesis comments.
  • The self-approval agent returns malformed JSON.
  • The PR head changed while self-approval was running.
  • The GitHub API rejects the approval submission.
  • The provenance check says “there is no trusted SHIP review for this head.”

Without explicit handling, the workflow could fail before posting anything useful. Then the PR would just have a failed Actions run, and the user would need to dig through logs to understand why no approval happened.

So the PR added two behaviors:

  1. Convert known failure cases into structured blocked or failed outcomes with a reason.
    Example: “latest trusted review synthesis reviewed a different head SHA” or “approval submission failed.”

  2. Run status/comment/artifact steps with always().
    In GitHub Actions, always() means “run this step even if an earlier step failed.” That lets Sepo still post a PR-visible status and upload debugging artifacts.

This is not central to the product idea. It is operational hardening: if self-approval does not approve, the PR should say why, and the workflow should preserve enough evidence to debug it.

@lolipopshock
Copy link
Copy Markdown
Contributor

That item is about making self-approval failures visible on the PR, instead of only showing up as a failed GitHub Actions log.

Concrete examples:

  • Sepo cannot read PR metadata.
  • Sepo cannot read the trusted review synthesis comments.
  • The self-approval agent returns malformed JSON.
  • The PR head changed while self-approval was running.
  • The GitHub API rejects the approval submission.
  • The provenance check says “there is no trusted SHIP review for this head.”

Without explicit handling, the workflow could fail before posting anything useful. Then the PR would just have a failed Actions run, and the user would need to dig through logs to understand why no approval happened.

So the PR added two behaviors:

  1. Convert known failure cases into structured blocked or failed outcomes with a reason.
    Example: “latest trusted review synthesis reviewed a different head SHA” or “approval submission failed.”

  2. Run status/comment/artifact steps with always().
    In GitHub Actions, always() means “run this step even if an earlier step failed.” That lets Sepo still post a PR-visible status and upload debugging artifacts.

This is not central to the product idea. It is operational hardening: if self-approval does not approve, the PR should say why, and the workflow should preserve enough evidence to debug it.

@sepo-agent /answer ok yeah that's a good addition. Can you finally summarize the system architecture and then we should jointly decide if we should (1) start a new PR since there's a lot of patches on the initial version or (2) split this change into smaller PR and then having orchestrator to run them individually?

@sepo-agent-app
Copy link
Copy Markdown
Author

sepo-agent-app Bot commented May 8, 2026

System architecture as it stands:

  1. Configuration gate
    AGENT_ALLOW_SELF_APPROVE=false by default. When disabled, normal review -> SHIP still stops as before.

  2. Orchestrator transition
    When enabled, review -> SHIP dispatches agent-self-approve instead of stopping. If self-approval requests changes, it hands back to fix-pr; if approved or blocked, it stops.

  3. Review provenance
    The review workflow captures the PR head before reviewer lanes run. Review synthesis is trusted for self-approval only if it is SHIP and stamped for that exact head.

  4. Self-approval preflight
    Before running the approval agent, Sepo checks: enabled flag, PR target, open PR, readable head SHA, trusted current-head review synthesis.

  5. Approval agent
    The agent performs high-level judgment: product fit, repo direction, risk, rubrics, maintainability. It runs read-only and returns structured JSON: APPROVE, REQUEST_CHANGES, or BLOCKED, plus reason/context and inspected head SHA.

  6. Deterministic resolver
    Only deterministic code can submit the GitHub approval. It rechecks PR state, current head, inspected head, trusted SHIP provenance, and parser validity. Approval is submitted only if all checks pass.

  7. Visibility/idempotency
    Non-approval outcomes post a visible status. Failures still upload artifacts. Self-approval status comments are marker-upserted, but only for Sepo-authored comments.

On PR strategy: I would choose option 2: split into smaller stacked PRs, not start over from scratch.

Starting a new PR gives a cleaner conversation, but it does not reduce the review surface unless we also split the code. The current patch has become a bundle of several separable concerns, and this is a security/trust-boundary feature, so smaller review units are worth it.

Suggested split:

  1. Review provenance infrastructure
    Capture reviewed PR head, stamp synthesis with reviewed head, parse synthesis-head marker, add tests.

  2. Self-approval core
    Add AGENT_ALLOW_SELF_APPROVE, route/prompt/workflow shell, parser, resolver, PR-only/open/head checks, docs/tests. Keep orchestration disabled or manually dispatchable only for this PR.

  3. Orchestrator integration
    Add review -> agent-self-approve -> fix-pr/stop, delegated authorization checks, max-round behavior, handoff context.

  4. Operational polish
    Status upsert, Sepo-authored marker filtering, always() artifacts/status, failure visibility.

I would not ask the orchestrator to independently invent all four. Better: use the current PR as the reference implementation, then have orchestrator/fix-pr produce stacked PRs with explicit ownership for each slice. That preserves the work already done while making review much easier.

@lolipopshock
Copy link
Copy Markdown
Contributor

System architecture as it stands:

  1. Configuration gate
    AGENT_ALLOW_SELF_APPROVE=false by default. When disabled, normal review -> SHIP still stops as before.

  2. Orchestrator transition
    When enabled, review -> SHIP dispatches agent-self-approve instead of stopping. If self-approval requests changes, it hands back to fix-pr; if approved or blocked, it stops.

  3. Review provenance
    The review workflow captures the PR head before reviewer lanes run. Review synthesis is trusted for self-approval only if it is SHIP and stamped for that exact head.

  4. Self-approval preflight
    Before running the approval agent, Sepo checks: enabled flag, PR target, open PR, readable head SHA, trusted current-head review synthesis.

  5. Approval agent
    The agent performs high-level judgment: product fit, repo direction, risk, rubrics, maintainability. It runs read-only and returns structured JSON: APPROVE, REQUEST_CHANGES, or BLOCKED, plus reason/context and inspected head SHA.

  6. Deterministic resolver
    Only deterministic code can submit the GitHub approval. It rechecks PR state, current head, inspected head, trusted SHIP provenance, and parser validity. Approval is submitted only if all checks pass.

  7. Visibility/idempotency
    Non-approval outcomes post a visible status. Failures still upload artifacts. Self-approval status comments are marker-upserted, but only for Sepo-authored comments.

On PR strategy: I would choose option 2: split into smaller stacked PRs, not start over from scratch.

Starting a new PR gives a cleaner conversation, but it does not reduce the review surface unless we also split the code. The current patch has become a bundle of several separable concerns, and this is a security/trust-boundary feature, so smaller review units are worth it.

Suggested split:

  1. Review provenance infrastructure
    Capture reviewed PR head, stamp synthesis with reviewed head, parse synthesis-head marker, add tests.

  2. Self-approval core
    Add AGENT_ALLOW_SELF_APPROVE, route/prompt/workflow shell, parser, resolver, PR-only/open/head checks, docs/tests. Keep orchestration disabled or manually dispatchable only for this PR.

  3. Orchestrator integration
    Add review -> agent-self-approve -> fix-pr/stop, delegated authorization checks, max-round behavior, handoff context.

  4. Operational polish
    Status upsert, Sepo-authored marker filtering, always() artifacts/status, failure visibility.

I would not ask the orchestrator to independently invent all four. Better: use the current PR as the reference implementation, then have orchestrator/fix-pr produce stacked PRs with explicit ownership for each slice. That preserves the work already done while making review much easier.

@sepo-agent /answer

Can you create a new issue detailing this?

@sepo-agent-app
Copy link
Copy Markdown
Author

sepo-agent-app Bot commented May 8, 2026

Created the issue: #170

It captures the current architecture, recommends using PR #166 as the reference implementation, and proposes a four-PR stacked split: review provenance, self-approval core, orchestrator integration, and operational polish.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent Handled by the agent effort/high Project management: high effort priority/p1 Project management: high priority

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement requested change

2 participants