Skip to content

fix(codex): require final two-phase approval decisions#70751

Merged
steipete merged 2 commits intoopenclaw:mainfrom
Lucenx9:codex/fix-two-phase-approval-decision
Apr 24, 2026
Merged

fix(codex): require final two-phase approval decisions#70751
steipete merged 2 commits intoopenclaw:mainfrom
Lucenx9:codex/fix-two-phase-approval-decision

Conversation

@Lucenx9
Copy link
Copy Markdown
Contributor

@Lucenx9 Lucenx9 commented Apr 23, 2026

Summary

  • Problem: Codex app-server approval bridges trusted a decision field returned by the initial plugin.approval.request call.
  • Why it matters: plugin.approval.request is a two-phase registration step; granting from that response can bypass the final plugin.approval.waitDecision confirmation path.
  • What changed: command/file/permission approvals and MCP approval elicitations now wait for plugin.approval.waitDecision whenever an approval route exists.
  • What did NOT change (scope boundary): decision: null from the request response still fails closed immediately for the existing no-approval-route case; approval payload shape and granted permission mapping are unchanged.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes N/A
  • Related N/A
  • This PR fixes a bug or regression

Root Cause (if applicable)

  • Root cause: The Codex bridge treated the registration response from plugin.approval.request as potentially final even though it always requests twoPhase: true.
  • Missing detection / guardrail: Tests covered the normal two-call path but did not assert that an eager request-time allow-always is ignored.
  • Contributing context (if known): The gateway uses decision: null on the request response to report no approval route, which made the request response look decision-shaped.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: extensions/codex/src/app-server/approval-bridge.test.ts, extensions/codex/src/app-server/elicitation-bridge.test.ts
  • Scenario the test should lock in: A request-time allow-always is ignored and the final waitDecision result controls the outcome.
  • Why this is the smallest reliable guardrail: The behavior is isolated in the two Codex bridge helpers that consume plugin approval decisions.
  • Existing test that already covers this (if any): Existing route tests covered normal accepted flows and no-route decision: null, but not conflicting request-time decisions.
  • If no new test is added, why not: N/A

User-visible / Behavior Changes

Codex approval prompts still look the same, but final approval decisions now always come from the two-phase wait endpoint when an approval route exists.

Diagram (if applicable)

Before:
plugin.approval.request -> response.decision=allow-always -> approved-session

After:
plugin.approval.request -> status/id -> plugin.approval.waitDecision -> final outcome

Security Impact (required)

  • New permissions/capabilities? (Yes/No) No
  • Secrets/tokens handling changed? (Yes/No) No
  • New/changed network calls? (Yes/No) No
  • Command/tool execution surface changed? (Yes/No) Yes
  • Data access scope changed? (Yes/No) No
  • If any Yes, explain risk + mitigation: This reduces the command/tool approval surface by preventing request-time approval decisions from granting Codex native command/file/permission approvals or MCP tool approval elicitations.

Repro + Verification

Environment

  • OS: Linux
  • Runtime/container: local Node/pnpm workspace
  • Model/provider: Codex app-server bridge tests
  • Integration/channel (if any): Codex extension
  • Relevant config (redacted): default test config

Steps

  1. Return { id, status: "accepted", decision: "allow-always" } from plugin.approval.request.
  2. Return { id, decision: "deny" } from plugin.approval.waitDecision.
  3. Run Codex approval and MCP elicitation bridge tests.

Expected

  • The bridge ignores request-time allow-always and returns the denied final outcome.

Actual

  • Before this PR, request-time decisions could be consumed directly.
  • After this PR, final waitDecision controls the outcome.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios: command approval bridge and MCP elicitation bridge ignore request-time allow-always; no-route decision: null still fails closed without waiting.
  • Edge cases checked: permission approval tests still preserve granted permission response mapping after moving to final decisions.
  • What you did not verify: Live Codex app-server approval flow against a real UI approval client.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes/No) Yes
  • Config/env changes? (Yes/No) No
  • Migration needed? (Yes/No) No
  • If yes, exact upgrade steps: N/A

Risks and Mitigations

  • Risk: If a future non-two-phase caller reused this helper, request-time decisions would be ignored.
    • Mitigation: requestPluginApproval always sends twoPhase: true; tests assert the expected two-call contract.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 23, 2026

Greptile Summary

This PR fixes a security bypass in the Codex app-server approval bridges where a decision field returned by the initial plugin.approval.request call could be trusted directly, circumventing the required two-phase plugin.approval.waitDecision confirmation. The fix introduces approvalRequestExplicitlyUnavailable, which narrows the short-circuit path to only decision: null (the gateway's signal for "no approval route"), and routes all other cases through waitForPluginApprovalDecision. The shell script change adds a more resilient temp-directory cleanup that gracefully handles permission-denied scenarios.

Confidence Score: 5/5

Safe to merge — the security fix is logically correct, all affected code paths are covered by updated and new tests, and no regressions are introduced.

The core logic change is narrow and correct: approvalRequestExplicitlyUnavailable properly differentiates decision: null (explicit no-route signal) from all other decision values (which must now go through waitDecision). The prototype-chain edge case is explicitly tested. Existing tests were updated to reflect the new two-call contract, and two new regression tests were added. The shell script change is additive and defensive. No P0 or P1 issues found.

No files require special attention.

Reviews (3): Last reviewed commit: "ci: retrigger pull request checks" | Re-trigger Greptile

@Lucenx9 Lucenx9 force-pushed the codex/fix-two-phase-approval-decision branch from 0a05f15 to 9f94a53 Compare April 23, 2026 19:47
@Lucenx9
Copy link
Copy Markdown
Contributor Author

Lucenx9 commented Apr 23, 2026

Addressed in df458e2fea: the bridges now treat request-time decision as final only when it is an own-property with value null (the no-approval-route sentinel). Non-null decisions and inherited decision values always go through plugin.approval.waitDecision. Added regression coverage for inherited decision: null in both approval and elicitation bridges.

@Lucenx9
Copy link
Copy Markdown
Contributor Author

Lucenx9 commented Apr 23, 2026

I am not applying the shorter local timeout suggestion here. The long wait is the intended human approval window and is already bounded by DEFAULT_CODEX_APPROVAL_TIMEOUT_MS plus the gateway call timeout. Treating non-null request-time decisions as final would reintroduce the original authorization bypass; adding a much shorter bridge-local timeout would make legitimate approvals fail while not changing the trust boundary. If this needs tighter availability controls, it should be handled by approval concurrency/backpressure policy rather than by trusting the request response or shortening the UX approval window in this bridge.

@openclaw-barnacle openclaw-barnacle Bot added scripts Repository scripts docker Docker and sandbox tooling labels Apr 23, 2026
@Lucenx9 Lucenx9 marked this pull request as draft April 23, 2026 20:20
@Lucenx9 Lucenx9 marked this pull request as ready for review April 23, 2026 20:20
@Lucenx9
Copy link
Copy Markdown
Contributor Author

Lucenx9 commented Apr 23, 2026

Temporarily closing/reopening to retrigger pull_request workflows after GitHub only dispatched label checks on the latest head SHA.

@Lucenx9 Lucenx9 closed this Apr 23, 2026
@Lucenx9 Lucenx9 reopened this Apr 23, 2026
@Lucenx9 Lucenx9 force-pushed the codex/fix-two-phase-approval-decision branch from 28730eb to a69edcd Compare April 23, 2026 20:26
@openclaw-barnacle openclaw-barnacle Bot removed scripts Repository scripts docker Docker and sandbox tooling labels Apr 23, 2026
@steipete steipete force-pushed the codex/fix-two-phase-approval-decision branch from a69edcd to 09959cc Compare April 24, 2026 04:09
steipete and others added 2 commits April 24, 2026 05:24
Co-authored-by: Lucenx9 <185146821+Lucenx9@users.noreply.github.com>
Co-authored-by: Lucenx9 <185146821+Lucenx9@users.noreply.github.com>
@steipete steipete force-pushed the codex/fix-two-phase-approval-decision branch from a686eec to b17f2ac Compare April 24, 2026 04:25
@aisle-research-bot
Copy link
Copy Markdown

aisle-research-bot Bot commented Apr 24, 2026

🔒 Aisle Security Analysis

We found 1 potential security issue(s) in this PR:

# Severity Title
1 🟡 Medium Potential application-layer DoS via forced two-phase approval wait (ignoring immediate decisions)
1. 🟡 Potential application-layer DoS via forced two-phase approval wait (ignoring immediate decisions)
Property Value
Severity Medium
CWE CWE-400
Location extensions/codex/src/app-server/approval-bridge.ts:103-105

Description

The app-server now always waits for the second-phase plugin.approval.waitDecision unless the plugin's plugin.approval.request response contains an own data property decision: null.

This can enable an application-layer DoS / resource exhaustion scenario:

  • Input/control: a (buggy or attacker-controlled) plugin can respond to plugin.approval.request with an id (so the request is considered routable) but never send/emit a decision for plugin.approval.waitDecision.
  • Sink: handleCodexAppServerApprovalRequest blocks awaiting waitForPluginApprovalDecision(...).
  • Impact: each approval request can tie up server resources until the gateway timeout (120s) elapses or the run is aborted; with sufficient concurrent approvals this can degrade service.

While there is a gateway-side timeout, this change increases exposure by no longer honoring request-time decisions (even if present), meaning plugins that previously completed in one phase will now incur the full wait or timeout.

Vulnerable code:

const decision = approvalRequestExplicitlyUnavailable(requestResult)
  ? null
  : await waitForPluginApprovalDecision({ approvalId, signal: params.signal });

Recommendation

Add a bounded, explicit server-side timeout and/or concurrency controls around the wait, and consider using request-time decisions when safe/expected.

Example (explicit timeout wrapping the wait):

import { setTimeout as delay } from "node:timers/promises";

const decision = approvalRequestExplicitlyUnavailable(requestResult)
  ? null
  : await Promise.race([
      waitForPluginApprovalDecision({ approvalId, signal: params.signal }),
      delay(30_000, null, { signal: params.signal }), // shorter app-server bound
    ]);

Additionally:

  • Enforce per-run/per-session limits on concurrent pending approvals.
  • Fail closed quickly (deny/unavailable) when a plugin repeatedly times out on waitDecision.
  • If the system supports both one-phase and two-phase flows, gate waiting on an explicit twoPhase/capability flag rather than implicitly waiting by default.

Analyzed PR: #70751 at commit b17f2ac

Last updated on: 2026-04-24T04:27:07Z

@steipete steipete merged commit 0985576 into openclaw:main Apr 24, 2026
9 checks passed
@steipete
Copy link
Copy Markdown
Contributor

Codex review follow-up: landed as 098557623f.

Rebased after #70569 and kept the final two-phase approval decision as the authority while preserving the explicit no-route sentinel. Local proof on the rebased branch: pnpm check:changed (extension typecheck/lint/import-cycle guard plus 20 test files, 157 tests). Thanks @Lucenx9.

Angfr95 pushed a commit to Angfr95/openclaw that referenced this pull request Apr 25, 2026
Require the Codex app-server bridge to wait for the final two-phase approval decision, while preserving the explicit no-route sentinel behavior.

Local gate on rebased branch: pnpm check:changed (20 files, 157 tests).

Thanks @Lucenx9.

Co-authored-by: Lucenx9 <185146821+Lucenx9@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants