Codex returns 'verdict: blocked' on completed work — three distinct causes mask successful task output

## Summary

I'm using `sapoto-codex:implement` (the `--background` task pipeline) to delegate plan-driven work from Claude Code while Claude orchestrates. In ~30 min of usage with `gpt-5.4 --effort high`, I hit three distinct failure modes that all produce the same `verdict: blocked` summary even though the underlying work was completed and verified. Surfacing all three because they compound: a downstream consumer (Claude) gets a single "blocked" signal with no actionable distinction between *"I couldn't do the work"* and *"I did the work but the contract refuses to let me commit it."*

Plugin: `@openai/codex-plugin-cc` v1.0.4
Codex CLI: `codex-cli 0.114.0`
Auth: ChatGPT (madanapallikalyan@gmail.com)
Host: macOS 24.6.0 (darwin), Apple Silicon
Plugin install: `/Users/kalyanmadanapalli/Desktop/sapoto-codex-plugin/`
Companion: `node .../codex/scripts/codex-companion.mjs task ...`

## Issue 1 — Sandbox blocks `git commit` inside a linked git worktree

### Reproduction

Branch is on a *linked* git worktree (`git worktree add ...`). The worktree's `.git` is a file pointing to `<main>/.git/worktrees/<name>/`. Codex makes file changes inside the worktree's working tree successfully, then tries `git commit`:

```
fatal: Unable to create '/Users/<me>/Desktop/automatic-document-fetcher/.worktrees/resilient-retry-scheduler/.git/worktrees/resilient-retry-scheduler/index.lock': Operation not permitted
```

The sandbox grants write access to the worktree's working tree but blocks writes to the linked git directory at `<main>/.git/worktrees/<name>/`. So `git commit` can't acquire `index.lock`.

This appears to be the same general class as #240 (sandbox config issues) but specific to git worktrees on macOS. Workaround per project's CLAUDE.md is to have Codex write files only and commit yourself afterward, but that surfaces as `verdict: blocked` in the companion output, which forces the orchestrator to read the raw `.log` file to recover what actually happened.

### Concrete log line

```
[2026-04-26T20:25:17.005Z] Assistant message
- verdict: blocked
- ...
- notes:
  - `git commit` was blocked by sandbox permissions: `fatal: Unable to create '.../index.lock': Operation not permitted`.
  - Acceptance items 1-4 are implemented and verified; the only incomplete part of the completeness contract is the required commit.
```

### Suggested fix
- Detect the sandbox-permission flavor of `git commit` failure separately and emit a distinct verdict like `verdict: ready_to_commit` or `commit_blocked_by_sandbox` so orchestrators can auto-commit. OR
- Allow the sandbox to write to `.git/worktrees/<name>/` paths under the user's repo root, the same way it allows writing to the working tree. OR
- Explicitly document in `CLAUDE.md`-style guidance that linked-worktree commits will fail and the orchestrator should commit after a "blocked" report — and surface a structured field in the JSON result that flags this case.

## Issue 2 — `result.rawOutput` returns `null` despite a final summary existing

### Reproduction

After a job completes, `codex-companion.mjs result <task-id> --json` returns:

```json
{
  "job": {
    "status": "completed",
    "summary": "- verdict: blocked",
    "result": null
  }
}
```

But the corresponding job log file (`~/.claude/plugins/data/codex-openai-codex/state/<workspace>/jobs/<task-id>.log`) clearly contains a `[Final output]` block with the structured report. The companion just doesn't surface it. This makes the orchestrator unable to programmatically extract `files_changed`, `tests` results, and `notes` — which are all in the log but not in `result.rawOutput`.

This is adjacent to #264 ("per-job state JSON stuck at status=running ...") but specific to the case where `status=completed` but `result` is `null`.

### Suggested fix
- Plumb the final assistant message through from the in-memory turn state into the persisted `result.rawOutput`, even when the only emission is a structured summary. Or fall back to the last `Assistant message` in the log if `rawOutput` is not set.

## Issue 3 — Codex runs the full test suite despite explicit "only run targeted tests" guidance

### Reproduction

My brief said:

> ## Pre-existing test failures (IMPORTANT — don't be blocked by these)
> Pre-existing failures unrelated to this work in: `browserWindowRegistry.test.ts`, `chromeSetupStatusResolver.test.ts`, `notificationService.test.ts`, `autoExportRepo.test.ts`. Do NOT run the full `pnpm test` suite — only run the targeted tests above.

Plus a `<verification>` block listing the exact commands.

Codex ran the targeted tests (correctly) AND THEN ran `pnpm test` (full suite). The full suite hit the four pre-existing failures listed above. Codex treated these as part of the completeness contract → refused to commit → `verdict: blocked`. Files were correctly modified, targeted tests passed, typecheck clean — but the orchestrator gets a "blocked" verdict and has to manually verify and commit.

This is essentially the completeness contract failing to differentiate "tests I could have caused" from "tests already broken on the parent branch." On a large codebase with any flaky/red tests on `main`, this means every Codex implement task will go "blocked" unless the user is on a perfectly-green branch.

### Suggested fix
- Have the verification step honor the brief's explicit list of commands and NOT run additional commands (especially `pnpm test`) without a prompt-side opt-in. OR
- Compare which tests were red BEFORE the diff vs. AFTER, and only block on tests the diff caused to fail. Existing red tests = warn-with-context, not block.
- Allow the brief to set a `--no-full-suite` flag or honor the project's `CLAUDE.md` "pre-existing failures" markers if present.

## Why these compound into a usability problem

A consumer like Claude Code orchestrating a 20-task implementation plan via `sapoto-codex:implement` gets a `verdict: blocked` signal in three different shapes:

| Cause | Real state | What surfaces |
|---|---|---|
| Sandbox can't commit | Done, just needs commit | `verdict: blocked` |
| Pre-existing test red | Done, tests pass for the diff | `verdict: blocked` |
| Genuine implementation failure | Not done | `verdict: blocked` |

The orchestrator can't distinguish these without reading the log file every time. Today I'm running these three queries on every "blocked" return:

```bash
node codex-companion.mjs result <task-id> --json | jq '.job.summary'
git status --short                    # was anything actually changed?
tail -120 .../jobs/<task-id>.log     # what's the real verdict?
```

A structured verdict enum (`done` / `commit_blocked_by_sandbox` / `red_tests_outside_diff` / `blocked`) would make the result programmatically actionable without log spelunking.

## Reproducer

If useful I can share a sanitized log file + the exact brief that triggered all three failure modes in two consecutive Codex runs (T7 fix-up: triggered #3 → blocked despite green diff; T8 implement: triggered #1 → blocked on commit despite green tests + green typecheck + green diff).

## Workarounds in use

- After every Codex completion, my orchestrator runs `git status --short && pnpm typecheck:all && pnpm test --run <targeted-files>` and commits manually if everything is green. This is the documented pattern in our project's CLAUDE.md ("If git commit is blocked in the codex sandbox (common), have codex write files only and commit yourself afterward") but it requires the orchestrator to second-guess every "blocked" verdict.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Codex returns 'verdict: blocked' on completed work — three distinct causes mask successful task output #273

Summary

Issue 1 — Sandbox blocks `git commit` inside a linked git worktree

Reproduction

Concrete log line

Suggested fix

Issue 2 — `result.rawOutput` returns `null` despite a final summary existing

Reproduction

Suggested fix

Issue 3 — Codex runs the full test suite despite explicit "only run targeted tests" guidance

Reproduction

Pre-existing test failures (IMPORTANT — don't be blocked by these)

Suggested fix

Why these compound into a usability problem

Reproducer

Workarounds in use

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Cause	Real state	What surfaces
Sandbox can't commit	Done, just needs commit	`verdict: blocked`
Pre-existing test red	Done, tests pass for the diff	`verdict: blocked`
Genuine implementation failure	Not done	`verdict: blocked`

Codex returns 'verdict: blocked' on completed work — three distinct causes mask successful task output #273

Description

Summary

Issue 1 — Sandbox blocks git commit inside a linked git worktree

Reproduction

Concrete log line

Suggested fix

Issue 2 — result.rawOutput returns null despite a final summary existing

Reproduction

Suggested fix

Issue 3 — Codex runs the full test suite despite explicit "only run targeted tests" guidance

Reproduction

Pre-existing test failures (IMPORTANT — don't be blocked by these)

Suggested fix

Why these compound into a usability problem

Reproducer

Workarounds in use

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Issue 1 — Sandbox blocks `git commit` inside a linked git worktree

Issue 2 — `result.rawOutput` returns `null` despite a final summary existing