Skip to content

[codex] docs: update Codex plugin install guidance#1288

Merged
arittr merged 1 commit intodevfrom
codex/pri-1367-codex-install-docs
Apr 28, 2026
Merged

[codex] docs: update Codex plugin install guidance#1288
arittr merged 1 commit intodevfrom
codex/pri-1367-codex-install-docs

Conversation

@arittr
Copy link
Copy Markdown
Collaborator

@arittr arittr commented Apr 27, 2026

What problem are you trying to solve?

PRI-1367 block 3 identified that the dedicated Codex install docs were still sending users down the older clone-and-symlink path even though the README now presents the Codex CLI/App plugin install as canonical.

The concrete user failure mode was: a new Codex user following docs/README.codex.md or an older raw .codex/INSTALL.md link would be told to clone obra/superpowers, create a ~/.agents/skills/superpowers symlink or Windows junction, and update by git pull. That conflicts with the current Codex plugin marketplace flow and leaves stale bootstrap/symlink setup looking like the recommended path.

While reviewing that cleanup, we also found one stale Codex reference note in skills/using-superpowers/references/codex-tools.md: it described future custom-agent support as an agents field in plugin.json plus a symlink mirroring the old skills install. Current OpenAI plugin examples can include plugin-level agents/ directories, while current Codex still does not expose those as named spawn_agent targets.

What does this PR change?

This removes the obsolete Codex-only install docs (.codex/INSTALL.md and docs/README.codex.md) so the repository relies on the current README Codex CLI/App plugin install path. It also updates the Codex tool-mapping reference to describe the current custom-agent limitation without the stale manifest/symlink claim.

Is this change appropriate for the core library?

Yes. Codex is a supported Superpowers harness, and these files were core user-facing installation docs for that harness. Removing obsolete setup instructions keeps users on the current Codex plugin path. The skill-reference edit is not a new skill or behavior change; it corrects a factual Codex platform note used by existing Superpowers skills when adapting Claude Code-style named agents to Codex spawn_agent.

What alternatives did you consider?

  1. Rewrite .codex/INSTALL.md as a compatibility landing page for old raw links. Initially chosen, then rejected after Drew decided the cleaner path is to drop the old file entirely and rely only on the current plugin install method.
  2. Leave clone-and-symlink as a manual fallback. Rejected because it keeps the obsolete flow looking endorsed.
  3. Keep docs/README.codex.md as a detailed guide. Rejected because it duplicated the README install flow and risked reintroducing stale details.
  4. Leave codex-tools.md alone because it is under skills/. Rejected after review because the specific section was factually stale and small enough to correct without changing workflow semantics.

Does this PR contain multiple unrelated changes?

No. All changes are part of the same Codex cleanup: remove obsolete Codex install entrypoints, rely on the current plugin install instructions in README, and remove one stale Codex plugin/agent packaging claim discovered while auditing the same docs.

Existing PRs

No open duplicate was found. Open search results for Codex/install-like wording were unrelated new-harness docs such as #881, #1148, and #516.

#430 migrated Codex from the older bootstrap CLI to native skill discovery, which is the now-obsolete symlink path this PR removes from current docs. #623 and #731 were closed Codex installer/docs attempts around older behavior. #904 was a closed attempt to add Codex marketplace artifacts and docs before the current committed plugin-artifact shape. #1261 is the key prerequisite that committed .codex-plugin/plugin.json and assets as the source of truth. #1284 is the recent OpenCode install-doc cleanup pattern, not a Codex duplicate.

Related issue: #1139 was closed after official native Codex plugin support shipped.

Environment tested

Harness (e.g. Claude Code, Cursor) Harness version Model Model version/ID
Codex desktop not exposed in UI GPT-5 not exposed in UI

Evaluation

  • What was the initial prompt you (or your human partner) used to start the session that led to this change?

Drew asked to continue PRI-1367 and inspect block 3, then requested three subagents to grok the codebase, ticket, and task before continuing.

  • How many eval sessions did you run AFTER making the change?

0 runtime agent eval sessions. This is a docs cleanup plus a factual platform-reference correction, not a runtime skill-behavior change.

  • How did outcomes change compared to before the change?

Before: docs/README.codex.md and .codex/INSTALL.md made clone + ~/.agents/skills symlink setup look like the primary Codex install path, and codex-tools.md described future plugin custom-agent support using a stale plugin.json/symlink model.

After: current user-facing Codex install guidance lives in README and points to Codex CLI/App plugin installation; the obsolete Codex-specific docs are gone; and the Codex tool mapping describes the current custom-agent limitation in terms of named spawn_agent target availability.

Verification run after the final change set:

git diff --check
bash scripts/bump-version.sh --check
bash tests/codex-plugin-sync/test-sync-to-codex-plugin.sh

Additional audit after the rebase:

rg -n 'README\.codex|\.codex/INSTALL|Codex install-doc|compatibility landing|legacy landing|old raw|Just clone and symlink|Create the skills symlink|Fetch and follow instructions from https://raw\.githubusercontent\.com/obra/superpowers/refs/heads/main/\.codex/INSTALL\.md|RawPluginManifest gains an `agents` field|mirroring the existing `skills/` symlink|plugin can symlink to `agents/`' README.md docs skills/using-superpowers/references/codex-tools.md --glob '!docs/superpowers/**'

That audit returned no obsolete Codex hits.

Rigor

  • If this is a skills change: I used superpowers:writing-skills and completed adversarial pressure testing (paste results below)
    • I used superpowers:writing-skills before editing the skill reference file. I did not run adversarial pressure testing because the edit is a factual reference correction, not a change to the skill trigger, workflow, or behavior-shaping instructions. The change was validated against current OpenAI plugin examples and current Codex manifest-loading source instead.
  • This change was tested adversarially, not just on the happy path
    • For the docs/sync path: the Codex plugin sync regression exercises ignored files, dirty local destination protection, bootstrap dry-run behavior, missing manifest handling, and no-op local apply. For the factual reference note: the stale phrasing audit confirmed the obsolete manifest/symlink claims were removed.
  • I did not modify carefully-tuned content (Red Flags table, rationalizations, "human partner" language) without extensive evals showing the change is an improvement

Human review

  • A human has reviewed the COMPLETE proposed diff before submission

Drew reviewed the complete local git diff and replied approved before this PR was opened. Drew then explicitly requested dropping .codex/INSTALL.md and docs/README.codex.md from the PR so the repo relies only on the current Codex plugin install method.

@arittr arittr marked this pull request as ready for review April 27, 2026 21:01
@arittr arittr force-pushed the codex/pri-1367-codex-install-docs branch from 4c06225 to 6d7c852 Compare April 27, 2026 22:16
@obra
Copy link
Copy Markdown
Owner

obra commented Apr 28, 2026

lgtm

@arittr arittr force-pushed the codex/pri-1367-codex-install-docs branch from 6d7c852 to 3d7b32b Compare April 28, 2026 23:07
@arittr arittr merged commit 4c7c544 into dev Apr 28, 2026
obra added a commit that referenced this pull request May 4, 2026
* docs: add Codex App compatibility design spec (PRI-823)

Design for making using-git-worktrees, finishing-a-development-branch,
and subagent-driven-development skills work in the Codex App's sandboxed
worktree environment. Read-only environment detection via git-dir vs
git-common-dir comparison, ~48 lines across 4 files, zero breaking changes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: address spec review feedback for PRI-823

Fix three Important issues from spec review:
- Clarify Step 1.5 placement relative to existing Steps 2/3
- Re-derive environment state at cleanup time instead of relying on
  earlier skill output
- Acknowledge pre-existing Step 5 cleanup inconsistency

Also: precise step references, exact codex-tools.md content, clearer
Integration section update instructions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: address team review feedback for PRI-823 spec

- Add commit SHA + data loss warning to handoff payload (HIGH)
- Add explicit commit step before handoff (HIGH)
- Remove misleading "mark as externally managed" from Path B
- Add executing-plans 1-line edit (was missing)
- Add branch name derivation rules
- Add conditional UI language for non-App environments
- Add sandbox fallback for permission errors
- Add STOP directive after Step 0 reporting

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: clarify executing-plans in What Does NOT Change section

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add cleanup guard test (#5) and sandbox fallback test (#10) to spec

Both tests address real risk scenarios:
- #5: cleanup guard bug would delete Codex App's own worktree (data loss)
- #10: Local thread sandbox fallback needs manual Codex App validation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add implementation plan for Codex App compatibility (PRI-823)

8 tasks covering: environment detection in using-git-worktrees,
Step 1.5 + cleanup guard in finishing-a-development-branch,
Integration line updates, codex-tools.md docs, automated tests,
and final verification.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs(codex-tools): add named agent dispatch mapping for Codex (#647)

* fix(writing-skills): correct false 'only two fields' frontmatter claim (#882)

* Replace subagent review loops with lightweight inline self-review

The subagent review loop (dispatching a fresh agent to review plans/specs)
doubled execution time (~25 min overhead) without measurably improving plan
quality. Regression testing across 5 versions (v3.6.0 through v5.0.4) with
5 trials each showed identical plan sizes, task counts, and quality scores
regardless of whether the review loop ran.

Changes:
- writing-plans: Replace subagent Plan Review Loop with inline Self-Review
  checklist (spec coverage, placeholder scan, type consistency)
- writing-plans: Add explicit "No Placeholders" section listing plan failures
  (TBD, vague descriptions, undefined references, "similar to Task N")
- brainstorming: Replace subagent Spec Review Loop with inline Spec Self-Review
  (placeholder scan, internal consistency, scope check, ambiguity check)
- Both skills now use "look at it with fresh eyes" framing

Testing: 5 trials with the new skill show self-review catches 3-5 real bugs
per run (spawn positions, API mismatches, seed bugs, grid indexing) in ~30s
instead of ~25 min. Remaining defects are comparable to the subagent approach.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Revert "Replace subagent review loops with lightweight inline self-review"

This reverts commit bf8f757.

* Reapply "Replace subagent review loops with lightweight inline self-review"

This reverts commit b045fa3.

* Add v5.0.6 release notes

* Move brainstorm server metadata to .meta/ subdirectory

Metadata files (.server-info, .events, .server.pid, .server.log,
.server-stopped) were stored in the same directory served over HTTP,
making them accessible via the /files/ route. They now live in a .meta/
subdirectory that is not web-accessible.

Also fixes a stale test assertion ("Waiting for Claude" → "Waiting for
the agent").

Reported-By: 吉田仁

* Revert "Move brainstorm server metadata to .meta/ subdirectory"

This reverts commit ab500da.

* Separate brainstorm server content and state into peer directories

The session directory now contains two peers: content/ (HTML served to
the browser) and state/ (events, server-info, pid, log). Previously
all files shared a single directory, making server state and user
interaction data accessible over the /files/ HTTP route.

Also fixes stale test assertion ("Waiting for Claude" → "Waiting for
the agent").

Reported-By: 吉田仁

* Fix owner-PID false positive when owner runs as different user

ownerAlive() treated EPERM (permission denied) the same as ESRCH
(process not found), causing the server to self-terminate within 60s
whenever the owner process ran as a different user. This affected WSL
(owner is a Windows process), Tailscale SSH, and any cross-user
scenario.

The fix: `return e.code === 'EPERM'` — if we get permission denied,
the process is alive; we just can't signal it.

Tested on Linux via Tailscale SSH with a root-owned grandparent PID:
- Server survives past the 60s lifecycle check (EPERM = alive)
- Server still shuts down when owner genuinely dies (ESRCH = dead)

Fixes #879

* Fix owner-PID lifecycle monitoring for cross-platform reliability

Two bugs caused the brainstorm server to self-terminate within 60s:

1. ownerAlive() treated EPERM (permission denied) as "process dead".
   When the owner PID belongs to a different user (Tailscale SSH,
   system daemons), process.kill(pid, 0) throws EPERM — but the
   process IS alive. Fixed: return e.code === 'EPERM'.

2. On WSL, the grandparent PID resolves to a short-lived subprocess
   that exits before the first 60s lifecycle check. The PID is
   genuinely dead (ESRCH), so the EPERM fix alone doesn't help.
   Fixed: validate the owner PID at server startup — if it's already
   dead, it was a bad resolution, so disable monitoring and rely on
   the 30-minute idle timeout.

This also removes the Windows/MSYS2-specific OWNER_PID="" carve-out
from start-server.sh, since the server now handles invalid PIDs
generically at startup regardless of platform.

Tested on Linux (magic-kingdom) via Tailscale SSH:
- Root-owned owner PID (EPERM): server survives ✓
- Dead owner PID at startup (WSL sim): monitoring disabled, survives ✓
- Valid owner that dies: server shuts down within 60s ✓

Fixes #879

* Release v5.0.6: inline self-review, brainstorm server restructure, owner-PID fixes

* fix: add Copilot CLI platform detection for sessionStart context injection

Copilot CLI v1.0.11 reads `additionalContext` from sessionStart hook
output, but the session-start script only emits the Claude Code-specific
nested format. Add COPILOT_CLI env var detection so Copilot CLI gets the
SDK-standard top-level `additionalContext` while Claude Code continues
getting `hookSpecificOutput`.

Based on PR #910 by @culinablaz.

* feat: add Copilot CLI tool mapping, docs, and install instructions

- Add references/copilot-tools.md with full tool equivalence table
- Add Copilot CLI to using-superpowers skill platform instructions
- Add marketplace install instructions to README
- Add changelog entry crediting @culinablaz for the hook fix

* fix(opencode): align skills path across bootstrap, runtime, and tests

The bootstrap text advertised a configDir-based skills path that didn't
match the runtime path (resolved relative to the plugin file). Tests
used yet another hardcoded path and referenced a nonexistent lib/ dir.

- Remove misleading skills path from bootstrap text; the agent should
  use the native skill tool, not read files by path
- Fix test setup to create a consistent layout matching the plugin's
  ../../skills resolution
- Export SUPERPOWERS_SKILLS_DIR from setup.sh so tests use a single
  source of truth
- Add regression test that bootstrap doesn't advertise the old path
- Remove broken cp of nonexistent lib/ directory

Fixes #847

* docs: add OpenCode path fix to release notes

* fix(opencode): inject bootstrap as user message instead of system message

Move bootstrap injection from experimental.chat.system.transform to
experimental.chat.messages.transform, prepending to the first user
message instead of adding a system message.

This avoids two issues:
- System messages repeated every turn inflate token usage (#750)
- Multiple system messages break Qwen and other models (#894)

Tested on OpenCode 1.3.2 with Claude Sonnet 4.5 — brainstorming skill
fires correctly on "Let's make a React to do list" prompt.

* docs: update release notes with OpenCode bootstrap change

* docs: add worktree rototill design spec (PRI-974)

Design for detect-and-defer worktree support. Superpowers defers to
native harness worktree systems when available, falls back to manual
git worktree creation when not. Covers Phases 0-2: detection, consent,
native tool preference, finishing state detection, and three bug fixes
(#940, #999, #238).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: address SWE review feedback on worktree rototill spec

- Fix Bug #999 order: merge → verify → remove worktree → delete branch
  (avoids losing work if merge fails after worktree removal)
- Add submodule guard to Step 0 detection (GIT_DIR != GIT_COMMON is also
  true in submodules)
- Preserve global path (~/.config/superpowers/worktrees/) in detection for
  backward compatibility, just stop offering it to new users
- Add step numbering note and implementation notes section
- Expand provenance heuristic to cover global path and manual creation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: honest spec revisions after issue/PR deep dive

- Step 1a is the load-bearing assumption, not just a risk — if it fails,
  the entire design needs rework. TDD validation must be first impl task.
- #1009 resolution depends on Step 1a working, stated explicitly
- #574 honestly deferred, not "partially addressed"
- Add hooks symlink to Step 1b (PR #965 idea, prevents silent hook loss)
- Add stale worktree pruning to Step 5 (PR #1072 idea, one-line self-heal)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add worktree rototill implementation plan (PRI-974)

5 tasks: TDD gate for Step 1a, using-git-worktrees rewrite,
finishing-a-development-branch rewrite, integration updates,
end-to-end validation. Task 1 is a hard gate — if native tool
preference fails RED/GREEN, stop and redesign.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: add RED/GREEN validation for native worktree preference (PRI-974)

Gate test for Step 1a — validates agents prefer EnterWorktree over
git worktree add on Claude Code. Must pass before skill rewrite.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: rewrite using-git-worktrees with detect-and-defer (PRI-974)

Step 0: GIT_DIR != GIT_COMMON detection (skip if already isolated)
Step 0 consent: opt-in prompt before creating worktree (#991)
Step 1a: native tool preference (short, first, declarative)
Step 1b: git worktree fallback with hooks symlink and legacy path compat
Submodule guard prevents false detection
Platform-neutral instruction file references (#1049)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: rewrite finishing-a-development-branch with detect-and-defer (PRI-974)

Step 2: environment detection (GIT_DIR != GIT_COMMON) before presenting menu
Detached HEAD: reduced 3-option menu (no merge from detached HEAD)
Provenance-based cleanup: .worktrees/ = ours, anything else = hands off
Bug #940: Option 2 no longer cleans up worktree
Bug #999: merge -> verify -> remove worktree -> delete branch
Bug #238: cd to main repo root before git worktree remove
Stale worktree pruning after removal (git worktree prune)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address spec review findings in both skill rewrites (PRI-974)

using-git-worktrees: submodule guard now says "treat as normal repo"
instead of "proceed to Step 1" (preserves consent flow)
using-git-worktrees: directory priority summaries include global legacy

finishing-a-development-branch: move git branch -d after Step 6 cleanup
to make Bug #999 ordering unambiguous (merge -> worktree remove -> branch delete)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: update worktree integration references across skills (PRI-974)

Remove REQUIRED language from executing-plans and subagent-driven-development.
Consent and detection now live inside using-git-worktrees itself.
Fix stale 'created by brainstorming' claim in writing-plans.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: include worktrees/ (non-hidden) in finishing provenance check (PRI-974)

The creation skill supports both .worktrees/ and worktrees/ directories,
but the finishing skill's cleanup only checked .worktrees/. Worktrees
under the non-hidden path would be orphaned on merge or discard.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: Step 1a validated through TDD — explicit naming + consent bridge (PRI-974)

Step 1a failed at 2/6 with the spec's original abstract text ("use your
native tool"). Three REFACTOR iterations found what works (50/50 runs):

1. Explicit tool naming — "do you have EnterWorktree, WorktreeCreate..."
   transforms interpretation into factual toolkit check
2. Consent bridge — "user's consent is your authorization" directly
   addresses EnterWorktree's "ONLY when user explicitly asks" guardrail
3. Red Flag entry naming the specific anti-pattern

File split was tested but proven unnecessary — the fix is the Step 1a
text quality, not physical separation of git commands. Control test
with full 240-line skill (all git commands visible) passed 20/20.

Test script updated: supports batch runs (./test.sh green 20), "all"
phase, and checks absence of git worktree add (reliable signal) rather
than presence of EnterWorktree text (agent sometimes omits tool name).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update spec with TDD findings on Step 1a (PRI-974)

Step 1a's original "deliberately short, abstract" design was disproven
by TDD (2/6 pass rate). Spec now documents the validated approach:
explicit tool naming + consent bridge + red flag (50/50 pass rate).

- Design Principles: updated to reflect explicit naming over abstraction
- Step 1a: replaced abstract text with validated approach, added design
  note explaining the TDD revision and why file splitting was unnecessary
- Risks: Step 1a risk marked RESOLVED with cross-platform validation table
  and residual risk note about upstream tool description dependency

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: honest cross-platform validation table in spec (PRI-974)

Research confirmed Claude Code is currently the only harness with an
agent-callable mid-session worktree tool. All others either create
worktrees before the agent starts (Codex App, Gemini, Cursor) or have
no native support (Codex CLI, OpenCode).

Table now shows: what was actually tested (Claude Code 50/50, Codex CLI
6/6), what was simulated (Codex App 1/1), and what's untested (Gemini,
Cursor, OpenCode). Step 1a is forward-compatible for when other
harnesses add agent-callable tools.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: cross-platform validation on 5 harnesses (PRI-974)

Tested on Gemini CLI (gemini -p) and Cursor Agent (cursor-agent -p):
- Gemini: Step 0 detection 1/1, Step 1b fallback 1/1
- Cursor: Step 0 detection 1/1, Step 1b fallback 1/1

Both correctly identified no native agent-callable worktree tool,
fell through to git worktree add, and performed safety verification.
Both correctly detected existing worktrees and skipped creation.

5 of 6 harnesses now tested. Only OpenCode untested (no CLI access).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: remove incorrect hooks symlink step from worktree skill

Git worktrees inherit hooks from the main repo automatically via
$GIT_COMMON_DIR — this has been the case since git 2.5 (2015).
The symlink step was based on an incorrect premise from PR #965
and also fails in practice (.git is a file in worktrees, not a dir).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: address PR #1121 review — respect user preference, drop y/n

- Consent prompt: drop "(y/n)" and add escape valve for users who
  have already declared their worktree preference in global or
  project agent instruction files.
- Directory selection: reorder to put declared user preference
  ahead of observed filesystem state, and reframe the default as
  "if no other guidance available".
- Sandbox fallback: require explicitly informing the user that
  the sandbox blocked creation, not just "report accordingly".
- writing-plans: fully qualify the superpowers:using-git-worktrees
  reference.
- Plan doc: mirror the consent-prompt change.

Step 1a native-tool framing and the helper-scripts suggestion are
still outstanding — the first needs a benchmark re-run before softer
phrasing can be adopted without regressing compliance; the second is
exploratory and will get a thread reply.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: soften Step 1a native-tool framing per PR #1121 review

Address obra's comment on explicit step numbers / prescriptive tone.
Drops "STOP HERE if available", the "If YES:" gate, and the "even if /
even if / NO EXCEPTIONS" reinforcement paragraph. Keeps the specific
tool-name anchors (EnterWorktree, WorktreeCreate, /worktree, --worktree),
which the original TDD data showed are load-bearing.

A/B verified against drill harness on the 3 creation/consent scenarios
(consent-flow, creation-from-main, creation-from-main-spec-aware):
baseline explicit wording scored 12/12 criteria, softened wording also
scored 12/12. The "agent used the most appropriate tool" criterion
passed in all 3 softened runs — agents still picked EnterWorktree via
ToolSearch without the imperative framing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: drop instruction file enumeration per PR #1121 review

Jesse flagged that the verbose CLAUDE.md/AGENTS.md/GEMINI.md/.cursorrules
enumeration (a) chews tokens, (b) confuses models that anchor on exact
strings, and (c) is repeated DRY-violatingly across 3+ locations.

Replace with abstract "your instructions" framing in four spots:
- skills/using-git-worktrees/SKILL.md Step 0 → Step 1 transition
- skills/using-git-worktrees/SKILL.md Step 1b Directory Selection
- docs/superpowers/plans/2026-04-06-worktree-rototill.md (both mirror locations)

Same intent, harness-agnostic phrasing, ~half the tokens.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: replace hardcoded /Users/jesse with generic placeholders (#858)

* Remove the deprecated legacy slash commands (#1188)

* fix: prevent subagent-driven-development from pausing every 3 tasks

requesting-code-review had "review after each batch (3 tasks)" for
executing-plans, which leaked into subagent-driven-development as a
check-in cadence. Replaced with flexible "each task or at natural
checkpoints" and added explicit continuous execution directive to
subagent-driven-development.

* Remove Integration sections from skills

These sections don't help with steering and are a legacy of the time
before agents had native skills systems.

* fix(opencode): cache bootstrap content at module level to eliminate per-step file I/O

getBootstrapContent() called fs.existsSync + fs.readFileSync + regex
frontmatter parsing on every agent step with zero caching.  The
experimental.chat.messages.transform hook fires every step in opencode's
agent loop (messages are reloaded from DB each step via
filterCompactedEffect).  A 10-step turn triggered 10 redundant file
reads + 10 regex parses for content that never changes during a session.

Changes:
- Add module-level _bootstrapCache (undefined = not loaded, null = file
  missing) so the first call reads and parses SKILL.md, all subsequent
  calls return the cached string with zero filesystem access
- Cache the null sentinel when SKILL.md is missing, preventing repeated
  fs.existsSync probes
- Add _testing export (resetCache/getCache) for test infrastructure
- Clarify the injection guard comment explaining how it interacts with
  opencode's per-step message reloading
- Add 15 regression tests covering cache behavior, fs call counts,
  injection guard, missing file sentinel, cache reset, and source audit

Fixes #1202

* test(opencode): simplify bootstrap cache coverage

* docs: clarify opencode install caveats

* test(opencode): modernize integration tests

* docs: add Factory Droid installation instructions

* Preserve Codex marketplace metadata

* docs: add README quickstart install links (#1293)

* docs(codex-tools): fix subagent wait mapping to wait_agent

Update the Codex tool mapping so Claude Code 'Task returns result' maps to the current Codex spawned-agent result tool, wait_agent. Also clarify that older Codex builds exposed spawned-agent waiting as wait, while current bare wait is the code-mode exec/wait surface for yielded exec cells.

Verified with Drill:
- codex-tool-mapping-comprehension fails against dev with task_returns_result=wait
- codex-tool-mapping-comprehension passes against this PR with task_returns_result=wait_agent and exec/wait scoped correctly
- codex-subagent-wait-mapping passes against this PR with spawn_agent -> wait_agent -> close_agent and PR963_OK returned

* fix(cursor): run SessionStart hook via run-hook.cmd on Windows

Route Cursor's Windows SessionStart hook through the existing run-hook.cmd dispatcher instead of invoking the extensionless session-start script directly. This avoids Windows opening the extensionless hook file and lets Git Bash run the script as intended.

Also removed an accidental UTF-8 BOM from hooks-cursor.json before merging.

Verified:
- hooks-cursor.json parses as JSON and has no BOM
- command is ./hooks/run-hook.cmd session-start
- CURSOR_PLUGIN_ROOT=/tmp/superpowers ./hooks/run-hook.cmd session-start emits valid Cursor JSON with additional_context

* fix(tests): make SDD integration test actually run its assertions

The SDD integration test silently bailed before printing any verification
results. Three independent bugs caused this:

1. `WORKING_DIR_ESCAPED` was computed from `$SCRIPT_DIR/../..` without
   resolving `..` segments. The resulting "directory" name contained
   literal `..` so `find` was looking in a path that doesn't exist.

2. With `set -euo pipefail`, the `find ... | sort -r | head -1` pipeline
   could exit non-zero (SIGPIPE on the producer when head closes early),
   killing the script silently before assertions ran.

3. The `claude -p` invocation never passed `--plugin-dir`, so it loaded
   the installed plugin instead of the working tree. Local edits to
   skills under test were not actually being tested.

Other adjustments:
- Run claude from inside the unique TEST_PROJECT directory instead of
  from the plugin root, so its session JSONL lives in its own
  `~/.claude/projects/` folder and doesn't race other concurrent
  claude sessions for "most recent file".
- Use the same character-normalization claude does (every non-alphanumeric
  becomes `-`) when computing the session dir name; macOS-resolved
  `/private/var/...` paths and tmp dirs with `.`/`_` in their names need
  this to round-trip correctly.
- Accept either `"name":"Agent"` or `"name":"Task"` in the subagent count
  — the harness renamed the tool but the test wasn't updated.

Verified on this branch: all six verification tests now pass against a
real end-to-end SDD run (skill invoked, 7 subagents dispatched, 6
TodoWrite calls, working code produced, tests pass, no extra features).

* feat: add Gemini CLI subagent support mapping

Map Gemini Task dispatch to @agent-name/@generalist and document parallel subagent dispatch for independent tasks.

* docs: update Codex plugin install guidance (#1288)

* Lift superpowers:code-reviewer agent into the requesting-code-review skill

The plugin had a single named agent (`agents/code-reviewer.md`) used by
two skills, while every other reviewer/implementer subagent in the repo
is dispatched as `general-purpose` with the prompt template living
alongside its skill. That asymmetry had no upside and several costs:

- Two sources of truth for the code review checklist (the agent file
  and `requesting-code-review/code-reviewer.md`), both drifting
  independently.
- `Codex` users could not use the named agent directly; the codex-tools
  reference doc had a workaround section explaining how to flatten the
  named agent into a `worker` dispatch.
- No third-party reliance on `superpowers:code-reviewer` inside this
  repo.

Changes:
- Merge `agents/code-reviewer.md` (persona + checklist) and
  `skills/requesting-code-review/code-reviewer.md` (placeholder
  template) into a single self-contained Task-dispatch template,
  matching the shape of `implementer-prompt.md`,
  `spec-reviewer-prompt.md`, etc.
- Update `skills/requesting-code-review/SKILL.md` and
  `skills/subagent-driven-development/code-quality-reviewer-prompt.md`
  to dispatch `Task (general-purpose)` instead of the named agent.
- Drop the now-obsolete "Named agent dispatch" workaround sections from
  `codex-tools.md` and `copilot-tools.md` — superpowers no longer ships
  any named agents, so those instructions documented nothing.
- Delete `agents/code-reviewer.md` and the empty `agents/` directory.

Tier 3 coverage for the change: a new behavioral test
`tests/claude-code/test-requesting-code-review.sh` plants real bugs
(SQL injection, plaintext password handling, credential logging) into
a tiny project, runs the actual `requesting-code-review` skill against
the working tree, and asserts the dispatched reviewer flags every
planted issue at Critical/Important severity and refuses to approve
the diff.

Verified end-to-end on this branch:
- The new test passes (5/5 assertions; reviewer caught all planted
  bugs and several others).
- The existing SDD integration test still passes (7/7 subagents
  dispatched, all as `general-purpose`; spec compliance still
  rejects extra features; produced code is correct).
- Session JSONLs confirm zero remaining `superpowers:code-reviewer`
  dispatches anywhere in the SDD pipeline.

* Prepare v5.1.0: release notes and version bump

Add v5.1.0 release notes covering:
- Removals: legacy slash commands (/brainstorm, /execute-plan,
  /write-plan), skill Integration sections
- Worktree skills rewrite (PRI-974, PR #1121)
- Contributor guidelines for AI agents
- Codex plugin mirror tooling (PR #1165)
- OpenCode bootstrap caching (#1202)
- SDD pause-every-3-tasks fix; SDD integration test fixes
- Cursor Windows hook routing
- Gemini CLI subagent dispatch mapping
- Skill terminology cleanups
- Install docs (Factory Droid, Codex, quickstart links)

Bumps version 5.0.7 -> 5.1.0 across all declared files via
scripts/bump-version.sh; not yet tagged or released.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Drew Ritter <drewritter@workerbee.local>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Drew Ritter <drew@primeradiant.com>
Co-authored-by: Blaž Čulina <culina.blaz@nsoft.com>
Co-authored-by: Jesse Vincent <jesse@primeradiant.com>
Co-authored-by: voidborne-d <voidborne-d@users.noreply.github.com>
Co-authored-by: Richard Luo <luo.richard@gmail.com>
Co-authored-by: Drew Ritter <drew@ritter.dev>
Co-authored-by: leonsong09 <59187950+leonsong09@users.noreply.github.com>
Co-authored-by: YuXiang Hong <41331696+starumiQAQ@users.noreply.github.com>
Co-authored-by: Sathvik Gilakamsetty <spacetime1007@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants