Release v2.12.0 — 3 skills, pr-review v2, route-intent hook, Iron Laws #23–26 · oliver-kriska/claude-elixir-phoenix

[2.12.0] - 2026-06-16

Workflow-completion release driven by 400-session analysis: three new skills
(/phx:recall, /phx:deps-update, /phx:watch-pr), /phx:pr-review v2 that
closes the review loop (fetch → fix → reply → resolve), a route-intent.sh
UserPromptSubmit hook replacing ~0%-firing CLAUDE.md prose routing, four new
Iron Laws (#23–#26), and an eval-hardening pass that backfilled the
AskUserQuestion 4-option check, cross-file consistency tests, and untracked-file
detection. Law count 22 → 26.

Added

Iron Law #26 — Comments aren't commit messages (session analysis found
Oliver asking "remove unnecessary comments" on essentially every PR, 8+
sessions clustered June 2026). A change's reasoning — the bug, what it
replaces, the task — belongs in the commit/PR/squash, which git persists; not
in code comments. No issue-reference tags inline (# ENA-1234). Keep only
durable intrinsic facts a future reader needs regardless of history:
footguns, invariants, library quirks. Wired into CLAUDE.md, the
inject-iron-laws.sh SubagentStart hook (code-writing subagents inherit it),
the iron-law-judge agent as detection #19 (so /phx:review flags ticket
tags, change-narration, and what-comments), and the init injectable
template. Stops the comments being added during /phx:work//phx:quick
rather than stripping them at PR time. Law count 25 → 26.
UserPromptSubmit routing hook (route-intent.sh) — injects one-line /phx:
suggestions directly into Claude's context for three high-signal intents: GitHub
PR URLs / review-feedback phrasing → /phx:pr-review, Tidewave
<context name="current-page"> blocks → /phx:investigate, Elixir stack-trace
pastes → /phx:investigate. Replaces CLAUDE.md prose routing rules measured at
~0% firing rate across 400 sessions. One suggestion per category per session,
silent on explicit slash commands, gated on mix.exs, always exits 0
(UserPromptSubmit exit 2 would erase the user's prompt).
/phx:recall — session and history archaeology (git-archaeology sessions
ran manual git log/diff pipelines with no plugin support). Three evidence
layers, cheapest first: .claude/solutions/ compound docs → git archaeology
(--grep, -S pickaxe, --follow, -L) → ccrider MCP session search, gated
with graceful degradation when the MCP is absent. ONE ccrider fetch = ONE
subagent (3–15KB responses; writes a ≤30-line summary file). Every answer cites
its evidence; clean misses are stated, then routed to /phx:compound so the
next recall stops at layer 1. 100% trigger accuracy.
/phx:deps-update — generic dependency freshness workflow (dependency
maintenance was a recurring session pattern with no plugin support). Inventory
via mix hex.outdated (exit 1 = normal "outdated" signal), changelog deltas via
the built-in mix hex.package diff <pkg> <v1>..<v2> (no project-specific mix
tasks), updates with coupled-group enforcement (Phoenix core, Ecto, Ash, Oban,
telemetry families move together), breaking-change fixes, and PR splitting
(patches bundled, minors by area, majors solo). Majors require an explicit
mix.exs edit; override: true only when the per-package constraint table
shows a transitive blocker. Hands off security to /phx:deps-audit (Mode B) and
verification to /phx:verify. The only mutating deps skill — audit/vet stay
read-only. 89% trigger accuracy.
/phx:watch-pr — token-conscious PR/CI watching (replaces hand-rolled
60-min foreground sleep loops observed in session analysis). A quiet
background watcher (scripts/watch-pr.sh, Monitor-tool-first with
run_in_background fallback) polls gh pr view --json in its own process and
emits ONE line per genuinely-new event (review, comment, CI conclusion, merged/
closed, watchdog, gh-failure) — raw JSON never enters Claude's context, and
Claude takes zero turns while idle (no cache-TTL straddling). --checks-only
delegates to gh pr checks --watch --fail-fast (exit code is the signal).
Routes actionable reviews to /phx:pr-review and CI failures to
/phx:investigate. 100% trigger accuracy on the new fixture.
/phx:pr-review v2 — closes the review loop (fetch → fix → reply → resolve).
The old skill drafted replies but used REST endpoints that expose neither thread
IDs nor resolved status, so it could never resolve a thread or skip handled ones.
v2 fetches threads via GraphQL reviewThreads (thread ID + isResolved +
isOutdated, paginated), replies via REST to the thread root, resolves via
resolveReviewThread, and is idempotent across review rounds — GitHub's
isResolved is the state. New flags: --bots-only (triage CI bot passes —
Copilot/Codex/CodeRabbit detected via __typename == "Bot"), --no-resolve.
New Iron Laws: never resolve without a reply, never claim a fix without a shown
diff, bot findings get the same scrutiny as humans. New references:
gh-commands.md (3 comment surfaces, pagination, bot detection),
bot-triage.md (batch flow + Elixir false-positive patterns).
Three new Iron Laws (#23–#25) from the 400-session analysis, wired into
elixir-idioms, liveview-patterns, the /phx:init template, the SubagentStart
injection hook, and iron-law-judge detection patterns:
- #23 Mix tasks start only what they need — Mix.Task.run("app.config") +
  Application.ensure_all_started/1, never Mix.Task.run("app.start") (boots the
  full tree: endpoint binds the port, Oban starts consuming jobs). The
  mix-tasks.md reference previously taught the anti-pattern; now fixed.
- #24 LiveView handlers match {:error, %Ecto.Changeset{}} explicitly — bare
  {:error, _} silently swallows form validation errors.
- #25 Capture Gettext/CLDR locale before spawning Task/GenServer — locale is
  process-local; spawned processes reset to default.
Pre-migration safety section in ecto-patterns/references/migrations.md —
check duplicates (including soft-deleted rows) before unique indexes, with
partial-index/data-fix/composite-key resolutions.
Tidewave reliability guards in tidewave-integration — worktree/port
verification (multi-worktree setups debug the wrong server), schema introspection
before SQL, output-size caps, browser_eval server-side fallbacks, and a
QA-walkthrough pattern for feature smoke tests.
Eval: AskUserQuestion 4-option-limit check (askuserquestion_option_limit
matcher) — the tool silently drops a 5th option; brainstorm shipped that way for
months. Scans option lists after every AskUserQuestion mention (YAML - label:
blocks and bullet/numbered runs), stops at headings, and skips sibling list items
when the mention is itself inside a list. Backfilled into all 50 skill evals and
the generator template; caught a real second instance in /phx:plan.
Eval: cross-file consistency tests (lab/eval/tests/test_consistency.py) —
two bug classes per-skill scoring can't see: references teaching anti-patterns
their own Iron Laws ban (mix-tasks.md shipped the app.start pattern Iron Law
#23 bans), and skill scripts using cwd-relative .claude/ paths (the
nested-state-dir bug class). The path lint caught a 4th live instance in
scripts/fetch-claude-docs.sh.
make eval now sees untracked files — brand-new skills/agents were invisible
to the git diff-based changed-file detection until first commit;
git ls-files --others is now merged into both detection paths.

Changed

Workflow handoffs between phases — /phx:investigate now ends with a routing
step (quick fix vs /phx:plan vs /phx:compound); /phx:review passes the review
file path to /phx:plan for follow-up plans; /phx:work suggests /phx:compound
after non-obvious fixes and re-verifies stale plans from earlier sessions.
/phx:full deflects existing plan files — description and a usage guard route
.claude/plans/*/plan.md arguments to /phx:work instead of re-planning.
intent-detection hard guard — skips entirely when the message starts with any
slash command; no more routing suggestions on top of explicit commands.
/phx:work batches checkbox updates — one edit pass when several tasks complete
together, not one Edit call per checkbox.
/phx:compound write-block fallback — outputs the solution doc inline and
points at /phx:permissions instead of silently dropping knowledge;
/phx:permissions now always recommends workflow-artifact write grants
(.claude/plans/, .claude/solutions/, .claude/reviews/).
AskUserQuestion discipline in brainstorm/triage — decisions only, concrete
impact per option; fixed brainstorm's Decision Point exceeding the tool's 4-option
limit (5 options meant one was always silently dropped).
security-analyzer — new end-to-end flow checks from the 400-session analysis: IDOR
via handle_params URL params, data-flow through multi-step transforms, failure-path
consistency in Ecto.Multi/with chains, soft-delete leakage in authz lookups — all bug
classes external review bots caught after plugin review passed.
elixir-reviewer — failure-path review section (Multi/with error branches,
short-circuit side effects, multi-step transforms, soft-delete filters), known
false-positive traps (nil[:key] is nil-safe via Access), and diff-scoped reading rule
to stop turn exhaustion on large PRs.
verification-runner — compiles FIRST (turn 1 combines discovery + mix compile),
maxTurns 10 → 15, earlier findings-file write; stops "compiling… let me check again"
turn exhaustion observed on large PRs.
parallel-reviewer + /phx:audit — rate-limit circuit breaker: when 2+ subagents
fail with rate-limit/API errors, synthesize from existing outputs and tell the user to
re-run after reset instead of dead-waiting on "continue".
ecto-schema-designer — pre-UNIQUE-index migration safety check (duplicates +
soft-deleted rows silently block production migrations).

Fixed

Iron Law verifier is now blame-aware — iron-law-verifier.sh scans only the content
the current Edit/Write introduced (new_string/content), not the whole file.
Pre-existing violations in untouched regions no longer force unrelated refactors.
block-dangerous-ops.sh fails open on script errors — a corrupted hook file (e.g.
merge-conflict markers) once blocked ALL Bash calls with no recovery; hooks.json now
appends || exit 0 and the script documents the JSON-deny/exit-0 contract.
Stop hook warns about uncommitted feature-branch changes — prevents the
lost-work-after-rebase incident class observed in session analysis.
liveview-architect + ecto-schema-designer missing Write — both agents still had
the pre-v2.8.1 disallowedTools: Write, ... frontmatter and fell back to inline output
when spawned as reviewers ("I only have Read, Grep, and Glob"). Write now allowed for
their own findings file; Edit stays disallowed.
web-researcher could never write its output file — research workers were asked to
save findings but had Write disallowed; agents burned all turns on fetches then lost the
output. Write allowed + reserve-last-turns-for-output guard.
/phx:plan post-plan AskUserQuestion exceeded the 4-option limit — 5 options
("Review the plan" / "Adjust the plan" merged into one) meant one was always
silently dropped. Fixed in the skill, planning-orchestrator, and both hook
scripts that echo the list (precompact-rules.sh, plan-stop-reminder.sh).
scripts/fetch-claude-docs.sh wrote its cache relative to cwd — anchored to
${CLAUDE_PROJECT_DIR:-$PWD} like the other skill scripts.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v2.12.0 — 3 skills, pr-review v2, route-intent hook, Iron Laws #23–26

Choose a tag to compare

Sorry, something went wrong.