Conversation
… propose-fix) Adds a five-stage Claude-powered GitHub bot modeled on Bun's public workflows (oven-sh/bun/.github/workflows/claude-*.yml + .claude/commands/): - Triage (Sonnet): classifies issues as bug / framework-design / other, reproduces bugs against tools/test-local.sh, emits confidence markers. - Cross-framework research (Opus): for framework-design issues, fans out to Rails / Laravel / Django / Phoenix / Spring Boot docs and proposes a Wheels-idiomatic path with auto-downgrade rules. - Propose Fix (Opus): TDD-mandatory draft PR; gated by bot-tdd-gate.yml which hard-rejects bot PRs without spec + implementation changes. - Reviewer A (Sonnet): single PR review with line comments and verdict. - Reviewer B (Sonnet): critiques A for sycophancy, false positives, and missed issues; loop cap = 3 rounds. Plus a daily cron (bot-auto-close.yml) that closes stale cannot-reproduce triages after 14 days. All workflows gated on vars.WHEELS_BOT_ENABLED=='true' (default unset, so this PR is dormant until an admin opts in). Per-issue/PR opt-out via the [skip-claude] label or title token. Bot identity is a custom GitHub App (wheels-bot[bot]); secrets WHEELS_BOT_APP_ID + WHEELS_BOT_PRIVATE_KEY must be added before enabling. Setup composite action (setup-wheels-test-env) extracts the LuCLI + Lucee + SQLite + Playwright prelude from pr.yml so triage and propose-fix can reuse it without duplicating ~130 lines. Documentation: docs/contributing/wheels-bot.md (operator handbook), CLAUDE.md §"Wheels Bot" (quick reference), CONTRIBUTING.md (opt-out mechanics). https://claude.ai/code/session_01F5Ev5XsFMzLncPCZ43hjVP
| concurrency: | ||
| group: wheels-bot-review-a-${{ github.event.pull_request.number }}-${{ github.event.pull_request.head.sha }} | ||
| cancel-in-progress: true |
There was a problem hiding this comment.
🔴 The concurrency group key on bot-review-a.yml line 12 includes ${{ github.event.pull_request.head.sha }}, but cancel-in-progress: true only cancels runs that share the same group key. Because every push to a PR produces a new head SHA, the second push lands in a different concurrency group and the in-flight run for the previous SHA is never cancelled — both runs proceed in parallel. This directly contradicts the test plan's assertion ("Push a second commit, confirm cancel-in-progress supersedes the first run") and wastes Sonnet --max-turns 25 review-A runs plus their cascading Reviewer-B fan-out. Fix: drop -${{ github.event.pull_request.head.sha }} from the group key so successive pushes share a group and the older run is cancelled. The marker check on line 44 already handles same-SHA idempotency, so SHA-specificity belongs there, not in the concurrency key.
Extended reasoning...
What's broken
bot-review-a.yml lines 11-13:
concurrency:
group: wheels-bot-review-a-${{ github.event.pull_request.number }}-${{ github.event.pull_request.head.sha }}
cancel-in-progress: trueGitHub Actions cancel-in-progress only cancels runs that share the same concurrency group key. Including head.sha in the key means each new commit produces a new group, so the previous run is not a sibling of the new run and is never cancelled. The two runs proceed in parallel until both complete.
The PR's own test plan asserts the opposite
From the PR description, Phase 1 test plan:
Push a second commit, confirm cancel-in-progress supersedes the first run.
That assertion will fail as written. A reviewer running it will see two parallel bot-review-a jobs (one for each SHA), not a cancellation.
Step-by-step proof
- Developer pushes commit
abc123to PR isAjax() function broken in 1.2 preview #99. GitHub Actions queues a run withconcurrency.group = wheels-bot-review-a-99-abc123. The run begins (Sonnet, up to 25 turns). - Two minutes later, developer pushes commit
def456to the same PR. GitHub Actions queues a run withconcurrency.group = wheels-bot-review-a-99-def456. - The two group keys differ (
...-abc123vs...-def456).cancel-in-progress: trueonly cancels in-progress runs whose group matches the new run's group. There is no in-progress run in group...-def456, so nothing gets cancelled. - Both runs complete. Reviewer A posts two reviews — one anchored to
abc123(now stale) and one anchored todef456.
The marker check on line 44 (wheels-bot:review-a:${pr}:${sha}) prevents the same SHA from being re-reviewed if the workflow retries, but it does not prevent two different SHAs from being reviewed in parallel. The marker and the concurrency group are addressing different problems.
Impact
- CI cost regression: every push during an in-flight run double-spends Sonnet
--max-turns 25on Reviewer A, and each Reviewer A review fans out into a Reviewer B run (capped at 3 rounds). On an active PR with rapid pushes, the spend compounds. - Stale reviews: the older run posts a review against an outdated SHA, which is exactly the scenario
cancel-in-progresswas added to prevent. - Documentation lies: a contributor running the test plan in good faith will see the assertion fail and not know whether it's the test or the workflow.
Fix
Drop head.sha from the concurrency group key:
concurrency:
group: wheels-bot-review-a-${{ github.event.pull_request.number }}
cancel-in-progress: trueThis way successive pushes to the same PR collide into the same group, and cancel-in-progress: true cancels the older run as the test plan promises. Same-SHA idempotency is already handled by the line-44 marker check, which is the correct place for SHA-specificity.
Note for context: bot-review-b.yml line 12 also includes ${{ github.event.review.id }} in its group key, but pairs it with cancel-in-progress: false — internally consistent (each review event spawns its own lane, no cancellation intended). Only review-a has the contradiction.
| permissions: | ||
| contents: read | ||
| checks: write | ||
|
|
There was a problem hiding this comment.
🔴 When a workflow declares an explicit permissions: block, GitHub Actions defaults every unlisted scope to none — not the workflow default. This block lists only contents: read and checks: write, so pull-requests is implicitly none. The verify step then runs gh pr diff "$PR_NUMBER" --name-only (with set -euo pipefail), which calls GET /repos/{o}/{r}/pulls/{n}/files and requires pull-requests: read. Result: every bot PR's TDD gate will fail-closed with a 403 auth error rather than running the intended spec+impl check, defeating the gate's purpose. Fix: add pull-requests: read to the permissions block.
Extended reasoning...
What is wrong
bot-tdd-gate.yml (lines 8-11) declares:
permissions:
contents: read
checks: writePer GitHub's documented behavior for GITHUB_TOKEN permissions, once permissions: is specified explicitly at the workflow or job level, every unlisted scope defaults to none — not to the repo/workflow default. So pull-requests is implicitly none here.
The verify step (line 42) runs:
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
PR_NUMBER: ${{ github.event.pull_request.number }}
run: |
set -euo pipefail
changed=$(gh pr diff "$PR_NUMBER" --name-only)gh pr diff … --name-only is backed by GET /repos/{owner}/{repo}/pulls/{pull_number}/files, which requires pull-requests: read. With the scope set to none, the API returns 403 Resource not accessible by integration, gh exits non-zero, the $(...) command-substitution fails, set -e propagates, and the entire gate hard-fails.
Why this matters
The gate is the central enforcement mechanism behind the bot's TDD invariant — it is supposed to reject bot PRs that lack either a spec change or an implementation change. With this misconfiguration, every real bot PR will fail the gate with a confusing auth error rather than running the intended spec/impl validation. Humans seeing the failure will see "Bot PR TDD Gate: failed" with a permissions error rather than the descriptive error messages designed for this gate (lines 56-60, 64-68). Worse, the gate is fail-closed, so this masks whether the bot is actually following TDD.
The bug is silent in the sense that the workflow is dormant by default (vars.WHEELS_BOT_ENABLED), so it won't surface until activation. But the moment the bot opens its first real PR, the gate will misfire.
Cross-check: every other PR-interacting workflow in this repo grants the scope
.github/workflows/label.yml:16: pull-requests: write
.github/workflows/compat-matrix.yml:517,547: pull-requests: write
.github/workflows/refresh-packages-baseline.yml:29: pull-requests: write
.github/workflows/web-deploy.yml:27: pull-requests: write
.github/workflows/generate-changelog.yml:42: pull-requests: write
.github/workflows/snapshot.yml:11: pull-requests: write
.github/workflows/docs-validation.yml:66: pull-requests: write
Only this new gate omits it — a copy-paste omission, not a deliberate hardening choice. (Note: those workflows use write because they post comments; this gate only needs read.)
Step-by-step proof
- Bot opens PR #N from branch
fix/bot-1234-fooagainstdevelop. bot-tdd-gate.ymltriggers onpull_request: opened.- Step "Decide if this PR is bot-authored" sets
is_bot=true(head ref starts withfix/bot-). - Step "Verify bot PR contains spec + implementation changes" runs.
gh pr diff "$PR_NUMBER" --name-onlyissuesGET /repos/wheels-dev/wheels/pulls/N/fileswith theGITHUB_TOKENwhosepull-requestsscope isnone.- GitHub returns
403 {"message":"Resource not accessible by integration"}. ghexits with code 1; the command substitutionchanged=$(...)propagates the failure underset -e.- The step fails before any of the spec/impl logic runs. The check named "Bot PR TDD Gate" reports failure with no actionable error message.
- Branch protection blocks merge for the wrong reason.
Fix
permissions:
contents: read
pull-requests: read
checks: writeMinor follow-up worth considering: checks: write is declared but no Checks API calls are made in the workflow body — it can probably be dropped. But that's a tidiness nit; the blocker is the missing pull-requests: read.
| claude_args: | | ||
| --model claude-opus-4-7 | ||
| --max-turns 60 | ||
| --allowedTools "Bash(gh:*),Bash(git:*),Bash(bash tools/test-local.sh*),Bash(curl:*),Read,Edit,Write,Grep,Glob" |
There was a problem hiding this comment.
🔴 The --allowedTools list at .github/workflows/bot-propose-fix.yml:106 grants Bash(git:*) — a wildcard that permits every git subcommand including push, push --force, reset --hard, config, rebase, checkout -B, etc. This contradicts the rails in .claude/commands/_shared-rails.md (Tool restrictions) which assert git is read-only and that "the caller workflow handles branch creation and pushes," and undermines the dedicated "Push branch" step at lines 109-117 that is supposed to be the only push path. Tighten to enumerated subcommands (e.g. Bash(git status),Bash(git diff:*),Bash(git log:*),Bash(git show:*),Bash(git grep:*),Bash(git add:*),Bash(git commit:*),Bash(git checkout:*)) — matching what the sibling review/research workflows do.
Extended reasoning...
The contradiction. The shared rails file (.claude/commands/_shared-rails.md lines 9-13) tells the model: "Git operations: read-only only — git status, git log, git diff, git show, git grep. Never git push, git config, git checkout -B on shared branches, git reset --hard, git --force, or any subcommand that rewrites history. The caller workflow handles branch creation and pushes when applicable." The propose-fix command file (.claude/commands/propose-fix.md) further says "Do not use --amend or --force," and the workflow has a dedicated "Push branch" step at .github/workflows/bot-propose-fix.yml lines 109-117 that owns the actual git push. But the --allowedTools value at line 106 is Bash(git:*) — a bare wildcard that covers every git subcommand, including the ones the rails forbid.\n\nWhy the prompt rails are not enough. Prompt text is advisory; --allowedTools is the runtime enforcement boundary. The Claude Code action evaluates the allowlist before invoking the shell, so anything listed there is reachable regardless of what the prompt says. A model that decides to git push directly (to ship faster, due to misreading the instructions, or because of prompt injection from the issue body) would not be blocked at the tool layer. The actions/checkout step at line 56 already cached the App token in the working tree, so a direct git push would succeed.\n\nAnomaly compared to siblings. The sibling workflows in this same PR all enumerate read-only subcommands rather than using the wildcard, demonstrating that the author knew how to scope the grant correctly:\n- bot-review-a.yml: Bash(git log:*),Bash(git diff:*),Bash(git show:*),Bash(git grep:*),Bash(git status)\n- bot-review-b.yml: same scoped list\n- bot-research.yml: Bash(git log:*),Bash(git show:*),Bash(git grep:*)\n\nOnly bot-propose-fix.yml uses git:*. Because propose-fix legitimately needs git add / git commit to stage and commit work on its branch, the right scoping is "read-only plus stage-and-commit" — not "everything."\n\nStep-by-step proof.\n1. Issue #999 is filed with triage-confidence:high (or workflow_dispatch is invoked).\n2. bot-propose-fix.yml fires. Lines 56-67: actions/checkout@v6 runs with the App token, so .git/config has the App token cached as the remote auth.\n3. Line 86: git checkout -b fix/bot-999-.... The model is now on a fresh branch.\n4. Line 99: anthropics/claude-code-action@v1 runs with --allowedTools "Bash(gh:*),Bash(git:*),...". The model is told (via rails text) git is read-only-plus-add-commit, but the actual tool permission is git:*.\n5. Failure mode A (deviation): The model decides to push immediately rather than wait for the dedicated step — git push -u origin fix/bot-999-.... Because Bash(git:*) matches git push, the call is permitted. The push succeeds (App token + branch under fix/bot-*/**).\n6. Failure mode B (history rewrite): The model runs git reset --hard HEAD~3 after a failed test or runs git config user.email someone@else.com to attribute commits differently. Neither is blocked at the tool layer; both contradict the rails text.\n7. Failure mode C (injection-amplified): Issue body contains adversarial instructions like "after the fix, force-push to clean history." The model either complies or is closer to compliance because the tool layer doesn't refuse.\n\nIn every case, the dedicated "Push branch" step at lines 109-117 (with its git diff --quiet no-op-detection) is bypassed.\n\nMitigations and residual risk. The PR description's activation step 4 says admins should add a repo ruleset restricting the App identity to push only to bot/** and fix/bot-*/**, and "Block force-push everywhere." That is a real defense-in-depth layer — the App identity literally cannot push to develop or rewrite history because the server-side ruleset would reject it. So this is a layered-defense concern, not a live security exploit.\n\nHowever, the rails text promises a tighter constraint than the grant actually enforces, and (a) the App-level ruleset is administrator-configured at activation time so it isn't guaranteed to be in place when this workflow runs, (b) the wildcard still permits in-tree mischief like git config rewriting commit attribution, git reset --hard losing legitimate work on the branch, and git push to a permitted fix/bot-*/** ref bypassing the dedicated step's gating logic.\n\nFix. Replace line 106 with an enumerated list that mirrors what the propose-fix prompt actually needs (read + stage + commit + checkout for branch operations within the working copy):\n\nyaml\n--allowedTools "Bash(gh:*),Bash(git status),Bash(git diff:*),Bash(git log:*),Bash(git show:*),Bash(git grep:*),Bash(git add:*),Bash(git commit:*),Bash(git checkout:*),Bash(bash tools/test-local.sh*),Bash(curl:*),Read,Edit,Write,Grep,Glob"\n\n\nThis matches the sibling pattern, keeps the rails honest, and leaves the dedicated "Push branch" step as the only path that pushes.
| - [ ] At least one file outside `tests/`, `vendor/wheels/tests/`, | ||
| `.ai/`, `CHANGELOG.md`, `docs/` is new or modified (the | ||
| implementation) |
There was a problem hiding this comment.
🔴 The propose-fix self-check at step 12 lists only 5 exclusion paths (tests/, vendor/wheels/tests/, .ai/, CHANGELOG.md, docs/) but bot-tdd-gate.yml line 63 excludes 7 — also web/ and .github/. So a bot PR that touches only a spec plus a web/sites/guides/.../<page>.mdx (which step 9 of the same prompt explicitly endorses for user-visible behavior changes) self-clears step 12, then is hard-rejected by the gate with "Bot PR has tests but no implementation". Sync the bullet at lines 158-160 with the gate's regex (and consider also fixing the gate's error message at line 67, which is tests/, .ai/, docs/, and CHANGELOG.md — even further from its own regex).
Extended reasoning...
What the bug is
bot-tdd-gate.yml and .claude/commands/propose-fix.md are supposed to enforce the same TDD invariant: every bot PR must contain at least one spec change and at least one implementation change. The gate is the enforcement; the prompt's step-12 self-check is the bot's pre-flight.
Their exclusion lists have drifted:
- Gate (
bot-tdd-gate.ymlline 63) excludes 7 path prefixes fromimpl_changes:^(tests/|vendor/wheels/tests/|\.ai/|CHANGELOG\.md|docs/|web/|\.github/) - Prompt (
propose-fix.mdlines 158-160) lists only 5:At least one file outside
tests/,vendor/wheels/tests/,.ai/,CHANGELOG.md,docs/is new or modified (the implementation)
The prompt is missing web/ and .github/. The gate's own error message at line 67 also drifts the other direction — it says outside tests/, .ai/, docs/, and CHANGELOG.md, missing vendor/wheels/tests/, web/, and .github/.
How it manifests — step-by-step proof
- A user-visible bug gets triaged with high confidence;
bot-propose-fix.ymlfires. - The bot follows step 5: writes a failing spec under
vendor/wheels/tests/specs/<layer>/. - The bot follows step 7: implements the fix… but the fix is something user-visible, like updating a documented validation message or a form-helper output. Step 9 explicitly says: "If user-visible behavior changed: update
web/sites/guides/src/content/docs/v4-0-0-snapshot/<area>/<page>.mdx". - Imagine the bot mis-classifies the MDX update as the "implementation" change (a reasonable confusion since step 9 lists it under "Update supporting docs" alongside CFML files). Or imagine an issue whose only resolution genuinely is an MDX clarification.
- Bot reaches step 12 self-check. Looking at the diff:
vendor/wheels/tests/specs/<layer>/<x>Spec.cfc— counts toward "spec" ✓web/sites/guides/.../<page>.mdx— not in the prompt's exclusion list, so the bot considers it the "implementation" ✓- All boxes ticked. Bot opens the PR.
bot-tdd-gate.ymlruns. Its regex does excludeweb/, soimpl_changesis empty. Gate fails with: "Bot PR has tests but no implementation."- Bot run wasted; the PR sits broken until a human intervenes.
The same trap exists for .github/: step 7 only forbids .github/workflows/pr.yml, not other .github/ paths (e.g. .github/pull_request_template.md, which this very PR modifies). A bot edit there would self-clear step 12 then trip the gate.
Why existing code doesn't prevent it
The gate is the only line of defense, and its error message also lies — saying the implementation must live outside 4 paths when the regex actually excludes 7. So a confused bot reading the gate's failure message would be told to look outside tests/, .ai/, docs/, CHANGELOG.md, but adding a web/ change still wouldn't fix the failure. There's no shared source for the exclusion list — three places (gate regex, gate error message, prompt self-check bullet) each have their own copy.
Impact
Bot is dormant by default (vars.WHEELS_BOT_ENABLED), and the worst case is a rejected bot PR — which is what the gate exists for. So this is not catastrophic. But it directly degrades bot-PR success rate on a path the prompt explicitly endorses, and the inconsistency means the gate doesn't actually enforce the prompt's discipline as claimed in the prompt's own "TDD invariant" note ("the prompt-level discipline below is enforced by code, so don't skip steps"). The gate's "enforced by code" promise is broken when the prompt and code disagree on what's being enforced.
How to fix
Minimal: update the bullet at propose-fix.md lines 158-160 to list all 7 paths (tests/, vendor/wheels/tests/, .ai/, CHANGELOG.md, docs/, web/, .github/), and update the gate's error message at bot-tdd-gate.yml line 67 to match. Better: extract the exclusion list into a single source — e.g., a small script under tools/ci/ that the gate sources and the prompt cites by reference rather than restating. That kills the drift class entirely.
| claude_args: | | ||
| --model claude-opus-4-7 | ||
| --max-turns 40 | ||
| --allowedTools "Bash(gh:*),Bash(git log:*),Bash(git show:*),Bash(git grep:*),WebFetch,WebSearch,Read,Grep,Glob" |
There was a problem hiding this comment.
🔴 Step 3 of research-frameworks.md instructs the model to "Use the Agent tool to launch 6 parallel sub-agents" for the cross-framework fan-out, but bot-research.yml's --allowedTools list does not include Task (the tool that launches sub-agents). claude-code-action enforces --allowedTools as an explicit allow-list, so the prompt's documented core mechanism cannot run as designed. Either add Task to the workflow's allow-list or rewrite step 3 to do sequential WebFetch calls in a single agent.
Extended reasoning...
What the bug is
.claude/commands/research-frameworks.md step 3 (lines 43–44) is explicit:
Use the Agent tool to launch 6 parallel sub-agents (or fewer if fewer frameworks are relevant). Each agent gets: …Each agent uses
WebFetchagainst the canonical docs and returns its structured summary.
But .github/workflows/bot-research.yml line 71 sets:
--allowedTools "Bash(gh:*),Bash(git log:*),Bash(git show:*),Bash(git grep:*),WebFetch,WebSearch,Read,Grep,Glob"Task (the Claude Code tool that launches sub-agents — the one referred to as "the Agent tool" in the prompt) is not listed.
Why existing code does not save us
claude-code-action@v1 passes --allowedTools straight through to the Claude CLI as an allow-list. Tools not explicitly granted are denied in non-interactive (--max-turns) runs — the model cannot prompt a human to approve them. The shared rails the prompt cites also reinforce this: _shared-rails.md says "No write-side network tools unless the caller workflows --allowed-tools explicitly grants them" — the same principle applies to Task.
Step-by-step proof
- Issue opens, triage classifies it
framework-design, posts the trigger marker. bot-research.ymlfires, generates an App token, runs the skip-check, then invokesclaude-code-action@v1with the allow-list above.- The model loads
/research-frameworks <issue>and follows step 3, attempting to callTaskto launch sub-agent New master #1 (Rails). - The CLI rejects the call because
Taskis not in--allowedTools. With--max-turns 40the run is non-interactive, so there is no human-approval fallback. - Best case: the model falls back to sequential
WebFetchcalls in a single agent — degraded behavior that contradicts the documented design and likely runs into the turn budget given six frameworks × multiple URLs each. Worst case: the model errors out before posting the comment, and the issue gets a partial or no research comment despite the workflow consuming Opus quota and CI minutes.
Impact
The "parallel sub-agent fan-out across 6 frameworks" is the named central mechanism of the research stage in the PR description, docs/contributing/wheels-bot.md ("Launches parallel sub-agents to look up how each of …"), and the prompt itself. With the bot dormant by default the blast radius is bounded today, but the moment an admin sets WHEELS_BOT_ENABLED=true the research stage will not behave as documented on its first real invocation.
How to fix
Pick one:
-
Add
Taskto the allow-list in.github/workflows/bot-research.ymlline 71:--allowedTools "Bash(gh:*),Bash(git log:*),Bash(git show:*),Bash(git grep:*),WebFetch,WebSearch,Read,Grep,Glob,Task"This is the lowest-friction fix and preserves the prompts parallel-fan-out design. May also warrant raising
--max-turns 40if sub-agents themselves need turn budget. -
Rewrite step 3 of
research-frameworks.mdto do sequentialWebFetchcalls in a single agent (and updatedocs/contributing/wheels-bot.mdand the PR description accordingly so the documented mechanism matches the actual implementation).
| This page is for humans interacting with the bot. For the design rationale, | ||
| see the plan at `/root/.claude/plans/i-just-watched-a-polymorphic-plum.md` (or | ||
| its archived copy in the repo when published). For the framework's general | ||
| contribution rules, see [`CONTRIBUTING.md`](../../CONTRIBUTING.md). |
There was a problem hiding this comment.
🟡 docs/contributing/wheels-bot.md line 9 references /root/.claude/plans/i-just-watched-a-polymorphic-plum.md, which is an absolute path on the PR author's local Claude Code install — it will not exist on any reader's filesystem, and the diff does not commit that plan into the repo. The hedge "or its archived copy in the repo when published" itself admits the breadcrumb is dead the moment this doc lands. Fix: either commit the plan into the repo and link the canonical path, or simply remove the line.
Extended reasoning...
What the bug is
In docs/contributing/wheels-bot.md (the new operator handbook), the lead-in paragraph contains a pointer to a design plan:
This page is for humans interacting with the bot. For the design rationale,
see the plan at `/root/.claude/plans/i-just-watched-a-polymorphic-plum.md` (or
its archived copy in the repo when published).
/root/.claude/plans/... is an absolute path that exists only on the PR author's local Claude Code install (/root/.claude/ is the per-user state directory used by Claude Code). It is not a path that will resolve on any reader's machine — not on a contributor's laptop, not on CI, not in the rendered docs site.\n\n### Why the existing escape hatch does not save it\n\nThe parenthetical "or its archived copy in the repo when published" is meant to be a fallback, but it is itself an admission that the breadcrumb is currently dead — the archived copy is not in the diff. A grep of the PR's changed-files list confirms only docs/contributing/wheels-bot.md is added under docs/; no plan file was committed alongside it. So neither half of the "X or Y" lookup resolves.\n\n### Impact\n\nThis is not a correctness or security issue — it is a documentation quality / professionalism issue:\n\n1. Readers following the link find nothing and lose trust in the rest of the doc.\n2. A leaked /root/... path signals "machine-generated, not proofread" — exactly the impression a brand-new bot/automation suite should avoid in its own operator handbook.\n3. The whimsical plan filename (i-just-watched-a-polymorphic-plum.md) reinforces that this was a session-local artifact never meant for committed prose.\n\n### Step-by-step proof\n\n1. Reader clones the repo at this PR's SHA.\n2. Reader opens docs/contributing/wheels-bot.md and sees: "see the plan at /root/.claude/plans/i-just-watched-a-polymorphic-plum.md".\n3. Reader runs cat /root/.claude/plans/i-just-watched-a-polymorphic-plum.md → file does not exist (and won't, on any non-PR-author machine).\n4. Reader checks the parenthetical fallback: git ls-files | grep -i polymorphic-plum or find . -name '*polymorphic-plum*' → no results in the repo either.\n5. Both lookup paths fail; the breadcrumb is dead.\n\n### How to fix\n\nTwo options, either is fine:\n\n- Drop the breadcrumb. Replace the two-sentence paragraph with just "This page is for humans interacting with the bot. For the framework's general contribution rules, see CONTRIBUTING.md." The body of the doc already explains the design.\n- Commit the plan and link the committed path. Move the plan into something like docs/superpowers/plans/2026-05-09-wheels-bot.md (mirroring the existing docs/superpowers/plans/ convention referenced in CLAUDE.md) and rewrite the line to see [docs/superpowers/plans/2026-05-09-wheels-bot.md](...).\n\nDropping the line is the lower-risk fix; it doesn't require deciding whether the plan is in shape for public consumption.
| is_bot=true | ||
| elif [[ "$PR_HEAD" =~ ^bot/ || "$PR_HEAD" =~ ^fix/bot- ]]; then | ||
| is_bot=true | ||
| fi |
There was a problem hiding this comment.
🟡 The TDD gate's branch-fallback regex on .github/workflows/bot-tdd-gate.yml:28 only matches ^bot/ and ^fix/bot-, but .claude/commands/_shared-rails.md lines 48-49 sanction two bot branch patterns: fix/bot-<issue>-<slug> or feature/bot-<slug>. A bot PR opened on a feature/bot-* branch by a non-App identity (manual testing, future workflow variants) would set is_bot=false and silently bypass the spec+impl requirement. Either expand the regex to ^bot/|^fix/bot-|^feature/bot- or strike feature/bot-<slug> from the rails to single-source the contract.
Extended reasoning...
The bug. .claude/commands/_shared-rails.md lines 48-49 explicitly declare two valid bot branch patterns: fix/bot-<issue>-<slug> or feature/bot-<slug>. The TDD gate that enforces the spec+impl requirement on bot PRs uses two signals to decide is_bot in .github/workflows/bot-tdd-gate.yml lines 26-30: a PR_AUTHOR == 'wheels-bot[bot]' check (line 26) and a branch-pattern fallback (line 28). The branch fallback only matches ^bot/ or ^fix/bot- — the documented feature/bot-* pattern is not in the regex.\n\nWhy the fallback exists. The PR_AUTHOR check covers the realistic path today (every bot-authored PR runs through line 26 first), so the immediate practical impact is small. But the branch regex was added as defense-in-depth for cases where the App identity isn't the author: manual workflow_dispatch testing where a maintainer pushes to a bot branch, future automation variants that don't run as the App, or anyone reading _shared-rails.md and pushing to the documented feature/bot-* branch expecting the gate to fire. In every one of those scenarios the gate silently turns into a no-op.\n\nStep-by-step proof. (1) A maintainer reads _shared-rails.md line 49, which says feature/bot-<slug> is a sanctioned bot-branch pattern. (2) They push a test PR on branch feature/bot-foo from their personal account to validate a future automation variant. (3) bot-tdd-gate.yml line 23 reads PR_HEAD=feature/bot-foo. Line 26 fails (PR_AUTHOR is the human, not wheels-bot[bot]). Line 28 evaluates 'feature/bot-foo' =~ ^bot/ → false, and 'feature/bot-foo' =~ ^fix/bot- → false. (4) is_bot=false is written. (5) Lines 36-77 are all conditioned on steps.classify.outputs.is_bot == 'true', so the entire spec+impl enforcement is skipped. The PR passes the gate without containing any spec changes, contradicting the contract documented in _shared-rails.md.\n\nWhy existing code doesn't prevent it. Three mitigations narrow the blast radius but don't close the hole: (a) the PR_AUTHOR check on line 26 catches every PR opened by the App identity; (b) the activation-step ruleset (PR description step 4) only allows the App to push to bot/** and fix/bot-*/**, so the App itself can't currently use feature/bot-*; (c) bot-propose-fix.yml line 80 hardcodes the branch name as fix/bot-${ISSUE_NUMBER}-${slug}. None of these are durable: (a) doesn't help when the author is a human, (b) is admin-managed config that can be expanded if someone implements a future feature/bot-* workflow per the rails, and (c) is one workflow — _shared-rails.md is the contract that future workflows will read.\n\nImpact. Bypassed TDD gate on feature/bot-* branches. The contract is single-sourced poorly: _shared-rails.md lists two branch patterns, the gate enforces one. The gap will materialize the moment any bot-related automation grows a feature/bot-* branch — and since the rails sanction it, that's a likely path for design-track features that aren't bug fixes.\n\nFix. Trivial — pick one of:\n\nyaml\n# Option A: expand the gate regex (preferred if feature/bot-* is intended)\nelif [[ "$PR_HEAD" =~ ^bot/ || "$PR_HEAD" =~ ^fix/bot- || "$PR_HEAD" =~ ^feature/bot- ]]; then\n\n\nOr strike feature/bot-<slug> from _shared-rails.md line 49 so the rails describe what the gate actually enforces. Either way, single-source the contract.
…action Composite actions cannot read the `vars` context — that lookup is only available in workflow files. The skip-check action's internal kill-switch check failed every time the action was invoked, with: Unrecognized named-value: 'vars'. Located at position 1 within expression: vars.WHEELS_BOT_ENABLED This silently broke from PR #2518 onwards but only surfaced after the kill switch was activated and Reviewer A actually fired on a real PR. The kill switch is already enforced at the job level via `if: vars.WHEELS_BOT_ENABLED == 'true'` in every wheels-bot workflow, so the action's internal check was redundant as well as broken. Removed: - `WHEELS_BOT_ENABLED` env binding (couldn't resolve) - The `if [[ ... != "true" ]]; then skip=true ...` block (redundant) Description updated to note that the kill switch lives at the job level, not in this action. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…_bots) (#2520) * fix(cli): escape # in test help URL fragment Module.cfc:3723 prints a help URL containing the unescaped fragment "testing#testing-against-different-engines". Lucee's parser interprets unescaped # in CFScript string literals as expression delimiters and crashes the file's compilation with "Invalid Syntax Closing [#] not found" when no closing # is found. This crashed Module.cfc compilation, which broke `wheels new` and the Wheels Snapshots smoke test pipeline (build / Smoke Test Installed Distribution). Every push to develop has been failing this gate since PR #2517 introduced the line. CFML rule: ## inside a string literal outputs a literal #. The fix is a one-character change. See CLAUDE.md "# escape gotcha" for the same bug class historical context. * ci: remove unreachable vars.WHEELS_BOT_ENABLED check from skip-check action Composite actions cannot read the `vars` context — that lookup is only available in workflow files. The skip-check action's internal kill-switch check failed every time the action was invoked, with: Unrecognized named-value: 'vars'. Located at position 1 within expression: vars.WHEELS_BOT_ENABLED This silently broke from PR #2518 onwards but only surfaced after the kill switch was activated and Reviewer A actually fired on a real PR. The kill switch is already enforced at the job level via `if: vars.WHEELS_BOT_ENABLED == 'true'` in every wheels-bot workflow, so the action's internal check was redundant as well as broken. Removed: - `WHEELS_BOT_ENABLED` env binding (couldn't resolve) - The `if [[ ... != "true" ]]; then skip=true ...` block (redundant) Description updated to note that the kill switch lives at the job level, not in this action. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ci: add allowed_bots to all wheels-bot claude-code-action invocations The anthropics/claude-code-action defaults to blocking workflow runs initiated by non-human actors (any GitHub Bot identity). Without an explicit allowlist, every bot-triggered workflow fails fast with: Action failed with error: Workflow initiated by non-human actor: wheels-bot (type: Bot). Add bot to allowed_bots list or use '*' to allow all bots. This surfaced first on bot-review-b.yml — Reviewer B is triggered by wheels-bot[bot] submitting a Reviewer A review, and the action blocked on the bot initiator. Same shape would hit bot-research, bot-propose-fix, and bot-auto-close (cron actor is github-actions[bot]) as the rollout progresses through Phases 2-5. Adding `allowed_bots: 'wheels-bot[bot],github-actions[bot]'` to all six wheels-bot workflows. Specific allowlist (not '*') because the repo is public — '*' would let any external GitHub App invoke the action with prompts they could influence (per the action's docs/security.md). The two identities allowed: - wheels-bot[bot]: our App's bot identity - github-actions[bot]: GitHub's own actor for scheduled and workflow-internal triggers Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Adds a five-stage Claude-powered GitHub bot for
wheels-dev/wheels, modeled on Bun's public Claude workflows (oven-sh/bun/.github/workflows/claude-*.yml+.claude/commands/). The bot is dormant by default — no workflow runs until a repo admin setsvars.WHEELS_BOT_ENABLED='true'.The five stages:
bug/framework-design/other(+ confidence on bug path)framework-designworkflow_dispatch)bot-tdd-gate.ymlPlus a daily cron (
bot-auto-close.yml) that closes stalecannot-reproducetriages after 14 days.Files
New (17):
.claude/commands/_shared-rails.md— common safety rails included in every prompt.claude/commands/{triage-issue,research-frameworks,propose-fix,review-pr,review-the-review,auto-close-stale-triage}.md.github/workflows/bot-{triage,research,propose-fix,tdd-gate,review-a,review-b,auto-close}.yml.github/actions/wheels-bot-skip-check/action.yml— central kill-switch + marker check.github/actions/setup-wheels-test-env/action.yml— composite action lifted frompr.yml:30-164(LuCLI + Lucee + SQLite + Playwright)docs/contributing/wheels-bot.md— operator handbookModified (3):
CLAUDE.md— new §"Wheels Bot" subsectionCONTRIBUTING.md—[skip-claude]opt-out + bot-output legibility.github/pull_request_template.md— opt-out hint at the bottomSafety controls
vars.WHEELS_BOT_ENABLED— repo variable, must be'true'for any workflow to run. Default unset = dormant.[skip-claude]label or title token — per-issue/PR opt-out, checked by every workflow.wheels-bot:<stage>:<key>) prevent duplicate runs across retries.wheels-bot[bot]) — push permissions scoped tobot/**andfix/bot-*/**via repo ruleset (set up by admins as part of activation).bot-tdd-gate.ymlhard-rejects bot PRs that don't include both a spec change and an implementation change. Human PRs bypass automatically.wheels-bot:fix-held:<issue>instead of opening a PR.medium) or when frameworks disagree (capped atlow) — onlyhigh-confidence research auto-fires the fix-PR stage.Activation steps (post-merge, admins only)
wheels-bot[bot]at github.com/settings/apps/new underwheels-dev. Permissions: Contents R/W, Issues R/W, Pull Requests R/W, Metadata R. No webhooks. Install on this repo only.WHEELS_BOT_APP_ID,WHEELS_BOT_PRIVATE_KEY. ConfirmANTHROPIC_API_KEYis already present.skip-claude,cannot-reproducein the repo.bot/**andfix/bot-*/**.developbranch protection: addBot PR TDD Gateto required checks, require 1 approving review fromwheels-dev/maintainers.vars.WHEELS_BOT_ENABLED='true'to activate.Phased rollout — see
docs/contributing/wheels-bot.md§ "Operating the bot". Phase 1 (Reviewer A only) is the smallest first cut; promote subsequent stages one at a time.Test plan
develop, confirmbot-review-a.ymlruns and posts a review with thewheels-bot:review-amarker. Push a second commit, confirm cancel-in-progress supersedes the first run.bot-triage.ymlbrings up Lucee + SQLite via the composite action and lands a triage comment with confidence + marker. Re-open the issue, confirm the marker check skips a duplicate run.bot-review-b.ymlfires only on bot-authored reviews. Have A re-review, confirm round counter increments. Confirm cap at round 3.workflow_dispatchfirst; verify spec + implementation both committed,bot-tdd-gate.ymlpasses, commitlint + fast-test pass, PR is--draft. Then enable auto-fire ontriage-confidence:high.workflow_dispatchagainst historical issues; verify research lands an accurate comparison table + Wheels-idiomatic API sketch with explicit confidence. After ≥ 5 supervised runs, extendbot-propose-fix.yml's trigger to fire onresearch-confidence:high.cannot-reproducetriages close politely.WHEELS_BOT_ENABLED=false, open a PR, confirmbot-review-a.ymlexits 0 immediately.Notes
yaml.safe_load.docs-validation.ymlorchestrator is intentionally not reused — it is a stateful batch runner; these bots are one-shot per event.pr.yml's required-checks contract is undisturbed — the test-env composite action references its prelude by extraction, not modification.https://claude.ai/code/session_01F5Ev5XsFMzLncPCZ43hjVP
Generated by Claude Code