feat(altair): implement marimekko-basic#5480
Conversation
🔧 AI Review Produced No Score — Auto-RetryingThe Claude Code Action ran but didn't write |
❌ AI Review Failed (auto-retry exhausted)The AI review action completed but did not produce valid output files. Auto-retry already tried once. What happened:
Manual rerun: |
AI Review - Attempt 1/3Image Description
Score: 82/100
Visual Quality (28/30)
Design Excellence (12/20)
Spec Compliance (12/15)
Data Quality (15/15)
Code Quality (8/10)
Library Mastery (7/10)
Score Caps Applied
Strengths
Weaknesses
Issues Found
AI Feedback for Next Attempt
Verdict: REJECTED |
🔧 Repair Attempt 1/4Applied fixes based on AI review feedback. Status: Repair completed, re-triggering review... |
…ompts (#5520) ## Summary Three workflows (`impl-review.yml`, `spec-create.yml`, `report-validate.yml`) used shell-style `$VAR` inside `with: prompt: |` blocks of `claude-code-action`. That block is a YAML string handed to a Node/Bun action — **no shell ever runs**, so `$VAR` was sent to Claude as a literal placeholder instead of the actual value. Result: Claude couldn't reliably identify the PR / spec / library to review and silently produced no `quality_score.txt`, which the validate step turns into `ai-review-failed`. ## Symptoms observed today (2026-04-29) 5 stuck implementation PRs from 2026-04-27, all with `ai-review-failed` despite the prior fixes branch (#5410) and the audit branch (#5515) landing in between: | PR | Branch | Pre-fix labels | |----|--------|----------------| | #5476 | seaborn/marimekko-basic | `ai-review-failed`, `quality:78` | | #5480 | altair/marimekko-basic | `ai-review-failed`, `quality:82` | | #5481 | letsplot/marimekko-basic | `ai-rejected`, `quality:76` | | #5483 | plotnine/marimekko-basic | `ai-review-failed` | | #5486 | plotly/line-basic | `ai-review-failed` | Re-dispatching review on each confirmed the bug: the run log of `Run AI Quality Review` shows the prompt being passed verbatim: ``` PROMPT: Read prompts/workflow-prompts/ai-quality-review.md and follow those instructions. Variables for this run: - LIBRARY: $LIBRARY # ← literal, never expanded - SPEC_ID: $SPEC_ID - PR_NUMBER: $PR_NUMBER - ATTEMPT: $ATTEMPT ``` Claude's review then either ran for ~20s and exited with no `quality_score.txt` (4 PRs failed), or recovered by inferring values from cwd (1 PR succeeded with `quality:82`). The intermittent pattern is exactly what you'd expect from "the prompt is ambiguous and Claude has to guess from context." ## Root cause Commit `252977cf3` ("chore: fix critical audit findings", 2026-04-28 22:46) routed several `${{ github.event.* }}` and step-output values through step-level `env:` and rewrote the in-prompt references as `$VAR`. That is the correct mitigation for `run:` shell steps and Python heredocs in the same workflows (and those changes stay in place). Inside `with: prompt: |` it is the wrong tool: the value is consumed by a JS action, not a shell, so there is no injection surface to mitigate and `$VAR` does not interpolate. `spec-create.yml` and `report-validate.yml` carry the identical anti-pattern in their `prompt:` blocks. They haven't surfaced as failures yet only because no triggering issue has come in since 2026-04-28. ## The fix Revert **only** the descriptive header lines of each `prompt:` block back to GitHub Actions Expression syntax (`${{ ... }}`), which the runner substitutes into the YAML string before the action receives it. Keep: - All `env:` blocks (harmless; lets future prompt content reference env vars if useful) - All `$VAR` references inside **embedded bash code samples** in the prompt (e.g. `gh issue edit $ISSUE_NUMBER`). Those are executed by Claude's Bash tool which inherits the step `env:` and expands them correctly — and rewriting them would re-enable the injection vector the audit was right to close. ```diff Variables for this run: - - LIBRARY: $LIBRARY - - SPEC_ID: $SPEC_ID - - PR_NUMBER: $PR_NUMBER - - ATTEMPT: $ATTEMPT + - LIBRARY: ${{ steps.pr.outputs.library }} + - SPEC_ID: ${{ steps.pr.outputs.specification_id }} + - PR_NUMBER: ${{ steps.pr.outputs.pr_number }} + - ATTEMPT: ${{ steps.attempts.outputs.display }} ``` (analogous 8-line revert in `spec-create.yml` × 2 prompt blocks and 4-line revert in `report-validate.yml`). Diff total: **3 files, 16 ±**. ## Test plan - [ ] After merge, redispatch `impl-review.yml` for the 4 stuck PRs (`gh workflow run impl-review.yml -f pr_number=<N>` for 5476, 5483, 5486; 5480 already got a 82 in the redispatch and should now stabilize) - [ ] Verify each run's `Run AI Quality Review` step log shows real values (e.g. `- LIBRARY: plotly`) in the PROMPT echo, not `$LIBRARY` - [ ] Verify `quality_score.txt` is produced and `ai-review-failed` label is removed - [ ] On next `spec-request`-labeled issue, verify the spec-create prompt sees the issue title/body - [ ] On next `report-pending`-labeled issue, verify the report-validate prompt sees the issue title/body 🤖 Generated with [Claude Code](https://claude.com/claude-code)
AI Review - Attempt 2/3Image Description
Score: 87/100
Visual Quality (29/30)
Design Excellence (11/20)
Spec Compliance (15/15)
Data Quality (15/15)
Code Quality (9/10)
Library Mastery (8/10)
Score Caps Applied
Strengths
Weaknesses
Issues Found
AI Feedback for Next AttemptThe visual output (from repair) is correct and scores 87. The blocker is that the committed Python file still has the old unthemed code. Ensure the fix includes: (1) Verdict: REJECTED |
AI Review - Attempt 2/3Image Description
Score: 87/100
Visual Quality (29/30)
Design Excellence (11/20)
Spec Compliance (15/15)
Data Quality (15/15)
Code Quality (9/10)
Library Mastery (8/10)
Score Caps Applied
Strengths
Weaknesses
Issues Found
AI Feedback for Next Attempt
Verdict: REJECTED |
) ## Summary The 3 AI-approved implementation PRs from today (#5476, #5480, #5481) all hit `gh pr merge` failures with `the base branch policy prohibits the merge`. Root cause: the branch ruleset on `main` requires three status checks (`Run Linting`, `Run Tests`, `Run Frontend Tests`) — and impl-PRs created by `impl-generate.yml` never get those checks. ## Why CI doesn't run on impl-PRs `impl-generate.yml` (and `impl-repair.yml`, `impl-review.yml`) push commits to PR branches using `GITHUB_TOKEN`. By GitHub's anti-recursion design, pushes / PRs created with `GITHUB_TOKEN` do **not** trigger downstream `pull_request` or `workflow_run` events. Verified across all 5 stuck PRs: | PR | Branch | `Run Linting` ever ran? | |----|--------|--------------------------| | #5476 seaborn/marimekko-basic | yes (once, on a 04-27 impl-repair commit; newer score commits invalidated it) | | #5480 altair/marimekko-basic | no | | #5481 letsplot/marimekko-basic | no | | #5483 plotnine/marimekko-basic | no | | #5486 plotly/line-basic | no | So the merge is gated on a check that structurally cannot complete. ## The fix Add `--admin` to the `gh pr merge` call inside `impl-merge.yml`. This lets the pipeline complete autonomously without weakening main's protection for human PRs. ```diff + # --admin bypasses the branch ruleset's required-status-check + # gate. Required because impl-generate.yml pushes via GITHUB_TOKEN, + # which by GitHub's anti-recursion design does not trigger + # downstream CI workflows (Run Linting / Run Tests / Run Frontend + # Tests), so impl PRs never get those checks. The pipeline already + # gates merge behind the AI quality review threshold. if gh pr merge "$PR_NUM" \ --repo "$REPOSITORY" \ --squash \ + --admin \ --delete-branch; then ``` The merge is still gated by: - AI quality threshold (cascading 90 / 80 / 70 / 60 / 50 across initial review + 4 repair attempts) - `impl-merge.yml`'s own pre-merge "Validate PR completeness" step - The label-based trigger requiring `ai-approved` So `--admin` only bypasses the structurally-missing CI artifact, not the substantive review gates. ## Considered alternative Push from `impl-generate` / `impl-repair` / `impl-review` via a PAT instead of `GITHUB_TOKEN` so CI triggers naturally. Cleaner long-term but needs a maintained secret and a broader review of which workflows touch which branches; deferred. ## Test plan - [ ] After merge, dispatch `impl-merge.yml` (or trust the `ai-approved` label trigger) for the 3 stuck approved PRs (#5476, #5480, #5481) - [ ] Verify merge succeeds without retries on attempt 1 - [ ] Verify post-merge: metadata file created, GCS staging→production promotion done, `impl:{library}:done` label on parent issue 🤖 Generated with [Claude Code](https://claude.com/claude-code)
…et (#5523) ## Summary Follow-up to #5521 (which added `--admin` to `gh pr merge`). That change alone wasn't enough — verified just now: 3 dispatched merges (#5476, #5480, #5481) all failed identically with: ``` GraphQL: Repository rule violations found 3 of 3 required status checks are expected. (mergePullRequest) ``` ## Why --admin alone didn't work The `main` ruleset's bypass list contains only `RepositoryRole admin` (mode: `pull_request`). Default `GITHUB_TOKEN` runs as `github-actions[bot]` with `write` role — not admin — so the API rejects the bypass. ```bash gh api repos/MarkusNeusinger/anyplot/rulesets/10578859 --jq '.bypass_actors' # [{"actor_id":5,"actor_type":"RepositoryRole","bypass_mode":"pull_request"}] ``` ## The fix Route **only the merge step** through a repo-admin PAT (`ADMIN_TOKEN`). All other steps in `impl-merge.yml` and the rest of the impl-* workflows keep using `GITHUB_TOKEN`. Bypass scope is therefore exactly one step, not the whole pipeline. ```diff - name: Merge PR to main (with retry) if: steps.check.outputs.should_run == 'true' env: - GH_TOKEN: ${{ secrets.GITHUB_TOKEN }} + GH_TOKEN: ${{ secrets.ADMIN_TOKEN || secrets.GITHUB_TOKEN }} PR_NUM: ${{ steps.check.outputs.pr_number }} REPOSITORY: ${{ github.repository }} + HAS_ADMIN_TOKEN: ${{ secrets.ADMIN_TOKEN != '' }} run: | + if [ "$HAS_ADMIN_TOKEN" != "true" ]; then + echo "::warning::ADMIN_TOKEN secret is not set..." + fi ``` The fallback `secrets.ADMIN_TOKEN || secrets.GITHUB_TOKEN` and the warning preserve the previous behavior if `ADMIN_TOKEN` isn't set yet — workflow still runs, fails with the same ruleset error as before, but the log says clearly what's missing instead of an opaque auth error. ## Required after merge 1. **Create PAT**: Settings → Developer settings → Personal access tokens → Fine-grained - Repository: `anyplot` - Permissions: - Contents: Read+Write - Pull requests: Read+Write - Administration: Read+Write - Metadata: Read 2. **Set secret**: Settings → Secrets and variables → Actions → New repository secret - Name: `ADMIN_TOKEN` - Value: the PAT ## Considered alternatives | Option | Verdict | |--------|---------| | Add `github-actions[bot]` as bypass actor on ruleset | broader blast radius — *every* workflow run could bypass main | | Remove the 3 required checks from ruleset | weakens protection for human PRs too | | Push from impl-generate via PAT so CI triggers naturally | cleanest semantically but needs PAT in 3 workflows + same maintenance overhead | | **Scope PAT to merge step only (this PR)** | smallest blast radius, matches the actual permission gap | ## Test plan - [ ] Merge this PR - [ ] Create the fine-grained PAT and add as `ADMIN_TOKEN` repo secret - [ ] Re-dispatch `impl-merge.yml` for the 3 stuck approved PRs (#5476 seaborn, #5480 altair, #5481 letsplot) - [ ] Verify each merges successfully on attempt 1 (no ruleset error in run log) - [ ] Verify metadata file created, GCS staging→production promotion done, parent issue gets `impl:{library}:done` label 🤖 Generated with [Claude Code](https://claude.com/claude-code)
Implementation:
marimekko-basic- python/altairImplements the python/altair version of
marimekko-basic.File:
plots/marimekko-basic/implementations/python/altair.pyParent Issue: #1002
🤖 impl-generate workflow