Skip to content

feat(letsplot): implement marimekko-basic#5481

Merged
MarkusNeusinger merged 7 commits intomainfrom
implementation/marimekko-basic/letsplot
Apr 29, 2026
Merged

feat(letsplot): implement marimekko-basic#5481
MarkusNeusinger merged 7 commits intomainfrom
implementation/marimekko-basic/letsplot

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

Implementation: marimekko-basic - python/letsplot

Implements the python/letsplot version of marimekko-basic.

File: plots/marimekko-basic/implementations/python/letsplot.py

Parent Issue: #1002


🤖 impl-generate workflow

@github-actions github-actions Bot added quality:76 Quality score 76/100 ai-rejected Quality not OK, triggers update labels Apr 27, 2026
@github-actions github-actions Bot added ai-attempt-1 First repair attempt ai-rejected Quality not OK, triggers update and removed ai-rejected Quality not OK, triggers update labels Apr 27, 2026
@github-actions
Copy link
Copy Markdown
Contributor Author

🔧 Repair Workflow Crashed (Attempt 1/3) — Auto-Retrying

The repair workflow failed (probably a transient Claude Code Action issue). Automatically re-triggering this attempt...


🤖 impl-repair

@github-actions github-actions Bot added ai-rejected Quality not OK, triggers update and removed ai-rejected Quality not OK, triggers update labels Apr 27, 2026
@github-actions
Copy link
Copy Markdown
Contributor Author

❌ Repair Workflow Crashed (Attempt 1/3, retry exhausted)

The repair workflow itself failed twice for this attempt — likely a persistent Claude Code Action issue.

Manual restart:

gh workflow run impl-repair.yml -f pr_number=5481 -f specification_id=marimekko-basic -f library=letsplot -f attempt=1

🤖 impl-repair

@github-actions github-actions Bot removed the ai-rejected Quality not OK, triggers update label Apr 29, 2026
@github-actions
Copy link
Copy Markdown
Contributor Author

🔧 Repair Attempt 1/4

Applied fixes based on AI review feedback.

Status: Repair completed, re-triggering review...


🤖 impl-repair

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Apr 29, 2026

AI Review - Attempt 2/3

Image Description

Light render (plot-light.png): The chart renders on a warm off-white background (approximately #FAF8F1). Four variable-width vertical bars represent North America, Europe, Asia Pacific, and Latin America (widths proportional to total market size). Each bar is subdivided into four stacked segments: Electronics (teal-green, bottom), Apparel (orange), Home Goods (blue), and Food & Beverage (pink-purple, top). White bold dollar-value labels appear inside each segment (e.g., "$180.0M" for Asia Pacific Electronics). Title "marimekko-basic · letsplot · anyplot.ai" is centered at the top; axis labels "Market Size Distribution" and "Share within Region (%)" are present; legend appears to the right. All title, axis, tick, and legend text is clearly readable against the light background. Minor issue: the "$30.0M" Food & Beverage label on the narrow Latin America bar extends slightly outside the segment boundary.

Dark render (plot-dark.png): Same chart on a near-black background (approximately #1A1A17). Title, axis labels, tick labels, and legend text all appear in white/light tones and are clearly readable — no dark-on-dark failure detected. Data colors are identical to the light render (Electronics teal-green, Apparel orange, Home Goods blue, Food & Beverage pink-purple), only the background and text chrome flip. The "$30.0M" Latin America/Food & Beverage label again overflows its narrow segment slightly. Legibility verdict: PASS for both renders.

Score: 73/100

Category Score Max
Visual Quality 22 30
Design Excellence 10 20
Spec Compliance 13 15
Data Quality 14 15
Code Quality 8 10
Library Mastery 6 10
Total 73 100

Visual Quality (22/30)

  • VQ-01: Text Legibility (7/8) — Font sizes explicitly set (title 24pt, axis_title 20pt, ticks 16pt), all text readable in both renders; minor label crowding on narrow Latin America segments
  • VQ-02: No Overlap (4/6) — "$30.0M" Food & Beverage label overflows its narrow segment boundary in both renders
  • VQ-03: Element Visibility (5/6) — Segments clearly defined with white borders; all data visible
  • VQ-04: Color Accessibility (1/2) — Custom palette not CVD-optimised; Python Blue and green may conflict for some users
  • VQ-05: Layout & Canvas (3/4) — Good canvas utilisation; x-axis labels at 15° are readable
  • VQ-06: Axis Labels & Title (2/2) — "Market Size Distribution" and "Share within Region (%)" are descriptive
  • VQ-07: Palette Compliance (0/2) — Custom #306998 (Python Blue) explicitly prohibited; uses Python logo colours instead of Okabe-Ito; code lacks ANYPLOT_THEME reading and theme-adaptive chrome tokens

Design Excellence (10/20)

  • DE-01: Aesthetic Sophistication (4/8) — Clean marimekko layout with value labels adds utility; however the palette is wrong Python logo colours, not a thoughtful design choice; looks like a configured default
  • DE-02: Visual Refinement (3/6) — Grid fully removed via element_blank() ✓, white segment borders provide good definition ✓; axis frame is retained and no spine removal
  • DE-03: Data Storytelling (3/6) — Area encoding communicates both dimensions simultaneously; dollar labels guide eye to values; no visual hierarchy or emphasis on notable data points

Spec Compliance (13/15)

  • SC-01: Plot Type (5/5) — Correct Marimekko/mosaic chart using geom_rect
  • SC-02: Required Features (4/4) — Proportional bar widths ✓, proportional segment heights ✓, colour-coded y-categories ✓, value labels on segments ✓
  • SC-03: Data Mapping (3/3) — Correct mapping: region widths proportional to totals, segment heights as shares within region, 0–100% y-axis
  • SC-04: Title & Legend (1/3) — Code hardcodes "pyplots.ai" (line 107); image shows "anyplot.ai" (likely a pipeline patch); title format is otherwise correct; legend labels match products

Data Quality (14/15)

  • DQ-01: Feature Coverage (5/6) — Shows variable-width bars, proportional stacking, multi-category comparison across regions of different sizes; good coverage of marimekko mechanics
  • DQ-02: Realistic Context (5/5) — Real-world market share analysis across major regions and product categories; neutral business context
  • DQ-03: Appropriate Scale (4/4) — Values in millions are plausible; Asia Pacific being the largest market is realistic

Code Quality (8/10)

  • CQ-01: KISS Structure (3/3) — Clean linear flow: imports → data → compute → plot → save; no functions or classes
  • CQ-02: Reproducibility (2/2) — Fully deterministic data; no random seed needed
  • CQ-03: Clean Imports (2/2) — All listed imports are used
  • CQ-04: Code Elegance (1/2) — Comment "Python Blue, Python Yellow" acknowledges wrong palette choice; shutil.rmtree cleanup is a brittle workaround for lets-plot-images subfolder
  • CQ-05: Output & API (0/1) — Saves as plot.png/plot.html instead of plot-{THEME}.png/plot-{THEME}.html; no os.getenv("ANYPLOT_THEME") reading anywhere

Library Mastery (6/10)

  • LM-01: Idiomatic Usage (4/5) — Correct ggplot grammar; proper use of geom_rect with aes(xmin/xmax/ymin/ymax), scale_x_continuous with custom breaks/labels, element_blank() for grid removal
  • LM-02: Distinctive Features (2/5) — label_format parameter in geom_text is a lets_plot feature; HTML export via ggsave; but nothing that uniquely showcases lets_plot over plotnine

Score Caps Applied

  • None

Strengths

  • Correct Marimekko structure: geom_rect with xmin/xmax/ymin/ymax is the right approach for variable-width bars in lets_plot ggplot grammar
  • Realistic, neutral market share data with plausible regional variation
  • All required spec features present: proportional widths, proportional heights, colour coding, value labels on segments
  • Grid fully removed for clean look; white segment borders provide good visual separation

Weaknesses

  • Custom Python logo colours (#306998, #FFD43B, #2E8B57, #DC2626) — #306998 explicitly prohibited in VQ-07; replace with Okabe-Ito ['#009E73', '#D55E00', '#0072B2', '#CC79A7']
  • No os.getenv("ANYPLOT_THEME") reading — code must set PAGE_BG, INK, INK_SOFT tokens and apply them via element_rect(fill=PAGE_BG) on plot/panel background and element_text(color=INK/INK_SOFT) on all chrome elements
  • Output saved as plot.png/plot.html — must be plot-{THEME}.png/plot-{THEME}.html
  • Title hardcoded as "pyplots.ai" — must be "anyplot.ai"
  • Value label "$30.0M" overflows the narrow Latin America Food & Beverage segment — suppress labels for segments with height < 8%

Issues Found

  1. VQ-07 / CQ-05 CRITICAL: Missing ANYPLOT_THEME reading and theme-adaptive chrome. Fix: Add THEME = os.getenv("ANYPLOT_THEME", "light"), PAGE_BG = "#FAF8F1" if THEME == "light" else "#1A1A17", INK = "#1A1A17" if THEME == "light" else "#F0EFE8", INK_SOFT = "#4A4A44" if THEME == "light" else "#B8B7B0". Apply via theme(plot_background=element_rect(fill=PAGE_BG, color=PAGE_BG), panel_background=element_rect(fill=PAGE_BG), plot_title=element_text(color=INK), axis_title=element_text(color=INK), axis_text=element_text(color=INK_SOFT), legend_text=element_text(color=INK_SOFT), legend_title=element_text(color=INK))
  2. VQ-07 CRITICAL: Replace colors = ["#306998", "#FFD43B", "#2E8B57", "#DC2626"] with OKABE_ITO = ['#009E73', '#D55E00', '#0072B2', '#CC79A7'] and update scale_fill_manual(values=OKABE_ITO)
  3. CQ-05: Change ggsave(plot, "plot.png", scale=3)ggsave(plot, f"plot-{THEME}.png", scale=3) and ggsave(plot, "plot.html")ggsave(plot, f"plot-{THEME}.html")
  4. SC-04: Change title from "marimekko-basic · letsplot · pyplots.ai" to "marimekko-basic · letsplot · anyplot.ai"
  5. VQ-02: In geom_text, add conditional to skip label when segment_height < 8 to prevent label overflow in narrow segments (e.g., df_labeled = df[df['ymax'] - df['ymin'] >= 8] and use that for geom_text)

AI Feedback for Next Attempt

Fix all four critical violations: (1) Add full ANYPLOT_THEME reading with PAGE_BG/INK/INK_SOFT tokens applied to all chrome elements via theme(plot_background=element_rect(fill=PAGE_BG), panel_background=element_rect(fill=PAGE_BG), plot_title=element_text(color=INK), axis_title=element_text(color=INK), axis_text=element_text(color=INK_SOFT), legend_text=element_text(color=INK_SOFT), legend_title=element_text(color=INK)). (2) Replace custom palette with OKABE_ITO = ['#009E73', '#D55E00', '#0072B2', '#CC79A7']. (3) Save as f"plot-{THEME}.png" and f"plot-{THEME}.html". (4) Fix title to "anyplot.ai". Bonus: filter geom_text to segments with height ≥ 8% to prevent label overflow, and consider adding element_blank() on axis lines for a cleaner minimal look.

Verdict: REJECTED

@github-actions github-actions Bot added quality:73 Quality score 73/100 ai-rejected Quality not OK, triggers update labels Apr 29, 2026
@github-actions github-actions Bot added ai-attempt-2 Second repair attempt and removed ai-rejected Quality not OK, triggers update labels Apr 29, 2026
@github-actions
Copy link
Copy Markdown
Contributor Author

🔧 Repair Attempt 2/4

Applied fixes based on AI review feedback.

Status: Repair completed, re-triggering review...


🤖 impl-repair

@github-actions
Copy link
Copy Markdown
Contributor Author

🔧 AI Review Produced No Score — Auto-Retrying

The Claude Code Action ran but didn't write quality_score.txt. Auto-retrying review once...


🤖 impl-review

@github-actions github-actions Bot added the ai-review-failed AI review action failed or timed out label Apr 29, 2026
@github-actions
Copy link
Copy Markdown
Contributor Author

❌ AI Review Failed (auto-retry exhausted)

The AI review action completed but did not produce valid output files. Auto-retry already tried once.

What happened:

  • The Claude Code Action ran
  • No quality_score.txt file was created

Manual rerun:

gh workflow run impl-review.yml -f pr_number=5481

🤖 impl-review

MarkusNeusinger added a commit that referenced this pull request Apr 29, 2026
…ompts (#5520)

## Summary

Three workflows (`impl-review.yml`, `spec-create.yml`,
`report-validate.yml`) used shell-style `$VAR` inside `with: prompt: |`
blocks of `claude-code-action`. That block is a YAML string handed to a
Node/Bun action — **no shell ever runs**, so `$VAR` was sent to Claude
as a literal placeholder instead of the actual value. Result: Claude
couldn't reliably identify the PR / spec / library to review and
silently produced no `quality_score.txt`, which the validate step turns
into `ai-review-failed`.

## Symptoms observed today (2026-04-29)

5 stuck implementation PRs from 2026-04-27, all with `ai-review-failed`
despite the prior fixes branch (#5410) and the audit branch (#5515)
landing in between:

| PR | Branch | Pre-fix labels |
|----|--------|----------------|
| #5476 | seaborn/marimekko-basic | `ai-review-failed`, `quality:78` |
| #5480 | altair/marimekko-basic | `ai-review-failed`, `quality:82` |
| #5481 | letsplot/marimekko-basic | `ai-rejected`, `quality:76` |
| #5483 | plotnine/marimekko-basic | `ai-review-failed` |
| #5486 | plotly/line-basic | `ai-review-failed` |

Re-dispatching review on each confirmed the bug: the run log of `Run AI
Quality Review` shows the prompt being passed verbatim:

```
PROMPT: Read prompts/workflow-prompts/ai-quality-review.md and follow those instructions.

Variables for this run:
- LIBRARY: $LIBRARY    # ← literal, never expanded
- SPEC_ID: $SPEC_ID
- PR_NUMBER: $PR_NUMBER
- ATTEMPT: $ATTEMPT
```

Claude's review then either ran for ~20s and exited with no
`quality_score.txt` (4 PRs failed), or recovered by inferring values
from cwd (1 PR succeeded with `quality:82`). The intermittent pattern is
exactly what you'd expect from "the prompt is ambiguous and Claude has
to guess from context."

## Root cause

Commit `252977cf3` ("chore: fix critical audit findings", 2026-04-28
22:46) routed several `${{ github.event.* }}` and step-output values
through step-level `env:` and rewrote the in-prompt references as
`$VAR`. That is the correct mitigation for `run:` shell steps and Python
heredocs in the same workflows (and those changes stay in place). Inside
`with: prompt: |` it is the wrong tool: the value is consumed by a JS
action, not a shell, so there is no injection surface to mitigate and
`$VAR` does not interpolate.

`spec-create.yml` and `report-validate.yml` carry the identical
anti-pattern in their `prompt:` blocks. They haven't surfaced as
failures yet only because no triggering issue has come in since
2026-04-28.

## The fix

Revert **only** the descriptive header lines of each `prompt:` block
back to GitHub Actions Expression syntax (`${{ ... }}`), which the
runner substitutes into the YAML string before the action receives it.
Keep:

- All `env:` blocks (harmless; lets future prompt content reference env
vars if useful)
- All `$VAR` references inside **embedded bash code samples** in the
prompt (e.g. `gh issue edit $ISSUE_NUMBER`). Those are executed by
Claude's Bash tool which inherits the step `env:` and expands them
correctly — and rewriting them would re-enable the injection vector the
audit was right to close.

```diff
             Variables for this run:
-            - LIBRARY: $LIBRARY
-            - SPEC_ID: $SPEC_ID
-            - PR_NUMBER: $PR_NUMBER
-            - ATTEMPT: $ATTEMPT
+            - LIBRARY: ${{ steps.pr.outputs.library }}
+            - SPEC_ID: ${{ steps.pr.outputs.specification_id }}
+            - PR_NUMBER: ${{ steps.pr.outputs.pr_number }}
+            - ATTEMPT: ${{ steps.attempts.outputs.display }}
```

(analogous 8-line revert in `spec-create.yml` × 2 prompt blocks and
4-line revert in `report-validate.yml`).

Diff total: **3 files, 16 ±**.

## Test plan

- [ ] After merge, redispatch `impl-review.yml` for the 4 stuck PRs (`gh
workflow run impl-review.yml -f pr_number=<N>` for 5476, 5483, 5486;
5480 already got a 82 in the redispatch and should now stabilize)
- [ ] Verify each run's `Run AI Quality Review` step log shows real
values (e.g. `- LIBRARY: plotly`) in the PROMPT echo, not `$LIBRARY`
- [ ] Verify `quality_score.txt` is produced and `ai-review-failed`
label is removed
- [ ] On next `spec-request`-labeled issue, verify the spec-create
prompt sees the issue title/body
- [ ] On next `report-pending`-labeled issue, verify the report-validate
prompt sees the issue title/body

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Apr 29, 2026

AI Review - Attempt 3/3

Image Description

Light render (plot-light.png): The plot renders on a warm off-white background consistent with #FAF8F1. The title "marimekko-basic · letsplot · anyplot.ai" is clearly readable in dark text. Axis labels ("Market Size Distribution", "Share within Region (%)") and tick labels for all four regions are legible. The Marimekko chart shows four variable-width bars (North America, Europe, Asia Pacific, Latin America) with stacked segments for Electronics (green), Apparel (orange), Home Goods (blue), and Food & Beverage (pink/lavender) — appearing as Okabe-Ito positions 1–4. White dollar-value labels ($120.0M, $85.0M, etc.) are readable inside segments. All text is readable against the light background — no light-on-light failures.

Dark render (plot-dark.png): The plot renders on a near-black background consistent with #1A1A17. Title, axis labels, and tick labels are rendered in light/white text, clearly readable against the dark surface. Data colors are identical to the light render — green, orange, blue, and pink/lavender — only chrome has flipped. The $30.0M label in the top-right Latin America / Food & Beverage segment is slightly near the chart edge. No dark-on-dark failures observed; legend text is readable.

Critical discrepancy: The images appear to be from a previous attempt. The current code specifies custom non-Okabe-Ito colors (["#306998", "#FFD43B", "#2E8B57", "#DC2626"]), has the wrong title domain ("pyplots.ai" instead of "anyplot.ai"), and saves to "plot.png" without reading ANYPLOT_THEME — which would produce identical light/dark outputs if run fresh. The images show a better implementation than what is currently in the code file.

Score: 76/100

Category Score Max
Visual Quality 25 30
Design Excellence 10 20
Spec Compliance 13 15
Data Quality 15 15
Code Quality 8 10
Library Mastery 6 10
Total 76 100

Visual Quality (25/30)

  • VQ-01: Text Legibility (7/8) — Explicit sizes set (title 24pt, axis 20pt, ticks 16pt). Readable in both renders. Not 8 because theme-adaptive text colors are missing in the current code.
  • VQ-02: No Overlap (5/6) — Minor edge clipping: the $30.0M label in the narrow Latin America / Food & Beverage segment sits near the chart boundary.
  • VQ-03: Element Visibility (6/6) — All rectangle segments and value labels clearly visible.
  • VQ-04: Color Accessibility (2/2) — Okabe-Ito colors in images are CVD-safe. White text on colored segments has adequate contrast.
  • VQ-05: Layout & Canvas (3/4) — Good canvas utilization with legend on right. Minor margin tightening possible.
  • VQ-06: Axis Labels & Title (2/2) — Descriptive: "Market Size Distribution" and "Share within Region (%)".
  • VQ-07: Palette Compliance (0/2) — Code uses non-Okabe-Ito colors ["#306998", "#FFD43B", "#2E8B57", "#DC2626"] (Python Blue, Yellow, Sea Green, Red). No ANYPLOT_THEME env var logic. No PAGE_BG/INK/INK_SOFT tokens. Saves to "plot.png" not "plot-{THEME}.png".

Design Excellence (10/20)

  • DE-01: Aesthetic Sophistication (4/8) — Well-configured library default. Clean appearance but not publication-exceptional.
  • DE-02: Visual Refinement (4/6) — Grid lines removed (element_blank()), white segment borders add definition. Spines still present.
  • DE-03: Data Storytelling (2/6) — Data is displayed correctly but no visual hierarchy or emphasis. No focal point highlighting dominant segments (e.g., Asia Pacific's large electronics share).

Spec Compliance (13/15)

  • SC-01: Plot Type (5/5) — Correct Marimekko chart with variable-width bars and stacked proportional segments using geom_rect.
  • SC-02: Required Features (4/4) — Variable widths, stacked segments, color coding, value labels, legend all present.
  • SC-03: Data Mapping (3/3) — X-axis: regional market size (widths proportional to total). Y-axis: share within region (0–100%). Color: product lines.
  • SC-04: Title & Legend (1/3) — Code has "marimekko-basic · letsplot · pyplots.ai" — wrong domain (pyplots.aianyplot.ai). Images show correct title from a previous attempt.

Data Quality (15/15)

  • DQ-01: Feature Coverage (6/6) — 4 regions × 4 products. Variable bar widths visible. Different proportions across regions demonstrate Marimekko features effectively.
  • DQ-02: Realistic Context (5/5) — Global market share scenario: North America, Europe, Asia Pacific, Latin America × Electronics, Apparel, Home Goods, Food & Beverage. Neutral and real-world.
  • DQ-03: Appropriate Scale (4/4) — Values $25M–$180M per segment, regional totals $130M–$425M. Plausible global market sizes.

Code Quality (8/10)

  • CQ-01: KISS Structure (3/3) — Clean Imports → Data → Plot → Save structure. No functions or classes.
  • CQ-02: Reproducibility (2/2) — Deterministic hardcoded data. No random seed needed.
  • CQ-03: Clean Imports (2/2) — All imported symbols are used (os, shutil for file handling, all lets-plot components).
  • CQ-04: Code Elegance (1/2) — Logic is clear. The shutil.move dance for lets-plot-images is clunky but functional. No fake UI.
  • CQ-05: Output & API (0/1) — Saves to "plot.png" / "plot.html" instead of "plot-{THEME}.png" / "plot-{THEME}.html". No ANYPLOT_THEME environment variable handling.

Library Mastery (6/10)

  • LM-01: Idiomatic Usage (4/5) — Grammar of graphics with geom_rect, scale_fill_manual, theme() is idiomatic lets-plot usage.
  • LM-02: Distinctive Features (2/5) — Composable layer system is used correctly but the same approach works identically in plotnine. No lets-plot-specific features leveraged.

Score Caps Applied

  • None — no caps triggered (DE-01=4, DE-02=4; no VQ-02/VQ-03/SC-01/DQ-02 zeroes)

Strengths

  • Correct Marimekko chart type using geom_rect with variable-width bars
  • Realistic global market share data scenario with appropriate scale
  • All required spec features present: variable widths, stacked segments, legend, value labels
  • Font sizes explicitly set at the correct scale for 4800×2700px output
  • Grid lines cleanly removed for a minimal appearance

Weaknesses

  • Critical: Code uses non-Okabe-Ito colors ["#306998", "#FFD43B", "#2E8B57", "#DC2626"] — replace with ["#009E73", "#D55E00", "#0072B2", "#CC79A7"]
  • Critical: Wrong title domain in code — "pyplots.ai" must be "anyplot.ai"
  • Critical: No ANYPLOT_THEME environment variable logic — code must read os.getenv("ANYPLOT_THEME", "light") and set PAGE_BG/INK/INK_SOFT tokens for theme-adaptive chrome
  • Critical: Output files must be "plot-{THEME}.png" and "plot-{THEME}.html", not "plot.png" / "plot.html"
  • Design storytelling absent — no visual emphasis on dominant segments or regions

Issues Found

  1. VQ-07 / CQ-05 CRITICAL: No theme adaptation — read ANYPLOT_THEME, set PAGE_BG="#FAF8F1"/"#1A1A17", INK, INK_SOFT tokens, save as plot-{THEME}.png
  2. VQ-07 CRITICAL: Wrong color palette — use OKABE_ITO = ["#009E73", "#D55E00", "#0072B2", "#CC79A7"] and scale_fill_manual(values=OKABE_ITO)
  3. SC-04 CRITICAL: Fix title to "marimekko-basic · letsplot · anyplot.ai"
  4. DE-03 LOW: Add visual emphasis — e.g., use slightly different opacity or outline to highlight the largest segment (Asia Pacific Electronics $180M)

AI Feedback for Next Attempt

Three critical fixes required: (1) Add THEME = os.getenv("ANYPLOT_THEME", "light") logic with PAGE_BG, INK, INK_SOFT tokens applied via theme(plot_background=element_rect(fill=PAGE_BG), axis_text=element_text(color=INK_SOFT), ...) — save as ggsave(plot, f"plot-{THEME}.png", scale=3) and ggsave(plot, f"plot-{THEME}.html"); (2) Replace custom colors with Okabe-Ito: OKABE_ITO = ["#009E73", "#D55E00", "#0072B2", "#CC79A7"]; (3) Fix title to "marimekko-basic · letsplot · anyplot.ai". Once these are fixed, consider adding subtle emphasis on the dominant Asia Pacific bar to improve storytelling.

Verdict: REJECTED

@github-actions github-actions Bot added the ai-approved Quality OK, ready for merge label Apr 29, 2026
MarkusNeusinger added a commit that referenced this pull request Apr 29, 2026
)

## Summary

The 3 AI-approved implementation PRs from today (#5476, #5480, #5481)
all hit `gh pr merge` failures with `the base branch policy prohibits
the merge`. Root cause: the branch ruleset on `main` requires three
status checks (`Run Linting`, `Run Tests`, `Run Frontend Tests`) — and
impl-PRs created by `impl-generate.yml` never get those checks.

## Why CI doesn't run on impl-PRs

`impl-generate.yml` (and `impl-repair.yml`, `impl-review.yml`) push
commits to PR branches using `GITHUB_TOKEN`. By GitHub's anti-recursion
design, pushes / PRs created with `GITHUB_TOKEN` do **not** trigger
downstream `pull_request` or `workflow_run` events. Verified across all
5 stuck PRs:

| PR | Branch | `Run Linting` ever ran? |
|----|--------|--------------------------|
| #5476 seaborn/marimekko-basic | yes (once, on a 04-27 impl-repair
commit; newer score commits invalidated it) |
| #5480 altair/marimekko-basic | no |
| #5481 letsplot/marimekko-basic | no |
| #5483 plotnine/marimekko-basic | no |
| #5486 plotly/line-basic | no |

So the merge is gated on a check that structurally cannot complete.

## The fix

Add `--admin` to the `gh pr merge` call inside `impl-merge.yml`. This
lets the pipeline complete autonomously without weakening main's
protection for human PRs.

```diff
+            # --admin bypasses the branch ruleset's required-status-check
+            # gate. Required because impl-generate.yml pushes via GITHUB_TOKEN,
+            # which by GitHub's anti-recursion design does not trigger
+            # downstream CI workflows (Run Linting / Run Tests / Run Frontend
+            # Tests), so impl PRs never get those checks. The pipeline already
+            # gates merge behind the AI quality review threshold.
             if gh pr merge "$PR_NUM" \
               --repo "$REPOSITORY" \
               --squash \
+              --admin \
               --delete-branch; then
```

The merge is still gated by:
- AI quality threshold (cascading 90 / 80 / 70 / 60 / 50 across initial
review + 4 repair attempts)
- `impl-merge.yml`'s own pre-merge "Validate PR completeness" step
- The label-based trigger requiring `ai-approved`

So `--admin` only bypasses the structurally-missing CI artifact, not the
substantive review gates.

## Considered alternative

Push from `impl-generate` / `impl-repair` / `impl-review` via a PAT
instead of `GITHUB_TOKEN` so CI triggers naturally. Cleaner long-term
but needs a maintained secret and a broader review of which workflows
touch which branches; deferred.

## Test plan

- [ ] After merge, dispatch `impl-merge.yml` (or trust the `ai-approved`
label trigger) for the 3 stuck approved PRs (#5476, #5480, #5481)
- [ ] Verify merge succeeds without retries on attempt 1
- [ ] Verify post-merge: metadata file created, GCS staging→production
promotion done, `impl:{library}:done` label on parent issue

🤖 Generated with [Claude Code](https://claude.com/claude-code)
MarkusNeusinger added a commit that referenced this pull request Apr 29, 2026
…et (#5523)

## Summary

Follow-up to #5521 (which added `--admin` to `gh pr merge`). That change
alone wasn't enough — verified just now: 3 dispatched merges (#5476,
#5480, #5481) all failed identically with:

```
GraphQL: Repository rule violations found
3 of 3 required status checks are expected.
(mergePullRequest)
```

## Why --admin alone didn't work

The `main` ruleset's bypass list contains only `RepositoryRole admin`
(mode: `pull_request`). Default `GITHUB_TOKEN` runs as
`github-actions[bot]` with `write` role — not admin — so the API rejects
the bypass.

```bash
gh api repos/MarkusNeusinger/anyplot/rulesets/10578859 --jq '.bypass_actors'
# [{"actor_id":5,"actor_type":"RepositoryRole","bypass_mode":"pull_request"}]
```

## The fix

Route **only the merge step** through a repo-admin PAT (`ADMIN_TOKEN`).
All other steps in `impl-merge.yml` and the rest of the impl-* workflows
keep using `GITHUB_TOKEN`. Bypass scope is therefore exactly one step,
not the whole pipeline.

```diff
       - name: Merge PR to main (with retry)
         if: steps.check.outputs.should_run == 'true'
         env:
-          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+          GH_TOKEN: ${{ secrets.ADMIN_TOKEN || secrets.GITHUB_TOKEN }}
           PR_NUM: ${{ steps.check.outputs.pr_number }}
           REPOSITORY: ${{ github.repository }}
+          HAS_ADMIN_TOKEN: ${{ secrets.ADMIN_TOKEN != '' }}
         run: |
+          if [ "$HAS_ADMIN_TOKEN" != "true" ]; then
+            echo "::warning::ADMIN_TOKEN secret is not set..."
+          fi
```

The fallback `secrets.ADMIN_TOKEN || secrets.GITHUB_TOKEN` and the
warning preserve the previous behavior if `ADMIN_TOKEN` isn't set yet —
workflow still runs, fails with the same ruleset error as before, but
the log says clearly what's missing instead of an opaque auth error.

## Required after merge

1. **Create PAT**: Settings → Developer settings → Personal access
tokens → Fine-grained
   - Repository: `anyplot`
   - Permissions:
     - Contents: Read+Write
     - Pull requests: Read+Write
     - Administration: Read+Write
     - Metadata: Read
2. **Set secret**: Settings → Secrets and variables → Actions → New
repository secret
   - Name: `ADMIN_TOKEN`
   - Value: the PAT

## Considered alternatives

| Option | Verdict |
|--------|---------|
| Add `github-actions[bot]` as bypass actor on ruleset | broader blast
radius — *every* workflow run could bypass main |
| Remove the 3 required checks from ruleset | weakens protection for
human PRs too |
| Push from impl-generate via PAT so CI triggers naturally | cleanest
semantically but needs PAT in 3 workflows + same maintenance overhead |
| **Scope PAT to merge step only (this PR)** | smallest blast radius,
matches the actual permission gap |

## Test plan

- [ ] Merge this PR
- [ ] Create the fine-grained PAT and add as `ADMIN_TOKEN` repo secret
- [ ] Re-dispatch `impl-merge.yml` for the 3 stuck approved PRs (#5476
seaborn, #5480 altair, #5481 letsplot)
- [ ] Verify each merges successfully on attempt 1 (no ruleset error in
run log)
- [ ] Verify metadata file created, GCS staging→production promotion
done, parent issue gets `impl:{library}:done` label

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@MarkusNeusinger MarkusNeusinger merged commit 10d2f0f into main Apr 29, 2026
3 checks passed
@MarkusNeusinger MarkusNeusinger deleted the implementation/marimekko-basic/letsplot branch April 29, 2026 11:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-approved Quality OK, ready for merge ai-attempt-1 First repair attempt ai-attempt-2 Second repair attempt ai-review-failed AI review action failed or timed out quality:73 Quality score 73/100 quality:76 Quality score 76/100

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant