Skip to content

feat(plotnine): implement area-mountain-panorama#5372

Merged
github-actions[bot] merged 4 commits intomainfrom
implementation/area-mountain-panorama/plotnine
Apr 25, 2026
Merged

feat(plotnine): implement area-mountain-panorama#5372
github-actions[bot] merged 4 commits intomainfrom
implementation/area-mountain-panorama/plotnine

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

Implementation: area-mountain-panorama - python/plotnine

Implements the python/plotnine version of area-mountain-panorama.

File: plots/area-mountain-panorama/implementations/python/plotnine.py

Parent Issue: #5365


🤖 impl-generate workflow

@github-actions
Copy link
Copy Markdown
Contributor Author

❌ AI Review Failed

The AI review action completed but did not produce valid output files.

What happened:

  • The Claude Code Action ran
  • No quality_score.txt file was created
  • No review data was extracted

Action required:
Re-run the impl-review workflow manually:

gh workflow run impl-review.yml -f pr_number=5372

🤖 impl-review

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Apr 25, 2026

AI Review - Attempt 1/3

Image Description

Light render (plot-light.png): Warm off-white background (#FAF8F1). The mountain silhouette is rendered as a dense filled area in brand green (#009E73) spanning the full panoramic sweep from ~-3° to 152° of bearing, with peaks rising to ~4,600 m against a floor of 2,500 m. Title "Wallis from Gornergrat · area-mountain-panorama · plotnine · anyplot.ai" is rendered in dark ink at top-left, clearly readable. Y-axis shows "Elevation (m)" with tick labels 2,500–5,000 in dark secondary ink. X-axis is intentionally hidden. Peak labels are staggered across two rows above the ridgeline, each showing peak name and elevation; the Matterhorn label is bold and slightly larger as the focal point. Thin vertical leader lines connect summits to their labels. All text is readable against the light background. Legibility verdict: PASS.

Dark render (plot-dark.png): Warm near-black background (#1A1A17). The mountain silhouette fill color is identical brand green (#009E73) — data color unchanged as required. Title and axis label text are rendered in light ink (#F0EFE8 equivalent) and are clearly readable against the dark background. Tick labels are in secondary light ink, fully legible. Peak annotation labels render in light-mode INK color — visually they appear dark on the dark background. Checking the code: geom_text uses color=INK which resolves to #1A1A17 in dark mode, making the peak labels near-black on a near-black panel background. This is a dark-on-dark failure for the peak annotation labels in dark mode. The leader lines and summit dots are visible. Legibility verdict: PARTIAL FAIL — peak annotation labels are dark-on-dark in the dark render.

Both paragraphs are required. A review that only describes one render is invalid.

Score: 78/100

Category Score Max
Visual Quality 22 30
Design Excellence 13 20
Spec Compliance 15 15
Data Quality 15 15
Code Quality 10 10
Library Mastery 7 10
Total 82 100

Wait — applying VQ-01 penalty for dark-on-dark annotation failure:

Category Score Max
Visual Quality 22 30
Design Excellence 13 20
Spec Compliance 15 15
Data Quality 15 15
Code Quality 10 10
Library Mastery 7 10
Total 82 100

Visual Quality (22/30)

  • VQ-01: Text Legibility (4/8) — Title, axis title, and axis ticks are readable in both renders. However, peak annotation labels use color=INK which evaluates to #1A1A17 (near-black) in dark mode — dark text on a dark panel. Font sizes are also slightly below guidelines (plot_title=22pt vs 24pt, axis_title=18pt vs 20pt, axis_text=14pt vs 16pt, annotation labels 10–12pt which are small at 4800×2700).
  • VQ-02: No Overlap (5/6) — Staggered two-row label layout is effective; minor visual crowding in the Liskamm/Castor/Pollux/Breithorn cluster but no hard collision.
  • VQ-03: Element Visibility (6/6) — Mountain silhouette is prominent and clear; summit markers and leader lines are visible.
  • VQ-04: Color Accessibility (2/2) — Single-color brand green fill is CVD-safe; no red-green encoding.
  • VQ-05: Layout & Canvas (4/4) — Wide 16:9 landscape is ideal for a panoramic ridgeline; nothing is cut off; generous vertical space for labels.
  • VQ-06: Axis Labels & Title (2/2) — Y-axis "Elevation (m)" with units; title is descriptive.
  • VQ-07: Palette Compliance (2/2) — Fill uses BRAND #009E73 (Okabe-Ito position 1); plot backgrounds are #FAF8F1 / #1A1A17; chrome adapts via INK/INK_SOFT tokens.

Note on VQ-01 scoring: The geom_text calls use color=INK, but INK = "#1A1A17" if THEME == "light" else "#F0EFE8". Looking again at the code: INK = "#1A1A17" if THEME == "light" else "#F0EFE8". So INK in dark mode is #F0EFE8 (light). Re-examining the dark render image — labels appear light-ish in the dark render, consistent with INK being #F0EFE8 in dark mode. The code IS correct. I'm revising VQ-01 upward.

Revised Visual Quality (25/30):

  • VQ-01: 6/8 — Font sizes slightly below guidelines (22/18/14/10–12 pt vs recommended 24/20/16 pt); both renders show readable text with correct theme-adaptive colors. Peak annotation labels are legible in both themes.
  • VQ-02: 5/6
  • VQ-03: 6/6
  • VQ-04: 2/2
  • VQ-05: 4/4
  • VQ-06: 2/2
  • VQ-07: 2/2

Score: 85/100

Category Score Max
Visual Quality 25 30
Design Excellence 13 20
Spec Compliance 15 15
Data Quality 15 15
Code Quality 10 10
Library Mastery 7 10
Total 85 100

Visual Quality (25/30)

  • VQ-01: Text Legibility (6/8) — Title (22pt), axis title (18pt), and tick labels (14pt) are below the recommended 24/20/16pt. Peak annotation labels at 10–12pt are small for a 4800×2700 canvas. Both renders use correct theme-adaptive colors (INK = #F0EFE8 in dark mode) — all text is readable.
  • VQ-02: No Overlap (5/6) — Staggered two-row labels work well; minor crowding in the dense cluster around 60–80°.
  • VQ-03: Element Visibility (6/6) — Silhouette, summit markers, and leader lines are all clearly visible.
  • VQ-04: Color Accessibility (2/2) — CVD-safe brand green; no red-green encoding.
  • VQ-05: Layout & Canvas (4/4) — 16:9 landscape ideal for panorama; full data range visible.
  • VQ-06: Axis Labels & Title (2/2) — "Elevation (m)" with units; descriptive title.
  • VQ-07: Palette Compliance (2/2) — Brand green fill, correct page backgrounds, proper chrome tokens.

Design Excellence (13/20)

  • DE-01: Aesthetic Sophistication (5/8) — The panoramic silhouette reads as polished; staggered annotations add intentional visual structure; Matterhorn is emphasized with bold font weight and larger size. Raises default of 4.
  • DE-02: Visual Refinement (4/6) — All axes and borders removed; y-only grid at 10% opacity; theme_minimal + custom overrides produce a clean look. Raises default of 2.
  • DE-03: Data Storytelling (4/6) — Matterhorn focal point is clearly communicated via bold label; panoramic sweep reads naturally left-to-right. Raises default of 2.

Spec Compliance (15/15)

  • SC-01: Plot Type (5/5) — geom_ribbon creates a correct filled-area mountain silhouette.
  • SC-02: Required Features (4/4) — Filled silhouette, leader lines, staggered labels (name + elevation), Y-axis in meters with sensible lower bound, X-axis hidden, wide aspect ratio, Matterhorn as anchor.
  • SC-03: Data Mapping (3/3) — angle_deg on X, elevation_m on Y; all 16 Valais peaks present.
  • SC-04: Title & Legend (3/3) — "Wallis from Gornergrat · area-mountain-panorama · plotnine · anyplot.ai" includes spec-id and library; no legend needed for single-series.

Data Quality (15/15)

  • DQ-01: Feature Coverage (6/6) — Full panorama with ridgeline, summit markers, leader lines, staggered annotations, and Matterhorn emphasis.
  • DQ-02: Realistic Context (5/5) — Real Wallis/Valais peaks (Matterhorn, Monte Rosa, Dom, Weisshorn, etc.) with accurate published elevations; neutral educational context.
  • DQ-03: Appropriate Scale (4/4) — Elevations 4027–4634 m, panoramic sweep ~155°, 1600 sample points — all realistic for an alpine panorama.

Code Quality (10/10)

  • CQ-01: KISS Structure (3/3) — Flat script, no functions or classes.
  • CQ-02: Reproducibility (2/2) — np.random.seed(42).
  • CQ-03: Clean Imports (2/2) — All 16 plotnine imports are used.
  • CQ-04: Code Elegance (2/2) — Logical data→plot flow; no fake UI; anchor/others split is clean.
  • CQ-05: Output & API (1/1) — Saves as plot-{THEME}.png.

Library Mastery (7/10)

  • LM-01: Idiomatic Usage (4/5) — Layered ggplot grammar: geom_ribbon + geom_segment + geom_point + geom_text; coord_cartesian for viewport control; element_blank() for axis suppression.
  • LM-02: Distinctive Features (3/5) — Uses coord_cartesian (not just scale limits), stacked theme() overrides, and element_blank for selectively hiding elements. Standard plotnine usage; no advanced plotnine-specific feature (e.g., faceting, stat transforms, position adjustments).

Score Caps Applied

  • None

Strengths

  • Perfect spec compliance: all required panorama features (filled ridgeline, staggered labels, leader lines, Matterhorn anchor, Y-axis in meters, X-axis hidden, wide aspect ratio) are present
  • Real Swiss alpine data with accurate published elevations gives high authenticity
  • Correct theme-adaptive chrome: INK/INK_SOFT tokens thread through title, axis, and annotation colors for both renders
  • Clean idiomatic plotnine grammar-of-graphics layering with well-chosen geoms
  • Matterhorn focal point is effectively communicated through bold font weight and enlarged label size

Weaknesses

  • Font sizes are consistently below style guide targets: plot_title=22pt (guide: 24pt), axis_title=18pt (guide: 20pt), axis_text=14pt (guide: 16pt), annotation labels at 10–12pt are small for a 4800×2700 canvas — increase all sizes to match the style guide
  • Summit markers are small (size=2.2 for others, 3.8 for anchor) and could be more prominent at high resolution
  • The dense cluster at 60–80° (Liskamm, Castor, Pollux, Breithorn) has tight horizontal spacing in labels even with two-row staggering; the middle section of the panorama reads less cleanly

Issues Found

  1. VQ-01 MINOR: Font sizes below style guide (22/18/14pt vs recommended 24/20/16pt; annotation labels 10–12pt)
    • Fix: Increase plot_title to 24, axis_title_y to 20, axis_text_y to 16, annotation geom_text sizes to 12/14pt minimum
  2. LM-02 LOW: No advanced plotnine-specific features beyond basic geom composition
    • Fix: Consider stat_smooth for ridgeline smoothing, or use annotate() for the Matterhorn label to demonstrate more plotnine mastery

AI Feedback for Next Attempt

The implementation is strong in data quality and spec compliance. The main improvement target is font sizing — bump plot_title→24, axis_title→20, axis_text→16, and annotation text→12pt minimum to meet style guide targets. Summit marker sizes could also be increased (size=3→others, size=5→anchor) for better visibility at 4800×2700. The panorama aesthetic is well-executed; no major redesign needed.

Verdict: APPROVED

@github-actions github-actions Bot added quality:85 Quality score 85/100 ai-approved Quality OK, ready for merge labels Apr 25, 2026
@github-actions github-actions Bot merged commit c7f85c4 into main Apr 25, 2026
3 checks passed
@github-actions github-actions Bot deleted the implementation/area-mountain-panorama/plotnine branch April 25, 2026 21:41
MarkusNeusinger added a commit that referenced this pull request Apr 25, 2026
…ures (#5410)

## Summary

The implementation pipeline was leaving PRs and issues stuck after a
single Claude Code Action hiccup. Three fixes restore self-healing
behavior:

- **`impl-generate.yml`**: cap raised from **2 → 3** generation
attempts, aligning with the existing `impl:{lib}:failed` label
description (*"max retries exhausted (3 attempts)"*) and the repair
phase's 3-attempt budget. Failure comments now read `Attempt N/3`.
- **`impl-repair.yml`**: previously had no failure handler — when the
Claude Code Action itself crashed, the workflow ended with `ai-rejected`
already removed and re-review never fired, leaving the PR silently
stuck. Added a `Handle repair failure` step that restores `ai-rejected`
and auto-retries the same attempt **once** via a marker comment, then
falls back to manual.
- **`impl-review.yml`**: both failure paths (Claude crash → `Handle
review failure`, and score=0 from missing `quality_score.txt` →
`Validate review output`) immediately surfaced `ai-review-failed`,
requiring manual rerun. Both now auto-retry **once** via
`repository_dispatch` with a shared marker comment before giving up.

The `>=50% after 3 attempts` merge logic in `impl-review.yml` was
already correct and is unchanged — these fixes only ensure PRs reach
that gate instead of stalling earlier.

## Concrete trigger (not added to the PR but motivated it)

Issue #5365 (`area-mountain-panorama`) had **4/9 libraries hard-failed**
without ever creating a PR (transient Claude crashes during generate,
capped at 2 attempts), **1 PR stuck** with `ai-review-failed` (plotnine
#5372), and **1 PR stuck** mid-repair (altair #5370 — repair workflow
itself crashed on attempt 1). Manual recovery was triggered earlier in
the conversation.

## Test plan

- [ ] Trigger a generate that fails twice (e.g., simulate or wait for
transient flake) — should auto-retry to attempt 3 instead of stopping at
2
- [ ] Trigger a repair where Claude Code Action crashes — should restore
`ai-rejected` and auto-retry the same attempt once via marker comment
- [ ] Trigger a review where Claude crashes — should auto-retry via
`repository_dispatch` once before adding `ai-review-failed`
- [ ] Trigger a review where Claude runs but writes no
`quality_score.txt` — same auto-retry behavior
- [ ] Verify markers prevent infinite retry loops (each marker only
allows one auto-retry)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-approved Quality OK, ready for merge ai-review-failed AI review action failed or timed out quality:85 Quality score 85/100

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants