Skip to content

feat(matplotlib): implement errorbar-basic#5382

Merged
github-actions[bot] merged 2 commits intomainfrom
implementation/errorbar-basic/matplotlib
Apr 25, 2026
Merged

feat(matplotlib): implement errorbar-basic#5382
github-actions[bot] merged 2 commits intomainfrom
implementation/errorbar-basic/matplotlib

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

Implementation: errorbar-basic - python/matplotlib

Implements the python/matplotlib version of errorbar-basic.

File: plots/errorbar-basic/implementations/python/matplotlib.py

Parent Issue: #973


🤖 impl-generate workflow

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Apr 25, 2026

AI Review - Attempt 1/3

Image Description

Light render (plot-light.png): Warm off-white background (#FAF8F1) — correct. Title "errorbar-basic · matplotlib · anyplot.ai" in dark near-black ink, clearly readable at 24pt. Axis labels "Experimental Group" (x) and "Response Value (units)" (y) in dark ink at 20pt. Six x-category tick labels and numeric y-ticks in medium-dark gray (INK_SOFT) at 16pt — all readable. Subtle horizontal grid lines at ~10% opacity. Left and bottom spines in medium gray (L-frame); top and right removed. Six data points as large solid circles in brand green #009E73 with white/off-white marker edges; error bars with visible caps (capsize=10, capthick=3). Treatment C shows a notably larger lower error (asymmetric). All text is readable against the light background.

Dark render (plot-dark.png): Warm near-black background (#1A1A17) — correct, not pure black. Title in light cream (#F0EFE8), clearly visible. Axis labels in light cream. Tick labels in medium light gray (#B8B7B0) — visible against the dark surface with no dark-on-dark failures. Grid lines remain subtle at ~10% opacity. L-frame spines in medium gray. All six data points and error bars remain in identical brand green #009E73 — data colors unchanged from the light render; only chrome flips. All text is readable against the dark background; no dark-on-dark text detected.

Both paragraphs are required. A review that only describes one render is invalid.

Score: 86/100

Category Score Max
Visual Quality 30 30
Design Excellence 10 20
Spec Compliance 15 15
Data Quality 14 15
Code Quality 10 10
Library Mastery 7 10
Total 86 100

Visual Quality (30/30)

  • VQ-01: Text Legibility (8/8) — All sizes explicitly set: 24pt title, 20pt axis labels, 16pt ticks; readable in both themes
  • VQ-02: No Overlap (6/6) — No overlapping text or data elements; six well-spaced categories
  • VQ-03: Element Visibility (6/6) — Large markers (markersize=15), thick error bars (elinewidth=3), clear caps (capsize=10, capthick=3)
  • VQ-04: Color Accessibility (2/2) — Single CVD-safe brand green #009E73; good contrast on both surfaces
  • VQ-05: Layout & Canvas (4/4) — Plot fills canvas well with balanced margins; tight_layout applied
  • VQ-06: Axis Labels & Title (2/2) — "Response Value (units)" and "Experimental Group" are descriptive with units
  • VQ-07: Palette Compliance (2/2) — First (only) series is #009E73; backgrounds are #FAF8F1 (light) and #1A1A17 (dark); chrome fully theme-adaptive

Design Excellence (10/20)

  • DE-01: Aesthetic Sophistication (4/8) — Clean and professional but looks like a well-configured library default; single-color, no emphasis techniques or design hierarchy
  • DE-02: Visual Refinement (4/6) — Spines removed (L-frame), subtle grid (alpha=0.10), set_axisbelow, white marker edges — good refinement but not fully polished
  • DE-03: Data Storytelling (2/6) — Data displayed but not interpreted; all groups rendered identically with no focal point or visual emphasis to guide the viewer

Spec Compliance (15/15)

  • SC-01: Plot Type (5/5) — Correct errorbar plot using ax.errorbar()
  • SC-02: Required Features (4/4) — Error bars with visible caps, asymmetric errors, consistent widths across all points
  • SC-03: Data Mapping (3/3) — Categorical x-axis, numeric y-axis, all data visible
  • SC-04: Title & Legend (3/3) — Title "errorbar-basic · matplotlib · anyplot.ai"; no legend (single series, correct omission)

Data Quality (14/15)

  • DQ-01: Feature Coverage (5/6) — Shows asymmetric errors with varying magnitudes across groups; all groups use asymmetric type — mixing in at least one symmetric pair would give fuller coverage
  • DQ-02: Realistic Context (5/5) — Clinical trial context (Control vs. Treatment groups) is real-world plausible and neutral
  • DQ-03: Appropriate Scale (4/4) — Values 25–48 units with error margins 2–6.5 are realistic for experimental measurements

Code Quality (10/10)

  • CQ-01: KISS Structure (3/3) — Flat script: imports → theme tokens → data → plot → style → save
  • CQ-02: Reproducibility (2/2) — np.random.seed(42) set
  • CQ-03: Clean Imports (2/2) — Only os, matplotlib.pyplot, numpy — all used
  • CQ-04: Code Elegance (2/2) — Clean Pythonic code, no over-engineering, no fake UI
  • CQ-05: Output & API (1/1) — Saves as plot-{THEME}.png with dpi=300 and facecolor=PAGE_BG

Library Mastery (7/10)

  • LM-01: Idiomatic Usage (4/5) — Correct use of ax.errorbar() with named parameters, set_axisbelow(True), axes-level methods throughout
  • LM-02: Distinctive Features (3/5) — Uses matplotlib-specific asymmetric error format (2×N yerr array), markeredgecolor on errorbar markers, capthick/elinewidth controls, set_axisbelow

Score Caps Applied

  • None — DE-01=4 and DE-02=4, so the "correct but boring" cap (DE-01 ≤ 2 AND DE-02 ≤ 2) does not trigger

Strengths

  • Perfect theme adaptation — all chrome tokens (background, text, grid, spine, tick colors) correctly flip between light and dark themes with zero dark-on-dark or light-on-light failures
  • Asymmetric error bars correctly implemented using the 2×N yerr array format with capsize=10 and capthick=3 giving clear cap visibility
  • Clean flat KISS code with explicit font sizes (24/20/16pt), set_axisbelow(True), and white marker edges for definition
  • Realistic clinical trial context (Control vs. Treatment groups) with plausible, neutral data and meaningful asymmetric uncertainties

Weaknesses

  • DE-03 LOW: No visual hierarchy or emphasis — all six groups rendered identically in the same green; viewer must find the story themselves
    • Fix: Add a reference line at the Control mean; or color-code groups (e.g., Control vs. Treatment in distinct Okabe-Ito colors); or vary marker size by uncertainty magnitude
  • DE-01 MODERATE: Single-color single-series design looks like a well-configured default but lacks aesthetic sophistication
    • Fix: Introduce a horizontal reference band or dashed baseline at the Control level; add subtle background shading behind the data range; add a brief data label on the highest-value point (Treatment D)
  • DQ-01 PARTIAL: All error bars are asymmetric; showing one or two symmetric cases alongside asymmetric ones would demonstrate fuller errorbar feature coverage

Issues Found

  1. DE-03 LOW (2/6): No focal point or visual narrative — six data points with identical styling and no guidance for the viewer
    • Fix: Reference line at Control mean + highlight Treatment D (highest value) with a distinct color or annotation pointing out the effect size
  2. DE-01 LOW (4/8): Generic single-color design with no design sophistication beyond clean defaults
    • Fix: Multi-color Okabe-Ito coding of groups (Control=green, Treatments=next Okabe-Ito colors); or a horizontal reference band; or strategic use of opacity to de-emphasize lower-priority groups

AI Feedback for Next Attempt

Improve design excellence by adding visual hierarchy: (1) Use Okabe-Ito multi-color coding to distinguish the Control group from the five Treatment groups — this immediately gives the viewer a "compare treatments to control" mental model. (2) Add a horizontal dashed reference line at the Control group mean to make treatment comparisons explicit. (3) Consider adding a concise annotation on Treatment D (highest mean) to create a clear focal point. These changes would push DE-01 from 4→6 and DE-03 from 2→4 without adding complexity.

Verdict: REJECTED

@github-actions github-actions Bot added quality:86 Quality score 86/100 ai-approved Quality OK, ready for merge labels Apr 25, 2026
@github-actions github-actions Bot merged commit 0c6ecb8 into main Apr 25, 2026
3 checks passed
@github-actions github-actions Bot deleted the implementation/errorbar-basic/matplotlib branch April 25, 2026 08:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-approved Quality OK, ready for merge quality:86 Quality score 86/100

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants