Skip to content

feat(plotnine): implement dumbbell-basic#5420

Merged
github-actions[bot] merged 5 commits intomainfrom
implementation/dumbbell-basic/plotnine
Apr 26, 2026
Merged

feat(plotnine): implement dumbbell-basic#5420
github-actions[bot] merged 5 commits intomainfrom
implementation/dumbbell-basic/plotnine

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

Implementation: dumbbell-basic - python/plotnine

Implements the python/plotnine version of dumbbell-basic.

File: plots/dumbbell-basic/implementations/python/plotnine.py

Parent Issue: #945


🤖 impl-generate workflow

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Apr 26, 2026

AI Review - Attempt 1/3

Image Description

Light render (plot-light.png): The plot renders on a warm off-white background consistent with #FAF8F1. Eight departments are displayed on the y-axis (HR, Customer Support, Engineering, Operations, Product, Marketing, Finance, Sales — sorted top-to-bottom by decreasing improvement magnitude). The x-axis shows "Satisfaction Score" from 30 to 100. Each row has a teal/green dot for "Before" and an orange dot for "After", connected by a thin gray horizontal line. The title "Employee Satisfaction · dumbbell-basic · plotnine · anyplot.ai" appears in dark text at the top. A legend in the upper-right area labels "Before" (green) and "After" (orange). Subtle horizontal grid lines are visible. All text — title, axis labels, tick labels, and legend — is clearly readable against the light background. Legibility verdict: PASS

Dark render (plot-dark.png): The same plot renders on a warm near-black background consistent with #1A1A17. Title, axis labels ("Department", "Satisfaction Score"), tick labels, and legend text all appear in light/off-white tones and are clearly readable against the dark background. The teal "Before" and orange "After" dot colors are visually identical to the light render — only the background and text chrome have flipped. No dark-on-dark failures (black text on near-black) are apparent. The subtle horizontal grid lines remain visible. Legibility verdict: PASS

Important note: The images show "anyplot.ai" in the title and appear to use Okabe-Ito colors (teal #009E73 and orange), but the code on disk specifies #306998 (Python Blue) for "Before" and #FFD43B (yellow) for "After", and saves to plot.png with no ANYPLOT_THEME support. This code-image mismatch means the images were generated from a different code version. The code as written cannot reproduce these images.

Score: 79/100

Category Score Max
Visual Quality 25 30
Design Excellence 11 20
Spec Compliance 14 15
Data Quality 14 15
Code Quality 9 10
Library Mastery 6 10
Total 79 100

Visual Quality (25/30)

  • VQ-01: Text Legibility (7/8) — Font sizes explicitly set (title=24, axis=20, ticks=16, legend=16); all text readable in both renders. Minor: no weight differentiation for hierarchy.
  • VQ-02: No Overlap (6/6) — No overlapping elements in either render.
  • VQ-03: Element Visibility (5/6) — Dots (size=6) are visible; connecting lines (size=1.5) are appropriately subtle. Dots could be slightly larger for the 4800×2700 canvas.
  • VQ-04: Color Accessibility (2/2) — Green+orange pairing is CVD-safe; images show Okabe-Ito–compatible colors.
  • VQ-05: Layout & Canvas (3/4) — Good proportions; legend is well-positioned outside the plot without being isolated.
  • VQ-06: Axis Labels & Title (2/2) — "Satisfaction Score" and "Department" are descriptive; no units needed for scores.
  • VQ-07: Palette Compliance (0/2) — Code uses #306998 (Python Blue — explicitly non-compliant per criteria) for "Before" and #FFD43B (not Okabe-Ito) for "After". No os.environ["ANYPLOT_THEME"] support; no theme-adaptive chrome (PAGE_BG, INK, INK_SOFT tokens); background and text colors are hardcoded without reading the theme variable.

Design Excellence (11/20)

  • DE-01: Aesthetic Sophistication (4/8) — Well-configured library default. Sorting by improvement is a thoughtful choice, but the code palette is non-compliant, and there is no font weight differentiation or additional polish beyond theme_minimal.
  • DE-02: Visual Refinement (4/6) — theme_minimal() removes spines; panel_grid_major_y=element_blank() removes y-grid, leaving only subtle horizontal grid lines. Good refinement above defaults.
  • DE-03: Data Storytelling (3/6) — Sorting departments by improvement magnitude creates a meaningful top-to-bottom narrative. However, no emphasis (color intensity, size, annotations) draws attention to the biggest gains; the viewer must discover the pattern independently.

Spec Compliance (14/15)

  • SC-01: Plot Type (5/5) — Correct dumbbell chart using geom_segment + geom_point.
  • SC-02: Required Features (4/4) — Horizontal orientation ✓, distinct colors for Before/After ✓, sorted by difference ✓, thin connecting line ✓.
  • SC-03: Data Mapping (3/3) — Values on x-axis, categories on y-axis; all data visible.
  • SC-04: Title & Legend (2/3) — Legend labels ("Before", "After") are correct. Title in code is "dumbbell-basic · plotnine · pyplots.ai" — uses wrong domain ("pyplots.ai" instead of "anyplot.ai"). The images show the correct domain, confirming the code is out of sync.

Data Quality (14/15)

  • DQ-01: Feature Coverage (5/6) — Eight departments show a variety of improvement magnitudes (6–28 points), demonstrating the full range of dumbbell behavior. Slight deduction: no "negative change" example (all departments improved).
  • DQ-02: Realistic Context (5/5) — Employee satisfaction before/after workplace policy changes is a real, neutral business scenario.
  • DQ-03: Appropriate Scale (4/4) — Scores 40–88 on an implicit 0–100 scale are realistic for satisfaction surveys.

Code Quality (9/10)

  • CQ-01: KISS Structure (3/3) — Clean imports → data → plot → save structure; no functions or classes.
  • CQ-02: Reproducibility (2/2) — Deterministic hardcoded data.
  • CQ-03: Clean Imports (2/2) — All imported symbols are used.
  • CQ-04: Code Elegance (2/2) — sorted(zip(...), key=lambda x: x[3]) is Pythonic. pd.Categorical for axis ordering is the right approach. No fake UI.
  • CQ-05: Output & API (0/1) — Saves to "plot.png" (bare filename). Must save to f"plot-{THEME}.png" where THEME = os.getenv("ANYPLOT_THEME", "light"). No theme variable is read at all.

Library Mastery (6/10)

  • LM-01: Idiomatic Usage (4/5) — Good plotnine grammar-of-graphics usage: ggplot() + geom_segment() + geom_point() + scale_color_manual() + labs() + theme(). pd.Categorical for proper factor ordering is idiomatic.
  • LM-02: Distinctive Features (2/5) — Layer composition of geom_segment + geom_point and categorical axis ordering are plotnine-specific patterns. Doesn't leverage scale_x_continuous breaks/labels, coord_flip, or plotnine's faceting system.

Score Caps Applied

  • None (all cap thresholds cleared: DE-01=4 > 2, DE-02=4 > 2, VQ-02=6, VQ-03=5, SC-01=5, DQ-02=5, CQ-04=2)

Strengths

  • Correct dumbbell chart type using geom_segment + geom_point layering — idiomatic plotnine approach
  • Data sorted by improvement magnitude creates a meaningful visual pattern
  • pd.Categorical for proper y-axis ordering is the right plotnine pattern
  • Good visual refinement: y-grid removed, theme_minimal() keeps the chart clean
  • Realistic, neutral employee-satisfaction dataset with meaningful variation

Weaknesses

  • Critical palette violation: #306998 (Python Blue) is explicitly listed as non-compliant in the scoring criteria; #FFD43B is not Okabe-Ito. Must use #009E73 (Okabe-Ito 1) for "Before" and #D55E00 (Okabe-Ito 2) for "After"
  • No theme support: No os.environ["ANYPLOT_THEME"] read; no PAGE_BG/INK/INK_SOFT tokens; connecting line color #888888 is also hardcoded
  • Wrong output filename: plot.save("plot.png") must become plot.save(f"plot-{THEME}.png")
  • Wrong title domain: "pyplots.ai" must be "anyplot.ai"
  • No data storytelling emphasis: All improvements shown equally; no visual emphasis on largest-gain departments

Issues Found

  1. VQ-07 = 0 (PALETTE NON-COMPLIANT): #306998 and #FFD43B are not Okabe-Ito colors. Add os.environ["ANYPLOT_THEME"] support and full theme-adaptive chrome.
    • Fix: Replace with OKABE_ITO = ["#009E73", "#D55E00", ...]; add PAGE_BG/INK/INK_SOFT tokens; apply to plot_background, axis_text, plot_title, legend_background, panel_grid_major
  2. CQ-05 = 0 (WRONG OUTPUT FILENAME): plot.save("plot.png") — must save as f"plot-{THEME}.png"
  3. SC-04 PARTIAL: Title uses "pyplots.ai" instead of "anyplot.ai"

AI Feedback for Next Attempt

Three required fixes: (1) Replace #306998/#FFD43B with Okabe-Ito colors {"Before": "#009E73", "After": "#D55E00"}; (2) Add import os and read THEME = os.getenv("ANYPLOT_THEME", "light"), set PAGE_BG/INK/INK_SOFT tokens, apply them to plot_background, panel_background, axis_text, plot_title, legend_background, legend_text, panel_grid_major; (3) Change plot.save("plot.png") to plot.save(f"plot-{THEME}.png", ...); (4) Fix title to use "anyplot.ai". Optionally: apply INK_SOFT to connecting line color instead of hardcoded #888888, and add mild color-opacity variation to emphasize the largest improvements (DE-03).

Verdict: REJECTED

@github-actions github-actions Bot added quality:79 Quality score 79/100 ai-rejected Quality not OK, triggers update labels Apr 26, 2026
@github-actions github-actions Bot added ai-attempt-1 First repair attempt and removed ai-rejected Quality not OK, triggers update labels Apr 26, 2026
Attempt 1/3 - fixes based on AI review
@github-actions
Copy link
Copy Markdown
Contributor Author

🔧 Repair Attempt 1/3

Applied fixes based on AI review feedback.

Status: Repair completed, re-triggering review...


🤖 impl-repair

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Apr 26, 2026

AI Review - Attempt 2/3

Image Description

Light render (plot-light.png): The plot renders on a warm off-white (#FAF8F1) background. Title "Employee Satisfaction · dumbbell-basic · plotnine · anyplot.ai" is bold and dark. Eight horizontal dumbbells are sorted by improvement gap (smallest at bottom: Sales +6, Finance +7; largest at top: HR +28, Customer Support +22). Green dots (#009E73) mark "Before" values and orange dots (#D55E00) mark "After" values — all improvements are positive. Connecting segments are a subtle gray with alpha=0.6. Only x-axis gridlines are shown (y-axis grid is correctly suppressed). Legend on the right with "Before" and "After" labels is clear. All text (title, axis labels, tick labels, legend) is clearly readable against the light background.

Dark render (plot-dark.png): The plot renders on a warm near-black (#1A1A17) background. The title and axis labels are rendered in light cream text (#F0EFE8), clearly readable. Tick labels and legend text are in the expected lighter gray tone (#B8B7B0 equivalent). Data colors are identical to the light render — green (#009E73) and orange (#D55E00) — confirming Okabe-Ito positions 1–2 are theme-invariant. The connecting lines adapt to a lighter gray (INK_SOFT dark variant). No dark-on-dark failures detected anywhere. The legend box uses the elevated dark surface (#242420) for a subtle distinction. All text is clearly readable against the dark background.

Both paragraphs are required. A review that only describes one render is invalid.

Score: 89/100

Category Score Max
Visual Quality 30 30
Design Excellence 11 20
Spec Compliance 15 15
Data Quality 15 15
Code Quality 10 10
Library Mastery 8 10
Total 89 100

Visual Quality (30/30)

  • VQ-01: Text Legibility (8/8) — All sizes explicitly set: title=24pt, axis=20pt, ticks=16pt, legend=16pt. Adaptive INK/INK_SOFT colors work correctly in both themes.
  • VQ-02: No Overlap (6/6) — 8 categories well-spaced; no label collisions.
  • VQ-03: Element Visibility (6/6) — size=7 markers clearly visible at 4800×2700; connecting lines at size=1.5 are appropriately subtle.
  • VQ-04: Color Accessibility (2/2) — Green (#009E73) vs orange (#D55E00) provide good luminance separation; CVD-safe Okabe-Ito combination.
  • VQ-05: Layout & Canvas (4/4) — 16:9 canvas, balanced margins, plot fills 50–70% of canvas, legend integrated at right.
  • VQ-06: Axis Labels & Title (2/2) — "Satisfaction Score" x-axis and "Department" y-axis are descriptive; survey scale is dimensionless.
  • VQ-07: Palette Compliance (2/2) — Before=#009E73 (position 1), After=#D55E00 (position 2). Backgrounds #FAF8F1 / #1A1A17. Data colors identical across both renders; chrome flips correctly.

Design Excellence (11/20)

  • DE-01: Aesthetic Sophistication (4/8) — Well-configured library default. Sorting by difference is the key intentional design choice, but nothing lifts this above a standard configured default. No annotations, no size variation, no emphasis to distinguish high vs. low performers.
  • DE-02: Visual Refinement (4/6) — Y-axis major grid correctly blanked (connecting lines do that job), minor grid removed, x-axis grid subtle. However, spines are not explicitly removed — panel_border=element_blank() is missing and axis_line is not explicitly styled per library guide.
  • DE-03: Data Storytelling (3/6) — Sorting by difference (ascending) creates a clear narrative structure revealing which departments improved most vs. least. But there is no visual emphasis (alpha variation, size variation, annotations) to make the high-gain departments pop visually.

Spec Compliance (15/15)

  • SC-01: Plot Type (5/5) — Correct horizontal dumbbell chart with two dots per category connected by a line.
  • SC-02: Required Features (4/4) — Distinct colors for start/end, horizontal orientation, categories on y-axis, sorted by difference, thin/subtle connecting line. All met.
  • SC-03: Data Mapping (3/3) — Satisfaction values on x-axis, departments on y-axis; all 8 categories visible.
  • SC-04: Title & Legend (3/3) — Title format correct; legend shows "Before" / "After" with matching dot colors.

Data Quality (15/15)

  • DQ-01: Feature Coverage (6/6) — Wide range of gap sizes (Sales +6 to HR +28) and starting values (Customer Support starts at 40, Product at 75) demonstrate diverse dumbbell patterns.
  • DQ-02: Realistic Context (5/5) — Employee satisfaction scores before/after workplace policy changes; neutral, real-world business scenario across 8 recognizable departments.
  • DQ-03: Appropriate Scale (4/4) — Values range 40–88 on a 30–100 axis; internally consistent with realistic survey scores.

Code Quality (10/10)

  • CQ-01: KISS Structure (3/3) — Flat script: imports → constants → data → DataFrames → plot → save. No functions or classes.
  • CQ-02: Reproducibility (2/2) — All data is hardcoded; fully deterministic.
  • CQ-03: Clean Imports (2/2) — All 13 plotnine symbols imported are actually used.
  • CQ-04: Code Elegance (2/2) — Clean Pythonic code; zip(strict=True) is a good practice; list comprehensions are appropriate.
  • CQ-05: Output & API (1/1) — Saves as plot-{THEME}.png via plot.save().

Library Mastery (8/10)

  • LM-01: Idiomatic Usage (5/5) — Expert plotnine grammar: ggplot() + geom_segment() + geom_point() + scale_* + theme() composition. pd.Categorical for ordered factors is the idiomatic plotnine/ggplot2 pattern for controlling plot order.
  • LM-02: Distinctive Features (3/5) — Grammar of graphics layering (geom_segment + geom_point) and pd.Categorical ordering are distinctive plotnine features. Could go further with scale_y_discrete(expand=...) for fine-tuned spacing or other plotnine-specific refinements.

Score Caps Applied

  • None — no caps apply (DE-01=4 > 2, DE-02=4 > 2; no zero categories)

Strengths

  • Perfect theme adaptation — all text elements (title, axis labels, ticks, legend) use adaptive INK/INK_SOFT tokens; both renders fully pass readability check
  • Correct Okabe-Ito order with Before=#009E73 (brand green, position 1) and After=#D55E00 (position 2)
  • Sorting by difference is the right design choice — immediately reveals improvement patterns across departments
  • pd.Categorical for ordered factor is the idiomatic plotnine approach to controlling plot ordering
  • All font sizes explicitly set at correct values for the 4800×2700 render target
  • Data is realistic, neutral, and varied (gap sizes range from +6 to +28 pts)

Weaknesses

  • DE-01 LOW: No standout design element beyond sorting — no annotations, no size variation, no visual emphasis; looks like a well-configured default
  • DE-02 PARTIAL: panel_border=element_blank() is missing and axis_line color is not explicitly set (per library guide); relying on theme_minimal() to handle spine removal
  • DE-03 MODERATE: Sorting creates structure but high-gain departments (HR +28, Customer Support +22) don't visually pop vs. low-gain ones (Sales +6, Finance +7) — need emphasis

Issues Found

  1. DE-01 LOW (4/8): Design sophistication stops at "well-configured default" — sorting is good but adds no further visual emphasis
    • Fix: Add difference annotations (+N) at the end of each dumbbell using geom_text, OR vary marker alpha/size by gap magnitude to create visual hierarchy emphasizing the largest improvements
  2. DE-02 PARTIAL (4/6): Spines not explicitly removed; axis_line color not set
    • Fix: Add panel_border=element_blank() to theme(); add axis_line=element_line(color=INK_SOFT) per the plotnine library guide
  3. DE-03 MODERATE (3/6): No visual emphasis distinguishes high-gain from low-gain departments
    • Fix: Consider adding geom_text with difference labels (e.g., "+28 pts") positioned to the right of the "After" dot, or use alpha variation on segments proportional to gap size

AI Feedback for Next Attempt

The core dumbbell implementation is excellent (perfect VQ, SC, DQ, CQ scores) — do not change the data, color palette, sorting, or output logic. Focus improvements on Design Excellence:

  1. Add difference annotations: Use geom_text(aes(x='end + 2', y='category', label='diff_label'), data=df_segments) to annotate each dumbbell with the improvement value (e.g., "+28 pts"). This simultaneously fixes DE-01 (sophistication), DE-03 (storytelling), and elevates the plot above defaults.
  2. Explicit spine removal: Add panel_border=element_blank() and axis_line=element_line(color=INK_SOFT) to the theme() block.
  3. Keep everything else unchanged — palette, sorting, font sizes, data, and theme adaptation are all correct.

Verdict: REJECTED

@github-actions github-actions Bot added quality:89 Quality score 89/100 ai-approved Quality OK, ready for merge labels Apr 26, 2026
@github-actions github-actions Bot merged commit 06a0cfe into main Apr 26, 2026
@github-actions github-actions Bot deleted the implementation/dumbbell-basic/plotnine branch April 26, 2026 01:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-approved Quality OK, ready for merge ai-attempt-1 First repair attempt quality:79 Quality score 79/100 quality:89 Quality score 89/100

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants