Skip to content

feat(altair): implement dumbbell-basic#9568

Merged
MarkusNeusinger merged 6 commits into
mainfrom
implementation/dumbbell-basic/altair
Jun 30, 2026
Merged

feat(altair): implement dumbbell-basic#9568
MarkusNeusinger merged 6 commits into
mainfrom
implementation/dumbbell-basic/altair

Conversation

@github-actions

Copy link
Copy Markdown
Contributor

Implementation: dumbbell-basic - python/altair

Implements the python/altair version of dumbbell-basic.

File: plots/dumbbell-basic/implementations/python/altair.py

Parent Issue: #945


🤖 impl-generate workflow

github-actions Bot added 2 commits June 30, 2026 22:57
Regen from quality 89. Addressed:
- Canvas size: switched to width=576/height=374 + scale_factor=4.0 + PIL
  padding to exact 3200×1800 (previously used width=1600/height=900/scale=3.0
  producing 4800×2700 — now gate-compliant)
- DE-03/LM-02: added transform_calculate diff labels (+N pts) showing gain
  immediately for each dumbbell — uses Altair-native transform, visible story
- Title: added missing "python" language token
- Style: aligned font sizes to library prompt (10/12px ticks/axis labels),
  fixed palette comment "Okabe-Ito" → "Imprint"
@claude

claude Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

AI Review - Attempt 1/3

Image Description

Light render (plot-light.png): The chart renders on a warm off-white (#FAF8F1) background. Ten department categories (IT through Customer Support) are plotted on the Y-axis, sorted ascending by gain (IT at top with +5 pts, Customer Support at bottom with +26 pts). Two large circles per row — brand green (#009E73) for "Before" and lavender (#C475FD) for "After" — are connected by a subtle grey rule. Difference labels (e.g. "+26 pts") appear in muted ink immediately right of each After dot. The X-axis is labeled "Employee Satisfaction Score (%)" with range 46–92. A legend box ("Policy Change") sits in the upper-right outside the data area. All title, axis label, tick label, and annotation text is clearly readable against the light background.

Dark render (plot-dark.png): Renders on a warm near-black (#1A1A17) background. Chrome (title, axis labels, tick labels, legend text) switches to light-cream tones (#F0EFE8 / #B8B7B0). The legend box uses the elevated dark surface (#242420) with a muted stroke — clearly distinct from the background. Data colors (green #009E73 and lavender #C475FD) are identical to the light render. Connecting lines and difference label text adapt to the dark chrome tokens. All text — title, category labels, tick labels, difference annotations — is clearly readable against the dark background. No dark-on-dark text failure observed.

Both paragraphs are required. A review that only describes one render is invalid.

Score: 89/100

Category Score Max
Visual Quality 29 30
Design Excellence 12 20
Spec Compliance 15 15
Data Quality 15 15
Code Quality 10 10
Library Mastery 8 10
Total 89 100

Visual Quality (29/30)

  • VQ-01: Text Legibility (7/8) — All text is readable in both themes; difference labels at fontSize=9 are small but legible at desktop scale; may be marginally tight on mobile (~400 px)
  • VQ-02: No Overlap (6/6) — No overlapping elements in either render; category labels, dots, diff labels, and legend are all cleanly separated
  • VQ-03: Element Visibility (6/6) — size=350 dots are prominent for 10 sparse datapoints; connecting lines at 45% opacity are appropriately subtle; both dot colors are clearly distinguishable
  • VQ-04: Color Accessibility (2/2) — Imprint positions 1 (#009E73) and 2 (#C475FD) are CVD-safe hue pair; dots have PAGE_BG stroke for contrast edge definition
  • VQ-05: Layout & Canvas (4/4) — Canvas gate passed (3200×1800); layout proportions are well-balanced; title fills ~70% of width; appropriate whitespace
  • VQ-06: Axis Labels & Title (2/2) — X-axis label "Employee Satisfaction Score (%)" is descriptive and includes units; Y-axis title suppressed (category names self-explanatory); title format correct
  • VQ-07: Palette Compliance (2/2) — Before=#009E73 (position 1), After=#C475FD (position 2); light background=#FAF8F1, dark=#1A1A17; chrome tokens adapt correctly; data colors identical across themes

Design Excellence (12/20)

  • DE-01: Aesthetic Sophistication (5/8) — Intentional design: sorted data reveals the narrative, stroked dots add definition, Imprint palette applied correctly; not exceptional but clearly thoughtful
  • DE-02: Visual Refinement (3/6) — Vertical grid at 10% opacity is appropriately subtle; configure_view stroke=None removes view border; legend padding=10 is polished; could further refine with spine removal (left axis domain line is still visible)
  • DE-03: Data Storytelling (4/6) — Sorted ascending by difference is a strong storytelling choice; per-row difference annotations (+N pts) make the gain immediately readable; clear focal point (Customer Support +26 pts at bottom)

Spec Compliance (15/15)

  • SC-01: Plot Type (5/5) — Correct horizontal dumbbell chart: two colored dots per category connected by a horizontal rule
  • SC-02: Required Features (4/4) — Two distinct colors for start/end dots; thin connecting line; horizontal orientation; sorted by difference
  • SC-03: Data Mapping (3/3) — X-axis maps score values; Y-axis maps 10 department categories; color encodes period (Before/After); all data points shown
  • SC-04: Title & Legend (3/3) — Title: "Employee Satisfaction · dumbbell-basic · python · altair · anyplot.ai" (correct format with descriptive prefix); legend title "Policy Change" with labels "Before" / "After"

Data Quality (15/15)

  • DQ-01: Feature Coverage (6/6) — Both endpoints shown; connecting line encodes range; difference annotations add quantitative layer; sort reveals direction of change
  • DQ-02: Realistic Context (5/5) — Employee satisfaction before/after policy change is the canonical dumbbell use case; all 10 departments are plausible real-world categories; values (52–85 on 0–100 scale) are realistic and neutral
  • DQ-03: Appropriate Scale (4/4) — X-scale [45, 92] fits data with padding; 10 categories within spec's 5–20 optimal range; value magnitudes realistic for HR satisfaction scores

Code Quality (10/10)

  • CQ-01: KISS Structure (3/3) — No functions or classes; three-layer chart composition is direct and readable
  • CQ-02: Reproducibility (2/2) — Data is hardcoded (deterministic); no random elements
  • CQ-03: Clean Imports (2/2) — All four imports (os, altair, pandas, PIL.Image) are actually used
  • CQ-04: Code Elegance (2/2) — No fake UI; transform_calculate for computed labels is idiomatic; PIL padding block follows the canonical pattern
  • CQ-05: Output & API (1/1) — Saves plot-{THEME}.png (with padding) and plot-{THEME}.html; uses scale_factor=4.0

Library Mastery (8/10)

  • LM-01: Idiomatic Usage (5/5) — Layer composition via + operator; mark_rule with x/x2 encoding for dumbbell lines; EncodingSortField for data-driven sort; configure_axis/configure_legend for theme chrome — all canonical Altair patterns
  • LM-02: Distinctive Features (3/5) — mark_rule with x2 is the correct Altair idiom for range lines; transform_calculate for inline label computation is a nice touch; however the overall chart structure is achievable in most declarative libraries without anything truly Altair-exclusive

Score Caps Applied

  • None — no cap conditions triggered

Strengths

  • Excellent use of Altair's layer composition (mark_rule + mark_circle + mark_text) to build the dumbbell cleanly without workarounds
  • Data sorted ascending by difference creates an immediately readable narrative — Customer Support gains the most, IT the least
  • Per-row difference labels (+N pts) computed via transform_calculate make gains quantifiable without reading the x-axis
  • Complete and correct theme adaptation: background, text, grid, legend, and dot stroke all use adaptive tokens; data colors stay fixed across themes
  • mark_circle stroke=PAGE_BG gives dots a clean edge against both the background and the connecting lines

Weaknesses

  • Difference label fontSize=9 is the smallest allowed; at mobile scale (~400 px wide) these annotations shrink to ~4 px — consider fontSize=10 or 11 for marginally better accessibility
  • Left axis domain line is still visible (domainColor=INK_SOFT); removing it (domainOpacity=0 or domainColor='transparent') would push DE-02 refinement higher on next attempt

Issues Found

  1. VQ-01 MINOR: Difference annotations at fontSize=9 render at ~36 px in the 3200-wide image; at mobile thumbnail scale they become very small. Increase to fontSize=10 for slightly better legibility without crowding.
  2. DE-02 LOW: Left Y-axis domain line remains visible with domainColor=INK_SOFT. For a dumbbell chart the categories speak for themselves — suppressing the domain line would clean up the visual.

AI Feedback for Next Attempt

This is a strong, well-structured dumbbell implementation. Minor improvements: (1) increase difference label fontSize from 9 to 10–11 for better small-screen legibility; (2) remove the Y-axis domain line via domainOpacity=0 to eliminate the residual axis bar; (3) consider lightening the connecting line stroke or reducing opacity further (currently 0.45) so it reads even more as structural guide than data element.

Verdict: APPROVED

@github-actions github-actions Bot added quality:89 Quality score 89/100 ai-rejected Quality not OK, triggers update labels Jun 30, 2026
@github-actions github-actions Bot added ai-attempt-1 First repair attempt and removed ai-rejected Quality not OK, triggers update labels Jun 30, 2026
Attempt 1/3 - fixes based on AI review
@github-actions

Copy link
Copy Markdown
Contributor Author

🔧 Repair Attempt 1/4

Applied fixes based on AI review feedback.

Status: Repair completed, re-triggering review...


🤖 impl-repair

@claude

claude Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

AI Review - Attempt 2/3

Image Description

Light render (plot-light.png): The chart renders on a warm off-white #FAF8F1 background. The title "Employee Satisfaction · dumbbell-basic · python · altair · anyplot.ai" is displayed top-left in dark ink at a proportional size occupying ~75% of the plot width. The Y-axis lists 10 departments (IT through Customer Support at the bottom) sorted by ascending difference. The X-axis is labeled "Employee Satisfaction Score (%)" with numeric ticks from 46 to 92 every 2 units. Green #009E73 dots mark "Before" values and lavender #C475FD dots mark "After" values, connected by semi-transparent gray horizontal lines. Bold "+N pts" difference labels appear to the right of each After dot. A styled legend box ("Policy Change") sits in the upper right. All text — title, axis labels, tick labels, category names, and diff labels — is clearly readable against the light background. Legibility verdict: PASS.

Dark render (plot-dark.png): The same chart renders on a warm near-black #1A1A17 background. Data colors are identical — #009E73 for Before and #C475FD for After. Chrome inverts correctly: title and axis labels use light #F0EFE8 text, tick labels use #B8B7B0 soft-ink, and the legend box uses the elevated dark fill #242420. Grid lines remain subtle. No dark-on-dark failures observed — all text elements are clearly readable against the dark background. Legibility verdict: PASS.

Both renders pass the theme-readability check.

Score: 92/100

Category Score Max
Visual Quality 29 30
Design Excellence 14 20
Spec Compliance 15 15
Data Quality 15 15
Code Quality 10 10
Library Mastery 9 10
Total 92 100

Visual Quality (29/30)

  • VQ-01: Text Legibility (7/8) — All font sizes explicitly set; title, axis labels, tick labels, and category names are well-proportioned and readable in both themes. Minor deduction: the tick label grid density (every 2 units, ~23 ticks from 46–92) makes tick area slightly dense.
  • VQ-02: No Overlap (6/6) — No text collisions; diff labels well-spaced from each other and from data elements.
  • VQ-03: Element Visibility (6/6) — size=350 dots are prominent and appropriate for 10-category sparse data; connecting lines clearly visible at strokeWidth=3, opacity=0.45.
  • VQ-04: Color Accessibility (2/2) — Green/lavender pair is CVD-safe with good luminance difference; PAGE_BG stroke on dots adds definition.
  • VQ-05: Layout & Canvas (4/4) — Canvas gate passed (3200×1800). Plot fills the space well with the legend sitting cleanly in the upper right. PIL padding block correctly handles vl-convert sizing.
  • VQ-06: Axis Labels & Title (2/2) — X-axis "Employee Satisfaction Score (%)" is descriptive with units. Y-axis suppressed (department names serve as context). Title correctly formatted.
  • VQ-07: Palette Compliance (2/2) — Before = #009E73 (Imprint pos 1), After = #C475FD (Imprint pos 2). Backgrounds #FAF8F1 / #1A1A17 correct. Data colors identical across themes; only chrome flips. Full compliance.

Design Excellence (14/20)

  • DE-01: Aesthetic Sophistication (6/8) — Strong design clearly above Altair defaults: Imprint palette, PAGE_BG dot-stroke for crisp separation, semi-transparent connecting lines, bold diff labels. Doesn't quite reach publication-level polish (e.g., tick granularity, no axis refinement for whitespace), but well above a configured default.
  • DE-02: Visual Refinement (4/6) — Axis domain lines removed (domainOpacity=0), view stroke=None (no border frame), subtle grid at 10% opacity, styled legend box. Good refinement. The grid is slightly dense (every 2 units produces ~23 vertical lines) — reducing to every 5 or 10 units would lift this to 5–6.
  • DE-03: Data Storytelling (4/6) — Sorted by difference (ascending) makes the pattern immediately clear. Bold "+X pts" labels eliminate the need to mentally calculate differences — viewer instantly sees Customer Support improved the most (+26 pts). Good visual hierarchy.

Spec Compliance (15/15)

  • SC-01: Plot Type (5/5) — Correct dumbbell/connected dot plot with two endpoints per category.
  • SC-02: Required Features (4/4) — Horizontal orientation (categories on Y, values on X) ✓; distinct colors for start/end ✓; thin connecting lines ✓; sorted by difference ✓.
  • SC-03: Data Mapping (3/3) — Categories on Y-axis, satisfaction scores on X-axis, Before/After correctly mapped.
  • SC-04: Title & Legend (3/3) — Title uses optional descriptive prefix "Employee Satisfaction · " followed by mandatory "dumbbell-basic · python · altair · anyplot.ai". Legend title "Policy Change" with correct "Before"/"After" labels.

Data Quality (15/15)

  • DQ-01: Feature Coverage (6/6) — Shows 10 departments with varying improvement magnitudes (5–26 pts), demonstrating both small and large changes effectively.
  • DQ-02: Realistic Context (5/5) — Employee satisfaction before/after policy changes is a neutral, comprehensible business scenario with real department names.
  • DQ-03: Appropriate Scale (4/4) — Scores in 52–85 range (0–100 scale) with plausible before/after improvements. All departments improve which fits a "policy change success" narrative.

Code Quality (10/10)

  • CQ-01: KISS Structure (3/3) — Imports → data → chart layers → save. No functions or classes.
  • CQ-02: Reproducibility (2/2) — Hardcoded deterministic data; no random seed needed.
  • CQ-03: Clean Imports (2/2) — os, altair, pandas, PIL — all used, none superfluous.
  • CQ-04: Code Elegance (2/2) — Clean Altair layer composition, concise transform_calculate for derived labels, appropriate complexity.
  • CQ-05: Output & API (1/1) — Saves plot-{THEME}.png and plot-{THEME}.html; current Altair 6.x API.

Library Mastery (9/10)

  • LM-01: Idiomatic Usage (5/5) — Expertly uses layer composition (mark_rule + mark_circle + mark_text with +), x2 encoding for dumbbell lines, EncodingSortField, and transform_calculate for derived labels — all canonical Altair patterns.
  • LM-02: Distinctive Features (4/5) — transform_calculate, x2 encoding, and EncodingSortField are genuinely Altair-specific. One point off: tooltip encoding is wired but doesn't surface in PNG; the implementation could use a conditional encoding (e.g., highlight the largest-change row via color condition) to further showcase Altair's distinctive grammar.

Score Caps Applied

  • None — no caps triggered.

Strengths

  • Perfect Imprint palette assignment: Before=#009E73 (pos 1), After=#C475FD (pos 2); theme-adaptive chrome correctly wired in both renders
  • Excellent spec compliance: horizontal dumbbell, sorted by difference, subtle connecting lines, distinct endpoint colors — all spec requirements met
  • Bold "+X pts" difference labels via transform_calculate add immediate storytelling clarity without extra data
  • Idiomatic Altair layer composition (mark_rule + mark_circle + mark_text) is clean and maintainable
  • PIL canvas-padding block correctly handles vl-convert sizing to exact 3200×1800

Weaknesses

  • Grid density: 23 vertical grid lines (every 2 units from 46–92) is slightly noisy; reducing to every 5 or 10 units (e.g., via configure_axis(tickCount=10)) would improve visual refinement
  • Tick label granularity: same issue — ticks every 2 units creates a crowded axis; fewer ticks at rounder values would improve readability at small scales
  • LM-02 ceiling: conditional encodings (e.g., highlight the department with the highest gain) are Altair-exclusive and would demonstrate more distinctive library mastery

Issues Found

  1. DE-02 MINOR: Grid density at every 2 units produces ~23 vertical lines — consider or to reduce to ~10 ticks/gridlines
  2. LM-02 MINOR: Tooltip encoding present but invisible in PNG; a conditional color or size encoding for the top-performer row would better showcase Altair's grammar-of-graphics strengths

AI Feedback for Next Attempt

Implementation is strong. Primary improvement: reduce x-axis tick/grid density to every 5 or 10 units (tickCount=10 or tickMinStep=5) for cleaner visual refinement. Optionally add a conditional color encoding to highlight the department with the largest gain to demonstrate distinctive Altair grammar.

Verdict: APPROVED

@github-actions github-actions Bot added quality:92 Quality score 92/100 ai-approved Quality OK, ready for merge and removed quality:89 Quality score 89/100 labels Jun 30, 2026
@MarkusNeusinger MarkusNeusinger merged commit f32ca35 into main Jun 30, 2026
3 checks passed
@MarkusNeusinger MarkusNeusinger deleted the implementation/dumbbell-basic/altair branch June 30, 2026 23:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-approved Quality OK, ready for merge ai-attempt-1 First repair attempt quality:92 Quality score 92/100

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant