feat(highcharts): implement shap-waterfall by github-actions[bot] · Pull Request #5902 · MarkusNeusinger/anyplot

github-actions · 2026-05-07T11:47:28Z

Implementation: `shap-waterfall` - python/highcharts

Implements the python/highcharts version of shap-waterfall.

File: plots/shap-waterfall/implementations/python/highcharts.py

Parent Issue: #5237

🤖 impl-generate workflow

claude · 2026-05-07T11:52:22Z

AI Review - Attempt 1/3

Image Description

Light render (plot-light.png): The chart renders on a warm off-white #FAF8F1 background. A horizontal waterfall chart shows credit-default risk attribution: a long green (#009E73) baseline bar spans from 0 to 0.35, followed by 10 feature SHAP bars (orange #D55E00 for positive contributions, blue #0072B2 for negative), and a final green prediction bar at 0.20. Bars are ordered from largest-to-smallest absolute SHAP magnitude (Credit Score –0.180 at top, Savings Balance –0.020 at bottom). Data labels beside each bar show signed SHAP values. Two dashed reference lines mark the baseline (0.35) and prediction (0.20). Title "Credit Default Risk · shap-waterfall · highcharts · anyplot.ai" is dark and readable. Feature labels on the left y-axis, tick labels on the x-axis, and the y-axis title "Probability of Default" are all clearly readable. All text is legible against the light background. Legibility verdict: PASS — however the x-axis has 52 tick marks at 0.01 intervals (–0.02 to 0.50) which are dense, and the right ~30% of the chart is mostly empty since data peaks near 0.35 while the axis extends to 0.50.

Dark render (plot-dark.png): Identical layout on a warm near-black #1A1A17 background. The baseline and prediction bars remain #009E73 (unchanged from light — correct). Positive bars stay #D55E00 and negative bars #0072B2 — data colors are identical across themes. Title text renders in light cream #F0EFE8; feature labels and tick labels appear in light gray #B8B7B0; data labels beside bars are light-colored. No dark-on-dark failures detected. Grid lines are subtle (10% opacity white rule). Legibility verdict: PASS — all chrome (text, grid, background) flips correctly to dark-theme tokens; data colors are unchanged.

Both paragraphs are required. A review that only describes one render is invalid.

Score: 86/100

Category	Score	Max
Visual Quality	27	30
Design Excellence	13	20
Spec Compliance	14	15
Data Quality	15	15
Code Quality	9	10
Library Mastery	8	10
Total	86	100

Visual Quality (27/30)

VQ-01: Text Legibility (7/8) — all sizes explicitly set (title 28px, axis labels 22px, tick labels 18–20px, data labels 18px); reference line annotation labels are 16px (below the 18px minimum — raise to 18px)
VQ-02: No Overlap (5/6) — 52 x-axis ticks at 0.01 intervals are very dense; no true overlap at 4800px but crowded; widen tick interval to 0.05
VQ-03: Element Visibility (6/6) — all bars clearly visible, colors well-contrasted in both themes
VQ-04: Color Accessibility (2/2) — Okabe-Ito palette is CVD-safe; orange/blue distinction is clear without relying on hue alone
VQ-05: Layout & Canvas (3/4) — plot fills ~65% of canvas; right side has excessive empty space (axis extends to 0.50, data peaks near 0.35); reduce max to ~0.42
VQ-06: Axis Labels & Title (2/2) — "Probability of Default" and "Feature" are descriptive
VQ-07: Palette Compliance (2/2) — #009E73 baseline/prediction, #D55E00 positive SHAP, #0072B2 negative SHAP (Okabe-Ito order); #FAF8F1 / #1A1A17 backgrounds; all chrome theme-correct

Design Excellence (13/20)

DE-01: Aesthetic Sophistication (5/8) — semantic color mapping (green = anchor, orange = risk up, blue = risk down) is intentional and professional; slightly above well-configured default but not publication-ready
DE-02: Visual Refinement (4/6) — borderWidth: 0 on bars, subtle 10% opacity grid, generous explicit margins (340px left, 220px right), legend disabled; good refinement
DE-03: Data Storytelling (4/6) — features sorted by absolute SHAP magnitude creates clear visual hierarchy; semantic color coding tells the positive/negative story; reference lines anchor baseline vs prediction; lacks a subtitle or annotation explaining the credit-default context to a non-expert viewer

Spec Compliance (14/15)

SC-01: Plot Type (5/5) — native Highcharts waterfall series with inverted: true for horizontal orientation; connector lines present via lineWidth: 2
SC-02: Required Features (4/4) — features ordered by |SHAP|, cumulative waterfall stacking, signed color encoding, baseline and prediction reference lines with labels, numeric SHAP data labels, horizontal layout with features on y-axis
SC-03: Data Mapping (3/3) — x-axis is probability space, y-axis is features, data flows correctly from E[f(x)]=0.35 to f(x)=0.20
SC-04: Title & Legend (2/3) — title is Credit Default Risk · shap-waterfall · highcharts · anyplot.ai; spec requires shap-waterfall · highcharts · anyplot.ai (extra descriptive prefix deviates from format); legend correctly disabled for single series

Data Quality (15/15)

DQ-01: Feature Coverage (6/6) — shows both positive and negative SHAP values, wide range of magnitudes (±0.18 to ±0.02), baseline and prediction bars, covers all waterfall aspects
DQ-02: Realistic Context (5/5) — credit scoring loan application is a canonical SHAP use case; features (Credit Score, Debt-to-Income, Annual Income, Loan Amount, Employment Years, Payment History…) are authentic credit model features; neutral business domain
DQ-03: Appropriate Scale (4/4) — BASE_VALUE=0.35 (35% default probability), FINAL_VALUE=0.20; SHAP values sum correctly (0.35 + (–0.15) = 0.20 ✓); magnitudes realistic for a credit model

Code Quality (9/10)

CQ-01: KISS Structure (2/3) — download_js() helper function defined; KISS requires flat Imports → Data → Plot → Save with no functions/classes
CQ-02: Reproducibility (2/2) — all data is hardcoded; fully deterministic
CQ-03: Clean Imports (2/2) — all imports are used
CQ-04: Code Elegance (2/2) — JSON + placeholder string-replace for JS functions is a pragmatic solution; CDP screenshot is the correct approach for full-resolution 4800×2700 capture; CDN fallback logic is well-structured
CQ-05: Output & API (1/1) — saves plot-{THEME}.png and plot-{THEME}.html; current Highcharts 11 API

Library Mastery (8/10)

LM-01: Idiomatic Usage (4/5) — native waterfall series type, isSum: true for summary bar, inverted: true, plotLines with labels, dataLabels with Highcharts.numberFormat — idiomatic and correct
LM-02: Distinctive Features (4/5) — isSum: true (Highcharts-specific waterfall feature for running totals), highcharts-more.js waterfall module, Page.captureScreenshot CDP command for exact-dimension PNG capture; these features are specific to Highcharts + Selenium combination

Score Caps Applied

None — no score caps triggered

Strengths

Native Highcharts waterfall series with isSum: true for the prediction bar — idiomatic and correct
Semantic Okabe-Ito color assignment (green = anchor bars, orange = positive SHAP, blue = negative SHAP) clearly communicates the plot's meaning
Fully theme-adaptive chrome — all text, grid, and background tokens flip correctly between light and dark without any dark-on-dark failures
Realistic credit-scoring domain with internally consistent SHAP values that sum to the correct prediction
CDP screenshot technique delivers precise 4800×2700 PNG without window-size tricks

Weaknesses

Title format must be shap-waterfall · highcharts · anyplot.ai — remove the "Credit Default Risk ·" prefix that precedes the spec-id
download_js() helper function violates KISS; inline the CDN download logic directly
X-axis has 52 tick marks at 0.01 intervals (–0.02 to 0.50) — replace with 0.05 tick intervals via tickInterval: 0.05 to reduce density
Y-axis max: 0.50 wastes ~30% of canvas width; reduce to ~0.42 to tighten the plot area
Reference line annotation font size is 16px; raise to 18px to meet minimum legibility standard
DE: no subtitle or contextual framing for non-expert viewers; consider adding a subtitle like "Single loan applicant — features sorted by |SHAP value|"

Issues Found

SC-04 TITLE FORMAT: Credit Default Risk · shap-waterfall · highcharts · anyplot.ai should be shap-waterfall · highcharts · anyplot.ai
- Fix: Remove the Credit Default Risk · prefix from the title text
CQ-01 FUNCTION: download_js() helper function violates KISS structure
- Fix: Inline the CDN download loop directly where highcharts_js and highcharts_more_js are assigned
VQ-02 / VQ-05 AXIS: X-axis from –0.02 to 0.50 at 0.01 intervals creates 52 ticks and wastes right-side canvas
- Fix: Set "tickInterval": 0.05 on yAxis (note: inverted chart, so probability is yAxis in Highcharts before inversion) and "max": 0.42
VQ-01 ANNOTATION SIZE: Reference line labels at 16px are below the 18px floor
- Fix: Change "fontSize": "16px" in plotLines label styles to "fontSize": "18px"

AI Feedback for Next Attempt

Fix four concrete issues: (1) title must be exactly shap-waterfall · highcharts · anyplot.ai; (2) inline the JS download loop — no helper functions; (3) set tickInterval: 0.05 and max: 0.42 on the probability axis to remove tick density and wasted right-side space; (4) raise plotLine annotation font size from 16px to 18px. For design improvement, consider adding a subtitle that frames the scenario (e.g., "Individual loan applicant — features ranked by |SHAP value|") and explore slightly larger bar heights (reduce groupPadding) to make the waterfall fill the vertical space more prominently.

Verdict: REJECTED

github-actions · 2026-05-07T11:58:09Z

🔧 Repair Attempt 1/4

Applied fixes based on AI review feedback.

Status: Repair completed, re-triggering review...

🤖 impl-repair

claude · 2026-05-07T12:04:54Z

AI Review - Attempt 2/3

Image Description

Light render (plot-light.png): The chart renders on a warm off-white #FAF8F1 background. A horizontal SHAP waterfall chart is displayed with 12 rows: baseline bar (green #009E73), 10 feature bars (orange #D55E00 for positive SHAP, blue #0072B2 for negative SHAP), and a final prediction bar (green #009E73). Feature names are shown on the left y-axis (E[f(x)] Baseline, Credit Score, Debt-to-Income, Annual Income, Loan Amount, Employment Years, Payment History, Open Accounts, Credit Inquiries, Credit Age, Savings Balance, f(x) Prediction). The x-axis is labeled "Probability of Default" with tick values from -0.05 to 0.45. SHAP value labels (e.g., -0.180, +0.150, -0.120) are displayed beside each bar. Two reference lines mark Baseline 0.35 (dashed) and Prediction 0.20 (dotted). All title, axis label, tick label, and data label text is clearly readable as dark ink on the light surface. Legibility verdict: PASS.

Dark render (plot-dark.png): The same chart renders on a warm near-black #1A1A17 background. Layout, data, and structure are identical to the light render. Data bar colors are unchanged — green baseline/prediction, orange positive, blue negative — confirming Okabe-Ito positions 1–3 are theme-invariant. Chrome elements flip correctly: feature labels appear as light gray #B8B7B0 against the dark background, axis titles are light #F0EFE8, and grid lines are very subtle. The reference line labels ("Prediction 0.20" in brand green, "Baseline 0.35" in soft gray) are legible. No dark-on-dark text failures detected. Legibility verdict: PASS.

Both paragraphs are required. A review that only describes one render is invalid.

Score: 86/100

Category	Score	Max
Visual Quality	28	30
Design Excellence	13	20
Spec Compliance	14	15
Data Quality	15	15
Code Quality	9	10
Library Mastery	7	10
Total	86	100

Visual Quality (28/30)

VQ-01: Text Legibility (7/8) — All sizes explicitly set (title 28px, axis 22px, ticks 18-20px, data labels 18px); data labels slightly small at this canvas size, 20-22px would be better
VQ-02: No Overlap (6/6) — No collisions in either render
VQ-03: Element Visibility (6/6) — All bars sized and padded well, clearly distinct
VQ-04: Color Accessibility (2/2) — Orange/blue are high-contrast and CVD-safe; green reference bars unambiguous
VQ-05: Layout & Canvas (3/4) — Chart fills canvas well, but yAxis.max: 0.50 extends 15+ percentage points beyond the Baseline 0.35 reference line, wasting canvas on the right
VQ-06: Axis Labels & Title (2/2) — "Probability of Default" (x-axis) and "Feature" (y-axis) are descriptive
VQ-07: Palette Compliance (2/2) — Okabe-Ito positions 1-3 used correctly; backgrounds #FAF8F1/#1A1A17; chrome tokens applied to all elements in both themes

Design Excellence (13/20)

DE-01: Aesthetic Sophistication (5/8) — Above defaults: semantic color coding for positive/negative direction, brand green for reference bars, clean fontFamily, left-aligned title with explicit positioning; not yet publication-level
DE-02: Visual Refinement (4/6) — borderWidth: 0 removes bar outlines, legend disabled, tooltip disabled, subtle GRID token applied; top/right axes still present (Highcharts default frame)
DE-03: Data Storytelling (4/6) — Color instantly signals contribution direction (warm orange = risk up, cool blue = risk down, green = anchor points); magnitude-ordered features guide the eye to highest-impact variables; clear visual flow from baseline to prediction

Spec Compliance (14/15)

SC-01: Plot Type (5/5) — Native Highcharts waterfall type with inverted: true — correct horizontal waterfall
SC-02: Required Features (4/4) — Cumulative bars, positive/negative color coding, base value bar, prediction bar, numeric SHAP labels, horizontal layout, reference lines, features sorted by |SHAP| magnitude, native waterfall connector lines
SC-03: Data Mapping (3/3) — Features on y-axis, probability of default on x-axis, all 10 features shown
SC-04: Title & Legend (2/3) — Title is "Credit Default Risk · shap-waterfall · highcharts · anyplot.ai" — the "Credit Default Risk · " prefix is non-standard; required format is {spec-id} · {library} · anyplot.ai = "shap-waterfall · highcharts · anyplot.ai"

Data Quality (15/15)

DQ-01: Feature Coverage (6/6) — Both positive (Debt-to-Income, Loan Amount, Open Accounts, Credit Inquiries) and negative (Credit Score, Annual Income, Employment Years, Payment History, Credit Age, Savings Balance) contributors shown; baseline and prediction bookend bars complete the picture
DQ-02: Realistic Context (5/5) — Credit default risk model for a single loan application; all feature names are domain-appropriate; neutral business scenario with no controversial content
DQ-03: Appropriate Scale (4/4) — Base value 0.35 (35% average default probability) realistic for a credit model; SHAP values ranging ±0.18 appropriate for probability-scale output; final prediction 0.20 is plausible

Code Quality (9/10)

CQ-01: KISS Structure (2/3) — download_js() helper function defined; KISS requires flat script with no functions/classes
CQ-02: Reproducibility (2/2) — All data hardcoded; fully deterministic
CQ-03: Clean Imports (2/2) — All 9 imports are used
CQ-04: Code Elegance (2/2) — JS formatter injection via string-replace is pragmatic for Highcharts Python integration; multi-CDN fallback is justified; no fake UI
CQ-05: Output & API (1/1) — Saves plot-{THEME}.png and plot-{THEME}.html correctly

Library Mastery (7/10)

LM-01: Idiomatic Usage (3/5) — Uses native Highcharts waterfall type correctly; however, bypasses highcharts_core Python API entirely in favour of raw JSON + string manipulation — the Python library's Chart/HighchartsOptions API is not used
LM-02: Distinctive Features (4/5) — isSum: true for the prediction total bar (Highcharts-specific waterfall feature), inverted: true for horizontal orientation, plotLines with styled labels for reference lines, CDP-based full-resolution screenshot capture — all Highcharts-distinctive

Score Caps Applied

None

Strengths

Native Highcharts waterfall chart type with isSum flag correctly models the cumulative SHAP attribution chain from baseline to final prediction
Semantic Okabe-Ito color assignment (orange = positive/risk-up, blue = negative/risk-down, green = anchors) creates an immediate, intuitive visual narrative that needs no legend
Complete theme-adaptive chrome: all INK, INK_SOFT, GRID, and PAGE_BG tokens applied to every axis, label, and background element in both renders
Multi-CDN fallback download logic with retry ensures robust CI execution even when a CDN is unreachable
Dual plotLines (baseline dashed, prediction dotted) with branded label styling fulfil the spec's "labeled reference lines" requirement

Weaknesses

Title prefix "Credit Default Risk · " is non-standard; must be exactly shap-waterfall · highcharts · anyplot.ai
yAxis.max: 0.50 wastes ~30% of canvas width to the right of the Baseline 0.35 line; trim to ~0.40–0.42
Data label fontSize 18px is slightly small for a 4800×2700 canvas; 20–22px would improve legibility
download_js() helper function violates KISS; inline the CDN logic or simplify to a single URL
Raw JSON construction bypasses highcharts_core Python API; LM-01 capped at 3/5

Issues Found

SC-04 TITLE: "Credit Default Risk · shap-waterfall · highcharts · anyplot.ai" has a non-standard prefix
- Fix: Change title text to "shap-waterfall · highcharts · anyplot.ai"
VQ-05 CANVAS: "max": 0.50 on yAxis extends well past data range, leaving large empty area right of Baseline line
- Fix: Set "max": 0.42 to reduce wasted space while preserving the reference lines

AI Feedback for Next Attempt

Fix the two concrete issues: (1) Strip the "Credit Default Risk · " prefix from the title — use exactly "shap-waterfall · highcharts · anyplot.ai". (2) Lower yAxis max from 0.50 to ~0.42 so the chart fills the canvas more efficiently. Optionally bump data label fontSize from 18px to 20-22px and inline the CDN download logic to remove the helper function.

Verdict: APPROVED

) ## Summary First run of \`auto-update-pr-branches.yml\` after #5957 found 0 BEHIND PRs even though three were stuck behind main (#5916, #5870, #5902). Two issues: 1. **Timing.** The workflow runs ~4s after the push to main, but GitHub recomputes \`mergeStateStatus\` and the cached PR head SHA asynchronously. Right after the push the field is still UNKNOWN and the cached head can be stale → \`update-branch\` returns *expected head sha didn't match current head ref*. Add a 30s sleep at the start. 2. **Over-strict filter.** The script only iterated PRs where \`mergeStateStatus == "BEHIND"\`, skipping UNKNOWN candidates — exactly the ones we wanted to fix. Drop the filter: after a push to main, every open auto-merge PR is behind, and \`update-branch\` is a no-op when the head is already up-to-date. Also: - Bump permissions to \`contents: write\` (update-branch creates a merge commit on the head ref). - Drop \`--silent\` and capture stderr so the actual GitHub error lands in the log. Verified manually: calling \`PUT /pulls/{num}/update-branch\` from the CLI on #5916 and #5870 worked and they auto-merged within seconds. The 422 on #5902 was a real history-divergence conflict (4 ahead / 58 behind / merge_base differs) — separate problem. ## Test plan - [ ] After this merges, push something to main and confirm the workflow finds N>0 PRs (where N is open auto-merge PRs). - [ ] Confirm any genuinely stuck PR (conflict) gets a clear error in the log instead of \`likely conflict or stale ref\`. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ap-waterfall

github-actions Bot mentioned this pull request May 7, 2026

[shap-waterfall] SHAP Waterfall Plot for Feature Attribution #5237

Open

github-actions Bot added quality:86 Quality score 86/100 ai-rejected Quality not OK, triggers update ai-attempt-1 First repair attempt and removed ai-rejected Quality not OK, triggers update labels May 7, 2026

github-actions Bot added the ai-approved Quality OK, ready for merge label May 7, 2026

MarkusNeusinger enabled auto-merge (squash) May 7, 2026 20:18

MarkusNeusinger mentioned this pull request May 7, 2026

ci(auto-update): wait for state to settle and surface real errors #5958

Merged

2 tasks

MarkusNeusinger force-pushed the implementation/shap-waterfall/highcharts branch from e479b98 to 221519c Compare May 7, 2026 20:24

github-actions Bot added 3 commits May 7, 2026 22:24

chore(highcharts): add metadata for shap-waterfall

6d17b71

chore(highcharts): update quality score 86 and review feedback for sh…

7c29aae

…ap-waterfall

chore(highcharts): update quality score 86 and review feedback for sh…

221519c

…ap-waterfall

MarkusNeusinger merged commit 2a2c9d8 into main May 7, 2026
6 checks passed

MarkusNeusinger deleted the implementation/shap-waterfall/highcharts branch May 7, 2026 20:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(highcharts): implement shap-waterfall#5902

feat(highcharts): implement shap-waterfall#5902
MarkusNeusinger merged 3 commits intomainfrom
implementation/shap-waterfall/highcharts

github-actions Bot commented May 7, 2026

Uh oh!

claude Bot commented May 7, 2026

Uh oh!

github-actions Bot commented May 7, 2026

Uh oh!

claude Bot commented May 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

github-actions Bot commented May 7, 2026

Implementation: shap-waterfall - python/highcharts

Uh oh!

claude Bot commented May 7, 2026

AI Review - Attempt 1/3

Image Description

Score: 86/100

Visual Quality (27/30)

Design Excellence (13/20)

Spec Compliance (14/15)

Data Quality (15/15)

Code Quality (9/10)

Library Mastery (8/10)

Score Caps Applied

Strengths

Weaknesses

Issues Found

AI Feedback for Next Attempt

Verdict: REJECTED

Uh oh!

github-actions Bot commented May 7, 2026

🔧 Repair Attempt 1/4

Uh oh!

claude Bot commented May 7, 2026

AI Review - Attempt 2/3

Image Description

Score: 86/100

Visual Quality (28/30)

Design Excellence (13/20)

Spec Compliance (14/15)

Data Quality (15/15)

Code Quality (9/10)

Library Mastery (7/10)

Score Caps Applied

Strengths

Weaknesses

Issues Found

AI Feedback for Next Attempt

Verdict: APPROVED

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Implementation: `shap-waterfall` - python/highcharts