feat(matplotlib): implement pp-basic by github-actions[bot] · Pull Request #4869 · MarkusNeusinger/anyplot

github-actions · 2026-03-15T20:16:50Z

Implementation: `pp-basic` - matplotlib

Implements the matplotlib version of pp-basic.

File: plots/pp-basic/implementations/matplotlib.py

Parent Issue: #4587

🤖 impl-generate workflow

claude · 2026-03-15T20:20:28Z

AI Review - Attempt 1/3

Image Description

The plot displays a P-P (Probability-Probability) plot with 200 blue scatter points (#306998, Python Blue) with white edges (alpha=0.7), plotted against a gray dashed 45-degree reference line. The x-axis is labeled "Theoretical Cumulative Probability" and the y-axis "Empirical Cumulative Probability", both ranging from 0.0 to 1.0. The title reads "pp-basic · matplotlib · pyplots.ai" in medium weight. The plot uses a square aspect ratio (12×12 figure). Top and right spines are removed. Subtle grid lines (alpha=0.2) appear on both axes. The data points follow the diagonal closely but show a slight S-shaped deviation in the upper range (around 0.7–0.9), consistent with the mixture of two normal distributions used to generate the data.

Score: 75/100

Category	Score	Max
Visual Quality	25	30
Design Excellence	9	20
Spec Compliance	15	15
Data Quality	12	15
Code Quality	9	10
Library Mastery	5	10
Total	75	100

Visual Quality (25/30)

VQ-01: Text Legibility (7/8) — Font sizes explicitly set (title 24, labels 20, ticks 16). All text clearly readable.
VQ-02: No Overlap (5/6) — Some scatter point overlap in dense regions (lower-left, upper-right), inherent to data density but exacerbated by oversized markers.
VQ-03: Element Visibility (4/6) — Markers at s=180 are too large for 200 data points. Guidelines recommend s=50–100 for 100–300 points. Oversized markers cause unnecessary visual clutter and obscure the reference line.
VQ-04: Color Accessibility (4/4) — Single-series Python Blue, colorblind-safe, good contrast.
VQ-05: Layout & Canvas (3/4) — Square aspect ratio is correct per spec. 12×12 at 300dpi = 3600×3600 matching allowed square format. Good utilization.
VQ-06: Axis Labels & Title (2/2) — Descriptive labels, probability is unitless. Title follows required format.

Design Excellence (9/20)

DE-01: Aesthetic Sophistication (4/8) — Well-configured default with Python Blue and white edges. Clean but not exceptional. No custom palette needed for single series.
DE-02: Visual Refinement (3/6) — Top/right spines removed, subtle grid on both axes. Some refinement visible beyond defaults.
DE-03: Data Storytelling (2/6) — Data shows deviation from diagonal but no visual emphasis or hierarchy draws attention to the deviation pattern. Viewer must find the insight themselves.

Spec Compliance (15/15)

SC-01: Plot Type (5/5) — Correct P-P plot comparing empirical vs theoretical CDF.
SC-02: Required Features (4/4) — 45-degree reference line, both axes 0–1, square aspect ratio, empirical CDF computed as i/(n+1), theoretical CDF evaluated against fitted normal.
SC-03: Data Mapping (3/3) — Theoretical CDF on x-axis, empirical on y-axis, all data visible.
SC-04: Title & Legend (3/3) — Title follows {spec-id} · {library} · pyplots.ai format. No legend needed for single series.

Data Quality (12/15)

DQ-01: Feature Coverage (5/6) — Mixture distribution creates visible deviation from diagonal, demonstrating the diagnostic purpose. Could be slightly more dramatic to better illustrate the S-shaped pattern mentioned in the spec.
DQ-02: Realistic Context (3/5) — Data is plausible (mixture of normals simulating contaminated sample) but generic — no real-world context or domain framing beyond abstract "observed" data.
DQ-03: Appropriate Scale (4/4) — 200 data points within spec's 50–500 range. Sensible parameter values.

Code Quality (9/10)

CQ-01: KISS Structure (3/3) — Linear script: imports → data → plot → save. No functions or classes.
CQ-02: Reproducibility (2/2) — np.random.seed(42) set.
CQ-03: Clean Imports (2/2) — All imports used (matplotlib.pyplot, numpy, math.erfc, math.sqrt).
CQ-04: Code Elegance (1/2) — Using from math import erfc, sqrt with a list comprehension for CDF calculation is slightly inelegant. A vectorized approach using scipy.stats.norm.cdf or numpy's erf would be cleaner and more Pythonic.
CQ-05: Output & API (1/1) — Saves as plot.png with dpi=300, bbox_inches='tight'. Current API.

Library Mastery (5/10)

LM-01: Idiomatic Usage (4/5) — Uses ax methods correctly, fig/ax pattern, spine manipulation, grid styling, tight_layout. Good idiomatic matplotlib.
LM-02: Distinctive Features (1/5) — Generic scatter + line plot. No distinctive matplotlib features leveraged (e.g., custom transforms, path effects, fill_between for confidence bands).

Score Caps Applied

None

Strengths

Perfect spec compliance — all required P-P plot features present with correct data mapping
Clean, readable code structure following KISS principle
Appropriate use of square aspect ratio as specified
Good color choice with Python Blue and white marker edges

Weaknesses

Marker size (s=180) is too large for 200 data points, causing visual clutter and overlap
No data storytelling — the S-shaped deviation pattern is present but not emphasized
Generic data context — no real-world domain framing
Manual CDF calculation using math.erfc instead of vectorized numpy/scipy approach
No distinctive matplotlib features used

Issues Found

VQ-03 MEDIUM: Markers oversized at s=180 for 200 points
- Fix: Reduce to s=60–80 with alpha=0.6 for better density visualization
DE-03 LOW: No visual hierarchy or storytelling
- Fix: Consider using fill_between to shade the area between points and the diagonal, or use color gradient along the diagonal to highlight deviation regions
DQ-02 MEDIUM: Generic data context
- Fix: Frame data in a real-world context (e.g., manufacturing quality measurements, lab test results)
LM-02 LOW: No distinctive matplotlib features
- Fix: Leverage features like fill_between for confidence bands, path effects, or custom tick formatting
CQ-04 LOW: Non-vectorized CDF calculation
- Fix: Use scipy.stats.norm.cdf or numpy's vectorized operations instead of list comprehension with math.erfc

AI Feedback for Next Attempt

Reduce marker size to s=60–80 for better density handling. Add visual storytelling through fill_between or color encoding to highlight deviation from the diagonal. Frame the data in a real-world context (e.g., manufacturing QC measurements vs. normal specification). Use vectorized CDF computation. Consider adding a subtle confidence band around the reference line to enhance both design excellence and library mastery.

Verdict: REJECTED

…-basic

Attempt 1/3 - fixes based on AI review

github-actions · 2026-03-15T20:31:16Z

🔧 Repair Attempt 1/3

Applied fixes based on AI review feedback.

Status: Repair completed, re-triggering review...

🤖 impl-repair

claude · 2026-03-15T20:34:17Z

AI Review - Attempt 2/3

Image Description

The plot displays a P-P (Probability-Probability) plot on a square canvas with both axes ranging from 0 to 1. The x-axis is labeled "Theoretical Cumulative Probability (Normal)" and the y-axis "Empirical Cumulative Probability." The title reads "pp-basic · matplotlib · pyplots.ai" with an italic subtitle "Bolt tensile strength (MPa) vs. normal distribution — quality control diagnostic." Data points are colored in two tones: Python Blue (#306998) for points near the diagonal and coral red (#C44E52) for points deviating more than 0.03 from the reference line. A dashed gray 45-degree reference line with a subtle stroke path effect runs diagonally. A light blue 95% confidence band surrounds the reference line. An annotation with a curved arrow points to the deviation region, reading "Heavier upper tail (secondary supplier batch)." The legend in the lower right shows "95% confidence band" and "Perfect normal fit." Top and right spines are removed; remaining spines are thinned and colored gray. Both major and minor grid lines are present at very low alpha. The S-shaped departure from the diagonal in the upper portion is clearly visible, telling the story of a mixture distribution.

Score: 90/100

Category	Score	Max
Visual Quality	28	30
Design Excellence	16	20
Spec Compliance	15	15
Data Quality	15	15
Code Quality	9	10
Library Mastery	7	10
Total	90	100

Visual Quality (28/30)

VQ-01: Text Legibility (8/8) — All font sizes explicitly set: title 24pt, axis labels 20pt, ticks 16pt, legend 14pt, annotation 14pt
VQ-02: No Overlap (6/6) — No overlapping text; annotation well-positioned away from data
VQ-03: Element Visibility (5/6) — s=70 for 200 points is within guideline range (50-100) but on the smaller side; alpha=0.65 appropriate
VQ-04: Color Accessibility (3/4) — Blue vs coral/red is generally colorblind-safe but not optimal for all types of color vision deficiency
VQ-05: Layout & Canvas (4/4) — Square aspect ratio well-utilized, balanced margins, nothing cut off
VQ-06: Axis Labels & Title (2/2) — Descriptive labels with distribution context specified

Design Excellence (16/20)

DE-01: Aesthetic Sophistication (6/8) — Strong design: custom two-color scheme encoding deviation magnitude, path effects on reference line for visual depth, confidence band, italic domain subtitle. Clearly above defaults.
DE-02: Visual Refinement (5/6) — Top/right spines removed, remaining spines thinned and colored feat(plotly): implement donut-labeled #555, subtle major+minor grid at low alpha, generous whitespace. Very polished.
DE-03: Data Storytelling (5/6) — Excellent storytelling: color-coded deviations immediately highlight distributional departure, annotation explains the cause (secondary supplier batch), subtitle provides domain context. Viewer immediately understands the insight.

Spec Compliance (15/15)

SC-01: Plot Type (5/5) — Correct P-P plot: empirical CDF vs theoretical CDF
SC-02: Required Features (4/4) — All spec features: 45-degree reference line, square aspect ratio, sorted data with i/(n+1) plotting position, theoretical CDF evaluation, visible S-shaped deviation
SC-03: Data Mapping (3/3) — X=theoretical CDF, Y=empirical CDF, both axes 0-1
SC-04: Title & Legend (3/3) — Title format "pp-basic · matplotlib · pyplots.ai" correct; legend labels meaningful and accurate

Data Quality (15/15)

DQ-01: Feature Coverage (6/6) — Mixture distribution creates clear S-shaped departure from diagonal, demonstrating exactly what P-P plots detect. Shows both conforming and deviating regions.
DQ-02: Realistic Context (5/5) — Bolt tensile strength (MPa) for quality control is a real, neutral engineering scenario directly matching spec applications
DQ-03: Appropriate Scale (4/4) — 840/910 MPa with σ=28-35 are realistic values for steel bolt tensile strength

Code Quality (9/10)

CQ-01: KISS Structure (2/3) — Linear flow but sys.path manipulation adds boilerplate complexity; vectorized normal_cdf is borderline function definition
CQ-02: Reproducibility (2/2) — np.random.seed(42) set
CQ-03: Clean Imports (2/2) — All imports used: os/sys for path hack, math for erf/sqrt, patheffects, pyplot, ticker, numpy
CQ-04: Code Elegance (2/2) — Clean, appropriate complexity; no fake UI or over-engineering
CQ-05: Output & API (1/1) — Saves as plot.png, dpi=300, bbox_inches='tight'

Library Mastery (7/10)

LM-01: Idiomatic Usage (4/5) — Consistently uses Axes methods, mticker for tick control, fill_between for confidence band. Solid idiomatic matplotlib.
LM-02: Distinctive Features (3/5) — Uses matplotlib.patheffects (pe.Stroke + pe.Normal) for reference line depth, mticker.MultipleLocator for precise tick placement — these are distinctive matplotlib features

Score Caps Applied

None — no caps triggered

Strengths

Excellent data storytelling through color-coded deviations and domain-contextual annotation
Strong visual refinement: path effects, thinned/colored spines, dual-level grid, confidence band
Perfect spec compliance with all required P-P plot features including square aspect ratio and i/(n+1) formula
Realistic manufacturing quality control scenario with plausible bolt tensile strength values
95% confidence band adds genuine statistical value beyond spec requirements

Weaknesses

Marker size (s=70) could be slightly larger for better visibility at 200 points
Blue/red color pairing could be improved for deuteranopia accessibility

Issues Found

None critical

AI Feedback for Next Attempt

N/A — implementation meets quality threshold.

Verdict: APPROVED

…-basic

github-actions bot added 2 commits March 15, 2026 20:16

feat(matplotlib): implement pp-basic

2b49d03

chore(matplotlib): add metadata for pp-basic

1c1c523

github-actions bot mentioned this pull request Mar 15, 2026

[pp-basic] Probability-Probability (P-P) Plot #4587

Closed

github-actions bot added quality:75 Quality score 75/100 ai-rejected Quality not OK, triggers update labels Mar 15, 2026

chore(matplotlib): update quality score 75 and review feedback for pp…

7c8fe6d

…-basic

github-actions bot added ai-attempt-1 First repair attempt and removed ai-rejected Quality not OK, triggers update labels Mar 15, 2026

fix(matplotlib): address review feedback for pp-basic

39be42c

Attempt 1/3 - fixes based on AI review

github-actions bot added quality:90 Quality score 90/100 ai-approved Quality OK, ready for merge labels Mar 15, 2026

github-actions bot and others added 2 commits March 15, 2026 20:35

chore(matplotlib): update quality score 90 and review feedback for pp…

dc708db

…-basic

Merge branch 'main' into implementation/pp-basic/matplotlib

a2fe314

github-actions bot merged commit 002adfa into main Mar 15, 2026

github-actions bot deleted the implementation/pp-basic/matplotlib branch March 15, 2026 20:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(matplotlib): implement pp-basic#4869

feat(matplotlib): implement pp-basic#4869
github-actions[bot] merged 6 commits intomainfrom
implementation/pp-basic/matplotlib

github-actions bot commented Mar 15, 2026

Uh oh!

claude bot commented Mar 15, 2026

Uh oh!

github-actions bot commented Mar 15, 2026

Uh oh!

claude bot commented Mar 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Conversation

github-actions bot commented Mar 15, 2026

Implementation: pp-basic - matplotlib

Uh oh!

claude bot commented Mar 15, 2026

AI Review - Attempt 1/3

Image Description

Score: 75/100

Visual Quality (25/30)

Design Excellence (9/20)

Spec Compliance (15/15)

Data Quality (12/15)

Code Quality (9/10)

Library Mastery (5/10)

Score Caps Applied

Strengths

Weaknesses

Issues Found

AI Feedback for Next Attempt

Verdict: REJECTED

Uh oh!

github-actions bot commented Mar 15, 2026

🔧 Repair Attempt 1/3

Uh oh!

claude bot commented Mar 15, 2026

AI Review - Attempt 2/3

Image Description

Score: 90/100

Visual Quality (28/30)

Design Excellence (16/20)

Spec Compliance (15/15)

Data Quality (15/15)

Code Quality (9/10)

Library Mastery (7/10)

Score Caps Applied

Strengths

Weaknesses

Issues Found

AI Feedback for Next Attempt

Verdict: APPROVED

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Implementation: `pp-basic` - matplotlib