Skip to content

feat(seaborn): implement line-win-probability#5090

Merged
github-actions[bot] merged 6 commits intomainfrom
implementation/line-win-probability/seaborn
Mar 20, 2026
Merged

feat(seaborn): implement line-win-probability#5090
github-actions[bot] merged 6 commits intomainfrom
implementation/line-win-probability/seaborn

Conversation

@github-actions
Copy link
Copy Markdown
Contributor

Implementation: line-win-probability - seaborn

Implements the seaborn version of line-win-probability.

File: plots/line-win-probability/implementations/seaborn.py

Parent Issue: #4418


🤖 impl-generate workflow

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Mar 20, 2026

AI Review - Attempt 1/3

Image Description

The plot shows a win probability line chart for an NFL-style football game. A blue line (#306998) tracks home team win probability from play 0 to 120 against a white background. The area above the 50% dashed baseline is filled with light blue (Home), and below with light salmon/red (Away). Five key scoring events are annotated with bold colored text and connecting dots: "TD Away 7-3" (red, around play 22), "TD Home 10-7" (blue, around play 35), "TD Home 20-14" (blue, around play 65), "TD Away 23-27" (red, around play 95), and "TD Home 30-27" (blue, around play 105). Quarter markers (Q1-Q4) appear as subtle dotted vertical lines with labels at the bottom. The title reads "line-win-probability · seaborn · pyplots.ai". Y-axis shows "Home Win Probability" with percentage labels (0%-100%), X-axis shows "Play Number". A legend in the upper left identifies Home (blue) and Away (red). The final score "Home 30 – Away 27" is displayed in italic gray text at the bottom right. Top and right spines are removed, with a subtle y-axis grid.

Score: 83/100

Category Score Max
Visual Quality 27 30
Design Excellence 13 20
Spec Compliance 15 15
Data Quality 14 15
Code Quality 10 10
Library Mastery 4 10
Total 83 100

Visual Quality (27/30)

  • VQ-01: Text Legibility (7/8) — Font sizes explicitly set (title 24, labels 20, ticks 16). Annotation text at 11pt is slightly small but still readable.
  • VQ-02: No Overlap (6/6) — All annotations well-positioned with offsets, no overlapping text.
  • VQ-03: Element Visibility (5/6) — Line clearly visible at 2.5 width, fills at alpha 0.25 are effective. Scatter markers at s=80 could be slightly larger for emphasis.
  • VQ-04: Color Accessibility (4/4) — Blue (#306998) vs salmon-red (#D4583B) provides good contrast and is colorblind-distinguishable.
  • VQ-05: Layout & Canvas (3/4) — Good overall layout. The final score text positioned at y=-0.06 falls slightly outside the axis area, creating minor imbalance at the bottom.
  • VQ-06: Axis Labels & Title (2/2) — "Play Number" and "Home Win Probability" are descriptive; y-axis shows percentage format.

Design Excellence (13/20)

  • DE-01: Aesthetic Sophistication (5/8) — Custom two-color palette, clean typography, frameless legend, quarter markers, annotated scores — clearly above defaults but not FiveThirtyEight-level polish. Missing: team name branding, subtitle, or more sophisticated typography hierarchy.
  • DE-02: Visual Refinement (4/6) — Top/right spines removed, subtle y-grid (alpha=0.15), quarter dividers are appropriately subtle. Good refinement but could push further with background color or more polished annotation styling.
  • DE-03: Data Storytelling (4/6) — Clear narrative arc: close game with lead changes, dramatic late-game swing, and final score. Color fills immediately convey momentum. Annotations highlight key turning points with running scores.

Spec Compliance (15/15)

  • SC-01: Plot Type (5/5) — Correct win probability line chart.
  • SC-02: Required Features (4/4) — All spec features present: 0-100% y-axis, 50% reference line, team-colored fill above/below 50%, annotated scoring events, final score display, quarter/period markers.
  • SC-03: Data Mapping (3/3) — X=play number, Y=win probability. Axes show full range (0-120 plays, 0%-100%).
  • SC-04: Title & Legend (3/3) — Title format "line-win-probability · seaborn · pyplots.ai" correct. Legend labels "Home"/"Away" match the data.

Data Quality (14/15)

  • DQ-01: Feature Coverage (5/6) — Shows lead changes, momentum swings, multiple event types (TD, FG, INT), close game with dramatic finish. Could show a wider range of probability extremes.
  • DQ-02: Realistic Context (5/5) — NFL game scenario with realistic play-by-play: touchdowns, field goals, interceptions. Neutral sports topic.
  • DQ-03: Appropriate Scale (4/4) — Probabilities stay in realistic range (0.02-0.98), scoring events cause appropriate probability shifts (0.07-0.22).

Code Quality (10/10)

  • CQ-01: KISS Structure (3/3) — Clean imports → data → plot → style → save flow. No functions or classes.
  • CQ-02: Reproducibility (2/2) — np.random.seed(42) set.
  • CQ-03: Clean Imports (2/2) — All imports used: matplotlib.patches, plt, np, pd, sns.
  • CQ-04: Code Elegance (2/2) — Clean, well-organized, appropriate complexity. No fake UI.
  • CQ-05: Output & API (1/1) — Saves as plot.png, dpi=300, bbox_inches='tight'. No deprecated API.

Library Mastery (4/10)

  • LM-01: Idiomatic Usage (3/5) — Uses sns.lineplot axes-level API correctly, but fill_between, scatter, annotate, and all styling are pure matplotlib. Seaborn is used minimally.
  • LM-02: Distinctive Features (1/5) — The only seaborn call is sns.lineplot, which is essentially identical to plt.plot. No distinctive seaborn features used (e.g., sns.set_theme, sns.set_context, seaborn color palettes, statistical features).

Score Caps Applied

  • None — no caps triggered.

Strengths

  • Excellent spec compliance — all required features implemented (fill areas, 50% line, annotations, quarter markers, final score)
  • Strong data storytelling with a realistic NFL game narrative showing lead changes and dramatic finish
  • Clean, well-organized code with good reproducibility
  • Good visual refinement: removed spines, subtle grid, well-positioned annotations with running scores

Weaknesses

  • Minimal seaborn usage — nearly the entire plot is built with matplotlib; only sns.lineplot uses the library
  • Annotation font size (11pt) could be larger for better readability at full resolution
  • Design polish could be elevated further (e.g., background tint, team name labels, subtitle)

Issues Found

  1. LM-01/LM-02 LOW: Seaborn is barely used — only a single sns.lineplot call. The fill_between, scatter, annotations, and all styling are pure matplotlib.
    • Fix: Use sns.set_theme() or sns.set_context("talk") for global styling, use seaborn color palette functions (sns.color_palette), leverage seaborn's built-in styling capabilities
  2. DE-01 MODERATE: Design is above defaults but not publication-quality
    • Fix: Add a subtle background color via sns.set_style("whitegrid") with customization, consider a subtitle with game context, refine annotation arrows with more sophisticated styling
  3. VQ-01 MINOR: Annotation text at 11pt is slightly small
    • Fix: Increase annotation fontsize to 13-14pt

AI Feedback for Next Attempt

To reach 90+: (1) Significantly increase seaborn usage — use sns.set_theme() or sns.set_context("talk") for styling, sns.color_palette() for colors, and consider if seaborn's statistical or styling features can replace any matplotlib code. (2) Bump annotation fontsize to 13-14pt. (3) Elevate design with a subtle background tint, more refined annotation styling, or a contextual subtitle. The spec compliance and data quality are already excellent — focus improvements on library mastery and design polish.

Verdict: REJECTED

@github-actions github-actions Bot added quality:83 Quality score 83/100 ai-rejected Quality not OK, triggers update labels Mar 20, 2026
@github-actions github-actions Bot added ai-attempt-1 First repair attempt and removed ai-rejected Quality not OK, triggers update labels Mar 20, 2026
@github-actions
Copy link
Copy Markdown
Contributor Author

🔧 Repair Attempt 1/3

Applied fixes based on AI review feedback.

Status: Repair completed, re-triggering review...


🤖 impl-repair

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Mar 20, 2026

AI Review - Attempt 2/3

Image Description

The plot displays a win probability line chart for an NFL game across 120 plays. A dark blue line (#306998) traces the home team's win probability from ~50% at kickoff through various scoring events. The area above the 50% dashed baseline is filled with translucent blue (Home) and below with translucent salmon/red (#D4583B, Away). Five key events are annotated with curved arrows and bold colored text showing the scoring play and running score: "TD Away 7-3", "TD Home 10-7", "TD Home 20-14", "TD Away 23-27", and "TD Home 30-27". Dotted vertical lines mark Q1, Q2, and Q3 boundaries. The y-axis shows 0%-100% with the 50% reference line. A styled box at the bottom right displays "Final: Home 30 – Away 27". The title follows the required format. A legend in the upper left shows Home (blue) and Away (red) patches. The background is a subtle light blue-gray (#F7F9FC). An italic subtitle reads "NFL Game — Home vs Away | Lead changes and momentum shifts across 120 plays".

Score: 91/100

Category Score Max
Visual Quality 29 30
Design Excellence 16 20
Spec Compliance 15 15
Data Quality 15 15
Code Quality 10 10
Library Mastery 6 10
Total 91 100

Visual Quality (29/30)

  • VQ-01: Text Legibility (8/8) — All font sizes explicitly set: title 24pt, axis labels 20pt, ticks 16pt, annotations 13pt, quarter labels 14pt. All perfectly readable.
  • VQ-02: No Overlap (5/6) — Q4 quarter label is hidden behind or overlaps with the "Final" score box at the bottom right. Remaining annotations are well-spaced.
  • VQ-03: Element Visibility (6/6) — Line width 2.8 is clearly visible, fill areas at alpha=0.2 are distinct, scatter markers with white edge highlighting at event points.
  • VQ-04: Color Accessibility (4/4) — Blue (#306998) and salmon (#D4583B) provide strong contrast and are distinguishable for colorblind users (blue-orange axis).
  • VQ-05: Layout & Canvas (4/4) — 16:9 aspect ratio, plot fills canvas well, balanced margins, nothing cut off.
  • VQ-06: Axis Labels & Title (2/2) — "Play Number" and "Home Win Probability" are descriptive; y-axis shows percentage units.

Design Excellence (16/20)

  • DE-01: Aesthetic Sophistication (6/8) — Custom background (#F7F9FC), cohesive blue/salmon palette via sns.color_palette, italic subtitle in muted gray, styled final score box with rounded corners, arrow annotations color-matched to teams. Clearly above configured defaults.
  • DE-02: Visual Refinement (5/6) — Spines removed via sns.despine, subtle y-axis grid (alpha=0.2), custom background, quarter markers as subtle dotted lines (alpha=0.6), generous whitespace.
  • DE-03: Data Storytelling (5/6) — Strong narrative: annotations show scoring plays with running score, fill colors create instant momentum visualization, final score box provides closure. Viewer immediately understands the back-and-forth game.

Spec Compliance (15/15)

  • SC-01: Plot Type (5/5) — Correct line chart showing win probability over game progression.
  • SC-02: Required Features (4/4) — All spec features present: 50% reference line, fill above/below with team colors, annotated key events, final score display, quarter markers.
  • SC-03: Data Mapping (3/3) — X=Play Number (0-120), Y=Win Probability (0%-100%). Full range displayed.
  • SC-04: Title & Legend (3/3) — Title "line-win-probability · seaborn · pyplots.ai" in correct format. Legend with Home/Away color patches.

Data Quality (15/15)

  • DQ-01: Feature Coverage (6/6) — Multiple lead changes, momentum swings, various event types (TD, FG, INT), both teams scoring, close competitive game.
  • DQ-02: Realistic Context (5/5) — NFL game scenario with plausible play-by-play win probability. Neutral sports context.
  • DQ-03: Appropriate Scale (4/4) — 120 plays realistic for NFL, probability values stay within 0.02-0.98 range, scoring swings are appropriately sized.

Code Quality (10/10)

  • CQ-01: KISS Structure (3/3) — Clean imports → data generation → plot → styling → save flow. No functions or classes.
  • CQ-02: Reproducibility (2/2) — np.random.seed(42) set.
  • CQ-03: Clean Imports (2/2) — All imports used: matplotlib.patches, plt, np, pd, sns.
  • CQ-04: Code Elegance (2/2) — Appropriate complexity, clean annotation offset dict, no fake functionality.
  • CQ-05: Output & API (1/1) — Saves as plot.png with dpi=300 and bbox_inches='tight'.

Library Mastery (6/10)

  • LM-01: Idiomatic Usage (4/5) — Good use of seaborn axes-level API (sns.lineplot, sns.scatterplot), sns.set_theme for global styling, sns.set_context for scaling, sns.despine for cleanup. Matplotlib supplements where seaborn lacks (fill_between, annotations).
  • LM-02: Distinctive Features (2/5) — Uses seaborn-specific features (set_theme, set_context, color_palette, despine) but these are styling utilities. The core visualization (fill_between, annotations, axhline) is matplotlib. Could be replicated without seaborn with minor changes.

Score Caps Applied

  • None

Strengths

  • Excellent data storytelling with annotated scoring events showing running score
  • Polished design with custom background, cohesive palette, and styled final score box
  • Complete spec compliance — all required features implemented
  • Realistic NFL game data with compelling back-and-forth narrative

Weaknesses

  • Q4 quarter label is obscured by the Final score box (minor overlap issue)
  • Library mastery could be stronger — core visualization relies heavily on matplotlib rather than seaborn-specific plotting features

Issues Found

  1. VQ-02 MINOR: Q4 label at bottom right is hidden behind the Final score box
    • Fix: Reposition Q4 label or move Final score box to avoid overlap

AI Feedback for Next Attempt

Strong implementation at 91/100. The Q4 label overlap with the Final score box is the only visual defect. Library mastery could improve by finding more seaborn-native approaches, though this plot type inherently requires matplotlib supplementation.

Verdict: APPROVED

@github-actions github-actions Bot added quality:91 Quality score 91/100 ai-approved Quality OK, ready for merge labels Mar 20, 2026
@github-actions github-actions Bot merged commit 7a7f872 into main Mar 20, 2026
3 checks passed
@github-actions github-actions Bot deleted the implementation/line-win-probability/seaborn branch March 20, 2026 12:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-approved Quality OK, ready for merge ai-attempt-1 First repair attempt quality:83 Quality score 83/100 quality:91 Quality score 91/100

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants