Skip to content

update(scatter-basic): seaborn — comprehensive quality review#4200

Merged
github-actions[bot] merged 5 commits intomainfrom
implementation/scatter-basic/seaborn
Feb 14, 2026
Merged

update(scatter-basic): seaborn — comprehensive quality review#4200
github-actions[bot] merged 5 commits intomainfrom
implementation/scatter-basic/seaborn

Conversation

@MarkusNeusinger
Copy link
Copy Markdown
Owner

Summary

Updated LIB_PLACEHOLDER implementation for scatter-basic.

Changes: Comprehensive quality review — improved variable naming, realistic data context, white marker edges for point definition, subtler grid styling, and explicit font sizing for large canvas.

Changes

  • Descriptive variable names with realistic data context
  • White marker edges for better point definition
  • Subtler grid styling (opacity ~0.15-0.2)
  • Explicit font sizing for 4800x2700 canvas
  • Library-specific improvements for idiomatic usage

Test Plan

  • Preview images uploaded to GCS staging
  • Implementation file passes ruff format/check
  • Metadata YAML updated with current versions
  • Automated review triggered

Generated with Claude Code /update command

Copilot AI review requested due to automatic review settings February 14, 2026 14:30
Comprehensive quality review: improved data context, white marker edges,
subtler grid, explicit font sizing, spec tag additions.
@claude
Copy link
Copy Markdown
Contributor

claude bot commented Feb 14, 2026

AI Review - Attempt 1/3

Image Description

The plot displays a 2D scatter plot with 150 data points showing the relationship between "Study Hours (per week)" on the x-axis (ranging from ~1 to 10) and "Exam Score (points)" on the y-axis (ranging from ~10 to 120). All markers use a single muted blue color (#306998 Python Blue) with white edge highlights and moderate transparency (alpha=0.7). The title "scatter-basic · seaborn · pyplots.ai" appears at the top in medium-weight font. A clear positive linear correlation is visible — students who study more tend to score higher. The top and right spines are removed (seaborn despine), and a subtle dashed grid with low opacity aids value estimation. The layout is landscape 16:9 with balanced margins and good canvas utilization.

Quality Score: 82/100

Criteria Checklist

Visual Quality (26/30)

  • VQ-01: Text Legibility (8/8) — All font sizes explicitly set: title 24pt, labels 20pt, ticks 16pt
  • VQ-02: No Overlap (6/6) — No overlapping text elements anywhere
  • VQ-03: Element Visibility (5/6) — s=120 slightly oversized for 150 points (guidelines: s=50-100 for 100-300 pts); causes minor clumping in dense regions
  • VQ-04: Color Accessibility (4/4) — Single Python Blue with white edges, fully colorblind-safe
  • VQ-05: Layout Balance (4/4) — Good 16:9 proportions, tight_layout fills canvas well
  • VQ-06: Axis Labels & Title (2/2) — Descriptive with units: "Study Hours (per week)", "Exam Score (points)"

Design Excellence (10/20)

  • DE-01: Aesthetic Sophistication (4/8) — Well-configured seaborn default with Python Blue, white edges, and despine. Pleasant but not publication-exceptional
  • DE-02: Visual Refinement (4/6) — Spines removed, subtle dashed grid at 0.2 alpha, good whitespace management
  • DE-03: Data Storytelling (2/6) — Data displayed without annotations, trend lines, or narrative emphasis. Viewer must discover the correlation pattern themselves

Spec Compliance (15/15)

  • SC-01: Plot Type (5/5) — Correct 2D scatter plot
  • SC-02: Required Features (4/4) — Alpha transparency, axis labels, title, grid lines all present
  • SC-03: Data Mapping (3/3) — X/Y correctly assigned
  • SC-04: Title Format (3/3) — Correct format: "scatter-basic · seaborn · pyplots.ai"

Data Quality (14/15)

  • DQ-01: Feature Coverage (5/6) — Shows clear positive correlation with noise, but the relationship is uniformly linear; no distinct clusters, outliers, or non-linear regions to showcase scatter plot versatility
  • DQ-02: Realistic Context (5/5) — Study hours vs exam scores: classic, neutral education scenario
  • DQ-03: Appropriate Scale (4/4) — Realistic ranges for both variables

Code Quality (10/10)

  • CQ-01: KISS Structure (3/3) — Clean imports → data → plot → style → save
  • CQ-02: Reproducibility (2/2) — np.random.seed(42) set
  • CQ-03: Clean Imports (2/2) — All imports used
  • CQ-04: Code Elegance (2/2) — Clean, Pythonic, appropriate complexity
  • CQ-05: Output & API (1/1) — Saves as plot.png, no deprecated functions

Library Mastery (7/10)

  • LM-01: Idiomatic Usage (4/5) — Uses sns.scatterplot with DataFrame, set_theme, despine. Could leverage more seaborn parameters (hue, size mapping)
  • LM-02: Distinctive Features (3/5) — set_theme(style="ticks", context="talk") and despine are seaborn-distinctive, but the core scatter is easily replicable in pure matplotlib

Strengths

  • Perfect spec compliance with correct title format, axis labels with units, and all required features
  • Clean KISS code structure with reproducible seed and idiomatic seaborn API usage
  • Good visual refinement: despine, subtle dashed grid, white marker edges, appropriate alpha
  • Realistic education context with plausible data ranges

Weaknesses

  • No data storytelling: adding a trend/regression line or annotating key insights (e.g., correlation coefficient, outlier callout) would guide the viewer
  • Marker size s=120 is slightly large for 150 data points — reducing to s=80-90 would reduce clumping in dense regions
  • The design stays within "well-configured default" territory; a more distinctive color treatment, secondary visual element, or stylistic refinement would elevate it

Verdict: REJECTED

@github-actions github-actions bot added quality:82 Quality score 82/100 ai-rejected Quality not OK, triggers update labels Feb 14, 2026
@github-actions github-actions bot added ai-attempt-1 First repair attempt and removed ai-rejected Quality not OK, triggers update labels Feb 14, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request updates the seaborn implementation for the scatter-basic plot with quality improvements including descriptive variable names, DataFrame-based data structure, seaborn theme configuration, white marker edges, and adjusted styling parameters. However, there are several issues that need to be addressed before merging.

Changes:

  • Added pandas import and DataFrame-based data structure with descriptive column names
  • Introduced sns.set_theme() for global styling configuration
  • Changed marker size from 200 to 120 and adjusted grid styling
  • Added sns.despine() call for cleaner appearance
  • Updated metadata with new timestamps and model version

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File Description
plots/scatter-basic/metadata/seaborn.yaml Updated metadata with new timestamps, Python 3.14, claude-opus-4-6 model, and null quality_score
plots/scatter-basic/implementations/seaborn.py Refactored implementation with DataFrame structure, theme configuration, and styling adjustments

@@ -1,31 +1,43 @@
""" pyplots.ai
Copy link

Copilot AI Feb 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The header docstring should start with a triple quote followed by a space (""" pyplots.ai), not without a space ("""pyplots.ai). All other seaborn implementations in the codebase follow the convention of including a space after the opening triple quote, as seen in plots/bar-basic/implementations/seaborn.py:1, plots/line-basic/implementations/seaborn.py:1, and plots/histogram-basic/implementations/seaborn.py:1.

Copilot uses AI. Check for mistakes.
Comment on lines +9 to +17
import pandas as pd
import seaborn as sns


# Data - study hours vs exam scores with realistic correlation
np.random.seed(42)
study_hours = np.random.uniform(1, 10, 150)
exam_scores = study_hours * 8 + np.random.randn(150) * 8 + 25
df = pd.DataFrame({"Study Hours (per week)": study_hours, "Exam Score (points)": exam_scores})
Copy link

Copilot AI Feb 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a new pandas import without using DataFrame-based plotting is not idiomatic for seaborn. While the code creates a DataFrame with descriptive column names, this pattern diverges from simpler seaborn implementations in the codebase. Compare with plots/bar-basic/implementations/seaborn.py which uses DataFrame-based plotting with seaborn's data parameter, or plots/line-basic/implementations/seaborn.py. For a basic scatter plot, using sns.scatterplot with x and y arrays directly (without pandas) would be simpler and more consistent with the KISS principle stated in the coding guidelines.

Copilot uses AI. Check for mistakes.
y="Exam Score (points)",
ax=ax,
alpha=0.7,
s=120,
Copy link

Copilot AI Feb 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The marker size changed from s=200 to s=120. For a 4800x2700px canvas (16x9 at 300dpi), this may make the points too small to see clearly, especially with the white edge (linewidth=0.5). The original size of 200 was likely chosen to ensure visibility at high resolution. Consider whether 120 is sufficiently visible for the intended canvas size, or if this reduction was intentional to accommodate more data points.

Suggested change
s=120,
s=200,

Copilot uses AI. Check for mistakes.
Comment on lines 35 to 37
ax.set_title("scatter-basic · seaborn · pyplots.ai", fontsize=24, fontweight="medium")
ax.set_xlabel("Study Hours (per week)", fontsize=20)
ax.set_ylabel("Exam Score (points)", fontsize=20)
Copy link

Copilot AI Feb 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The title is set before the axis labels, which is inconsistent with the codebase convention. Most seaborn implementations in the repository (e.g., plots/bar-basic/implementations/seaborn.py:28-31, plots/line-basic/implementations/seaborn.py:27-29) set the title after the axis labels. While functionally equivalent, following the established pattern improves code consistency and readability.

Suggested change
ax.set_title("scatter-basic · seaborn · pyplots.ai", fontsize=24, fontweight="medium")
ax.set_xlabel("Study Hours (per week)", fontsize=20)
ax.set_ylabel("Exam Score (points)", fontsize=20)
ax.set_xlabel("Study Hours (per week)", fontsize=20)
ax.set_ylabel("Exam Score (points)", fontsize=20)
ax.set_title("scatter-basic · seaborn · pyplots.ai", fontsize=24, fontweight="medium")

Copilot uses AI. Check for mistakes.
Library: seaborn 0.13.2 | Python 3.13.11
Quality: 92/100 | Created: 2025-12-22
Library: seaborn 0.13.2 | Python 3.14
Quality: 82/100 | Created: 2025-12-22
Copy link

Copilot AI Feb 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Quality field in the header shows "/100" without a numeric value. Based on the codebase convention (e.g., plots/histogram-basic/implementations/seaborn.py:4 shows "Quality: 95/100"), this should either include the actual quality score (e.g., "Quality: 92/100") or use "Created" instead of "Updated" if this is meant to represent a new implementation. Additionally, the metadata file sets quality_score to null, which is inconsistent with the header claim of having a quality score.

Copilot uses AI. Check for mistakes.

# Create plot
# Plot
sns.set_theme(style="ticks", context="talk", font_scale=1.1)
Copy link

Copilot AI Feb 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sns.set_theme() call with context="talk" and font_scale=1.1 is being used, but then explicit fontsize parameters are set for all text elements (title=24, xlabel=20, ylabel=20, tick_params=16). This creates redundancy and makes the theme settings ineffective. Either rely on the theme's font scaling or remove the theme configuration and use explicit sizes throughout. Compare with plots/histogram-basic/implementations/seaborn.py which uses sns.set_theme() with a comprehensive rc dict but still sets explicit fontsizes, suggesting explicit sizes are preferred for the 4800x2700 canvas as mentioned in the coding guidelines.

Suggested change
sns.set_theme(style="ticks", context="talk", font_scale=1.1)
sns.set_theme(style="ticks")

Copilot uses AI. Check for mistakes.
Attempt 1/3 - fixes based on AI review
@github-actions
Copy link
Copy Markdown
Contributor

🔧 Repair Attempt 1/3

Applied fixes based on AI review feedback.

Status: Repair completed, re-triggering review...


🤖 impl-repair

@claude
Copy link
Copy Markdown
Contributor

claude bot commented Feb 14, 2026

AI Review - Attempt 2/3

Image Description

The plot displays a scatter plot of 150 data points showing the relationship between "Study Hours (per week)" (x-axis, range ~1-10) and "Exam Score (points)" (y-axis, range ~20-125). Data points are rendered as medium-blue (#306998) filled circles with white edges and moderate transparency (alpha=0.6). A red (#c44e52) regression line runs diagonally from lower-left to upper-right, accompanied by a light pink 95% confidence interval band. The title reads "scatter-basic · seaborn · pyplots.ai" in medium-weight dark gray text at the top. Axis labels are descriptive with units. A correlation annotation "r = 0.86" is placed in the lower-right corner inside a rounded white box with a light gray border. The plot uses a clean, despined style (no top/right spines) with subtle dashed grid lines. The overall layout is well-balanced on a 16:9 canvas with generous whitespace.

Quality Score: 96/100

Criteria Checklist

Visual Quality (30/30)

  • VQ-01: Text Legibility (8/8) - All font sizes explicitly set: title=24, labels=20, ticks=16. All text perfectly readable at full size.
  • VQ-02: No Overlap (6/6) - No overlapping text elements anywhere in the plot.
  • VQ-03: Element Visibility (6/6) - 150 points with s=80 and alpha=0.6 — within recommended range (50-100, 0.5-0.7) for 100-300 points. Excellent visibility.
  • VQ-04: Color Accessibility (4/4) - Blue scatter + red regression line are colorblind-distinguishable. White marker edges add clarity.
  • VQ-05: Layout Balance (4/4) - 16:9 canvas well-utilized, balanced margins, tight_layout applied. Plot fills appropriate area.
  • VQ-06: Axis Labels & Title (2/2) - "Study Hours (per week)" and "Exam Score (points)" — descriptive with units.

Design Excellence (16/20)

  • DE-01: Aesthetic Sophistication (6/8) - Custom Python Blue, intentional red regression line, white marker edges, muted gray tones for non-data elements (feat(plotnine): implement scatter-basic #444, feat(plotly): implement donut-labeled #555), context="talk" styling. Clearly above defaults but not quite FiveThirtyEight-level.
  • DE-02: Visual Refinement (5/6) - sns.despine(), subtle grid (alpha=0.15, dashed), custom spine edge colors, title padding. Well-polished.
  • DE-03: Data Storytelling (5/6) - Correlation annotation (r=0.86) with styled bbox guides interpretation. Regression line + 95% CI band tells the trend story. Intentional outliers add narrative interest.

Spec Compliance (15/15)

  • SC-01: Plot Type (5/5) - Correct scatter plot type.
  • SC-02: Required Features (4/4) - Transparency (alpha=0.6), axis labels, descriptive title, grid lines — all spec requirements present.
  • SC-03: Data Mapping (3/3) - X=Study Hours (independent), Y=Exam Score (dependent). Correct assignment.
  • SC-04: Title Format (3/3) - "scatter-basic · seaborn · pyplots.ai" — correct format. No legend needed for single series.

Data Quality (15/15)

  • DQ-01: Feature Coverage (6/6) - Shows positive correlation, spread/noise, intentional outliers (high-performing low-study, low-performing high-study), slight curvature (diminishing returns).
  • DQ-02: Realistic Context (5/5) - Study hours vs exam scores — real, comprehensible education scenario. Neutral topic.
  • DQ-03: Appropriate Scale (4/4) - Hours 1-10/week (realistic), scores ~20-125 points (plausible for a points-based exam).

Code Quality (10/10)

  • CQ-01: KISS Structure (3/3) - Clean Imports → Data → Plot → Save flow, no functions/classes.
  • CQ-02: Reproducibility (2/2) - np.random.seed(42) set.
  • CQ-03: Clean Imports (2/2) - All imports (matplotlib, numpy, pandas, seaborn) are used.
  • CQ-04: Code Elegance (2/2) - Clean, Pythonic, appropriate complexity.
  • CQ-05: Output & API (1/1) - Saves as plot.png, dpi=300, no deprecated functions.

Library Mastery (10/10)

  • LM-01: Idiomatic Usage (5/5) - sns.regplot() is the idiomatic seaborn approach. DataFrame-based data, sns.set_theme(), sns.despine().
  • LM-02: Distinctive Features (5/5) - sns.regplot() combining scatter + regression line + 95% CI band in a single call is distinctively seaborn. Would require significant manual work in plain matplotlib.

Strengths

  • Excellent use of sns.regplot() — the distinctive seaborn feature that combines scatter, regression line, and confidence interval in one call
  • Correlation annotation (r = 0.86) with styled bbox provides meaningful data storytelling
  • Intentional outliers (gifted student at 2.5h/95pts, underperformer at 8.5h/28pts) make the data realistic and demonstrate scatter patterns well
  • Perfect text sizing with all font sizes explicitly set to appropriate values
  • Clean despined aesthetic with subtle grid, muted non-data colors, and well-balanced layout
  • Slight curvature in data generation (diminishing returns) adds realism beyond simple linear noise

Weaknesses

  • DE-01 could reach 7-8 with more sophisticated color palette choices or additional typographic refinement (e.g., custom font family, more nuanced color gradient)
  • DE-02 grid could be refined further — y-axis-only grid might look cleaner for a scatter plot, or the grid could be removed entirely given the regression line already guides the eye

Verdict: APPROVED

@github-actions github-actions bot added quality:96 Quality score 96/100 ai-approved Quality OK, ready for merge labels Feb 14, 2026
Copilot AI review requested due to automatic review settings February 14, 2026 14:40
@MarkusNeusinger MarkusNeusinger review requested due to automatic review settings February 14, 2026 14:40
Copilot AI review requested due to automatic review settings February 14, 2026 14:40
@MarkusNeusinger MarkusNeusinger review requested due to automatic review settings February 14, 2026 14:40
@github-actions github-actions bot merged commit 4b02d42 into main Feb 14, 2026
@github-actions github-actions bot deleted the implementation/scatter-basic/seaborn branch February 14, 2026 14:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-approved Quality OK, ready for merge ai-attempt-1 First repair attempt quality:82 Quality score 82/100 quality:96 Quality score 96/100

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants