From 81737b0b9d4992cd71fbd002d6615d3335f2aaf2 Mon Sep 17 00:00:00 2001 From: "github-actions[bot]" Date: Fri, 9 Jan 2026 21:46:20 +0000 Subject: [PATCH] spec: add logistic-regression specification Created from issue #3550 --- plots/logistic-regression/specification.md | 31 +++++++++++++++++++ plots/logistic-regression/specification.yaml | 32 ++++++++++++++++++++ 2 files changed, 63 insertions(+) create mode 100644 plots/logistic-regression/specification.md create mode 100644 plots/logistic-regression/specification.yaml diff --git a/plots/logistic-regression/specification.md b/plots/logistic-regression/specification.md new file mode 100644 index 0000000000..9d9a99c6f7 --- /dev/null +++ b/plots/logistic-regression/specification.md @@ -0,0 +1,31 @@ +# logistic-regression: Logistic Regression Curve Plot + +## Description + +A logistic regression visualization showing the characteristic S-shaped (sigmoid) probability curve for binary classification. The plot displays data points colored by their binary class, the fitted logistic curve representing predicted probabilities, confidence intervals around the curve, and an optional decision threshold line. This visualization is essential for understanding how a logistic model maps continuous input features to class probabilities. + +## Applications + +- Visualizing credit risk scoring models where the probability of default varies with income or credit score +- Analyzing medical diagnostic thresholds where probability of disease changes with biomarker levels +- Understanding marketing conversion rates as a function of customer engagement metrics or ad spend +- Demonstrating the decision boundary in binary classification problems for educational purposes + +## Data + +- `x` (numeric) - Continuous independent variable (predictor/feature) plotted on the horizontal axis +- `y` (binary) - Binary outcome variable (0 or 1) plotted as data points +- `probability` (numeric) - Predicted probability from the logistic model (0 to 1) for the fitted curve +- Size: 50-500 data points recommended for clear visualization of both the curve and underlying data +- Example: Binary classification data where the outcome probability follows a sigmoidal relationship with the predictor + +## Notes + +- Data points should be jittered slightly on the y-axis (around 0 and 1) for visibility when overlapping +- Use distinct colors for the two classes (e.g., blue for class 0, orange for class 1) +- The logistic curve should be smooth and prominently displayed (solid line, ~2px width) +- Include 95% confidence interval band around the fitted curve with semi-transparent shading +- Add a horizontal dashed line at probability = 0.5 to indicate the default decision threshold +- Label axes clearly: x-axis with the predictor name, y-axis as "Probability" (0 to 1) +- Consider displaying model coefficients or accuracy metrics as annotations +- Points should have moderate transparency (alpha ~0.6) to show density patterns diff --git a/plots/logistic-regression/specification.yaml b/plots/logistic-regression/specification.yaml new file mode 100644 index 0000000000..bdf076be3a --- /dev/null +++ b/plots/logistic-regression/specification.yaml @@ -0,0 +1,32 @@ +# Specification-level metadata for logistic-regression +# Auto-synced to PostgreSQL on push to main + +spec_id: logistic-regression +title: Logistic Regression Curve Plot + +# Specification tracking +created: 2026-01-09T21:45:54Z +updated: null +issue: 3550 +suggested: MarkusNeusinger + +# Classification tags (applies to all library implementations) +# See docs/reference/tagging-system.md for detailed guidelines +tags: + plot_type: + - scatter + - line + - regression + data_type: + - numeric + - categorical + - binary + domain: + - statistics + - machine-learning + - model-evaluation + features: + - regression + - probability + - confidence-interval + - threshold