## Gallery & Quick Links

Below are a couple of images and links pulled from the repository to give a quick visual reference. For full interactive examples, run the examples in the following sections or open the example HTML: `notebooks/pairplot_demo.html`.

![Site favicon](../site/assets/images/favicon.png)

[Open pairplot demo HTML](../notebooks/pairplot_demo.html)

---


# all_plots_demo â€” Interactive plot gallery

This notebook demonstrates the visualizers and widgets provided by `plotly_ml`. It is organized into sections with runnable examples and interactive widgets for rapid exploration.

**What you'll find:**
- Quick examples showing how to create figures programmatically.
- Widget-driven pairplots for cross-filtering in notebooks.
- Exportable Plotly HTML outputs for sharing.

---


In [1]:
import sys
from pathlib import Path

import numpy as np
import pandas as pd

# Ensure we import the local workspace package (src/plotly_ml) rather than an older
# site-installed version.
project_root = Path.cwd().resolve().parent
src_path = project_root / "src"
if src_path.exists() and str(src_path) not in sys.path:
    sys.path.insert(0, str(src_path))
for name in list(sys.modules):
    if name == "plotly_ml" or name.startswith("plotly_ml."):
        del sys.modules[name]

from plotly_ml import regression, univariant, pariplot, classification, comparison

np.random.seed(0)

## Regression

### Regression evaluation plot
Uses columns `y_true`, `y_pred`, and optionally a split column like `set`.

In [5]:
n = 250
x = np.linspace(0, 10, n)
y_true = 2.0 * x + np.random.normal(0, 1.5, size=n)
y_pred = 2.0 * x + np.random.normal(0, 1.0, size=n)
set_col = np.where(np.arange(n) < int(0.7 * n), "train", "test")

df_reg = pd.DataFrame({"y_true": y_true, "y_pred": y_pred, "set": set_col})
fig = regression.regression_evaluation_plot(df_reg, y="y_true", split_column="set")
fig

## Univariant

### Raincloud plot
A violin-based distribution plot that can group by a categorical column.

In [6]:
df_uni = pd.DataFrame(
    {
        "value": np.r_[np.random.normal(0, 1.0, 200), np.random.normal(1.5, 0.7, 200)],
        "group": ["A"] * 200 + ["B"] * 200,
    }
)
fig = univariant.raincloud_plot(
    df_uni, value="value", group="group", title="Raincloud Plot"
)
fig

Value columns: ['value']


## Pariplot

### Pairplot (static figure)

In [7]:
df_pair = pd.DataFrame(
    {
        "a": np.random.normal(0, 1.0, 300),
        "b": np.random.normal(1.0, 1.2, 300),
        "c": np.random.normal(-0.5, 0.8, 300),
    }
)
fig = pariplot.pairplot(df_pair, height=650, width=650)
fig

### Pairplot with hue + trend + correlations

In [8]:
df_pair_hue = pd.DataFrame(
    {
        "x": np.random.normal(0, 1.0, 400),
        "y": np.random.normal(0.5, 1.1, 400),
        "z": np.random.normal(-0.2, 0.9, 400),
        "group": np.where(np.random.rand(400) > 0.5, "A", "B"),
    }
)
fig = pariplot.pairplot(
    df_pair_hue,
    hue="group",
    diag="hist",
    trend="ols",
    corr=["pearson", "spearman"],
    height=700,
    width=700,
)
fig

### Pairplot crossfilter widget (linked lasso selection)
This returns a `PairplotWidget` when `link_selection=True`.

In [9]:
widget = pariplot.pairplot(
    df_pair_hue,
    hue="group",
    diag="kde",
    link_selection=True,
    height=700,
    width=700,
)
widget

<plotly_ml._widget.PairplotWidget object at 0x7f0dc6ba4b90>

### Standalone HTML helpers
- `pairplot_html(...)` returns an HTML string
- `pairplot_html_file(...)` writes HTML to disk

In [None]:
from IPython.display import HTML, display

html = pariplot.pairplot_html(df_pair_hue, hue="group", height=650, width=650)
# Display the HTML in-notebook
display(HTML(html))

In [None]:
import pathlib

out_path = pathlib.Path("pairplot_demo.html")
pariplot.pairplot_html_file(out_path, df_pair_hue, hue="group", height=650, width=650)
out_path.resolve()

PosixPath('/home/crispy/python/plotly_ml/notebooks/pairplot_demo.html')

## Classification

The classification module supports:
- Binary classification: columns `y_true` + `y_score`
- Multiclass one-vs-rest: per-class probability columns via `proba_columns` + `classes`

### Binary: ROC / PR / Threshold / Calibration

In [None]:
n = 600
score = np.clip(np.random.beta(2, 5, size=n), 0, 1)
# Create a target correlated with score
y = (np.random.rand(n) < (0.15 + 0.7 * score)).astype(int)
split = np.where(np.arrange(n) < int(0.7 * n), "train", "test")

df_bin = pd.DataFrame({"y_true": y, "y_score": score, "set": split})

classification.roc_curve_plot(df_bin, split_column="set")

In [None]:
classification.precision_recall_curve_plot(df_bin, split_column="set")

In [None]:
classification.discrimination_threshold_plot(df_bin, split_column="set")

In [None]:
classification.calibration_plot(df_bin, split_column="set", n_bins=10)

### Multiclass: ROC / PR / Calibration (one-vs-rest)

In [None]:
labels = np.array(["cat", "dog", "fish"])
n = 500
y_mc = np.random.choice(labels, size=n, p=[0.4, 0.35, 0.25])

# Create noisy probabilities biased toward the true class
raw = np.random.dirichlet([2, 2, 2], size=n)
idx = np.array([{"cat": 0, "dog": 1, "fish": 2}[v] for v in y_mc])
raw[np.arrange(n), idx] += 1.5
proba = raw / raw.sum(axis=1, keepdims=True)

df_mc = pd.DataFrame(
    {
        "y_true": y_mc,
        "p_cat": proba[:, 0],
        "p_dog": proba[:, 1],
        "p_fish": proba[:, 2],
    }
)

proba_cols = ["p_cat", "p_dog", "p_fish"]
classes = ["cat", "dog", "fish"]

classification.roc_curve_plot(df_mc, proba_columns=proba_cols, classes=classes)

In [None]:
classification.precision_recall_curve_plot(
    df_mc, proba_columns=proba_cols, classes=classes
)

In [None]:
classification.calibration_plot(
    df_mc, proba_columns=proba_cols, classes=classes, n_bins=10
)

## Classification: Additional diagnostics

Confusion matrix, lift/gain/KS, score distributions, and prediction confidence visuals.

In [None]:
# Confusion matrix (derive predicted label at 0.5 threshold)
df_bin2 = df_bin.copy()
df_bin2["y_pred_label"] = (df_bin2["y_score"] > 0.5).astype(int)
classification.confusion_matrix_plot(
    df_bin2, y_true="y_true", y_pred="y_pred_label", split_column="set"
)

In [None]:
# Lift / Gains / KS plot (binary)
classification.lift_gain_ks_plot(
    df_bin, y_true="y_true", y_score="y_score", split_column="set"
)

In [None]:
# Score distribution by true class / split
classification.score_distribution_plot(
    df_bin, y_true="y_true", y_score="y_score", split_column="set"
)

In [None]:
# Prediction confidence (multiclass): how confident is the model per predicted class
classification.prediction_confidence_plot(
    df_mc, proba_columns=proba_cols, classes=classes
)

### Multiclass: Threshold / Lift-Gain-KS / Score distributions
These additional diagnostics now also support multiclass via one-vs-rest using `proba_columns` + `classes` (optionally with a split column).

In [None]:
df_mc_split = df_mc.copy()
df_mc_split["set"] = np.where(
    np.arrange(len(df_mc_split)) < int(0.7 * len(df_mc_split)), "train", "test"
)

classification.discrimination_threshold_plot(
    df_mc_split,
    proba_columns=proba_cols,
    classes=classes,
    split_column="set",
    n_thresholds=101,
    title="Multiclass Discrimination Threshold (OvR)",
)

In [None]:
classification.lift_gain_ks_plot(
    df_mc_split,
    proba_columns=proba_cols,
    classes=classes,
    split_column="set",
    title="Multiclass Lift / Gains / KS (OvR)",
)

In [None]:
classification.score_distribution_plot(
    df_mc_split,
    proba_columns=proba_cols,
    classes=classes,
    split_column="set",
    nbins=40,
    title="Multiclass Score Distributions (OvR)",
)

## Regression diagnostics

Residuals vs Fitted, Q-Q plot, binned actual vs predicted (with CI), and residuals by group.

In [None]:
# Residuals vs Fitted
regression.residuals_vs_fitted_plot(
    df_reg, y_true="y_true", y_pred="y_pred", split_column="set"
)

In [None]:
# Q-Q plot for residuals
regression.qq_plot(df_reg, y_true="y_true", y_pred="y_pred")

In [None]:
# Binned actual vs predicted with confidence intervals
regression.binned_actual_vs_pred_plot(
    df_reg, y_true="y_true", y_pred="y_pred", n_bins=10
)

In [None]:
# Residuals by group (use 'set' column as example)
regression.residuals_by_group_plot(
    df_reg, y_true="y_true", y_pred="y_pred", group="set"
)

## Model Comparison / Curves

Examples for grouped metric bars and precomputed learning/validation curves.

In [None]:
# Metrics by split (grouped bar chart)
df_metrics = pd.DataFrame(
    {
        "model": ["m1", "m1", "m2", "m2"],
        "set": ["train", "test", "train", "test"],
        "accuracy": [0.95, 0.90, 0.93, 0.89],
        "f1": [0.94, 0.88, 0.92, 0.86],
    }
)
comparison.metrics_by_split_bar_plot(
    df_metrics, metrics=["accuracy", "f1"], model_col="model", split_col="set"
)

In [None]:
# Learning curve from precomputed scores (toy example)
train_sizes = [50, 100, 200, 400]
train_scores = [0.8, 0.85, 0.88, 0.90]
val_scores = [0.78, 0.82, 0.86, 0.87]
comparison.learning_curve_plot(train_sizes, train_scores, val_scores)

In [None]:
# Validation curve from precomputed scores (toy example)
param_values = [0.001, 0.01, 0.1, 1.0]
train_scores = [0.6, 0.75, 0.85, 0.88]
val_scores = [0.58, 0.72, 0.80, 0.83]
comparison.validation_curve_plot(
    param_values, train_scores, val_scores, param_name="C", xscale="log"
)