# Random Data Demo

This notebook generates random data and shows a Plotly ML pairplot.

## Import Required Libraries
Import NumPy and pandas for data generation and handling, plus matplotlib for visualization. We also import Polars and Plotly ML for the pairplot.

In [7]:
import numpy as np
import pandas as pd
import polars as pl

from plotly_ml import pairplot

## Generate Random Data
Create random arrays and a DataFrame using NumPy random functions with a fixed seed for reproducibility.

In [13]:
rng = np.random.default_rng(42)

n_samples = 300_000
x1 = rng.normal(loc=0.0, scale=1.0, size=n_samples)
x2 = 0.6 * x1 + rng.normal(loc=0.0, scale=0.7, size=n_samples)
x3 = rng.normal(loc=2.0, scale=1.2, size=n_samples)

categories = rng.choice(["A", "B", "C"], size=n_samples, p=[0.4, 0.35, 0.25])

df_pd = pd.DataFrame({"x1": x1, "x2": x2, "x3": x3, "group": categories})
df_pl = pl.from_pandas(df_pd)

(df_pd.head(), df_pl.head())

(         x1        x2        x3 group
 0  0.304717  1.407710  1.550531     A
 1 -1.039984 -0.166338  2.233196     A
 2  0.750451  0.165966  2.298790     C
 3  0.940565  0.735754  3.559926     B
 4 -1.951035 -1.733685  3.892370     B,
 shape: (5, 4)
 ┌───────────┬───────────┬──────────┬───────┐
 │ x1        ┆ x2        ┆ x3       ┆ group │
 │ ---       ┆ ---       ┆ ---      ┆ ---   │
 │ f64       ┆ f64       ┆ f64      ┆ str   │
 ╞═══════════╪═══════════╪══════════╪═══════╡
 │ 0.304717  ┆ 1.40771   ┆ 1.550531 ┆ A     │
 │ -1.039984 ┆ -0.166338 ┆ 2.233196 ┆ A     │
 │ 0.750451  ┆ 0.165966  ┆ 2.29879  ┆ C     │
 │ 0.940565  ┆ 0.735754  ┆ 3.559926 ┆ B     │
 │ -1.951035 ┆ -1.733685 ┆ 3.89237  ┆ B     │
 └───────────┴───────────┴──────────┴───────┘)

## Display Random Data
Show the first few rows and shapes of the generated random datasets.

In [14]:
print("Pandas shape:", df_pd.shape)
print("Polars shape:", df_pl.shape)

df_pd.head()

Pandas shape: (300000, 4)
Polars shape: (300000, 4)


Unnamed: 0,x1,x2,x3,group
0,0.304717,1.40771,1.550531,A
1,-1.039984,-0.166338,2.233196,A
2,0.750451,0.165966,2.29879,C
3,0.940565,0.735754,3.559926,B
4,-1.951035,-1.733685,3.89237,B


## Basic Summary Statistics
Compute mean, standard deviation, min, and max for the random data.

In [15]:
summary = df_pd[["x1", "x2", "x3"]].agg(["mean", "std", "min", "max"])
summary

Unnamed: 0,x1,x2,x3
mean,4.3e-05,-0.001667,2.000448
std,1.000688,0.921722,1.201027
min,-4.928238,-4.075195,-3.792547
max,5.007235,4.312461,7.300274


## Pairplot with Plotly ML
Use the Polars DataFrame and color by the categorical group.

### If the crossfiltering widget doesn’t render
The crossfiltering pairplot uses **AnyWidget** (an `ipywidgets`-compatible custom widget). If you only see text like `PairplotWidget(...)` instead of an interactive plot, it means the frontend is not rendering widgets.

In **VS Code** this is usually one of:
- Notebook / workspace is **not trusted** (restricted mode disables widget JS).
- Widget rendering is disabled or the **Jupyter Renderers** extension is missing/disabled.
- `Jupyter: Widget Script Sources` is set to **block** scripts (and Plotly.js is loaded in the browser).

Quick checks/fixes:
- Trust the workspace (VS Code: Command Palette → `Workspaces: Manage Workspace Trust`).
- Ensure extensions are enabled: `ms-toolsai.jupyter` and `ms-toolsai.jupyter-renderers`.
- Set `Jupyter: Widget Script Sources` to allow scripts (e.g. `All`), then restart the kernel and re-run the cell.

In [17]:
from plotly_ml import pairplot_html_file

fig_default = pairplot(df_pl)
fig_default

fig = pairplot(
    df_pl,
    hue="group",
    diag="hist",
    trend="ols",
    corr=["pearson", "spearman"],
    link_selection=True,
    use_webgl=True,
)
fig

<plotly_ml._widget.PairplotWidget object at 0x7f7c6a5a3020>

In [12]:
# Test standalone HTML export (crossfilter via injected JS)
html_path = pairplot_html_file(
    "pairplot_demo.html",
    df_pl,
    hue="group",
    diag="hist",
    trend="ols",
    corr=["pearson", "spearman"],
)
print(f"HTML written to {html_path}")

# Test plain Figure for Dash (link_selection=True but return_widget=False)
fig_dash = pairplot(
    df_pl,
    hue="group",
    link_selection=True,
    return_widget=False,
)
print(f"Dash figure type: {type(fig_dash).__name__}")

HTML written to pairplot_demo.html
Dash figure type: Figure
