# Random Data Demo

This notebook generates random data and shows a Plotly ML pairplot.

## Import Required Libraries
Import NumPy and pandas for data generation and handling, plus matplotlib for visualization. We also import Polars and Plotly ML for the pairplot.

In [1]:
import numpy as np
import pandas as pd
import polars as pl

from plotly_ml import pairplot

## Generate Random Data
Create random arrays and a DataFrame using NumPy random functions with a fixed seed for reproducibility.

In [2]:
rng = np.random.default_rng(42)

n_samples = 300
x1 = rng.normal(loc=0.0, scale=1.0, size=n_samples)
x2 = 0.6 * x1 + rng.normal(loc=0.0, scale=0.7, size=n_samples)
x3 = rng.normal(loc=2.0, scale=1.2, size=n_samples)

categories = rng.choice(["A", "B", "C"], size=n_samples, p=[0.4, 0.35, 0.25])

df_pd = pd.DataFrame({"x1": x1, "x2": x2, "x3": x3, "group": categories})
df_pl = pl.from_pandas(df_pd)

(df_pd.head(), df_pl.head())

(         x1        x2        x3 group
 0  0.304717  1.391975  2.618492     B
 1 -1.039984 -1.697693  1.306953     C
 2  0.750451  1.054950  3.529337     C
 3  0.940565  0.334371  1.246895     A
 4 -1.951035 -1.213548  1.236062     C,
 shape: (5, 4)
 ┌───────────┬───────────┬──────────┬───────┐
 │ x1        ┆ x2        ┆ x3       ┆ group │
 │ ---       ┆ ---       ┆ ---      ┆ ---   │
 │ f64       ┆ f64       ┆ f64      ┆ str   │
 ╞═══════════╪═══════════╪══════════╪═══════╡
 │ 0.304717  ┆ 1.391975  ┆ 2.618492 ┆ B     │
 │ -1.039984 ┆ -1.697693 ┆ 1.306953 ┆ C     │
 │ 0.750451  ┆ 1.05495   ┆ 3.529337 ┆ C     │
 │ 0.940565  ┆ 0.334371  ┆ 1.246895 ┆ A     │
 │ -1.951035 ┆ -1.213548 ┆ 1.236062 ┆ C     │
 └───────────┴───────────┴──────────┴───────┘)

## Display Random Data
Show the first few rows and shapes of the generated random datasets.

In [3]:
print("Pandas shape:", df_pd.shape)
print("Polars shape:", df_pl.shape)

df_pd.head()

Pandas shape: (300, 4)
Polars shape: (300, 4)


Unnamed: 0,x1,x2,x3,group
0,0.304717,1.391975,2.618492,B
1,-1.039984,-1.697693,1.306953,C
2,0.750451,1.05495,3.529337,C
3,0.940565,0.334371,1.246895,A
4,-1.951035,-1.213548,1.236062,C


## Basic Summary Statistics
Compute mean, standard deviation, min, and max for the random data.

In [4]:
summary = df_pd[["x1", "x2", "x3"]].agg(["mean", "std", "min", "max"])
summary

Unnamed: 0,x1,x2,x3
mean,-0.041079,-0.032524,1.904914
std,0.930272,0.918715,1.217249
min,-2.566658,-3.305273,-1.207401
max,2.913862,2.529016,5.814624


## Pairplot with Plotly ML
Use the Polars DataFrame and color by the categorical group.

In [None]:
from plotly_ml import pairplot_html_file

fig_default = pairplot(df_pl)
fig_default

fig = pairplot(
    df_pl,
    hue="group",
    diag="hist",
    trend="ols",
    corr=["pearson", "spearman"],
    link_selection=True,
    use_webgl=False,
)
fig

<plotly_ml._widget.PairplotWidget object at 0x7f3a872b75f0>

In [6]:
# Test standalone HTML export (crossfilter via injected JS)
html_path = pairplot_html_file(
    "pairplot_demo.html",
    df_pl,
    hue="group",
    diag="hist",
    trend="ols",
    corr=["pearson", "spearman"],
)
print(f"HTML written to {html_path}")

# Test plain Figure for Dash (link_selection=True but return_widget=False)
fig_dash = pairplot(
    df_pl,
    hue="group",
    link_selection=True,
    return_widget=False,
)
print(f"Dash figure type: {type(fig_dash).__name__}")

HTML written to pairplot_demo.html
Dash figure type: Figure
