# 01 – H₀-only Gaussian mixture simulation

This notebook reproduces the **H₀-only simulation study** from the manuscript.

- **Data generation:** 3-class Gaussian mixture in $\mathbb{R}^2$ (synthetic “subjects”).
- **Design:** 10 subjects per class, with class-specific mean structure.
- **Topological features:** H₀ (MST-based) features only, extracted via `graphph.h0_experiment`.
- **Modeling:** Fits the proposed model using H₀ features.
- **Outputs:** Classification performance, estimated Λ matrices, and diagnostic plots corresponding to the H₀-only simulation results in the paper.

Requirements: the `graphph` package in this repository must be importable (see `README.md`).


# Imports

In [2]:
from pathlib import Path
import sys, os

# add repo root to sys.path
ROOT = Path("..").resolve()
if str(ROOT) not in sys.path:
    sys.path.insert(0, str(ROOT))

import graphph as gph
# use gph.<functions> here

# Configs

In [4]:
DELTAS = (2.0, 1.0, 0.5)
MODE: gph.SimMode = "anchored"

base_simcfg = gph.SimConfig(
    n=150, S=10, sigma=0.25, delta=2.0,
    trans_sd=0.0, scale_lo=1.0, scale_hi=1.0,
    jitter_sd=0.07, seed=2025,
)

fitcfg = gph.FitConfig(
    m=2, kappa=6.0, alpha=0.2,
    num_warmup=600, num_samples=800,
    num_chains=1, target_accept=0.8,
    seed=2025, dense_mass=False,
)


# Loop over DELTAS

In [6]:
for d in DELTAS:
    simcfg_d = replace(base_simcfg, delta=d)
    dtag = str(d).replace(".", "p")
    savecfg = gph.SaveConfig(out_dir=f"runs/anchored_coupled_delta{dtag}", thin_every=1)

    results, anchors = gph.simulate_dataset(simcfg_d, mode=MODE)

    # integrity check
    for cid in results:
        gph.check_events_integrity(results[cid][0]["events"])

    models, C, acc = gph.fit_and_classify(results, fitcfg, save_thin_every=savecfg.thin_every)
    print(f"[Δ={d}] Confusion (rows=true, cols=pred):\n{C}")
    print(f"[Δ={d}] Accuracy: {acc:.3f}")

    # gph.save_experiment(savecfg, simcfg_d, fitcfg, MODE, results, anchors, models, C, acc)


sample: 100%|█| 1400/1400 [00:14<00:00, 94.36it/s, 15 steps of size 2.23e-01. acc. pro


[train] class=1G  time=22.56s


sample: 100%|█| 1400/1400 [00:12<00:00, 108.12it/s, 15 steps of size 2.15e-01. acc. pr


[train] class=2G_11  time=12.99s


sample: 100%|█| 1400/1400 [00:17<00:00, 79.03it/s, 31 steps of size 1.63e-01. acc. pro


[train] class=2G_37  time=17.75s
[Δ=2.0] Confusion (rows=true, cols=pred):
[[10  0  0]
 [ 0 10  0]
 [ 0  0 10]]
[Δ=2.0] Accuracy: 1.000


sample: 100%|█| 1400/1400 [00:14<00:00, 98.56it/s, 15 steps of size 2.23e-01. acc. pro


[train] class=1G  time=14.24s


sample: 100%|█| 1400/1400 [00:13<00:00, 107.59it/s, 15 steps of size 2.43e-01. acc. pr


[train] class=2G_11  time=13.04s


sample: 100%|█| 1400/1400 [00:16<00:00, 87.40it/s, 31 steps of size 1.92e-01. acc. pro


[train] class=2G_37  time=16.05s
[Δ=1.0] Confusion (rows=true, cols=pred):
[[10  0  0]
 [ 0 10  0]
 [ 0  0 10]]
[Δ=1.0] Accuracy: 1.000


sample: 100%|█| 1400/1400 [00:13<00:00, 101.06it/s, 15 steps of size 2.23e-01. acc. pr


[train] class=1G  time=13.89s


sample: 100%|█| 1400/1400 [00:13<00:00, 102.14it/s, 15 steps of size 2.32e-01. acc. pr


[train] class=2G_11  time=13.74s


sample: 100%|█| 1400/1400 [00:16<00:00, 87.33it/s, 31 steps of size 1.87e-01. acc. pro


[train] class=2G_37  time=16.16s
[Δ=0.5] Confusion (rows=true, cols=pred):
[[10  0  0]
 [ 0 10  0]
 [ 0  0 10]]
[Δ=0.5] Accuracy: 1.000
