# Ictonyx Example: PyTorch Classification Variability Study

This notebook trains a simple feedforward network on the Iris dataset multiple times
and reports the distribution of validation accuracy across runs.

**Requirements:** `pip install ictonyx torch scikit-learn`

In [None]:
import numpy as np
import torch
import torch.nn as nn
from sklearn.datasets import load_iris
from sklearn.preprocessing import StandardScaler

import ictonyx as ix
from ictonyx import (
    ModelConfig,
    PyTorchModelWrapper,
    ArraysDataHandler,
    run_variability_study,
)

print(f"Ictonyx v{ix.__version__}")
print(f"PyTorch v{torch.__version__}")
print(f"Device: {'cuda' if torch.cuda.is_available() else 'cpu'}")

## 1. Load and Prepare Data

We use the classic Iris dataset — 150 samples, 4 features, 3 classes.
StandardScaler normalizes features, which helps the network converge faster.

In [None]:
iris = load_iris()
X = StandardScaler().fit_transform(iris.data).astype(np.float32)
y = iris.target.astype(np.int64)

data_handler = ArraysDataHandler(X, y, test_size=0.2, val_size=0.2)

print(f"Samples: {len(X)}")
print(f"Features: {X.shape[1]}")
print(f"Classes: {len(np.unique(y))}")

## 2. Define the Model Builder

The model builder is a **factory function** that creates a fresh model for each run.
This is essential — each variability study run needs its own randomly initialized model.

The `PyTorchModelWrapper` takes:
- An `nn.Module` (your network architecture)
- A loss function (`criterion`)
- An optimizer class + params (not an instance — because the optimizer must bind to each new model's parameters)

In [None]:
def create_iris_net(config: ModelConfig) -> PyTorchModelWrapper:
    """Factory function: creates a fresh model each run."""
    model = nn.Sequential(
        nn.Linear(4, 32),
        nn.ReLU(),
        nn.Dropout(0.2),
        nn.Linear(32, 16),
        nn.ReLU(),
        nn.Linear(16, 3),
    )
    return PyTorchModelWrapper(
        model,
        criterion=nn.CrossEntropyLoss(),
        optimizer_class=torch.optim.Adam,
        optimizer_params={'lr': config.get('learning_rate', 0.01)},
        task='classification',
    )

# Quick sanity check
test_wrapper = create_iris_net(ModelConfig({'learning_rate': 0.01}))
print(repr(test_wrapper))

## 3. Run the Variability Study

This trains the model 10 times, each with a different random initialization
but deterministic seeding (seed=42), so the study is fully reproducible.

In [None]:
config = ModelConfig({
    'epochs': 30,
    'batch_size': 16,
    'learning_rate': 0.01,
    'verbose': 0,
})

results = run_variability_study(
    model_builder=create_iris_net,
    data_handler=data_handler,
    model_config=config,
    num_runs=10,
    seed=42,
)

## 4. Examine Results

In [None]:
print(results.summarize())

In [None]:
print("Available metrics:", results.get_available_metrics())
print()
print("Per-run val_accuracy:")
for i, acc in enumerate(results.get_metric_values('val_accuracy'), 1):
    print(f"  Run {i}: {acc:.4f}")

In [None]:
# Summary DataFrame — one row per run, all final-epoch metrics
results.to_dataframe()