**For correct rendering, view this notebook in [nbviewer](https://nbviewer.org/github/markuskrecik/preference-dynamics-learning/blob/main/notebooks/40_training_cnn_n1.ipynb)**

# 1d CNN Model Training for n=1 action

This notebook trains a 1d CNN model to predict preference dynamics parameters in the case of n=1 action.

The CNN pytorch model is automatically constructed through the `CNN1DPredictor` class with a `CNN1DConfig`. It combines multiple initial convolutional and pooling layers with subsequent fully connected layers.

The Trainer class is configured through the `TrainerConfig`. The training loop stops early if a validation DataLoader is provided. The trainer automatically creates checkpoints of the training and model state for the best and last epoch, and supports resuming from a checkpoint. If the trainer is running in a mlflow run, training metrics are automatically logged.

The `ExperimentRunner` class can perform single runs and parallelized hyperparameter optimization studies through Optuna with mlflow logging. It supports resuming previous studies. Parameter studies are defined through subclassing with a `suggest_parameters` method.
mlflow runs can be examined in the UI through `uv run mlflow ui --port 5000 --backend-store-uri sqlite:///mlruns.db`.

All classes can be extensively configured through their config schemas beyond the options shown here.

**This notebook:**
- Trains a 1d CNN model for the easiest case of n=1 action
- Evaluates the model on the test set with various metrics
- Performs a hyperparameter study to identify the best configuration of convolutional and hidden layers
- Compares time series of true and predicted parameters


## Training for 1 action


In [None]:
%load_ext autoreload
%autoreload 2


import numpy as np
from optuna.visualization import plot_parallel_coordinate, plot_contour
from plotly.offline import init_notebook_mode

init_notebook_mode(connected=True)

from preference_dynamics.schemas import (
    ParameterVector,
    ICVector,
    ODEConfig,
    SolverConfig,
    ODESolverConfig,
    TrainerConfig,
    RunnerConfig,
)

from preference_dynamics.solver import create_default_sampler, generate_batch, solve_ODE
from preference_dynamics.data import DataConfig, DataManager
from preference_dynamics.data.transformer import SampleGroupStdNormalizer
from preference_dynamics.data.adapters import StateInputAdapter, ParameterICTargetAdapter
from preference_dynamics.models import CNN1DConfig
from preference_dynamics.training import compute_metrics
from preference_dynamics.experiments import ExperimentRunner
from preference_dynamics.visualization import (
    plot_metrics,
    plot_parameter_comparison,
    plot_time_series,
    plot_training_curves,
)
from preference_dynamics.utils import (
    num_params,
    num_vars,
    assemble_checkpoint_path,
    get_param_names,
    get_var_names,
)


n_actions = 1
data_dir = f"data/n{n_actions}"
model_name = f"cnn1d_n{n_actions}"

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


### Preliminaries

Generate data if not done already, otherwise load it. I circumvent the previously identified identifiability issue of $\mu$ by setting it to 0 while sampling.

The transformer `SampleGroupStdNormalizer` normalizes each variable group (desires, efforts) separately for each sample through the standard deviation. I don't shift the mean to 0, because this would erase the information about the steady state.

The input and output adapters serve as a flexible way to define model input data and model predictions without having to change the model architecture. Here, I specify that the model receives the state vector as input through `StateInputAdapter` and predicts parameters and initial conditions through `ParameterICTargetAdapter`.

In [6]:
data_config = DataConfig(
    data_dir=data_dir,
    load_if_exists=False,
    transformers=[
        SampleGroupStdNormalizer(),
    ],
    input_adapter=StateInputAdapter(),
    target_adapter=ParameterICTargetAdapter(),
)
dm = DataManager(config=data_config)

try:
    dm.setup()
except FileNotFoundError:
    n_samples = 10000
    seed = 42

    sampler = create_default_sampler(
        n_actions=n_actions,
        random_seed=seed,
        mu_range=(0.0, 0.0),
    )

    solver_config = SolverConfig(
        time_span=(0.0, 200.0),
        n_time_points=201,
    )

    print(f"Generating {n_samples} samples for n={n_actions} actions...")
    batch = generate_batch(
        n_samples,
        sampler,
        solver_config,
        n_jobs=-1,
        show_progress=True,
        debug=False,
    )
    dm.save_raw(batch)

    dm.setup()

### Model Training

Let's start with a tiny model:
- 1 convolutional layer with kernel size 3
- 1 global pooling layer
- A dropout layer
- 1 hidden layer with 32 neurons
- All activation functions are ReLU

In [7]:
model_config = CNN1DConfig(
    model_name=model_name,
    in_channels=dm.n_inputs,
    filters=[64],
    kernel_sizes=[3],
    features=[32, dm.n_targets],
    dropout=0.3,
)

trainer_config = TrainerConfig(
    loss_function="mse",
    learning_rate=0.002,
    num_epochs=200,
    early_stopping_patience=20,
)

runner_config = RunnerConfig(
    experiment_name=f"preference_dynamics_n{n_actions}",
)

runner = ExperimentRunner(
    runner_config=runner_config,
    data_config=data_config,
    model_config=model_config,
    trainer_config=trainer_config,
)

In [8]:
experiment = runner.run("base")

plot_training_curves(experiment.history)

Training:  70%|███████   | 140/200 [05:55<02:32,  2.54s/it, epoch=140, train_loss=1.01, val_loss=0.992, epoch_time=1.32]


### Model Evaluation

As can be seen, training converged well. Let's load the best model checkpoint and compare the performance on the test set.
The overall and per-parameter metrics show that the performance can be improved.

In [None]:
experiment = runner.load_checkpoint("cnn1d_n1/b278dc3a0d3a416f81d5c44978fad2ea/best")
# experiment.load_checkpoint("best")

y_pred, y_true, loss = experiment.trainer.evaluate(dm.test_dataloader)

col_names = get_param_names(n_actions, ic=True)

metrics = compute_metrics(y_true, y_pred)
plot_metrics(metrics, target_names=col_names)
plot_parameter_comparison(y_true, y_pred, col_names)

INFO:preference_dynamics.training.trainer:Loading checkpoint from checkpoints/cnn1d_n1/b278dc3a0d3a416f81d5c44978fad2ea/best.pt


## Hyperparameter Optimization

Let's do some hyperparameter studies on the learning rate, and number of filters, and hidden dimensions.

For this, I subclass the `ExperimentRunner` and override the `suggest_parameters` method.

### Learning Rate

In [10]:
class Runner(ExperimentRunner):
    def suggest_parameters(self, trial):
        lr = trial.suggest_float("lr", 1e-4, 1e-1, log=True)
        self.trainer_config.learning_rate = lr


runner = Runner(
    runner_config=runner_config,
    data_config=data_config,
    model_config=model_config,
    trainer_config=trainer_config,
)

In [None]:
study = runner.run_study("lr_study", n_trials=10, n_jobs=1)

[I 2026-01-07 12:27:36,691] Using an existing study with name 'lr_study' instead of creating a new one.
INFO:preference_dynamics.experiments.runner:Run: Trial 5: lr=0.03767215131687004
INFO:preference_dynamics.training.trainer:Initialized Trainer with device=cuda, checkpoint_dir=checkpoints
INFO:preference_dynamics.data.manager:Loading raw data from data/n1/raw
INFO:preference_dynamics.data.manager:Saving 3 splits to data/n1/processed
INFO:preference_dynamics.training.trainer:Starting training for 200 epochs with run_id=5d5e1771edb7424aa481c9a449140cbe
Training:  24%|██▍       | 48/200 [01:19<04:12,  1.66s/it, epoch=48, train_loss=1.78, val_loss=1.78, epoch_time=1.86]INFO:preference_dynamics.training.trainer:Early stopping at epoch 48 (patience: 20)
Training:  24%|██▍       | 48/200 [01:19<04:10,  1.65s/it, epoch=48, train_loss=1.78, val_loss=1.78, epoch_time=1.86]
INFO:preference_dynamics.training.trainer:Loading checkpoint from checkpoints/cnn1d_n1/5d5e1771edb7424aa481c9a449140cbe/be

In [11]:
study = runner.load_study("lr_study")  # if not loaded already
plot_parallel_coordinate(study)

As can be seen, anything between 1e-3 and 1e-2 is the best learning rate. I choose the best-performing trial with lr=0.082 for fastest convergence.

### CNN and MLP Layers

Next, I vary the number of convolutional and hidden layers.

In [12]:
trainer_config = TrainerConfig(
    loss_function="mse",
    learning_rate=0.082,
    num_epochs=200,
    early_stopping_patience=20,
)


class Runner(ExperimentRunner):
    def suggest_parameters(self, trial):
        n_kernels = trial.suggest_int("n_kernels", 1, 4)
        self.model_config.kernel_sizes = [3] * n_kernels
        self.model_config.filters = [64] * n_kernels
        n_hidden_layers = trial.suggest_int("n_hidden_layers", 1, 3)
        self.model_config.features = [32] * n_hidden_layers + [dm.n_targets]


runner = Runner(
    runner_config=runner_config,
    data_config=data_config,
    model_config=model_config,
    trainer_config=trainer_config,
)

In [None]:
study = runner.run_study("num_kernels_hidden_study", n_trials=12, n_jobs=1)

[I 2026-01-06 16:14:35,731] Using an existing study with name 'num_kernels_hidden_study' instead of creating a new one.
INFO:preference_dynamics.experiments.runner:Run: Trial 14: n_kernels=2, n_hidden_layers=1
INFO:preference_dynamics.training.trainer:Initialized Trainer with device=cuda, checkpoint_dir=checkpoints
INFO:preference_dynamics.data.manager:Loading raw data from data/n1/raw
INFO:preference_dynamics.data.manager:Saving 3 splits to data/n1/processed
INFO:preference_dynamics.training.trainer:Starting training for 100 epochs with run_id=ef5aacd2bf744924a269125396dc3acb
Training: 100%|██████████| 100/100 [03:59<00:00,  2.40s/it, epoch=99, train_loss=0.94, val_loss=0.939, epoch_time=2.67]
INFO:preference_dynamics.training.trainer:Loading checkpoint from checkpoints/cnn1d_n1/ef5aacd2bf744924a269125396dc3acb/best.pt
INFO:preference_dynamics.training.trainer:Test loss: 0.9037303820900295
[I 2026-01-06 16:18:49,499] Trial 14 finished with value: 0.9037303820900295 and parameters: {'n

The contour plot shows that only 1 convolutional layer performs worse. Apart from that, all combinations of layers perform similarly, which can be verified in the parallel coordinate plot.

The plot shows that performance does not increase significantly beyond a parsimonious model with 2 convolutional and 1 hidden layer.

In [13]:
study = runner.load_study("num_kernels_hidden_study")
plot_contour(study)

### Number of Filters and Neurons

The next hyperparameter study investigates the optimal number of neurons per layer / filters per CNN layer. I use the optimal number of layers from the previous study.

The results:
- `filter_0=64-96` is generally best
- `filter_1=128` significantly improves performance
- `neurons=128` or `64` yields ambiguous results

In [14]:
model_config = CNN1DConfig(
    model_name=model_name,
    in_channels=dm.n_inputs,
    filters=[64, 64],
    kernel_sizes=[3, 3],
    features=[32, dm.n_targets],
    dropout=0.3,
)


class Runner(ExperimentRunner):
    def suggest_parameters(self, trial):
        filters = [trial.suggest_int(f"filter_{i}", 32, 128, step=32) for i in range(2)]
        self.model_config.filters = filters
        neurons = trial.suggest_int("neurons", 32, 128, step=32)
        self.model_config.features = [neurons] + [dm.n_targets]


runner = Runner(
    runner_config=runner_config,
    data_config=data_config,
    model_config=model_config,
    trainer_config=trainer_config,
)

In [None]:
study = runner.run_study("num_filters_neurons_study", n_trials=12, n_jobs=1)

[I 2026-01-06 17:55:28,355] Using an existing study with name 'num_filters_neurons_study' instead of creating a new one.
INFO:preference_dynamics.experiments.runner:Run: Trial 11: filter_0=96, filter_1=96, neurons=96
INFO:preference_dynamics.training.trainer:Initialized Trainer with device=cuda, checkpoint_dir=checkpoints
INFO:preference_dynamics.data.manager:Loading raw data from data/n1/raw
INFO:preference_dynamics.data.manager:Saving 3 splits to data/n1/processed
INFO:preference_dynamics.training.trainer:Starting training for 100 epochs with run_id=2234dcff39e74d0d85ad2faa7046c6e6
Training: 100%|██████████| 100/100 [03:48<00:00,  2.28s/it, epoch=99, train_loss=0.927, val_loss=0.931, epoch_time=3.36]
INFO:preference_dynamics.training.trainer:Loading checkpoint from checkpoints/cnn1d_n1/2234dcff39e74d0d85ad2faa7046c6e6/best.pt
INFO:preference_dynamics.training.trainer:Test loss: 0.8987486848364705
[I 2026-01-06 17:59:28,057] Trial 11 finished with value: 0.8987486848364705 and paramet

In [15]:
study = runner.load_study("num_filters_neurons_study")
display(plot_contour(study))

The overall metrics over the best run show a significant improvement in model performance.
The per-parameter comparisons show less spread, however the model still struggles to reproduce $g_0$.

The prediction of $m_0<0$ and $v_0<0$ is particularly challenging, since these values cannot be inferred directly from the time series, which is reflected in the results.

In [16]:
run_id = [
    t.user_attrs["mlflow_run_id"]
    for t in study.trials
    if t.params == {"filter_0": 64, "filter_1": 128, "neurons": 128}
][0]
checkpoint_path = assemble_checkpoint_path(["best"], model_name=model_name, run_id=run_id)

experiment = runner.load_checkpoint(checkpoint_path)

y_pred, y_true, loss = experiment.trainer.evaluate(dm.test_dataloader)

metrics = compute_metrics(y_true, y_pred)
plot_metrics(metrics, col_names)
plot_parameter_comparison(y_true, y_pred, col_names)

INFO:preference_dynamics.training.trainer:Initialized Trainer with device=cuda, checkpoint_dir=checkpoints
INFO:preference_dynamics.training.trainer:Loading checkpoint from checkpoints/cnn1d_n1/c1049714fb634f14870a1908fcc33531/best.pt


## Time Series Comparison

The real test is however, if the model can faithfully recreate the time series.
Let's look at some examples. s=0 is the time series for the true parameters, and s=1 is the time series for the predicted parameters.
The model somewhat captures the behavior, but does not faithfully reproduce the steady state.


In [17]:
# Visual comparison of true and predicted time series

solver_config = SolverConfig(
    time_span=(0.0, 200.0),
    n_time_points=401,
)
n_params = num_params(n_actions)
n_vars = num_vars(n_actions)

for i in range(5):
    samples = []
    try:
        for name, y in {"true": y_true.cpu().numpy()[i], "pred": y_pred.cpu().numpy()[i]}.items():
            config_ode = ODEConfig(
                parameters=ParameterVector(values=y[:n_params]),
                initial_conditions=ICVector(values=y[n_params : n_params + n_vars]),
            )
            config = ODESolverConfig(ode=config_ode.model_dump(), solver=solver_config)

            sample = solve_ODE(config)
            samples.append(sample)
        plot_time_series(samples)
    except ValueError:
        continue

## Summary

This notebook trains a simple 1d CNN model for the case of n=1 action, and performs a hyperparameter study to identify the best learning rate (0.001<lr<0.01) number of convolutional layers (=2) and hidden layers (=1), as well as the number of filters (64, 128) and neurons per layer (128).

**Future extensions:**
- Additional hyperparameter studies on the optimizer, activation functions, and dropout rate

**Next steps** (see `41_training_cnn_n1_residual.ipynb`):
- Use a new model architecture for improved performance, and to include features, and additionally forecast the last time point