# `probly` Tutorial â€” Ensemble Transformation 

This notebook is a practical introduction to the Ensemble transformation in `probly`. Deep Ensembles are one of the most robust and high-performing methods for uncertainty quantification.

We will start by explaining the core idea behind Deep Ensembles and see how  `probly`'s transformation enables you to create them. We will then walk through a PyTorch example to see how to train the ensemble and use the disagreement between its members to estimate predictive uncertainty.



---
# Part A: Introduction to Ensembles and the Ensemble Transformation
---


## 1. Concept: What is a Deep Ensemble?
### 1.1 The Core Idea: Wisdom of the Crowd

The idea behind an ensemble is simple and powerful: instead of relying on the prediction of a single model, we train multiple models independently
and aggregate their predictions. The core principle is that if we have a diverse set of "experts" (the models), their collective judgment will be 
more robust and reliable than any single expert's.

### 1.2 Deep Ensembles for Uncertainty

In Deep Learning, a **Deep Ensemble** consists of multiple neural networks. To create a diverse ensemble, each network is 
trained from a different random initialization.

When we give the same input to every model in the ensemble, we will get a set of different predictions.

- The mean of these predictions gives us a robust final prediction.
- The variance (or disagreement) among these predictions gives us a direct and high-quality measure of the model's uncertainty.
If all models agree, uncertainty is low. If they disagree significantly, uncertainty is high.

While very effective, creating and managing deep ensembles manually can be cumbersome.

### 1.3 The Ensemble Transformation

The Ensemble transformation in `probly` automates the creation and management of a deep ensemble.

The transformation does the following:

- It takes a single, user-defined base model as a template.
- It creates a specified number of deep copies of this model.
- Crucially, it re-initializes the parameters of each copy, so every model in the ensemble starts from a different random state.
- It packages all these independent models into a single torch.nn.ModuleList.

This provides a convenient way to train and query all ensemble members simultaneously.

### 1.4. What that entails
| Aspect                       |Ensemble Transformation in `probly`                                                |
|------------------------------|--------------------------------------------------------                          |
| **Main Idea**                | "Wisdom of the crowd"                                                            | 
| Stochastic Element           | Disagreement between multiple independent models.                                | 
| Architectural Change         | Creates an nn.ModuleList of cloned and re-initialized models.                    | 
| Uncertainty Interpretation   | A very strong and robust measure of model uncertainty.                           | 
| Training Cost                |High (training N models)                                                          | 


In [1]:
import torch
from torch import nn

from probly.transformation import ensemble


def build_mlp(in_dim: int = 10, hidden: int = 32, out_dim: int = 1) -> nn.Sequential:
    return nn.Sequential(
        nn.Linear(in_dim, hidden),
        nn.ReLU(),
        nn.Linear(hidden, out_dim),
    )


model = build_mlp()
print("Original model:\n", model)

# Apply the Ensemble transformation
num_models = 5
ensemble_model = ensemble(model, num_members=num_models)
print(f"\nEnsemble with {num_models} members:\n", ensemble_model)

Original model:
 Sequential(
  (0): Linear(in_features=10, out_features=32, bias=True)
  (1): ReLU()
  (2): Linear(in_features=32, out_features=1, bias=True)
)

Ensemble with 5 members:
 ModuleList(
  (0-4): 5 x Sequential(
    (0): Linear(in_features=10, out_features=32, bias=True)
    (1): ReLU()
    (2): Linear(in_features=32, out_features=1, bias=True)
  )
)
