# `probly` Tutorial â€” Bayesian Transformation 

This notebook is a practical introduction to the Bayesian transformation in `probly`. Bayesian Neural Networks are a more advanced topic than Dropout or DropConnect,
so this tutorial aims to provide an intuitive, hands-on understanding.

We will start by explaining the core idea behind Bayesian Neural Networks (BNNs) and then see how the `probly` transformation enables you to create them. After that, we will look at a PyTorch example to inspect the transformed model and use it to estimate uncertainty.

---
# Part A: Introduction to BNNs and the Bayesian Transformation
---


## 1.Concept: What is a Bayesian Neural Network?

To understand the Bayesian transformation, we first need to understand the difference between a standard neural network and a Bayesian one.

### 1.1 Standard Neural Networks

In a standard neural network, each weight is a single, deterministic number. After training, these weights are fixed. 
When you pass an input through the model, it follows one exact path, producing one exact output.
The model has no inherent way to express how "sure" it is about the values of its weights.

### 1.2 Bayesian Neural Networks (BNNs)

In a Bayesian Neural Network, we replace the deterministic weights with probability distributions.
Instead of a weight being a single number, it might be represented by a Gaussian (normal) distribution 
with a mean and a standard deviation.

- The mean represents the most likely value for that weight.

- The standard deviation represents the model's uncertainty about that weight. A small standard deviation means the model
 is very confident in the weight's value, while a large one means it is very unsure.

During a forward pass, we don't use the mean value directly. Instead, we sample a value for each weight from its distribution.
Because we get a slightly different set of weights every time, each forward pass on the same input will produce a slightly different
 output. This natural variation is a direct reflection of the model's parameter uncertainty.

### 1.3 The Bayesian Transformation `(probly)`

The Bayesian transformation in `probly` automates the process of converting a standard network into a BNN.

The transformation does the following:

It walks through your PyTorch model and finds all compatible layers (e.g., nn.Linear and nn.Conv2d).
It programmatically replaces each standard layer with a corresponding custom Bayesian layer (e.g., BayesLinear, BayesConv2d).
These new layers contain weight distributions instead of single values and are inherently stochastic, even during inference.

This allows us to get a distribution of predictions by running multiple forward passes, which we can then use to quantify the model's uncertainty.


### 1.4. What that entails
| Aspect                       |Bayesian Transformation `(probly)`                                                |
|------------------------------|--------------------------------------------------------                          |
| **Main Idea**                | "Weights are distributions"                                                      | 
| Stochastic Element           | Weights are sampled from probability distributions.                              | 
| Architectural Change         | Replaces `nn.Linear` and `nn.Conv2d` with `BayesLinear`/`BayesConv2d` layers.    | 
| Uncertainty Interpretation   | A principled, direct measure of the model's parameter uncertainty.               | 
|Supported Layers              | `Linear` and `Conv2d`                                                            | 
|Key Parameters                | `prior_mean`, `prior_std`, `posterior_std`                                       | 

## 2. Quickstart (PyTorch)

Below: build a small MLP, apply `bayesian(model)`, and inspect the modified architecture to see the layer replacement.
