# Posterior and Rate Function Demo

the **rate function as a regularisation term**.

- the posterior is given by:
  \[
  p_n(\vartheta) \;\propto\; e^{-a_n \ell_\vartheta}\,p_{\text{prior}},
  \]
  and the Large Deviation Principle (Proposition 4.1) states that the posterior concentrates
  with rate function
  \[
  \ell_\vartheta + I_{F^{(L+1)}} - \inf(\ell_\vartheta + I_{F^{(L+1)}}).
  \]

- we simulate the **rate function** \(I_{F^{(L+1)}}\) for a simple single-input network
  (Linear and ReLU activations), using the helper functions from `ld_rate_sim.py`.

- Then, we show how this rate acts as a **regularizer** in the loss square


In [None]:
import numpy as np
from ld_rate_single_input import compute_IK, compute_IF_from_IK, make_IF_regularizer


In [3]:
# Compute IK and IF for Linear activation (L=2, K1=1.0)
k_grid, IK_vals = compute_IK(L=2, K1=1.0, activation="linear", a=0.5, k_grid_max=4.0, k_grid_size=200)

# Evaluate IF(y) for a range of norms
y_norms = np.linspace(0, 3.0, 11)
IF_vals = [compute_IF_from_IK(y, k_grid, IK_vals) for y in y_norms]

list(zip(y_norms, IF_vals))


[(np.float64(0.0), 0.0003167863292496964),
 (np.float64(0.3), 0.14198631274515439),
 (np.float64(0.6), 0.41102680957243387),
 (np.float64(0.8999999999999999), 0.7069472329383795),
 (np.float64(1.2), 1.0066224992917594),
 (np.float64(1.5), 1.3034545946889153),
 (np.float64(1.7999999999999998), 1.5951524375683666),
 (np.float64(2.1), 1.881302904380866),
 (np.float64(2.4), 2.1616897244710094),
 (np.float64(2.6999999999999997), 2.436587187344768),
 (np.float64(3.0), 2.7062405710970308)]

## Using IF as a Regularizer

We now construct a callable `IF_reg(y)` that evaluates the rate function penalty for any output vector `y`.

This corresponds to adding \(I_F\) to the empirical loss, as in the posterior rate.


In [5]:
# Build IF regularizer
IF_reg = make_IF_regularizer(L=2, K1=1.0, activation="linear", a=0.5, k_grid_max=4.0, k_grid_size=200)

# Toy example: output y and target t
y = np.array([1.2, -0.3, 0.5])
t = np.array([1.0, 0.0, 0.0])

sq_loss = 0.5 * np.sum((y - t)**2)
IF_pen = IF_reg(y)
gamma = 0.1
total = sq_loss + gamma * IF_pen

sq_loss, IF_pen, total


(np.float64(0.19), 1.1398715100606056, np.float64(0.30398715100606055))