# 02 · Minimal Active Inference on a Ring World

This notebook demonstrates a compact, from-scratch **active inference** agent on a discrete ring world.

**Key equations**
- **Posterior (filter):** $q(s) \propto A[o,s]\, (B[a]q)$
- **EFE (risk + ambiguity):** $G = D_{\mathrm{KL}}(Q(o)\Vert P(o)) + \mathbb{E}_{Q(s')}{[H(P(o\mid s'))]}$
- **EFE (cost − information gain):** $G = \mathbb{E}_{Q(o)}[-\ln P(o)] - I(S';O)$

We compute both decompositions and verify their numerical equivalence (up to constants) while rolling the agent forward.

In [None]:
import numpy as np, pandas as pd
from persystems.gm import GenerativeModel, softmax
from persystems.inference import bayes_predict, bayes_update, belief_entropy
from persystems.planning import choose_action
from persystems.viz import plot_efe_trace, plot_cost_info, plot_entropy, plot_final_posterior
np.set_printoptions(precision=3, suppress=True)
np.random.seed(7)

## Build the generative model $(A,B,C)$
- $A[o,s]=P(o\mid s)$ (likelihood)
- $B[a][s',s]=P(s'\mid s,a)$ (per-action transition)
- $C$ are log-preferences; $P(o)=\operatorname{softmax}(C)$

In [None]:
N = 5
target_idx = 3
gm = GenerativeModel.make_ring_world(N=N, A_eps=0.15, target_idx=target_idx)
print('Actions:', gm.actions)
print('Preferred outcomes P(o):', softmax(gm.C))
print('A shape', gm.A.shape, 'B len', len(gm.B), 'C shape', gm.C.shape)

## Roll-out with horizon $H=1$ (myopic EFE)
At each step we:
1. Evaluate $G$ for each candidate action.
2. Take the action with minimal $G$.
3. Observe $o_t\sim P(o\mid s_t)$.
4. Update $q(s_t\mid o_{1:t})$ by Bayes (filter).

In [None]:
T = 30
qs = np.ones(N)/N                    # initial belief over states
true_s = np.random.randint(0, N)    # hidden true state

hist = {k: [] for k in ['t','true_s','obs','a','G','risk','ambiguity','exp_cost','info_gain','Hq']}
for t in range(T):
    a_idx, comp, Gs = choose_action(qs, gm.A, gm.B, gm.C, horizon=1)
    a = gm.actions[a_idx]
    true_s = (true_s + a) % N
    o = int(np.random.choice(np.arange(N), p=gm.A[:, true_s]))
    q_pred = bayes_predict(qs, gm.B[a_idx])
    qs = bayes_update(q_pred, gm.A, o)
    # log
    hist['t'].append(t)
    hist['true_s'].append(true_s)
    hist['obs'].append(o)
    hist['a'].append(a)
    hist['G'].append(float(Gs[a_idx]))
    hist['risk'].append(float(comp['risk']))
    hist['ambiguity'].append(float(comp['ambiguity']))
    hist['exp_cost'].append(float(comp['expected_cost']))
    hist['info_gain'].append(float(comp['info_gain']))
    hist['Hq'].append(float(belief_entropy(qs)))

for k in hist: hist[k] = np.array(hist[k])
df = pd.DataFrame(hist)
df.head(10)

### Plots: two EFE decompositions + belief entropy and final posterior

In [None]:
plot_efe_trace(hist['t'], hist['G'], hist['risk'], hist['ambiguity'])
plot_cost_info(hist['t'], hist['exp_cost'], hist['info_gain'])
plot_entropy(hist['t'], hist['Hq'])
plot_final_posterior(qs)

## Depth-$H$ planning (tiny lookahead)
Enumeration over observation branches scales as $(|\mathcal{A}|\,|\mathcal{O}|)^H$, so keep $H$ small in discrete demos.

In [None]:
for H in [1, 2]:
    qs0 = np.ones(N)/N
    a_idx, comp, Gs = choose_action(qs0, gm.A, gm.B, gm.C, horizon=H)
    print(f"H={H} → best action index {a_idx}, EFE per action {np.round(Gs, 4)}")

### Notes
- The **risk** term pulls outcomes toward preferences $P(o)$.
- The **ambiguity** (or the negative **information gain**) drives epistemic behavior.
- Depth-$H$ planning quickly becomes the computational bottleneck; pruning and amortization help (see future notebooks).