# Intelligent Agents and Active Inference

### Preliminaries

- Goal 
  - Introduction to Active Inference and application to the design of synthetic intelligent agents 
- Materials        
  - Mandatory
    - These lecture notes
    - Karl Friston - 2016 - [The Free Energy Principle](https://www.youtube.com/watch?v=NIu_dJGyIQI) (video)
  - Optional
  - References

- In the previous lessons we assumed that a data set was given. 
- In this lesson we consider _agents_. An agent is a system that _interacts_ with its environment through both sensors and actuators.
- Crucially, by acting onto the environment, the agent is able to affect the data that it will sense in the future.
  - As an example, by changing the direction where I look, I can affect the sensory data that will be sensed by my retina.
- With this definition of an agent, (biological) organisms are agents, and so are robots, self-driving cars, etc.
- In this lesson, we will first describe how biological agents ...

Friston at CCN-2016 2 fragments

https://youtu.be/b1hEc6vay_k?start=254&end=475

https://youtu.be/b1hEc6vay_k?t=4505&end4860





Again we consider the Free energy functional


$$\begin{align}
F[q] &= \underbrace{-\sum_z q(z) \log p(x,z)}_{\text{energy}} - \underbrace{\sum_z q(z) \log \frac{1}{q(z)}}_{\text{entropy}} \tag{EE}\\
  &= \underbrace{\sum_z q(z) \log \frac{q(z)}{p(z|x)}}_{\text{divergence}} - \underbrace{\log p(x)}_{\text{log-evidence}}  \tag{DE}\\
  &=  \underbrace{\sum_z q(z)\log\frac{q(z)}{p(z)}}_{\text{complexity}} - \underbrace{\sum_z q(z) \log p(x|z)}_{\text{accuracy}}  \tag{IC}
\end{align}$$
 

    

An agent comprises of
1. a generative model $p(x|z) p(z)$, where $z = \{ s, u, \theta\}$.
2. a recognition model $q(z)$
3. a recipe to minimize FE $F[q]$

### Model specification

- We first consider the following generative model for the agent (omitting parameters $\theta$)
$$\begin{align*}
p^\prime(x,s,u) &= p(s_{t-1}) \prod_{k=t}^{t+T} \underbrace{p(x_k|s_k) \cdot p(s_k | s_{k-1}, u_k)}_{\text{internal dynamics}} \cdot\underbrace{p(u_k)}_{\substack{\text{control prior}}}
\end{align*}$$

- In order to infer _goal-driven_ behavior, we now add prior beliefs $p^+(x)$ about future outcomes, leading to an extended agent model:
$$\begin{align*}
p(x,s,u) &= \frac{p^\prime(x,s,u) p^+(x)}{\int_x p^\prime(x,s,u) p^+(x) \mathrm{d}x} \\
  &\propto p(s_{t-1}) \prod_{k=t}^{t+T} p(x_k|s_k) p(s_k | s_{k-1}, u_k) p(u_k) p^+(x_k)
\end{align*}$$

- We also assume that the agent interacts with an environment, which we represent by a dynamic model
$$
(y_t,\tilde{s}_t) = R_t\left( a_t,\tilde{s}_{t-1}\right)
$$
where $a_t$ are _actions_ , $y_t$ are _outcomes_ and $\tilde{s}_t$ holds the environmental _states_. 

- The agent can push actions $a_t$ onto the environment and measure responses $y_t$, but has no access to the environmental states $\tilde{s}_t$.



### FFG for Agent Model

- After selecting an action $a_t$ and making an observation $y_t$, the FFG for the model is given by the following FFG:

<img src="./figures/fig-active-inference-model-specification.png" width="800px">

- The (brown) dashed box is the agent's Markov blanket.  


### Online Active Inference

- Online active inference proceeds by iteratively executing three stages: (1) act-execute-observe, (2) infer, (3) slide forward

<img src="./figures/fig-online-active-inference.png" width="600px">

### Specification of Free Energy 

- Consider the agent's inference task at time step $t$, right after having selected an action $a_t$ and having made an observation $y_t$.

- As usual, we record actions and observations by substituting the values into the generative model:
$$\begin{align*}
p(x,s,u) &\propto  \underbrace{p(x_t=y_t|s_t)}_{\text{observation}} p(s_t|s_{t-1},u_t) p(s_{t-1}) \underbrace{p(u_t=a_t)}_{\text{action}} \\ & \quad \cdot \underbrace{\prod_{k=t+1}^{t+T} p(x_k|s_k) p(s_k | s_{k-1}, u_k) p(u_k) p^+(x_k)}_{\text{future}}
\end{align*}$$


- Note that (future) $x$ is also a latent variable and hence we include $x$ in the recognition model.  

- This leads to the following free energy functional
$$\begin{align*}
F[q] &\propto \sum_{x,s,u} q(x,s,u) \log \frac{q(x,s,u)}{p(x,s,u)} 
\end{align*}$$

- Lots of interesting FE decompositions are possible again. For instance
$$\begin{align*}
F[q] &\propto \sum_{x,s,u} q(x,s|u)q(u) \log \frac{q(x,s|u)q(u)}{p(x,s|u)p(u)} \\
  &= \sum_{u} q(u) \underbrace{\sum_{x,s,u} q(x,s|u)\log \frac{q(x,s|u)}{p(x,s|u)}}_{F_u[q]} + \underbrace{\sum_{u} q(u) \log \frac{q(u)}{p(u)}}_{\text{complexity}}
\end{align*}$$
breaks the FE into a complexity term and a data-dependent term FE $F_u[q]$. 

- Let's now consider the break-up $x=(x_t,x_{>t})$ with $x_{>t} = (x_{t+1},\ldots,x_{t+T})$ that recognizes the distinction between already observed and future data.


$$\begin{align*}
F_u[q] &= \sum_{x,s,u} q(x_t,x_{>t},s|u)\log \frac{q(x_t,x_{>t},s|u)}{p(x_t,x_{>t}s|u)} \\
&= (observed FE) + (predicted FE)
\end{align*}$$

- It follows that 

$$
q(u) \propto p(u) \exp \left( -V_u - G_u\right)
$$






### FE Decomposition

- Lots of interesting FE decompositions are possible again. For instance

$$\begin{align*}
F[q] &\propto \sum_{x,s,u} q(x,s,u) \log \frac{q(x,s,u)}{p(x,s,u)} \\
  &= \sum_{x,s,u} q(x) q(s) q(u) \log \frac{q(x) q(s) q(u)}{p(x|s) p(s|u) p(u) p^+(x)}
\end{align*}$$


- In order to make inference manageable, let us assume a mean-field factorization for the recognition model (the approximate 'posterior'):
$$\begin{align*}
q(x,s,u) = q(x) q(s) q(u) = q(s_{t-1})\prod_{k=t}^{t+T} q(x_k) q(s_k) q(u_k)
\end{align*}$$



In [2]:
open("../../styles/aipstyle.html") do f
    display("text/html", read(f,String))
end