# How to think about likelihoods over multiple samples

This notebook demonstrates how we treate a likelihood over multiple samples.

==========================================================================

* **Notebook dependencies**:
    * ...

* **Content**: Jupyter notebook accompanying Chapter 2 of the textbook "Fundamentals of Active Inference"

* **Author**: Sanjeev Namjoshi (sanjeev.namjoshi@gmail.com)

* **Version**: 0.1

In a previous notebook we showed how to scale hidden state inference with Bayes' theorem to multiple samples. In this notebook we will extend the explanation of the likelihood a step further. 

We will use the same assume the following:
* $x^*$: The true **external state** of the generative process.
* $y$: The **outcome** of a generative process, known as the **observation** for a generative model. This is the data the agent receives.

In this scenario the external states of the generative process ($x^*$) denote the size of a food source and the outcomes ($y$) are represents levels of light intensity emitted from the food as a function of size. Using these observations, the agent needs to infer ("perceive") the hidden state of the generative process that generated the data it is receiving. This is represented by the variable ($x$), the **hidden state** which captures the agent's belief about the food size that could have generated the observed sensory data. We use the following agent and environment:

$$
    \mathscr{E} \triangleq 
    \begin{cases}
        y = g_{\mathscr{E}}(x^*; \theta^*) + \omega_y^*    & \text{Outcome generation} \\
        g_{\mathscr{E}}(x^*; \theta^*) = \beta_0^* + \beta_1^* x^* & \text{Generating function} \\
        \omega_y^* \sim \mathcal{N}(0, \sigma^2=1) & \text{Observation noise} \\
        \theta^* := \left \{\beta_0^* = 3, \beta_1^* = 2 \right \} & \text{Observation parameters}
    \end{cases}
$$

$$
    \mathcal{M} \triangleq 
    \begin{cases}
        p_{\mu_y, \sigma^2_y}(y_i \mid x) = \mathcal{N}(y_i; g_{\mathcal{M}}, \sigma^2_y) & \text{Likelihood} \\
        p_{\mu_y, \sigma^2_y}(x) = \mathcal{N}(x; m_x, s^2_x) & \text{Prior on } x \\
        g_{\mathcal{M}}(x, \theta) = \beta_0 + \beta_1 x & \text{Generating function} \\
        \theta := \left \{\beta_0 = 3, \beta_1 = 2 \right \}  & \text{Linear parameters} \\ 
        \phi := \left \{\sigma^2_y = 0.25, s^2_x = 0.25, m_x = 4 \right \} & \text{Other parameters}
    \end{cases}
$$

The only difference is that observation are now indexed by $i = 0, \dots, N$ to indicate multiple samples are generated from the same hidden state. We use the linear environment we have used previously and generate $N=30$ samples with $x^*=2$.