# Define the dataset

In [1]:
import pandas as pd
df = pd.DataFrame({
    'StorageTemperature': [2, 8, 15, 25],
    'TotalMushrooms': [30,25,20,30],
    'SpoiledMushrooms': [2,4,5,20]
})
df

Unnamed: 0,StorageTemperature,TotalMushrooms,SpoiledMushrooms
0,2,30,2
1,8,25,4
2,15,20,5
3,25,30,20


# Modelling assumptions:
1. The outcome of the \($n_i$\) mushrooms within each group \(i\) are *independent*.  
   Each animal in the group has probability \($p_i$\) of death.

2. The probability \($p_i$\) that a mushroom spoils depends on the temperature level \($x_i$\) as follows:

   $$
   p_i = \text{sigm}(\alpha + \beta x_i)
   $$

   where

   $$
   \text{sigm}(z) = \frac{1}{1 + e^{-z}}
   $$

3.  The parameters $\theta = [\alpha, \beta]^\top$ have independent Gaussian priors:
    \begin{align}
    \alpha &\sim \mathcal{N}(\mu_\alpha, \sigma_\alpha^2), \quad \mu_\alpha = 0, \sigma_\alpha = 2 \\
    \beta &\sim \mathcal{N}(\mu_\beta, \sigma_\beta^2), \quad \mu_\beta = 0, \sigma_\beta = 1
    \end{align}

4.  The outcomes in the four groups are independent of each other, given $ùúÉ$.


## 1.1: Probabilistic model

* Derive and comment the full probabilistic model.

## Bayesian Logistic Regression Model

We aim to model the spoilage of mushrooms based on temperature.

### 1. Variables
* $i$: Index for the experimental group.
* $n_i$: Total number of mushrooms in group $i$.
* $y_i$: Number of spoiled mushrooms in group $i$.
* $x_i$: Temperature level for group $i$.

### 2. Likelihood (Sampling Distribution)
The outcome of the $n_i$ mushrooms within each group $i$ is independent. The number of spoiled mushrooms follows a Binomial distribution:

$$y_i \sim \text{Binomial}(n_i, p_i)$$

The probability $p_i$ that a mushroom spoils depends on the temperature $x_i$ via the logistic link function:

$$p_i = \text{sigm}(\alpha + \beta x_i) = \frac{1}{1 + e^{-(\alpha + \beta x_i)}}$$

### 3. Priors
The parameters $\theta = [\alpha, \beta]^\top$ have independent Gaussian priors:

$$
\begin{aligned}
\alpha &\sim \mathcal{N}(\mu_\alpha=0, \sigma_\alpha^2=4) \\
\beta &\sim \mathcal{N}(\mu_\beta=0, \sigma_\beta^2=1)
\end{aligned}
$$

**Derivation of Joint Prior Density:**
Because the priors are independent, the joint density is the product of the individual densities:

$$
f(\alpha, \beta) = f(\alpha) \cdot f(\beta)
$$

Substituting the standard normal PDF formula $f(x) = \frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{(x-\mu)^2}{2\sigma^2}}$ with our specific hyperparameters ($\sigma_\alpha=2, \sigma_\beta=1, \mu=0$):

$$
\begin{aligned}
f(\alpha, \beta) &= \left( \frac{1}{\sqrt{2\pi \cdot 4}} e^{-\frac{\alpha^2}{2 \cdot 4}} \right) \cdot \left( \frac{1}{\sqrt{2\pi \cdot 1}} e^{-\frac{\beta^2}{2 \cdot 1}} \right) \\
&= \left( \frac{1}{2\sqrt{2\pi}} \cdot \frac{1}{\sqrt{2\pi}} \right) \cdot \exp\left( -\frac{\alpha^2}{8} - \frac{\beta^2}{2} \right) \\
&= \frac{1}{4\pi} \exp\left( \frac{-\alpha^2}{8} - \frac{4\beta^2}{8} \right)
\end{aligned}
$$

**Final Joint Prior:**
$$
f(\alpha, \beta) = \frac{1}{4\pi} \exp \left( -\frac{\alpha^2 + 4\beta^2}{8} \right)
$$

### 4. Posterior Distribution
The posterior distribution is proportional to the product of the likelihood and the prior (Bayes' Rule):

$$p(\alpha, \beta \mid \mathcal{D}) \propto p(\mathcal{D} \mid \alpha, \beta) \cdot f(\alpha, \beta)$$

**Step A: The Likelihood Function**
Since observations between groups are independent, the total likelihood is the product of the Binomial probabilities for all groups $i$:

$$
\begin{aligned}
p(\mathcal{D} \mid \alpha, \beta) &= \prod_{i} \text{Binomial}(y_i \mid n_i, p_i) \\
&= \prod_{i} \left[ \binom{n_i}{y_i} p_i^{y_i} (1 - p_i)^{n_i - y_i} \right]
\end{aligned}
$$

where $p_i = \text{sigm}(\alpha + \beta x_i)$.

**Step B: Combining with the Prior**
Multiplying the Likelihood (Step A) by the Joint Prior (Step 3):

$$
p(\alpha, \beta \mid \mathcal{D}) \propto \left( \prod_{i} \binom{n_i}{y_i} [\text{sigm}(\alpha + \beta x_i)]^{y_i} [1 - \text{sigm}(\alpha + \beta x_i)]^{n_i - y_i} \right) \cdot \left( \frac{1}{4\pi} e^{-\frac{\alpha^2 + 4\beta^2}{8}} \right)
$$

**Simplified Proportional Form:**
When calculating the unnormalized posterior (e.g., for MCMC or Grid Approximation), we can drop constant factors that do not depend on $\alpha$ or $\beta$ (such as $\binom{n_i}{y_i}$ and $\frac{1}{4\pi}$).

The final unnormalized density function to implement is:

$$
\text{Unnormalized Posterior} = \left( \prod_{i} p_i^{y_i} (1 - p_i)^{n_i - y_i} \right) \cdot \exp \left( -\frac{\alpha^2 + 4\beta^2}{8} \right)