# Bayesian Inference

## 1. Introduction

In this activity, you will explore the problem of estimating probabilities from data using Bayesian faremework. The goal is to start with prior information on some paramater of interest (Ex. $\theta$) and update it according to some observed data. Based on the Bayes' theorom, **_posterior_** distribution incorporates observations $x$ into the distribution of the parameter of interest. In this setup posterior distribution serves as a summary of data and can be expressed as the following:

$$
p(\theta | x) = \frac{p(x|\theta)p(\theta)}{\int_{\theta}p(x|\theta)p(\theta)d\theta}
$$

Here $p(x|\theta)$ is the likelihood (model) distribution that summarizes the information about experimental data and $p(\theta)$ is the prior distribution that quantifies available knowledge about the parameter of interest, and also describes the uncertainty about this parameter before data are observed. In other words prior distribution describes our best guess about parameters before obeserving the data.

## 2. Estimating Chemotherapy Response Rates

### 2.1 Prior distribution: $p(\theta)$

Efficacy of a new chemotherapy medication is under investigation. Based on a preliminary results using sample aize of 10 it is belived that on average 90% of patients will respond to this medication. Also investigators belive that it will be unlikely that this proportion will go bellow 80%. Define and plot a prior distribution based on this preliminary result. **Hint:** You can consider using the Beta function.

### 2.2 Likelihood distribution: $p(x|\theta)$

During a new trial, following data was collected:

$[1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1]$

Select and plot a lileklihood function for all possible $\theta$ ranges. **Hint:** A propoer likelihood function would be a binomial distribution.

### 2.3 Posterior Distribution: $p(\theta|x)$

Compute and plot the posterior distribution along with the prior and likelihood distributions. Explain the results. What is the most likely value for the parameter of interest.

### 2.4 Importance Sampling

In practice, due to the difficulty in calculating the normalization factor for the posterior distribution, we may not be able to fully estimate this density. However, numerical methods such as **importance sampling** method can be used to estimate an important proprty such as **expectation** of the posteror distribution, when we only know this distribution up to a scale.

We defined the posterior distribution in the context of the Bayesian inference as follows:

$$
p(\theta | x) = \frac{p(x|\theta)p(\theta)}{\int_{\theta}p(x|\theta)p(\theta)d\theta}
$$

If we are not able to calculate the normalization factor, then we could define the unnormalized postrior distribution as:

$$
\begin{align}
\tilde p(\theta | x) & \propto p(x|\theta)p(\theta) \\
p(\theta | x) &= C \cdot \tilde p(\theta | x)
\end{align}
$$

Where $C$ is a constant. In this context, the expectation of posterior density is defined as:

$$
E[\theta | x] = \int_{\theta} p(\theta|x)\theta d\theta
$$

Let's introduce another proposed distribution, denoted as $q(\theta)$ and rewrite the expectation a:


$$
\begin{aligned}
E[\theta | x] &= \int_{\theta} \frac{p(\theta|x)}{q(\theta)}q(\theta) \theta d\theta \\
&= \int_{\theta}\omega(\theta)q(\theta) \theta d\theta \\
&= \int_{\theta}\omega(\theta)\theta q(\theta) d\theta
\end{aligned}
$$

We can use Monte Carlo simulation to estimate this expection, and based on the law of large numbers we will have:

$$
\tilde E[\theta | x] = \frac{1}{n} \sum_{i}^{n} \omega(\theta_{i}) \theta_{i} \xrightarrow{n \rightarrow \infty} E_{q(\theta)}[\omega(\theta)\theta | x]
$$

Note that $\omega(\theta_{i}) = \frac{p(\theta_{i})}{q(\theta_{i})}$, but we only know $p(.)$ up to a constant. To address this issue we could use a normalization factor based on the weigths:


$$
\begin{aligned}
\tilde E[1 | x] &= 1 \\
&= \frac{1}{n}\sum_{i}^{n} \omega(\theta_{i}) 1 = 1 \\
&= \frac{1}{n}\sum_{i}^{n} \frac{p(\theta_{i}| x)}{q(\theta_{i})} = 1 \\
&= \frac{1}{n}\sum_{i}^{n} \frac{C \cdot \tilde p(\theta_{i}| x)}{q(\theta_{i})} = 1 \\
&= \frac{C}{n} \cdot \sum_{i}^{n} \frac{\tilde p(\theta_{i}| x)}{q(\theta_{i})} = 1 \\
&= \frac{C}{n} \cdot \sum_{i}^{n} \tilde\omega(\theta_{i}) = 1 \\
&\implies C = \frac{n}{\sum_{i}^{n} \tilde\omega(\theta_{i})}
\end{aligned}
$$

Accordingly:

$$
\tilde E[\theta | x] = \frac{\sum_{i}^{n} \tilde\omega(\theta_{i}) \theta_{i}}{\sum_{i}^{n} \tilde\omega(\theta_{i})}
$$


Implement important sampling on the non-normalized posterior (Beta distribution) and a proposed normal distribution $q \sim \mathcal{N}(0,1)$ to estimate the expectation of posterior distribution.

