## Dealing With Discrete Priors

According to what's been mentioned in the project file, we are supposed to assume that $\theta$ is discrete (even though it's a rather unrealistic assumption in my opinion) and can only take 10 possible different values:

$$ \theta = \{ 0.1, 0.2, \cdots,  1\} $$

To initiate our bayesian investigations, we first need to choose an appropriate **prior**. In this particular case, it simply means to define a pmf for $\theta$ based on our intuition and guesswork (maybe). Here's the prior distribution for the random variable $\theta$ according to the file:

$$
\pi_{\text{Ghader}}(\theta) = 
\begin{cases}
0.05, & \text{if } \theta = 0.1, 0.2 \\
0.10, & \text{if } \theta = 0.3, 0.4 \\
0.20, & \text{if } \theta = 0.5, 0.6 \\
0.10, & \text{if } \theta = 0.7, 0.8 \\
0.05, & \text{if } \theta = 0.9, 1.0 \\
\end{cases}
$$

This shows our degree of belief in each possible value of $\theta$ before any real data is observed. What we are aiming for is to readjust this pmf such that it fits our observations more closely. In other words, we wanna find out which values of $\theta$ are the most probable.

### The Likelihood Function

According to the file, we have observed 12 "yes"es out of 20 individuals surveyed. Thus, we know our likelihood function is:

$$ \mathcal{L}(\theta)=\binom{20}{12}(\theta)^{12}(1-\theta)^8 $$

(It's just the binomial pmf expressed as a function of $\theta$ instead of number of successes)

Now its the time to calculate the posterior distribution according to the Baye's Theorem:

### Implementation:

In [8]:
import math
import pandas as pd

precomputed_binom_coefficient = math.comb(20, 12) # precompute binomial coefficient for optimization

prior_df = pd.DataFrame({
    'theta': [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0],
    'prior_pmf': [0.05, 0.05, 0.10, 0.10, 0.20, 0.20, 0.10, 0.10, 0.05, 0.05]
})

def likelihood(theta):
    return precomputed_binom_coefficient * (theta ** 12) * ((1 - theta) ** 8)

# Calculate unnormalized posteriors
prior_df['likelihood'] = prior_df['theta'].apply(likelihood)
prior_df['unnormalized_posterior'] = prior_df['prior_pmf'] * prior_df['likelihood']

# Normalize to get posterior PMF
total_probability = prior_df['unnormalized_posterior'].sum()
prior_df['posterior_pmf'] = prior_df['unnormalized_posterior'] / total_probability

# Display the results
print(prior_df.round(6))

   theta  prior_pmf  likelihood  unnormalized_posterior  posterior_pmf
0    0.1       0.05    0.000000                0.000000       0.000000
1    0.2       0.05    0.000087                0.000004       0.000056
2    0.3       0.10    0.003859                0.000386       0.004974
3    0.4       0.10    0.035497                0.003550       0.045755
4    0.5       0.20    0.120134                0.024027       0.309698
5    0.6       0.20    0.179706                0.035941       0.463269
6    0.7       0.10    0.114397                0.011440       0.147453
7    0.8       0.10    0.022161                0.002216       0.028565
8    0.9       0.05    0.000356                0.000018       0.000229
9    1.0       0.05    0.000000                0.000000       0.000000


### Answer to the questions (according to the results)

1. the most likely value of $\theta$ in prior distribution: 0.5 & 0.6
2. the most likely value of $\theta$ in posterior distribution: 0.6
3. $P(\theta \gt 0.5) = \sum_{i=6}^{10}{P(\theta=i/10)} = 0.639516$