# **Bayesian Updating with Conjugate Priors**
---
---

### **Conjugate Prior:** posterior with same distribution as the prior
- certain likelihood functions with specific prior distributions -> posterior having the same distribution as the prior



> 
> ### P(θ) such that P(θ|D) = P(θ)
> - chosen *Prior Distribution* for the *Likelihood Function* = a *Posterior Distribution* the same as the *Prior Distribution* 
> - Allows you to skip `posterior = likelihood * prior` computation (if you know it is a conjugate prior)
> - implied **Closed-Form Expression** for the *prior disribution* (you already know what the maximum posterior will be)
>      - *Closed-Form Expression* = finite number of standard operations

### Import Libraries

In [1]:
import numpy as np
import scipy.stats as stats

## Calculate Posterior of Binomial Likelihood (Conjugate Priors)
---
---
Medium Article: [Conjugate Prior](https://towardsdatascience.com/conjugate-prior-explained-75957dc80bfb)

- `Θ` = probability of success
- goal = pick the `Θ` that maximizes the postrior probability

### Generate Dataset and Calculate Posterior of the Binomial Likelihood
---

In [19]:
# Generate Dataset
success_prob = 0.3
data = np.random.binomial(n=1, p=success_prob, size=1000)

# Θ
theta_range = np.linspace(0,1,1000)

# Prior P(Θ)
    # Beta Distribution: Probability Distribution on Probabilities
    # Cumulative Distribution Function (CDF): Probability 'X' will take on a value less than or equal to 'x'
Alpha = 2
Beta = 8
theta_range_cdf = theta_range + 0.0001
cdf_prior = stats.beta.cdf(x=theta_range_cdf, a=Alpha, b=Beta) - stats.beta.cdf(x=theta_range, a=Alpha, b=Beta)

# Likelihood P(X|Θ)
    # Binomial Distribution: probability of 'k' successes in 'n' periods
    # Probability Mass Function (PMF): probability that 'X' = 'k'
        # k = number of successes
        # n = number of trials 
        # p = probability of success
likelihood = stats.binom.pmf(k=np.sum(data), n=len(data), p=theta_range)

# Posterior P(Θ|X)
posterior = likelihood * cdf_prior 
normalized_posterior = posterior / np.sum(posterior)


### Above code ^: Posterior Calculation is Expensive
1. Computing Posterior For Every `Θ`
    - normalization aside -> Goal is to find the *maximum* of the posteriors (Maximum a Posteriori - MAP)
2. **No Closed-Form Formula** of Posterior Distribution 
    - *Closed-Form Expression* = finite number of standard operations
    - find MAP by numerical optimization (gradient descent or newtons method)

# **Conjugate Prior**
---
---
### Able to Skip Computationally Expensive *`posterior = likelihood * prior`* computation 


### **IFF** your *prior distribution* has a *closed-form expression*:
 - MAP already known
 - **Closed-Form Expression** = finite number of standard operations
    - Closed-Form Formulas of Conjugate Priors Lighten Computation 

---
>### Pre-Known Conjugate Priors
>> Beta Distribution Posterior: 
>> - Beta Prior * **Bernoulli** Likelihood = Beta Posterior
>> - Beta Prior * **Binomial** Likelihood = Beta Posterior
>> - Beta Prior * **Negative Binomial** Likelihood = Beta Posterior 
>> - Beta Prior * **Geometric** Likelihood = Beta Posterior 
> 
>> Gamma Distribution Posterior: 
>> - Gamma Prior * **Poisson** Likelihood = Gamma Posterior 
>> - Gamma Prior * **Exponential** Likelihood = Gamma Posterior 
>
>> Normal Posterior: 
>> - Normal Prior * **Normal** Likelihood (mean) = Normal Posterior 
>
---

### Beta Distributions Conjugate Prior to Binomial Likelihood: 
- Binomial Probabiltiy Density Function: `f(x) = (ᴺｘ) Θˣ (1-Θ)ᴺ⁻ˣ`
    - function of `x`
- Beta Probability Density Function: `g(Θ) = (1 / (B(𝜶,β))) Θᵅ⁻¹ (1-Θ)^(β-1)`
    - function of `Θ`
- **`Θ`** = Probability of Success
- **`x`** = Number of Successes
- **`n`** = Number of Trials -> **`n-x`** = Number of Failures 