# Implementation of Finite Mixture Model

<details>
  <summary>Table of Contents</summary>
  <ol>
    <li>
      <a href="#import-needed-filepaths-and-libraries">Import Needed Filepaths and Libraries</a>
    </li>
    <li>
    <a href="#load-dataset-into-pandas-dataframe">
    Load Dataset Into Pandas DataFrame
    </a>
    </li>
    <li>
    <a href="#define-response-variable-y">
    Define Response Variable y
    </a>
    </li>
    <li>
    <a href="#set-up-reproducible-random-number-generator">
    Set Up Reproducible Random Number Generator
    </a>
    </li>
    <li>
      <a href="#setting-up-mixture-model">Setting Up Mixture Model</a>
      <ul>
        <li><a href="#initial-parameters">Initial Parameters</a></li>
        <li><a href="#setting-priors">Setting Priors</a></li>
      </ul>
    </li>
    <li><a href="#gibbs-sampler-implementation">Gibbs Sampler Implementation</a>
      <ul>
        <li><a href="#gibbs-sampler-updates">Gibbs Sampler Updates</a></li>
      </ul>
</li>
  </ol>
</details>

## Import Needed Filepaths and Libraries

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy import stats
from texas_gerrymandering_hb4.config import FINAL_CSV

## Load Dataset Into Pandas DataFrame
Our processed dataset is read into a Pandas DataFrame.

In [None]:
df = pd.read_csv(FINAL_CSV)

## Define Response Variable `y`
To clarify, `y` represents an array of outcomes.

In [None]:
y = df["dem_share"].values.astype(float)

## Set Up Reproducible Random Number Generator

In [None]:
np.random.seed(123)

## Setting Up Mixture Model

### Initial Parameters
* These parameters represent starting guesses for the Gibbs sampler.
* `lambda` ($\lambda$) is the mixing proportion. Starting from a place of ignorance, we assume a perfect mix between the two, and that our means are the sample means and our variances are the sample variances. $\lambda$ is set to 0.5 to serve as a neutral initial guess.
* `mu_1` and `mu_2` represent the initial means for each component.
* `sigma_squared_1` and `sigma_squared_2` are the initial variances for each component.

In [None]:
lambda = 0.5
mu_1 = np.mean(y)
mu_2 = np.mean(y)
sigma_squared_1 = np.var(y)
sigma_squared_2 = np.var(y)

### Setting Priors
* `alpha_1` and `alpha_2` are the priors for lambda.
$$\lambda \sim Beta(\alpha_2, \alpha_2)$$
Because the prior for lambda is Beta(2,2), `alpha_1` and `alpha_2` are both set to 2.
* This is a conjugate prior. Also, as a reminder, Beta(2,2) is also a Dirchilet distribution. What is beneficial about this is that the probability of obtaining 0 or 1, which is a degenerate model, is 0. Hence, as you get closer to 0 or 1, the likelihood is tiny. As a result, the problem is pushed further away from a degenerate value, so they become less likely to accidently become a point of convergence.
* The means `m0_1` and `mu0_2` are coming from the same distribution.
* Our variances are coming from the scaled inverse-$\chi^2$ distribution with 2 degrees of freedom.

In [None]:
alpha_1 = 2
alpha_2 = 2

mu0_1 = np.mean(y)
mu0_2 = np.mean(y)

sigma_0_squared_1 = np.var(y)
sigma_0_squared_2 = np.var(y)

### Gibbs Sampler Parameters

In [None]:
iterations = 1000
warmup = 500

### Storage for Samples

In [None]:
lambda samples = np.zeros(iterations)
mu1_samples = np.zeros(iterations)
mu2_samples = np.zeros(iterations)
sigma_squared_1_samples = np.zeros(iterations)
sigma_squared_2_samples = np.zeros(iterations)
z_samples = np.zeros((iterations, n), dtype=init)

## Scaled Inverse-$\chi^2$ Sampler

In [None]:
def rinvchisq(df, scale):
    return df * scale / np.random.chisquare(df)


## Gibbs Sampler Implementation
* The process for the Gibbs sampler involves initializing parameters, iterative sampling, and continuing the iterations until convergence.
* The benefits of the Gibbs sampler are that it handles complex, high-dimensional distributions and is easier to implement than direct sampling methods.

###  Gibbs Sampler Updates

#### Updating z
* We compute the posterior probability that our $z_{i}=1$, taking into account the normal distributions that our data follows:
$$p_{z_{i}=1} = \frac{\lambda_{old} \cdot N(y_{i}|\mu_{1,2}, \sigma^2_{1,2})}{\lambda_{old} \cdot N(y_{i}|\mu_{1,2}, \sigma_{1,2}^2) + (1-\lambda_{old})N(y_{i}|\mu_{1,2}, \sigma_{1,2}^2)}$$
* Once the probability of $z_{i=1}$ is computed, then a new value of $z_{i}$ is computed using a binomial distribution.
$$z_{i}^{new} \leftarrow Bin(n, p_{z_{i}=1})$$

#### Updating $\lambda$
* Updating $\lambda$ is a two-step process because it follows a Beta distribution.
* First, we update the Beta distribution using the standard form from conjugacy. Recall that our parameters are (2,2); they are being updated based on the frequency of 1's among our z's, which is why we end up with two updates for $\alpha$ and $\beta$. We begin updating $\lambda$ by first updating $\alpha$ and $\beta$ parameters:
$$\alpha_{new} \leftarrow \alpha_{old} + \sum z_{i}^{new}$$
$$\beta_{new} \leftarrow \beta_{old} + n -  \sum z_{i}^{new}$$
* We update $\lambda$ by computing the posterior probability. Because our $\lambda$ is coming from a Beta distribution with updated parameters, we draw a value of $\lambda$:
$$\lambda_{new} \leftarrow Beta(\alpha_{new}, \beta_{new})$$

#### Updating Means
* Our number of observations are $n_1 = \sum z_{i}^{new}$ and $n_2 = \sum z_{i}^{new}$, respectively. Recall that our $z_{i}$ values have been redrawn.
* We examine the total number of observations, followed by the means.  We compute the means as follows:
$$\bar{y}_1 = \frac{1}{n_1} \sum y_{i} (z_{i}^{new} = 1)$$
$$\bar{y}_2 = \frac{1}{n_2} \sum y_{i} (z_{i}^{new} = 1)$$
* Once we obtain our means, we are able to compute a posterior mean based our sample. We construct this using our initial values; we use `init` to denote this. This is combined with our prior and sample to compute a new mean as follows:
$$\mu_{1,new} = \frac{\frac{\mu}{\sigma_{1,init}^2} + n_1 \frac{\bar{y}_1}{\sigma^2_{1, old}}}{\frac{1}{\sigma_{1,init}^2}}$$

In [None]:
# -----------------------------
# 3. Gibbs sampler
# -----------------------------
for it in range(iterations):
    # ---- Update z (component membership for each district) ----
    lik_1 = _lambda * norm.pdf(y, loc=mu_1, scale=np.sqrt(sigma_1_sq))
    lik_2 = (1 - _lambda) * norm.pdf(y, loc=mu_2, scale=np.sqrt(sigma_2_sq))
    z_probs = lik_1 / (lik_1 + lik_2)
    z = np.random.binomial(1, z_probs)  # 1 = component 1, 0 = component 2

    # ---- Update lambda ----
    alpha_lambda_post = alpha_1 + np.sum(z)
    beta_lambda_post = alpha_2 + (n - np.sum(z))
    _lambda = np.random.beta(alpha_lambda_post, beta_lambda_post)

    # ---- Update mu_1 ----
    n1 = np.sum(z)
    if n1 > 0:
        y1_mean = np.mean(y[z == 1])
        mu1_post_mean = ((mu0_1 / sigma0_1_sq) + n1 * y1_mean / sigma_1_sq) / \
                        (1 / sigma0_1_sq + n1 / sigma_1_sq)
        mu1_post_sd = np.sqrt(1 / (1 / sigma0_1_sq + n1 / sigma_1_sq))
    else:
        mu1_post_mean = mu0_1
        mu1_post_sd = np.sqrt(sigma0_1_sq)
    mu_1 = np.random.normal(mu1_post_mean, mu1_post_sd)

    # ---- Update mu_2 ----
    n2 = n - n1
    if n2 > 0:
        y2_mean = np.mean(y[z == 0])
        mu2_post_mean = ((mu0_2 / sigma0_2_sq) + n2 * y2_mean / sigma_2_sq) / \
                        (1 / sigma0_2_sq + n2 / sigma_2_sq)
        mu2_post_sd = np.sqrt(1 / (1 / sigma0_2_sq + n2 / sigma_2_sq))
    else:
        mu2_post_mean = mu0_2
        mu2_post_sd = np.sqrt(sigma0_2_sq)
    mu_2 = np.random.normal(mu2_post_mean, mu2_post_sd)

    # ---- Update sigma_1_sq (scaled inverse-chi-squared) ----
    nu1_post = nu0_1 + n1
    if n1 > 0:
        ss1 = np.sum((y[z == 1] - mu_1) ** 2)
    else:
        ss1 = 0.0
    sigma1_post_scale = (nu0_1 * sigma0_1_sq + ss1) / nu1_post
    sigma_1_sq = rinvchisq(nu1_post, sigma1_post_scale)

    # ---- Update sigma_2_sq (scaled inverse-chi-squared) ----
    nu2_post = nu0_2 + n2
    if n2 > 0:
        ss2 = np.sum((y[z == 0] - mu_2) ** 2)
    else:
        ss2 = 0.0
    sigma2_post_scale = (nu0_2 * sigma0_2_sq + ss2) / nu2_post
    sigma_2_sq = rinvchisq(nu2_post, sigma2_post_scale)

    # ---- Store samples ----
    lambda_samples[it] = _lambda
    mu1_samples[it] = mu_1
    mu2_samples[it] = mu_2
    sigma1_sq_samples[it] = sigma_1_sq
    sigma2_sq_samples[it] = sigma_2_sq
    z_samples[it, :] = z

## Posterior Summaries