# Example Usage for `mix_gamma_vi`

In [1]:
from mix_gamma_vi import mix_gamma_vi
import numpy as np
import tensorflow as tf
import tensorflow_probability as tfp

## Generate Dataset

Generate 10000 data from a mixture of gamma two gamma distributions.

In [7]:
N = 10000
pi_true = [0.5, 0.5]
a_true  = [20,  80 ]
B_true  = [20,  40 ]

mix_gamma = tfp.distributions.MixtureSameFamily(
    mixture_distribution=tfp.distributions.Categorical(probs=pi_true),
    components_distribution=tfp.distributions.Gamma(concentration=a_true, rate=B_true))

x = mix_gamma.sample(N)

## Variational Inference Under the Shape-Mean Parameterisation

The defualt parameterisation for the function `mix_gamma_vi` is the mean-shape parameterisation under which the variational approximations to the posterior are

$$ q^*(\mathbf{\pi}) = \left( \zeta_1, ..., \zeta_K \right)   $$

$$ q^*(\mathbf{\pi}) = \mathrm{Dirichlet} \left( \zeta_1, ..., \zeta_K \right) ,   $$
$$ q^*(\alpha_k) = \mathcal{N}(\hat{\alpha}_k, \sigma_j^2) ,$$
$$ q^* (\mu_k) =  \operatorname{Inv-Gamma} \left( \gamma_k, \lambda_k \right) .   $$
The product approximates the joint posterior
$$ p(\mathbf{\pi}, \mathbf{\alpha}, \mathbf{\mu} \mid \mathbf{x}) \approx q^*(\mathbf{\pi}) \prod_{k=1}^K q^*(\alpha_k) q^*(\mu_k). $$

In [3]:
# Fit a model
fit = mix_gamma_vi(x, 2)

# Get the fitted distribution
distribution = fit.distribution()

# Get the means of the parameters under the fitted posterior
distribution.mean()

{'pi': <tf.Tensor: id=1324, shape=(1, 2), dtype=float32, numpy=array([[0.4959155 , 0.50408447]], dtype=float32)>,
 'mu': <tf.Tensor: id=1331, shape=(1, 2), dtype=float32, numpy=array([[1.0062267, 2.0021513]], dtype=float32)>,
 'alpha': <tf.Tensor: id=1335, shape=(1, 2), dtype=float32, numpy=array([[19.765982, 80.58321 ]], dtype=float32)>}

In [4]:
distribution.stddev()

{'pi': <tf.Tensor: id=1345, shape=(1, 2), dtype=float32, numpy=array([[0.00499908, 0.00499908]], dtype=float32)>,
 'mu': <tf.Tensor: id=1358, shape=(1, 2), dtype=float32, numpy=array([[0.00321394, 0.0031414 ]], dtype=float32)>,
 'alpha': <tf.Tensor: id=1362, shape=(1, 2), dtype=float32, numpy=array([[0.39694458, 1.6051203 ]], dtype=float32)>}

## Variational Inference Under the Shape-Rate Parameterisation

The traditional parameterisation for gamma distribution is the shape-rate parameterisation which this package also supports (although it is not recommended). In this case, the variational approximations to the posterior are

$$ q^*(\mathbf{\pi}) = \mathrm{Dirichlet} \left( \zeta_1, ..., \zeta_K \right) ,   $$
$$ q^*(\alpha_k) = \mathcal{N}(\hat{\alpha}_k, \sigma_k^2) ,  $$
$$ q^* (\beta_k) =  \operatorname{Gamma} \left( \gamma_j, \lambda_j \right) .   $$
The product approximates the joint posterior
$$ p(\mathbf{\pi}, \mathbf{\alpha}, \mathbf{\beta} \mid \mathbf{x}) \approx q^*(\mathbf{\pi}) \prod_{k=1}^K q^*(\alpha_k) q^*(\beta_k) . $$

In [5]:
# Fit a model
fit = mix_gamma_vi(x, 2, parameterisation="shape-rate")

# Get the fitted distribution
distribution = fit.distribution()

# Get the means of the parameters under the fitted posterior
distribution.mean()

{'pi': <tf.Tensor: id=2435, shape=(1, 2), dtype=float64, numpy=array([[0.49586246, 0.50413754]])>,
 'beta': <tf.Tensor: id=2442, shape=(1, 2), dtype=float64, numpy=array([[0.05108236, 0.0249591 ]])>,
 'alpha': <tf.Tensor: id=2446, shape=(1, 2), dtype=float64, numpy=array([[19.69911649, 80.21602851]])>}

In [6]:
distribution.stddev()

{'pi': <tf.Tensor: id=2456, shape=(1, 2), dtype=float64, numpy=array([[0.00499908, 0.00499908]])>,
 'beta': <tf.Tensor: id=2469, shape=(1, 2), dtype=float64, numpy=array([[1.63451684e-04, 3.92490419e-05]])>,
 'alpha': <tf.Tensor: id=2473, shape=(1, 2), dtype=float64, numpy=array([[0.06197073, 0.12626589]])>}

So, the standard deviation of $\mathbf{\alpha}$ under the shape-rate parameterisation is much lower than it is under the shape-mean parameterisation.