# Bernoulli Mixture Model: Theory

To train a Bernoulli Mixture Model, the formulae are:

- Expectation step

$$z_{n, k} \leftarrow \frac{\pi_k \prod_{i = 1}^D \mu_{k, i}^{x_{n, i}} (1 - \mu_{k, i})^{1 - x_{n, i}} }{\sum_{m = 1}^K \pi_m \prod_{i = 1}^D \mu_{m, i}^{x_{n, i}} (1 - \mu_{m, i})^{1 - x_{n, i}}}$$

- Maximization step

$$\mathbf{\mu_m} \leftarrow \mathbf{\bar{x}_m}$$

$$\pi_m \leftarrow \frac{N_m}{N}$$

where $\mathbf{\bar{x}_m} = \frac{1}{N_m} \sum_{n = 1}^N z_{n, m} \mathbf{x_n}$ and $N_m = \sum_{n = 1}^N z_{n, m}$

# BMM Implementation

see `bmm.py` for the complete implementation of the BMM

In [17]:
# settings

data_path = '/home/data/ml/mnist'
k = 10

In [25]:
%matplotlib inline

import matplotlib
import numpy as np
import matplotlib.pyplot as plt

In [26]:
# loading the data

from mnist import load_mnist
data, labels = load_mnist(dataset='training', path=data_path)

# pre-processing the data (reshape + making it binary)

data = np.reshape(data, (60000, 784))
data = np.where(data > 0.5, 1, 0)

In [27]:
# creating our model

import bmm

model = bmm.bmm(k, data, 784)

In [28]:
model.fit()

iteration 1 - llk = -5608.030992918283
iteration 2 - llk = -1296.8522988562108
iteration 3 - llk = -1245.4698432348746
iteration 4 - llk = -1255.2494159604662
iteration 5 - llk = -1243.6874753613904
iteration 6 - llk = -1209.729443974399
iteration 7 - llk = -1185.8059892885158
iteration 8 - llk = -1171.587276325702
iteration 9 - llk = -1163.3657370116014
iteration 10 - llk = -1156.9091169450662
