**EM Algorithm: GMM (Clustering)**

This is a boiler-plate template to get you guys started on how Expectation and Maximization steps can be implimented using numpy arrays.

Just to emphasize:

**E Step**
- For each point $i$ st $i \in \{1,2,3..N\}$
    - For each. cluster $j$ st $j \in \{1,2,3..K\}$
        - Compute $p(j|i)=\frac{p_jN(x^i,\mu_j,\sigma_jI)}{\sum_{j}p_jN(x^i,\mu_j,sigma_jI)}$
        
**M Step**
- Compute $n_j=\sum^{N}_{i=1}p(j|i)$
- From $n_j$ compute $p_j=\frac{n_j}{N}$
- From $p_j$ and $n_j$ compute $\mu_j = \frac{1}{n_j}\sum^{N}_{i=1}p(j|i)x^i$
- From $p_j$, $n_j$ and $\mu_j$ compute $\sigma_j = \frac{1}{n_j d}\sum^{N}_{i=1}p(j|i)||x^i-\mu_j||^2$

In [1]:
import numpy as np
import random
from scipy import stats

In [3]:
X = np.loadtxt("./data/toy_data.txt")

In [4]:
X.shape

(250, 2)

In [14]:
## Randomly Select parameters (K=2)
## (mu,sigma,pj)
random.seed(42)
mu_1 = X[random.randint(0,250)]
mu_2 = X[random.randint(0,250)]

In [29]:
mu_1

array([3.806, 0.903])

In [17]:
mu = np.array([mu_1,mu_2])

In [46]:
mu

array([[ 3.806,  0.903],
       [-1.809,  1.69 ]])

In [31]:
p1 = 1/250
p2 = 1-p1
mixture = np.array([p1,p2])

In [19]:
sigma_1 = 0.45
sigma_2 = 0.67

In [20]:
sigma = np.array([sigma_1,sigma_2])

In [48]:
sigma

array([0.45, 0.67])

In [25]:
np.identity(2)*(sigma[0])**2

array([[0.2025, 0.    ],
       [0.    , 0.2025]])

In [28]:
stats.multivariate_normal.pdf(X[1],mean=mu[0],cov=np.identity(2)*(sigma[0])**2)

5.8625252785376755e-52

**E Step**
- For each point $i$ st $i \in \{1,2,3..N\}$
    - For each. cluster $j$ st $j \in \{1,2,3..K\}$
        - Compute $p(j|i)=\frac{p_jN(x^i,\mu_j,\sigma_jI)}{\sum_{j}p_jN(x^i,\mu_j,sigma_jI)}$



In [38]:
### EStep
k = 2
ll = 0
post = np.zeros((X.shape[0],k))
for i in range(X.shape[0]):
    for j in range(k):
        likelihood = stats.multivariate_normal.pdf(X[i],mean = mu[j], cov = np.identity(2)*(sigma[j])**2)
        post[i,j]=likelihood*mixture[j]
    total = post[i,:].sum()
    post[i,:] = post[i,:]/total
    ll = ll + np.log(total)

In [39]:
ll

-4854.331613120599

**M Step**
- Compute $n_j=\sum^{N}_{i=1}p(j|i)$
- From $n_j$ compute $p_j=\frac{n_j}{N}$
- From $p_j$ and $n_j$ compute $\mu_j = \frac{1}{n_j}\sum^{N}_{i=1}p(j|i)x^i$
- From $p_j$, $n_j$ and $\mu_j$ compute $\sigma_j = \frac{1}{n_j d}\sum^{N}_{i=1}p(j|i)||x^i-\mu_j||^2$

In [42]:
n_j = post.sum(axis=0)

In [43]:
n_j

array([ 98.97150702, 151.02849298])

In [44]:
p_j = n_j/250

In [45]:
p_j

array([0.39588603, 0.60411397])

In [47]:
new_mu = np.zeros((k,X.shape[1]))

In [49]:
new_var = np.zeros(k)

In [50]:
new_var

array([0., 0.])

In [63]:
post[:,0].shape

(250,)

In [64]:
post[:,0,None].shape

(250, 1)

In [71]:
((X*post[:,0,None]).sum(axis=0))/n_j ### p(1|i)xi

array([5.83077787, 0.15133387])

In [72]:
(X*post[:,0,None]).shape

(250, 2)

In [70]:
((X*post[:,1,None]).sum(axis=0))/n_j ### p(2|i)xi

array([-3.14188277,  0.75920293])