## Comparing EM with GD

### First, we set up by choosing the number of clusters $m$, the number of features $n$, and generate samples by use of a BMM with random parameters. The goal is to recover this setting. The sample set is $S$, which is a dictionary containing binary strings $x$. Notice that replicates are gathered together to save space.

In [1]:
import nbm
import numpy as np
m = 2  # number of classes
n = 5 # number of features
samples = 1000000 # sample size
# generate the parameters
[gtheta, gPhi] = nbm.randInt(m, n)
total = np.zeros(n)
ct = 0
# generate samples
S = dict()
for _ in range(2**n):    # the sample space
# compute the probability
    x = nbm.decToBits(_, n)   # convert decimal to binary
    prob = 0.0
    for __ in range(m):
        prob += gtheta[__] * nbm.bern_n(x, gPhi[__, :], n) 
    S[_] = prob * samples
# add to the dictionary
avg = np.zeros(n)
for __ in range(m):
    avg += gtheta[__] *gPhi[__, :]
diff = gPhi[1, :] - gPhi[0, :]
Sigma = np.multiply(avg, 1 - avg)

In [2]:
print("gtheta: ", gtheta)
print("gPhi: ", gPhi)

gtheta:  [0.193 0.807]
gPhi:  [[0.0191 0.0789 0.8774 0.2741 0.2515]
 [0.4478 0.7046 0.3388 0.2343 0.0886]]


### Checking if EM can converge to a k-cluster point, which we call "bad min"

In [8]:
# initialize params
print("-------------------EM: ----------------------------")
bad_min = 0
for _ in range(60):
    print("-----------------------iteration: ", _, "-----------------------------")
    [theta, Phi] = nbm.randInt(m, n)
    Phi = np.random.rand(m, n)
    [theta1, Phi1] = nbm.em(n, m, samples, S, theta, Phi)
    if min(theta1) < 0.0005:
        print("bad min!")
        print("theta1: ", theta1)
        bad_min += 1
    else:
        print("nice!")
print("bad_min: ", bad_min)

-------------------EM: ----------------------------
-----------------------iteration:  0 -----------------------------
nice!
-----------------------iteration:  1 -----------------------------
nice!
-----------------------iteration:  2 -----------------------------
nice!
-----------------------iteration:  3 -----------------------------
nice!
-----------------------iteration:  4 -----------------------------
nice!
-----------------------iteration:  5 -----------------------------
nice!
-----------------------iteration:  6 -----------------------------
nice!
-----------------------iteration:  7 -----------------------------
nice!
-----------------------iteration:  8 -----------------------------
nice!
-----------------------iteration:  9 -----------------------------
nice!
-----------------------iteration:  10 -----------------------------
nice!
-----------------------iteration:  11 -----------------------------
nice!
-----------------------iteration:  12 -----------------------------
ni

### Checking if GD can converge to a k-cluster point, which we call "bad min"

In [9]:
print("-------------------GD: ----------------------------")
bad_min = 0
for _ in range(60):
    print("-----------------------iteration: ", _, "-----------------------------")
    [theta, Phi] = nbm.randInt(m, n)
    [ell, theta, Phi, iterId] = nbm.gradientDescent(n, samples, S, theta, Phi, numIterations=10000, alpha = 0.02, tolerance = 1e-7)
#    print("final ell: ", ell)
    if min(theta) < 0.0005:
        print("bad min!")
        print("final theta: ", theta)
        bad_min += 1
    else:
        print("nice!")
print("bad_min: ", bad_min)

-------------------GD: ----------------------------
-----------------------iteration:  0 -----------------------------
GD converges at  5571 steps
nice!
-----------------------iteration:  1 -----------------------------
GD converges at  5355 steps
nice!
-----------------------iteration:  2 -----------------------------
GD converges at  6020 steps
nice!
-----------------------iteration:  3 -----------------------------
GD converges at  210 steps
bad min!
final theta:  [1. 0.]
-----------------------iteration:  4 -----------------------------
GD converges at  149 steps
bad min!
final theta:  [0. 1.]
-----------------------iteration:  5 -----------------------------
GD converges at  5586 steps
nice!
-----------------------iteration:  6 -----------------------------
GD converges at  5641 steps
nice!
-----------------------iteration:  7 -----------------------------
GD converges at  142 steps
bad min!
final theta:  [0. 1.]
-----------------------iteration:  8 -----------------------------
G