## Binomial Distribution 
$P(x:n,p) = nCx p^x (1-p)^{(n-x)}$
- n is the number of trials (occurrences)
- k is the number of successful trials
- p is probability of success in a single trial

## EM algorithm

In [14]:
import numpy as np
heads = [14, 33, 19, 10, 0, 17, 24, 17, 1, 36, 5, 6, 5, 13, 4, 35, 5, 5, 74, 34]
throws = [41, 43, 23, 23, 1, 23, 36, 37, 2, 131, 5, 29, 13, 47, 10, 58, 15, 14, 100, 113]
tails = [throws[i] - heads[i] for i in range(len(throws))]


# xs = np.array([heads, heads])
xs = list(zip(heads, tails))
thetas = np.array([[0.5001, 0.4999], [0.5, 0.5]])

In [15]:
tol = 0.0001
max_iter = 100

def EM(xs, thetas ,tol, max_iter):
    ll_old = 0
    for i in range(max_iter):
        ws_A = []
        ws_B = []

        vs_A = []
        vs_B = []

        ll_new = 0

        # E-step: calculate probability distributions over possible completions
        for x in xs:

            # multinomial (binomial) log likelihood
            #reason: log (theta^p *(1-theta)^q) = p*log(theta) + q*log(1-theta) 
            ll_A = np.sum([x*np.log(thetas[0])]) 
            ll_B = np.sum([x*np.log(thetas[1])])

            # [EQN 1]: weight 
            denom = np.exp(ll_A) + np.exp(ll_B)
            w_A = np.exp(ll_A)/denom
            w_B = np.exp(ll_B)/denom

            ws_A.append(w_A)
            ws_B.append(w_B)

            # expected head, tails 
            vs_A.append(np.dot(w_A, x))
            vs_B.append(np.dot(w_B, x))

            # update complete log likelihood
            ll_new += w_A * ll_A + w_B * ll_B

        # M-step: update values for parameters given current distribution
        # [EQN 2]
        thetas[0] = np.sum(vs_A, 0)/np.sum(vs_A)
        thetas[1] = np.sum(vs_B, 0)/np.sum(vs_B)
        # print distribution of z for each x and current parameter estimate

        print("Iteration: %d" % (i+1))
        print("theta_A = %.2f, theta_B = %.2f, log likelihood = %.2f" % (thetas[0,0], thetas[1,0], ll_new))

        if np.abs(ll_new - ll_old) < tol:
            break
        ll_old = ll_new
EM(xs, thetas ,tol, max_iter)        

Iteration: 1
theta_A = 0.47, theta_B = 0.47, log likelihood = -529.57
Iteration: 2
theta_A = 0.47, theta_B = 0.46, log likelihood = -527.91
Iteration: 3
theta_A = 0.55, theta_B = 0.39, log likelihood = -526.18
Iteration: 4
theta_A = 0.68, theta_B = 0.31, log likelihood = -492.69
Iteration: 5
theta_A = 0.71, theta_B = 0.31, log likelihood = -470.33
Iteration: 6
theta_A = 0.71, theta_B = 0.31, log likelihood = -469.04
Iteration: 7
theta_A = 0.71, theta_B = 0.31, log likelihood = -468.89
Iteration: 8
theta_A = 0.71, theta_B = 0.31, log likelihood = -468.88
Iteration: 9
theta_A = 0.71, theta_B = 0.31, log likelihood = -468.87
Iteration: 10
theta_A = 0.71, theta_B = 0.31, log likelihood = -468.87
Iteration: 11
theta_A = 0.71, theta_B = 0.31, log likelihood = -468.87


(Optional) There could be an optional bias parameter in this model, if it is much more likely to pick one coin. Update your model to also estimate the bias

- average the expected value for coin A and coin B 

In [16]:
three_coin = []
five_coin = []
ratio = np.divide(heads,throws)
for i in range(len(ratio)): 
    if ratio[i] > 0.5:
        five_coin.append(ratio[i])
    else: 
        three_coin.append(ratio[i])
np.mean(three_coin), np.mean(five_coin)

(0.32846026694949065, 0.7632534563283143)