**Maximum likelihood estimatation from observed and unobserved data**

You are given a bag containing red and blue coins. All the red coins have the same probability of heads. All the blue coins have the same probability of heads (possibly different from that of the red coins).

Your task is to estimate the proportion of red coins in the bag and the probability of heads for both the red and the blue coin.

In [2]:
import ipywidgets as widgets
prob_red = widgets.FloatSlider(min=0.0, max=1.0, description='prob_red')
prob_head_red = widgets.FloatSlider(min=0.0, max=1.0, description='head_red')
prob_head_blue = widgets.FloatSlider(min=0.0, max=1.0, description='head_blue')
display(prob_red, prob_head_red, prob_head_blue)

FloatSlider(value=0.0, description='prob_red', max=1.0)

FloatSlider(value=0.0, description='head_red', max=1.0)

FloatSlider(value=0.0, description='head_blue', max=1.0)

Use these widgets to control the model.

In [12]:
import random
def choose_coin():
    return 'R' if random.random() < prob_red.value else 'B'

def flip_coin(coin):
    uar = random.random()
    if coin == 'R':
        if uar < prob_head_red.value:
            return 'H'
    elif uar < prob_head_blue.value:
        return 'H'
    return 'T'

def flip_random_coin_n_times(n, hidden=False):
    coin = choose_coin()
    return ('_' if hidden else coin, ''.join([flip_coin(coin) for i in range(n)]))

def flip_m_random_coins_n_times(m, n, hidden=False):
    return [flip_random_coin_n_times(n, hidden) for i in range(m)]

Use the above methods to sample from the model. The optional parameter 'hidden' controls whether the colour of the coin is observed in the samples.

In [4]:
flip_m_random_coins_n_times(5, 100)

[('B',
  'TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT'),
 ('B',
  'TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT'),
 ('B',
  'TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT'),
 ('B',
  'TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT'),
 ('B',
  'TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT')]

In [5]:
flip_m_random_coins_n_times(5, 100, hidden=True)

[('_',
  'TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT'),
 ('_',
  'TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT'),
 ('_',
  'TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT'),
 ('_',
  'TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT'),
 ('_',
  'TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT')]

**TASK 1** Implement the following two functions to estimate parameters for the model in the observed case. Splitting the work into two separate functions will simplify things for the next task. 

* How could you measure the error in your estimates?
* How does the error decrease with the sample size?
* If you were only allowed to flip coins a total of N times how would you choose m (the number of coins) and n the number of times to flip each coin? Why?

In [29]:
def compute_sufficient_statistics(samples):
    total = len(samples) * len(samples[0][1])
    count_red = sum([len(sample[1]) for sample in samples if sample[0] == 'R']) 
    count_red_head = sum([sample[1].count('H') for sample in samples if sample[0] == 'R'])
    count_blue_head = sum([sample[1].count('H') for sample in samples if sample[0] == 'B'])
    return total, count_red, count_red_head, count_blue_head

def mle(total, count_red, count_red_head, count_blue_head):
    estimate_prob_red = count_red / total
    estimate_prob_head_red = count_red_head / count_red
    estimate_prob_head_blue = count_blue_head / (total - count_red)
    return estimate_prob_red, estimate_prob_head_red, estimate_prob_head_blue

In [32]:
samples = flip_m_random_coins_n_times(10000, 100)
estimate_prob_red, estimate_prob_head_red, estimate_prob_head_blue = mle(*compute_sufficient_statistics(samples))
print(estimate_prob_red, estimate_prob_head_red, estimate_prob_head_blue)

0.304 0.20045723684210526 0.3996954022988506


**TASK 2** Given a sample from a single coin whose colour is unobserved, estimate the posterior probability that the coin is red, given some estimates of the model parameters.

* If you pass in the true model parameters (e.g. prob_red.value, prob_head_red.value and prob_head_blue.value), how quickly does the posterior change? Use the plot_distribution function to view this.
* How does this depend on the model parameters?

In [61]:
def compute_posterior_prob_red(sample, estimate_prob_red, estimate_prob_head_red, estimate_prob_head_blue):
    count_head = sample.count('H')
    count_tail = len(sample) - count_head
    joint_red = estimate_prob_red * estimate_prob_head_red**count_head * (1 - estimate_prob_head_red)**count_tail
    joint_blue = (1 - estimate_prob_red) * estimate_prob_head_blue**count_head * (1 - estimate_prob_head_blue)**count_tail
    return joint_red / (joint_red + joint_blue)


**TASK 3** Reusing your code from Tasks 1 and 2, implement expectation maximization algorithm to find a (locally optimal) solution to the parameters when the colour of the coins is not observed.

In [67]:
def compute_expected_statistics(samples, estimate_prob_red, estimate_prob_head_red, estimate_prob_head_blue):
    total, expected_count_red, expected_count_red_head, expected_count_blue_head = 0, 0.0, 0.0, 0.0
    for sample in samples:
        total += len(sample[1])
        posterior_prob_red = compute_posterior_prob_red(sample[1], estimate_prob_red, estimate_prob_head_red, estimate_prob_head_blue)
        expected_count_red += posterior_prob_red * len(sample[1])
        expected_count_red_head += posterior_prob_red * sample[1].count('H')
        expected_count_blue_head += (1 - posterior_prob_red) * sample[1].count('H')
    return total, expected_count_red, expected_count_red_head, expected_count_blue_head

def expectation_maximization(samples, estimate_prob_red, estimate_prob_head_red, estimate_prob_head_blue):
    for i in range(10):
        total, expected_count_red, expected_count_red_head, expected_count_blue_head = compute_expected_statistics(
            samples, estimate_prob_red, estimate_prob_head_red, estimate_prob_head_blue)
        estimate_prob_red, estimate_prob_head_red, estimate_prob_head_blue = mle(
            total, expected_count_red, expected_count_red_head, expected_count_blue_head)
        print(estimate_prob_red, estimate_prob_head_red, estimate_prob_head_blue)
        


In [78]:
samples = flip_m_random_coins_n_times(10, 100, hidden=True)
expectation_maximization(samples, 0.5, 0.7, 0.2)

0.555294461770073 0.6019919515493285 0.3951315829659137
0.3266628264755212 0.699974269087489 0.4178358745991849
0.2999963891190526 0.719998947269077 0.4200019986750612
0.2999832413939155 0.7200037090276086 0.4200055926313944
0.2999832315266686 0.7200037114282721 0.4200055958313131
0.29998323151917455 0.7200037114300724 0.4200055958337533
0.29998323151916884 0.7200037114300739 0.4200055958337552
0.29998323151916884 0.7200037114300738 0.4200055958337553
0.29998323151916884 0.7200037114300738 0.4200055958337553
0.29998323151916884 0.7200037114300738 0.4200055958337553
