Mathematician John von Neumann is credited with figuring out how to take a biased coin (whose probability of coming up heads is p, not necessarily equal to 0.5) and “simulate” a fair coin. Simply flip the coin twice. If it comes up heads both times or tails both times, then flip it twice again. Eventually, you’ll get two different flips — either a heads and then a tails, or a tails and then a heads, with each of these two cases equally likely. Once you get two different flips, you can call the second of those flips the outcome of your “simulation.”

For any value of p between zero and one, this procedure will always return heads half the time and tails half the time. This is pretty remarkable! But there’s a downside to von Neumann’s approach — you don’t know how long (i.e., how many flips) the simulation will last.

Suppose I want to simulate a fair coin in at most three flips. For which values of p is this possible?

Extra credit: Suppose I want to simulate a fair coin in at most N flips. For how many values of p is this possible?

### General Thinking:

- John VN is not counting a simulation until there is a divergence at `N-1` flips, in which case the `Nth` flip is going to count as the outcome of the simulation. 

- In our problem we want to minimize those `N` flips to `<=3`. 

In [20]:
import numpy as np 

# start with building a class for handling the John Von Neumann scenario 
# Note: This is not efficient. Very slow, but just a POC
class johnny_v():
    def __init__(self, p = 0.5, sims = 1000):
        self.coin_p = p
        self.sims = sims
        self.obs_list = []
        
    def __single_sim(self):
        """Flip a coin until we get N-1 != N. Return observation N"""
        same = True
        
        while same:
            
            # flip two coins with a prob of coin_p
            outcomes = np.random.binomial(1, self.coin_p, size = 2)
            
            if outcomes[0] != outcomes[1]:
                return outcomes[1]
            
    def full_sim(self):
        """Run a full simulation across self.sims length"""
        
        for i in range(self.sims):
            self.obs_list.append(self.__single_sim())

In [24]:
# run a test with p = 0.5 on 1000 sims
test_1 = johnny_v(0.5, 1000)
test_1.full_sim()
sum(test_1.obs_list) / test_1.sims

0.506

In [26]:
# run a test with p = 0.2 on 10000 sims
test_1 = johnny_v(0.2, 10000)
test_1.full_sim()
sum(test_1.obs_list) / test_1.sims

0.497

In [27]:
# run a test with p = 0.01 on 100000 sims
test_1 = johnny_v(0.01, 100000)
test_1.full_sim()
sum(test_1.obs_list) / test_1.sims

0.49986

### Switching to the problem at hand: 

- Need to modify the class now to max out at 3 sims. 

- Maybe makes sense that if non-dominant flip is the first then keep it, otherwise move to next. 

- General idea: 
 - If `p` represents heads, then when `p` > 0.5 we would not accept a heads on its first flip, rather we move to the next flip. But if the first flip is tails, we accept. 
 - We can work this out to find our end points: 
     - 3 flips:
         - Max `p`: `p*p*p = 0.5` 
             - `p` = 0.7937
         - Min `p`: `(1-p)^3 = 0.5`
            -  `p` = `1 - 0.5 ** (1/3)`
            -  `p` = 0.2063
      - 2 flips:
         - Max `p`: `p*p = 0.5` 
             - `p` = 0.707
         - Min `p`: `(1-p)^2 = 0.5`
            -  `p` = `1 - 0.5 ** (1/2)`
            -  `p` = 0.2929
      - 1 flip: 
         - 0.5 is our only method. 

In [135]:
# Modified Flips below: 
class modified_flips():
    def __init__(self, p = 0.5, sims = 1000):
        self.coin_p = p
        self.sims = sims
        self.obs_list = []
        
    def __single_sim(self):
        """Flip a coin 3 times & return observation N"""
        outcomes = np.random.binomial(1, self.coin_p, size = 3)
        
        if self.coin_p == 0.5:
            return outcomes[0] # return first value randomly - will converge to 0.5 

        # when less likely, we need to return heads (1) when it occurs
        if self.coin_p < 0.5:
            
            # 3 flips
            if self.coin_p < 0.25: 
                if outcomes[0] == 1:
                    return outcomes[0]
                elif outcomes[1] == 1:
                    return outcomes[1]
                else:
                    return outcomes[2]
                
            # 2 flips - return second 
            else:
                if outcomes[0] == 1:
                    return outcomes[0]
                else:
                    return outcomes[1]
                
        # when more likely, we need to return tails (0) when it occurs
        if self.coin_p > 0.5:
            
            # 3 flips
            if self.coin_p > 0.75: 
                if outcomes[0] == 0:
                    return outcomes[0]
                elif outcomes[1] == 0:
                    return outcomes[1]
                else:
                    return outcomes[2]
                
            # 2 flips - return second 
            else:
                if outcomes[0] == 0:
                    return outcomes[0]
                else:
                    return outcomes[1]
            
    def full_sim(self):
        """Run a full simulation across self.sims length"""
        
        for i in range(self.sims):
            self.obs_list.append(self.__single_sim())

In [138]:
# check each answer: 
x = 0.5 # fair coin
p_values = [x, x**0.5, x**0.333, 1 - x**0.5, 1 - x**0.333]

# run sim for p-values 100K times each, print final proportion of heads
for p_val in p_values:
    test_1 = modified_flips(p_val, 100000)
    test_1.full_sim() # run sim
    final_prop = sum(test_1.obs_list) / test_1.sims
    print(f"For p-value of {p_val} the observed proportions of heads was: {final_prop:.2f}")

For p-value of 0.5 the observed proportions of heads was: 0.50
For p-value of 0.7071067811865476 the observed proportions of heads was: 0.50
For p-value of 0.7938839309316524 the observed proportions of heads was: 0.50
For p-value of 0.2928932188134524 the observed proportions of heads was: 0.50
For p-value of 0.20611606906834756 the observed proportions of heads was: 0.50


In [140]:
# None of the strategies will work on other values of p
bad_p_vals = [0.15, 0.3, 0.67, 0.9]

# run sim for p-values 100K times each, print final proportion of heads
for p_val in bad_p_vals:
    test_1 = modified_flips(p_val, 100000)
    test_1.full_sim() # run sim
    final_prop = sum(test_1.obs_list) / test_1.sims
    print(f"For p-value of {p_val} the observed proportions of heads was: {final_prop:.2f}")

For p-value of 0.15 the observed proportions of heads was: 0.38
For p-value of 0.3 the observed proportions of heads was: 0.51
For p-value of 0.67 the observed proportions of heads was: 0.45
For p-value of 0.9 the observed proportions of heads was: 0.73
