### General Approach:

- We can solve for the binomial CDF which tells us likelihood of getting up to `x` successes (`P(X <= x)`) out of `n` attempts with likelihood `p`
    - Scipy's binom.cdf func is `P(X <= x)` so will need to solve for `k - 1`
    - For our problem, we want likelihood of `k` through `n` heads out of `n` remaining trials. This means we can take 1 - CDF(`k-1`,`n`,0.5)
    
    - Tangible example: 
        - We have run 15 trials, with 8 heads thus far.
		  - 51 - 8 = 43 heads left
		  - We can solve for 1 - (X <= 42), which tells us likelihood of getting > 42 heads (43+)
          - *In scipy, this will need to be written as 1 - (X <= x), where x is 43 - 1*

- We only count a trial when p > 0.99, then determine if person lost or not. 

#### let's say we have 12 flips left & we need 4 flips or more to break 51. 
- What we want is `P(X >= x)` where x = 4. 
    - `P(X >= x)` = `1 - P(X < x)`, so we can just solve 1 - CDF(x - 1,flips_left,0.5)
    - `P(X >= 4) = 0.92700195`

```python
p = 1 - binom.cdf(3,12,0.5) # 0.927001953125
```

#### Speeding things up:
- My initial runs were calculating the CDF 101 times per iteration. So my 100K test loop was doing many CDF calls  (100K & 101 = `10,100,000`)
- Given there are only so many combinations of `k` & `n` for cdf, can solve for each of those and then pass in as a dictionary. 
    - Note: Solving for `1 - binom.cdf(k - 1, n,  0.5)`
- A single run speeds up from `11.5 ms` to `0.338 ms` with this switch - about a 30x speedup. 

In [1]:
import numpy as np 
from scipy.stats import binom

In [2]:
# build cdf dictionary for possible turns left, heads we need.
# max heads is 51, max turns left is 101
# store off the actual p value to reduce operations 
tot_turns = 101
max_heads = 51
bin_cdf = {}

for n in range(tot_turns+1):
    for k in range(max_heads + 1):
        
        # using k-1 still .cdf func seems to be <= x
        bin_cdf[k,n] = 1 - binom.cdf(k-1,n,0.5)

In [3]:
# check a few: https://stattrek.com/online-calculator/binomial.aspx
# 1) P(X >= 10) when we have 47 flips left is 0.9999875479846878
print(bin_cdf.get((10,47))) # pretty good matching

# 2) P(X >= 30) when we have 47 flips left is 0.03947034684
print(bin_cdf.get((30,47))) # pretty good matching

0.9999875479846878
0.03947034684074424


In [4]:
def faster_game():
    
    # General game params:
    heads = 0
    winning_flips = 51
    tot_flips = 101
    flips = 0
    exceeds = False
    winner = False

    # Run 101 flips. If a clear winner arises, end game
    for _ in range(tot_flips):

        # flip coin once
        coin = np.random.binomial(1,0.5)

        # update heads count, flips, and tails 
        heads += coin
        flips += 1
        tails = flips - heads   

        # needed to win:
        needed_heads = winning_flips - heads
        turns_left = tot_flips - flips
        
        #### STOP GAME IF WINNER IS FOUND: 
        
        # check if game is over for either person
        if heads >= winning_flips:
            winner = True
            return exceeds, winner
        
        # check if game is over for either person
        if tails >= winning_flips:
            winner = False
            return exceeds, winner
        
        ### Solve for likelihood of getting needed_heads or more based on remaining turns

        # solving for a scenario where heads is less than k. 
        # if we have 90 turns left and need 40 heads
        # then we want to know up to 40 heads, which is going to be X < x (already solved for )
        k = needed_heads # whats the likelihood of getting up to needed heads but not over; 
        n = turns_left

        if k >= 0:

            # binom.cdf calculates P(X <= k)
            p = bin_cdf.get((k,n))
            
            if p >= 0.99:
                exceeds = True
            
    return exceeds, winner

### Rerun with New Function: 30x faster

In [5]:
tot_games = 0
super_loser = 0

for j in range(2000000):
    
    exceeds, winner = faster_game()
    
    # out of loop
    if exceeds:
        tot_games += 1

        if winner == False:
            super_loser += 1
     
    if (j+1) % 200000 == 0:
        print(f"Iteration {j+1}:")
        print(f"\tTotal games with > 0.99: {tot_games}")
        print(f"\tSuper loser games: {super_loser}")
        print(f"\tPercent: {100 * super_loser / tot_games:.2f}")

Iteration 200000:
	Total games with > 0.99: 63204
	Super loser games: 439
	Percent: 0.69
Iteration 400000:
	Total games with > 0.99: 126246
	Super loser games: 912
	Percent: 0.72
Iteration 600000:
	Total games with > 0.99: 189519
	Super loser games: 1359
	Percent: 0.72
Iteration 800000:
	Total games with > 0.99: 252336
	Super loser games: 1760
	Percent: 0.70
Iteration 1000000:
	Total games with > 0.99: 315418
	Super loser games: 2190
	Percent: 0.69
Iteration 1200000:
	Total games with > 0.99: 378853
	Super loser games: 2598
	Percent: 0.69
Iteration 1400000:
	Total games with > 0.99: 442127
	Super loser games: 3032
	Percent: 0.69
Iteration 1600000:
	Total games with > 0.99: 505253
	Super loser games: 3437
	Percent: 0.68
Iteration 1800000:
	Total games with > 0.99: 568678
	Super loser games: 3836
	Percent: 0.67
Iteration 2000000:
	Total games with > 0.99: 632022
	Super loser games: 4270
	Percent: 0.68
