# Lecture 5  – Probability

## History of Data Science, Winter 2022

In [None]:
import numpy as np
from scipy.special import comb
import matplotlib.pyplot as plt

## The Problem of Points

Let's define a function, `prob_a_wins`, that takes in the number of points both players need to win the game and returns the probability that Player A wins.

In [None]:
def prob_a_wins(a_left, b_left):
    '''Returns the probability (according to Fermat and Pascal's method) that 
       Player A wins the game, given:
       - the number of points Player A needs to win the game (a_left), and
       - the number of points Player B needs to win the game (b_left)
    '''
    max_turns = a_left + b_left - 1
    return np.sum([comb(max_turns, k) for k in np.arange(a_left, max_turns + 1)]) / (2 ** (max_turns))

In [None]:
prob_a_wins(2, 3)

Let's also define a function `stop_game` that takes in the target score and the number of points both players have so far and returns the probability that Player A wins. This will call our original `prob_a_wins` function.

In [None]:
def stop_game(target_score, a_score, b_score):
    '''Returns the probability (according to Fermat and Pascal's method) that 
       Player A wins the game, given: 
       - a target score (target_score),
       - the number of points that Player A currently has (a_score), and
       - the number of points that Player B currently has (b_score)
    '''
    a_left = target_score - a_score
    b_left = target_score - b_score
    return prob_a_wins(a_left, b_left)

In [None]:
stop_game(10, 8, 7)

In [None]:
stop_game(1000, 1000 - 2, 1000 - 3)

Great! But... you might not be convinced. In particular, the idea of playing more rounds even after someone may have already won may be counterintuitive.

Let's simulate to remove any doubts!

First, let's write a function to simulate what would happen if we played a single interrupted game until completion. Specifically, it takes in a target score and the scores of both players so far, and it plays one turn at a time until one player wins. It returns `True` if Player A wins.

In [None]:
def simulate_one_game(target_score, a_score, b_score):
    '''Simulates the remainder of a single game that has stopped, given:
       - a target score (target_score),
       - the number of points that Player A currently has (a_score), and
       - the number of points that Player B currently has (b_score)
       
       Returns True if Player A wins and False if Player B wins.
    '''
    outstr = ''
    while (target_score > a_score) and (target_score > b_score):
        flip = np.random.choice(['A', 'B'])
        if flip == 'A':
            a_score += 1
        else:
            b_score += 1
            
    return target_score == a_score

In [None]:
simulate_one_game(10, 8, 7)

Now, let's simulate many, many games and look at the proportion of them that Player A wins.

In [None]:
def simulate_many(target_score, a_score, b_score, reps=10000):
    '''Repeatedly calls simulate_one_game on the same arguments
       and returns the proportion of simulated games that were won by
       Player A.
    '''
    wins = 0
    
    for i in np.arange(reps):
        wins += simulate_one_game(target_score, a_score, b_score)
        
    return wins / reps

In [None]:
simulate_many(10, 8, 7)

In [None]:
simulate_many(1000, 1000 - 2, 1000 - 3)

Our simulated probabilities, given by `simulate_many`, appear to be similar to our theoretical probabilities, given by `stop_game`.