# Homework 5 Supplementary Notebook

## History of Data Science, Winter 2022

In [None]:
import numpy as np
from scipy.special import comb
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')
from IPython.display import YouTubeVideo

## Question 2

In this question, we'll simulate another classical problem in probability, called the Gambler's Ruin. The term "Gambler's Ruin" refers to many different but related problems. For our purposes, we will consider the following scenario:
- A gambler starts with $i$ dollars. Each time they play a round of a game, they win with probability $p$ (and hence lose with probability $1-p$).
    - If they win the round, they win 1 dollar.
    - If they lose the round, they lose 1 dollar.
- They stop playing either when they hit a target of $n$ dollars (i.e. they won) or when they end up with 0 dollars (i.e. they went broke, hence the term "the Gambler's ruin").

This is an example of what's called a "random walk" in probability. Watch 0:51 to 4:38 of the video below for more context.

In [None]:
YouTubeVideo('stgYW6M5o4k')

### Question 2.1

Let's start by simulating a single game until completion. Below, complete the implementation of the function `ruin_one`. It should take in an initial balance `i`, a target `n`, and the probability of winning a given round, `p`. It should repeatedly either add 1 or subtract 1 from the player's balance – adding 1 with probability `p`, subtracting 1 with probability `1-p` – until the player's balance reaches `n` or reaches `0`. If the player's balance reaches `n`, return `'W'`, and if the player's balance reaches `0`, return `'L'`.

**_Hint_:** Look into optional arguments that `np.random.choice` can use.

**Make sure to include a screenshot of your implementation in your PDF writeup.**

In [None]:
def ruin_one(i, n, p):
    ...

Once you've completed `ruin_one`, run the following cell several times. Each time you call it, you're simulating the act of playing a game where the player:
- starts with a balance of \\$5,
- wins if they reach \\$15, and
- is equally likely to win or lose any given round (since `p = 0.5`).

In [None]:
ruin_one(5, 15, 0.5)

### Question 2.2

Now, complete the implementation of the function `ruin_many`, which takes in the same arguments as `ruin_one`. It should call `ruin_one` `reps` times, and should return the **proportion** of simulations in which the player won.

**_Hint:_** You'll need to use a `for`-loop; **don't** use `i` as your loop variable, since `i` is one of the arguments to both `ruin_one` and `ruin_many`.

**Make sure to include a screenshot of your implementation in your PDF writeup.**

In [None]:
def ruin_many(i, n, p, reps=10000):
    ...

Once you've completed `simulate_many_games`, run the following cell. It might take up to ~10 seconds to run. The proportion you see should be somewhere near $\frac{1}{3}$.

In [None]:
ruin_many(5, 15, 0.5)

Now run the following cell, which performs the same simulation but with $p = 0.6$. You should see a much higher proportion.

In [None]:
ruin_many(5, 15, 0.6)

### Question 2.3

It turns out that there are mathematical formulas that describe the probability of reaching $n$ dollars (i.e. winning), given an initial balance $i$ and probability of winning a single round $p$. To derive them by hand, you need to know about Markov Chains, which you will learn about in upper-division probability courses.

For an arbitrary $p$, this probability is

$$P(\text{win starting at $i$, when $p \neq 0.5$}) = \frac{\left(\frac{1-p}{p} \right)^i - 1}{\left(\frac{1-p}{p} \right)^n - 1}$$

Note that if $p = 0.5$, this formula doesn't work, as both the numerator and denominator would be 0. In the case where $p = 0.5$, then

$$P(\text{win starting at $i$, when $p = 0.5$}) = \frac{i}{n}$$

Below, complete the implementation of the function `ruin_theoretical`, which takes in `i`, `n`, and `p` as described before and returns the theoretical probability of winning given those conditions.

**Make sure to include a screenshot of your implementation in your PDF writeup.**

In [None]:
def ruin_theoretical(i, n, p):
    ...

Run the following few cells to see the results of your work.

In [None]:
# This function compares the results of your simulation and the true theoretical probabilities of winning
def compare_simulation_and_theoretical(i, n, p):
    sim = ruin_many(i, n, p)
    theory = ruin_theoretical(i, n, p)
    print(f'initial balance = {i}, target balance = {n}, p = {p}')
    print(f'simulated probability of winning: {round(sim, 4)}')
    print(f'theoretical probability of winning: {round(theory, 4)}')

In [None]:
compare_simulation_and_theoretical(5, 15, 0.5)

In [None]:
compare_simulation_and_theoretical(5, 15, 0.6)

In [None]:
compare_simulation_and_theoretical(10, 25, 0.5)

If you did everything correctly, the simulated probabilities and theoretical probabilities should be very close to one another.

Play around with the third example, `compare_simulation_and_theoretical(10, 25, 0.5)`. What happens if we change 0.5 to 0.4 or 0.3? Why? (No need to answer this anywhere, it's just something to think about.)

### Fun Demo

Only proceed to this part after you've completed 2.1-2.3.

Below, we define a function that takes values for `i`, `n`, and `p`, and visualizes the "path" that the player's balance follows as it goes from `i` to either `n` or `0`.

In [None]:
def plot_ruin_path(i, n, p):
    j = i
    path = np.array([i])
    while (j != n) and (j != 0):
        j += np.random.choice([1, -1], p=[p, 1-p])
        path = np.append(path, j)
    
    x = np.arange(1, len(path) + 1)
    plt.plot(x, path)
    plt.title(f"Path of player's balance from {i} to {path[-1]}, p = {p}")

Run the following cell.

In [None]:
plot_ruin_path(10, 20, 0.5)

Now, run the cell above repeatedly. You'll notice that sometimes the player wins/loses in a few steps, like 30 or 40. But other times, it takes hundreds, or even thousands, of steps for them to win or lose.

Try changing `p` to be `0.75` instead of `0.5` and run the cell repeatedly again. What do you notice? (Again, there's no need to answer this anywhere, it's just something to think about.)

## Question 3 (Optional)

In this question, your job will be to modify the Problem of Points code from lecture to work for games that are not fair, and instead Player A has a probability $p$ of winning.

### Question 3.1

Below, we've copied the skeleton of `prob_a_wins` from lecture. Complete the implementation of `prob_a_wins` so that it returns the probability that Player A wins the game under the same conditions as before, but with the added caveat that Player A wins each round with probability $p$, not probability $\frac{1}{2}$.

**_Note:_** We cannot simply divide by $2^{a + b - 1}$ anymore, since not all outcomes are equally likely. Instead, to compute this probability, we'll need to use the fact that the probability that Player A wins exactly $k$ of the $n$ remaining rounds is

$$P(\text{Player A wins exactly $k$ of $n$ remaining rounds}) = {n \choose k} p^{k} (1-p)^{n - k}$$

In the case where $p = 0.5 = \frac{1}{2}$, this is just $\frac{n \choose k}{2^{n}}$, which is the form we studied in lecture. 

As we did in lecture, you'll need to sum this probability for all values of $k$ from $a$ to $a + b - 1$.

**Make sure to include a screenshot of your implementation in your PDF writeup.**

In [None]:
def prob_a_wins(a_left, b_left, p):
    '''Returns the probability (according to Fermat and Pascal's method) that 
       Player A wins the game, given:
       - the number of points Player A needs to win the game (a_left),
       - the number of points Player B needs to win the game (b_left),
       - the probability that Player A wins a single round (p)
    '''
    ...

Once you've completed the implementation of `prob_a_wins`, run the cell below. It should show you 0.6875, which is the same probability we found in lecture (since we set $p = 0.5$).

In [None]:
prob_a_wins(2, 3, 0.5)

If we instead use a different $p$, we'll get different results:

In [None]:
prob_a_wins(2, 3, 0.8)

In [None]:
prob_a_wins(2, 3, 0.2)

Below, we've implemented `stop_game` as we did in lecture. Again, it computes the probability that Player A wins given a target score, current scores for both players, and the probability that Player A wins any given round.

In [None]:
def stop_game(target_score, a_score, b_score, p):
    '''Returns the probability (according to Fermat and Pascal's method) that 
       Player A wins the game, given: 
       - a target score (target_score),
       - the number of points that Player A currently has (a_score),
       - the number of points that Player B currently has (b_score), and
       - the probability that Player A wins a single round (p)
    '''
    a_left = target_score - a_score
    b_left = target_score - b_score
    return prob_a_wins(a_left, b_left, p)

What's the theoretical probability of Player A winning, if the game stops when Player A is up 8-7, the target score is 10, and Player A wins each round with $p = 0.8$?

In [None]:
stop_game(10, 8, 7, 0.8)

### Question 3.2

Similarly, complete the implementation of `simulate_one_game` so that it works the same way as it did in lecture but accounts for `p`, the probability that Player A wins a single round.

**_Hint:_** You only need to change one small part of the code from lecture.

**Make sure to include a screenshot of your implementation in your PDF writeup.**

In [None]:
def simulate_one_game(target_score, a_score, b_score, p):
    '''Simulates the remainder of a single game that has stopped, given:
       - a target score (target_score),
       - the number of points that Player A currently has (a_score),
       - the number of points that Player B currently has (b_score), and
       - the probability that Player A wins a single round (p)
       
       Returns True if Player A wins and False if Player B wins.
    '''
    ...

Once you've completed the implementation of `simulate_one_game`, run the cells below.

In [None]:
def simulate_many(target_score, a_score, b_score, p, reps=10000):
    '''Repeatedly calls simulate_one_game on the same arguments
       and returns the proportion of simulated games that were won by
       Player A.
    '''
    wins = 0
    
    for i in np.arange(reps):
        wins += simulate_one_game(target_score, a_score, b_score, p)
        
    return wins / reps

What's the **simulated** probability of Player A winning, if the game stops when Player A is up 8-7, the target score is 10, and Player A wins each round with $p = 0.8$?

In [None]:
simulate_many(10, 8, 7, 0.8)

This looks pretty close to the theoretical probability we found using `stop_game(10, 8, 7, 0.8)`!

In [None]:
stop_game(10, 8, 7, 0.8)

Nice job.