# Week 2 Exercises

These exercises are intended to increase your familiarity with python and the concepts of the course

## Evil Monty Hall

Consider the Monty Hall problem covered in class.  

1.  Write a simulation for an extended version where there are a total of four doors, rather than 3.  There is one car and 3 goats and the setup is the same: the contestant opens a door, Monty opens a different one revealing a goat and the contestant gets to stick or switch.   What is the probability of winning (opening a door with a car behind it) if the contestant switches or sticks.   Do it by simulation, not math.

2.  Same as above, but now there are 2 goats and 2 cars randomly hidden behind the four doors.  Again, simulate the probability of winning a car (any of two) if the contestant switches or sticks.

In [104]:
# Four door Monty Hall

import random 

def monty_hall_simulation_4(num_trials=10000):
    stick_wins = 0
    switch_wins = 0
    
    for _ in range(num_trials):
        # All doors
        doors = {1, 2, 3, 4}

        # Randomly place a car behind one door
        car_position = random.choice(list(doors))
        
        # The contestant makes a random choice
        contestant_choice = random.choice(list(doors))
        
        # Doors Monty can open to ensure he shows a smelly goat
        monty_can_open = doors - {contestant_choice, car_position}
        monty_opens = random.choice(list(monty_can_open))
        
        # The door that the contestant switches to if they choose to switch
        switch_choice = random.choice(list(doors - {contestant_choice, monty_opens}))
        
        # Check the outcomes
        if contestant_choice == car_position:
            stick_wins += 1
        if switch_choice == car_position:
            switch_wins += 1

    print(f"Probability of winning if you stick: {stick_wins/num_trials}")
    print(f"Probability of winning if you switch: {switch_wins/num_trials}")

monty_hall_simulation_4()

Probability of winning if you stick: 0.2512
Probability of winning if you switch: 0.3765


In [321]:
# N Car Monty

def n_car_monty_hall_simulation(num_cars, num_doors, num_trials=10000):
    stick_wins = 0
    switch_wins = 0
    
    for _ in range(num_trials):
       # All doors
        doors = set(range(1, num_doors+1))

        # Randomly place cars behind doors
        car_positions = set(random.sample(list(doors), num_cars))
        car_positions

        # The contestant makes a random choice
        contestant_choice = set(random.sample(list(doors), 1))

        # Doors Monty can open to ensure he shows a smelly goat
        monty_can_open = doors - car_positions - set(contestant_choice)
        monty_opens = set(random.sample(list(monty_can_open), 1))

        # The doors that the contestant switches to if they choose to switch
        can_switch_choice = doors  - contestant_choice - monty_opens
        switch_choice = random.sample(list(can_switch_choice), 1).pop()
        
        # Check the outcomes
        if contestant_choice <= car_positions:
            stick_wins += 1
        if switch_choice in car_positions:
            switch_wins += 1
 
        #print(f"Doors:{doors} -- Car:{car_positions} -- Contestant:{contestant_choice} -- Monty:{monty_opens} -- Switch:{switch_choice} -- {stick_wins},{switch_wins}")
    
    print(f"Probability of winning if you stick: {stick_wins/num_trials}")
    print(f"Probability of winning if you switch: {switch_wins/num_trials}")

monty_hall_simulation_4(3, 8)

Probability of winning if you stick: 0.3732
Probability of winning if you switch: 0.432


## Base Rate Fallacy

We looked at Bayes Rule in the lecture, this exercise applies it.

The engineers over at Google have created an AI algorithm that can detect a certain disease using an iPhone app. It has a true positive rate of 99% and a fales positive rate of 5%.

$$ P(T^+|D^+) = 0.99$$
$$ P(T^+|D^-) = 0.05$$

where $T^+$ is the event of a positive test (on the app) for a person, and $D^+$ is the event that a person has SCD, and $D^-$ is the event that a person does not have SCD. 

In the general population the prevalence of the disease is 1 in about 100,000, ie $P(D^+) = \frac{1}{100000}$

So, if a randomly selected persons runs the app and it comes back positive, what is the probability that you have the SCD  ?

1.  Solve using math

2.  Solve using simulation


## Math Solution
Using Bayes Rule and then Law of Total Probability
\begin{align*}
P(D+|T+) &= \frac{P(D+) P(T+|D+)}{P(T+)} \\
 &= \frac{P(D+) P(T+|D+)}{P(T+|D+)P(D+) + P(T+|D-)P(D-)} \\
 &= \frac{\frac{1}{100000}\cdot 0.99}{0.99\cdot\frac{1}{100000} + 0.05 * \frac{99999}{100000}} \\
 &\approx 0.0198\%
\end{align*}

In [370]:
# Python here

# Set the parameters
true_positive_rate = 0.99
false_positive_rate = 0.0001
prevalence = 1/100000

# Generate a large random sample of individuals
n_samples = 10000000
actual_disease = np.random.rand(n_samples) < prevalence    # NOTE HOW THIS WORKS !

# Generate the test results based on the true disease state and the test characteristics
test_results = np.where(actual_disease, 
                        np.random.rand(n_samples) < true_positive_rate, 
                        np.random.rand(n_samples) < false_positive_rate)

# Calculate the conditional probability P(D+|T+) using the simulation results
positive_test_results = test_results.sum()
true_positives = (test_results & actual_disease).sum()

conditional_probability = true_positives / positive_test_results
print(f"P(D+|T+) = {conditional_probability * 100:.3f}%")

P(D+|T+) = 9.751%


Extra credit.   This disease is actually Sickle Cell Disease where the prevalence in the African American Community is much higher: 1 in 365.   So if you are African American and the test comes back positive, what is the probability you have the disease ?

3. Solve with math

4. Solve using simulation

5. Discuss some implications (a single paragraph, tops)

## Math Solution
Using Bayes Rule and then Law of Total Probability
\begin{align*}
P(D+|T+) &= \frac{P(D+) P(T+|D+)}{P(T+)} \\
 &= \frac{P(D+) P(T+|D+)}{P(T+|D+)P(D+) + P(T+|D-)P(D-)} \\
 &= \frac{\frac{1}{365}\cdot 0.99}{0.99\cdot\frac{1}{365} + 0.05 * \frac{364}{365}} \\
 &\approx 5\%
\end{align*}

Discussion here

# Not so Lucky Luke

Luke is gambling with a biased coin, which has a probability of 40% of coming up Heads, and Tails is considered a win. 

1.  What is the probability that Luke wins (flips Tails) on the first flip ?
2.  What is the probability that Luke loses 3 flips (Heads) in a row, before winning (tails) on the fourth ?
3.  If on the fourth flip he gets Heads again!  What is the probability that next flip will come up Heads ?

Solve however you like.

## Q1
p = 0.6

## Q2
From geometric series let X be the number of heads before a tail
\begin{align*}
X\sim geo(p) \implies P(X=k) =& (1-p)^{k}p\\
=& 0.4^3\cdot0.6\\
=& 0.0384
\end{align*}

## Q3
Its still 0.4, as these are independent events.  There are no 'streaks'