# Notebook 5: Simulation with Conditional and Total Probability Solutions
***

In this notebook we'll get some more practice with conditional probabilities, total probability, and the product rule.  We'll also see how we can do some simple random simulations using Numpy to verify our results. 

We'll need Numpy for this notebook, so let's load it.  You might also want to use Pandas too, so we'll load that one too.

In [1]:
import numpy as np 
import pandas as pd

### Example - Estimating the Probability of Drawing Balls from Boxes
***
In class we computed solved the following problem: 

S’pose we have two boxes filled with green and red balls.  Box 1 contains 2 green balls and 7 red balls.  Box 2 contains 4 green balls and 3 red balls. Paul selects a ball by first choosing one of the two boxes at random. He then selects one of the balls in this box at random. What is the probability Paul has selected a red ball, if it is twice as likely for Paul to pick from Box 1 as it is Box 2?

The following code runs a simple simulation to estimate the probability of drawing a ball of a particular color.  Run the code and verify that it agrees with the by-hand computation.

\begin{align*}
P(red) &= P(red|B1)P(B1) + P(red|B2)P(B2) \\
          &= \frac{7}{9} \cdot \frac{2}{3} + \frac{3}{7} \cdot \frac{1}{3} \\
          &= \frac{14}{27}+\frac{1}{7} \\
          &\approx 0.6614
\end{align*}

In [2]:
box1 = {'balls' : np.array(["green", "red"]), 'probs' : np.array([2/9, 7/9])}
box2 = {'balls' : np.array(["green", "red"]), 'probs' : np.array([4/7, 3/7])}
box_choices = {'boxes' : np.array([box1, box2]), 'probs' : np.array([2/3, 1/3])}

def sample_ball(box_choices):
    # randomly choose a box
    box = np.random.choice(box_choices['boxes'], p = box_choices['probs'])
    # randomly choose a ball from that box
    return np.random.choice(box['balls'], p = box['probs'])

def probability_of_color(color, box_choices, num_samples=1000):
    # get a bunch of balls
    balls = np.array([sample_ball(box_choices) for ii in range(num_samples)])
    # compute fraction of balls of desired color 
    return np.sum(balls == color) / num_samples

In [3]:
probability_of_color("red", box_choices, num_samples=50000)

0.66232

You should find something quite close to what we got in class, which is about 0.661.

### Exercise 1 - More Colors! 
*** 

Suppose now we add a third color to the mix.  Box 1 now contains 2 green balls, 7 red balls, and 5 purple balls.  Box 2 now contains 4 green balls, 3 red balls, and 5 purple balls.  The probability of grabbing the first box is still twice the probability of grabbing the second box.  

**Part A**: What is the probability of drawing a purple ball?  Try working this out by hand.

**Format your Probability Here**

**Part B**: Next, copy and paste the code from the example above and modify it to estimate the probability that you just computed and check your work. 

In [4]:
#Grab your code from Example above and modify it here...

In [5]:
probability_of_color("purple", box_choices, num_samples=50000)

0.0

### Exercise 2 - Estimating Conditional Probabilities 
***

Suppose you roll a fair die two times.  Let $A$ be the event "the sum of the throws equals 4" and $B$ be the event "at least one of the throws is a $3$"

**Part A**: Compute (by hand) the probability that the sum of the throws equals 4 _given_ that at least one of the throws is a 3.  That is, compute $P(A \mid B)$. 

**Insert your response here**

**Part B**: Let's see if we can write a simple simulation to confirm our result.  The following code runs a simulation to estimate $P(A)$, i.e. the probability that if you roll a fair six-sided die twice the result will sum to 4.  Your job is to modify the code so that it estimates the conditional probability $P(A \mid B)$. **Hint**: Think about the definition of conditional probability.

*Hint:  the Numpy methods `logical_or` and `logical_and` are potentially useful.*

In [6]:
die = np.array([1,2,3,4,5,6])

num_samples = 100000
roll1 = np.random.choice(die, size=num_samples)
roll2 = np.random.choice(die, size=num_samples)
sum_to_four = (roll1 + roll2) == 4

sum_to_four_prob = np.sum(sum_to_four)/num_samples
print("The probability of rolling a sum-to-four is approximately {:.3f}".format(sum_to_four_prob))

The probability of rolling a sum-to-four is approximately 0.084


### Exercise 3 - The Ol' Marble Switcharoo
***
A marble is drawn at random from a bag containing one black and one white marble.  If the white marble is drawn it is put back into the bag.  If the black marble is drawn, it is returned to the bag along with two **more** black marbles.  Then a second draw is made. What is the probability a black marble was drawn on **both** the first and the second draws?  

**Part A**: Let's do this one in reverse order.  First, see if you can write a simple simulation to estimate this probability. 

**Part B**: Now carry out the computation by hand and see if they agree. 

### Exercise 4 - Outcomes and Set Operations
***
Suppose we run an experiment we toss a coin 3 times.  The sample space for this experiment is 

$$
\Omega = \{HHH, ~ THH, ~HTH, ~HHT, ~TTH, ~THT, ~HTT, ~TTT\}
$$

**Part A**: Write down the set of outcomes corresponding to each of the following: 
1. $A$: "we throw tails exactly two times"
2. $B$: "we throw tails at least two times"
3. $C$: "tails did not appear _before_ a head appeared" 
4. $D$: "the first throw results in tails" 


**Part B**: Write down the set of outcomes corresponding to each of the following events: 
1. $A^c$ 
2. $A \cup (C \cap D)$
3. $A \cap D^{~c}$

**Part C**: Finally, Let $E$ be the event that $A$ or $C$ occurs, but not both.  Express $E$ in terms of $A$ and $C$, using only the basic set operations "union", "intersection", and "complement".  Then list the set of outcomes from the experiment that are included in $E$. 