In [0]:
#@title Imports
!pip install -q symbulate
from symbulate import *

# Conditional Probability as Information

You know that your coworker has two children. What is the probability that both are boys?

In [0]:
model = BoxModel(["B", "G"], size=2, replace=True)
model.sim(10000).tabulate()

$$P(\text{both boys}) = \frac{1}{4}.$$

One day, she mentions to you, "I need to stop by St. Joseph's after work for a PTA meeting." St. Joseph's is a local all-boys school. So now you know that she has at least one boy.

To quantify how probabilities change in light of new information, we calculate the **conditional probability**.

$$ P(\text{both boys}\ |\ \text{at least one boy}) $$

The $|$ symbol is read "given" and the event after the $|$ symbol represents information that we know. 

In general, to calculate a conditional probability, we use the formula:

$$ P(B | A) = \frac{P(A \cap B)}{P(A)}. $$

The $\cap$ symbol means "and". (You can remember this because $\cap$ looks like the letter "n", which should remind you of "and".) The probability $P(A \cap B)$ is called a **joint probability**.

So the conditional probability above is 

$$ P(\text{both boys}\ |\ \text{at least one boy}) = \frac{P(\text{both boys} \cap \text{at least one boy})}{P(\text{at least one boy})} = \frac{P(\text{both boys})}{P(\text{at least one boy})} = \frac{1/4}{3/4} = \frac{1}{3}. $$

In the above example, the joint probability $P(\text{both boys} \cap \text{at least one boy})$ is easy to calculate because the two events are redundant. If we know that both are boys, then we automatically know that at least one is a boy.

The information that at least one of her children attends St. Joseph's (and, thus, is a boy) increases the probability that she has two boys from $1/4$ to $1/3$.

In [0]:
def at_least_one_boy(children):
  for child in children:
    if child == "B":
      return True
  return False

model.sim(10000).filter(at_least_one_boy).tabulate()

# Multiplication Rule

The conditional probability formula can be rearranged to produce the following formula:

$$ P(A \cap B) = P(A) P(B | A). $$ 

This formula is convenient when the conditional probability $P(B | A)$ is known but $P(A \cap B)$ is not. 

This formula is called the **multiplication rule** because it says that we can _multiply_ probabilities to get joint probabilities.

## Applying the Multiplication Rule

On an exam where every question is multiple choice with 5 answer choices, you know the correct answer to 60% of the questions. For the remaining questions, you guess one of the 5 choices at random.

What is the probability a randomly chosen question is one you got right by guessing?

\begin{align} 
P(\text{don't know answer} \cap \text{correct}) &= P(\text{don't know answer}) P(\text{correct}\ |\ \text{don't know answer}) \\
&= (1 - 0.6) \cdot 1/5
\end{align}

## Linda Example

> Linda is 31 years old, single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-nuclear demonstrations.

Which alternative is more probable?

1. Linda is a bank teller.
2. Linda is a bank teller and is active in the feminist movement.

# Example (combining both formulas)

A fair coin is tossed 10 times. It comes up heads 5 times. What is the probability that there were exactly 2 heads in the first 4 tosses?

\begin{align}
P(\text{2 H in first 4 tosses}\ |\ \text{5 H in 10 tosses}) &= \frac{P(\text{2 H in first 4 tosses} \cap \text{5 H in 10 tosses})}{P(\text{5 H in 10 tosses})} \\
&= \frac{P(\text{2 H in first 4 tosses}) P( \text{5 H in 10 tosses}\ |\ \text{2 H in first 4 tosses})}{P(\text{5 H in 10 tosses})} \\
&= \frac{P(\text{2 H in first 4 tosses}) P(\text{3 H in last 6 tosses})}{P(\text{5 H in 10 tosses})} \\
&= \frac{\binom{4}{2} (.5)^2 (1-.5)^{4-2} \cdot \binom{6}{3} (.5)^3 (1 - .5)^{6-3}}{\binom{10}{5} (.5)^5 (1 - .5)^{10-5}} \\
&\approx .476
\end{align}

In [0]:
model = BoxModel([0, 1], size=10, replace=True)

def five_heads_total(tosses):
  return sum(tosses) == 5

def heads_in_first_4_tosses(tosses):
  return sum(tosses[:4])

(model.sim(1000000)
 .filter(five_heads_total)
 .apply(heads_in_first_4_tosses)
 .tabulate()
)

Find the _distribution_ of the number of heads in the first 4 tosses, if there are 5 heads in 10 tosses. Is this a named distribution that we learned in this class? 

(To do this, derive an expression for $p[x]$ in terms of $x$. Note that we essentially calculated $p[2]$ above.)