# 2. Conditional probability

## Brief summary

### Conditional probability

If $A$ and $B$ are events with $P(B) > 0$, then the *conditional probability* of $A$ given $B$, denoted by $P(A|B)$, is defined as

\begin{equation}
P(A|B) = \frac{P(A \cap B)} {P(B)}
\end{equation}

- $P(A)$: *prior* probability of $A$
- $P(A|B)$: *posterior* probability of $A$ (after updating based on the *evidence* $B$)


### Bayes' rule

\begin{equation}
P(A|B) = \frac{P(B|A)P(A)} {P(B)}
\end{equation}


### Odds

The *odds* of an event $A$ are

\begin{equation}
odds(A) = P(A) / P(A^c)
\end{equation}


### The Law of total probability (LOTP)

Let $A_1$, $...$, $A_n$ be a partition of the sample space $S$ (i.e., the $A_i$ are disjoint events and their union is $S$), with $P(A_i) > 0$ for all $i$. Then

\begin{equation}
P(B) = \sum_{i=1}^{n} P(B|A_i)P(A_i)
\end{equation}


### Indepence of two events

Events $A$ and $B$ are *independent* if

\begin{equation}
P(A \cap B) = P(A)P(B)
\end{equation}

If $P(A) > 0$ and $P(B) > 0$, then this is equivalent to

\begin{equation}
P(A|B) = P(A), P(B|A) = P(B)
\end{equation}


### Conditional independence

Events $A$ and $B$ are said to be *conditionally independent* given $E$ if 

\begin{equation}
P(A \cap B|E) = P(A|E)P(B|E)
\end{equation}





## Python examples

In [1]:
import numpy as np
import matplotlib.pyplot as plt
from scipy.special import factorial
from scipy.special import binom
from numpy.random import choice
from numpy.random import permutation

%matplotlib inline
%load_ext autoreload
%autoreload 2

### Simulating the frequentist interpretation

#### Example: Elder is a girl vs. at least one girl

A family has two children, and it is known that at least one is a girl. What is the probability that both are girls, given this information? What if it is known that the *elder* child is a girl?

- $A$: The event that both children are girls
- $B$: The event that the elder is a girl

\begin{equation}
P(A|B) = \frac{P(A \cap B)} {P(B)} = \frac{1/4} {1/2} = 1/2
\end{equation}

In [2]:
C = 2
n = 10**5
population = np.arange(C)    # [0, 1] == ['girl', 'boy']
child1 = choice(population, size=n, replace=True)    # the gender of the elder child in each of n families
child2 = choice(population, size=n, replace=True)    # the gender of the younger child in each of n families

n_b = np.sum(child1 == 0)    # N(B): the number of families where the elder is a girl
n_ab = np.sum(np.all([child1 == 0, child2 == 0], axis=0))    # N(A \cap B): the number of families where both childeren are girls and the elder is a girl
print(n_ab / float(n_b))

0.501058814128


- $A$: The event that both children are girls
- $B$: The event that at least one of the children is a girl

\begin{equation}
P(A|B) = \frac{P(A \cap B)} {P(B)} = \frac{1/4} {3/4} = 1/3
\end{equation}

In [3]:
n_b = np.sum(np.any([child1 == 0, child2 == 0], axis=0))    # N(B): the number of families where at least one of the children is a girl
n_ab = np.sum(np.all([child1 == 0, child2 == 0], axis=0))    # N(A \cap B): the number of families where both childeren are girls and the elder is a girll
print(n_ab / float(n_b))

0.33447130836


### Monty Hall simulation

#### Example: Monty Hall

On the game show Let’s Make a Deal, hosted by Monty Hall, a contestant chooses one of three closed doors, two of which have a goat behind them and one of which has a car. Monty, who knows where the car is, then opens one of the two remaining doors. The door he opens always has a goat behind it (he never reveals the car!). If he has a choice, then he picks a door at random with equal probabilities. Monty then offers the contestant the option of switching to the other unopened door. If the contestant’s goal is to get the car, should she switch doors?

(answer)
Let’s label the doors 1 through 3. Without loss of generality, we can assume the contestant picked door 1 (if she didn’t pick door 1, we could simply relabel the doors, or rewrite this solution with the door numbers permuted). Monty opens a door, revealing a goat. As the contestant decides whether or not to switch to the remaining unopened door, what does she really wish she knew? Naturally, her decision would be a lot easier if she knew where the car was! This suggests that we should condition on the location of the car. Let $C_i$ be the event that the car is behind door i, for $i = 1, 2, 3$. By the law of total probability,

\begin{equation}
P(\text{get car}) = P(\text{get car}|C_1)\cdot\frac{1}{3} + P(\text{get car}|C_2)\cdot\frac{1}{3} + P(\text{get car}|C_3)\cdot\frac{1}{3}
\end{equation}

Suppose the contestant employs the switching strategy. If the car is behind door 1, then switching will fail, so $P(get\ car|C_i) = 0$. If the car is behind door 2 or 3, then because Monty always reveals a goat, the remaining unopened door must contain the car, so switching will succeed. Thus,

\begin{equation}
P(\text{get car}) = 0\cdot\frac{1}{3} + 1\cdot\frac{1}{3} + 1\cdot\frac{1}{3} = \frac{2}{3}
\end{equation}

so the switching strategy succeeds 2/3 of the time. The contestant shou

In [4]:
# Assume the contestant always chooses door 0
C = 3
n = 10**5   # Number of trials
population = np.arange(C)   # [0, 1, 2]
cardoor = choice(population, n, replace=True)
print(np.sum(cardoor == 0) / float(n))   # The fraction of times when the never-switch strategy succeeds

0.33218


In [5]:
def monty(simulate=True):
    doors = np.arange(3)   # [0, 1, 2]
    # Randomly pick where the car is
    cardoor = choice(doors, 1)[0]
    
    if not simulate:
        # Prompt player - 
        # Receive the player's choice of door (should be 0, 1, or 2)
        chosen = int(input("Monty Hall says 'Pick a door, any door!'"))
    else:
        chosen = 0
    
    # Pick Monty's door (can't be the player's door or the car door)
    if chosen != cardoor:
        montydoor = doors[np.all([doors != chosen, doors != cardoor], axis=0)]
    else:
        montydoor = choice(doors[doors != chosen])
        
    if not simulate:
        # Find out whether the player wants to switch doors
        print('Monty opens door {}!'.format(montydoor))
        reply = str(input('Would you like to switch (y/n)?'))
        
        # Interpret what player wrote as 'yes' if it starts with 'y'
        if reply[0] == 'y':
            chosen = doors[np.all([doors != chosen, doors != montydoor], axis=0)]
    else:
        # FIXME: always change
        chosen = doors[np.all([doors != chosen, doors != montydoor], axis=0)]
    
    # Announce the result of the game!
    if (chosen == cardoor): 
        if not simulate: print('You won!')
        return True
    else:
        if not simulate: print('You lost!')
        return False

In [6]:
n = 10**5   # Number of trials
results = []
for i in range(n):
    results.append(monty(simulate=True))
print(np.sum(results)/float(n))

0.66718
