# Chapter 2: Conditional probability
 
This Jupyter notebook is the Python equivalent of the R code in section 2.10 R, pp. 80 - 83, [Introduction to Probability, Second Edition](https://www.crcpress.com/Introduction-to-Probability-Second-Edition/Blitzstein-Hwang/p/book/9781138369917), Blitzstein & Hwang.

In [1]:
import numpy as np

## Simulating the frequentist interpretation 

Recall that the frequentist interpretation of conditional probability based on a large number `n` of repetitions of an experiment is $P(A|B) ≈ n_{AB}/n_{B}$, where $n_{AB}$ is the number of times that $A \cap B$ occurs and $n_{B}$ is the number of times that $B$ occurs. Let's try this out by simulation, and verify the results of Example 2.2.5. So let's use [`numpy.random.choice`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.random.choice.html) to simulate `n` families, each with two children.

In [4]:
np.random.seed(42)

n = 10**5

child1 = np.random.choice([1,2],n, replace = True)
child2 = np.random.choice([1,2],n, replace = True)

print('child1:\n{}\n'.format(child1))
print('child2:\n{}\n'.format(child2))

child1:
[1 2 1 ... 1 2 2]

child2:
[1 2 2 ... 2 1 2]



Here `child1` is a NumPy `array` of length `n`, where each element is a 1 or a 2. Letting 1 stand for "girl" and 2 stand for "boy", this `array` represents the gender of the elder child in each of the `n` families. Similarly, `child2` represents the gender of the younger child in each family. 

Alternatively, we could have used

In [5]:
np.random.choice(['girl','boy'],n, replace = True)

array(['boy', 'boy', 'boy', ..., 'boy', 'boy', 'boy'], dtype='<U4')

but it is more convenient working with numerical values.

Let $A$ be the event that both children are girls and $B$ the event that the elder is a girl. Following the frequentist interpretation, we count the number of repetitions where $B$ occurred and name it `n_b`, and we also count the number of repetitions where $A \cap B$ occurred and name it `n_ab`. Finally, we divide `n_ab` by ` n_b` to approximate $P(A|B)$.

In [8]:
n_b = np.sum(child1==1)
n_ab = np.sum((child1==1) & (child2==1))
print('P(both girls| elder is a girl) = {:0.2F} '.format(n_ab/n_b))

P(both girls| elder is a girl) = 0.50 


The ampersand `&` is an elementwise $AND$, so `n_ab` is the number of families where both the first child and the second child are girls. When we ran this code, we got 0.50, confirming our answer $P(\text{both girls | elder is a girl}) = 1/2$. 

Now let $A$ be the event that both children are girls and $B$ the event that at least one of the children is a girl. Then $A \cap B$ is the same, but `n_b` needs to count the number of families where at least one child is a girl. This is accomplished with the elementwise $OR$ operator `|` (this is not a conditioning bar; it is an inclusive $OR$, returning `True` if at least one element is `True`).

In [12]:
n_b = np.sum((child1==1) | (child2==1))
n_ab = np.sum((child1==1) & (child2==1))
print('P(both girls | at least one is a girl) = {:0.5F}'.format(n_ab/n_b))

P(both girls | at least one is a girl) = 0.33508


For us, the result was 0.33, confirming that $P(\text{both girls | at least one girl}) = 1/3$.