# Chapter 2: Conditional probability
 
This Jupyter notebook is the Python equivalent of the R code in section 2.10 R, [Introduction to Probability, 1st Edition](https://www.crcpress.com/Introduction-to-Probability/Blitzstein-Hwang/p/book/9781466575578), Blitzstein & Hwang.

----

In [1]:
import numpy as np

np.random.seed(123)

## Simulating the frequentist interpretation 

Recall that the frequentist interpretation of conditional probability based on a large number `n` of repetitions of an experiment is $P(A|B) ≈ n_{AB}/n_{B}$, where $n_{AB}$ is the number of times that $A \cap B$ occurs and $n_{B}$ is the number of times that $B$ occurs. Let's try this out by simulation, and verify the results of Example 2.2.5. So let's use [`numpy.random.choice`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.random.choice.html) to simulate `n` families, each with two children.

In [2]:
n = 10**5
child1 = np.random.choice([1,2], n, replace=True) 
child2 = np.random.choice([1,2], n, replace=True) 

child1

array([1, 2, 1, ..., 2, 1, 1])

Here `child1` is a Numpy `array` of length `n`, where each element is a 1 or a 2. Letting 1 stand for "girl" and 2 stand for "boy", this `array` represents the gender of the elder child in each of the `n` families. Similarly, `child2` represents the gender of the younger child in each family. 

Alternatively, we could have used

In [3]:
np.random.choice(["girl", "boy"], n, replace=True)

array(['boy', 'boy', 'girl', ..., 'girl', 'boy', 'boy'],
      dtype='<U4')

but it is more convenient working with numerical values.

Let $A$ be the event that both children are girls and $B$ the event that the elder is a girl. Following the frequentist interpretation, we count the number of repetitions where $B$ occurred and name it `n_b`, and we also count the number of repetitions where $A \cap B$ occurred and name it `n_ab`. Finally, we divide `n_ab` by ` n_b` to approximate $P(A|B)$.

In [4]:
n_b = np.sum(child1==1)
n_ab = np.sum((child1==1) & (child2==1))
n_ab / n_b

0.50183318056828596

The ampersand `&` is an elementwise $AND$, so `n_ab` is the number of families where both the first child and the second child are girls. When we ran this code, we got 0.50, confirming our answer $P(\text{both girls | elder is a girl}) = 1/2$. 

Now let $A$ be the event that both children are girls and $B$ the event that at least one of the children is a girl. Then $A \cap B$ is the same, but `n_b` needs to count the number of families where at least one child is a girl. This is accomplished with the elementwise $OR$ operator `|` (this is not a conditioning bar; it is an inclusive $OR$, returning `True` if at least one element is `True`).

In [5]:
n_b = np.sum((child1==1) | (child2==2))
n_ab = np.sum((child1==1) & (child2==1))
n_ab / n_b

0.33529035865484463

For us, the result was 0.33, confirming that $P(\text{both girls | at least one girl}) = 1/3$.

## Monty Hall simulation

Many long, bitter debates about the Monty Hall problem could have been averted by trying it out with a simulation. To study how well the never-switch strategy performs, let's generate $10^{5}$ runs of the Monty Hall game. To simplify notation, assume the contestant always chooses door 1. Then we can generate a vector specifying which door has the car for each repetition:


In [6]:
n = 10**5
cardoor = np.random.choice([1,2,3] , n, replace=True)

np.sum(cardoor==1) / n

0.33395999999999998

At this point we could generate the vector specifying which doors Monty opens, but that's unnecessary since the never-switch strategy succeeds if and only if door 1 has the car! So the fraction of times when the never-switch strategy succeeds is `sum(cardoor==1)/n`, which was 0.334 in our simulation. This is very close to 1/3.

What if we want to play the Monty Hall game interactively? We can do this by programming a function. Entering the following code in R defines a function called `monty`, which can then be invoked by entering the command monty() any time you feel like playing the game!


The print command prints its argument to the screen. We combine this with paste since print("Monty opens door montydoor") would literally print “Monty opens door montydoor”. The scan command interactively requests input from the user; we use what = integer() when we want the user to enter an integer and what = character() when we want the user to enter text. Using substr(reply,1,1) extracts the first character of reply, in case the user replies with “yes” or “yep” or “yeah!” rather than with “y”.


In [7]:
class StateContext():
    
    def __init__(self):
        self.num_plays = 0
        self.num_wins = 0
        self.doors = np.array(['C', 'G', 'G'])
        self.players_choice = None
        self.montys_choice = None
        self.current_state = None
        self.message = None
    
    def set_initial_state(self, initial):
        self.current_state = initial

    def get_success_rate(self):
        if self.num_plays > 0:
            return self.num_wins / self.num_plays
        else:
            return 0.0

INITIAL = 2**0
PLAYER_CHOOSES = 2**1
MONTY_CHOOSES = 2**2
PLAYER_SWITCHES = 2**3
CONTINUE_PLAYING = 2**4

class BaseState():
    def __init__(self, context, **kwargs):
        self.context = context
        self.allowed = 0

    def transition(self, next_state):
        if next_state.state & self.allowed > 0:
            self.context.current_state = next_state
        else:
            print('Current:',self,' => switching to',next_state,'not possible.')

class InitialState(BaseState):
    def __init__(self, context, **kwargs):
        super(InitialState, self).__init__(context, **kwargs)
        np.random.shuffle(self.context.doors)
        self.state = INITIAL
        self.allowed = PLAYER_CHOOSES | INITIAL
        self.context.num_plays = 0
        self.context.num_wins = 0
        self.context.players_choice = None
        self.context.montys_choice = None
        self.context.message = 'Pick a door.'

class ContinuePlayingState(BaseState):
    def __init__(self, context, **kwargs):
        super(ContinuePlayingState, self).__init__(context, **kwargs)
        np.random.shuffle(self.context.doors)
        self.state = INITIAL
        self.allowed = PLAYER_CHOOSES | INITIAL
        self.context.players_choice = None
        self.context.montys_choice = None
        self.context.message = 'Pick a door.'

class PlayerChoosesState(BaseState):
    def __init__(self, context, **kwargs):
        super(PlayerChoosesState, self).__init__(context, **kwargs)
        self.state = PLAYER_CHOOSES
        self.allowed = MONTY_CHOOSES | INITIAL
        if 'players_choice' in kwargs:
            self.context.players_choice = kwargs['players_choice']

class MontyChoosesState(BaseState):
    def __init__(self, context, **kwargs):
        super(MontyChoosesState, self).__init__(context, **kwargs)
        self.state = MONTY_CHOOSES
        self.allowed = PLAYER_SWITCHES | INITIAL
        idx = [e for e in range(len(self.context.doors)) if e != self.context.players_choice and self.context.doors[e] != 'C']
        self.context.montys_choice = np.random.choice(idx, size=1, replace=False)[0]

class PlayerSwitchesState(BaseState):
    def __init__(self, context, **kwargs):
        super(PlayerSwitchesState, self).__init__(context, **kwargs)
        self.state = PLAYER_SWITCHES
        self.allowed = INITIAL | CONTINUE_PLAYING
        if 'player_switches' in kwargs:
            if kwargs['player_switches'] == True:
               idx = [e for e in range(len(self.context.doors)) if e != self.context.players_choice and e != self.context.montys_choice][0]
               self.context.players_choice = idx
        if self.context.players_choice == np.where(self.context.doors=='C')[0][0]:
            self.context.num_wins += 1
            self.context.message = 'You win!'
        else:
            self.context.message = 'Monty wins.'
        self.context.num_plays += 1

----

&copy; Blitzstein, Joseph K.; Hwang, Jessica. Introduction to Probability (Chapman & Hall/CRC Texts in Statistical Science).