# Chapter 2: Conditional probability

## Simulating the frequentist interpretation

Recall that the frequentist interpretation of conditional probability based on a large number $n$ of repetitions of an experiment is $P(A|B) \approx n_{AB} / n_B$, where $n_{AB}$ is the number of times that $A \cap B$ occurs and $n_B$ is the number of times that $B$ occurs. Let's try this out by simulation, and verify the results of Example 2.2.5. So let's simulate $n$ families, each with two children.

In [None]:
n <- 10^5
child1 <- sample(2,n,replace=TRUE)
child2 <- sample(2,n,replace=TRUE)

Here `child1` is a vector of length $n$, where each element is a $1$ or a $2$. Letting $1$ stand for "girl" and $2$ stand for "boy", this vector represents the gender of the elder child in each of the $n$ families. Similarly, `child2` represents the gender of the younger child in each family.

Alternatively, we could have used

In [None]:
sample(c("girl","boy"),n,replace=TRUE)

but it is more convenient working with numerical values.

Let $A$ be the event that both children are girls and $B$ the event that the elder is a girl. Following the frequentist interpretation, we count the number of repetitions where $B$ occurred and name it `n.b`, and we also count the number of repetitions where $A \cap B$ occurred and name it `n.ab`. Finally, we divide `n.ab` by `n.b` to approximate $P(A|B)$.

In [None]:
n.b <- sum(child1==1)
n.ab <- sum(child1==1 & child2==1)
n.ab/n.b

The ampersand `&` is an elementwise AND, so `n.ab` is the number of families where both the first child and the second child are girls. When we ran this code, we got $0.50$, which agrees with $P(\text{both girls}|\text{elder is a girl}) = 1/2$.

Now let $A$ be the event that both children are girls and $B$ the event that at least one of the children is a girl. Then $A \cap B$ is the same, but `n.b` needs to count the number of families where at least one child is a girl. This is accomplished with the elementwise OR operator `|` (this is _not_ a conditioning bar; it is an inclusive OR, returning TRUE if at least one element is TRUE).

In [None]:
n.b <- sum(child1==1 | child2==1)
n.ab <- sum(child1==1 & child2==1)
n.ab/n.b

We got $0.33$, which agrees with $P(\text{both girls}|\text{at least one girl}) = 1/3$.

## Monty Hall simulation

Many long, bitter debates about the Monty Hall problem could have been averted by _trying it out_ with a simulation. To study how well the never-switch strategy performs, let's generate $10^5$ runs of the Monty Hall game. To simplify notation, assume the contestant always chooses door $1$. Then we can generate a vector specifying which door has the car for each repetition:

In [None]:
n <- 10^5
cardoor <- sample(3,n,replace=TRUE)

At this point we could generate the vector specifying which doors Monty opens, but that's unnecessary since the never-switch strategy succeeds if and only if door 1 has the car! So the fraction of times when the never-switch strategy succeeds is `sum(cardoor==1)/n`, which was $0.334$ in our simulation. This is very close to thetrue value, $1/3$.

What if we want to _play_ the Monty Hall game interactively? We can do this by programming a _function_. Entering the following code in R defines a function called `monty`, which can then be invoked by entering the command `monty()` any time you feel like playing the game!
```{r}
monty <- function() {
    doors <- 1:3
    
    # randomly pick where the car is
    cardoor <- sample(doors,1)
    
    # prompt player
    print("Monty Hall says 'Pick a door, any door!'")
    
    # receive the player's choice of door (should be 1,2, or 3)
    chosen <- scan(what = integer(), nlines = 1, quiet = TRUE)
    
    # pick Monty's door (can't be the player's door or the car door)
    if (chosen != cardoor) montydoor <- doors[-c(chosen, cardoor)]
    else montydoor <- sample(doors[-chosen],1)
    
    # find out whether the player wants to switch doors
    print(paste("Monty opens door ", montydoor, "!", sep=""))
    print("Would you like to switch (y/n)?")
    reply <- scan(what = character(), nlines = 1, quiet = TRUE)
    
    # interpret what player wrote as "yes" if it starts with "y"
    if (substr(reply,1,1) == "y"){
        chosen <- doors[-c(chosen,montydoor)]
    }
    
    # announce the result of the game!
    if (chosen == cardoor) print("You won!")
    else print("You lost!")
}
```

The `print` command prints its argument to the screen. We combine this with the `paste` command since if we used `print("Monty opens door montydoor")` then R would literally print "Monty opens door montydoor". The `scan` command interactively requests input from the user; we use `what = integer()` when we want the user to enter an integer and `what = character()` when we want the user to enter text. Using `substr(reply,1,1)` extracts the first character of `reply`, in case the user replies with "yes" or "yep" or "yeah!" rather than with "y".

The following cells will run this code in Google Colab. To run the simulation:
1. Choose a door (1, 2, or 3) from the dropdown list of the first cell, and run the cell. The output will tell you which door Monty then opens.
2. In the second cell, decide if you would like to switch to the remaining door or stick with your original choice. Select your choice from the dropdown list (yes/no) and run the cell to see if you won or lost.

Repeat the previous steps to repeat the simulation.

(Outside of Google Colab, the simulation can be run simply changing the value of `chosen` in the first cell and `reply` in the second cell, and then running the cells.)

In [None]:
#@title Monty Hall says 'Pick a door, any door!' { vertical-output: true, display-mode: "form" }
doors <- 1:3
cardoor <- sample(doors,1) 

chosen <- 1 #@param ["1", "2", "3"] {type:"raw"}

if (chosen != cardoor) {
  montydoor <- doors[-c(chosen, cardoor)]
} else {
  montydoor <- sample(doors[-chosen],1)
}

print(paste("Monty opens door ", montydoor, "!", sep=""))

In [None]:
#@title Would you like to switch? { vertical-output: true, display-mode: "form" }
reply <- "yes" #@param ["yes", "no"] {type:"string"}

if (reply == "yes") {
  chosen <- doors[-c(chosen,montydoor)]
}

if (chosen == cardoor) {
  print("You won!")
} else {
  print("You lost!")
}