# The Three Prisoners Problem

Three prisoners, Alice, Bob and Charlie are senteced to death, but one of them (uniformly chosen at random) is selected to be pardoned, so that just the two out of the three prisoners will be executed. The warden knows which one will be pardoned, but he is not allowed to tell the prisoners. Alice begs the warden to let her know the identity of one of the others who will be executed saying: 

> _"If Bob is pardoned, say Charlie's name, and if Charlie is pardoned say Bob's. If I'm pardoned, chose randomly to name Bob or Charlie."_

Now, given the warden's answer, we are interested in answering two questions:
1. What is the probability of correctly guessing the pardoned prisoner?
2. Is the warden’s answer useful for Alice?

### Channel matrix 

Let's model this problem using a channel $W$ that takes an input $x$ (the prisoner to be pardoned) and produces an output $y$ (the warden's answer). You can think of $W$ as being the warden in our problem. The possible values for $x$ and $y$ are $A, B, C$ (short for Alice, Bob and Charlie). Notice that the warden will never say Alice's name, but still for the more general case we include it. $W$ is defined as

$$ 
W = \left( \begin{array} {ccc}
    p(y=A | x=A) & p(y=B | x=A) & p(y=C | x=A) \\
    p(y=A | x=B) & p(y=B | x=B) & p(y=C | x=B) \\
    p(y=A | x=C) & p(y=B | x=C) & p(y=C | x=C) \\
\end{array} \right) 
$$

or in our case

$$ 
W = \left( \begin{array} {cc}
    0 & \frac{1}{2} & \frac{1}{2} \\
    0 & 0 & 1 \\
    0 & 1 & 0 \\
\end{array} \right)
$$

$W$'s first row corresponds to $x=A$, meaning the scenario where Alice is chosen to be pardoned. In that case the channel's output (or the warden's saying) is not deterministic, but has some degree of randomness. More specifically the warden says Alice with probability $0$, Bob with probability $\frac{1}{2}$ and Charlie with probability $\frac{1}{2}$. 

$W$'s second row corresponds to $x=B$, meaning the scenario where Bob is chosen to be pardoned. In that case, there is only one possible output, and that is Charlie. In this case, $W$ (the warden) behaves deterministically and this can be seen because the second row of $W$ has $0$ everywhere, except for one specific output, which gets probability $1$. Same happens with the third row, which corresponds to $x=C$, meaning Charlie is chosen to be pardoned.

Columns now, correspond to outputs of $W$. The first column to $y=A$, the second to $y=B$ and the third to $y=C$. Notice that the first column contains $0$ everywhere. That is because the warden never says Alice's name. Or in a more general way of saying this, $W$ never produces output $A$. 

Notice also, that each row of $W$ sums up to $1$. This happens because each row defines a probability distribution, which basically says how $W$ (the warden) behaves given a specific input $x$.

Let's also define $W$ using python and libqif.

In [3]:
import numpy as np
import matplotlib.pyplot as plt
try:
    from qif import *
except: # install qif if not available (for running in colab, etc)
    import IPython; IPython.get_ipython().run_line_magic('pip', 'install qif')
    from qif import *

In [5]:
W = np.array([
    # y=A  y=B  y=C
    [   0, 1/2, 1/2],    # x=A
    [   0,   0,   1],    # x=B
    [   0,   1,   0],    # x=C
])

Now, to answer Question 1 using QIF terminology, we want to find the **posterior vulnerability** of $W$. That is, what is the probability of correctly guessing the secret $x$ after observing the channel's output $y$. Let's see how we can compute that.

### Prior distribution 

First of all we must define the distribution of $x$. It is also called *the prior distribution* $\pi$. In our case that is the probability of each prisoner being pardoned. The problem states that it is uniform, so we have

$$
p(x=A) = \frac{1}{3} \\
p(x=B) = \frac{1}{3} \\
p(x=C) = \frac{1}{3} \\
$$

or for short

$$
\pi = (\frac{1}{3},\frac{1}{3}, \frac{1}{3}) \\
$$

### Joint Matrix

Next, we compute $J$, which is defined as

$$ 
J = \left( \begin{array} {ccc}
    p(y=A \cap x=A) & p(y=B \cap x=A) & p(y=C \cap x=A) \\
    p(y=A \cap x=B) & p(y=B \cap x=B) & p(y=C \cap x=B) \\
    p(y=A \cap x=C) & p(y=B \cap x=C) & p(y=C \cap x=C) \\
\end{array} \right)
$$

$J$ contains the joint probabilities for each combination of $x$ and $y$. For computing $J$, we use the rule $p(S \cap T) = p(T) \cdot p(S | T) $.

$$ 
J = \left( \begin{array} {ccc}
    p(x=A) \cdot p(y=A | x=A) & p(x=A) \cdot p(y=B | x=A) & p(x=A) \cdot p(y=C | x=A) \\
    p(x=B) \cdot p(y=A | x=B) & p(x=B) \cdot p(y=B | x=B) & p(x=B) \cdot p(y=C | x=B) \\
    p(x=C) \cdot p(y=A | x=C) & p(x=C) \cdot p(y=B | x=C) & p(x=C) \cdot p(y=C | x=C) \\
\end{array} \right)
$$

Notice that $J$ depends on the channel $W$, **but also** on the distribution $\pi$ of $x$. Meaning it depends on each of the $p(x=A)$, $p(x=B)$, $p(x=C)$. Thus, if the pardoned prisoner were not chosen at random, we would have a different $J$.

If we compute $J$ for our case, it becomes

$$ 
J = \left( \begin{array} {cc}
    0 & \frac{1}{6} & \frac{1}{6} \\
    0 & 0 & \frac{1}{3} \\
    0 & \frac{1}{3} & 0 \\
\end{array} \right)
$$