# Exercise 05.01

## Problem:

**[Purpose: Iterative application of Bayes’ rule, and seeing how posterior probabilities change with inclusion of more data.]**

This exercise extends the ideas of Table 5.4, so at this time, please review Table 5.4 and its discussion in the text. Suppose that the same randomly selected person as in Table 5.4 gets re-tested after the first test result was positive, and on the re-test, the result is negative. When taking into account the results of both tests, what is the probability that the person has the disease? Hint: For the prior probability of the re-test, use the posterior computed from the Table 5.4. Retain as many decimal places as possible, as rounding can have a surprisingly big effect on the results. One way to avoid unnecessary rounding is to do the calculations in R.

## Solution:

The following conditional probabilities are given of the treatment results given disease presence:

\begin{align*}
p(\text{T} \, = \, \text{positive} \, | \, \text{D} \, = \, \text{positive}) &= 0.99 \\
p(\text{T} \, = \, \text{negative} \, | \, \text{D} \, = \, \text{positive}) &= 0.01 \\
p(\text{T} \, = \, \text{positive} \, | \, \text{D} \, = \, \text{negative}) &= 0.05 \\
p(\text{T} \, = \, \text{negative} \, | \, \text{D} \, = \, \text{negative}) &= 0.95 \\
\end{align*}

They are represented in the following equations:

In [2]:
# probability of T = + given D = +
pTPos_DPos = 0.99

# probability of T = - given D = +
pTNeg_DPos = 0.01

# probability of T = + given D = -
pTPos_DNeg = 0.05

# probability of T = - given D = -
pTNeg_DNeg = 0.95

# background probability of D = +
pDPos = 0.001

# background probability of D = -
pDNeg = 1 - pDPos

Table 5.4 gives the following information as the joint distribution of the test results and disease presence:

<table>
    <th>
        <td>Disease Present</td>
        <td>Disease Absent</td>
        <td>Marginal (test result)</td>
    </th>
    <tr>
        <td>Positive Test</td>
        <td>0.00099</td>
        <td>0.04995</td>
        <td>**0.05094**</td>
    </tr>
    <tr>
        <td>Negative Test</td>
        <td>0.00001</td>
        <td>0.94905</td>
        <td>**0.94906**</td>
    </tr>
    <tr>
        <td>Marginal (disease presence)</td>
        <td>**0.00100**</td>
        <td>**0.99900**</td>
        <td>**1.00000**</td>
    </tr>
</table>

The probability that an individual has the disease given that the test was positive is given by the following equation:

\begin{equation}
p(\text{D} \, = \, \text{positive} \, | \, \text{T} \, = \, \text{positive}) = \frac{p(\text{T} \, = \, \text{positive} \, | \, \text{D} \, = \, \text{positive}) p(\text{D} \, = \, \text{positive})}{p(\text{T} \, = \, \text{positive})}
\end{equation}

The following calculation can be performed in R:

In [3]:
# T = + is the sum of p(T = +, D = +) and p(T = +, D = -)
pTPos = pTPos_DPos*pDPos + pTPos_DNeg*pDNeg

pDPos_TPos = pTPos_DPos*pDPos/pTPos
print(pDPos_TPos)

[1] 0.01943463


Assume that the $p(\text{D} \, = \, \text{positive} \, | \, \text{T} \, = \, \text{positive})$ that was just calculated is now the new $p(\text{D} \, = \, \text{positive})$ value. That is, the posterior becomes the next prior. Now:

\begin{equation*}
p(\text{D} \, = \, \text{negative}) = 1 - p(\text{D} \, = \, \text{positive})
\end{equation*}

Finding the new posterior can be performed in R:

In [4]:
# set the posterior as the new prior
pDPos2 = pDPos_TPos
pDNeg2 = pTPos_DNeg*pDNeg/pTPos

# T = + is the sum of p(T = +, D = +) and p(T = +, D = -)
pTNeg2 = pTNeg_DPos*pDPos2 + pTNeg_DNeg*pDNeg2

pDPos_TNeg2 = pTNeg_DPos*pDPos2/pTNeg2
print(pDPos_TNeg2)

[1] 0.0002085862
