# Data Modeling II: Bayesian Statistics

Bayesian statistics systematically combines our prior knowledge about a situation with new data to refine what we believe is true.
In countless real-world scenarios---ranging from medical diagnostics to fundamental physics experiments---information we already have (like the rarity of a disease or theoretical constraints on a physical parameter) can significantly shape how we interpret fresh evidence.
By framing unknowns as probability distributions, Bayesian methods provide a coherent framework for updating those distributions whenever new observations appear, yielding a posterior that reflects all evidence, old and new.
This unifying perspective makes it possible to quantify uncertainties in a transparent way, avoid common logical pitfalls, and naturally propagate errors to any derived quantities of interest.

## Medical Test "Paradox"

The medical test paradox occurs when a diagnostic test is described as highly accurate, yet a person who tests positive for a rare disease ends up with a much lower chance of actually having it.
This seemingly contradiction highlights the importance of prior knowledge or base rates.

Consider a disease that affects only 1% of the population.
Imagine a test that has:
* 99% sensitivity: if you **do** have the disease, it flags you positive 99% of the time.
* 99% specificity: if you **do not** have the disease, it correctly flags you negative 99% of the time.

Many people assume that a "99% accurate" test implies a 99% chance of having the disease if you test positive.
We will see that is not necessarily true when the disease is rare.

### A Simple Counting Argument

Suppose we have 10,000 people.
About 100 of them are diseased (1%).
The remaining 9,900 are healthy.  

Of the 100 diseased people, 99 will test positive (true positives).
Of the 9,900 healthy people, 1% will falsely test positive (99 people).
We end up with a total of 198 positive results: 99 true positives plus 99 false positives.

Hence, only half of these positives (99 out of 198) are truly diseased.
This implies a 50% chance of actually having the disease, which is far lower than 99%.

### Why This Happens

When a condition is rare, most people do not have it.
A small fraction of a large healthy group (the 1% false-positive rate applied to 9,900 healthy people) can match or exceed the positives from the much smaller diseased group.
his is a direct consequence of prior probability: we have to weigh how common the disease is before we interpret a new test result.

## Introducing Bayes’ Theorem

The medical test paradox illustrates a gap between "test accuracy" and "post-test probability" when the condition is rare.
We observed that even a 99%-accurate test can yield only a 50% chance of actually having the disease if the prevalence is 1%.
This puzzle arises because many individuals mistakenly assume that the **test's accuracy** alone determines the **chance** of being diseased after a positive result, whereas the **prior prevalence** also plays a critical role.

Bayes' Theorem provides a mathematical statement that merges prior information about how common a disease is (the *prevalence*) with the evidence from a test result (the *likelihood*).
If we let $D$ represent the event that a person has the disease, and $T^+$ represent a positive test result, then Bayes' Theorem states:
\begin{align}
P(D \mid T^+) = \frac{P(T^+ \mid D)\; P(D)}{P(T^+)}
\end{align}
where 
1. $P(D)$ is the *prior probability* of having the disease, i.e., the prevalence (1%).
2. $P(T^+ \mid D)$ is the test's *sensitivity*: the probability of a positive test given that the person is actually sick (99%).
3. $P(T^+)$ is the total probability of a positive test, which includes:
   * True positives (when someone with the disease tests positive).  
   * False positives (when someone without the disease tests positive).

### Connecting to Our Example

From our counting argument (with 10,000 people, 1% prevalence, and 99% sensitivity/specificity):

1. $P(D) = 0.01$.
2. $P(T^+ \mid D) = 0.99$.
3. $P(T^+)$ includes both the true-positive contribution $\bigl(0.99 \times 0.01\bigr)$ and the false‐positive contribution $\bigl(0.01 \times 0.99\bigr)$, because the test incorrectly flags 1% of the healthy group of 99%.
   Adding these yields $0.0099 + 0.0099 = 0.0198$.

Hence, by Bayes' Theorem:
\begin{align}
P(D \mid T^+) = \frac{(0.99)\times(0.01)}{0.0198} = \frac{0.0099}{0.0198} = 0.50.
\end{align}
We recover the same 50% figure from the simpler counting argument, yet now it is transparent how each term (prior, sensitivity, false positives) contributes.

### Why Bayes’ Theorem Matters

The key power of Bayes' Theorem is that it forces us to incorporate the **prior probability** $P(D)$ before we look at new evidence $T^+$.
Once the data (test results) come in, we use the likelihood $P(T^+ \mid D)$ to update this prior, producing the **posterior probability** $P(D \mid T^+)$.
In the medical context, the "update" reveals how a single test result against a low prevalence might not be enough for a confident diagnosis.

### Broader Applications

While the disease-testing example is a simple illustration, Bayes' Theorem underpins many advanced topics:
- In **physics**, it helps estimate unknown parameters (like masses, cross sections, or damping coefficients) and automatically weighs in prior knowledge (like experimental constraints or theoretical bounds).
- In **astronomy**, it deals with uncertain measurements of faint sources, updating beliefs about everything from exoplanet detections to cosmological parameters.
- In **engineering**, it refines reliability or failure-rate assessments with new test data.

The consistent theme is that **prior knowledge plus data** equals an updated understanding that neither alone can provide.
This is the heart of Bayesian statistics.
By seeing how a low base rate can overshadow test accuracy in medical examples, we avoid misinterpretations in a wide range of scientific and engineering problems, where ignoring prior information can lead to equally surprising results.