# Medical Screening Test

This notebook examplify the meaning of the joint entropy measure, with the help of a fictional medical screening test. Let us consider the following random variables:
* $X$: Actual condition in {Healthy, Positive}
* $Y$: Test result in {Negative, Positive}

We will vary test quality from weak to strong and observe how the joint entropy $H\left(X,Y\right)$ and the shared information $I\left(X;Y\right)$ change.

Let us consider the following parameters: The disease prevalence (actual condition).

In [None]:
import numpy as np

# P(Healthy) = 0.9, P(Positive) = 0.1
p_x = np.array([0.9, 0.1])

We now will need to simulate the generation of a joint probability distribution for this test. The following code can do it:

In [None]:
def build_test_joint_distribution(q: float) -> np.ndarray:
    """
    Build p(x,y) for a binary medical test.

    q in [0,1] controls test quality:
        q = 0.0 -> weak test (both sensitivity/specificity = 0.5, near random)
        q = 1.0 -> strong test (sensitivity = 0.95, specificity = 0.95)
    
    X rows: [Healthy, Positive]
    Y cols: [Negative, Positive]
    """
    if not (0.0 <= q <= 1.0):
        raise ValueError("q must be in [0,1].")
    sensitivity = 0.5 + 0.45 * q 
    specificity = 0.5 + 0.45 * q 
    # Conditional probabilities:
    # For Healthy: P(Neg|Healthy)=specificity, P(Pos|Healthy)=1-specificity
    # For Positive: P(Neg|Positive)=1-sensitivity, P(Pos|Positive)=sensitivity
    p_xy = np.array([
        [p_x[0] * specificity, p_x[0] * (1 - specificity)],
        [p_x[1] * (1 - sensitivity), p_x[1] * sensitivity],
    ])
    return p_xy

We will now run our experiment over different qualities of test. To do so, we can create the following interval for the $q$ variable:

In [None]:
import numpy as np

qs = np.linspace(0.0, 1.0, 80)

We are now ready to run our simple experiment and displaying it:

In [None]:
from entropy_lab.measures.entropy import compute_entropy, compute_joint_entropy

Hx_vals, Hy_vals, Hxy_vals, Hsum_vals, MI_vals = [], [], [], [], []

for q in qs:
    p_xy = build_test_joint_distribution(q)

    p_x_marg = p_xy.sum(axis=1)  # actual condition
    p_y_marg = p_xy.sum(axis=0)  # test result

    Hx = compute_entropy(p_x_marg)
    Hy = compute_entropy(p_y_marg)
    Hxy = compute_joint_entropy(p_xy)
    Ixy = Hx + Hy - Hxy

    Hx_vals.append(Hx)
    Hy_vals.append(Hy)
    Hxy_vals.append(Hxy)
    Hsum_vals.append(Hx + Hy)
    MI_vals.append(Ixy)

Hx_vals = np.array(Hx_vals)
Hy_vals = np.array(Hy_vals)
Hxy_vals = np.array(Hxy_vals)
Hsum_vals = np.array(Hsum_vals)
MI_vals = np.array(MI_vals)

# Show endpoint tables for interpretation
p_weak = build_test_joint_distribution(0.0)
p_strong = build_test_joint_distribution(1.0)

print("Weak test (q=0.0):")
print(p_weak)
print("sum =", p_weak.sum(), "\n")

print("Strong test (q=1.0):")
print(p_strong)
print("sum =", p_strong.sum(), "\n")


In [None]:
import matplotlib.pyplot as plt

fig, (ax1, ax2) = plt.subplots(
    2, 1, figsize=(10, 10), sharex=True
)

# Top subplot: Entropy curves
ax1.plot(qs, Hx_vals, label="H(X): Actual condition")
ax1.plot(qs, Hy_vals, label="H(Y): Test result")
ax1.plot(qs, Hxy_vals, label="H(X,Y): Joint")
ax1.plot(qs, Hsum_vals, "--", label="H(X)+H(Y)")
ax1.set_ylabel("Entropy (bits)")
ax1.set_title("Joint Entropy in a Medical Screening System")
ax1.grid(True, alpha=0.3)
ax1.legend()

# Bottom subplot: Shared information
ax2.plot(qs, MI_vals, label="I(X;Y) = H(X)+H(Y)-H(X,Y)")
ax2.set_xlabel("Test quality")
ax2.set_ylabel("Shared information (bits)")
ax2.set_title("How Informative the Test Is About the True Condition")
ax2.grid(True, alpha=0.3)
ax2.legend()

plt.tight_layout()
plt.show()

As test quality improves, the test result becomes less random and more aligned with the true condition. This reduces the joint uncertainty $H\left(X,y\right)$ and increases the mutual information 
$I\left(X;Y\right)$, i.e., the amount of information the test provides about the patient's actual condition.

Interestingly, even with a strong test (95% sensitivity/specificity), the mutual information remains below 0.3 bits because the disease prevalence is low (10%). This shows a key information-theoretic principle: The usefulness of a test depends not only on its accuracy, but also on the base rate of the condition.