In [17]:
from sympy import S
from sympy.stats import Bernoulli, E, density, variance

# Create a Bernoulli random variable with probability of success p
p = S.Half
X = Bernoulli('X', p)

print(
    f"Probability mass function (PMF): {density(X)}\n"
    f"Expected value (mean): {E(X)}\n"
    f"Variance: {variance(X)}\n"
)


X
Probability mass function (PMF): BernoulliDistribution(1/2, 1, 0)
Expected value (mean): 1/2
Variance: 1/4



To evaluate the accuracy of a pregnancy test facility in a case study involving 1,000 women, we need to follow a systematic approach to test and analyze the results. Below, I outline a comprehensive plan that includes study details, methodology, metrics for evaluation, and examples of analysis.

### 1. **Study Overview**

- **Total Participants:** 1,000 women 
  - **Group A (Pregnant):** 500 women (definitely pregnant)
  - **Group B (Not Pregnant):** 500 women (definitely not pregnant)

### 2. **Objective**

The main objective is to test the accuracy of the pregnancy test facility by calculating its performance metrics, including sensitivity, specificity, positive predictive value, negative predictive value, and overall accuracy.

### 3. **Test Implementation**

Each woman will undergo a pregnancy test at the facility, which will classify each result as either:
- **Positive:** Indicates that the test suggests the woman is pregnant.
- **Negative:** Indicates that the test suggests the woman is not pregnant.

### 4. **Defining Outcomes**

After testing, results will be classified into four categories:

- **True Positive (TP):** Pregnant women who test positive.
- **False Positive (FP):** Non-pregnant women who test positive.
- **True Negative (TN):** Non-pregnant women who test negative.
- **False Negative (FN):** Pregnant women who test negative.

### 5. **Hypothetical Test Results**

For analysis, let's consider the following hypothetical results from the tests:

- **True Positives (TP):** 480 pregnant women test positive.
- **False Negatives (FN):** 20 pregnant women test negative.
- **True Negatives (TN):** 490 non-pregnant women test negative.
- **False Positives (FP):** 10 non-pregnant women test positive.

### 6. **Calculation of Metrics**

Using the results above, we can calculate the following performance metrics:

- **Sensitivity (True Positive Rate):**
  $$
  \text{Sensitivity} = \frac{TP}{TP + FN} = \frac{480}{480 + 20} = \frac{480}{500} = 0.96 \, (96\%)
  $$

- **Specificity (True Negative Rate):**
  $$
  \text{Specificity} = \frac{TN}{TN + FP} = \frac{490}{490 + 10} = \frac{490}{500} = 0.98 \, (98\%)
  $$

- **Positive Predictive Value (PPV):**
  $$
  \text{PPV} = \frac{TP}{TP + FP} = \frac{480}{480 + 10} = \frac{480}{490} \approx 0.9796 \, (97.96\%)
  $$

- **Negative Predictive Value (NPV):**
  $$
  \text{NPV} = \frac{TN}{TN + FN} = \frac{490}{490 + 20} = \frac{490}{510} \approx 0.9608 \, (96.08\%)
  $$

- **Overall Accuracy:**
  $$
  \text{Accuracy} = \frac{TP + TN}{TP + FP + TN + FN} = \frac{480 + 490}{500 + 500} = \frac{970}{1000} = 0.97 \, (97\%)
  $$

### 7. **Results Summary**

Based on the hypothetical results, the pregnancy test facility demonstrates:
- **Sensitivity:** 96% (high ability to correctly identify pregnant women).
- **Specificity:** 98% (high ability to correctly identify non-pregnant women).
- **Positive Predictive Value (PPV):** ~98% (likelihood that a positive test result indicates pregnancy).
- **Negative Predictive Value (NPV):** ~96% (likelihood that a negative test result indicates no pregnancy).
- **Overall Accuracy:** 97% (overall correctness of the test results).

### 8. **Conclusion and Recommendations**

The pregnancy test facility shows high accuracy in identifying both pregnant and non-pregnant women based on the hypothetical data provided. 

**Recommendations:**
- These results suggest the tests are reliable, but continuous quality control is advised.
- Investigating causes for any false positives or negatives can help improve accuracy.
- It might be beneficial to repeat the study with a larger and more diverse sample to validate findings.

This framework provides a clear structure for analyzing and interpreting the accuracy of a pregnancy test facility in a controlled case study.

Applying Bayes' Theorem with Sensitivity and Specificity

Understanding Sensitivity and Specificity
* Sensitivity: The probability of a positive test result given the person is pregnant.
* Specificity: The probability of a negative test result given the person is not pregnant.

Applying Bayes' Theorem

Suppose we know the sensitivity of the test is 98% and the specificity is 95%.
Given:
* P(B|A): Sensitivity = 0.98 (Probability of a positive test given pregnancy)
* P(¬B|¬A): Specificity = 0.95 (Probability of a negative test given not pregnant)
We can calculate P(B|¬A) (false positive rate) as 1 - Specificity = 0.05.
Assuming a prior probability of pregnancy P(A) of 20% (0.20), we can calculate P(B) as before:
P(B) = P(B|A) * P(A) + P(B|¬A) * P(¬A)

P(B) = 0.98 * 0.20 + 0.05 * 0.80 = 0.236

Now, we can apply Bayes' theorem to find the probability of pregnancy given a positive test result:
P(A|B) = P(B|A) * P(A) / P(B)

P(A|B) = 0.98 * 0.20 / 0.236 ≈ 0.831

Interpretation:
If the test is positive, there's an 83.1% probability that the woman is pregnant. This calculation incorporates both the sensitivity and specificity of the test, providing a more accurate estimate of the probability of pregnancy.
Key Points:
* Prior Probability: The initial belief or estimate of the probability of an event (in this case, pregnancy) is crucial.
* Test Accuracy: Sensitivity and specificity directly influence the accuracy of the posterior probability.
* Bayes' Theorem: Allows us to update our beliefs as we gather more evidence (in this case, the test result).
By understanding and applying Bayes' theorem, we can make more informed decisions based on probabilistic information, even when dealing with uncertain outcomes.

## 2x2 Contingency Table for Pregnancy Test Results

A 2x2 contingency table is a common tool used to analyze the performance 
of diagnostic tests, including pregnancy tests. It provides a clear and 
concise way to visualize the outcomes of a test and calculate sensitivity 
and specificity.

**Here's a 2x2 contingency table:**

|                           | Test Positive | Test Negative | Total |
|---------------------------|---------------|---------------|-------|
| **Actually Pregnant**     | TP            | FN            | TP+FN |
| **Actually Not Pregnant** | FP            | TN            | FP+TN |
| **Total**                 | TP+FP         | FN+TN         | N     |

* **TP:** True Positive (Correctly identifies a pregnant person)
* **FN:** False Negative (Incorrectly identifies a pregnant person as not pregnant)
* **FP:** False Positive (Incorrectly identifies a non-pregnant person as pregnant)
* **TN:** True Negative (Correctly identifies a non-pregnant person)
* **N:** Total number of tests

**Calculating Sensitivity and Specificity:**

* **Sensitivity:** TP / (TP+FN)
* **Specificity:** TN / (TN+FP)

**Example:**

Let's say a pregnancy test has the following results:

|                           | Test Positive | Test Negative | Total |
|---------------------------|---------------|---------------|-------|
| **Actually Pregnant**     | 95            | 5             | 100   |
| **Actually Not Pregnant** | 2             | 98            | 100   |
| **Total**                 | 97            | 103           | 200   |

* **Sensitivity:** 95 / (95+5) = 0.95 or 95%
* **Specificity:** 98 / (98+2) = 0.98 or 98%

This means that:
* 95% of pregnant women will get a positive result.
* 98% of non-pregnant women will get a negative result.

By understanding sensitivity and specificity, you can assess the 
reliability of a pregnancy test and make informed decisions based on the results.


To determine the probability of being pregnant given a positive test result, we need to use Bayes' theorem. This theorem helps us update our beliefs about an event (pregnancy) based on new information (a positive test result).

**Here's the formula:**

```
P(Pregnant | Positive Test) = P(Positive Test | Pregnant) * P(Pregnant) / P(Positive Test)
```

**Breaking down the terms:**

* **P(Pregnant | Positive Test):** This is the probability we want to find - the probability of being pregnant given a positive test result.
* **P(Positive Test | Pregnant):** This is the sensitivity of the test - the probability of a positive test result given that the person is actually pregnant.
* **P(Pregnant):** This is the prior probability of being pregnant before taking the test. It depends on various factors like age, lifestyle, and fertility.
* **P(Positive Test):** This is the probability of getting a positive test result, regardless of whether the person is pregnant or not.

**Calculating the Probability:**

To calculate the exact probability, we need specific values for sensitivity, specificity, and the prior probability of pregnancy. 

However, we can make some general observations:

* **High Sensitivity:** If the test has high sensitivity (i.e., it's good at detecting pregnancy when it's present), a positive result significantly increases the probability of being pregnant.
* **Low Prior Probability:** If the prior probability of pregnancy is low (e.g., a young woman with regular menstrual cycles), a positive test result might still indicate a high probability of pregnancy due to the high sensitivity of the test.

**In general, a positive home pregnancy test is highly accurate, especially if taken correctly and after a missed period.** However, for definitive confirmation, it's always recommended to consult with a healthcare provider.

**Remember:** While a positive test is a strong indicator of pregnancy, it's essential to consider individual factors and consult with a healthcare professional for accurate diagnosis and guidance.

----

## Example: Calculating the Probability of Pregnancy Given a Positive Test

**Let's assume the following:**

* **Sensitivity:** 99% (P(Positive Test | Pregnant) = 0.99)
* **Specificity:** 98% (P(Negative Test | Not Pregnant) = 0.98)
* **Prior Probability of Pregnancy:** 20% (P(Pregnant) = 0.20)

**First, we need to calculate P(Positive Test):**

P(Positive Test) = P(Positive Test | Pregnant) * P(Pregnant) + P(Positive Test | Not Pregnant) * P(Not Pregnant)

P(Positive Test) = 0.99 * 0.20 + (1 - 0.98) * (1 - 0.20) = 0.218 

**Now, we can use Bayes' theorem to calculate the probability of being pregnant given a positive test:**

P(Pregnant | Positive Test) = (0.99 * 0.20) / 0.218 ≈ 0.904 

**Interpretation:**

Given these assumptions, if you get a positive test result, there's approximately a 90.4% chance that you are pregnant. 

**Remember:** This is just an example, and the actual probability can vary based on factors like the specific test used, timing of the test, and individual circumstances. For a more accurate assessment, it's always best to consult with a healthcare provider.

