## Exercise #1: If someone gets a positive test, is it "statistically significant" at the p<0.05 level? Why or why not?

A positive test result is **not** "statistically significant" at the 𝑝<0.05 level because the false positive rate is exactly 5%, which is exactly the same as the alpha/ significance level of p=0.05. Therefore, one cannot confidently reject the null hypothesis. 

To further prove this point, the HIV incidence is 0.6% according to WHO. Therefore, in a 1000 person sample:
 * True Positive: 1000 x 6/1000 = 6 people 
 * False Positive: 1000 x (1000-6)/1000 x 0.05 = 49.7
 In a 1000 person sample, someone receiving a positive HIV test can only be 10.77% confident (6/(49.7+6) = 0.1077) that it correctly indicates they have HIV. Therefore, the test is not "statistically significant" at the p<0.05 level. 

## Exercise #2: What is the probability that if someone gets a positive test, that person is infected?

Let's do the same thing, but this time we will try different values for the proportion of the population that is actually infected. What you should notice is that the PROPORTION INFECTED GIVEN A POSITIVE TEST depends (a lot!) on the OVERALL RATE OF INFECTION. Put another way, to determine the probabilty of a hypothesis, given your data (e.g., proportion infected given a positive test), you have to know the probability that the hypothesis was true without any data.

For this exercise, assume a range of priors (infection rates) from 0 to 1 in steps of 0.1.

I will use an HIV incidence of 0.6% based on WHO data from this website: https://www.who.int/data/gho/data/themes/hiv-aids#:~:text=Globally%2C%2039.9%20million%20%5B36.1%E2%80%93,considerably%20between%20countries%20and%20regions.

In [6]:
import numpy as np

incidence = 0.006
population = 1000
false_pos_rate = 0.05

true_pos = population * incidence
false_pos = population * (1-incidence) * false_pos_rate
total_pos = true_pos + false_pos

# Probability of true HIV infection (according to Bayes' theorem)
probability_true_infection = true_pos / (true_pos + false_pos)
print("The probability of true HIV infection given an incidence of 0.6% is:", probability_true_infection)

# Loop through incidence rates from 0 to 1 in steps of 0.1
for incidence in np.arange(0, 1.1, 0.1):
    print(f'Incidence rate: {incidence}')
    
    # Calculate the true positive, false positives, total positives, and probability of true HIV infection
    true_pos = population * incidence
    false_pos = population * (1-incidence) * false_pos_rate
    total_pos = true_pos + false_pos
    probability_true_infection = true_pos / (true_pos + false_pos)
    
    # Print statements for current HIV infection incidence rate
    print("Total positive tests:", total_pos)
    print("Probability of true infection (being infected given a positive test):", probability_true_infection)
    print("--------------------")
    
    
          

The probability of true HIV infection given an incidence of 0.6% is: 0.10771992818671454
Incidence rate: 0.0
Total positive tests: 50.0
Probability of true infection (being infected given a positive test): 0.0
--------------------
Incidence rate: 0.1
Total positive tests: 145.0
Probability of true infection (being infected given a positive test): 0.6896551724137931
--------------------
Incidence rate: 0.2
Total positive tests: 240.0
Probability of true infection (being infected given a positive test): 0.8333333333333334
--------------------
Incidence rate: 0.30000000000000004
Total positive tests: 335.00000000000006
Probability of true infection (being infected given a positive test): 0.8955223880597015
--------------------
Incidence rate: 0.4
Total positive tests: 430.0
Probability of true infection (being infected given a positive test): 0.9302325581395349
--------------------
Incidence rate: 0.5
Total positive tests: 525.0
Probability of true infection (being infected given a positi