In [1]:
import numpy as np
import math
import pandas as pd

# Risk and Decision Making - Elliot Linsey QMUL 2022

This module is about the notion of probability and most specifically *Bayesian* subjective probability. Before we reach that, let us first go through the building blocks of basic probability. 

Probability is known as the *uncertainty* of an event occurring. This can be applied to any event, either past or present or future, that has unknown information regarding it. It can apply to many different types of events such as legal, medical, political etc. Some examples could be: 

* Did OJ Simpson kill his wife? 
* Do I have a certain disease?
* Will this coin flip land on heads or tails? 

The Bayesian narrative involves updating our belief in an event occurring, either using previously observed events or expert judgement. 

![bayes%20theorem.JPG](attachment:bayes%20theorem.JPG)

## Odds

This is a common way for bookies and betting companies to translate probabilities. It is simply defined as: 

![Odds.JPG](attachment:Odds.JPG)

An example could be the odds of 4/1, which is verbalised as "4 to 1 against". In this case, if the event does occur, for every £1 you stake you win $4 + your original stake of £1, so £5 in total.

Due to this, the way the bookie sees the event is that it is 4 times more likely to **not** occur. To figure out the percentage chance that it will happen, you simply do:

![odds2.JPG](attachment:odds2.JPG)

Which is 100 x 1/5 = 20\% chance of occurring (or a probability of 0.2).

This is also the case when an event has a high chance of occurring, say the odds of 1/4, verbalised as "4 to 1 on". This means that for every £4 you put on, if the event occurs you only receive £1 + your original stake of £4, for £5 total.

The percentage chance in this case is 100 x 4/5 = 80%. 

## Axioms

These are assumed rules we make about probability in order to have a basis for our theorems later on and to actually prove anything. 

### No. 1

*The probability of any event is a number between zero and one*. 

This is derived from the fact that we can never be more than 100% certain or less than 0% certain of an event occurring. Therefore, if we divide our percentage chance by 100, we always end up with a number between 0 and 1 for our probability. 

Take into account that if we have a very low *percentage chance*, we still divide by 100 to get our probability and end up with an even smaller number. i.e if we have a percentage chance of 0.4%, we have a probability of 0.004. 

We write a probability of an event as p(E). This value cannot be greater than 1 or less than 0 as then it would not satisfy the axiom requirements. 

### No. 2

*The probability of the exhaustive event is 1*.

The exhaustive event consists of every possible event that can occur within the sample space (or experiment). For example, rolling a dice with the events ({roll 6}, {don't roll 6}) is the exhaustive event as all possible outcomes are covered, therefore the probability of one of these events happening is 1.

### No. 3

*For Mutually Exclusive events, the probability of either event happening is the sum of the probabilities of the individual events*. 

Firstly, the definition of **mutually exclusive** events is 2 (or more) sets of elementary events that cannot occur at the same time. For instance, you cannot roll both a 4 and a 6 on the same die roll. 

In this, we introduce the notion of the **union** of events, that is "either $E_1$ or $E_2$" If we have two elementary events for dice of $E_1$ = {5,6} and $E_2$ = {1,2,3} (basically any number except 4). The probability of these two events is calculated by summing each events probability: 2/6 + 3/6 = 5/6.

Mathematicians write this as $p(E_1\cup E_2) = p(E_1) + p(E_2)$ 

**Remember, this is for mutually exclusive events, independent events are different**. 

Independent events: 

The union of two independent events is still written the same as $p(E_1\cup E_2)$. However, this time we must minus the **intersection** of the two events occurring.

The intersection is when both events occur *at the same time*. i.e, you roll a 6 on a die and a tail on a toin coss. To calculate this, you simply multiply the probabilities of the events, 1/6 x 1/2 = 1/12. 

For the union of rolling a 6 on a die **or** a tail on a toin coss, you simply minus the intersection from the sum of the probabilities. 

$p(E_1\cup E_2) = p(E_1) + p(E_2) - p(E_1 \cap E_2)$ 

**Remember, $p(A \cup B)$ means the probability of either A or B occurring (differs whether the events are independent or mutually exclusive).** 

**$p(A \cap B)$ means the probability of both A and B occurring (only for independent events).**

### No. 4

*The probability of the complement of an event is 1 - p(E).* 

The complement of an event consists of the event(s) that do not occur when p(E) does. In essence, if $p(E)$ is event, then $p(notE)$ is the complement. Due to axiom 2, we know that the probability of the exhaustive event is 1, this can also be thought of as the probability of p(E) + p(not E) as they encompass every possible outcome. Therefore, we can find the probability of either event p(E) or p(not E) by minusing the other from 1. 

Due to this, it is sometimes easier to find the probability of the opposite event you are investigating, and minus it from 1. For example, if you wanted to find the probability of not rolling a 6, you could find the probability of rolling 6 (1/6) and minus it from 1 = 5/6. 

If we have a binary experiment with only 2 events, by assigning a probability to one we are automatically assigning a probability to the other. If p(E) = 0.3, then p(not E) = 0.7 as we minus 0.3 from 1. 

Mathematicians write "not E" as $\neg E$.

### Example

What is the probability of drawing a card from a deck that is either red or an ace? Assuming it is a fair deck. 

Notice that the wording is red **or** an ace. Therefore we are trying to find the union of the events A (red) and B (ace) = $p(A \cup B)$. However, we can have cards that are both red and an ace so we will have to find the intersection of these events as well to calculate the union. 

$p(A) = 26/52$ 

$p(B) = 4/52$

$p(A\cap B) = 26/52 \times 4/52$

Therefore: $p(A \cup B) = 26/52 + 4/52 - 2/52\\
 = 28/52\\ 
 = 7/13$

The standard error is very similar to the standard deviation in that both measure the spread of your data. However, standard error measures the statistics (sample data) whilst standard deviation measures parameters (population data). 

The phrase "standard error of the mean" tells you how far the sample mean deviates from the population mean. The larger the sample size, the smaller the SE and closer you are to the true population mean.  

![standard%20error.JPG](attachment:standard%20error.JPG)

## Relative and Absolute Risk

These two are technically highlighting the same thing, but displaying the difference in risk in different ways. 

An example question is this:

"It is known that about 2.3% of people who have sleeping disorders have severe insomnia (defined as going more than 36 hours without being able to sleep at all)

A study of 1000 people who have sleeping disorders discovered that tea-drinkers (classified as those who drink more than 2 cups of tea a day) are more likely to suffer severe insomnia."

Important information to take from this is the initial percentage of 2.3%, as we can see we find that the total of the row 'Severe Insomnia' = 23. The total number of participants is 1000, therefore 23/1000 = 2.3% so this matches the above figure. 

The second part relates to the fact that out of 300 tea-drinkers, 9 have severe insomnia so 9/300 = 0.03 or 3% have insomnia. For non tea-drinkers, 14 out of 700 have insomnia so 14/700 = 0.02 or 2%. Remember in these cases to convert the decimal value to percentage by multiplying by 100. 

In [2]:
risk = pd.DataFrame([
    [9,14],
    [291,686],
    [300,700]
],index=['Severe Insomnia','Other sleeping disorders','Total'],columns=['Tea-drinkers','Non tea-drinkers'])
risk

Unnamed: 0,Tea-drinkers,Non tea-drinkers
Severe Insomnia,9,14
Other sleeping disorders,291,686
Total,300,700


These questions are about those with sleeping disorders: 

ai)  What is the relative increase in risk of having severe insomnia for tea drinkers compared to non tea-drinkers?

For relative increase we are looking at the increase as a form of proportion from the percentage. In this case, 2% of non tea-drinkers have insomnia and 3% of tea-drinkers have insomnia. Therefore, the relative risk increase of drinking tea is (3-2)/2 = 0.5 or 50%. 

aii)  What is the absolute increase in risk of having severe insomnia for tea drinkers compared to those who are not tea-drinkers?

The absolute increase is simply the difference between the actual values, again 2% for non tea and 3% for tea. Therefore, 3-2 = 1% or 0.01



Here's where understanding the question starts to get more interesting: 

b) Suppose we know that 10% of the population have sleep disorders. Of those with sleeping disorders, 30% are tea—drinkers. Of those with no sleeping disorders only 20% are tea drinkers. Answer the following questions about the whole population: 

With questions like the above, it makes sense to create representative tables of the proportions given to us with created numbers. 

If we imagine a population of 100,000, we find 10% of this to have sleeping disorders so 10k whilst the other 90k do not. 

Within this 10k, we know 30% are tea-drinkers so 3k **in total** for tea-drinkers. We know from our previous table that 0.03% of tea-drinkers that have a sleeping disorder also have severe insomnia, so 0.03 x 3000 = 90. Therefore, 2910 of our tea-drinkers must only have other sleeping disorders. 

For non tea-drinkers within the remaining 7k, we know 0.02 have severe insomnia = 140, then we minus the remainder from 7000 for 6860. 

In [3]:
risk2 = pd.DataFrame([
    [90,140],
    [2910,6860],
    [3000,7000]
],index=['Severe Insomnia','Other sleeping disorders','Total'],columns=['Tea-drinkers','Non tea-drinkers'])
risk2

Unnamed: 0,Tea-drinkers,Non tea-drinkers
Severe Insomnia,90,140
Other sleeping disorders,2910,6860
Total,3000,7000


For those without sleeping disorders at all within the remaining 90% of our population (90k), we know 20% are tea-drinkers. However, as we know that none of these have severe insomnia we simply put them into the 'Other sleeping disorders' category (which is doubling as a none category in this instance I suppose). We do the same for the non tea-drinkers category. 

In [4]:
risk3 = pd.DataFrame([
    [0,0],
    [18000,72000],
    [18000,72000]
],index=['Severe Insomnia','Other sleeping disorders','Total'],columns=['Tea-drinkers','Non tea-drinkers'])
risk3

Unnamed: 0,Tea-drinkers,Non tea-drinkers
Severe Insomnia,0,0
Other sleeping disorders,18000,72000
Total,18000,72000


From here, we can combine the two tables to create our full population risk table. 

In [5]:
risk4 = pd.DataFrame([
    [90,140],
    [20910,78860],
    [21000,79000]
],index=['Severe Insomnia','Other sleeping disorders','Total'],columns=['Tea-drinkers','Non tea-drinkers'])
risk4

Unnamed: 0,Tea-drinkers,Non tea-drinkers
Severe Insomnia,90,140
Other sleeping disorders,20910,78860
Total,21000,79000


Now, we can answer questions about the whole population: 

bi) What is the relative increase in risk of having severe insomnia for tea-drinkers compared to those who are not tea-drinkers?

90/21000 = 0.4286%

140/79000 = 0.1772%

Relative risk increase = (0.4286-0.1772)/0.1772 = 1.42 or 142% (It seems easier to convert the initial calculations directly into percentages)

bii) What is the absolute increase in risk of having severe insomnia for tea drinkers compared to those who are not tea-drinkers?

Absolute risk increase:  0.4286-0.1772 = 0.2514%

In [6]:
(0.4286-0.1772)

0.25139999999999996

## Risk Ratios

This is related to absolute and relative risks. 

D = Person has the disease

R = Person exposed to (specific) risk

![risk%20ratios.JPG](attachment:risk%20ratios.JPG)

p(D|R) = probability of disease given exposed to risk = a/a+b

p(D|not R) = probability of disease given not exposed to risk = c/c+d

Risk ratio = p(D|R)/p(D|not R) = (a/a+b)/(c/c+d)

![risk%20ratios2.JPG](attachment:risk%20ratios2.JPG)

Risk ratio of whether being exposed to the risk increases the chance of disease = (120/420)/(130/580) = 0.2857/0.2241 = 1.27 (as it is greater than 1 it does increase risk). 

In [7]:
((4/100000)-(147/100000))/(4/100000)

-35.74999999999999

Within these formulas we are measuring the risk increase for features R (a and b) compared to Not R (c and d).

![relative%20risk%20increase.JPG](attachment:relative%20risk%20increase.JPG)

![absolute%20risk%20increase.JPG](attachment:absolute%20risk%20increase.JPG)

![odds%20ratio.JPG](attachment:odds%20ratio.JPG)

## Paradoxes

### Simpson's Paradox

The presence of a confounding variable within our data leads us to make the incorrect conclusion about a trial or set of results.

![simpsons%20paradox.JPG](attachment:simpsons%20paradox.JPG)

Above, we see that Treatment B has the highest overall success rate. But, when you drill down into the data and stratify by kidney stone size, we find that Treatment A actually had the higher individual success rates. 

This has occurred because we have given different proportions of treatments to different kidney stone sizes. 

![causal%20simpson.JPG](attachment:causal%20simpson.JPG)

### Illustrated within AgenaRisk

Within this example, we see that if we take the drug we have a 50/50 chance of recovery. However, this is because taking the drug increases the chance that the subject is male, who have a greater chance of recovery than females. Within this, our confounding variable is sex. 

![simpsons%20agena1.JPG](attachment:simpsons%20agena1.JPG)

This is an example of a observational trial. Within this cohort of subjects we get these results. However, in the real world we want to know the effectiveness of the drug regardless of sex.

To do this, we have to cut the link between sex and drug taken. 

This is as simple as just deleting the link between the two in AgenaRisk. 

This gives the result that taking the drug has a lower chance of recovery, a clear difference to our previous result. 

**When doing this in AgenaRisk, it is advisable to copy and paste the original nodes, rather than deleting from the original model**

![simpsons%20agena2.JPG](attachment:simpsons%20agena2.JPG)

## Counterfactuals

Using this type of model, we can answer 'intervention' questions, such as when we know a patient has taken the drug and not recovered, what is the probability that they would have recovered given they had *not* taken the drug? In this case, we do not know the sex of the subject.

Here's the observational part of the model, what we *observed* to happen to the original patient.

Be mindful that within this, the Sex NPT has a 50/50 chance of being either Male or Female. The displaying probabilities of 66.67% to 33.33% have been created by updating the Recovery node. 

![counterfactual1.JPG](attachment:counterfactual1.JPG)

Here's the intervention part of the model. *What would have happened* had they not taken the drug? 

Within the Sex node, we have had to update the priors to reflect the what is displayed in the original observed outcome, that being 66.67% and 33.33%. Therefore, we have set the Sex NPT table to those values. 

![counterfactual2.JPG](attachment:counterfactual2.JPG)

In this case, we can see that had they not taken the drug, there was a 43.33% chance of recovery.

We can do this all within one model: 

We actually should not break the link in the left hand side, the link should only be broken in the counterfactual world.

![counterfactual3.JPG](attachment:counterfactual3.JPG)

We copy the drug taken and recovery nodes which keeps the links to sex. Then on one branch we enter the observational data (which updates our sex probabilities). On the other branch we enter the interventional data which gives us the same Recovery probability as above. 

In general, this involves copying the nodes that we have observational data for and leaving the background nodes as is. All the inputs should be the same. 

## Evaluation

This is another look at our good friends sensitivity (recall) and specificity (precision). 

![evaluation.JPG](attachment:evaluation.JPG)

Within this confusion matrix, True positives are **A** and True negatives are **D**. False positives are **C** and false negatives are **B**.

Sensitivity = A/(A+B)  (Also known as True positive rate)

Specificity = D/(C+D) (Also known as True negative rate) 

Accuracy = A+D/(A+B+C+D)

False positive rate = 1 - Specificity

## Inputting Data into AgenaRisk

From a table of data like below:

![titanic%20data.JPG](attachment:titanic%20data.JPG)

We can put this data into nodes in AgenaRisk to answer questions such as: 

What is the probability that a survivor is crew? (This is different to what is the probability that a crewperson survives?)

![crew%20survived.JPG](attachment:crew%20survived.JPG)

For the crew node we have simply used the values of total crew as True and total passengers as False. AgenaRisk then normalises these values. 

For the survived node we then input the values for whether the crew members survived or died as well as the same for passengers. 

Once we have done this, we can use the causal link to answer our question. 

![crew%20survived2.JPG](attachment:crew%20survived2.JPG)

There is a 30% they were crew, given they survived.

## Exam Question

The following probabilities about survival chances are 'learnt' from a training dataset:

If sex == Male then survival probability = 0.2

If sex == Female and class 1 or 2 then survival probability = 0.8

If sex == Female and class 3 then survival probability = 0.6

The relevant information in a **different test dataset** is displayed below:


|                   | Male    | Female  class 1 or 2   |     Female class 3 |
| :---              |  :----: |    ---:                |                    |
| Survived          | 75   | 75            |         60           |
| Did not Survive   | 225    | 15               |          50          |

What this means is that if we set a cutoff point of above 0.2, then all men will be predicted to not survive. By manipulating cutoff points we can create different confusion matrices. 

For a **cutoff of 0.1** we can generate a confusion matrix of: 

|                   | Predicted Yes    | Predicted No  |     Total     |
| :---              |  :----:          |    ---:       |                    |
| Actual Yes        | 210              | 0           |         210           |
| Actual No         | 290              | 0            |          290     |

Here we predict that everybody survives, therefore everyone must go in the Predicted Yes column. However, we must take into account the Actual values and split the data between whether they truly survived or did not. 

We can calculate the sensitivity, specificity, false positive rate and accuracy using this matrix. 

sensitivity = 210/(210+0) = 100%

specificity = 0/(290+0) = 0%

fpr = 1 - specificity = 100%

accuracy = (210+0)/(210+0+290+0) = 210/500 = 42%

**The question**: 

For cutoff values of 0.5, 0.7 and 0.9, generate the confusion matrices and subsequent metrics. 

For a cutoff of 0.5:

All men are predicted to die but all women are to survive. 

From the original table, we can take all the men that *actually survived* and predict them no. Then take all the men that *actually died* and predict them no as well. 

The women are all subsequently predicted yes, but we must split them as well between those that actually survived and died. In doing this, we combine their row values i.e. 75 class 1 or 2 women survived and 60 class 3 women survived, therefore 135 are predicted Yes and actually survive. 

|                   | Predicted Yes    | Predicted No  |     Total     |
| :---              |  :----:          |    ---:       |                    |
| Actual Yes        | 135              | 75           |         210           |
| Actual No         | 65             | 225            |          290     |

sensitivity = 135/(135+75) = 64%

specificity = 225/(65+225) = 78%

fpr = 1 - specificity = 22%

accuracy = (135+225)/(135+75+65+225) = 210/500 = 72%

For a cutoff of 0.7: 

All men are predicted to die as well as all women from class 3. 

Here we just add the class 3 women to the predicted no column.

|                   | Predicted Yes    | Predicted No  |     Total     |
| :---              |  :----:          |    ---:       |                    |
| Actual Yes        | 75              | 135           |         210           |
| Actual No         | 15             | 275            |          290     |

sensitivity = 75/(75+135) = 36%

specificity = 275/(15+275) = 95%

fpr = 1 - specificity = 5%

accuracy = (75+275)/(135+75+15+275) = 210/500 = 70%

For a cutoff of 0.9:

Everyone is predicted to die.

|                   | Predicted Yes    | Predicted No  |     Total     |
| :---              |  :----:          |    ---:       |                    |
| Actual Yes        | 0              | 210           |         210           |
| Actual No         | 0             | 290            |          290     |

sensitivity = 0/(0+210) = 0%

specificity = 290/(0+290) = 100%

fpr = 1 - specificity = 0%

accuracy = (0+290)/(0+210+0+290) = 210/500 = 58%

From this we can generate a ROC curve (note that to do this in python we would need the actual data), we can sketch it using our metrics from the confusion matrices.

![titanic%20ROC.JPG](attachment:titanic%20ROC.JPG)

### Likelihood Ratio

We can use this to identify how a new piece of evidence supports a hypothesis (probative). A legal example can be used here:

Hp is the prosecution hypothesis, that the defendant is guilty.

Hd is the defendant hypothesis, that the defendant is innocent. Both Hp and Hd are the priors of this Bayes equation.

Both these can be written as probabilities, P(Hp) and P(Hd). What is important for likelihood ratios is that both hypotheses are mutually exclusive and exhaustive. Also, from our axioms of probility, P(Hp) = 1 - P(Hd)

Our posterior probability in this case is P(Hp | E), which is the probability that our hypothesis is true given the observed evidence. 

The likelihood of the evidence occurring P(E | Hp) relies on the evidence observed. It is the probability of seeing this evidence if the hypothesis is true. 

#### Example

A DNA sample is taken from a crime scene and a partial sequence is extracted. This partial sequence is observed in 1/100 people. 

Fred has a match against this DNA sample. 

Our likelihoods: 

P(E | Hd) = 1/100. This is because if Fred is innocent and not connected to the crime scene, there is still a 1/100 chance that he would match with the sample

P(E | Hp) = 1. This is because if Fred is guilty, then the DNA sample would be his and therefore a match should be guaranteed. 

#### Prosecutor's Fallacy

This can raise a common fallacy, and that is to assume that P(E | Hd) = P(Hd | E). In other words, they mix up the probability of seeing this evidence if Fred did not do it (1/100) with the probability of Fred being innocent given they have seen this evidence. Overall, they are saying that the probability of Fred being innocent is only 1/100 because they have seen this evidence when it is the other way around, it's the probability of seeing this evidence given that Fred is innocent. 

This is equivalent to assuming that P(Hp | E) is 99/100. (1-P(Hd | E)) 

To get a true posterior probability, we need to know the priors. 

If this crime occurred on an island of 1001 people (Fred included), then P(Hp) = 1/1001 and equivalently P(Hd) = 1000/1001.
 
Here is Bayes theorem: 

![bayes%20theorem.JPG](attachment:bayes%20theorem.JPG)

P(Hp | E) = P(E | Hp) * P(Hp)/P(E)

We already know P(E | Hp) and P(Hp). We just need to work out P(E). 

For this, we take both routes where P(E) = True. These are, P(E | Hp)* and P(E | Hd). We then multiply them by their respective priors P(Hp) and P(Hd) and then sum them. Resulting in a final:

P(E | Hp)\*P(Hp) + P(E | Hd) * P(Hd) = 0.011

Here's the full tree diagram: 

![Fred%20bayes%20tree-2.jpg](attachment:Fred%20bayes%20tree-2.jpg)

In [8]:
pe = ((1/1001)*1)+((1000/1001)*1/100)
pe

0.010989010989010988

This equals 1/11. So there is a 1/11 chance that Fred is guilty having observed this evidence. We can also work out P(Hd | E) from this by just doing 1 - 1/11 = 10/11. 

In [11]:
print('P(Hp | E) = ' + str((1/100*1000/1001)/pe))

P(Hp | E) = 0.9090909090909092


![informal%20fred.JPG](attachment:informal%20fred.JPG)

However, the use of Bayes theorem in courts is ill-advised due to the difficulty of getting the general public to understand it and the use of priors properly. 

Therefore, the use of likelihood ratios is introduced which removes the need for priors but still provides probative value for evidence independent of priors. 

If P(Hp | E) > P(Hp), we can say that the evidence has increased the probability that Hp is true. Therefore, it is probative in favour of Hp. On the flipside, if P(Hp | E) < P(Hp) then it is probative against Hp. Evidence that doesn't change the probability P(Hp | E) = P(Hp) has no probative value. 

So long as Hp is the negation of Hd (mutually exclusive and exhaustive), the likelihood ratio is: 

![likelihood%20ratio.JPG](attachment:likelihood%20ratio.JPG)

If the LR > 1, then the evidence is probative **in support of Hp** as it results in an increased posterior probability of Hp | E. The higher the LR the closer the posterior probability gets to 1

If the LR < 1, then the evidence is against Hp or **supportive of Hd**. The closer it gets to 0 the closer the posterior probability of Hp | E gets to 0

If the LR = 1, the evidence has no probative value. 

For our example, the LR is: $\frac{1}{1/100}$ = 100

If the chance of a random match was 1 in 10,000,000. The LR would be: $\frac{1}{1/10000000}$ = 10000000. Evidence with this high level of LR is said to be highly probative.

We can derive a couple of things with the LR in relation to pure Bayes theorem: 

![odds%20LR.JPG](attachment:odds%20LR.JPG)

With our example, this results in 0.1 which I think means that P(Hp | E) is smaller than P(Hd | E), if it is over 1 then P(Hp | E) would have a higher probability.

In [19]:
print('odds LR = '+ str(((1/1001)/(1000/1001))*100))

odds LR = 0.1


Also, we can get the posterior odds probability using the LR and our prior (or at least very close to):

![LR%20post.JPG](attachment:LR%20post.JPG)

In [20]:
print('posterior odds = ' + str((1/1001)*100))

posterior odds = 0.0999000999000999


The two hypotheses must be mutually exclusive and exhaustive for the LR to be a valid metric. For example, the two hypotheses must be: 

The DNA is from the defendent and - The DNA is not from the defendent. 

Usually, forensic scientists use the hypothesis of the DNA is from a person *unrelated* to the defendent, which is easier for them to calculate but renders the LR invalid. 

Priors must still be taken into account however. An LR of 10,000,000 on an island of 1000 people will result in LR odds of 10,000 in favour of Hp, but on an island of 60,000,000 that only becomes 1/6 in favour of Hp or 5/6 in favour of Hd.

In [26]:
LR = 10000000

print((1/1000)*LR)
print((1/60000000)*LR)

10000.0
0.16666666666666666


In [22]:
1/60000000

1.6666666666666667e-08

#### Another Example using FPR and FNR

If we have a covid test that has a 2% False Positive Rate, i.e if you **don't** have the virus, there is a 2% chance of testing positive. The test also has a 1% False Negative Rate, i.e if you **have** the virus there is a 1% chance of you testing negative. 

Sara tests positive, what is the probability she has the disease? 

The P(E | H) = 0.99 (1 - FNR)

Also, P(E | not H) = 0.02 (FPR)

In this case, the likelihood ratio is:

![likelihood%20ratio2.JPG](attachment:likelihood%20ratio2.JPG)

What this tells us is that we are nearly 50 times more likely to see a positive test result in someone that has the disease compared to someone who doesn't. 

Again, to get the posterior **odds** in this case, we need to use the prior odds. 

If the disease is only present in 1/200 people P(H) = 1/200. We just multiply this with our LR to get our posterior odds. 

$\frac{1}{200}\times \frac{50}{1} = \frac{1}{4}$

This is a 25% chance that she actually has the virus or 1 to 4 in favour of not having the virus (against the hypothesis)

However, if we change the prior to something like 1/2, then:

$\frac{1}{2}\times \frac{50}{1} = \frac{25}{1}$

So now the posterior odds are now 25 to 1 in favour of her having the virus.

I think this means 25/26 (96%) of her having the virus.