A diagnostic test has a 98% probability of giving a positive result when applied to a person suffering from Thripshaw's Disease, and 10% probability of giving a (false) positive when applied to a non-sufferer. It is estimated that 0.5 % of the population are sufferers. Suppose that the test is now administered to a person whose disease status is unknown. Calculate the probability that the test will:<br>
1. Be positive<br>
2. Correctly diagnose a sufferer of Thripshaw's<br>
3. Correctly identify a non-sufferer of Thripshaw's<br>
4. Misclassify the person

In [1]:
#1. Calculate the probability for a positive
P_pos_AND_infected = 0.98
P_pos_AND_not = 0.10
P_infected = 0.005
#probability of positive is prob of pos and infected * prob of infected + prob of pos and not * prob of not infected
P_pos = P_pos_AND_infected * P_infected + P_pos_AND_not * (1 - P_infected)
print('#1. Probability of positive is ' + str(P_pos))

#2. Calculate probability of correctly diagnosing a sufferer of disease
print('#2. Probability of correctly diagnosing suffer is ' + str(P_pos_AND_infected))

#3. Calculate probability of correctly identifying non-sufferer
# probability of false positive is .10 therefore the true negative 1 - false positive
P_neg_AND_not = 1 - P_pos_AND_not
print('#3. Probability of correctly identifying non-sufferer ' + str(P_neg_AND_not))

#4. Calculate the probability of misclassifying the person

P_mis = 1 - (P_pos_AND_infected * P_infected + P_neg_AND_not * (1-P_infected))
print('#4. Probability of misclassifying the person is ' + str(P_mis))

#1. Probability of positive is 0.1044
#2. Probability of correctly diagnosing suffer is 0.98
#3. Probability of correctly identifying non-sufferer 0.9
#4. Probability of misclassifying the person is 0.09960000000000002


# Extra Practice with Bayes Theorem
#### An aircraft emergency locator transmitter (ELT) is a device designed to transmit a signal in the case of a crash. The Altigauge Manufacturing Company makes 80% of the ELTs, the Bryant Company makes 15% of them, and the Chartair Company makes the other 5%. The ELTs made by Altigauge have a 4% rate of defects, the Bryant ELTs have a 6% rate of defects, and the Chartair ELTs have a 9% rate of defects (which helps to explain why Chartair has the lowest market share).

##### a. If an ELT is randomly selected from the general population of all ELTs, find the probability that it was made by the Altigauge Manufacturing Company.


##### b. If a randomly selected ELT is then tested and is found to be defective, find the probability that it was made by the Altigauge Manufacturing Company.

In [2]:
# find the probability than and ELT was made by Altigauge Manufacturing
P_alt = .8
P_bry = .15
P_cha = .05
P_altdef = .04
P_brydef = .06
P_chadef = .09

print('a. The probability an ELT was made by Altigauge is ' + str(P_alt))
# for the probability that a ELT is defective and from Altigauge
# we need to apply BAYES THEOREM:
# P(Altigauge|defective) = P(defective|Altigauge) * P(Altigauge) / P(defective)
# where P(defective) = P(Altigauge) * P(defective|Altigauge) + P(Bryant) * P(defective|Bryant) + P(Chartair) * P(defective|Chartair)
P_alt_def = (P_altdef*P_alt)/(P_alt*P_altdef + P_bry*P_brydef + P_cha*P_chadef)
print('b. The probability of a defective ELT being made by Altigauge is ' + str(P_alt_def))

a. The probability an ELT was made by Altigauge is 0.8
b. The probability of a defective ELT being made by Altigauge is 0.7032967032967034


#### You go to see the doctor about an ingrowing toenail. The doctor selects you at random to have a blood test for swine flu, which for the purposes of this exercise we will say is currently suspected to affect 1 in 10,000 people in Australia. The test is 99% accurate, in the sense that the probability of a false positive is 1%. The probability of a false negative is zero. You test positive. What is the new probability that you have swine flu?

#### Now imagine that you went to a friend’s wedding in Mexico recently, and (for the purposes of this exercise) it is know that 1 in 200 people who visited Mexico recently come back with swine flu. Given the same test result as above, what should your revised estimate be for the probability you have the disease?

In [2]:
# P(A) = probability of swine flu
PA = 1/10000
# P(B) = probability of positive test
PB = .99
# we wish to know P(A|B) - the probability we have swine flu given positive test
# according to BAYES THEOREM - P(A|B) = P(B|A)*P(A)/P(B)
#                                     = P(B|A)*P(A)/[P(A)*P(B|A)+P(~A)*P(B|~A)]
# P(B|A) = probability of positive test given we have swine flu - this is 1 because there are no false negative
PBA = 1
# P(~A) = the probability of not having swine flu -> 1-P(A)
P_A = 1-PA
# P(B|~A) = probability of positive given no swine flu (false positive) - this is .1 (given)
PB_A = .01

PAB = PBA*PA/(PA*PBA+P_A*PB_A)
print(PAB)

# now that the swine flu is 1/200
PA1 = 1/200
P_A1 = 1 - PA1
PAB1 = PBA*PA1/(PBA*PA1 + PB_A*P_A1)
print(PAB1)

0.009901970492127933
0.33444816053511706


#### Imagine that, while in Mexico, you also took a side trip to Las Vegas, to pay homage to the TV show CSI. Late one night in a bar you meet a guy who claims to know that in the casino at the Tropicana there are two sorts of slot machines: one that pays out 10% of the time, and one that pays out 20% of the time [note these numbers may not be very realistic]. The two types of machines are coloured red and blue. The only problem is, the guy is so drunk he can’t quite remember which colour corresponds to which kind of machine. Unfortunately, that night the guy becomes the vic in the next CSI episode, so you are unable to ask him again when he’s sober

#### Next day you go to the Tropicana to find out more. You find a red and a blue machine side by side. You toss a coin to decide which machine to try first; based on this you then put the coin into the red machine. It doesn’t pay out. How should you update your estimate of the probability that this is the machine you’re interested in? What if it had paid out - what would be your new estimate then?

In [3]:
# Since the first machine we play is red lets assume that the red machine has the better odds
PR = .5
PB = .5
PJR = .2
PJB = .1

# We want to find out what the probability is that this red machine is the one that has the better odds: 
# PRNJ: probability of red given no jackpot

PRNJ = PR*(1-PJR)/(PR*.8 + PB*.9)
print(PRNJ)

# PRJ: probability of red given jackpot
PRJ = PR*PJR/(PR*PJR+PB*PJB)
print(PRJ)

0.47058823529411764
0.6666666666666666


* 1% of people have a certain genetic defect.
* 90% of tests for the gene detect the defect (true positives).
* 9.6% of the tests are false positives.
* If a person gets a positive test result, what are the odds they actually have the genetic defect?

In [8]:
PC = .01
PTP = .9
PFP = .096
# what is the probability of having genetic defect given a positive test result
PCTP = PTP*PC/(PC*PTP + (1-PC)*(PFP))
print(PCTP)

0.0865051903114187


#### Given the following statistics, what is the probability that a woman has cancer if she has a positive mammogram result?

* One percent of women over 50 have breast cancer.
* Ninety percent of women who have breast cancer test positive on mammograms.
* Eight percent of women will have false positives.

In [10]:
PC = .01
PTP = .90
PFP = .08

PCTP = PC*PTP/(PTP*PC + (1-PC)*PFP)
print(PCTP)

0.10204081632653063
