# Conditional Probability - Lab

## Introduction

In order to be ready for real world applications of probability, it is important to understand what happens when probabilities are not independent. Very often, the probability of a certain event depends on other events happening! Let's see how this all works in this lab.

## Objectives

You will be able to:

- Understand and explain the conditional probability - $P(A \cap B) = P(A \mid B) P(B)$
- Use the multiplication rule to find the probability of the intersection of two events
- Apply the techniques learned in the lesson to simple problems

## Exercise 1
A coin is tossed and a single 6-sided dice is rolled. Find the probability of landing on the head side of the coin and rolling a 3 on the dice.

In [3]:
sample_space_coin = 2
event_space_coin = 1
prob_coin = event_space_coin / sample_space_coin

sample_space_dice = 6
event_space_dice = 1
prob_dice = event_space_dice / sample_space_dice

prob_both = prob_coin * prob_dice

prob_both * 100

8.333333333333332

## Exercise 2


After conducting a survey, one of the outcomes was that 8 out of 10 of the surey subjects liked chocolate chip cookies. If three survey subjects are chosen at random **with replacement**, what is the probability that all three like chocolate chip cookies?

In [4]:
trial_1 = 8/10
trial_2 = 8/10
trial_3 = 8/10

prob_3_trials = trial_1 * trial_2 * trial_3
prob_3_trials

0.5120000000000001

* In the context of permutations, replacement means you would take the length of your sample size to the power of how many trials there would be. <br>
* In the context of probability, replacement just means the trials are going to be independent, which means you do them separately and then multiply them together to get the probability of all of them happening. <br>
* In our case it makes sense because you would have a 80% chance to pull a cc cookie lover. But you're doing the trial three separate times, all the while putting back whomever you chose, regardless of their cookie preference. So each time you do that, though the probability of pulling a cc cookie lover stays the same, the probability that this keeps happening over and over goes down slightly. 

## Exercise 3
70% of your friends like chocolate flavored ice cream , and 35% like chocolate AND like strawberry flavors.

What percent of those who like chocolate also like strawberry?

In [28]:
# Your solution 
A = 0.35
B = 0.7

# P(A|B) = P(A cap B)/P(B)
Prob_A_given_B = A / B

Prob_A_given_B

0.5

50% of your friends who like chocolate also like strawberry

## Exercise 4
What is the probability of drawing 2 consecutive aces from a deck of cards. 

In [23]:
# Your solution
ace_1 = 4/52
ace_2 = 3/51
a_cap_b = ace_1 * ace_2

a_cap_b * 100

0.4524886877828055

> At first I thought this was going to need a conditional probability formula to solve, but it didn't. Actually none of the problems up until this point have needed it (I think).    **-P**

## Exercise 5
In a manufacturing factory that produces a certain product, there are 100 units of the product, 5 of which are defective. We pick three units from the 100 units at random. 

What is the probability that none of them are defective?
Hint: Use the chain rule here!

In [30]:
# Your solution
prob_A = 95/100
prob_B_given_A = 94/99
prob_C_given_AB = 93/98

# three_probs = 
prob_A_B_C = prob_A * prob_B_given_A * prob_C_given_AB
prob_A_B_C

0.8559987631416203

This was simpler than I thought because I know that you have to figure $P(A|B)$, which I thought would be unknown, so I was trying to figure out how you would get that answer first before you could move on to the next step. <br><br>
But I think the thing to note here is the word **given**--we are assuming (or giving) that $A$ has happened, so instead of figuring out what the probability would have been _had we pulled a defect on the first draw_, we are instead assuming that there wasn't a defect on the first draw, and then moving on from there. And right away you have your $P(A|B)$ without having to do any calculations.<br><br>
After that it is pretty similar to the above problem: you whittle down what the options would be, _again assuming that you had not pulled a defect_. <br>
This is also why you use $95/100$ instead of $5/100$, like I was doing when I started out. I was trying to find the probability that we would pull a defect, but we were asking what the probability would be that we would not find a defect.

## Exercise 6

Let's consider the example where 2 dice are thrown. Given that **at least one** of the dice has come up on a number higher than 4, what is the probability that the sum is 8?

Let $i,j$ be the numbers shown on the dice. The events $A$ and $B$ are described below:

* **Event $A$ is when either $i$ or $j$ is 5 or 6** (keep an eye on either - or)
* **Event $B$ is when $i + j = 8$**


* What is the size of sample space $\Omega$ ?
* What is $P(A \cap B)$ ?
* What is $P(A)$ ?
* Use above to calculate $P(B \mid A)$

In [31]:
# Your solution
omega = 36

A = 2/36

B = 2/36

A_cap_B = 


B_given_A = A_cap_B / A
B_given_A

0.5454545454545455

Here's FI's solution that I wasn't even close to:

$A:= {(i,j) \in \Omega \mid \text{either $i$ or $j$ is 5 or 6}}$ 

$B:= {(i,j)\in \Omega \mid i+j = 8}$

In total, there are 36 possible outcomes, so $\Omega = 36$.

$P(A) = 20/36$ - Event $A$ has 20 possible outcomes out of 36
<br>--this is because if you are looking at all the possible sets ([1,1], [4,2], [5,4]) etc., all of the sets where one of the dice is 5 ([5,1], [5,2], [5,3]) etc., equals 6 possibilities, which is the same for all the dice where 6 is one of the dice, so this equals 12. Then, you still have set numbers 1-4, each of which contain two of the possibilities we are looking for, either a 5 or a 6, which will be the last two possibilities for each of the sets. #'s 1-4 * 2ea. = 8; 8 + 12 = 20.

$P(A \cap B) = 4/36$ - Event $B$ has 4 possible outcomes out of 36 - (2 and 6, 3 and 5, 6 and 2, 5 and 3)

We want to know what the probability of $B$ is given $A$, so $P(B \mid A)= \dfrac{P(B\cap A)}{P(A)}$. This leads to:

In [39]:
p_B_given_A = (4/36)/(20/36) 
print("P(A)=", 20/36)
print("P(B cup A)=", 4/36)
print("P(B|A)=", p_B_given_A) 
print("The probability that B happens, given A, is 8% more likely than the probability that B and A happen,"
     "since they are     dependent")

P(A)= 0.5555555555555556
P(B cup A)= 0.1111111111111111
P(B|A)= 0.19999999999999998
The probability that B happens, given A, is 8% more likely than the probability that B and A happen,since they are     dependent


## Exercise 7

Let's consider a credit card example. At a supermarket, customers are selected randomly, the store owner recorded whether costumers owned a Visa card (event A) or an Amex credit card (event B). Some customers own both cards.
You can assume that:

- $P(A)$ = 0.5
- $P(B)$ = 0.4
- both $A$ and $B$ $(A \cap B)$= 0.25.


With the knowledge we have about conditional probabilities, compute and interpret the following probabilities:

- $P(B \mid A)$
- $P(B' \mid A)$
- $P(A \mid B)$
- $P(A' \mid B)$


In [45]:
# Your solution
a = .5
b = .4
a_and_b = .25


b_given_a = a_and_b / a

b_cmplmnt_given_a = 1 - b_given_a

a_given_b = a_and_b / b

a_cmplmnt_given_b = 1 - a_given_b

In [46]:
print(b_given_a)
print(b_cmplmnt_given_a)
print(a_given_b)
print(a_cmplmnt_given_b)

0.5
0.5
0.625
0.375


The reason why, in the compliment problems, that it is simply one minus the previous problem is because, if we remember the lesson on sets, the universal compliment to an event is everything that is not in it. When we are combininig this with **_given_** clauses, we find that: 
* if we found the probability of _A, given B_, then we see that the universal compliment of _A given B_ will be everything that wasn't in _A given B_.
* $P(A'|B) = 1 - P(A|B)$

## Summary 

In this lab you practiced conditional probability and its theorem with some simple problems. The key takeaway from this lab is to be able to identify random events as dependent or independent and calculating the probability of their occurrence using appropriate methods. Next you'll learn about some more conditional probability axioms, building on the knowledge we have so far. 

## Worked problem from Khan Academy

A hospital is testing patients for a certain disease. If a patient has the disease, the test is designed to return a "positive" result. If a patient does not have the disease, the test should return a "negative" result. No test is perfect though.
* 99%, percent of patients who have the disease will test positive.
* 5%, percent of patients who don't have the disease will also test positive.
* 10%, percent of the population in question has the disease.<br>
**If a random patient tests positive, what is the probability that they have the disease?**

In [55]:
have_disease = .10
dont_have_disease = .90
false_positive = .05
test_accuracy = .99

In [66]:
cracks = have_disease * (1-test_accuracy)
no_tru_pos = test_accuracy * (have_diesase - cracks)
no_false_pos = dont_have_disease * false_positive
total_pos = no_false_pos+no_tru_pos

In [67]:
print(cracks)
print(no_tru_pos)
print(no_false_pos)
print(total_pos)

0.0010000000000000009
0.09801
0.045000000000000005
0.14301


In [60]:
have_disease * test_accuracy

0.099

In [61]:
total_pos

0.14301

In [59]:
random_has_disease = have_disease / total_pos
random_has_disease

0.6992518005733865