In the previous course, we covered the fundamentals of probability and learned about:

- Theoretical and empirical probabilities

- Probability rules (the addition rule and the multiplication rule)

- Counting techniques (the rule of product, permutations, and combinations)


In this course, we'll build on what we've learned and develop new techniques that will enable us to better estimate probabilities. Our focus for the entire course will be on learning how to calculate probabilities based on certain conditions — hence the name conditional probability.

By the end of this course, we'll be able to:

- Assign probabilities to events based on certain conditions by using conditional probability rules.

- Assign probabilities to events based on whether they are in a relationship of statistical independence or not with other events.

- Assign probabilities to events based on prior knowledge by using Bayes' theorem.

- Create a spam filter for SMS messages using the multinomial Naive Bayes algorithm.


---------------------------------------------------------------------------------------------------------------------

Now suppose the die is rolled and we're told some new information: the die showed an odd number (1, 3, or 5) after landing. Is the probability of getting a 5 still P(5)=1/6? 

Or should we instead update the probability based on the information we have?

 
Ω = {1,2,3,4,5,6}  ==>   Ω = {1,3,5}

Therefore, knowing the die showed an add number, the probability of getting 5 is P(5) = 1/3


For notation simplicity, P(5 given the die showed an odd number) becomes **P(5|odd)**. The vertical bar character ( | ) should be read as "given." We can read P(5|odd) as "the probability of getting a 5 given that the die showed an odd number."


Say we roll a fair six-sided die and want to find the probability of getting an odd number, given the die showed a number greater than 1 after landing. Using probability notation, we want to find P(A|B) where:

- A is the event that the number is odd: A = {1, 3, 5}
- B is the event that the number is greater than 1: B = {2, 3, 4, 5, 6}


To find P(A|B), we need to use the following formula:

**P(A|B) = number of successful outcomes / total number of possible outcomes**


We know for sure event B happened (the number is greater than 1), so the sample space is reduced from {1, 2, 3, 4, 5, 6} to {2, 3, 4, 5, 6}:


The total number of possible outcomes above is given by the number of elements in the reduced sample space Ω={2,3,4,5,6} — there are five elements.

The number of elements in a set is called the **cardinal** of the set. Ω is a set, and the **cardinal of Ω={2,3,4,5,6} is**:

**cardinal(Ω) = 5**


In set notation, cardinal(Ω) is abbreviated as **card(Ω)**, so we have:

total number of possible outcomes = **card(Ω) = 5**



The only possible odd numbers we can get are only 3 and 5, and the number of possible successful outcomes is also given by the cardinal of the set {3, 5}:

number of successful outcomes = 2 = card({3,5})


### **P(A|B) = card(A∩B)/card(B) = 2/5**



--------------------------------------------------
Two fair six-sided dice are simultaneously rolled, and the two numbers they show are added together. The diagram below shows all the possible results that we can get from adding the two numbers together.

![probability-pic-1](https://raw.githubusercontent.com/tongNJ/Dataquest-Online-Courses-2022/main/Pictures/probability-pic-1.PNG)


Find P(A|B), where A is the event where the sum is an even number, and B is the event that the sum is less than eight.

In [1]:
def two_dice_sum():
    total = []
    for i in range(1,7):
        for j in range(1,7):
            _sum = i+j
            total.append(_sum)
    return total


In [21]:
card_b=0 
card_a_and_b = 0
for i in two_dice_sum():
    if i < 8:
        card_b += 1
        if i%2==0:
            card_a_and_b +=1

print('card(b) = ' + str(card_b))
print('card(a_and_b) = ' + str(card_a_and_b))
print(f'P(A|B) = card(a and b) / card(b) = {round(card_a_and_b/card_b*100,2)}%')

card(b) = 21
card(a_and_b) = 9
P(A|B) = card(a and b) / card(b) = 42.86%


----------------------------------------------------------------------------------------------------------
A team of biologists wants to measure the efficiency of a new HIV test they developed (HIV is a virus that causes AIDS, a disease which affects the immune system). They used the new method to test 53 people, and the results are summarized in the table below:

![probability-pic-8](https://raw.githubusercontent.com/tongNJ/Dataquest-Online-Courses-2022/main/Pictures/probability-pic-8.PNG)


By reading the table above, we can see that:

- 23 people are infected with HIV.
- 30 people are not infected with HIV (HIVC means not infected with HIV — recall from the previous course that the superscript "C" indicates a set complement).
- 45 people tested positive for HIV .
- 8 people tested negative for HIV.
- Out of the 23 infected people, 21 tested positive (correct diagnosis).
- Out of the 30 not-infected people, 24 tested positive (wrong diagnosis).


The team now intends to use these results to calculate probabilities for new patients and figure out whether the test is reliable enough to use in hospitals. They want to know:

1. What is the probability of testing positive, given that a patient is infected with HIV?


2. What is the probability of testing negative, given that a patient is not infected with HIV?

In [24]:
#Q1 What is the probability of testing positive, given that a patient is infected with HIV?
# P(T+ | HIV+) = card(T+ ∩ HIV+) / card(HIV+)

card_HIV = 23 # there are 23 people infected with HIV
card_positive_and_HIV = 21 # out of 23 infected people, 21 people were tested positive

P_T_given_HIV = card_positive_and_HIV /card_HIV 

print(f'The probability of testing positive, given that a patient is infected with HIV is {round(P_T_given_HIV*100,2)}%')

The probability of testing positive, given that a patient is fected with HIV is 91.3%


In [25]:
#Q2 What is the probability of testing negative, given that a patient is not infected with HIV?
# P(T- | HIV-) = card(T- ∩ HIV-) / card(HIV-)
card_no_HIV = 30
card_negative_and_no_HIV = 6

P_T_given_no_HIV = card_negative_and_no_HIV / card_no_HIV

print(f'The probability of testing negative, given that a patient is not infected with HIV is {round(P_T_given_no_HIV*100,2)}%')

The probability of testing negative, given that a patient is not infected with HIV is 20.0%


In [26]:
'''
The probability of testing negative given that a patient is not
infected with HIV is 20%. This means that for every 10,000 healthy
patients, only about 2000 will get a correct diagnosis, while the
other 8000 will not. It looks like the test is almost completely
inefficient, and it could be dangerous to have it used in hospitals.
'''

'\nThe probability of testing negative given that a patient is not\ninfected with HIV is 20%. This means that for every 10,000 healthy\npatients, only about 2000 will get a correct diagnosis, while the\nother 8000 will not. It looks like the test is almost completely\ninefficient, and it could be dangerous to have it used in hospitals.\n'


--------------------------------------------------------------------------------------------------
A company offering a browser-based task manager tool intends to do some targeted advertising based on people's browsers. The data they collected about their users is described in the table below:

![probability-pic-9](https://raw.githubusercontent.com/tongNJ/Dataquest-Online-Courses-2022/main/Pictures/probability-pic-9.PNG)

Find:

1. P(Premium | Chrome) — the probability that a randomly chosen user has a premium subscription, provided their browser is Chrome. Assign your answer to **p_premium_given_chrome**.


2. P(Basic | Safari) — the probability that a randomly chosen user has a basic subscription, provided their browser is Safari. Assign your answer to **p_basic_given_safari**.


3. P(Free | Firefox)} — the probability that a randomly chosen user has a free subscription, provided their browser is Firefox. Assign your answer to **p_free_given_firefox**.


4. Between a Chrome user and a Safari user, who is more likely to have a premium subscription? If you think a Chrome user is the answer, then assign the string 'Chrome' to a variable named more_likely_premium, otherwise assign 'Safari'. To solve this exercise, you'll also need to calculate **P(Premium | Safari)**.

In [29]:
# Q1 - the probability that a randomly chosen user has a premium subscription, provided their browser is Chrome
# p_premium_given_chrome = P(premium ∩ Chrome) / P (Chrome)
p_premium_given_chrome = (158/6385) / (2762/6385)
print(f'''The probability that a randomly chosen user has a premium subscription, 
provided their browser is Chrome = {round(p_premium_given_chrome*100,2)}%''')

The probability that a randomly chosen user has a premium subscription, 
provided their browser is Chrome = 5.72%


In [32]:
#Q2 - the probability that a randomly chosen user has a basic subscription, provided their browser is Safari.
p_basic_given_safari = 274/1288
print(f'''The probability that a randomly chosen user has a basic subscription, 
provided their browser is Safari = {round(p_basic_given_safari*100,2)}%''')

The probability that a randomly chosen user has a basic subscription, 
provided their browser is Safari = 21.27%


In [33]:
#Q3 - the probability that a randomly chosen user has a free subscription, provided their browser is Firefox
p_free_given_firefox = 2103/2285
print(f'''The probability that a randomly chosen user has a free subscription, 
provided their browser is Firefox = {round(p_free_given_firefox*100,2)}%''')

The probability that a randomly chosen user has a free subscription, 
provided their browser is Firefox = 92.04%


In [36]:
#Q4 - Between a Chrome user and a Safari user, who is more likely to have a premium subscription?
p_premium_given_safari = 120/1288
print(f'''The probability that a randomly chosen user has a premium subscription, 
provided their browser is Safari = {round(p_premium_given_safari*100,2)}%''' )
print('\n')
print('Safari users are more likely to choose Premium subscription')

The probability that a randomly chosen user has a premium subscription, 
provided their browser is Safari = 9.32%


Safari users are more likely to choose Premium subscription


### Quick important summary

- P(A) means finding the probability of A

- P(A|B) means finding the conditional probability of A (given that B occurs)

- P(A ∩ B) means finding the probability that both A and B occur

- P(A ∪ B) means finding the probability that A occurs or B occurs (this doesn't exclude the situation where both A and B occur)


#### Test

Suppose we have a bowl with **six green marbles** and **four red marbles**. If we're drawing one marble at a time randomly and **without replacement** (without replacement means we don't put the marbles drawn back in the bowl), then what's the probability of getting a red marble on the first draw, followed by a green marble on the second draw?

In [1]:
#P(1st_red) = total marbles in red / total marbles
p_1st_red = 4/10

#P(2nd_green) = total marbles in green / total marbles after one being drawn
p_2nd_green = 6/9

#P(1st draw red followed by 2nd draw green) = P(1st_red) * P(2nd_green)
p_1st_red_2nd_green = p_1st_red * p_2nd_green

print(p_1st_red_2nd_green)

0.26666666666666666


In probability notation, we want to find P(A ∩ B), where:

- A is the event that we get a red marble on the first draw
- B is the event that we get a green marble on the second draw

In this case, we don't have a table anymore that we can use to calculate P(A ∩ B). However, we can find a solution by using the conditional probability formula to develop a separate formula for P(A ∩ B). Using a little algebra, we have:

![probability-pic-10](https://raw.githubusercontent.com/tongNJ/Dataquest-Online-Courses-2022/main/Pictures/probability-pic-10.PNG)


Above, we used P(A|B) to develop our formula, but note that we can also use P(B|A):

![probability-pic-11](https://raw.githubusercontent.com/tongNJ/Dataquest-Online-Courses-2022/main/Pictures/probability-pic-11.PNG)


Note that A ∩ B = B ∩ A, which means that P(A∩B) = P(B∩A). As a consequence, we have two different formulas we can use to calculate P(A ∩ B):


![probability-pic-12](https://raw.githubusercontent.com/tongNJ/Dataquest-Online-Courses-2022/main/Pictures/probability-pic-12.PNG)

### Either of the two formulas above is called the multiplication rule of probability — or, in short, the multiplication rule.

#### Exercise below

- The probability that a customer buys RAM memory from an electronics store is P(RAM) = 0.0822.

- The probability that a customer buys a gaming laptop is P(GL) = 0.0184.

- The probability that a customer buys RAM memory given that they bought a gaming laptop is P(RAM | GL) = 0.0022.

Calculate:

1. P(GL ∩ RAM) — assign your answer to p_gl_and_ram.
2. P($RAM^C$ | GL) — assign your answer to p_non_ram_given_gl.
3. P(GL ∩ $RAM^C$) — assign your answer to p_gl_and_non_ram.
4. P(GL ∪ RAM) — assign your answer to p_gl_or_ram.

In [5]:
# 1. P(GL ∩ RAM) — assign your answer to p_gl_and_ram.
#  P(GL ∩ RAM) = P(RAM∩ GL) = P(GL) * P (RAM | GL)
p_ram = 0.0822
p_gl = 0.0184
p_ram_given_gl = 0.0022

p_gl_and_ram = p_gl * p_ram_given_gl
print(f'P(GL ∩ RAM) = {round(p_gl_and_ram*100,4)}%')

P(GL ∩ RAM) = 0.004%
4.048e-05


In [13]:
# 2. P(no_RAM | GL) — assign your answer to p_non_ram_given_gl.
# P(no_RAM | GL) = P(no_RAM ∩ GL) / P(GL) = 1 - P(RAM ∩ GL)
p_non_ram_given_gl = 1 - p_ram_given_gl
print(f'P(no_RAM | GL) = {round(p_non_ram_given_gl*100,4)}%')

P(no_RAM | GL) = 99.78%


In [14]:
# 3. P(GL ∩ no_RAM) — assign your answer to p_gl_and_non_ram.
# P(GL ∩ no_RAM) = P(no_RAM ∩ GL) = P(no_RAM | GL) * P(GL)
p_gl_and_non_ram = p_non_ram_given_gl * p_gl
print(f'P(GL ∩ no_RAM) = {round(p_gl_and_non_ram*100,4)}%')

P(GL ∩ no_RAM) = 1.836%


In [15]:
# 4. P(GL ∪ RAM) — assign your answer to p_gl_or_ram.
# P(GL ∪ RAM) = P(GL) + P(RAM) - P(GL ∩ RAM)
p_gl_or_ram = p_gl + p_ram - p_gl_and_ram
print(f'P(GL ∪ RAM) = {round(p_gl_or_ram*100,4)}%')

P(GL ∪ RAM) = 10.056%


### Recall,  you might remember from the previous course that we introduced the multiplication rule in a slightly different way (notice there's no conditional probability involved in the formula below):

#### P(A∩B) = P(A) ⋅ P(B)    - formula(2)

#### but we just learnt that the formula(1),

![probability-pic-12](https://raw.githubusercontent.com/tongNJ/Dataquest-Online-Courses-2022/main/Pictures/probability-pic-12.PNG)


#### our question is in what circumstance that formula(1) = formula(2) ??

Event A is getting a 5 on the first roll
Event B is getting a 6 on the second roll)


To clarify the difference, let's consider an example where we roll a fair six-sided die twice and want to find P(A ∩ B), where:

- Event A is getting a 5 on the first roll
- Event B is getting a 6 on the second roll

Using formula(1), we have:
**P(A∩B) = P(A) ⋅ P(B|A) = (1/6) ⋅ (1/6) = 1/36**

Using formula(2), we have:
**P(A∩B) = P(A) ⋅ P(B) = (1/6) ⋅ (1/6) = 1/36**

In more general terms, if event A occurs and the probability of B remains unchanged and vice versa (A and B can be any events for any random experiment), then events A and B are said to be **statistically independent**  (although the term "independent" is more often used).


- To prove two events are **dependent**, it's enough to **prove wrong** only one of these three relationships: 
  - P(A)=P(A∣B), 
  - P(B)=P(B∣A), 
  - and P(A∩B)= P(A)⋅P(B).


-----------------------------------------------------------------------------------------------------------
### Mutually independent

We see events A, B, C are **mutually independent** only if they meet two conditions. 
- First, the condition of pairwise independence must hold:
  - P(A∩B) = P(A) * P(B)
  - P(A∩C) = P(A) * P(C)
  - P(B∩C) = P(B) * P(C)


- Second, events A, B, and C must be independent together:
  - P(A∩B∩C) = P(A) * P(B) * P(C)


If any of these two conditions are not fulfilled, then A, B, C are not mutually independent, and we cannot use the multiplication rule in the above form.

What we really need is to develop a multiplication rule in terms of conditional probability that works correctly for cases where we have **three dependent events**.

Let's start by recalling that:

  **P(A∩B) = P(A) ⋅ P(B|A)**
  
  
Note that we can think of P(A ∩ B ∩ C) as the probability of two events instead of three:

![probability-pic-13](https://raw.githubusercontent.com/tongNJ/Dataquest-Online-Courses-2022/main/Pictures/probability-pic-13.PNG)


Now we have a final multiplication rule we can use for cases where we have three mutually dependent events:

  **P(A∩B∩C) = P(A) ⋅ P(B|A) ⋅ P(C|A∩B)**

In [16]:
32/90

0.35555555555555557

In [17]:
90/2000

0.045