One intuitive way to understand `P(A|B)` is **if B occurs, then what's the probability that A occurs?** 

This suggests that both events A and B occur. However, since `P(A ∩ B)` is the probability that both A and B occur, then what's the difference between `P(A|B)` and `P(A ∩ B)`, if any?

![image.png](attachment:image.png)

Finding `P(A ∩ B)`, the probability that both A and B occur, means finding the probability that we get a number that is both odd and greater than `1 (either a 3 or a 5)`, which is:

![image.png](attachment:image.png)

* With `P(A ∩ B)`, we're trying to find the probability of two events `(A and B)`, while 
* With `P(A|B)` we're only trying to find the probability of a single event, which is A

![image.png](attachment:image.png)

# Quick and important summary:

* `P(A)` means finding the probability of A
* `P(A|B)` means finding the conditional probability of A (given that B occurs)
* `P(A ∩ B)` means finding the probability that both A and B occur
* `P(A ∪ B)` means finding the probability that A occurs or B occurs (this doesn't exclude the situation where both A and B occur)

The analytics team of a store randomly sampled 2,000 customers and looked at customer behavior with respect to buying laptops and wireless mouses. The results are summarized in the table below, where:

* `L` means the customer bought a laptop
* `M` means the customer bought a mouse
* `LC` means the customer didn't buy a laptop
* `MC` means the customer didn't buy a mouse

![image.png](attachment:image.png)

In [3]:
# P(M), the probability that a customer buys a mouse
p_m = 515/2000

# P(M|L), the probability that a customer buys a mouse given that they bought a laptop 
p_m_given_l = 32/90

# P(M ∩ L), the probability that a customer buys both a mouse and a laptop
p_m_and_l = 32/2000

# P(M ∪ L), the probability that a customer buys a mouse or a laptop

p_m_or_l = (515+90)/2000 - p_m_and_l
p_m_or_l

0.2865

Looking at the table, we can see that `P(MC|L)`, the probability that a customer doesn't buy a mouse given that they bought a laptop, is:

![image.png](attachment:image.png)

For any random experiment either event A or AC will happen, so the event "A or non-A" (A ∪ AC) is certain and has a probability of one:

![image.png](attachment:image.png)

For our electronics store example, say new data is collected, and we know that:

* `P(B|M) = 0.1486`, the probability that a customer buys batteries given that they bought a mouse is `0.1486`.
* `P(C|L) = 0.0928`, the probability that a customer buys a cooler given that they bought a laptop is `0.0928`.
* `P(BC|C) = 0.7622`, the probability that a customer doesn't buy batteries given that they bought a cooler is `0.7622`.

In [4]:
p_b_given_m = 0.1486
p_c_given_l = 0.0928
p_non_b_given_c = 0.7622

# P(BC|M)
p_non_b_given_m = 1-p_b_given_m

# P(CC|L)
p_non_c_given_l = 1-p_c_given_l

# P(B|C)
p_b_given_c = 1 - p_non_b_given_c

# P(B|MC)

p_b_given_non_m = 'not possible' # we need to find out Mc first


Note: `P(A|B)` does not necessarily have the same value as `P(B|A)` (although in some rare circumstances, they may end up being equal).

In [6]:
# Find:

# P(M|LC)
p_m_given_non_l = 483/1910

# P(LC|M)
p_non_l_given_m = 483/515

# P(M ∩ LC)
p_m_and_non_l = 483/2000

# P(LC ∩ M)
p_non_l_and_m = 483/2000 # P(M and non-L) is the same as P(non-L and M)


Note, however, that we can't always calculate probabilities for events of the form `P(A ∩ B)` just by looking at some data in a table.

Suppose we have a bowl with six green marbles and four red marbles. If we're drawing one marble at a time randomly and without replacement (without replacement means we don't put the marbles drawn back in the bowl), then what's the probability of getting a red marble on the first draw, followed by a green marble on the second draw?

In probability notation, we want to find `P(A ∩ B)`, where:

* A is the event that we get a red marble on the first draw
* B is the event that we get a green marble on the second draw

In this case, we don't have a table anymore that we can use to calculate `P(A ∩ B)`. However, we can find a solution by using the conditional probability formula to develop a separate formula for `P(A ∩ B)`. Using a little algebra, we have:

![image.png](attachment:image.png)

Either of the two formulas above is called the **multiplication rule** of probability — or, in short, the **multiplication rule**.

Out of the ten marbles in the bowl, four marbles are red, so we have:

`P(A) = 4/10`

We're sampling without replacement (we don't put back the marbles once we draw them), which means that for the second draw, we have nine marbles left

`P(B|A) = 6/9`

Using the multiplication rule

`P(A n B) = (4/10)x(6/9) = 24/90`

* The probability that a customer buys RAM memory from an electronics store is `P(RAM) = 0.0822`.
* The probability that a customer buys a gaming laptop is `P(GL) = 0.0184`.
* The probability that a customer buys RAM memory given that they bought a gaming laptop is `P(RAM | GL) = 0.0022`.

In [7]:
p_ram = 0.0822
p_gl = 0.0184
p_ram_given_gl = 0.0022

# P(GL ∩ RAM)
p_gl_and_ram = p_gl*p_ram_given_gl

# P(RAMC | GL) 
p_non_ram_given_gl = 1 - p_ram_given_gl

# P(GL ∩ RAMC)
p_gl_and_non_ram = p_gl*p_non_ram_given_gl

# P(GL ∪ RAM)
p_gl_or_ram = p_gl + p_ram - p_gl_and_ram

In last probablity projects, we introduced the multiplication rule in a slightly different way (notice there's no conditional probability involved in the formula below):

![image.png](attachment:image.png)

To clarify the difference, let's consider an example where we roll a fair six-sided die twice and want to find `P(A ∩ B)`, where:

* Event A is getting a 5 on the first roll
* Event B is getting a 6 on the second roll

![image.png](attachment:image.png)

![image.png](attachment:image.png)

![image.png](attachment:image.png)

In more general terms, if event A occurs and the probability of B remains unchanged and vice versa (A and B can be any events for any random experiment), then events A and B are said to be **statistically independent** (although the term "independent" is more often used).

![image.png](attachment:image.png)

A fair six-sided die is rolled twice and the following three events are considered:

* `Event K` — the die showed a 4 on the second roll
* `Event L` — the die showed a 2 on the first roll
* `Event M` — the die showed an even number on the second roll

In [8]:
# Find whether the following events are independent or not:

# Events K and L
k_and_l = 'independent'

# Events L and M
l_and_m = 'independent'

# Events K and M
k_and_m = 'dependent'


We saw events A and B are independent if:
    
![image.png](attachment:image.png)

If any of the three relationships above does not hold, then events A and B are said to be **statistically dependent (or just "dependent")**.

If events A and B are dependent, it means the occurrence of event A changes the probability of event B and vice versa. In mathematical terms, this means that:

![image.png](attachment:image.png)

To prove two events are dependent, it's enough to prove wrong only one of these three relationships:
![image.png](attachment:image.png)

![image.png](attachment:image.png)

Both above formulas will lead to the same result. However, depending on the problem we're trying to solve, it may be easier to calculate `P(A|B)` rather than `P(B|A)` or vice versa, so we should choose the formula that's easier to work with

![image.png](attachment:image.png)

In [1]:
# Find whether the following events are independent or not

# Events L and M    # check either P(A/B)==P(A) or P(B/A)==P(B) or P(AnB) = P(A).P(B)
l_and_m = 'dependent'

# Events L and MC
l_and_non_m = 'dependent'

In [2]:
# calculate

#  P(L ∩ M)
p_l_and_m = (90/2000)*(32/90)

#  P(L ∩ MC)
p_l_and_non_m = (90/2000)*(58/90)

We are only considering two events uptil now. But what if we have three events, A, B, and C? How do we find whether or not they are independent? And what multiplication rule do we use with three events?

When we discussed the multiplication rule, the rule can be extended for any number of events, provided they are all independent. If we have events A, B, C,..., X, Y, Z, and they are all independent, then the multiplication rule can extend to:
![image.png](attachment:image.png)

To find whether three events — A, B, C — are independent or not, two conditions must hold. 

1. The three events have to be independent one from another, which means the following relationships must be true
![image.png](attachment:image.png)

Above, events A, B, C are independent in pairs — we say they are **pairwise independent**.

2. They should be also independent together, which mathematically means
![image.png](attachment:image.png)

If both conditions above hold, events A, B, C are said to be **mutually independent**.

We'll now look at an example where three events satisfy the condition of pairwise independence, and yet they are not mutually independent. Let's say we toss a fair coin twice and consider the following three events, where:

* A is the event that we get heads on both tosses, or heads on the first toss and tails on the second: A = {HH, HT}
* B is the event that we get heads on both tosses, or tails on the first toss and heads on the second: B = {HH, TH}
* C is the event that we get heads on both tosses, or tails on both tosses: B = {HH, TT}

The entire sample space has four possible outcomes:
![image.png](attachment:image.png)

![image.png](attachment:image.png)

![image.png](attachment:image.png)

We conclude that events A, B, C are not mutually independent, even though they are pairwise independent

For our electronics store example, say new data is collected, and we know that:

* The probability that a customer buys an electric toothbrush is `P(ET) = 0.0432`.
* The probability that a customer buys an air conditioning system is `P(AC) = 0.0172`.
* The probability that a customer buys a PlayStation is `P(PS) = 0.0236`.

In [4]:
p_et = 0.0432
p_ac = 0.0172
p_ps = 0.0236

# Assuming events ET, AC, and PS are mutually independent, calculate:

# P(ET ∩ PS)
p_et_and_ps = p_et*p_ps

# P(ET ∩ AC)
p_et_and_ac = p_et*p_ac

# P(AC ∩ PS) 
p_ac_and_ps = p_ac*p_ps

# P(ET ∩ AC ∩ PS)
p_et_and_ac_and_ps = p_et*p_ac*p_ps

We now need to develop a **multiplication rule** in terms of conditional probability that works correctly for cases where we have three dependent events. Let's start by recalling that:
![image.png](attachment:image.png)

Note that we can think of `P(A ∩ B ∩ C)` as the probability of two events instead of three:

![image.png](attachment:image.png)

For our electronics store example, say new data is collected. We know that:

* The probability that a customer doesn't buy a set of laptop stickers is `P(LSC) = 0.9821`.
* The probability that a customer buys screen cleaning wipes given that they bought a set of laptop stickers is `P(CW | LS) = 0.0079`.
* The probability that a customer buys a laptop given that they bought both a set of laptop stickers and screen cleaning wipes is `P(L | LS ∩ CW) = 0.2908`.

In [5]:
p_non_ls = 0.9821
p_cw_given_ls = 0.0079
p_l_given_ls_and_cw = 0.2908

# Assume events LS, CW, and L are dependent and calculate P(LS ∩ CW ∩ L)
p_ls = 1 - p_non_ls

p_ls_and_cw_and_l = p_ls*p_cw_given_ls*p_l_given_ls_and_cw

We cover important concepts and ideas in this project

![image.png](attachment:image.png)