##  Introduction

In the previous course, we covered the fundamentals of probability and learned about:

- Theoretical and empirical probabilities
- Probability rules (the addition rule and the multiplication rule)
- Counting techniques (the rule of product, permutations, and combinations)

In this course, we'll build on what we've learned and develop new techniques that will enable us to better estimate probabilities. Our focus for the entire course will be on learning how to calculate probabilities based on certain conditions — hence the name conditional probability.

By the end of this course, we'll be able to:

- Assign probabilities to events based on certain conditions by using conditional probability rules.
- Assign probabilities to events based on whether they are in a relationship of statistical independence or not with other events.
- Assign probabilities to events based on prior knowledge by using Bayes' theorem.
- Create a spam filter for SMS messages using the multinomial Naive Bayes algorithm.

Let's start by doing a very quick recap and then solving a few probability exercises 

On this screen, we'll solve a few probability exercises to remind ourselves of some of the concepts we learned in the previous course. As a side note, we'll try to do a brief recap wherever possible, but if you feel it's not enough, please go back to the previous course and read the material again — having to recall something you might have forgotten is an essential part of the learning process.

Let's start by considering rolling a regular six-sided die. This random experiment has six possible outcomes: 1, 2, 3, 4, 5, and 6. The set of all possible outcomes associated with a random experiment is called a sample space, and we denote it by the Greek letter 
Ω
 ("Omega"). We represent the sample space of a die roll as a set:

![image.png](attachment:31d6ba64-5e3b-4824-9732-ce0091f87627.png)

Under the assumption that all six outcomes are equally likely (the die is fair), we can find P(5) — the probability of the event "getting a 5" — by using the following formula:

![image.png](attachment:2d0e6bef-2fe8-4185-bf90-0f86685694c6.png)

There are six possible outcomes (1, 2, 3, 4, 5, and 6) and one successful outcome (5), so the probability of getting a 5 when rolling a fair six-sided die is:

![image.png](attachment:45ffc572-72ac-4584-a461-f4bce6f58c92.png)

Now let's solve a few exercises and start the discussion about conditional probability on the next screen.

Consider rolling a fair six-sided die once and calculate:

1. The probability of getting a 2. Assign your answer to p_2.
2. The probability of getting an odd number (1, 3, or 5). Assign your answer to p_odd.
3. The probability of getting a 2 or a 4, Assign your answer to p_2_or_4.

In [1]:
p_2 = 1/6
p_odd = 3/6
p_2_or_4 = 2/6

## Updating Probabilities

![image.png](attachment:72a1e664-5bb3-4bec-9187-dd3ebda64580.png)

![image.png](attachment:0cbe6f32-2055-4d64-901a-33a307751deb.png)

When we don't know whether the number is odd, the possible outcomes of the experiment are 1, 2, 3, 4, 5, or 6. But after we find out the number is odd, the possible outcomes are 1, 3, or 5. In other words, the new information we got reduced the sample space from {1, 2, 3, 4, 5, 6} to {1, 3, 5}:

![image.png](attachment:4a3776fd-280a-4c02-b6b3-141fdf57267a.png)

![image.png](attachment:732c8b77-6540-4491-b17b-bdeb71dfdd08.png)

![image.png](attachment:7e059416-561d-4be7-bcde-7a396d849d28.png)

Let's do a few exercises similar to the one above. We'll continue the discussion on the next screen, where we introduce the concept of conditional probability.

A fair six-sided die is rolled. All we know is that the number we got is less than 5. Calculate:

1. The probability of getting a 3. Assign your answer to p_3.
2. The probability of getting a 6. Assign your answer to p_6.
3. The probability of getting an odd number. Assign your answer to p_odd.
4. The probability of getting an even number. Assign your answer to p_even.

In [2]:
p_3 = 1/4
p_6 = 0
p_odd = 2/4
p_even = 2/4

##  Conditional Probability

![image.png](attachment:d382ffa5-ace0-43a8-8aa9-20c04a1fa201.png)

![image.png](attachment:6d4f7d93-94c9-4a10-800b-22eb83ec9e84.png)

![image.png](attachment:b3236215-9ed7-4cf1-abcd-f42f54dfe912.png)

1. What is the probability of getting a 5?
2. What is the probability of getting a 5 given the die showed an odd number?

![image.png](attachment:fc7ccdb8-6e63-4162-878c-8fb64f6a4df8.png)

![image.png](attachment:24c5d5ae-4188-455e-94c0-0c8616611d81.png)

![image.png](attachment:3ae41fc3-bfc0-47c4-9769-e013b895e995.png)

![image.png](attachment:bfe3f321-c5dd-49c7-a6c9-d86dfdb81b5d.png)

Now let's get more practice with solving conditional probability problems. On the next screen, we develop a formula for calculating conditional probabilities.

A student is randomly selected from a class. All we know is that he was born during winter. Assume the winter months are December, January, and February and ignore the fact that these three months have different number of days. Find:

1. probability that he was born in December. Assign your answer to p_december.
2. The probability that he was born during summer. Assign your answer to p_summer.
3. The probability that he was born in a month which ends in letter "r" — "September", for instance, ends in "r", while "April" doesn't. Assign your answer to p_ends_r.

In [3]:
p_december = 1/3
p_summer = 0
p_ends_r = 1/3

## Conditional Probability Formula

![image.png](attachment:40f3938a-fb1a-4c74-b6ca-a85470f54e46.png)

Say we roll a fair six-sided die and want to find the probability of getting an odd number, given the die showed a number greater than 1 after landing. Using probability notation, we want to find P(A|B) where:

- A is the event that the number is odd: A = {1, 3, 5}
- B is the event that the number is greater than 1: B = {2, 3, 4, 5, 6}

To find P(A|B), we need to use the following formula:

![image.png](attachment:e81b7548-a114-45ce-84ea-e939b5678589.png)

We know for sure event B happened (the number is greater than 1), so the sample space is reduced from {1, 2, 3, 4, 5, 6} to {2, 3, 4, 5, 6}:

![image.png](attachment:30288a27-a2ee-451b-aa15-5f0161c7b8e3.png)

This means we're left with only five total possible outcomes if B happens:

![image.png](attachment:ed969a60-a015-4d89-8790-a785ad6838f3.png)

![image.png](attachment:c7fd7784-37a0-4c10-b18b-b10204520b0b.png)

![image.png](attachment:345eb15a-cd43-4da2-bc30-549b815d2dd9.png)

In set notation, cardinal(Ω) is abbreviated as card(Ω), so we have:

![image.png](attachment:7516d596-292e-4eb6-80a7-5fb553a83e76.png)

![image.png](attachment:3ed156b5-dff0-4157-8e20-f62e96a0caea.png)

![image.png](attachment:010fc492-a28a-40f1-b15c-78724b981acb.png)

Recall we're interested in finding the probability of getting an odd number, given the number the die showed is greater than 1. There are three odd numbers on a regular six-sided die (1, 3, and 5), but we know for sure we got a number greater than 1, so the only possible odd numbers we can get are 3 and 5. This means that the number of possible successful outcomes is two:

![image.png](attachment:b919c6e0-8df8-48ed-aa23-bae9d19dbdab.png)

The only possible odd numbers we can get are only 3 and 5, and the number of possible successful outcomes is also given by the cardinal of the set {3, 5}:

![image.png](attachment:69dedcd6-372f-4500-a6f0-77d9b7d2602e.png)

Note that the set {3, 5} is the result of the intersection between set A and set B:

![image.png](attachment:473e0cd0-b7d6-44fb-a2ff-e2e4a34e7624.png)

![image.png](attachment:c501cd43-def5-45e4-ba59-7e44d6436885.png)

So the number of successful outcomes is given by the cardinal of the intersection between set A and B:

![image.png](attachment:95ba0634-f69b-4cc3-839d-c790ba7b0b45.png)

We now have a formula for conditional probability, defined purely in terms of A and B, where A and B can be any events (not just events related to a die roll):

![image.png](attachment:9362daec-6881-44d5-95ce-f3b1bdcb784c.png)

We'll practice this formula in the following exercise. On the next screen, we'll look at a more realistic example and see how the formula can be used to measure the efficiency of a newly developed Covid test.

Two fair six-sided dice are simultaneously rolled, and the two numbers they show are added together. The diagram below shows all the possible results that we can get from adding the two numbers together.

![image.png](attachment:44c1bdcc-90e9-414b-b56a-8e4e655fd6b4.png)

Find P(A|B), where A is the event where the sum is an even number, and B is the event that the sum is less than eight.

1. Find card(B). Assign your answer to card_b
    - Note that you'll have to treat identical sums differently if they come from different die numbers. On the diagram above, we see that we have three sums of 4, but they all come from different die outcomes: (3, 1), (2,2), and (1, 3), where the first number describes the outcome of the first die throw, and the second number the outcome of the second die throw.
2. Find card(A ∩ B). Assign your answer to card_a_and_b.
3. Calculate P(A|B). Assign your answer to p_a_given_b.

In [5]:
card_b = 21
card_a_and_b = 9
p_a_given_b = 9/21

## Example Walkthough

On the previous screen, we developed a formula for conditional probability:

![image.png](attachment:84691f81-fef9-41b6-a7c2-bb79eec88a59.png)

We'll now use the formula in the context of a more realistic example. A team of biologists wants to measure the efficiency of a new COVID-19 test they developed. COVID-19 is a virus that causes respiratory illness and has had significant global impact. The team used the new method to test 53 people, and the results are summarized in the table below:

![image.png](attachment:cc7fbd12-f11a-435d-a49f-786558ddf431.png)

By reading the table above, we can see that:

- 23 people are infected with COVID-19.
- 30 people are not infected with COVID-19 (COVID-19C means not infected with COVID-19 — recall from the previous course that the superscript "C" indicates a set complement).
- 45 people tested positive for COVID-19.
- 8 people tested negative for COVID-19.
Out of the 23 infected people, 21 tested positive (correct diagnosis).
- Out of the 30 not-infected people, 24 tested positive (wrong diagnosis).

The team now intends to use these results to calculate probabilities for new patients and figure out whether the test is reliable enough to use in hospitals. They want to know:

- What is the probability of testing positive, given that a patient is infected with COVID-19?- 
What is the probability of testing negative, given that a patient is not infected with COVID-19?

![image.png](attachment:2d935b2a-cb18-45e3-a398-2186d3cc1f44.png)

![image.png](attachment:ef4ee9a5-ee36-4ccb-8f5d-3d92ff0763e9.png)

![image.png](attachment:0569e458-8947-45c2-bb79-6c905035ff5d.png)


![image.png](attachment:c6f91d1b-122a-4e77-b960-311d6e967c87.png)

The probability of testing positive, given that the patient is infected with COVID-19, is therefore 91.30%. This may suggest that the new test is fairly good at detecting the virus when the virus is actually present. However, at a probability of 91.30%, we can expect that for every 10,000 patients infected with COVID-19, about 9,130 patients will get a correct diagnosis, while the other 870 will not. The team should probably conclude that the test needs more refinement with respect to detecting the virus.

![image.png](attachment:05f39ef4-99cf-4717-a9dd-4b1ea4e3ea45.png)

Use the data in the table below (it's the same table we used above) and:

![image.png](attachment:69d992b9-fc5c-4508-8b33-5e6248c4c9dc.png)

1. Calculate P(T- | COVID-19C). Assign your answer to p_negative_given_non_covid.
2. Print p_negative_given_non_covid.
3. Interpret the result — does the value of P(T- | COVID-19C) suggest that the test needs more work? Or does it look like the test is reliable enough to use in hospitals? Write your thoughts using a multi-line string. This part of the exercise is not answer-checked, but we suggest a solution nonetheless.

In [6]:
p_negative_given_non_hiv = 6/ 30

print(p_negative_given_non_hiv)

0.2


The probability of testing negative given that a patient is not
infected with COVID-19 is 20%. This means that for every 10,000 healthy
patients, only about 2000 will get a correct diagnosis, while the
other 8000 will not. It looks like the test is almost completely
inefficient, and it could be dangerous to have it used in hospitals.

## Probability Formula

On the last screen, we put our conditional probability formula to use and answered a few questions about the effectiveness of a COVID-19 test using the data in the table below:

![image.png](attachment:48803dc6-02dd-4bb8-bfe5-231014df4be0.png)

![image.png](attachment:4b5e8522-5b72-4b85-a3b4-0af5475c26d9.png)

![image.png](attachment:e5c6bbb4-f554-490f-b973-60097aaa48f8.png)

This allows us to define a formula for conditional probability purely in terms of probabilities instead of set cardinals. Thus, for any two events A and B, P(A|B) is:

![image.png](attachment:d6fd25f2-6dd0-4b3d-98fb-343482e8dfe6.png)

![image.png](attachment:49693566-875a-4929-9f3d-d8c4735eeec3.png)

![image.png](attachment:4991dc3a-19fd-4644-bc51-537d6f647e36.png)

To understand the mathematical reason for why the above formula works, let's start by considering the following mathematical statements:

![image.png](attachment:e3748093-237e-4627-9420-7cad5bfcde3c.png)

![image.png](attachment:2d27b5b3-36e4-4557-a3c9-8bd4ccab41e7.png)

Now let's get more practice with our new formula.

A company offering a browser-based task manager tool intends to do some targeted advertising based on people's browsers. The data they collected about their users is described in the table below:

![image.png](attachment:9c7c1081-69f2-4575-8ddf-e7d3efb05cdc.png)

1. P(Premium | Chrome) — the probability that a randomly chosen user has a premium subscription, provided their browser is Chrome. Assign your answer to p_premium_given_chrome.
2. P(Basic | Safari) — the probability that a randomly chosen user has a basic subscription, provided their browser is Safari. Assign your answer to p_basic_given_safari.
3. P(Free | Firefox) — the probability that a randomly chosen user has a free subscription, provided their browser is Firefox. Assign your answer to p_free_given_firefox.
4. Between a Chrome user and a Safari user, who is more likely to have a premium subscription? If you think a Chrome user is the answer, then assign the string 'Chrome' to a variable named more_likely_premium, otherwise assign 'Safari'. To solve this exercise, you'll also need to calculate P(Premium | Safari).

In [8]:
p_premium_given_chrome = 158 / 2762
p_basic_given_safari = 274 / 1288
p_free_given_firefox =  2103 / 2285
more_likely_premium = 'Safari'

In this lesson, we learned the fundamentals of conditional probability and managed to derive two important formulas:

![image.png](attachment:b7931258-521c-47b7-bd7b-d072a4b9e730.png)

In the next lesson, we continue our discussion about conditional probability, and learn about:

![image.png](attachment:c8b465ba-5869-4898-8777-c382e27da624.png)

## Conditional Probability: Intermediate

An Important Difference

In the last lesson, we started learning about conditional probability and managed to derive two important formulas:

![image.png](attachment:3df9f413-4280-4fa3-8368-fffc1bc36842.png)

One intuitive way to understand P(A|B) is "if B occurs, then what's the probability that A occurs?" This suggests that both events A and B occur. However, since P(A ∩ B) is the probability that both A and B occur, then what's the difference between P(A|B) and P(A ∩ B), if any?

Let's take rolling a fair six-sided die, and try to find P(A|B) and P(A ∩ B), where:

- A is the event that the number is odd: A = {1, 3, 5}
- B is the event that the number is greater than 1: B = {2, 3, 4, 5, 6}

Using formula (1) above, we see that P(A|B), the probability of getting an odd number given that we got a number greater than 1, is:

![image.png](attachment:d86a4f91-4717-4d4f-97c8-09d4a5a11665.png)

Finding P(A ∩ B), the probability that both A and B occur, means finding the probability that we get a number that is both odd and greater than 1 (either a 3 or a 5), which is:

![image.png](attachment:bbf9c3b6-cd9a-444f-a7f1-23495835b1db.png)

With P(A ∩ B), we're trying to find the probability of two events (A and B), while with P(A|B) we're only trying to find the probability of a single event, which is A.

![image.png](attachment:dd51cd31-7088-4be8-a2d4-62fc1c098e61.png)

Below, we'll look at an exercise to help us understand this distinction better. Before that, however, let's do a quick and important summary:

- P(A) means finding the probability of A
- P(A|B) means finding the conditional probability of A (given that B occurs)
- P(A ∩ B) means finding the probability that both A and B occur
- P(A ∪ B) means finding the probability that A occurs or B occurs (this doesn't exclude the situation where both A and B occur)

The analytics team of a store randomly sampled 2,000 customers and looked at customer behavior with respect to buying laptops and wireless mouses. The results are summarized in the table below, where:

- "L" means the customer bought a laptop
- "M" means the customer bought a mouse
- "LC" means the customer didn't buy a laptop
- "MC" means the customer didn't buy a mouse

![image.png](attachment:f1203bed-c31e-49a1-b66d-d8c11768a7af.png)

Find:

1. P(M), the probability that a customer buys a mouse — assign your answer to p_m.
2. P(M|L), the probability that a customer buys a mouse given that they bought a laptop — assign your answer to p_m_given_l.
3. P(M ∩ L), the probability that a customer buys both a mouse and a laptop — assign your answer to p_m_and_l.
4. P(M ∪ L), the probability that a customer buys a mouse or a laptop — assign your answer to p_m_or_l. Check the hint if you don't remember how to calculate this.

In [10]:
58 + 515

573

In [11]:
p_m = 515 / 2000
p_m_given_l = 32 / 90
p_m_and_l = 32 / 2000
p_m_or_l = 573 / 2000

## Complements


![image.png](attachment:3d087e1a-0d72-45e1-9d69-8355bd47a271.png)

![image.png](attachment:18cfd95d-3ab5-42ee-8dba-fb906dddda50.png)

![image.png](attachment:6f19a3d2-ee46-48e8-923e-0e8cacc0e955.png)

![image.png](attachment:872a75e4-1f1f-42de-9863-ce7f28d7448f.png)

![image.png](attachment:3632dd47-a673-479d-90fa-78d7ba961218.png)

In more general terms, for any two events A and B, we have:

![image.png](attachment:d52ed6af-d32c-4828-96d8-f254ccea2f70.png)

Using some algebra, we can swap terms in the equation above and deduce two important relations that are true for any event A and B:

![image.png](attachment:11f043aa-cef6-47a6-9994-faa838bcdc47.png)

![image.png](attachment:5090481b-fab7-42a9-852b-6126736f6245.png)

![image.png](attachment:2c24634d-4f81-4048-82fb-5586a618fd45.png)

Let's now do a few exercises and continue the discussion in the next screen.

For our electronics store example, say new data is collected, and we know that:

- P(B|M) = 0.1486, the probability that a customer buys batteries given that they bought a mouse is 0.1486.
- P(C|L) = 0.0928, the probability that a customer buys a cooler given that they bought a laptop is 0.0928.
- P(BC|C) = 0.7622, the probability that a customer doesn't buy batteries given that they bought a cooler is 0.7622.

Using the two rules we learned above, find:

![image.png](attachment:a802f935-043a-4cf8-9888-fa067ad9c1f9.png)

In [12]:
p_non_b_given_m = 1 - 0.1486
p_non_c_given_l = 1 - 0.0928
p_b_given_c = 1 - 0.7622
p_b_given_non_m = 'not possible'

## Order of Conditioning

On the first two screens, we built on what we've learned in the previous lesson and tried to get a better understanding of the ins and outs of conditional probability. Next, we take a closer look at the difference between P(A|B) and P(B|A).

Let's consider again the table below:

![image.png](attachment:5f657208-6cde-49da-87d8-0d0db63e75c1.png)

We can calculate P(M|L) and P(L|M) using the data in the table, and we see they are not the same:

![image.png](attachment:192edcb7-81c3-493b-a9cc-4f4511a413f4.png)

Note that, in principle, there could be cases where P(M|L) is equal to P(L|M). For instance, let's say we have this data instead of the table above:

![image.png](attachment:83ef5818-5df5-42b5-9606-44c9ba7b8c4b.png)


Using this data above, we see that P(M|L) and P(L|M) are now equal:

![image.png](attachment:d80508a2-884a-435b-ab11-281c1b9ade7b.png)

The take-home message is that it matters how we condition — P(A|B) does not necessarily have the same value as P(B|A) (although in some rare circumstances, they may end up being equal).

Let's continue with a few exercises and resume the discussion on the next screen.

For the following exercises use the data in the table below.

![image.png](attachment:d38ca0b6-eaee-42ea-9c0a-6c540eca746e.png)


![image.png](attachment:d68fe366-35b4-4848-a0ff-e67c4264ead7.png)

In [13]:
p_m_given_non_l = 483/1910
p_non_l_given_m = 483 / 515
p_m_and_non_l = 483/2000
p_non_l_and_m = 483/2000

## The Multiplication Rule

In the previous exercise, we calculated P(M ∩ LC) by using data from the table below:



![image.png](attachment:fb41314d-475d-4c7c-8b26-75c68b0dbfbe.png)

Note, however, that we can't always calculate probabilities for events of the form P(A ∩ B) just by looking at some data in a table.

Suppose we have a bowl with six green marbles and four red marbles. If we're drawing one marble at a time randomly and without replacement (without replacement means we don't put the marbles drawn back in the bowl), then what's the probability of getting a red marble on the first draw, followed by a green marble on the second draw?

In probability notation, we want to find P(A ∩ B), where:

- A is the event that we get a red marble on the first draw
- B is the event that we get a green marble on the second draw

In this case, we don't have a table anymore that we can use to calculate P(A ∩ B). However, we can find a solution by using the conditional probability formula to develop a separate formula for P(A ∩ B). Using a little algebra, we have:

![image.png](attachment:aca35a38-b098-4b0a-9ee2-a343a93d2146.png)

Above, we used P(A|B) to develop our formula, but note that we can also use P(B|A):



Above, we used P(A|B) to develop our formula, but note that we can also use P(B|A):



![image.png](attachment:33e80198-1223-4146-9be0-49bb8df9164b.png)

![image.png](attachment:639b951b-f5cf-45d9-b71a-07b58ada2f9a.png)

Either of the two formulas above is called the multiplication rule of probability — or, in short, the multiplication rule.

We can use the multiplication rule to calculate P(A ∩ B) in our example with the marbles, where A is the event that we get a red marble on the first draw, and B is the event that we get a green marble on the second draw.

Out of the ten marbles in the bowl, four marbles are red, so we have:

![image.png](attachment:5af68141-0aa7-41e0-ba29-7ac34b23ebf1.png)

We're sampling without replacement (we don't put back the marbles once we draw them), which means that for the second draw, we have nine marbles left. Given that the first marble is red, we have six green marbles left in the bowl, so the probability of getting a green marble on the second draw (B) given that we got a red marble on the first draw (A) is:

![image.png](attachment:69b874ec-e173-46af-acb0-61b8c4850258.png)

Using the multiplication rule, we see that P(A ∩ B), the probability of drawing a red marble followed by a green marble, is:

![image.png](attachment:6349c5f7-3dd4-403e-b054-c637e3b4d0fc.png)

Let's now get some practice using the multiplication rule.

For the exercises below, we know:

- The probability that a customer buys RAM memory from an electronics store is P(RAM) = 0.0822.
- The probability that a customer buys a gaming laptop is P(GL) = 0.0184.
- The probability that a customer buys RAM memory given that they bought a gaming laptop is P(RAM | GL) = 0.0022.

![image.png](attachment:0cdd1489-ed92-4680-9ac8-22458969d387.png)

In [15]:
p_gl_and_ram = 0.0184 * 0.0022
p_non_ram_given_gl = 1 - 0.0022
p_gl_and_non_ram = p_non_ram_given_gl * 0.0184
p_gl_or_ram = 0.0822 + 0.0184 - p_gl_and_ram

## Statistical Independence

On the last screen, we learned about the multiplication rule and saw that:

![image.png](attachment:3552e3f4-8643-4a11-afe8-71d68baa4cbf.png)

However, you might remember from the previous course that we introduced the multiplication rule in a slightly different way (notice there's no conditional probability involved in the formula below):

![image.png](attachment:420a85a9-bf05-4899-bca5-760ffa35dded.png)

To clarify the difference, let's consider an example where we roll a fair six-sided die twice and want to find P(A ∩ B), where:

- Event A is getting a 5 on the first roll- 
Event B is getting a 6 on the second roll

Let's start by using the formula (1). The probability of getting a 5 on the first roll (event A) is:

![image.png](attachment:65dd9aef-29c3-45d0-8204-06fde30f1fc8.png)

P(B|A), the probability of getting a 6 on the second roll (event B) given that we got a 5 on the first roll (event A), is:

![image.png](attachment:a7f2e6f5-f677-4cfa-bb75-e1cae10aca54.png)

Using formula (1), we have:

![image.png](attachment:5b5b5f25-0200-438f-94bf-a26c7e6085b6.png)

Note, however, that P(B) — the probability of getting a 6 on the second roll (event B) — is the same as that of P(B|A):

![image.png](attachment:fa55e7f1-a163-4a4e-8a2c-5245dc996cd2.png)

P(B) is equal to P(B|A), because getting a 5 on the first roll (event A) doesn't influence in any way the probability of getting a 6 on the second roll (event B). In other words, if event A occurs, the probability of B remains unchanged.

![image.png](attachment:e6c3b67c-d795-43dc-aad7-fc7ebd27fb03.png)

P(A|B) is the probability that we get a 5 on the first roll (event A) given that we got a 6 on the second roll (event B) — if you have trouble understanding this, imagine someone rolled the die twice, and all we know is that they got a six on the second roll (event B), and now we want to use this information to find P(A|B). So, P(A|B) is:

![image.png](attachment:040113fc-0232-4a9e-9a82-9ac3f41e72de.png)

P(A) is equal to P(A|B) because getting a 6 on the second roll doesn't influence in any way the probability of getting a 5 on the first roll (event A). In other words, if event B occurs, the probability of A remains unchanged.

![image.png](attachment:321ab5b1-6c31-4c9e-ba96-155c6f12fcf4.png)

In more general terms, if event A occurs and the probability of B remains unchanged and vice versa (A and B can be any events for any random experiment), then events A and B are said to be statistically independent (although the term "independent" is more often used).

In mathematical terms, if events A and B are independent, it means that:

![image.png](attachment:df93b25b-8680-4adb-afdd-c782a8b9947c.png)

We use the last formula above to calculate P(A ∩ B) when A and B are independent. We can also use the two formulas we saw in the beginning (formula (1) and (2)), but the formula above has the advantage of being more simple and not requiring us to calculate conditional probabilities.

On the next screen, however, we'll see the formula fails if A and B are not independent. Until we resume our discussion, let's get some practice with what we've learned.

A fair six-sided die is rolled twice and the following three events are considered:

- Event K — the die showed a 4 on the second roll
- Event L — the die showed a 2 on the first roll
- Event M — the die showed an even number on the second roll

Find whether the following events are independent or not:

1. Events K and L — assign the string 'independent' to a variable named k_and_l if the events are independent, otherwise assign the string 'dependent'.
2. Events L and M — assign the string 'independent' to a variable named l_and_m if the events are independent, otherwise assign the string 'dependent'.
3. Events K and M — assign the string 'independent' to a variable named k_and_m if the events are independent, otherwise assign the string 'dependent'.

In [16]:
k_and_l = 'independent'
l_and_m = 'independent'
k_and_m = 'dependent'

## Statistical Dependence

In the previous screen, we saw events A and B are independent if:

![image.png](attachment:3908ddf7-b75c-402b-8f51-636b22b7dc39.png)

If any of the three relationships above does not hold, then events A and B are said to be statistically dependent (or just "dependent").

If events events A and B are dependent, it means the occurrence of event A changes the probability of event B and vice versa. In mathematical terms, this means that:

![image.png](attachment:287dc4db-7c12-46d7-ac2f-06ea6f19d391.png)

In the previous exercise, for instance, we considered rolling a fair six-sided die twice and saw events K and M are not independent, where K and M were:

- Event K: The die showed a 4 on the second roll.
- Event M: The die showed an even number on the second roll.

![image.png](attachment:2310a2e1-e932-4bb1-a368-d2848d6c78ce.png)

![image.png](attachment:74286685-1984-4ba4-be10-568b63495742.png)

![image.png](attachment:1f66b171-b586-4ed4-bc45-bc08c097482e.png)

To prove two events are dependent, it's enough to prove wrong only one of these three relationships:

![image.png](attachment:761e1827-44b4-4cc9-9a82-144855393931.png)

To calculate P(A ∩ B) for dependent events, however, we can use either of the following formulas:

![image.png](attachment:2bf9639b-1020-4cb9-826e-38043bb1d288.png)

Both formulas will lead to the same result. However, depending on the problem we're trying to solve, it may be easier to calculate P(A|B) rather than P(B|A) or vice versa, so we should choose the formula that's easier to work with.

Now let's look at a few exercises, then move to one of the last screens of this lesson.

Consider the table below:

![image.png](attachment:bc392041-191e-4ca9-8a66-07ac18621568.png)

Find whether the following events are independent or not (check the hint if you don't know how to solve this):

1. Events L and M — assign the string 'independent' to a variable named l_and_m if the events are independent, otherwise assign the string 'dependent'.
2. Events L and MC — assign the string 'independent' to a variable named l_and_non_m if the events are independent, otherwise assign the string 'dependent'.

Use the formulas we learned to calculate (you could also calculate the probabilities just by looking at the table in this case, but try to use the formulas):

P(L ∩ M) — assign your answer to p_l_and_m.

P(L ∩ MC) — assign your answer to p_l_and_non_m.

In [17]:
l_and_m = 'dependent'
l_and_non_m = 'dependent'

p_l_and_m = 90 / 2000 * 32 /90
l_and_non_m = 90 /2000 * 58 /90

##  Independence for Three Events

For the past few screens, we discussed about independence and the multiplication rule. However, you might have noticed that we've only been considering two events at most. What if we have three events, A, B, and C? How do we find whether or not they are independent? And what multiplication rule do we use with three events?

When we discussed the multiplication rule in the previous course, you might recall we explained that the rule can be extended for any number of events, provided they are all independent. If we have events A, B, C,..., X, Y, Z, and they are all independent, then the multiplication rule can extend to:

![image.png](attachment:ad6516b2-6547-48f0-94d4-d97f88465a91.png)

To find whether three events — A, B, C — are independent or not, two conditions must hold. First, the three events have to be independent one from another, which means the following relationships must be true:

![image.png](attachment:04b2a990-5eb1-4d13-bf0b-04608604708b.png)

Above, events A, B, C are independent in pairs — we say they are pairwise independent.

So pairwise independence is the first condition that has to be respected if three events A, B, C are to be independent. The second condition is that they should be also independent together, which mathematically means:

![image.png](attachment:d35faa15-8598-44d8-a7c7-9dd36bb786cb.png)

If both conditions above hold, events A, B, C are said to be mutually independent. We say that events A, B, and C are mutually independent when the occurrence or non-occurrence of any one of these events does not affect the probability of any of the other events occurring.

To determine if A, B, and C are mutually independent, we need to check if the joint probabilities of all possible pairs of events are equal to the product of their individual probabilities.

We'll now look at an example where three events satisfy the condition of pairwise independence, and yet they are not mutually independent. Let's say we toss a fair coin twice and consider the following three events, where:

- A is the event that we get heads on both tosses, or heads on the first toss and tails on the second: A = {HH, HT}
- B is the event that we get heads on both tosses, or tails on the first toss and heads on the second: B = {HH, TH}
- C is the event that we get heads on both tosses, or tails on both tosses: C = {HH, TT}

The entire sample space has four possible outcomes:

![image.png](attachment:8606727c-ffd4-4512-ab3f-1b16054332b9.png)

Each of the events A, B, C have two successful outcomes, so we have:

![image.png](attachment:9b463523-e9eb-4f7d-a960-f42d7369df8a.png)

![image.png](attachment:7eb6477b-0df1-4654-8b28-1dc63714bf48.png)

Multiplying P(A) by P(B) gives us the same result for P(A ∩ B), so A and B are independent:

![image.png](attachment:4206a913-4ccd-46dd-93dd-e4ec6eafe353.png)

By the same reasoning, we see that pairwise independence holds for all three events:

![image.png](attachment:5519c8ec-6749-4966-b561-00a207a16981.png)

![image.png](attachment:4185e3f0-0da3-4617-8666-f2d1c89e286e.png)

![image.png](attachment:1e8d17e8-23c8-4db9-8e4f-bf5eed04382c.png)

We conclude that events A, B, C are not mutually independent, even though they are pairwise independent.

On the next screen, we'll develop a multiplication rule that we can use for three mutually dependent events. Until then, let's get some practice.

For our electronics store example, say new data is collected, and we know that:

- The probability that a customer buys an electric toothbrush is P(ET) = 0.0432.
- The probability that a customer buys an air conditioning system is P(AC) = 0.0172
- The probability that a customer buys a PlayStation is P(PS) = 0.0236.

Assuming events ET, AC, and PS are mutually independent, calculate:

1. P(ET ∩ PS) — assign your answer to p_et_and_ps.
2. P(ET ∩ AC) — assign your answer to p_et_and_ac.
4. P(AC ∩ PS) — assign your answer to p_ac_and_ps.
5. P(ET ∩ AC ∩ PS) — assign your answer to p_et_and_ac_and_ps.

In [20]:
p_et = 0.0432
p_ac = 0.0172
p_ps = 0.0236

p_et_and_ps = p_et * p_ps
p_et_and_ac = p_et * p_ac
p_ac_and_ps = p_ac * p_ps
p_et_and_ac_and_ps = p_et * p_ac * p_ps

## Formula for Three Dependent Events

On the previous screen, we saw events A, B, C are mutually independent only if they meet two conditions. First, the condition of pairwise independence must hold:

![image.png](attachment:952252c5-272f-4e71-85a9-4c01bae9f6cc.png)

Second, events A, B, and C must be independent together:

![image.png](attachment:1ab46f6c-72c8-43fc-a4ba-cc366d1961d4.png)

If any of these two conditions are not fulfilled, then A, B, C are not mutually independent, and we cannot use the multiplication rule in the above form.

What we really need is to develop a multiplication rule in terms of conditional probability that works correctly for cases where we have three dependent events. Let's start by recalling that:

![image.png](attachment:a4a29ec4-df26-423f-9466-3ad5ed2bebb8.png)

Note that we can think of P(A ∩ B ∩ C) as the probability of two events instead of three:

![image.png](attachment:3099a43d-2708-4af7-8934-a8b1d4dc1861.png)

![image.png](attachment:a4a20ce5-1cc3-42b9-b1d2-fe236ecb9daf.png)

Now we have a final multiplication rule we can use for cases where we have three mutually dependent events:

![image.png](attachment:ad104404-048f-47c1-95a3-5dd181f8a09a.png)

Note that the same kind of reasoning can be used to extend the multiplication rule to four or more events.

Let's now get some practice with this new formula.

For our electronics store example, say new data is collected. We know that:

- The probability that a customer doesn't buy a set of laptop stickers is P(LSC) = 0.9821.
- The probability that a customer buys screen cleaning wipes given that they bought a set of laptop stickers is P(CW | LS) = 0.0079.
- The probability that a customer buys a laptop given that they bought both a set of laptop stickers and screen cleaning wipes is P(L | LS ∩ CW) = 0.2908.

Assume events LS, CW, and L are dependent and calculate P(LS ∩ CW ∩ L). Assign your answer to p_ls_and_cw_and_l.

In [21]:
p_non_ls = 0.9821
p_cw_given_ls = 0.0079
p_l_given_ls_and_cw = 0.2908

p_ls_and_cw_and_l = (1 - 0.9821) * 0.0079 * 0.2908

## Next Steps
In this lesson, we continued learning about conditional probability and managed to cover important concepts and ideas:

![image.png](attachment:70df91ad-fd01-4ad2-af4d-176c6628edf1.png)

In the next lesson, we're going to focus on discussing the law of total probability and Bayes' theorem.