
#  Introduction to Multiple Events in Probability

---

####  What are Multiple Events?

* **Multiple events** occur when you consider **more than two events** simultaneously, such as $A, B, C, \ldots$.
* Events can be related by **AND** (intersection), **OR** (union), or complements.
* Probability of multiple events can be calculated using **generalized versions** of addition and multiplication rules.

---

####  Types of Relationships Between Events

* **Mutually exclusive events:** No two can happen at the same time.

  * For example, rolling a die: getting a 2 and a 5 simultaneously is impossible.

* **Independent events:** The outcome of one event does not affect others.

* **Dependent events:** The occurrence of one affects the probabilities of others.

---

####  Probability of Multiple Events (General Rules)

---

####  **Addition Rule for Multiple Events (OR)**

For three events $A, B, C$, the probability of **at least one** occurring (union) is:

$$
\begin{aligned}
P(A \cup B \cup C) =\ & P(A) + P(B) + P(C) \\
& - P(A \cap B) - P(B \cap C) - P(A \cap C) \\
& + P(A \cap B \cap C)
\end{aligned}
$$

* Add probabilities of each event.
* Subtract probabilities of every pairwise intersection (to correct double counting).
* Add back the probability of the triple intersection (because it was subtracted three times).

---

#### b) **Multiplication Rule for Multiple Events (AND)**

* For **independent events**, the probability that all occur is the product:

$$
P(A \cap B \cap C) = P(A) \times P(B) \times P(C)
$$

* For **dependent events**, use conditional probabilities:

$$
P(A \cap B \cap C) = P(A) \times P(B|A) \times P(C|A \cap B)
$$

---

#### Example of Addition Rule (3 events)

Imagine a survey where:

* $P(A)$ = probability a person likes tea = 0.5
* $P(B)$ = likes coffee = 0.6
* $P(C)$ = likes juice = 0.3
* $P(A \cap B)$ = likes both tea & coffee = 0.2
* $P(B \cap C)$ = coffee & juice = 0.1
* $P(A \cap C)$ = tea & juice = 0.05
* $P(A \cap B \cap C)$ = likes all three = 0.02

Then:

$$
\begin{aligned}
P(A \cup B \cup C) = & 0.5 + 0.6 + 0.3 - 0.2 - 0.1 - 0.05 + 0.02 \\
= & 1.4 - 0.35 + 0.02 = 1.07
\end{aligned}
$$

Since probability can’t exceed 1, this means nearly everyone likes at least one drink.

---

####  Example of Multiplication Rule (3 independent events)

Suppose the chance it rains on each of three days is independent and each day has probability 0.3:

$$
P(\text{rain all 3 days}) = 0.3 \times 0.3 \times 0.3 = 0.027
$$

---




#### Summary

| Concept                   | Formula (3 events)                                                   | Description                                  |
| ------------------------- | -------------------------------------------------------------------- | -------------------------------------------- |
| Addition Rule (Union)     | $P(A \cup B \cup C)$ with inclusion-exclusion                        | Probability at least one event happens       |
| Multiplication Rule (AND) | $P(A \cap B \cap C) = P(A) \times P(B) \times P(C)$ (if independent) | Probability all events happen simultaneously |




In [2]:
## 6️⃣ Python Code Examples

### a) Addition Rule for 3 events


P_A = 0.5
P_B = 0.6
P_C = 0.3
P_AB = 0.2
P_BC = 0.1
P_AC = 0.05
P_ABC = 0.02

P_union = P_A + P_B + P_C - P_AB - P_BC - P_AC + P_ABC
print("P(A or B or C):", P_union)

### b) Multiplication Rule for independent events


P_rain_day = 0.3
P_all_three = P_rain_day ** 3
print("Probability of rain on all three days:", P_all_three)




P(A or B or C): 1.07
Probability of rain on all three days: 0.026999999999999996



# Conditional Probability — Full Explanation

---

#### What is Conditional Probability?

Conditional Probability measures the probability of an event **$A$** occurring **given that another event $B$ has already occurred**.

It answers:

> "If we know $B$ happened, what is the chance that $A$ also happens?"

---

#### Formula:

$$
P(A \mid B) = \frac{P(A \cap B)}{P(B)}, \quad \text{provided } P(B) > 0
$$

* $P(A \mid B)$ = Probability of $A$ given $B$
* $P(A \cap B)$ = Probability both $A$ and $B$ happen
* $P(B)$ = Probability of event $B$

---

#### Intuition

* You **restrict your sample space to event $B$**, since $B$ already happened.
* Now, within $B$, you want to know how often $A$ also occurs.
* Think of it as zooming into the subset $B$.

---

####  Real-Life Examples

##### 1️⃣ Disease Testing (Medical Diagnosis)

* $A$: Patient actually has the disease.
* $B$: Test result is positive.

**Conditional Probability:** What is the probability the patient has the disease **given** that the test is positive?

$$
P(\text{Disease} \mid \text{Positive Test}) = \frac{P(\text{Disease and Positive})}{P(\text{Positive})}
$$

This is crucial in **diagnostic accuracy**, and closely related to **Bayes’ theorem**.

---

##### 2️⃣ Spam Filtering (Email Classification)

* $A$: Email is spam.
* $B$: Email contains the word "offer".

**Conditional Probability:** What is the probability an email is spam **given** it contains the word "offer"?

$$
P(\text{Spam} \mid \text{Word "offer"}) = \frac{P(\text{Spam and "offer"})}{P(\text{"offer"})}
$$

Spam filters use this type of conditional probability to detect spam based on features.

---



## 🧠 Why Is Conditional Probability Important?

* Models **dependence** between events.
* Foundation for **Bayesian inference**.
* Used in **machine learning** for classification and prediction.
* Helps calculate **risk** or **likelihood** in medical tests, finance, engineering, and more.



##### Python Code Example

Let's say:

* $P(A \cap B) = 0.04$ (4% emails are spam and contain "offer")
* $P(B) = 0.1$ (10% emails contain "offer")



In [6]:
P_A_and_B = 0.04
P_B = 0.1

P_A_given_B = P_A_and_B / P_B
print(f"Probability email is spam given it contains 'offer': {P_A_given_B:.2f}")

Probability email is spam given it contains 'offer': 0.40



# Probability Tree

---

####  What is a Probability Tree?

A **probability tree** is a **diagrammatic method** to represent all possible outcomes of a sequence of events, along with their probabilities.

* Each **branch** represents an event and its probability.
* Branches split off to represent possible outcomes of the next event.
* Helps visualize complex scenarios with multiple stages or dependent events.

---

####  Why Use a Probability Tree?

* Simplifies understanding of **conditional probabilities**.
* Helps calculate **joint probabilities** of sequences of events.
* Visualizes **dependent and independent events**.
* Useful in **Bayes’ theorem** and decision-making problems.

---

####  Basic Structure

1. Start with a **root node** (start of the experiment).
2. Draw branches for each possible outcome of the first event, labeling probabilities.
3. From the end of each branch, draw branches for the next event outcomes, labeling probabilities.
4. Continue until all events are represented.
5. Multiply probabilities along branches to get **joint probabilities**.

---

####  Example: Two Coin Tosses

Event: Toss a fair coin twice.
Possible outcomes: HH, HT, TH, TT.

| Step | Explanation                           |
| ---- | ------------------------------------- |
| 1    | First toss: H (0.5), T (0.5)          |
| 2    | Second toss after H: H (0.5), T (0.5) |
| 3    | Second toss after T: H (0.5), T (0.5) |

---

#### Tree Diagram (Text Representation):

```
Start
├── H (0.5)
│   ├── H (0.5) → HH (0.5 * 0.5 = 0.25)
│   └── T (0.5) → HT (0.5 * 0.5 = 0.25)
└── T (0.5)
    ├── H (0.5) → TH (0.5 * 0.5 = 0.25)
    └── T (0.5) → TT (0.5 * 0.5 = 0.25)
```

---

####  Calculating Probabilities from Tree

* Multiply probabilities along branches.
* Sum probabilities of desired outcomes if needed.

Example: Probability of exactly one Head in two tosses:

* Outcomes: HT, TH
* Probability = 0.25 + 0.25 = 0.5

---

#### More Complex Example: Disease Testing (Dependent Events)

* Event 1: Person has disease (D) or not ($D^c$)
* Event 2: Test positive (T+) or negative (T−)
* Given probabilities:

  * $P(D) = 0.01$, $P(D^c) = 0.99$
  * $P(T+|D) = 0.95$ (true positive)
  * $P(T+|D^c) = 0.05$ (false positive)

---

#### Tree branches and joint probabilities:

```
Start
├── D (0.01)
│   ├── T+ (0.95) → P(D and T+) = 0.0095
│   └── T- (0.05) → P(D and T-) = 0.0005
└── D^c (0.99)
    ├── T+ (0.05) → P(D^c and T+) = 0.0495
    └── T- (0.95) → P(D^c and T-) = 0.9405
```

---

### Use tree to calculate:

* Probability test positive:

$$
P(T+) = P(D \cap T+) + P(D^c \cap T+) = 0.0095 + 0.0495 = 0.059
$$

* Probability disease given positive test (Bayes’ Theorem):

$$
P(D | T+) = \frac{P(D \cap T+)}{P(T+)} = \frac{0.0095}{0.059} \approx 0.161
$$

![image.png](attachment:331b00ab-5607-41cc-ac65-286612ac19fd.png)

In [9]:
# Two coin tosses: probabilities
prob_H = 0.5
prob_T = 0.5

# Joint probabilities
prob_HH = prob_H * prob_H
prob_HT = prob_H * prob_T
prob_TH = prob_T * prob_H
prob_TT = prob_T * prob_T

print(f"P(HH) = {prob_HH}")
print(f"P(HT) = {prob_HT}")
print(f"P(TH) = {prob_TH}")
print(f"P(TT) = {prob_TT}")

# Probability exactly one head:
prob_one_head = prob_HT + prob_TH
print(f"P(exactly one head) = {prob_one_head}")

P(HH) = 0.25
P(HT) = 0.25
P(TH) = 0.25
P(TT) = 0.25
P(exactly one head) = 0.5




# Bayes’ Theorem

#### 1. **Definition:**

Bayes’ Theorem describes the probability of an event $A$ given that another event $B$ has occurred. It links **conditional probabilities** and **prior knowledge**.

Mathematically:

$$
P(A|B) = \frac{P(B|A) \times P(A)}{P(B)}
$$

Where:

* $P(A|B)$ = Posterior probability: Probability of event A happening given B is true.
* $P(B|A)$ = Likelihood: Probability of event B happening given A is true.
* $P(A)$ = Prior probability: Initial probability of event A.
* $P(B)$ = Marginal probability: Total probability of event B.

---

### 2. **Intuition:**

* We start with some belief about $A$ (prior $P(A)$).
* We observe event $B$.
* We want to update our belief about $A$ after seeing $B$ (posterior $P(A|B)$).
* Bayes’ Theorem provides the way to update our beliefs based on new evidence.

---

### 3. **Why is $P(B)$ there?**

$P(B)$ is a normalizing constant to ensure that the total probabilities sum to 1. It can be calculated using the law of total probability:

$$
P(B) = P(B|A) \times P(A) + P(B|\neg A) \times P(\neg A)
$$

where $\neg A$ is the complement of $A$.

---

### 4. **Real-life example: Disease Testing**

Suppose:

* $A$: Person has a disease.
* $B$: Person tests positive.

Known:

* $P(A)$ = 1% (Disease prevalence)
* $P(B|A)$ = 99% (Test correctly detects disease)
* $P(B|\neg A)$ = 5% (False positive rate)

We want $P(A|B)$: Probability person actually has the disease given they tested positive.

---

### 5. **Bayes’ Theorem Steps:**

Calculate $P(B)$:

$$
P(B) = P(B|A)P(A) + P(B|\neg A)P(\neg A) = 0.99 \times 0.01 + 0.05 \times 0.99 = 0.0099 + 0.0495 = 0.0594
$$

Calculate posterior:

$$
P(A|B) = \frac{P(B|A) \times P(A)}{P(B)} = \frac{0.99 \times 0.01}{0.0594} \approx 0.1667
$$

Interpretation: Even if the test is positive, the probability the person actually has the disease is \~16.67% (due to low prevalence and false positives).

---

# Bayes’ Theorem in Machine Learning — Naive Bayes Classifier

* Naive Bayes classifiers apply Bayes’ theorem with a **strong assumption** that features are **conditionally independent** given the class label.
* Used for classification problems like spam detection, sentiment analysis, text classification.
* It calculates the probability of each class given the features and predicts the class with the highest posterior probability.





# Summary:

| Concept | Meaning                           |                                              |
| ------- | --------------------------------- | -------------------------------------------- |
| ( P(A   | B) )                              | Posterior — updated probability of A given B |
| ( P(B   | A) )                              | Likelihood — probability of B given A        |
| $P(A)$  | Prior — initial belief about A    |                                              |
| $P(B)$  | Evidence — total probability of B |                                              |


In [15]:
# Python Code: Bayes Theorem Calculation Example

#Let's implement the disease testing example step-by-step:

# Given probabilities
P_A = 0.01              # Probability of disease
P_not_A = 1 - P_A       # Probability of no disease

P_B_given_A = 0.99      # Probability test positive given disease
P_B_given_not_A = 0.05  # Probability test positive given no disease (false positive)

# Calculate total probability of test positive (P(B))
P_B = P_B_given_A * P_A + P_B_given_not_A * P_not_A

# Calculate posterior probability P(A|B) using Bayes Theorem
P_A_given_B = (P_B_given_A * P_A) / P_B

print(f"Probability of having the disease given a positive test: {P_A_given_B:.4f}")

Probability of having the disease given a positive test: 0.1667


In [18]:
# Python Example: Simple Naive Bayes Classifier (from scratch)


# Prior probabilities
P_spam = 0.4
P_not_spam = 0.6

# Likelihoods: Probability of seeing the word "discount" in spam/not spam emails
P_discount_given_spam = 0.7
P_discount_given_not_spam = 0.1

# New email contains the word "discount", find P(spam|discount)
P_discount = P_discount_given_spam * P_spam + P_discount_given_not_spam * P_not_spam

P_spam_given_discount = (P_discount_given_spam * P_spam) / P_discount

print(f"Probability email is spam given it contains 'discount': {P_spam_given_discount:.4f}")


Probability email is spam given it contains 'discount': 0.8235
