###  **Q1. What is Bayes' Theorem?**

**Bayes’ Theorem** is a mathematical rule that allows us to **update the probability of an event** based on new evidence. It connects **prior knowledge** with **new data**, helping us make better predictions and decisions.

It’s widely used in statistics, machine learning, medicine, and more.

---

###  **Q2. What is the formula for Bayes' Theorem?**

The mathematical formula is:

$$[
P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)}
]$$

Where:
- **\(P(A|B)\)**: Posterior probability – the updated probability of event **A** after observing **B**
- **\(P(B|A)\)**: Likelihood – the probability of observing **B** given that **A** is true
- **\(P(A)\)**: Prior probability – our belief about **A** before seeing the evidence
- **\(P(B)\)**: Evidence – the total probability of observing **B**

---

###  **Q3. How is Bayes’ Theorem used in practice?**

Bayes' Theorem is applied in many real-world scenarios:

####  **1. Medical Diagnosis**
- Estimate the probability that a patient has a disease given a positive test result.
- Helps adjust for **false positives** and **rare conditions**.

####  **2. Spam Detection**
- In **Naive Bayes classifiers**, it's used to determine if an email is spam based on the frequency of words like “buy now” or “free.”

####  **3. Machine Learning & AI**
- Bayesian networks, probabilistic models, and classification tasks (e.g., text classification).

####  **4. Risk Assessment**
- In finance or engineering, it’s used to update risk levels when new information becomes available.

####  **5. Legal Reasoning**
- Used to evaluate the probability of guilt given new evidence in forensic analysis.

---

###  Example:
Imagine you're testing for a rare disease (1% prevalence), and the test is 99% accurate.

Even if you test **positive**, Bayes’ Theorem shows your actual chance of having the disease is much lower than 99% — because false positives matter more when the disease is rare.


###  **Example: Medical Diagnosis**
Let’s say:
- 1% of people have a rare disease → \(P(Disease) = 0.01\)
- A test detects it 99% of the time when it's there → \(P(Positive|Disease) = 0.99\)
- But the test gives false positives 5% of the time → \(P(Positive|No Disease) = 0.05\)

You test positive. What’s the chance you actually have the disease?

Using Bayes' Theorem:

$$[
P(Disease|Positive) = \frac{P(Positive|Disease) \cdot P(Disease)}{P(Positive)}
]$$

$$[
P(Positive) = P(Positive|Disease) \cdot P(Disease) + P(Positive|No Disease) \cdot P(No Disease)
]$$

$$[
= (0.99 \cdot 0.01) + (0.05 \cdot 0.99) = 0.0594
]$$

$$[
P(Disease|Positive) = \frac{0.0099}{0.0594} \approx 0.1667
]$$

 So even if you test positive, there’s only a **16.7%** chance you actually have the disease.

### Q4. What is the relationship between Bayes' theorem and conditional probability?
Ans: \

Bayes' Theorem is **built directly on the concept of conditional probability** — in fact, it's a **rearrangement** of the formula for conditional probability.

---

###  **Conditional Probability Formula:**
The probability of event **A** given event **B** is:

$$[
P(A|B) = \frac{P(A \cap B)}{P(B)}
]$$

This tells us the probability of **A** happening if we know that **B** has happened.

---

###  **Bayes' Theorem Derived from Conditional Probability:**

From the definition of conditional probability:

$$[
P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)}
]$$

This is **Bayes’ Theorem** — just a restructured version of conditional probability that **flips the condition** from \(P(B|A)\) to \(P(A|B)\), allowing us to **reverse** our understanding based on new evidence.

---

###  **In Simple Terms:**
- **Conditional probability** tells us: "Given B, what's the chance of A?"
- **Bayes' Theorem** tells us: "We know how likely A is, and how likely B is if A is true — so what’s the chance A is true now that B has happened?"

---

###  Real-World Analogy:
Think of a weather app:
- Conditional probability might tell you: "If it's raining, there's a 90% chance clouds were present."
- Bayes' Theorem helps you reverse it: "It’s cloudy now. What’s the chance it’s raining?"

### Q5. How do you choose which type of Naive Bayes classifier to use for any given problem?
Ans: \

There are **three main types** of Naive Bayes classifiers in practice, and the choice depends on the **nature of your features (input variables)**.

---

###  1. **Gaussian Naive Bayes**  
Use when:
- Features are **continuous** (real numbers).
- Data is assumed to follow a **normal (Gaussian) distribution**.

 Example use case:
- Predicting whether a patient has a disease based on continuous variables like age, BMI, and blood pressure.

 Assumes:
$$[
P(x_i | y) \sim \mathcal{N}(\mu_y, \sigma_y^2)
]$$

---

###  2. **Multinomial Naive Bayes**  
Use when:
- Features are **discrete counts** (e.g., word frequencies in text).
- Mostly used for **text classification**, like spam filtering or document categorization.

 Example use case:
- Classifying emails as spam/ham based on word counts.

 Assumes:
- Features represent the number of times a word appears in a document.

---

###  3. **Bernoulli Naive Bayes**  
Use when:
- Features are **binary** (0 or 1), indicating the **presence or absence** of something.

 Example use case:
- Sentiment analysis using binary indicators for whether specific words appear in a tweet.

 Assumes:
- Features follow a **Bernoulli distribution** (only 0 or 1 values).

---

###  Summary Table:

| Classifier Type      | Use When Features Are         | Common Use Case                  |
|----------------------|-------------------------------|----------------------------------|
| Gaussian NB          | Continuous (real-valued)      | Medical prediction, sensors      |
| Multinomial NB       | Discrete counts (frequencies) | Text classification, spam        |
| Bernoulli NB         | Binary (0 or 1)               | Presence/absence of features     |

---

###  Bonus Tip:
If you're unsure, look at your dataset:
- Are your features numbers? Try **Gaussian**.
- Are they counts? Try **Multinomial**.
- Are they 0/1 indicators? Go with **Bernoulli**.


### Assignment:
You have a dataset with two features, X1 and X2, and two possible classes, A and B. You want to use Naive Bayes to classify a new instance with features X1 = 3 and X2 = 4. The following table shows the frequency of each feature value for each class: \
Class X1=1 X1=2 X1=3 X2=1 X2=2 X2=3 X2=4 \
A 3 3 4 4 3 3 3 \
B 2 2 1 2 2 2 3 \
Assuming equal prior probabilities for each class, which class would Naive Bayes predict the new instance
to belong to?
Ans: \
To solve this problem using **Naive Bayes**, we need to compute the **posterior probabilities** for each class (A and B) given the new instance with **X1 = 3** and **X2 = 4**. Since we are assuming **equal prior probabilities** for each class, the prior probability $$( P(A) = P(B) = 0.5 )$$.

### Step-by-Step Calculation:

1. **Prior probabilities:**
   Since we assume equal prior probabilities:
   $$[
   P(A) = 0.5, \quad P(B) = 0.5
   ]$$

2. **Likelihoods:**
   We need to calculate the likelihoods for each class. This is based on the frequency table for each feature value given the class.

#### Class A:
- $$( P(X1=3 | A) = \frac{4}{3 + 3 + 4} = \frac{4}{10} )$$
- $$( P(X2=4 | A) = \frac{3}{4 + 3 + 3 + 3} = \frac{3}{13} )$$

#### Class B:
- $$( P(X1=3 | B) = \frac{1}{2 + 2 + 1} = \frac{1}{5} )$$
- $$( P(X2=4 | B) = \frac{3}{2 + 2 + 2 + 3} = \frac{3}{9} )$$

3. **Posterior probabilities:**

Using Bayes’ Theorem, the **posterior probability** for each class is:

$$[
P(A | X1 = 3, X2 = 4) \propto P(X1 = 3 | A) \cdot P(X2 = 4 | A) \cdot P(A)
]$$
$$[
P(B | X1 = 3, X2 = 4) \propto P(X1 = 3 | B) \cdot P(X2 = 4 | B) \cdot P(B)
]$$

For **Class A**:

$$[
P(A | X1 = 3, X2 = 4) \propto \frac{4}{10} \cdot \frac{3}{13} \cdot 0.5
]$$
$$[
P(A | X1 = 3, X2 = 4) \propto \frac{4 \times 3}{10 \times 13} \times 0.5 = \frac{12}{130} \times 0.5 = \frac{6}{130} = 0.0462
]$$

For **Class B**:

$$[
P(B | X1 = 3, X2 = 4) \propto \frac{1}{5} \cdot \frac{3}{9} \cdot 0.5
]$$
$$[
P(B | X1 = 3, X2 = 4) \propto \frac{1 \times 3}{5 \times 9} \times 0.5 = \frac{3}{45} \times 0.5 = \frac{1.5}{45} = 0.0333
]$$

4. **Conclusion:**
Since the posterior probability for **Class A** (0.0462) is greater than the posterior probability for **Class B** (0.0333), the **Naive Bayes classifier** would predict that the new instance with **X1 = 3** and **X2 = 4** belongs to **Class A**.

In [1]:
# Given frequency table
# Class A and B frequency for X1 and X2 values
class_A = {1: 3, 2: 3, 3: 4, 4: 3}  # Frequency of X1 for Class A, Frequency of X2 for Class A
class_B = {1: 2, 2: 2, 3: 1, 4: 3}  # Frequency of X1 for Class B, Frequency of X2 for Class B

# Total counts for each class (sums of frequencies for each feature)
total_A = sum(class_A.values())  # Total instances in Class A
total_B = sum(class_B.values())  # Total instances in Class B

# Prior probabilities (equal priors)
P_A = 0.5
P_B = 0.5

# Likelihoods for Class A and Class B given the features X1 = 3, X2 = 4
P_X1_given_A = class_A[3] / total_A
P_X2_given_A = class_A[4] / total_A

P_X1_given_B = class_B[3] / total_B
P_X2_given_B = class_B[4] / total_B

# Compute posterior probabilities using Bayes' Theorem
P_A_given_X = P_X1_given_A * P_X2_given_A * P_A
P_B_given_X = P_X1_given_B * P_X2_given_B * P_B

# Output the results
print(f"Posterior probability for Class A: {P_A_given_X}")
print(f"Posterior probability for Class B: {P_B_given_X}")

# Class prediction based on the higher posterior probability
if P_A_given_X > P_B_given_X:
    print("Predicted Class: A")
else:
    print("Predicted Class: B")


Posterior probability for Class A: 0.03550295857988166
Posterior probability for Class B: 0.0234375
Predicted Class: A
