# Bayes' Theorem Explained

---

## 📚 Definition
Bayes' Theorem describes the probability of an event, based on prior knowledge of conditions that might be related to the event.

It allows us to **update our beliefs** based on new evidence.

---

## 🧮 Formula

$$
P(A|B) = \frac{P(B|A) \times P(A)}{P(B)}
$$

Where:
- $P(A|B)$ = Posterior probability (Probability of A given B)
- $P(B|A)$ = Likelihood (Probability of B given A)
- $P(A)$ = Prior probability of A
- $P(B)$ = Evidence (Overall probability of B)

---

## 💡 Example (Medical Test)
- 1% of people have a disease ➔ $P(\text{Disease}) = 0.01$
- Test is 99% accurate:
  - If you have disease: 99% chance test is positive ➔ $P(\text{Positive} | \text{Disease}) = 0.99$
  - If you don't have disease: 5% false positive ➔ $P(\text{Positive} | \text{No Disease}) = 0.05$

We want to find:

$$
P(\text{Disease} | \text{Positive Test})
$$

Using Bayes' Theorem!

## Types of Bayes' Theorem Applications in Machine Learning

---

## 📚 Overview
Bayes' Theorem is used in various machine learning models, especially where probability and uncertainty need to be modeled.

---

## 📋 Types and When to Use

| Type                         | What it Means                                           | Where/When to Use |
|:-----------------------------|:--------------------------------------------------------|:------------------|
| **Naive Bayes Classifier**    | Assumes all features are independent                    | Text classification (spam detection, sentiment analysis) |
| **Gaussian Naive Bayes**      | Assumes features follow a **normal distribution**        | Continuous data (e.g., predicting disease from medical data) |
| **Multinomial Naive Bayes**   | Used for **count-based features** (discrete data)         | Document classification, word counts (NLP) |
| **Bernoulli Naive Bayes**     | Features are **binary** (0 or 1)                         | Binary/yes-no features (like email spam detection) |
| **Bayesian Networks**         | Graphical model showing **dependencies** between features| Complex relationships, medical diagnosis, risk prediction |
| **Bayesian Inference**        | Updating models when new data comes in                  | Online learning, time series, continual learning |

---

## 🎯 When to Apply Bayes in ML

✅ **When your data is small**  
Bayesian methods perform well even with limited datasets.

✅ **When you want probabilistic outputs**  
Bayesian models provide confidence levels for predictions.

✅ **When features are mostly independent**  
Naive Bayes works best if features do not heavily interact.

✅ **When you're working with text**  
Spam filters, sentiment analysis, document classification — Naive Bayes excels in NLP tasks.

✅ **When updating models over time**  
Bayesian Inference is useful for streaming data or evolving datasets.

---

## 🛑 Things to Watch Out For

- **Naive Bayes** assumes **independent features**, but in real-world data, features often correlate.
- For complex tasks like image recognition, Naive Bayes might be too simplistic.

---

## 📈 Example Visual Scenario

Imagine classifying emails as spam/not-spam:
- Features = Words in an email ("win", "free", "money", etc.)
- Naive Bayes assumes each word's presence is independent of the others.
- It calculates **P(Spam | Words)** using Bayes' Theorem.

That's exactly how Gmail filters spam! 🔥

---

> "Bayes' Theorem is not just a formula — it's a way to reason about the world based on evidence." 📚