# **Naïve Bayes: Step-by-Step Example**

Let's create a **toy dataset**, apply **Naïve Bayes step by step**, and use it for **prediction**.

---

## **Step 1: Creating a Toy Dataset**
Imagine we're predicting whether a person **buys a sports car** based on their **age** and **income level**.

| Age  | Income  | Buys Sports Car? |
|------|--------|----------------|
| Young  | High    | Yes            |
| Young  | Medium  | No             |
| Young  | Low     | No             |
| Middle | High    | Yes            |
| Middle | Medium  | No             |
| Middle | Low     | No             |
| Old    | High    | No             |
| Old    | Medium  | No             |
| Old    | Low     | No             |

---

## **Step 2: Calculate Prior Probabilities**
We compute how often each class (`Buys_Sports_Car = Yes/No`) appears in the dataset.

**Prior probabilities**:

- **P(Buys_Sports_Car = No) = 0.78** (7 out of 9 people didn’t buy a sports car)
- **P(Buys_Sports_Car = Yes) = 0.22** (2 out of 9 people bought a sports car)

---

## **Step 3: Compute Likelihoods P(X | Y)**
We calculate how likely each feature value is given that a person **buys** or **doesn't buy** a sports car.

### **Likelihood Probabilities**
For **Age**:
- **If No (doesn’t buy a car)**:
  - Middle: **28.6%**
  - Old: **42.9%**
  - Young: **28.6%**
- **If Yes (buys a car)**:
  - Middle: **50%**
  - Young: **50%**

For **Income**:
- **If No**:
  - High: **14.3%**
  - Medium: **42.9%**
  - Low: **42.9%**
- **If Yes**:
  - High: **100%** (All buyers had high income)

---

## **Step 4: Make a Prediction!**
Let’s predict whether a **Young person with High income** will buy a sports car. We'll use **Bayes' Theorem**:

$P(Y|X) = \frac{P(X|Y) P(Y)}{P(X)}$

Since **P(X)** is the same for all classes, we only need to compute:

$P(Y=Yes | X) \propto P(Age=Young | Yes) \times P(Income=High | Yes) \times P(Yes)$

$P(Y=No | X) \propto P(Age=Young | No) \times P(Income=High | No) \times P(No)$

### **Compute Posterior Probabilities**
- **P(Buys Sports Car = Yes)**:
  - $P(Age=Young | Yes) = 0.5$
  - $P(Income=High | Yes) = 1.0$
  - $P(Yes) = 0.22$
  - **Posterior (Yes) = 0.5 × 1.0 × 0.22 = 0.11**

- **P(Buys Sports Car = No)**:
  - $P(Age=Young | No) = 0.286$
  - $P(Income=High | No) = 0.143$
  - $P(No) = 0.78$
  - **Posterior (No) = 0.286 × 0.143 × 0.78 = 0.0319**

### **Normalize Probabilities**
Total Probability = **0.11 + 0.0319 = 0.1419**

- **P(Buys Sports Car = Yes) = 0.11 / 0.1419 ≈ 77.8%**
- **P(Buys Sports Car = No) = 0.0319 / 0.1419 ≈ 22.2%**

---

## **Final Prediction**
Since **P(Yes) = 77.8%** is higher than **P(No) = 22.2%**, we predict that a **Young person with High income will buy a sports car**! 🎯🚗

---

## **Final Summary**
1. **Calculated Prior Probabilities** → Probability of buying/not buying a car before any features.  
2. **Computed Likelihoods** → How often each feature (age, income) appears for buyers vs. non-buyers.  
3. **Applied Bayes' Theorem** → Multiplied likelihoods with prior probabilities.  
4. **Made a Prediction** → Highest probability wins!
