# Naïve Bayes for Numerical Data (Gaussian Naïve Bayes)

Naïve Bayes is a probabilistic classifier based on **Bayes’ Theorem**, with the assumption that features are **conditionally independent** given the class.

---

## 1. Bayes’ Theorem

For a class \( C_k \) and feature vector \( X = (x_1, x_2, \dots, x_n) \):

$$
P(C_k \mid X) = \frac{P(X \mid C_k) \, P(C_k)}{P(X)}
$$

Since \( P(X) \) is the same for all classes, we compute:

$$
P(C_k \mid X) \propto P(C_k) \, P(X \mid C_k)
$$

---

## 2. Naïve Independence Assumption

$$
P(X \mid C_k) = \prod_{j=1}^n P(x_j \mid C_k)
$$

Thus:

$$
P(C_k \mid X) \propto P(C_k) \prod_{j=1}^n P(x_j \mid C_k)
$$

---

## 3. Handling Numerical Features

For numerical features, we assume a **Gaussian distribution** within each class.

For feature \( x_j \) in class \( C_k \):

$$
P(x_j \mid C_k) = \frac{1}{\sqrt{2 \pi \sigma_{jk}^2}} \exp \left( -\frac{(x_j - \mu_{jk})^2}{2 \sigma_{jk}^2} \right)
$$

where:
- \( \mu_{jk} \) = mean of feature \( j \) in class \( C_k \)  
- \( \sigma_{jk}^2 \) = variance of feature \( j \) in class \( C_k \)

---

## 4. Example Dataset

We have three **numerical features** (Height, Weight, Age) and one **categorical target** (Gender):

| Height (cm) | Weight (kg) | Age | Gender |
|-------------|-------------|-----|--------|
| 175.2       | 70.5        | 25  | Male   |
| 162.8       | 55.3        | 30  | Female |
| 180.1       | 82.4        | 28  | Male   |
| 158.5       | 48.2        | 22  | Female |
| 177.9       | 75.8        | 35  | Male   |
| 165.4       | 60.1        | 27  | Female |

- **Features:** Height, Weight, Age  
- **Target:** Gender (Male / Female)

---

## 5. Step-by-Step Training

### Step 1: Compute Class Priors

$$
P(\text{Male}) = \frac{3}{6} = 0.5, \quad P(\text{Female}) = \frac{3}{6} = 0.5
$$

---

### Step 2: Compute Mean and Variance per Feature per Class

**For Male:**

- Heights: [175.2, 180.1, 177.9]  

$$
\mu_{H,M} = 177.73, \quad \sigma^2_{H,M} \approx 6.74
$$  

- Weights: [70.5, 82.4, 75.8]  

$$
\mu_{W,M} = 76.23, \quad \sigma^2_{W,M} \approx 22.89
$$  

- Ages: [25, 28, 35]  

$$
\mu_{A,M} = 29.33, \quad \sigma^2_{A,M} \approx 17.56
$$  

---

**For Female:**

- Heights: [162.8, 158.5, 165.4]  

$$
\mu_{H,F} = 162.23, \quad \sigma^2_{H,F} \approx 11.67
$$  

- Weights: [55.3, 48.2, 60.1]  

$$
\mu_{W,F} = 54.53, \quad \sigma^2_{W,F} \approx 22.37
$$  

- Ages: [30, 22, 27]  

$$
\mu_{A,F} = 26.33, \quad \sigma^2_{A,F} \approx 10.33
$$  

---

## 6. Prediction Example

Suppose we want to classify a **new person** with:

- Height = 170  
- Weight = 65  
- Age = 26  

$$
X = (170, 65, 26)
$$

---

### Step 1: Compute Likelihood for Male

1. **Height (Male):**

$$
P(170 \mid M) = \frac{1}{\sqrt{2 \pi (6.74)}} \exp \left( -\frac{(170 - 177.73)^2}{2 \cdot 6.74} \right) \approx 5.23 \times 10^{-4}
$$

2. **Weight (Male):**

$$
P(65 \mid M) \approx 0.0081
$$

3. **Age (Male):**

$$
P(26 \mid M) \approx 0.088
$$

Multiply together with prior:

$$
P(M \mid X) \propto 0.5 \times (5.23 \times 10^{-4}) \times (0.0081) \times (0.088) \approx 1.87 \times 10^{-7}
$$

---

### Step 2: Compute Likelihood for Female

1. **Height (Female):**

$$
P(170 \mid F) \approx 0.030
$$

2. **Weight (Female):**

$$
P(65 \mid F) \approx 0.024
$$

3. **Age (Female):**

$$
P(26 \mid F) \approx 0.123
$$

Multiply together with prior:

$$
P(F \mid X) \propto 0.5 \times (0.030) \times (0.024) \times (0.123) \approx 4.43 \times 10^{-5}
$$

---

### Step 3: Compare Posterior Probabilities

$$
P(M \mid X) \approx 1.87 \times 10^{-7}, \quad P(F \mid X) \approx 4.43 \times 10^{-5}
$$

Since \( P(F \mid X) \gg P(M \mid X) \), the classifier predicts:

$$
\hat{C} = \text{Female}
$$

---

## 7. Summary

1. Compute **class priors**.  
2. Compute **mean & variance** for each feature within each class.  
3. Use **Gaussian probability density function** to compute likelihoods.  
4. Multiply likelihoods with prior.  
5. Choose class with **maximum posterior probability**.  

✅ In this example, a new person with (Height=170, Weight=65, Age=26) is classified as **Female**.


- Here we use the same formula as the above bayesian distribution formul and then calculate the value of the 
- Then we assume some part of the data to be normally distribited