<a href="https://colab.research.google.com/github/Tanu-N-Prabhu/Python/blob/master/Machine%20Learning%20Interview%20Prep%20Questions/Supervised%20Learning%20Algorithms/Naive%20Bayes/naive_bayes_from_scratch.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Naive Bayes from Scratch (Gaussian | No Libraries)

In this notebook, we’ll:

- Understand how Naive Bayes works
- Implement Gaussian Naive Bayes using NumPy
- Classify continuous input data
- Test the model on a small dataset


## What is Naive Bayes?

Naive Bayes is a **probabilistic classifier** based on Bayes’ Theorem:

$$[
P(y|x) = \frac{P(x|y) \cdot P(y)}{P(x)}
]$$

Key assumptions:
- Features are **conditionally independent** given the class
- For continuous features, we use the **Gaussian probability density function (PDF)**:

$$[
P(x_i|y) = \frac{1}{\sqrt{2\pi\sigma^2}} \cdot \exp\left(-\frac{(x_i - \mu)^2}{2\sigma^2}\right)
]$$


## Imports + Dataset


In [3]:
import numpy as np

# Simple 2D data (2 features), binary classification
X_train = np.array([
    [1.0, 2.1], [1.3, 1.8], [0.8, 2.5],
    [6.0, 5.8], [6.2, 6.1], [5.8, 6.5]
])
y_train = np.array([0, 0, 0, 1, 1, 1])

## Compute Priors, Means, and Variances

In [4]:
classes = np.unique(y_train)
priors = {}
means = {}
variances = {}

for c in classes:
    X_c = X_train[y_train == c]
    priors[c] = len(X_c) / len(X_train)
    means[c] = np.mean(X_c, axis=0)
    variances[c] = np.var(X_c, axis=0) + 1e-9  # avoid zero variance

## Gaussian Probability Density Function

In [5]:
def gaussian_pdf(x, mean, var):
    numerator = np.exp(- (x - mean) ** 2 / (2 * var))
    denominator = np.sqrt(2 * np.pi * var)
    return numerator / denominator

## Prediction Function

In [6]:
def predict(x):
    posteriors = []
    for c in classes:
        prior = np.log(priors[c])
        conditional = np.sum(np.log(gaussian_pdf(x, means[c], variances[c])))
        posterior = prior + conditional
        posteriors.append(posterior)
    return classes[np.argmax(posteriors)]

## Test Predictions

In [7]:
X_test = np.array([
    [1.1, 2.0], [6.1, 5.9]
])

for x in X_test:
    pred = predict(x)
    print(f"Input: {x} → Predicted class: {pred}")

Input: [1.1 2. ] → Predicted class: 0
Input: [6.1 5.9] → Predicted class: 1


## Summary

- Used simple functions to implement Gaussian Naive Bayes
- Calculated priors, means, and variances manually
- Predicted using log of Gaussian PDF
- Easy and clean, no class or OOP code

This makes Naive Bayes transparent and suitable for interviews or fast prototyping.
