# Variants of NB

## 1. **Gaussian Naïve Bayes**

* Assumes that **continuous features** follow a **normal (Gaussian) distribution** within each class.
* Likelihood:

  $$
  P(x_i \mid y) = \frac{1}{\sqrt{2\pi\sigma_{y,i}^2}} \exp\left(-\frac{(x_i - \mu_{y,i})^2}{2\sigma_{y,i}^2}\right)
  $$

  * $\mu_{y,i}$: mean of feature $i$ for class $y$
  * $\sigma_{y,i}^2$: variance
* **Use case**: Continuous numeric data (e.g., medical measurements, sensor data).

---

## 2. **Multinomial Naïve Bayes**

* Assumes features are **discrete counts** (e.g., word counts in text).
* Likelihood:

  $$
  P(x \mid y) = \frac{( \sum_i x_i )!}{\prod_i x_i!} \prod_{i=1}^n P(x_i \mid y)^{x_i}
  $$
* **Use case**: Text classification (spam detection, sentiment analysis) with **bag-of-words** or **TF-IDF counts**.

---

## 3. **Bernoulli Naïve Bayes**

* Features are **binary** (0 or 1: present/absent).
* Likelihood:

  $$
  P(x_i \mid y) = P_{i,y}^{x_i} (1-P_{i,y})^{1-x_i}
  $$
* **Use case**: Text classification where only word presence matters (not frequency).

---

## 4. **Complement Naïve Bayes**

* A variation of Multinomial NB designed for **imbalanced datasets**.
* Uses **complement of each class** to estimate likelihoods, reducing bias toward majority class.
* **Use case**: Text classification with severe class imbalance.

---

## 5. **Categorical Naïve Bayes** (aka Multivariate Bernoulli NB in sklearn ≥0.20)

* Handles **categorical features with multiple categories** (not just binary).
* Uses category probabilities per feature.
* **Use case**: Datasets with categorical variables (e.g., “color = red/green/blue”).

---

## 6. **Kernel Density Estimation (KDE) Naïve Bayes**

* Instead of assuming Gaussian distribution, estimates feature likelihoods using **non-parametric density estimation**.
* More flexible but computationally heavier.
* **Use case**: Continuous features that are not Gaussian-shaped.

---


| Variant            | Data Type   | Distribution Assumption        | Common Use Case              |
| ------------------ | ----------- | ------------------------------ | ---------------------------- |
| **Gaussian NB**    | Continuous  | Normal (Gaussian)              | Medical, sensor data         |
| **Multinomial NB** | Count-based | Multinomial                    | Text (word counts, TF-IDF)   |
| **Bernoulli NB**   | Binary      | Bernoulli                      | Text (word presence/absence) |
| **Complement NB**  | Count-based | Multinomial (complement class) | Imbalanced text datasets     |
| **Categorical NB** | Categorical | Categorical distribution       | Tabular categorical data     |
| **KDE NB**         | Continuous  | Non-parametric (KDE)           | Complex continuous features  |



In [1]:
# Demonstration of Naive Bayes Variants in scikit-learn
import numpy as np
import pandas as pd
from sklearn.datasets import load_iris, fetch_20newsgroups, make_classification
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB, MultinomialNB, BernoulliNB, CategoricalNB, ComplementNB
from sklearn.metrics import accuracy_score, classification_report

results = {}

# 1. Gaussian Naive Bayes (Iris dataset)
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.3, random_state=42)
gnb = GaussianNB()
gnb.fit(X_train, y_train)
y_pred = gnb.predict(X_test)
results['GaussianNB'] = accuracy_score(y_test, y_pred)

# 2. Multinomial Naive Bayes (text classification)
docs = ["I love Python", "Python is great for machine learning", "I dislike bugs", "Bugs are annoying"]
labels = [1, 1, 0, 0]  # 1=positive, 0=negative
vectorizer = CountVectorizer()
X = vectorizer.fit_transform(docs)
X_train, X_test, y_train, y_test = train_test_split(X, labels, test_size=0.5, random_state=42)
mnb = MultinomialNB()
mnb.fit(X_train, y_train)
y_pred = mnb.predict(X_test)
results['MultinomialNB'] = accuracy_score(y_test, y_pred)

# 3. Bernoulli Naive Bayes (binary text presence/absence)
bnb = BernoulliNB()
bnb.fit(X_train, y_train)
y_pred = bnb.predict(X_test)
results['BernoulliNB'] = accuracy_score(y_test, y_pred)

# 4. Complement Naive Bayes (good for imbalanced data)
cnb = ComplementNB()
cnb.fit(X_train, y_train)
y_pred = cnb.predict(X_test)
results['ComplementNB'] = accuracy_score(y_test, y_pred)

# 5. Categorical Naive Bayes (on synthetic categorical dataset)
# Generate categorical-like features (values 0-3)
X, y = make_classification(n_samples=200, n_features=3, n_informative=3, n_redundant=0, random_state=42)
X = np.digitize(X, bins=[-1, 0, 1, 2])  # discretize features into bins (categories)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
catnb = CategoricalNB()
catnb.fit(X_train, y_train)
y_pred = catnb.predict(X_test)
results['CategoricalNB'] = accuracy_score(y_test, y_pred)

results


{'GaussianNB': 0.9777777777777777,
 'MultinomialNB': 1.0,
 'BernoulliNB': 1.0,
 'ComplementNB': 1.0,
 'CategoricalNB': 0.85}

| Variant           | Use Case                                           | Dataset Used                  | Accuracy  |
| ----------------- | -------------------------------------------------- | ----------------------------- | --------- |
| **GaussianNB**    | Continuous features (normally distributed)         | Iris dataset                  | **97.8%** |
| **MultinomialNB** | Discrete counts (text classification, word counts) | Small text dataset            | **100%**  |
| **BernoulliNB**   | Binary features (word presence/absence)            | Same text dataset             | **100%**  |
| **ComplementNB**  | Handles imbalanced text data better                | Same text dataset             | **100%**  |
| **CategoricalNB** | Purely categorical data                            | Synthetic categorical dataset | **85%**   |
