## 🔍 What is Naive Bayes?

Naive Bayes is a **probabilistic machine learning algorithm** based on **Bayes’ Theorem**, primarily used for **classification tasks**.

**It assumes:**

* **Feature independence** (naive assumption).

* That **each feature contributes independently** to the probability of a class.

It calculates the **posterior probability** of each class based on observed features and chooses the class with the highest probability.

P(Class/Features) = P(Features/Class) * P(Class)/P(Features)
​
 
## ❓ Why Use Naive Bayes?

* **✅ Simple & Fast:** Quick to train and predict, even on large datasets.

* **✅ Low Memory Usage:** Stores only simple statistics.

* **✅ Performs well with high-dimensional data**, especially in text classification (e.g., spam detection).

* **✅ Works well with small datasets.**

## 📅 When to Use Naive Bayes?

Use Naive Bayes when:

* You have **text data** (emails, articles, reviews).

* You need a **baseline model** to compare with other classifiers.

* You want a **fast model** for real-time or streaming applications.

* The features are **conditionally independent** (or nearly so).

**Common Applications:**

* Spam Filtering

* Sentiment Analysis

* Document Classification

* Medical Diagnosis (with careful feature handling)

## 🧠 How Does Naive Bayes Solve Problems?

Let’s take **email spam** classification as an example.

Suppose we want to classify an email as spam or not spam using word frequency. Naive Bayes calculates:

* The probability of the email being spam given the words in it.

* It uses training data to estimate these probabilities.

* Even without knowing grammar or context, it relies on word occurrence patterns (like "offer", "win", etc.).

Each word contributes independently to the final prediction.

## ⚠️ Issues & Limitations of Naive Bayes

* **Strong Independence Assumption:**

    * Assumes that features (e.g., words) are independent, which is rarely true in practice.

    * Despite this, it performs surprisingly well in many applications.

* **Zero Frequency Problem:**

    * If a word in test data wasn't seen in training data, its probability becomes zero.

    * **Solution:** Use **Laplace smoothing** (hyperparameter alpha).

* **Not Ideal for Correlated Features:**

    * If your features are correlated (e.g., temperature and humidity), Naive Bayes struggles.

* **Poor Probabilistic Calibration:**

    * The predicted probabilities can be poorly calibrated.

    * Naive Bayes is good for classification decisions, not probability estimates.

* **Continuous Variables Handling:**

    * GaussianNB assumes normal distribution for continuous features.

    * If the data isn’t Gaussian-distributed, performance may drop.

## ✅ Summary of Naive Bayes Capabilities

* **Speed:** Very fast to train and predict.

* **Works on Small Datasets:** Performs well even with limited data.

* **Feature Independence Required:** Assumes features are independent (this is a limitation in real-world data).

* **Handles Text Data Well:** Excellent performance in text classification tasks like spam detection or sentiment analysis.

* **Continuous Feature Support:** Only supported well in GaussianNB.

* **Parameter Tuning Complexity:** Low — has only a few hyperparameters, making it easy to tune.

* **Interpretability:** High — easy to understand the logic behind predictions.

In [1]:
# Importing Necessary Libraries
import numpy as np 
import pandas as pd 
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.naive_bayes import GaussianNB, MultinomialNB, BernoulliNB,ComplementNB
from sklearn.metrics import accuracy_score
from sklearn.datasets import load_iris

In [2]:
# Load Dataset
data = load_iris()
X = data.data
y = data.target

In [3]:
X

array([[5.1, 3.5, 1.4, 0.2],
       [4.9, 3. , 1.4, 0.2],
       [4.7, 3.2, 1.3, 0.2],
       [4.6, 3.1, 1.5, 0.2],
       [5. , 3.6, 1.4, 0.2],
       [5.4, 3.9, 1.7, 0.4],
       [4.6, 3.4, 1.4, 0.3],
       [5. , 3.4, 1.5, 0.2],
       [4.4, 2.9, 1.4, 0.2],
       [4.9, 3.1, 1.5, 0.1],
       [5.4, 3.7, 1.5, 0.2],
       [4.8, 3.4, 1.6, 0.2],
       [4.8, 3. , 1.4, 0.1],
       [4.3, 3. , 1.1, 0.1],
       [5.8, 4. , 1.2, 0.2],
       [5.7, 4.4, 1.5, 0.4],
       [5.4, 3.9, 1.3, 0.4],
       [5.1, 3.5, 1.4, 0.3],
       [5.7, 3.8, 1.7, 0.3],
       [5.1, 3.8, 1.5, 0.3],
       [5.4, 3.4, 1.7, 0.2],
       [5.1, 3.7, 1.5, 0.4],
       [4.6, 3.6, 1. , 0.2],
       [5.1, 3.3, 1.7, 0.5],
       [4.8, 3.4, 1.9, 0.2],
       [5. , 3. , 1.6, 0.2],
       [5. , 3.4, 1.6, 0.4],
       [5.2, 3.5, 1.5, 0.2],
       [5.2, 3.4, 1.4, 0.2],
       [4.7, 3.2, 1.6, 0.2],
       [4.8, 3.1, 1.6, 0.2],
       [5.4, 3.4, 1.5, 0.4],
       [5.2, 4.1, 1.5, 0.1],
       [5.5, 4.2, 1.4, 0.2],
       [4.9, 3

In [4]:
y

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])

In [5]:
# Splitting Dataset to Training set and testing set 
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

In [6]:
# Preprocessing: Scaling features 

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled  = scaler.fit_transform(X_test)

In [7]:
# Gaussian Naive Bayes
gnb = GaussianNB()
gnb.fit(X_train_scaled, y_train)

y_pred_gnb = gnb.predict(X_test_scaled)
print(f"GaussianNB Accuracy: {accuracy_score(y_test, y_pred_gnb)}")


GaussianNB Accuracy: 0.8888888888888888


In [8]:
# Multinomial Naive Bayes
mnb = MultinomialNB()
mnb.fit(X_train, y_train) # No scaling required for MultinomialNB
y_pred_mnb = mnb.predict(X_test)
print(f"MultinomialNB Accuracy: {accuracy_score(y_test, y_pred_mnb)}")

MultinomialNB Accuracy: 0.9555555555555556


In [9]:
# Bernoulli Naive Bayes (Typically for binary/boolean data, we are using Iris dataset here)

# Compute median of each column (feature)
thresholds = np.median(X_train, axis=0)
# Binarize based on the median threshold
X_train_bin = (X_train > thresholds).astype(int)
X_test_bin = (X_test > thresholds).astype(int)

# Train and evaluate BernoulliNB
bnb = BernoulliNB()
bnb.fit(X_train_bin, y_train)
y_pred_bnb = bnb.predict(X_test_bin)
print(f"BernoulliNB Accuracy: {accuracy_score(y_test, y_pred_bnb):.2f}")

BernoulliNB Accuracy: 0.78


### **Explanation of the Code:**

**Data Preprocessing:**

* The iris dataset is loaded using load_iris() from scikit-learn.

* Features (X) and target labels (y) are split, and the data is divided into training and testing sets using train_test_split().

* Standardization is applied to the features using StandardScaler before applying the Gaussian Naive Bayes because it performs better with scaled data (when features have different units/values).

**Model Training and Prediction:**

* GaussianNB is applied to scaled data, and the accuracy is measured using accuracy_score.

* MultinomialNB works well with count data, so no scaling is applied here.

* BernoulliNB requires binary data, so the features are transformed into binary values before training.

## Code Implementation: With make_pipeline

Now, we'll use make_pipeline to streamline the preprocessing and model fitting process in one step.

In [13]:
# Importing make_pipeline for preprocessing and modeling
from sklearn.pipeline import make_pipeline
# Gaussian Naive Bayes with pipeline
gnb_pipeline = make_pipeline(StandardScaler(), GaussianNB())

gnb_pipeline.fit(X_train, y_train)
y_pred_gnb_pipeline = gnb_pipeline.predict(X_test)
print(f"GaussianNB with Pipeline Accuracy: {accuracy_score(y_test, y_pred_gnb_pipeline)}")


GaussianNB with Pipeline Accuracy: 0.9777777777777777


In [None]:
# Multinomial Naive Bayes with pipeline
mnb_pipeline = make_pipeline(MultinomialNB())
mnb_pipeline.fit(X_train, y_train)
y_pred_mnb_pipeline = mnb_pipeline.predict(X_test)
print(f"MultinomialNB with Pipeline Accuracy: {accuracy_score(y_test, y_pred_mnb_pipeline)}")

MultinomialNB with Pipeline Accuracy: 0.9555555555555556


In [None]:
# Bernoulli Naive Bayes with pipeline
bnb_pipeline = make_pipeline(BernoulliNB())
# Compute median of each column (feature)
thresholds = np.median(X_train, axis=0)
# Binarize based on the median threshold
X_train_bin_pipeline = (X_train > thresholds).astype(int)
X_test_bin_pipeline = (X_test > thresholds).astype(int)
bnb_pipeline.fit(X_train_bin_pipeline, y_train)
y_pred_bnb_pipeline = bnb_pipeline.predict(X_test_bin_pipeline)
print(f"BernoulliNB with Pipeline Accuracy: {accuracy_score(y_test, y_pred_bnb_pipeline)}")

BernoulliNB with Pipeline Accuracy: 0.7777777777777778


### **Explanation of make_pipeline:**

**make_pipeline():** This utility helps to chain together steps such as preprocessing and model fitting, making the code more concise and easier to manage.

* **Gaussian Naive Bayes** is now trained within a pipeline that includes feature scaling.

* **Multinomial Naive Bayes and Bernoulli Naive Bayes** are trained in pipelines, where the transformations are handled within the same process.

By using **make_pipeline**, we remove the need for explicit preprocessing steps and ensure that the preprocessing is applied consistently during both training and prediction phases.