In [1]:
import pandas as pd 

## 🔍 What is Naive Bayes?

Naive Bayes is a **probabilistic machine learning algorithm** based on **Bayes’ Theorem**, primarily used for **classification tasks**.

**It assumes:**

* **Feature independence** (naive assumption).

* That **each feature contributes independently** to the probability of a class.

It calculates the **posterior probability** of each class based on observed features and chooses the class with the highest probability.

P(Class/Features) = P(Features/Class) * P(Class)/P(Features)
​
 
## ❓ Why Use Naive Bayes?

* **✅ Simple & Fast:** Quick to train and predict, even on large datasets.

* **✅ Low Memory Usage:** Stores only simple statistics.

* **✅ Performs well with high-dimensional data**, especially in text classification (e.g., spam detection).

* **✅ Works well with small datasets.**

## 📅 When to Use Naive Bayes?

Use Naive Bayes when:

* You have **text data** (emails, articles, reviews).

* You need a **baseline model** to compare with other classifiers.

* You want a **fast model** for real-time or streaming applications.

* The features are **conditionally independent** (or nearly so).

**Common Applications:**

* Spam Filtering

* Sentiment Analysis

* Document Classification

* Medical Diagnosis (with careful feature handling)

## 🧠 How Does Naive Bayes Solve Problems?

Let’s take **email spam** classification as an example.

Suppose we want to classify an email as spam or not spam using word frequency. Naive Bayes calculates:

* The probability of the email being spam given the words in it.

* It uses training data to estimate these probabilities.

* Even without knowing grammar or context, it relies on word occurrence patterns (like "offer", "win", etc.).

Each word contributes independently to the final prediction.

## ⚠️ Issues & Limitations of Naive Bayes

* **Strong Independence Assumption:**

    * Assumes that features (e.g., words) are independent, which is rarely true in practice.

    * Despite this, it performs surprisingly well in many applications.

* **Zero Frequency Problem:**

    * If a word in test data wasn't seen in training data, its probability becomes zero.

    * **Solution:** Use **Laplace smoothing** (hyperparameter alpha).

* **Not Ideal for Correlated Features:**

    * If your features are correlated (e.g., temperature and humidity), Naive Bayes struggles.

* **Poor Probabilistic Calibration:**

    * The predicted probabilities can be poorly calibrated.

    * Naive Bayes is good for classification decisions, not probability estimates.

* **Continuous Variables Handling:**

    * GaussianNB assumes normal distribution for continuous features.

    * If the data isn’t Gaussian-distributed, performance may drop.

## ✅ Summary of Naive Bayes Capabilities

* **Speed:** Very fast to train and predict.

* **Works on Small Datasets:** Performs well even with limited data.

* **Feature Independence Required:** Assumes features are independent (this is a limitation in real-world data).

* **Handles Text Data Well:** Excellent performance in text classification tasks like spam detection or sentiment analysis.

* **Continuous Feature Support:** Only supported well in GaussianNB.

* **Parameter Tuning Complexity:** Low — has only a few hyperparameters, making it easy to tune.

* **Interpretability:** High — easy to understand the logic behind predictions.

In [4]:
# Importing Necessary Libraries
import numpy as np 
import pandas as pd 
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.naive_bayes import GaussianNB, MultinomialNB, BernoulliNB,ComplementNB
from sklearn.metrics import accuracy_score
from sklearn.datasets import load_iris