### Naive Bayes Classifier

Naive Bayes is a **supervised machine learning algorithm** based on applying **Bayes' Theorem** with a strong (naive) assumption of **independence between features**. It calculates the probability of each class given the input features and predicts the class with the highest posterior probability. Despite its simplicity, Naive Bayes can often perform surprisingly well, especially in high-dimensional datasets.this is widely used in **text classification** problems such as **spam filtering**, **sentiment analysis**, and **document categorization**. It's also suitable for medical diagnosis, recommendation systems, and any scenario where speed and scalability are more important than absolute predictive accuracy.

**Pros:**
- Very fast and efficient, especially on large datasets.
- Performs well with high-dimensional data (e.g., text data).
- Easy to implement and interpret.
- Requires less training data compared to more complex models.

**Cons:**
- Assumes feature independence, which is rarely true in real-world data.
- May perform poorly when features are highly correlated.
- Doesn’t work well with continuous variables unless Gaussian or binning assumptions are applied.

In [3]:
%pip install --quiet pandas numpy matplotlib seaborn scikit-learn


import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 24.3.1 -> 25.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [4]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score, classification_report

X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
model = GaussianNB()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

print("Accuracy:", accuracy_score(y_test, y_pred))
print("Classification Report:\n", classification_report(y_test, y_pred))


Accuracy: 1.0
Classification Report:
               precision    recall  f1-score   support

           0       1.00      1.00      1.00        10
           1       1.00      1.00      1.00         9
           2       1.00      1.00      1.00        11

    accuracy                           1.00        30
   macro avg       1.00      1.00      1.00        30
weighted avg       1.00      1.00      1.00        30

