Naive Bayes:

Definition:
Naive Bayes is a family of simple probabilistic classifiers based on Bayes' Theorem with the assumption that the features (input variables) are independent of each other, which is why it is called "naive." Despite this assumption being often unrealistic in real-world data, Naive Bayes can perform surprisingly well, particularly in text classification tasks like spam detection, sentiment analysis, and other probabilistic classification problems.
The algorithm works by calculating the posterior probability of each class based on the given features and then selecting the class with the highest probability as the predicted class.

Formula for Naive Bayes (Bayes' Theorem):
The general formula used in Naive Bayes classification is derived from Bayes' Theorem, which calculates the probability of a class given the features:
P(C∣X)= P(X∣C)⋅P(C)/P(X)


Algorithm:
1. Calculate prior probabilities for each class.
2. Calculate mean and standard deviation for each feature within each class.
3. Calculate the likelihood of each feature given the class using the Gaussian distribution.
4. Calculate the posterior probabilities using Bayes' Theorem.
5. Predict the class with the highest posterior probability for each test instance.
6. Repeat the process for all test data.

In [1]:
# Step 1: Import necessary libraries
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
from sklearn.datasets import load_iris

In [2]:
# Step 2: Load the dataset
iris = load_iris()
X = iris.data  # Features: Sepal length, sepal width, petal length, petal width
y = iris.target  # Target: species

In [3]:
# Step 3: Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

In [4]:
# Step 4: Initialize the Naive Bayes classifier (GaussianNB is used for continuous data)
nb_classifier = GaussianNB()

In [5]:
# Step 5: Train the Naive Bayes model
nb_classifier.fit(X_train, y_train)

In [6]:
# Step 6: Make predictions on the test set
y_pred = nb_classifier.predict(X_test)

In [7]:
# Step 7: Evaluate the model performance
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.4f}")

Accuracy: 0.9778


In [8]:
# Confusion Matrix
print("\nConfusion Matrix:")
print(confusion_matrix(y_test, y_pred))

# Classification Report
print("\nClassification Report:")
print(classification_report(y_test, y_pred))


Confusion Matrix:
[[19  0  0]
 [ 0 12  1]
 [ 0  0 13]]

Classification Report:
              precision    recall  f1-score   support

           0       1.00      1.00      1.00        19
           1       1.00      0.92      0.96        13
           2       0.93      1.00      0.96        13

    accuracy                           0.98        45
   macro avg       0.98      0.97      0.97        45
weighted avg       0.98      0.98      0.98        45

