Gaussian Naïve Bayes (GNB)

Assumes features follow a Gaussian (normal) distribution.

Best suited for continuous data (like Iris).

Formula used for each feature:

Multinomial Naïve Bayes (MNB)
Assumes features are counts or frequencies (like word counts in text).

Typically used in text classification.

Not ideal for continuous features, unless transformed (e.g., discretized or scaled as counts).

In [1]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import GaussianNB, MultinomialNB
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

iris = load_iris()
X = iris.data  # feature matrix
y = iris.target  # target labels

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

gnb = GaussianNB()
mnb = MultinomialNB()

gnb.fit(X_train, y_train)
mnb.fit(X_train, y_train)


y_pred_gnb = gnb.predict(X_test)
y_pred_mnb = mnb.predict(X_test)


accuracy_gnb = accuracy_score(y_test, y_pred_gnb)
conf_matrix_gnb = confusion_matrix(y_test, y_pred_gnb)
class_report_gnb = classification_report(y_test,y_pred_gnb)

accuracy_mnb = accuracy_score(y_test, y_pred_mnb)
conf_matrix_mnb = confusion_matrix(y_test, y_pred_mnb)
class_report_mnb = classification_report(y_test, y_pred_mnb)

accuracy_gnb, conf_matrix_gnb, class_report_gnb, accuracy_mnb, conf_matrix_mnb, class_report_mnb

(1.0,
 array([[10,  0,  0],
        [ 0,  9,  0],
        [ 0,  0, 11]]),
 '              precision    recall  f1-score   support\n\n           0       1.00      1.00      1.00        10\n           1       1.00      1.00      1.00         9\n           2       1.00      1.00      1.00        11\n\n    accuracy                           1.00        30\n   macro avg       1.00      1.00      1.00        30\nweighted avg       1.00      1.00      1.00        30\n',
 0.9,
 array([[10,  0,  0],
        [ 0,  9,  0],
        [ 0,  3,  8]]),
 '              precision    recall  f1-score   support\n\n           0       1.00      1.00      1.00        10\n           1       0.75      1.00      0.86         9\n           2       1.00      0.73      0.84        11\n\n    accuracy                           0.90        30\n   macro avg       0.92      0.91      0.90        30\nweighted avg       0.93      0.90      0.90        30\n')

Gaussian Naive Bayes (GNB) Results

Accuracy: 1.0 → Perfect classification (100% correct predictions).

👉 Why so good?
GaussianNB assumes features follow a Gaussian (normal) distribution, which matches the Iris dataset pretty well since its features (like petal length/width) are continuous and roughly normally distributed. That’s why it achieved perfect accuracy.

🔹 Multinomial Naive Bayes (MNB) Results

Accuracy: 0.90 → 90% correct predictions.

👉 Why weaker?
MultinomialNB assumes features are counts (non-negative integers), like word frequencies in text classification