# Naive Bayes Classifiers

Only for classification. Even faster than linear models, good for very large datasets and high-dimensional data. Often less accurate than linear models.

Naive Bayes Classifiers tend to be faster in training than linear models, but they provide slightly worse generalization performance. Naive Bayes models learn parameters looking at each feature individually collecting per-class statistics.

There are three kinds of naive Bayes classifiers implemented in scikit-learn: GaussianNB, BernoulliNB, and MutinomialNB.

GaussianNB can be applied to any continuous data,  BernoulliNB assumes binary data, MultinomialNB assumes count data (each feature representing an integer count).

In [2]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import mglearn



In [3]:
X = np.array([[0, 1, 0, 1], 
              [1, 0, 1, 1], 
              [0, 0, 0, 1], 
              [1, 0, 1, 0]])
y = np.array([0, 1, 0, 1])

In [4]:
counts = {}
for label in np.unique(y):
    counts[label] = X[y == label].sum(axis=0)
print("Feature counts:\n{}".format(counts))

Feature counts:
{0: array([0, 1, 0, 2]), 1: array([2, 0, 2, 1])}


BernoulliNB classifier counts how often every feature of a class is not zero.

MultinomialNB classifier takes into account the average value of each feature for each class.

GaussianNB classifier stores the average value as well as the standard deviation for each feature for each class.

### Strengths, weaknesses, parameters

BernoulliNB and MultinomialNB have a single parameter, alpha, controlling model complexity. The algorithm adds alpha many virtual data points with positive values for all the features, resulting in "smoother" statistics and less complex models.

GaussianNB is used on very high-dimensional data while BernoulliNB and MultinomialNB are used on sparse count data like text.

Naive Bayes models are great baseline models and are used on very large datasets for speed.