Quite similar to linear models but even faster in training but with worse performance.

The reason that naive Bayes models are so efficient is that they learn parameters by
looking at each feature individually and collect simple per-class statistics from each
feature

3 types of Naive Bayes Classifiers: **GaussianNB, BernoulliNB, and MultinomialNB**

GaussianNB can be applied to
any **continuous data**, while BernoulliNB assumes **binary data** and MultinomialNB
assumes **count data** (that is, that each feature represents an integer count of something,
like how often a word appears in a sentence). BernoulliNB and MultinomialNB
are mostly used in text data classification.

The BernoulliNB classifier counts how often **every feature of each class is not zero.**

MultinomialNB takes into account the
**average value** of each feature for each class, while GaussianNB stores the **average value**
as well as the **standard deviation** of each feature for each class

To make a prediction, a data point is compared to the statistics for each of the classes, and the best matching class is predicted

In [2]:
import numpy as np

In [3]:
X = np.array([[0, 1, 0, 1],
[1, 0, 1, 1],
[0, 0, 0, 1],
[1, 0, 1, 0]])
y = np.array([0, 1, 0, 1])
counts = {}
for label in np.unique(y):
# iterate over each class
# count (sum) entries of 1 per feature
    counts[label] = X[y == label].sum(axis=0)
print("Feature counts:\n{}".format(counts))

Feature counts:
{0: array([0, 1, 0, 2]), 1: array([2, 0, 2, 1])}


MultinomialNB and BernoulliNB have a single parameter, alpha, which controls
model complexity. The way alpha works is that the algorithm adds to the data alpha many virtual data points that have positive values for all the features. 

This results in a
“smoothing” of the statistics. A large alpha means more smoothing, resulting in less
complex models. The algorithm’s performance is relatively robust to the setting of
alpha, meaning that setting alpha is not critical for good performance. However,
tuning it usually improves accuracy somewhat