## Naive Bayes Classifiers

Naive Bayes Classifiers are quite similar to the linear models.

They tend to be faster in training.

This is because naive Bayes models often provide generalization performance slightly worse than linear classifiers like `LogisticRegression`, `LinearSVC`.

The reason that naive Bayes models are so efficient is that they learn parameters by looking at each feature individually and collect simple per-class statistics from each feature.

There are three kinds of naive Bayes classifiers implemented in `scikit-learn`:

* `GaussianNB`
* `BernoulliNB`
* `MultinomialNB`
    
`GaussianNB` can be applied to any continuous data, while `BernoulliNB` assumes binary data and `MultinomialNB` assumes count data (meaning, each feature represents an integer count of something, like how often a word appears in a sentence.)

`BernoulliNB` and `MultinomialNB` are mostly used in text data classification.

* The `BernoulliNB` classifier counts how often every feature of each class is not zero.

In [16]:
import numpy as np

X = np.array([
    [0, 1, 0, 1],
    [1, 0, 1, 1],
    [0, 0, 0, 1],
    [1, 0, 1, 0]
])

y = np.array([0, 1, 0, 1])

In [25]:
counts = {}

for label in np.unique(y):
    counts[label] = X[y == label].sum(axis=0)

print("Feature counts: {}".format(counts))

Feature counts: {0: array([0, 1, 0, 2]), 1: array([2, 0, 2, 1])}


In [10]:
for label in np.unique(y):
    print(y==label)

[ True False  True False]
[False  True False  True]


In [9]:
np.unique(y)

array([0, 1])

In [17]:
X[1,0]

1

In [18]:
X[0,0]

0

In [23]:
X[1].sum()

3

In [24]:
X[0].sum()

2