## Naive Bayes Classifiers

*Naive Bayes* classifiers are a family of classifiers that are quite similar to linear models.  However, they tend to be even faster in training.  The price paid for this efficiency is that Naive Bayes models often provide generalization performance that is slightly worse than *LogisticRegression* and *LinearSVC*.

The reason that Naive Bayes models are so efficient is that they learn parameters by looking at each feature individually and collect simple per-class statistics from each feature.  

There are 3 kinds of Naive Bayes classifiers implemented in scikit-learn:
- GaussianNB (can be applied to any continuous data)
- BernoulliNB (assumes binary data)
- MultinomialNB (assumes integer count data, like how many times a word appears in a sentence)

### BernoulliNB

The BernoulliNB Classifier counts how often every feature of each class is not zero.

Here, we have 4 data points, with 4 binary features each.  There are 2 classes, 0 and 1.

In [2]:
# Standard imports
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import scipy as sp
import sklearn
from IPython.display import display
import mglearn

# Don't display deprecation warnings
import warnings
warnings.filterwarnings('ignore')

In [3]:
X = np.array([[0, 1, 0, 1],
              [1, 0, 1, 1],
              [0, 0, 0, 1],
              [1, 0, 1, 0]])
y = np.array([0, 1, 0, 1])

Counting the nonzero entries per class in essence looks like this:

In [4]:
counts = {}
for label in np.unique(y):
    # Iterate over each class
    # Count (sum) entries of 1 per feature
    counts[label] = X[y == label].sum(axis=0)
    
print("Feature counts:\n", counts)

Feature counts:
 {0: array([0, 1, 0, 2]), 1: array([2, 0, 2, 1])}
