## Naive Bayes

Naive Bayes are a family of powerful and easy-to-train classifiers that determine the
probability of an outcome given a set of conditions using Bayes' theorem. In other words,
the conditional probabilities are inverted, so that the query can be expressed as a function of
measurable quantities. The approach is simple, and the adjective "naive" has been attributed
not because these algorithms are limited or less efficient, but because of a fundamental
assumption about the causal factors that we're going to discuss. Naive Bayes are multi-
purpose classifiers and it's easy to find their application in many different contexts;
however, **their performance is particularly good in all those situations where the probability
of a class is determined by the probabilities of some causal factors.** A good example is
natural language processing, where a piece of text can be considered as a particular instance
of a dictionary and the relative frequencies of all terms provide enough information to infer
a belonging class.

As suggestion, you should read the chapter 6(pag 120) of the book . It has a very good explanation. 

In [None]:
import numpy as np
from mla.base import BaseEstimator
from mla.neuralnet.activations import softmax


class NaiveBayesClassifier(BaseEstimator):
    """Gaussian Naive Bayes."""
    # Binary problem.
    n_classes = 2

    def fit(self, X, y=None):
        self._setup_input(X, y)
        # Check target labels
        assert list(np.unique(y)) == [0, 1]

        # Mean and variance for each class and feature combination
        self._mean = np.zeros((self.n_classes, self.n_features), dtype=np.float64)
        self._var = np.zeros((self.n_classes, self.n_features), dtype=np.float64)

        self._priors = np.zeros(self.n_classes, dtype=np.float64)

        for c in range(self.n_classes):
            # Filter features by class
            X_c = X[y == c]

            # Calculate mean, variance, prior for each class
            self._mean[c, :] = X_c.mean(axis=0)
            self._var[c, :] = X_c.var(axis=0)
            self._priors[c] = X_c.shape[0] / float(X.shape[0])

    def _predict(self, X=None):
        # Apply _predict_proba for each row
        predictions = np.apply_along_axis(self._predict_row, 1, X)

        # Normalize probabilities so that each row will sum up to 1.0
        return softmax(predictions)

    def _predict_row(self, x):
        """Predict log likelihood for given row."""
        output = []
        for y in range(self.n_classes):
            prior = np.log(self._priors[y])
            posterior = np.log(self._pdf(y, x)).sum()
            prediction = prior + posterior

            output.append(prediction)
        return output

    def _pdf(self, n_class, x):
        """Calculate Gaussian PDF for each feature."""

        mean = self._mean[n_class]
        var = self._var[n_class]

        numerator = np.exp(-(x - mean) ** 2 / (2 * var))
        denominator = np.sqrt(2 * np.pi * var)
return numerator / denominator