### Naive Bayes 

Suppose you are trying to orginize your email. How would you take the texts and catigorize it? This is a text classification problem.

Firstly you want to assemble all the words you want to look out for. Might be the 10,000 most reoccuring words in your inbox. 

One way to do this is to create a binary feature vector that creates a 1 if a word appears in the email or 0 if not. 

$x ∈ \{0,1\}^n$ with the condition {word i appeares in email}

In this example $n$ would be 10,000. 

**Key Idea:** Assume $ X_i $’s are conditionally independent given $ y $": This means that once the class $ y $ (e.g., spam or not spam) is known, the features $ X_1, X_2, \ldots, X_{n} $ (e.g., words in an email) don’t influence each other.

$$\left( \prod_{i=1}^n p(X_i | y) \right) p(y)$$

This algo has a high bias because it wouldnt take into account the order which the words are on the page. It has a low variance for its abilty to predict well. 
## Implementation Of Naive Bayes

I followed [this video](https://www.youtube.com/watch?v=TLInuAorxqE) to implement as python code. 

In [2]:
import numpy as np

class NaiveBayes:

    def fit(self, X, y):
        n_samples, n_features = X.shape
        self._classes = np.unique(y)
        n_classes = len(self._classes)

        # calculate mean, var, and prior for each class
        self._mean = np.zeros((n_classes, n_features), dtype=np.float64)
        self._var = np.zeros((n_classes, n_features), dtype=np.float64)
        self._priors = np.zeros(n_classes, dtype=np.float64)

        for idx, c in enumerate(self._classes):
            X_c = X[y == c]
            self._mean[idx, :] = X_c.mean(axis=0)
            self._var[idx, :] = X_c.var(axis=0)
            self._priors[idx] = X_c.shape[0] / float(n_samples)
            

    def predict(self, X):
        y_pred = [self._predict(x) for x in X]
        return np.array(y_pred)

    def _predict(self, x):
        posteriors = []

        # calculate posterior probability for each class
        for idx, c in enumerate(self._classes):
            prior = np.log(self._priors[idx])
            posterior = np.sum(np.log(self._pdf(idx, x)))
            posterior = posterior + prior
            posteriors.append(posterior)

        # return class with the highest posterior
        return self._classes[np.argmax(posteriors)]

    def _pdf(self, class_idx, x):
        mean = self._mean[class_idx]
        var = self._var[class_idx]
        numerator = np.exp(-((x - mean) ** 2) / (2 * var))
        denominator = np.sqrt(2 * np.pi * var)
        return numerator / denominator


# Testing
if __name__ == "__main__":
    # Imports
    from sklearn.model_selection import train_test_split
    from sklearn import datasets

    def accuracy(y_true, y_pred):
        accuracy = np.sum(y_true == y_pred) / len(y_true)
        return accuracy

    X, y = datasets.make_classification(
        n_samples=1000, n_features=10, n_classes=2, random_state=123
    )
    X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.2, random_state=123
    )

    nb = NaiveBayes()
    nb.fit(X_train, y_train)
    predictions = nb.predict(X_test)

    print("Naive Bayes classification accuracy", accuracy(y_test, predictions))

Naive Bayes classification accuracy 0.965
