## Naive Bayes

- A probabalistic classifier based on Bayes' theorem with a strong (naive) independence assumptions between the features

- Bayes' Theorem states that the probability of event A given event B is equal to the probability of B given A times the probability of A all over the probability of B

- In the classification case, the Theorem states that the probability of a class label given its feature vector is equal to the probability of the feature vector's given the class label, times the probability of the class label all over the probability of the feature vector. This is based on the assumption that the features are mutually independent. 

- This assumption is usually not the case for real-world examples, hence the name naive. When applying Naive Bayes, the formula above is prone to errors since the probabilities are between 0 and 1, meaning that across n iterations when n is large, the prediction can go to 0. Therefore, we use the prior probability (frequency of each class) and class conditional probability (model with gaussian distribution) to calculate the posterior probability

- When training, calculate mean, variance, and prior frequency for each class. For predictions, calculate the posterior probability for each class with the frequency of each class and the gaussian formula, then choose the class with the highest posterior probability.


In [4]:
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn import datasets 

In [15]:
class NaiveBayes:
    
    def fit(self, X, y):
        n_samples, n_features = X.shape
        self._classes = np.unique(y)
        n_classes = len(self._classes)

        # Calculate mean, variance, and prior for each class
        self._mean = np.zeros((n_classes, n_features),dtype=np.float64)
        self._var = np.zeros((n_classes, n_features),dtype=np.float64)
        self._priors = np.zeros((n_classes),dtype=np.float64)

        for idx, c in enumerate(self._classes):
            X_c = X[y == c]
            self._mean[idx,:] = X_c.mean(axis=0)
            self._var[idx,:] = X_c.var(axis=0)
            self._priors[idx] = X_c.shape[0] / float(n_samples)


    def predict(self, X):
        y_pred = [self._predict(x) for x in X]
        return np.array(y_pred)

    def _predict(self, X):
        posteriors = []

        # Calculate posterior probability for each class
        for idx, c in enumerate(self._classes):
            prior = np.log(self._priors[idx])
            posterior = np.sum(np.log(self._pdf(idx, X)))
            posterior = posterior + prior
            posteriors.append(posterior)

            # Return class with the highest posterior
            return self._classes[np.argmax(posteriors)]
        
    def _pdf(self, class_idx, x):
        mean = self._mean[class_idx]
        var = self._var[class_idx]
        numerator = np.exp(-((x- mean) ** 2) / (2 * var))
        denominator = np.sqrt(2 * np.pi * var)
        return numerator / denominator

    


In [26]:
def accuracy(y_true, y_pred):
        accuracy = np.sum(y_true == y_pred) / len(y_true)
        return accuracy

X, y = datasets.make_classification (
    n_samples =1000, n_features=10, n_classes=2, random_state=123
)
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=123
)

nb = NaiveBayes()
nb.fit(X_train, y_train)
predictions = nb.predict(X_test)

print(accuracy(y_test, predictions))


0.495
