P(y|X) = [P(x1|y) P(x2|y) ... P(y)]/P(X)  X is the feature vector X= (x1,x2,x3.....,xn) all x are IID , using the chain rule

In [6]:
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn import datasets


Naive Bayes classification based on Bayes Theorem
P(y|X) = ( P(X|y) P(y) ) / P(X)
X is the vector of features and all are mutually indipendent
P(y|X) = ( P(x1|y) P(x2|y) P(x3|y) ... P(xn|y) P(y) ) / P(X)
P(y|X) is the posterior probability
P(y) is the prior probability
The goal is to select the class with highest probability
y = max(P(y|X)) 
since P(x1|y) P(x2|y) ... are bounded 0 to 1  and if are multiplied , will get very small number. so its suggest to use log
log(P(x1|y)) + log(P(x2|y)) ....

To calculate the conditional prob P(xn|y) is used the Gaussian distribution
numerator = exp(- (x - mean)^2 / (2 * variance))
denominator = sqrt(2 * pi * variance)

In [32]:
class NaiveBayes:
    
    def fit(self, X, y):
        n_samples, n_features = X.shape
        self._classes = np.unique(y) #get the unique element of the array , so the classes 0,1
        n_classes = len(self._classes) # number of the classes

        #initializing class variable
        self._mean = np.zeros((n_classes, n_features), dtype=np.float64)
        self._var = np.zeros((n_classes, n_features), dtype=np.float64)
        self._priors =  np.zeros(n_classes, dtype=np.float64)

        # calculate mean, var, and prior for each class
        for idx, c in enumerate(self._classes):
            samples_c = X[y==c] #samples thats have c as label
            self._mean[idx, :] = samples_c.mean(axis=0)
            self._var[idx, :] = samples_c.var(axis=0)
            self._priors[idx] = samples_c.shape[0] / float(n_samples) #number of samples labeled c / total samples

    def predict(self, X):
        y_pred = []
        for x in X:    
            posteriorsProbs = [] 
            # calculate posterior probability for each class
            for idx, c in enumerate(self._classes):
                
                prior = np.log(self._priors[idx])
                mean = self._mean[idx]
                var = self._var[idx]
                
                # calculate the prob with Gaussian proability dense function 
                numerator = np.exp(- (x-mean)**2 / (2 * var))
                denominator = np.sqrt(2 * np.pi * var)
                posterior_prob = np.sum(np.log(numerator/denominator))
                
                #update posterior_prob with prior
                posterior_prob = prior + posterior_prob
                posteriorsProbs.append(posterior_prob)
                
            y_pred.append(self._classes[np.argmax(posteriorsProbs)])   
            
        return np.array(y_pred)        

In [34]:
def accuracy(y_t,y_p):
    return np.sum(y_t == y_p) / len(y_t)

In [35]:

X,y =datasets.make_classification(n_samples=1000,n_features=10,n_classes=2, random_state=123)
X_train, X_test,y_train,y_test = train_test_split(X,y,test_size=0.2,random_state=123)

nb = NaiveBayes()
nb.fit(X_train,y_train)
preds = nb.predict(X_test)


print("Accuracy ",accuracy(y_test,preds))

Accuracy  0.965
