# Gaussian Naive Bayes

This notebook is an extension to [Naive Bayes Notebook](https://github.com/jyotipmahes/Implementation-of-ML-algos-in-Python/blob/master/Naive_Bayes.ipynb). In last notebook, we implemented Naive Bayes from scratch for categorical variables. In this notebook, we will discuss how to **extend Naive Bayes to continuous numerical data**. In Gaussian Naive Bayes, we make an **assumption** that the **numerical features follow a normal distribution** and we can calculate the **conditional probabilities P(x/c) with the help of Normal density function**. Here are the steps we will follow to implement Gaussian Naive Bayes.
1. **Segregate** the data as per **classes** and calculate the **mean** and **standard deviation** for each numerical feature. Also **calculate class prior** probabilities similar to previous notebook.
2. For any test data, calculate the **conditional probability** for each feature values with the help of **Normal density function** and using the mean and standard deviation calculate per class. 
3. **Multiply** the **conditional probabilities** and multiply it with **prior probabilities** (and take log) to get posterior probabilities. 
4. **Select** the **class** with **highest probabilities**.

As we can see, the only difference is the way we calculate conditional probabilities and rest of the process remains the same. Hence we will use the previous code with slight modifications. 

![title](https://images.slideplayer.com/25/7828801/slides/slide_14.jpg)

## Implementation code

In [1]:
import numpy as np
from sklearn.datasets import load_iris
from collections import defaultdict
import operator
from functools import reduce
from operator import itemgetter
from sklearn.model_selection import train_test_split

In [2]:
class GaussianNB:
    def __init__(self):
        self.n_items = None
        self.classes = None
        self.priors = defaultdict(float)
        self.prob_predicted = defaultdict(float)
        self.class_feature_mean = defaultdict(list)
        self.class_feature_std = defaultdict(list)
    
    def fit(self, x_train, labels):
        self.x_train = np.array(x_train)
        self.labels = np.array(labels)
        self.n_items = self.labels.size
        self.classes = set(self.labels)
        
        for i in self.classes:
            self.class_feature_mean[i] = np.mean(x_train[labels==i], axis = 0)
            self.class_feature_std[i] = np.std(x_train[labels==i], axis = 0)
            
        self.doc_priors()

    
    def doc_priors(self):
        for label in self.classes:
            self.priors[label] = np.sum(1 for d in self.labels if d == label)*1.0 / self.n_items
    
    def likelihood(self, label, x):
        exponent = np.exp(-((x- self.class_feature_mean[label])**2/
                            (2*(self.class_feature_std[label])**2)))
        probs =  (1 / (np.sqrt(2*np.pi) * self.class_feature_std[label])) * exponent
        return np.sum(np.log(probs))
    
    
    def post_prob(self, test):
        for label in self.classes:
            self.prob_predicted[label] = np.log(self.priors[label]) 
            self.prob_predicted[label] += self.likelihood(label, test)
        return self.prob_predicted

   
    def predict(self, test):
        self.test_labels = []
        for i in test:
            prob_predicted = self.post_prob(i)
            label, prob = max(prob_predicted.items(),
                      key=itemgetter(1))
            self.test_labels.append(label)
        return self.test_labels


## Checking our implementation on iris data

### Get train and test data sets

In [3]:
data = load_iris()
X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, test_size = 0.3, 
                                                    random_state =1, shuffle = True)

### Creating classifier and fitting to train data

In [4]:
clf = GaussianNB()
clf.fit(X_train, y_train)

### Predicting on test data

In [5]:
pred = clf.predict(X_test)
print("Prediction accuracy %.3f" %(np.sum(pred==y_test)/len(y_test)))

Prediction accuracy 0.933


Looks like out implementation is correct and we have build a good enough classifier.

### Compairing with sklearn predictions

In [6]:
from sklearn.naive_bayes import GaussianNB as sGB

In [7]:
clf2 = sGB()

In [8]:
clf2.fit(X_train, y_train)

GaussianNB(priors=None)

In [9]:
pred = clf2.predict(X_test)
print("Prediction accuracy %.3f" %(np.sum(pred==y_test)/len(y_test)))

Prediction accuracy 0.933


We get same results

# Bernoulli Naive Bayes

BernoulliNB implements the naive Bayes training and classification algorithms for data that is distributed according to multivariate Bernoulli distributions; i.e., there may be multiple features but each one is assumed to be a binary-valued (Bernoulli, boolean) variable. Therefore, this class requires samples to be represented as binary-valued feature vectors. The difference between Bernoulli naive Bayes and Multinomial Naive Bayes is the way we calculate likelihood function:



$$P(x_i|y)= P(i|y)x_i + (1- P(i|y))(1- x_i)$$

which differs from multinomial NB’s rule in that it explicitly penalizes the non-occurrence of a feature  that is an indicator for class , where the multinomial variant would simply ignore a non-occurring feature.

In the case of text classification, word occurrence vectors (rather than word count vectors) may be used to train and use this classifier. BernoulliNB might perform better on some datasets, especially those with shorter documents.

**We can implement this by changing the likelihood function in Multinomial Naive Bayes implementation shared in last notebook**.

Please refer to this link to check the [implementation](https://mattshomepage.com/articles/2016/Jun/07/bernoulli_nb/)

# Mixing Gaussian and Multinomial Naive Bayes

Now that we have seen the implementation of both Gaussian and Multinomial Naive Bayes, we can mix them together and create a hybrid classifier which can handle both **Categorical** and **continuous** data. We have to define t