# Bayes Theorem

Bayes Theorem is based on conditional probability, which is the probability of one event occurring based on the the previous/known data about another event. Bayes Theorem can be interpreted mathematically as:

$$
   P(A | B) = \frac{P(B | A)*P(A)}{P(A)}
$$

A, B       = Events

P(A | B)     = Probability of A when B is known

P(B | A)     = Probability of B when A is known

P(A), P(B) = The independent probabilities of A and B

Let's understand the Bayes theorem with following example:
A  bookstore manager has information about his customers’ age and income. He wants to know how book sales are distributed across three age-classes of customers: youth (18-35), middle-aged (35-60), and seniors (60+). 

Let us call our known data as X. Our hypothesis is H, where we have some X that belongs to a certain class C. The goal is to determine the conditional probability of our hypothesis H given that X is present, i.e., P(H | X). In the layman term, for instance, a 26 years old with an income of $2000. H is our hypothesis that the customer will buy the book or no.

Given these, Bayes Theorem states:

$$
   P(H | X) = \frac{P(X | H)*P(H)}{P(X)}
$$

## Naive Bayes

Naive Bayes is the application of Bayes Theorem for classification given the known features, therefore, it is called Naive Bayes claasifier. Naive Bayes has several methods of supervised learning algorithms based on applying Bayes’ theorem with the assumption that each pair of features is independent in itself given the value of the class variable.

#### Applications

- Face Recognition
- Weather Prediction
- Medical Diagnosis
- News Classification

#### Advantages

Following are some of the benefits of the Naive Bayes classifier: 

- It is simple and easy to implement
- It doesn’t require as much training data
- It handles both continuous and discrete data
- It is highly scalable with the number of predictors and data points
- It is fast and can be used to make real-time predictions
- It is not sensitive to irrelevant features 

#### Disadvantages

- If categorical variable has a category (in test data set), which was not observed in training data set, then model will assign a 0 (zero) probability and will be unable to make a prediction. This is often known as Zero Frequency.
- On the other side naive Bayes is also known as a bad estimator, so the probability outputs are not to be taken too seriously.
- Another limitation of Naive Bayes is the assumption of independent predictors. In real life, it is almost impossible that we get a set of predictors which are completely independent.

## Methods of Naive Bayes Classifier
There are several methods to implement the Naive Bayes Classifier:
1. Gaussian Naive Bayes
2. Multinomial Naive Bayes
3. Complement Naive Bayes
4. Bernoulli Naive Bayes
5. Categorical Naive Bayes
6. Out-of-core naive Bayes model fitting

For the sake of practice, we will use Gaussian Naive Bayes for implementation of Naive Bayes Classifier on the Wine Dataset from sklearn library.

**Step 01-** Imort the required libraries and the dataset
In this step, the Iris dataset is imported from the scikit-learn library.  

In [3]:
import pandas as pd
from sklearn.datasets import load_iris
iris = load_iris()

**Step 02-** Select the features (X) and response variable (y) from the dataset

In [2]:
X = iris.data
y = iris.target

**Step 03-** In this step, first of all train_test_split function is imported from the scikit-learn library. Next, the data is split into train data and test data with the proportion of 80/20.

In [5]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=1)

**Step 04-** GaussianNB Algorithm is imported from Naive Bayes window of Scikit-learn library, then the model is trained on the training data, where the variables X_train and y_train are fit to the model.

In [6]:
from sklearn.naive_bayes import GaussianNB
gnb = GaussianNB()
gnb.fit(X_train, y_train)

GaussianNB()

**Step 05-** Now is the time to put the test data inside the model for predictions upon that data provided.

In [7]:
y_pred = gnb.predict(X_test)
y_pred

array([0, 1, 1, 0, 2, 1, 2, 0, 0, 2, 1, 0, 2, 1, 1, 0, 1, 1, 0, 0, 1, 1,
       2, 0, 2, 1, 0, 0, 1, 2])

**Step 06-** In the end, the performance of trained model is evaluated by importing the metrics function from scikit-learn library. Next, the accuracy_score function is uses to see the score of our model. While evaluating the model, the actual output values (y_test) are compared with the predicted values by the model, which we assigned to y_pred in the above step. 

In [8]:
from sklearn import metrics
print("Gaussian Naive Bayes model accuracy(in %):", metrics.accuracy_score(y_test, y_pred)*100)

Gaussian Naive Bayes model accuracy(in %): 96.66666666666667


References:
1) https://towardsdatascience.com/naive-bayes-classifier-81d512f50a7c
2) https://www.geeksforgeeks.org/naive-bayes-classifiers/