# Naive Bayes Classifier Algorithm

1. [Naive Bayes Overview](#1)
1. [Bayes' Theorem](#2)
1. [ Working of Naïve Bayes' Classifier](#3)
1. [Problem Solving Using Naive Baye's](#4)
1. [Types of Naïve Bayes Classifier](#5)
1. [Naive Baye's Algorithm Implementation](#6)

#### <span id="1"></span>  1. Naive Bayes Overview

1.Naïve Bayes algorithm is a supervised learning algorithm, which is based on Bayes theorem and used for solving classification problems.<br>

2.It is mainly used in text classification that includes a high-dimensional training dataset.<br>

3.Naïve Bayes Classifier is one of the simple and most effective Classification algorithms which helps in building the fast machine learning models that can make quick predictions.<br>

4.It is a probabilistic classifier, which means it predicts on the basis of the probability of an object.<br>

5.Some popular examples of Naïve Bayes Algorithm are spam filtration, Sentimental analysis, and classifying articles.<br>

#### <span id="2"></span>  2. Bayes' Theorem

Bayes' theorem is also known as Bayes' Rule or Bayes' law, which is used to determine the probability of a hypothesis with prior knowledge. It depends on the conditional probability.<br>

The formula for Bayes' theorem is given as:<br>

<img src="https://www.kdnuggets.com/wp-content/uploads/bayes-nagesh-1.jpg" class="center">

Where,

1. P(A|B) is Posterior probability: Probability of hypothesis A on the observed event B.<br>

2. P(B|A) is Likelihood probability: Probability of the evidence given that the probability of a hypothesis is true.<br>

3. P(A) is Prior Probability: Probability of hypothesis before observing the evidence.<br>

4. P(B) is Marginal Probability: Probability of Evidence.<br>

#### <span id="3"></span>  3. Working of Naïve Bayes' Classifier

Working of Naïve Bayes' Classifier can be understood with the help of the below example:<br>

Suppose we have a dataset of weather conditions and corresponding target variable "Play". So using this dataset we need to decide that whether we should play or not on a particular day according to the weather conditions. So to solve this problem, we need to follow the below steps:<br>

1. Convert the given dataset into frequency tables.
2. Generate Likelihood table by finding the probabilities of given features.
3. Now, use Bayes theorem to calculate the posterior probability.<br>

Problem: If the weather is sunny, then the Player should play or not?

#### <span id="4"></span>  4. Problem Solving Using Naive Baye's

The dataset is represented as below.

<img src="https://www.kdnuggets.com/wp-content/uploads/bayes-nagesh-2.png" class="center">

Here x1,x2….xn represent the features, i.e they can be mapped to Color, Type, and Origin. By substituting for X and expanding using the chain rule we get,<br>

<img src="https://www.kdnuggets.com/wp-content/uploads/bayes-nagesh-5.png" class="center">

The posterior probability P(y|X) can be calculated by first, creating a Frequency Table for each attribute against the target. Then, molding the frequency tables to Likelihood Tables and finally, use the Naïve Bayesian equation to calculate the posterior probability for each class. The class with the highest posterior probability is the outcome of the prediction. Below are the Frequency and likelihood tables for all three predictors.<br>


<img src="https://www.kdnuggets.com/wp-content/uploads/bayes-nagesh-8.png" class="center">

Frequency and Likelihood tables of ‘Color’<br>

<img src="https://www.kdnuggets.com/wp-content/uploads/bayes-nagesh-9.png" class="center">

Frequency and Likelihood tables of ‘Type’

<img src="https://www.kdnuggets.com/wp-content/uploads/bayes-nagesh-10.png" class="center">


Frequency and Likelihood tables of ‘Origin’<br>
So in our example, we have 3 predictors X.
<img src="https://www.kdnuggets.com/wp-content/uploads/bayes-nagesh-11.png" class="center">

As per the equations discussed above, we can calculate the posterior probability P(Yes | X) as :
<img src="https://www.kdnuggets.com/wp-content/uploads/bayes-nagesh-12.png" class="center">

and, P(No | X):
<img src="https://www.kdnuggets.com/wp-content/uploads/bayes-nagesh-13.png" class="center">

#### <span id="5"></span>  5. Types of Naïve Bayes Classifier

1. Multinomial Naïve Bayes: Feature vectors represent the frequencies with which certain events have been generated by a multinomial distribution. This is the event model typically used for document classification.<br>


2. Bernoulli Naïve Bayes: In the multivariate Bernoulli event model, features are independent booleans (binary variables) describing inputs. Like the multinomial model, this model is popular for document classification tasks, where binary term occurrence(i.e. a word occurs in a document or not) features are used rather than term frequencies(i.e. frequency of a word in the document).<br>


3. Gaussian Naïve Bayes: In Gaussian Naïve Bayes, continuous values associated with each feature are assumed to be distributed according to a Gaussian distribution(Normal distribution). 

References :<br>
https://www.kdnuggets.com/2020/06/naive-bayes-algorithm-everything.html<br>
https://www.javatpoint.com/machine-learning-naive-bayes-classifier

#### <span id="6"></span>  6. Naive Baye's Algorithm Implementation

##### Import Necessary Libraries

In [1]:
import numpy as np
import pandas as pd
import seaborn as sns

##### Import Dataset

In [3]:
df = pd.read_csv(r'C:\Users\imsanjoykb\Downloads\emails.csv')
df.head()

Unnamed: 0,text,spam
0,Subject: naturally irresistible your corporate...,1
1,Subject: the stock trading gunslinger fanny i...,1
2,Subject: unbelievable new homes made easy im ...,1
3,Subject: 4 color printing special request add...,1
4,"Subject: do not have money , get software cds ...",1


##### Dataset Analysis

In [4]:
df.shape

(5728, 2)

In [8]:
#### Check Null Value
df.isnull().sum()

text    0
spam    0
dtype: int64

In [9]:
#### Drop duplicates value
df.drop_duplicates(inplace = True)

In [10]:
df.shape

(5695, 2)

In [11]:
##### Import NLTK tool kits
import nltk
nltk.download('stopwords')

[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\imsanjoykb\AppData\Roaming\nltk_data...
[nltk_data]   Unzipping corpora\stopwords.zip.


True

In [12]:
from nltk.corpus import stopwords, words
import string

##### Remove Punctuation & stopwords

In [19]:
def clean_text(text):
    remove_punc = [char for char in text if char not in string.punctuation] # ; , . "" ?
    remove_punc = ''.join(remove_punc)

    rem_words = [word for word in remove_punc.split() if word.lower not in stopwords.words('english')]
    return rem_words

In [20]:
df['text'].head()

0    Subject: naturally irresistible your corporate...
1    Subject: the stock trading gunslinger  fanny i...
2    Subject: unbelievable new homes made easy  im ...
3    Subject: 4 color printing special  request add...
4    Subject: do not have money , get software cds ...
Name: text, dtype: object

In [21]:
#### Clean Text
df['text'].head().apply(clean_text)

0    [Subject, naturally, irresistible, your, corpo...
1    [Subject, the, stock, trading, gunslinger, fan...
2    [Subject, unbelievable, new, homes, made, easy...
3    [Subject, 4, color, printing, special, request...
4    [Subject, do, not, have, money, get, software,...
Name: text, dtype: object

##### Apply TfIdf Vectorizer

In [22]:
from sklearn.feature_extraction.text import TfidfVectorizer

In [23]:
x = TfidfVectorizer(analyzer = clean_text).fit_transform(df['text'])

In [24]:
y = df['spam']

##### Seperate train, test data

In [25]:
from sklearn.model_selection import train_test_split
xtrain, xtest, ytrain, ytest = train_test_split(x,y,test_size = .30 , random_state = 42)

In [26]:
xtrain.shape

(3986, 37380)

In [27]:
xtest.shape

(1709, 37380)

##### Apply Bernoulli Naive Bayes Algorithm

In [28]:
from sklearn.naive_bayes import BernoulliNB

In [29]:
bn = BernoulliNB()

In [30]:
bn.fit(xtrain,ytrain)

BernoulliNB()

In [31]:
bn.score(xtest,ytest)

0.976009362200117

In [32]:
predict = bn.predict(xtest)

##### Apply Confusion Matrix

In [33]:
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(ytest,predict)

In [34]:
cm

array([[1255,   14],
       [  27,  413]], dtype=int64)

##### Show Classification Report

In [35]:
from sklearn.metrics import classification_report

In [36]:
print(classification_report(ytest,predict))

              precision    recall  f1-score   support

           0       0.98      0.99      0.98      1269
           1       0.97      0.94      0.95       440

    accuracy                           0.98      1709
   macro avg       0.97      0.96      0.97      1709
weighted avg       0.98      0.98      0.98      1709



In [37]:
bn.score(xtest,ytest)

0.976009362200117

##### Apply Multinomial Naive Bayes Algorithm

In [38]:
from sklearn.naive_bayes import MultinomialNB

In [39]:
multi = MultinomialNB()

In [40]:
multi.fit(xtrain, ytrain)

MultinomialNB()

In [41]:
multi.score(xtest,ytest)

0.8367466354593329