# Coding the Naive Bayes Classifier From Scratch

This post will walk through the basics of the Naive Bayes Classifier as well as show a python implementation of coding it from the ground up. While Naive Bayes is a fairly simple and straightforward algorithm, it has a number of real world use cases, including the canonical spam detection as well as sentiment analysis and weather detection. This post will walk through an example using [UCI's Banknote Authentication Dataset](https://archive.ics.uci.edu/ml/datasets/banknote+authentication "UCI ML Repo").

## Banknote Dataset

First, let's get started with the basics and __read in the dataset:__

In [1]:
import pandas as pd
import numpy as np
url = url = 'http://archive.ics.uci.edu/ml/machine-learning-databases/00267/data_banknote_authentication.txt'
df = pd.read_csv(url, header=None)
df.columns = ['imgVariance','imgSkewness','imgCurtosis','imgEntropy','Class']
df.head()

Unnamed: 0,imgVariance,imgSkewness,imgCurtosis,imgEntropy,Class
0,3.6216,8.6661,-2.8073,-0.44699,0
1,4.5459,8.1674,-2.4586,-1.4621,0
2,3.866,-2.6383,1.9242,0.10645,0
3,3.4566,9.5228,-4.0112,-3.5944,0
4,0.32924,-4.4552,4.5718,-0.9888,0


The dataset consists of 1372 observations. There are four features which describe images of genuine and forged banknotes, as well as as label indicating whether or not the note is genuine. Prior to diving into the classifier, let's __split our data into training and test sets:__

In [2]:
msk = np.random.rand(len(df)) < 0.8
train_df = df[msk]
test_df = df[~msk]

## Why is it Naive? Why is it Bayes(ian)?

The Naive Bayes Classifier is a supervised learning algorithm so given a set of datapoints {${x^1,...x^m}$} our goal is to predict the correct {${y^1,...,y^m}$}. However, unlike discriminative classifier such as logistic regressions or decision trees which directly estimate $P(y|x)$ and create a decision boundaries to make predictions, the Naive Bayes Classifier is a __generative classifier__. It uses $P(x|y)$ to then estimate $P(y|x)$. And here is where good old Bayes Theorem helps you out.



In [8]:
%%latex
\begin{equation*}
P(X|Y) = \frac{P(Y|X)P(X)}{P(Y)}
\end{equation*}

<IPython.core.display.Latex object>