# Naive Bayes #

Naive Bayes is a supervised learning algorithm that applies Bayes theorem for predicting.

Bayes Theorem: 
<center>{$P(Y|X) = \frac{P(X|Y)P(Y)}{P(X)}$}</center>

Using Bayes to predict label (y) from feature vector $x_1, \dotsc, x_n$: 
<center>$P(y|x_1, \dotsc, x_n) = \frac{P(x_1, \dotsc, x_n)P(y)}{P(x_1, \dotsc, x_n)}$</center>

Assumption: conditional independence between pairs of features

So labeling equation becomes:
<center>$\hat{y} = \mathsf{argmax}_y P(y)\prod_{i=1}^n P(x_i|y)$</center>

Calculate $P(y)$ and $P(x_i|y)$ using the data set.

In [9]:
# import necessary packages

from sklearn import datasets, model_selection, metrics
from sklearn import naive_bayes as nb

### Data set ###

Import Iris built-in data set from scikit-learn for illustrating Naive Bayes implementation.

Seperate data into train and test.

In [6]:
data = datasets.load_iris()
X = data['data']
Y = data['target']

X_train,X_test,Y_train,Y_test = model_selection.train_test_split(X, Y, train_size=0.7)

### Model implementation with scikit-learn ###

In [11]:
model = nb.GaussianNB().fit(X, Y)
Y_pred = model.predict(X_test)

### Model Analysis ###

Analyze the resulting model using accuracy and the confusion matrix.

Confusion Matrix:
Entry [i, j] in the matrix is the number of samples truly labeled i but predicted to be j

In [12]:
# accuracy on training data
train_acc = model.score(X_train, Y_train)
print('Accuracy on training:', train_acc)

# accuracy on testing data
test_acc = model.score(X_test, Y_test)
print('Accuracy on testing:', test_acc)
print()

matrix = metrics.confusion_matrix(Y_test, Y_pred)
print(matrix)

Accuracy on training: 0.9619047619047619
Accuracy on testing: 0.9555555555555556

[[14  0  0]
 [ 0 16  1]
 [ 0  1 13]]


Accuracy of the model is fairly high on the test set, indicating it is doing a fairly good job at classifying the flowers.

Confusion matrix formatted:

| | Actual 0 | Actual 1 | Acutal 2|
| ------- | ------- | ------- | ------- |
| Predict 0 | 14 | 0 | 0 |
| Predict 1| 0 | 16 | 1 |
| Predict 2| 0 | 1 | 13 |

Based on the confusion matrix, the model:
* Class 0:
    * True positives: 14
    * False positives:
    * False negatives:
    * True negatives: 
* Class 1:
    * True positives:
    * False positives: 
    * False negatives:
    * True negatives:
* Class 2:
    * True positives:
    * False positives: 
    * False negatives: 
    * True negatives: 