# Classifcation: Naive Bayes

$
P(A \mid B) = \frac{P(B \mid A) \, P(A)}{P(B)}
$

where

- $P(A|B)$ is the posterior probability of class $A$ given predictor ( features).
- $P(A)$ is the probability of class.
- $P(B|A)$ is the likelihood which is the probability of predictor given class.
- $P(B)$ is the prior probability of predictor.

A more complete view on the example can be found on [All About Naive Bayes @ towardsdatascience](https://towardsdatascience.com/all-about-naive-bayes-8e13cef044cf) by [Gaurav Chauhan](https://towardsdatascience.com/@gauravc2708).

| Fruit  | Long | Sweet | Yellow | Total |
|--------|:----:|:-----:|:------:|-------|
| Banana |  400 |   350 |    450 |   500 |
| Orange |    0 |   150 |    300 |   300 |
| Other  |  100 |   150 |     50 |   200 |
| Total  |  500 |   650 |    800 |  1000 |

- 50% of the fruits are bananas
- 30% are oranges
- 20% are other fruits

### Calculating Propabilities

Given the the question, how propable it is that we have a have a _Banana_ when the given fruit is _Long_, _Sweet_ and _Yellow_?

#### $P(Banana \mid Long, Sweet, Yellow)$

$
P(\frac{Banana}{Long, Sweet, Yellow}) = \frac{P(\frac{Long}{Banana}) \times P(\frac{Sweet}{Banana}) \times P(\frac{Yellow}{Banana}) \times P(Banana)}{P(Long) \times P(Sweet) \times P(Yellow)}
$

$
P(\frac{Banana}{Long, Sweet, Yellow}) = \frac{(0.8) \times (0.7) \times (0.9) \times (0.5)}{(0.25) \times (0.33) \times (0.41)}
$

$
P(\frac{Banana}{Long, Sweet, Yellow}) = 0.252
$

#### $P(Orange \mid Long, Sweet, Yellow)$

Given the same features, how propable it is that we have a have an Orange?

$
P(\frac{Orange}{Long, Sweet, Yellow}) = \frac{0 \times P(\frac{Sweet}{Orange}) \times P(\frac{Yellow}{Orange}) \times P(Orange)}{0 \times P(Sweet) \times P(Yellow)}
$

$
P(\frac{Orange}{Long, Sweet, Yellow}) = 0
$

Since there was not a long _Orange_ in the data set given, the propability is of a _Long_, _Sweet_ and _Yellow_ Orange is _0_.

#### $P(Other \mid Long, Sweet, Yellow)$

$
P(\frac{Other}{Long, Sweet, Yellow}) = \frac{P(\frac{Long}{Other}) \times P(\frac{Sweet}{Other}) \times P(\frac{Yellow}{Other}) \times P(Other)}{P(Long) \times P(Sweet) \times P(Yellow)}
$

$
P(\frac{Other}{Long, Sweet, Yellow}) = \frac{(0.5)) \times (0.75) \times (0.25) \times (0.2)}{(0.25) \times (0.33)) \times (0.41)}
$

$
P(\frac{Other}{Long, Sweet, Yellow}) = 0.01875
$

## Using Naive Bayes with sklearn

In [2]:
import numpy as np

In [3]:
import pandas as pd

In [46]:
from sklearn.naive_bayes import GaussianNB
from sklearn import datasets
from sklearn import model_selection
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.metrics import f1_score
from sklearn.metrics import confusion_matrix

In [17]:
dataset = datasets.load_wine()

### Data Exploration

The data's(/feature's) shape:

In [33]:
print(f'Shape: {dataset.data.shape}') 

Shape: (178, 13)


The data's features:

In [34]:
print(f"Features: {', '.join(dataset.feature_names)}")

Features: alcohol, malic_acid, ash, alcalinity_of_ash, magnesium, total_phenols, flavanoids, nonflavanoid_phenols, proanthocyanins, color_intensity, hue, od280/od315_of_diluted_wines, proline


In [35]:
print(f"Labels: {', '.join(dataset.target_names)}")

Labels: class_0, class_1, class_2


Print the wine data features (top 5 records)

In [36]:
print(dataset.data[0:5])

[[1.423e+01 1.710e+00 2.430e+00 1.560e+01 1.270e+02 2.800e+00 3.060e+00
  2.800e-01 2.290e+00 5.640e+00 1.040e+00 3.920e+00 1.065e+03]
 [1.320e+01 1.780e+00 2.140e+00 1.120e+01 1.000e+02 2.650e+00 2.760e+00
  2.600e-01 1.280e+00 4.380e+00 1.050e+00 3.400e+00 1.050e+03]
 [1.316e+01 2.360e+00 2.670e+00 1.860e+01 1.010e+02 2.800e+00 3.240e+00
  3.000e-01 2.810e+00 5.680e+00 1.030e+00 3.170e+00 1.185e+03]
 [1.437e+01 1.950e+00 2.500e+00 1.680e+01 1.130e+02 3.850e+00 3.490e+00
  2.400e-01 2.180e+00 7.800e+00 8.600e-01 3.450e+00 1.480e+03]
 [1.324e+01 2.590e+00 2.870e+00 2.100e+01 1.180e+02 2.800e+00 2.690e+00
  3.900e-01 1.820e+00 4.320e+00 1.040e+00 2.930e+00 7.350e+02]]


Print the wine labels (0:class_0, 1:class_2, 2:class_2)

In [31]:
print(dataset.target)

[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2]


In [39]:
# 70% training and 30% test
X_train, X_test, y_train, y_test = train_test_split(dataset.data,
                                                    dataset.target,
                                                    test_size=0.3,
                                                    random_state=109)

In [40]:
#Create a Gaussian Classifier
gnb = GaussianNB()

In [41]:
#Train the model using the training sets
gnb.fit(X_train, y_train)

GaussianNB(priors=None, var_smoothing=1e-09)

In [42]:
#Predict the response for test dataset
y_pred = gnb.predict(X_test)

## Model Evaluation

In [50]:
print(f'Confusion matrix: {confusion_matrix(y_test, y_pred)}')

Confusion matrix: [[20  1  0]
 [ 2 15  2]
 [ 0  0 14]]


In [49]:
# Model Accuracy, how often is the classifier correct?
print(f'Accuracy:, {accuracy_score(y_test, y_pred)}')

Accuracy:, 0.9074074074074074


## Sources

- https://www.datacamp.com/community/tutorials/naive-bayes-scikit-learn