## Naive Bayes From Scratch

* `Naive Bayes` Algorithm is based on the `Bayes` Theorem which states that the probability of A given B equals the probability of B given A multiplied by probability of A divided by probability of B. i.e

$$
p(A | B) = \frac{p(B | A). p(A)}{p(B)}
$$

* Applying Bayes' Theorem to ML, we have:

$$
p(y | X) = \frac{p(X | y). p(y)}{p(X)}
$$

where:
  * $p(y | X)$: Posterior probability
  * $p(X | y)$: Class-conditional probability
  * $p(y)$: Prior probability of y
  * $p(X)$: Prior probability of X

* Note:

$$
Posterior probability = Class-conditional probability + Prior probability of y
$$

* It's a `naive` algorithm because it assumes that the features are mutually independent (which might not be true).
* Expanding Bayes' theorem yields:

$$
p(y | X) = \frac{p(x_{1} | y).p(x_{2} | y)...p(x_{n} | y). p(y)}{p(X)}
$$

* Since p(X) does NOT depend on `y`, we can drop it.
* In order to determine `y`, we need to find the argmax of the posterior. i.e
  
$$
p(y | X) = \argmax(p(x_{1} | y).p(x_{2} | y)...p(x_{n} | y). p(y))
$$

* Since the product of the probabilities will yield a very small value (very close 0), we need to find the `log` of the posterior so that we avoid overflow error. 

$$
p(y | X) = \argmax(logp(x_{1} | y).logp(x_{2} | y)...logp(x_{n} | y). logp(y))
$$

* Log of the conditional probability can be modelled using a `Probability Density Function`.

$$
p(X | y) = (\frac{\exp({- \frac{(x_{i} - \mu_{y})}{2\sigma_{y}^2}})}{\sqrt{2\pi\sigma_{y}^2}})
$$

where:
  * $\mu_{y}$ is the mean given a class. i.e when class=0 or 1.
  * $p\sigma_{y}^2$: is the variance given a class. i.e when class=0 or 1.

* Therefore, `y` is:

$$
y = \argmax({\sum_{i=1}^{N}{log(\frac{\exp({-\frac{(x_{i} - \mu_{y})}{2\sigma_{y}^2}})}{\sqrt{2\pi\sigma_{y}^2}}) + log(p(y))}})
$$

* Since we have a binary class, for each input, the index of the value that produces the highest probability (argmax) is the the predicted value of `y`.

In [1]:
import numpy as np

# Black code formatter (Optional)
%load_ext lab_black

In [2]:
class NaiveBayes:
    pass