# Naive Bayes Alogrithm 
Alogrithm: 

Input: $T = {(x_1, y_1), (x_2, y_2),..., (x_N, y_N)}$, here $x_i = (x_i^{(1)}, x_i^{(2)}, ..., x_i^{(n)})^T$ , $x_i^{(j)}$ is the jth feature of the ith sample.

 $x_i^{(j)} \in \{a_{j1}, a_{j2},..., a_{js_j} $, here $a_{jl}$ means the jth feature have the value l.

$j = 1,2,..., n$,  $l = 1, 2, ..., S_j$, $y_i \in \{c_1, c_2, ..., c_k \}$ 

Sample $x$

Output: class of sample $x$

Alogrithm for naive bayes with bayes estimation

(1) compute prior probability and conditional probability
$$
P(Y=c_k) = \frac{\sum_{i=1}^{N}I(y_i = c_k) + \lambda}{N + K \lambda}  ,  k =,1,2,...,K
$$

$$
P(X^{j} = a_{jl}| Y=c_k) = \frac{\sum_{i=1}^{N}I(x_i^{(j)} = a_{jl},y_i = c_k)+ \lambda}{\sum_{i=1}^{N}I(y_i = c_k) + S_j \lambda}
$$
$$
j = 1,2,..., n;  l = 1,2,..., S_j ; k = 1,2,...,K 
$$

(2) for sampe $x_i = (x_i^{(1)}, x_i^{(2)}, ..., x_i^{(n)})^T$, compute:
$$
P(Y = c_k)\prod_{j = 1}^n P(X^{(j)} = x^{(j)} | Y = c_k)
$$
$$k = 1,2,...,K$$

(3) Determine the class of sample $x$
$$
y = \arg \mathop{\min}_{c_k} P(Y = c_k)\prod_{j = 1}^n P(X^{(j)} = x^{(j)} | Y = c_k)
$$


In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

from collections import Counter
import math

In [3]:
# Example
lambda_ = 0.2
x = [2, "S"]
X1 = [1, 2, 3]
X2 = ["S", "M", "L"]
Y = [1, -1]

Example table:

|  X/Y   | 1     | 2      | 3     | 4      | 5     | 6      | 7     | 8      | 9     | 10     | 11    | 12     | 13    | 14    | 15  |
|  ----  | ----  |  ----  | ----  |  ----  | ----  |  ----  | ----  |  ----  | ----  |  ----  | ----  |  ----  | ----  | ----  | ----|
| X_1    | 1  | 1  | 1  | 1  | 1  | 2  | 2  | 2  | 2  | 2  | 3  | 3  | 3  | 3  | 3  |
| X_2    | S  | M  | M  | S  | S  | S  | M  | M  | L  | L  | L  | M  | M  | L  | L  |
| Y      | -1  | -1  | 1  |  1 | -1  | -1  | -1  | 1  | 1  | 1  | 1  | 1  | 1  | 1  | -1  |

$P_\lambda(Y=1)=(9+lambda\_)/(15 + 2*lambda\_) = (9+0.2)/(15+2*0.2)=0.5974025974025974$

$P_\lambda(Y=-1)=(6+lambda\_)/(15 + 2*lambda\_) = (6+0.2)/(15+2*0.2)=0.40259740259740264$  
$P(X^{(1)}=1|Y=1) = (2+0.2)/(9+3*0.2)=0.22916666666666669 $  
$P(X^{(1)}=2|Y=1) = (3+0.2)/(9+3*0.2)=0.33333333333333337 $  
$P(X^{(1)}=3|Y=1) = (4+0.2)/(9+3*0.2)=0.43750000000000006 $  
$P(X^{(2)}=S|Y=1) = (1+0.2)/(9+3*0.2)=0.125 $   
$P(X^{(2)}=M|Y=1) = (4+0.2)/(9+3*0.2)=0.43750000000000006 $     
$P(X^{(2)}=L|Y=1) = (4+0.2)/(9+3*0.2)=0.43750000000000006 $  
$P(X^{(1)}=1|Y=-1) = (3+0.2)/(6+3*0.2)=0.4848484848484849 $  
$P(X^{(1)}=2|Y=-1) = (2+0.2)/(6+3*0.2)=0.33333333333333337 $   
$P(X^{(1)}=3|Y=-1) = (1+0.2)/(6+3*0.2)=0.18181818181818182 $   
$P(X^{(2)}=S|Y=-1) = (3+0.2)/(6+3*0.2)=0.4848484848484849 $  
$P(X^{(2)}=M|Y=-1) = (2+0.2)/(6+3*0.2)=0.33333333333333337 $   
$P(X^{(2)}=L|Y=-1) = (1+0.2)/(6+3*0.2)=0.18181818181818182 $   
so  
$P(Y=1)P(X^{(1)}=2|Y=1)P(X^{(2)}=S|Y=1) =0.5974025974025974* 0.33333333333333337*0.125=0.024891774891774892$  
$P(Y=-1)P(X^{(1)}=2|Y=-1)P(X^{(2)}=S|Y=-1) =0.40259740259740264* 0.33333333333333337*0.4848484848484849=0.06506624688442873$  

so, it should be -1.

In [42]:
class NaiveBayes:
    def __init__(self, lambda_):
        self.lambda_ = lambda_

    def fit(self, X, y):
        N, M = X.shape
        data = np.hstack((X, y.reshape(N, 1)))

        py  = {}
        pxy = {}
        uniquey, countsy = np.unique(y, return_counts=True)
        tmp = dict(zip(uniquey, countsy))
        for k, v in tmp.items():
            py[k] = (v + self.lambda_) / (N + len(uniquey) * self.lambda_)
            tmp_data = data[data[:, -1] == k]
            for col in range(M):
                uniquecol, countscol = np.unique(tmp_data[:, col], return_counts= True)
                tmp1 = dict(zip(uniquecol, countscol))
                for kk, vv in tmp1.items():
                    pxy['X({})={}|Y={}'.format(col +1, kk, k)] = (vv + self.lambda_) / (v + len(uniquecol) * self.lambda_)
        
        self.py = py
        self.pxy = pxy

    def predict(self, x):
        print(self.py)
        print(self.pxy)
        M = len(x)
        res = {}
        for k, v in self.py.items():
            p = v 
            for i in range(len(x)):
                p = p * self.pxy['X({})={}|Y={}'.format(i + 1, x[i], k)]
            res[k] = p
        print(res)
        maxp = -1
        maxk = -1
        for kk, vv in res.items():
            if vv > maxp:
                maxp = vv
                maxk = kk

        return maxk

In [53]:
lambda_ = 0.2

d = {'S':0, 'M':1, 'L':2}

X = np.array([[1, d['S']], [1, d['M']], [1, d['M']],
             [1, d['S']], [1, d['S']], [2, d['S']],
             [2, d['M']], [2, d['M']], [2, d['L']],
             [2, d['L']], [3, d['L']], [3, d['M']],
             [3, d['M']], [3, d['L']], [3, d['L']]])

y = np.array([-1, -1, 1, 1, -1, -1, -1, 1, 1, 1, 1, 1, 1, 1, -1])
model = NaiveBayes(lambda_)
model.fit(X, y)
model.predict(np.array([2, 0]))

{-1: 0.4025974025974026, 1: 0.5974025974025974}
{'X(1)=1|Y=-1': 0.4848484848484849, 'X(1)=2|Y=-1': 0.33333333333333337, 'X(1)=3|Y=-1': 0.18181818181818182, 'X(2)=0|Y=-1': 0.4848484848484849, 'X(2)=1|Y=-1': 0.33333333333333337, 'X(2)=2|Y=-1': 0.18181818181818182, 'X(1)=1|Y=1': 0.22916666666666669, 'X(1)=2|Y=1': 0.33333333333333337, 'X(1)=3|Y=1': 0.43750000000000006, 'X(2)=0|Y=1': 0.125, 'X(2)=1|Y=1': 0.43750000000000006, 'X(2)=2|Y=1': 0.43750000000000006}
{-1: 0.06506624688442873, 1: 0.024891774891774892}


-1

# Sklearn implementtion for naive bayes

In [54]:
from sklearn.naive_bayes import GaussianNB

In [60]:
clf = GaussianNB()
clf.fit(X, y)
clf.predict([[2, 0]])

array([-1])

In [57]:
from sklearn.naive_bayes import BernoulliNB, MultinomialNB

In [58]:
clf1 = BernoulliNB()
clf1.fit(X, y)
clf1.predict([[2, 0]])

array([-1])

In [59]:
clf2 = MultinomialNB()
clf2.fit(X, y)
clf2.predict([[2, 0]])

array([1])