# 朴素贝叶斯
## 公式推导

### 贝叶斯算法的原理，在李航机器学习的书中已有详细证明，一下只对关键问题进行证明

1 为什么贝叶斯中后验概率最大化等价于经验风险最小化

* 令$ L(y, f(x)) $为损失函数，通过积分可以得到经验损失


$$ 
\begin{array}{l}{R_{\mathrm{epp}}(f)=\int_{x} \int_{y} L(y, f(x)) \times P(x, y) d x d y} \\ {=\int_{x} \int_{y} L(y, f(x)) \times P(y | x) P(x) d x d y} \\ {=\int_{x} P(x) d x \int_{y} L(y, f(x)) \times P(y | x) d y} \\ {\min \left(\int_{y} L(y, f(x)) \times P(y | x) d y\right) \rightarrow \min \left(\sum_{i=1}^{n} L\left(y_{i}, f\left(x_{i}\right)\right) \times P\left(Y=y_{i} | X=x_{i}\right)\right)} \\ {\min \left(\sum_{k=1}^{K} L\left(c_{k}, y\right) \times P\left(c_{k} | X=x\right)\right)}\end{array}
 $$

* 令损失函数为指示函数，则可得等价形式

$$ 
\begin{array}{l}{\min \left(\sum_{k=1}^{K} L\left(c_{k}, y\right) \times P\left(c_{k} | X=x\right)\right)} \\ {=\min \left(\sum_{k=1}^{K} P\left(c_{k} \neq y | X=x\right)\right)} \\ {=\min \left(\sum_{k=1}^{K} 1-P\left(c_{k}=y | X=x\right)\right)} \\ {=\max \left(P\left(c_{k}=y | X=x\right)\right)}\end{array}
 $$

2 贝叶斯原理公式

$$ 
P\left(Y=c_{k} | X=x\right)=\frac{P\left(X=x | Y=c_{k}\right) P\left(Y=c_{k}\right)}{\sum_{k} P\left(X=x | Y=c_{k}\right) P\left(Y=c_{k}\right)}
 $$
 
 $$ 
P\left(Y=c_{k} | X=x\right)=\frac{P\left(Y=c_{k}\right) \prod_{j} P\left(X^{(j)}=\dot{x}^{(j)} | Y=c_{k}\right)}{\sum_{k} P\left(Y=c_{k}\right) \prod_{j} P\left(X^{(j)}=x^{(j)} | Y=c_{k}\right)}
 $$

In [42]:
import numpy as np

class NavieBayesClassifier:
    def __init__(self, lamb = 0):
        self.prior_prob_y = {}
        self.prior_prob_x = {}
        self.x_dim = 0
        #拉普拉斯平滑系数
        self.lamb = lamb
    def fit(self, x, y):
        '''
        x是二维ndarray数组
        y是一维ndarray数组
        x，y长度相同
        '''
        self.x_dim = len(x[0])
        y_list = y.tolist()
        y_unique = np.unique(y)
        for val in y_unique:
            self.prior_prob_y[val] = y_list.count(val)/len(y_list)
        y = np.array([y_list])
        xy = np.hstack((x, y.T))
        for d in range(self.x_dim):
            #处理x不同维度
            x_and_y = xy[:, (d,-1)]
            x_unique = np.unique(xy[:, d])
            laplace = len(x_unique)
            self.prior_prob_x[d] = {}
            for yy in y_unique:
                #处理不同的y值
                x_when_yy = x_and_y[x_and_y[:, -1] == yy]
                x_list = x_when_yy[:, 0].tolist()
                self.prior_prob_x[d][yy] = {}
                for xx in x_unique:
                    #获取固定的y下，不同的x的概率
                    self.prior_prob_x[d][yy][xx] = (x_list.count(xx) + self.lamb) / (len(x_list) + laplace * self.lamb)
    def predict(self, x):
        '''
        x是一维数组
        '''
        res = {}
        all_pro = 0
        for y_val in self.prior_prob_y:
            res[y_val] = self.prior_prob_y[y_val]
            px_y = 1
            for d in range(self.x_dim):
                print(d, y_val, x[d], self.prior_prob_x[d][y_val][x[d]])
                px_y *= self.prior_prob_x[d][y_val][x[d]]
            res[y_val] *= px_y
            all_pro += res[y_val]
        for y_val in res:
            res[y_val] /= all_pro


In [43]:
#利用书中的实例测试
# if __name__ == '__main__':
xy = [[1,4,-1],
    [1,5,-1],
    [1,5,1],
    [1,4,1],
    [1,4,-1],
    [2,4,-1],
    [2,5,-1],
    [2,5,1],
    [2,6,1],
    [2,6,1],
    [3,6,1],
    [3,5,1],
    [3,5,1],
    [3,6,1],
    [3,6,-1]]
xy = np.array(xy)


sb_clf = NavieBayesClassifier(1)
sb_clf.fit(xy[:, (0,1)], xy[:, -1])

print('x prob', sb_clf.prior_prob_x)
print('y prob', sb_clf.prior_prob_y)


x prob {0: {-1: {1: 0.4444444444444444, 2: 0.3333333333333333, 3: 0.2222222222222222}, 1: {1: 0.25, 2: 0.3333333333333333, 3: 0.4166666666666667}}, 1: {-1: {4: 0.4444444444444444, 5: 0.3333333333333333, 6: 0.2222222222222222}, 1: {4: 0.16666666666666666, 5: 0.4166666666666667, 6: 0.4166666666666667}}}
y prob {-1: 0.4, 1: 0.6}


* 得到了和书中（带有laplace平滑）一样的结果