# 第六章 逻辑斯蒂回归

## 逻辑斯蒂回归模型

### 逻辑斯蒂回归模型
二项逻辑斯蒂回归模型是如下的条件概率分布：   
<center>$P(Y=1|x)={e^{w\cdot{x}+b}\over1+e^{w\cdot{x}+b}}$</center>   
<center>$P(Y=0|x)={1\over1+e^{w\cdot{x}+b}}$</center>   
为了方便，将权值向量和输入向量加以扩充，扔记作$w,x$，此时$w=(w^{(1)},w^{(2)},w^{(3)},\cdots,w^{(n)},b)^T, x=(x^{(1)},x^{(2)},x^{(3)},\cdots,1)^T$,这时，逻辑斯蒂回归模型如下：   
<center>$P(Y=1|x)={e^{w\cdot{x}}\over1+e^{w\cdot{x}}}$</center>
<center>$P(Y=0|x)={1\over1+e^{w\cdot{x}}}$</center>   
现考查逻辑斯蒂回归模型的特点，一个事件的几率是指该事件发生的概率与该事件不发生的概率的比值。如果一个事件发生的概率为$p$，那么该事件发生的几率为$p\over{1-p}$,该事件发生的对数几率或logit函数是
<center>$logit(p)=log{p\over{1-p}}$</center>   
对逻辑斯蒂回归而言，结合上面的公式   
<center>$log{P(Y=1|x)\over{1-P(Y=1|x)}}=w\cdot x$</center>   
   
模型的参数估计:   
逻辑斯蒂回归模型学习时，可以应用极大似然法估计模型参数，从而得到逻辑斯蒂回归模型。   
设：<center>$P(Y=1|x)=\pi(x), P(X=1|x)=1-\pi(x)$</center>   
则似然函数：   
<center>$\prod_{i=1}^N{[\pi(x_i)]^{y_i}[1-\pi(x_i)]^{1-y_i}}$</center>   
对数似然函数为：   
<center>
$
L(w)=\sum_{i=1}^N{[y_ilog\pi(x_i)+(1-y_i)log(1-\pi(x_i))]}   
   =\sum_{i=1}^N{[y_i(w\cdot x_i)-log(1+e^{w\cdot x_i})]}
$
</center>   
对$L(w)$求极大值，得到$w$的极大值。   
逻辑斯蒂回归中通常用梯度下降法或拟牛顿法来优化目标函数


实现logistic回归

In [1]:
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import pandas as pd


class LogisticRegression(object):
    def __init__(self, max_iter=100, lr=0.01):
        self.iter = max_iter
        self.lr = lr

    def sigmoid(self, x):
        return 1 / (1 + np.e ** (-x))

    def data_matrix(self, X):
        data_mat = []
        for d in X:
            # 对每一个数据加一个偏置项
            data_mat.append([1.0, *d])
        return data_mat

    def fit(self, X, y):
        # label = np.mat(y)
        data_mat = self.data_matrix(X)  # m*n
        self.weights = np.zeros((len(data_mat[0]), 1), dtype=np.float32)
        for iter_ in range(self.iter):
            for i in range(len(X)):
                result = self.sigmoid(np.dot(data_mat[i], self.weights))
                error = y[i] - result
                # 更新权值
                self.weights += self.lr * error * np.transpose([data_mat[i]])
        print('LogisticRegression Model(learning_rate={},max_iter={})'.format(self.lr, self.iter))

    def score(self, X_test, y_test):
        right = 0
        X_test = self.data_matrix(X_test)
        for x, y in zip(X_test, y_test):
            result = np.dot(x, self.weights)
            if (result > 0 and y == 1) or (result < 0 and y == 0):
                right += 1
        return right / len(X_test)

In [2]:
def create_data():
    iris = load_iris()
    df = pd.DataFrame(iris.data, columns=iris.feature_names)
    df['label'] = iris.target
    df.columns = ['sepal length', 'sepal width', 'petal length', 'petal width', 'label']
    data = np.array(df.iloc[:100, [0, 1, -1]])
    # print(data)
    return data[:, :2], data[:, -1]

In [3]:
X, y = create_data()
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
lr = LogisticRegression()
lr.fit(X_train, y_train)
score = lr.score(X_test, y_test)
print(score)

LogisticRegression Model(learning_rate=0.01,max_iter=100)
1.0
