# Logistic Regression
## Logistic Regression 的梯度下降算法推导

1. 逻辑回归模型

逻辑回归（Logistic Regression）用于二分类任务，其模型为：		
$$
y = \sigma(z) = \sigma(Xw + b)
$$
其中：			
	•	 X  是  m $\times$ n  的数据矩阵，每行是一个样本，每列是一个特征，				
	•	 w  是  n $\times$ 1  的权重向量，				
	•	 b  是标量偏置，				
	•	 $z = Xw + b$  是线性变换后的结果，				
	•	 $\sigma(z)$  是 Sigmoid 激活函数：				
$$
\sigma(z) = \frac{1}{1 + e^{-z}}
$$

2. 损失函数

逻辑回归使用 二元交叉熵损失（Binary Cross-Entropy Loss）, 等价于对参数进行极大似然估计			
$$
L = -\frac{1}{m} \sum_{i=1}^{m} \left[ y_i \log(\hat{y}_i) + (1 - y_i) \log(1 - \hat{y}_i) \right]
$$

其中：
	•	 y_i  是真实标签，取值  $\{0,1\}$ ，		
	•	 $\hat{y}_i = \sigma(z_i)$  是预测值。		

3. 计算梯度

对权重  $w$  求梯度

$$
\frac{\partial L}{\partial w} = \frac{1}{m} \sum_{i=1}^{m} \left( \hat{y}_i - y_i \right) x_i
$$

用矩阵形式表示：
$$
\frac{\partial L}{\partial w} = \frac{1}{m} X^T (\hat{y} - y)
$$

对偏置 $ b$  求梯度

$$
\frac{\partial L}{\partial b} = \frac{1}{m} \sum_{i=1}^{m} (\hat{y}_i - y_i)
$$
矩阵形式：
$$
\frac{\partial L}{\partial b} = \frac{1}{m} \sum (\hat{y} - y)
$$

4. 梯度下降更新公式

使用梯度下降法（Gradient Descent）更新参数：
$$
w \leftarrow w - \alpha \frac{\partial L}{\partial w} \\


b \leftarrow b - \alpha \frac{\partial L}{\partial b}
$$
其中：
	•	 $\alpha $ 是学习率（learning rate）。

In [15]:
import pandas as pd
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

def sigmoid(x: np.ndarray):
    eps = 1e-8
    return 1.0 / (1 + np.exp(-x) + eps)

class LogisticResgression:
    def __init__(self, in_dim: int, out_dim: int, lr: float = 0.1):
        self.in_dim = in_dim
        self.out_dim = out_dim
        self.lr = lr

        self._init_param()

    def __call__(self, x: np.ndarray):
        return self.forward(x)

    def _init_param(self):
        self.w = np.random.randn(self.in_dim, self.out_dim)
        self.b = np.zeros((1, self.out_dim))

    def forward(self, x: np.ndarray):
        return x @ self.w + self.b

    def train(self, X: np.ndarray, y: np.ndarray):
        for _ in range(20):
            # 使用单样本进行更新
            for x, y_true in zip(X, y):
                x = np.expand_dims(x, axis=0)
                y_pred = self.predict(x)
                err = np.sum(y_pred - y_true)
                grad_w = err * x.T
                grad_b = err
                self.w -= self.lr * grad_w
                self.b -= self.lr * grad_b

    def predict(self, X: np.ndarray):
        output = self.forward(X)
        y_pred = sigmoid(output)

        return y_pred.squeeze(axis=1)

In [16]:
def get_dataset():
    iris = load_iris()
    df = pd.DataFrame(iris.data, columns=iris.feature_names)
    df['label'] = iris.target
    df.columns = [
        'sepal length', 'sepal width', 'petal length', 'petal width', 'label'
    ]

    data = np.array(df.iloc[:10000, [0, 1, -1]])
    X, y = data[:, :-1], data[:, -1]
    y = np.array([1 if i == 1 else -1 for i in y])

    return X, y

In [17]:
X, y = get_dataset()
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
y_train = np.where(y_train == -1, 0, 1)
y_test = np.where(y_test == -1, 0, 1)
model = LogisticResgression(X_train.shape[1], 1)
model.train(X_train, y_train)

y_pred = model.predict(X_test)
y_pred_binary = (y_pred >= 0.5).astype(int)
auc = accuracy_score(y_test, y_pred_binary)
print(f"auc: {auc}")

auc: 0.7
