## 要求：
实现一个学习算法的整体结构
+ 获取并定义输入参数
+ 初始化参数
+ 计算损失函数及其梯度
+ 循环
    + 计算当前损失（正向传播）
    + 计算当前梯度（反向传播）
    + 更新参数（梯度下降）

In [1]:
import numpy as np
import pandas as pd
from tqdm import tqdm

## 1. 数据预处理

In [10]:
# 读取数据
origin_data = pd.read_csv('../../Datasets/exercise_data/Iris.csv', header=0, nrows=100, index_col=0)
species = origin_data["Species"].unique()
origin_data["Species"] = origin_data["Species"].map({species[0]: 0, species[1]: 1})

# 打乱数据
shuffle_data = origin_data.sample(frac=1)

# 划分训练集和测试集
train_X = shuffle_data.iloc[:80, :-1].values
train_Y = shuffle_data.iloc[:80, -1].values.reshape(-1, 1)

test_X = shuffle_data.iloc[80:, :-1].values
test_Y = shuffle_data.iloc[80:, -1].values.reshape(-1, 1)

## 2. 实现 sigmoid 函数

In [3]:
def sigmoid(x: np.ndarray) -> np.ndarray:
    """
    Compute the sigmoid of x

    Arguments:
    x -- A scalar or numpy array of any size

    Return:
    s -- sigmoid(x)
    """
    s = 1 / (1 + np.exp(-x))
    return s

## 3. 实现 sigmoid 的梯度函数

In [4]:
def sigmoid_derivative(x: np.ndarray):
    """
    Compute the gradient (also called the slope or derivative) of the sigmoid function with respect to its input x.
    You can store the output of the sigmoid function into variables and then use it to calculate the gradient.
    
    Arguments:
    x -- A scalar or numpy array
    
    Return:
    ds -- Your computed gradient.
    """
    s = sigmoid(x)
    ds = s * (1 - s)
    return ds

## 4. 整体逻辑实现

In [7]:
def model(train_X, train_Y, iteration_num=2000, learning_rate=0.005, TOL=1e-6, print_cost=False):
    """
    Builds the logistic regression model by calling the function you've implemented previously
    
    Arguments:
    train_X -- training set represented by a numpy array of shape (num_px * num_px * 3, m_train)
    train_Y -- training labels represented by a numpy array (vector) of shape (1, m_train)
    iteration_num -- hyperparameter representing the number of iterations to optimize the parameters
    TOL -- hyperparameter representing the tolerance of the cost function
    print_cost -- Set to true to print the cost every 100 iterations
    
    Returns:
    w -- weights, a numpy array of shape (num_px * num_px * 3, 1)
    b -- bias, a scalar
    """
    # 获取训练集的维度, m 为样本数, n 为特征数
    m, n = train_X.shape

    # 初始化参数
    w = np.zeros((n, 1))
    b = 0

    # 梯度下降
    for i in tqdm(range(1, iteration_num + 1)):
        Z = train_X @ w + b
        A = sigmoid(Z)
        cost = -1 / m * np.sum(train_Y * np.log(A) + (1 - train_Y) * np.log(1 - A))

        # 计算当前梯度
        dZ = A - train_Y
        dw = 1 / m * train_X.T @ dZ
        db = 1 / m * np.sum(dZ)

        # 更新参数
        w -= learning_rate * dw
        b -= learning_rate * db

        # 打印损失
        if print_cost and i % 1000000 == 0:
            print("Cost after iteration {}: {}".format(i, cost))

        # 判断是否收敛
        if cost < TOL:
            break

    return w, b

## 5. 测试

In [None]:
# 训练模型
w, b = model(train_X, train_Y, iteration_num=20000000, learning_rate=0.005, TOL=1e-6, print_cost=True)

# 计算测试集的预测精度
predict_Y = sigmoid(np.dot(test_X, w) + b)
predict_Y[predict_Y > 0.5] = 1
predict_Y[predict_Y <= 0.5] = 0
predict_accuracy = np.mean(predict_Y == test_Y)
print("train accuracy: {} %".format(predict_accuracy * 100))

# 输出模型参数
print("w = {}".format(w))
print("b = {}".format(b))