逻辑回归：0 和 1

![logistic_regression.PNG](attachment:logistic_regression.PNG)

n ： 为特征数（784）
m ： 为训练的样本数
α:  常数（步长）

![sigmoid.PNG](attachment:sigmoid.PNG)

具体过程：
1. 选取假设函数hypothesis
2. 构造损失函数
3. 梯度下降法求最小值

利用Logistic函数（或称为Sigmoid函数）

sigmoid函数的两个实现

In [2]:
import numpy as np
import math
import time
images_x = np.ndfromtxt('./images.csv', delimiter=',')
labels_y = np.ndfromtxt("./labels.csv", delimiter=',', dtype=np.int8)
img_size = images_x.shape[1]
img_size

784

In [3]:
# 过滤 0和1 并分离数据
# 准备好测试数据和训练数据
# numpy整理
ind = np.logical_or(labels_y == 1, labels_y == 0)
images_x = images_x[ind, :]
labels_y = labels_y[ind]
num_train = int(len(labels_y) * 0.8)
x_train = images_x[0:num_train, :]
x_test = images_x[num_train:-1,:]
y_train = labels_y[0:num_train]
y_test = labels_y[num_train:-1]

In [5]:
# 不使用numpy函数的版本sigmoid函数:
def h1(theta, x):
    sum = 0.0
    for i in range(len(x)):
        sum -= theta[i] * x[i]
        return 1 / (1 + math.exp(sum))
    
# 使用numpy函数
def h2(theta, x):
    return 1 / (1 + np.exp(np.dot(theta, x)))

In [6]:
# 测试两个函数
theta = np.zeros([img_size])
x = images_x[0,:]
h1(theta, x)
h2(theta, x)
h1

<function __main__.h1(theta, x)>

In [5]:
# 一种方法：
def h(theta, x):
    return 1 / (1 + np.exp(-np.dot(theta, x)))

# 批量梯度下降得到最优
def gradient_descent1(theta, x_train, y_train, step):
    """
    theta: 当前的 θ 值
    x_train : 训练数据的特征值
    y_train : 训练的标签值
    step： 步长（常数）
    """
    # 初始化
    len_train = len(y_train)
    diff_arr = np.zeros([len_train])
    # 求每一个样本，假设预测的值 - 真实训练的值 
    for m in range(len_train):
        diff_arr[m] = h(theta, x_train[m, :]) - y_train[m]
        
    # 最开头的逻辑回归原理，公式套用
    for j in range(len(theta)):
        sum = 0.0
        for m in range(len_train):
            sum += diff_arr[m] * x_train[m, j]
        theta[j] = theta[j] - step * sum

# 训练得到最优
def train_elementwise(x_train, y_train, max_iter, step):
    theta = np.zeros([img_size])
    for i in range(max_iter):
        gradient_descent1(theta, x_train, y_train, step)       
    return theta

max_iter = 10
step = 0.01
start = time.time()
theta = train_elementwise(x_train, y_train, max_iter, step)
end = time.time()
print("花费时间：{0} 秒".format(end - start))

  This is separate from the ipykernel package so we can avoid doing imports until


time elapsed: 5.7393639087677 seconds


第二种改进方法：公式

![battle_sigmoid.PNG](attachment:battle_sigmoid.PNG)

In [11]:
# 第二种方法：矩阵向量方式
def h_vec(theta, x):
    return 1 / (1 + np.exp(-np.matmul(x, theta)))

# 第二种改进方式梯度下降
def gradient_descent2(theta, x_train, y_train, step):
    """
    theta: 当前的 θ 值
    x_train : 训练数据的特征值
    y_train : 训练的标签值
    step： 步长（常数）
    """
    diff_z = h_vec(theta, x_train) - y_train
    for j in range(len(theta)):
        theta[j] = theta[j] - step * np.dot(diff_z, x_train[:, j])
        
# 训练
def train_battle(x_train, y_train, max_iter, step):
    """
    x_train : 训练数据的特征值
    y_train : 训练的标签值
    max_iter: 最大迭代
    step： 步长（常数）
    """
    theta = np.zeros([img_size])
    for i in range(max_iter):
        gradient_descent2(theta, x_train, y_train, step)       
    return theta

max_iter = 10
step = 0.01
start = time.time()
theta = train_battle(x_train, y_train, max_iter, step)
end = time.time()
print("花费时间：{0} 秒".format(end - start))
# 明显比上面的一种方法更快，更省时间。

时间过去：0.17290115356445312 秒


  This is separate from the ipykernel package so we can avoid doing imports until


In [None]:
# 3方法：可以一次性计算所有的更新
def h_vec(theta, x):
    return 1 / (1 + np.exp(-np.matmul(x, theta)))

def GD(theta, x_train, y_train, step):
    theta = 