# 线性回归
PyTorch 深度学习的基本技能已经掌握，从本篇开始进入深度学习模型，任何模型都具备以下基本要素，弄懂一个模型非常简单

+ 数据：数据就是输入和已知的输出【监督学习可能没有已知输出】
+ 模型：模型就是数学中的一个函数镞，可变化的就是参数，输入到输出的一个映射，一般输出都是一个具体的值，可以是概率和数值
+ 参数：一个函数必然包含固定个数或者很多参数
+ 损失函数：对任意一个模型，我们总能找到一个办法衡量它的好坏，这个办法就是通过损失函数
+ 优化方法：对一个固定的损失函数，我们通过优化算法，就能求出模型的参数


上面每一个概念在 PyTorch 中都有一个对应方式    
以后我们统一 y 表示输出，x1，x2，..... 表示单个输入变量，X 表示输入向量，一个样本

## 线性回归模型
线性回归模型怎么对应上面的几个基本要素呢
+ 数据：根据问题来的，比如预测某个地区的房价，必然会收集很多房子的特征：面积、楼龄、周边商超数量、卧室数量等，还会收集它现在的价格，这个就是数据
+ 模型：线性回归就是规定函数形式就是：p = b + a1 \* x1 + a2 \* x2 + an * xm, b 是 bias
+ 参数：b a1 a2 am 就是参数
+ 损失函数：z = sum(sum((y-p) * (y-p))/m)/n，其中 m 是参数个数，n 是样本数量
+ 优化方法：梯度下降等

当模型和损失函数形式较为简单时，上面的误差最小化问题的解可以直接用公式表达出来。这类解叫作解析解（analytical solution）。本节使用的线性回归和平方误差刚好属于这个范畴。然而，大多数深度学习模型并没有解析解，只能通过优化算法有限次迭代模型参数来尽可能降低损失函数的值。这类解叫作数值解（numerical solution）。

## 优化方法
在求数值解的优化算法中，小批量随机梯度下降（mini-batch stochastic gradient descent）在深度学习中被广泛使用。它的算法很简单：先选取一组模型参数的初始值，如随机选取；接下来对参数进行多次迭代，使每次迭代都可能降低损失函数的值。在每次迭代中，先随机均匀采样一个由固定数目训练数据样本所组成的小批量（mini-batch）[Math Processing Error]B，然后求小批量中数据样本的平均损失有关模型参数的导数（梯度），最后用此结果与预先设定的一个正数的乘积作为模型参数在本次迭代的减小量。

下面我们来用 PyTorch 来构建整个线性回归模型。


## 1. 数据准备
下面简单模拟一些测试数据
+ 假设总共两个变量，变量对应的线性权重为：[2, -3.4]
+ 假设bias = 4.2
+ 假设误差服从均值为 0，方差为 0.1 的正态分布
+ 假设样本个数为：1000

In [90]:
import torch
import numpy as np
var_nums = 2
sample_nums = 1000
true_bias = 4.2
X = torch.rand(sample_nums, var_nums)  # shape=[1000, 2]
print('X.shape=', X.shape)
true_weight = torch.tensor([2, -3.4]).view(2, 1)  # shape=[2, 1]
print('true_weights.shape', true_weight.shape)
error = torch.tensor(np.random.normal(0.0, 0.1, sample_nums))
print('error.shape', error.shape)
Y = torch.mm(X, true_weight).view(sample_nums) + true_bias + error
print('Y.shape', Y.shape)
print(Y)

X.shape= torch.Size([1000, 2])
true_weights.shape torch.Size([2, 1])
error.shape torch.Size([1000])
Y.shape torch.Size([1000])
tensor([3.8446, 4.0907, 4.1172, 3.3592, 5.2081, 2.7656, 5.2604, 2.8241, 2.4092,
        2.1524, 5.2988, 1.7538, 4.8105, 1.9225, 3.7813, 4.3766, 4.1147, 3.0951,
        4.5631, 3.5313, 4.3096, 2.9987, 2.5910, 1.9592, 3.1709, 3.9261, 4.9781,
        4.0535, 2.6952, 6.1048, 3.6954, 3.6510, 3.0196, 5.7960, 4.4282, 2.5839,
        3.0691, 4.9949, 2.3908, 3.5015, 3.6174, 5.2803, 3.9719, 5.3797, 4.2355,
        4.4568, 2.3278, 4.5102, 2.8263, 3.0402, 3.8340, 4.2123, 2.1433, 5.4706,
        3.0473, 3.7451, 3.7015, 4.2973, 4.2556, 2.6841, 2.4219, 2.6268, 3.1156,
        3.2309, 4.6750, 3.0694, 4.9669, 2.5136, 1.6487, 3.6050, 2.9262, 4.2725,
        4.7486, 5.4740, 3.4866, 3.1768, 2.2584, 2.8020, 1.7936, 5.4785, 4.8686,
        3.1204, 2.2605, 3.3992, 3.0475, 3.2269, 3.1219, 3.0383, 5.0227, 3.4884,
        4.2764, 2.6713, 3.5684, 3.4632, 2.2490, 1.4212, 4.8428, 4.5983, 5

## 2. 定义基本工具函数
+ 模型函数，指定样本和权重，对应的输出
+ 初始化参数函数，初始化所有必须的参数
+ 损失函数，初始化衡量误差的函数
+ 样本获取函数，如何批量获取样本

In [85]:
# 定义回归模型
def linear_reg_model(weight, bias, input, batch_size):
    return torch.mm(input, weight).view(batch_size) + bias

def params_init():
    p_weights = torch.randn(2, 1, dtype=torch.float32, requires_grad=True)
    p_bias = torch.zeros(1, dtype=torch.float32, requires_grad=True)
    return p_weights, p_bias

def loss_func(true_Y, hat_Y):
    error = true_Y - hat_Y
    return error ** 2

def sample_batchs(X, true_Y, sample_nums, batch_size):
    res = []
    inds = list(range(0, sample_nums))
    np.random.shuffle(inds)
    cur_ind = 0
    while cur_ind + batch_size < sample_nums:
        keep_inds = inds[cur_ind:cur_ind + batch_size]
        res.append((torch.index_select(X, 0, torch.tensor(keep_inds, dtype=torch.int64)), 
            torch.index_select(true_Y, 0, torch.tensor(keep_inds, dtype=torch.int64))))
        cur_ind += batch_size
    keep_inds = inds[cur_ind:cur_ind + batch_size]
    if keep_inds:
        res.append((torch.index_select(X, 0, torch.tensor(keep_inds, dtype=torch.int64)), 
            torch.index_select(true_Y, 0, torch.tensor(keep_inds, dtype=torch.int64))))
    return res

In [82]:
res = sample_batchs(X, Y, 1000, 100)
print(res[0])

(tensor([[0.6280, 0.7622],
        [0.4410, 0.2582],
        [0.0578, 0.3930],
        [0.2031, 0.2891],
        [0.8529, 0.1454],
        [0.7091, 0.5959],
        [0.1866, 0.2087],
        [0.6091, 0.6290],
        [0.6470, 0.8776],
        [0.6175, 0.3982],
        [0.5556, 0.0076],
        [0.4439, 0.1246],
        [0.8183, 0.7475],
        [0.8767, 0.8118],
        [0.1617, 0.8198],
        [0.6508, 0.3082],
        [0.0043, 0.0019],
        [0.9388, 0.0294],
        [0.4366, 0.4452],
        [0.3214, 0.3705],
        [0.7145, 0.3275],
        [0.4385, 0.5287],
        [0.0708, 0.3658],
        [0.9899, 0.2482],
        [0.4600, 0.8385],
        [0.4917, 0.0014],
        [0.4817, 0.9047],
        [0.6969, 0.1907],
        [0.4987, 0.2497],
        [0.1685, 0.7173],
        [0.0289, 0.1620],
        [0.7326, 0.0606],
        [0.0928, 0.4620],
        [0.4855, 0.3306],
        [0.8491, 0.5779],
        [0.3495, 0.8891],
        [0.8898, 0.7307],
        [0.6151, 0.9724],
        [0.

## 3. 开始模型训练
采用微批梯度下降法，需要指定以下参数：
+ epoch_nums: 整个样本迭代训练几次
+ batch_nums: 每次微批的数据量大小
+ X: 输入的数据 X
+ ture_Y: 样本真正的 Y
+ step_ratio: 迭代步长

In [88]:
epoch_nums = 20
batch_nums = 5
step_ratio = 0.03
X = X
true_Y = Y
p_weight, p_bias = params_init()
print(p_weight)
print(p_bias)

for epoch in range(0, epoch_nums):
    print("epoch={}".format(epoch+1))
    batchs = sample_batchs(X, true_Y, sample_nums, batch_nums)
    for batch in batchs:
        b_X, b_true_Y = batch
        batch_size = b_true_Y.shape[0]
        b_hat_Y = linear_reg_model(p_weight, p_bias, b_X, batch_size)
        loss = loss_func(b_true_Y, b_hat_Y).sum()

        loss.backward()
        p_weight.data -= step_ratio*p_weight.grad/batch_size
        p_bias.data -= step_ratio*p_bias.grad/batch_size

        p_weight.grad.data.zero_()
        p_bias.grad.data.zero_()
    
    with torch.no_grad():
        hat_Y = linear_reg_model(p_weight, p_bias, X, sample_nums)
        loss = loss_func(true_Y, hat_Y)
        print("epoch={}, loss={}".format(epoch+1, torch.sqrt(loss.mean())))
        print("true_weight={}, p_weight={}".format(true_weight, p_weight))
        print("true_bias={}, p_bias={}".format(true_bias, p_bias))

tensor([[1.8841],
        [0.9425]], requires_grad=True)
tensor([0.], requires_grad=True)
epoch=1
epoch=1, loss=0.6970049738883972
true_weight=tensor([[ 2.0000],
        [-3.4000]]), p_weight=tensor([[ 2.5768],
        [-1.1153]], requires_grad=True)
true_bias=4.2, p_bias=tensor([2.6210], requires_grad=True)
epoch=2
epoch=2, loss=0.3500586450099945
true_weight=tensor([[ 2.0000],
        [-3.4000]]), p_weight=tensor([[ 2.4202],
        [-2.3266]], requires_grad=True)
true_bias=4.2, p_bias=tensor([3.3870], requires_grad=True)
epoch=3
epoch=3, loss=0.19472648203372955
true_weight=tensor([[ 2.0000],
        [-3.4000]]), p_weight=tensor([[ 2.2607],
        [-2.8783]], requires_grad=True)
true_bias=4.2, p_bias=tensor([3.7845], requires_grad=True)
epoch=4
epoch=4, loss=0.13141153752803802
true_weight=tensor([[ 2.0000],
        [-3.4000]]), p_weight=tensor([[ 2.1431],
        [-3.1482]], requires_grad=True)
true_bias=4.2, p_bias=tensor([3.9715], requires_grad=True)
epoch=5
epoch=5, loss=0.1081

## 4. 线性回归高级版本实现
通过 PyTorch Lightning 来实现， https://github.com/PyTorchLightning/pytorch-lightning

In [1]:
# import nessasary lib
import os
import torch
from torch.nn import functional as F
from torch.utils.data import DataLoader
from torch.utils.data import TensorDataset
import pytorch_lightning as pl
import numpy as np

In [37]:
class LineRegModel(pl.LightningModule):

    def __init__(self):
        super(LineRegModel, self).__init__()
        self.true_weight = torch.tensor([2, -3.4], dtype=torch.float32).view(2, 1)
        self.true_bias = torch.tensor(4.2, dtype=torch.float32)
        # 定义模型结构
        self.l1 = torch.nn.Linear(2, 1)

    def forward(self, x):
        # 必须：定义模型
        return self.l1(x)

    def training_step(self, batch, batch_nb):
        # 必须提供：定于训练过程
        x, y = batch
        y_hat = self(x)
        loss = F.mse_loss(y_hat, y)
        tensorboard_logs = {'train_loss': loss}
        return {'loss': loss, 'log': tensorboard_logs}

    def validation_step(self, batch, batch_nb):
        # 可选提供：定义验证过程
        x, y = batch
        y_hat = self(x)
        
        return {'val_loss': F.mse_loss(y_hat, y)}

    def validation_epoch_end(self, outputs):
        # 可选提供：定义验证过程
        avg_loss = torch.stack([x['val_loss'] for x in outputs]).mean()
        tensorboard_logs = {'val_loss': avg_loss}
        return {'val_loss': avg_loss, 'log': tensorboard_logs}

    def test_step(self, batch, batch_nb):
        # 可选提供：定义测试过程
        x, y = batch
        y_hat = self(x)
        return {'test_loss': F.mse_loss(y_hat, y)}

    def test_epoch_end(self, outputs):
        # 可选提供：定义测试过程
        avg_loss = torch.stack([x['test_loss'] for x in outputs]).mean()
        logs = {'test_loss': avg_loss}
        return {'test_loss': avg_loss, 'log': logs, 'progress_bar': logs}

    def configure_optimizers(self):
        # 必须提供：定义优化器
        # can return multiple optimizers and learning_rate schedulers
        # (LBFGS it is automatically supported, no need for closure function)
        return torch.optim.SGD(self.parameters(), lr=0.04)

    def gen_data_loader(self, shuffle, sample_nums, batch_size):
        X = torch.rand(sample_nums, 2, dtype=torch.float32) 
        error = torch.tensor(np.random.normal(0.0, 0.1, sample_nums), dtype=torch.float32).view(sample_nums, 1)
        Y = torch.mm(X, self.true_weight) + self.true_bias + error
        print("True Y Shape = {}".format(Y.shape))
        # 先转换成 torch 能识别的 Dataset
        torch_dataset = TensorDataset(X, Y)

        # 把 dataset 放入 DataLoader
        loader = DataLoader(
            dataset=torch_dataset,      # torch TensorDataset format
            batch_size=batch_size,      # mini batch size
            shuffle=shuffle,            # 要不要打乱数据 (打乱比较好)
            num_workers=4,              # 多线程来读数据
        )
        return loader

    def train_dataloader(self):
        # 必须提供：提供训练数据集
        return self.gen_data_loader(True, 1000, 20)

    def val_dataloader(self):
        # 可选提供：提供验证数据集
        return self.gen_data_loader(False, 1000, 1000)

    def test_dataloader(self):
        # 可选提供：提供测试数据集
        return self.gen_data_loader(False, 1000, 1000)

In [38]:
lr_model = LineRegModel()

# most basic trainer, uses good defaults (1 gpu)
trainer = pl.Trainer(max_epochs=20, num_sanity_val_steps=0)
trainer.fit(lr_model)

GPU available: False, used: False
TPU available: False, using: 0 TPU cores

  | Name | Type   | Params
--------------------------------
0 | l1   | Linear | 3     
True Y Shape = torch.Size([1000, 1])
True Y Shape = torch.Size([1000, 1])


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Training', layout=Layout(flex='2'), max…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…

HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Validating', layout=Layout(flex='2'), m…




1

In [39]:
# 打印所有参数
for i in lr_model.parameters():
    print(i)

true_weight = torch.tensor([2, -3.4], dtype=torch.float32).view(2, 1)
true_bias = torch.tensor(4.2, dtype=torch.float32)

Parameter containing:
tensor([[ 2.0270, -3.3655]], requires_grad=True)
Parameter containing:
tensor([4.1701], requires_grad=True)


In [40]:
trainer.test()

True Y Shape = torch.Size([1000, 1])


HBox(children=(FloatProgress(value=1.0, bar_style='info', description='Testing', layout=Layout(flex='2'), max=…

--------------------------------------------------------------------------------
TEST RESULTS
{'test_loss': tensor(0.0112)}
--------------------------------------------------------------------------------



{'test_loss': 0.0112014040350914}