# 线性拟合

线性拟合算是最为基础的机器学习模型，本文使用PyTorch的神经网络模块`torch.nn`搭建一个简单的线性拟合模型，并进行学习。

In [1]:
import os
import sys

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

import torch
print(f'Import PyTorch V{torch.__version__}')

Import PyTorch V1.12.1


In [2]:
dev = torch.device(type='cuda') if torch.cuda.is_available() else torch.device(type='cpu')
print(f'Use device {dev}')

Use device cpu


## 模型

线性拟合的模型如下所示：

$$y = x A^T + b$$

其中$x$为输入数据，$y$为输出数据，$A$和$b$是需要拟合的模型参数，
令$d_i$和$d_o$分别代表输入和输出数据的维度，则$A$维度为$d_o \times d_i$，$b$维度为$d_o$。

这里假设问题的输入、输出维度分别为4和2，正确的解如下所示。

In [3]:
d_i, d_o = 4, 1
A_hat = torch.tensor([
    [1.2, 3.0, 7.7, -0.9],
])
b_hat = torch.tensor([0.7])

A_hat, b_hat

(tensor([[ 1.2000,  3.0000,  7.7000, -0.9000]]), tensor([0.7000]))

创建初始的模型

In [4]:
layer = torch.nn.Linear(d_i, d_o, device=dev)
model = torch.nn.Sequential(layer)
model

Sequential(
  (0): Linear(in_features=4, out_features=1, bias=True)
)

In [5]:
layer.weight, layer.bias

(Parameter containing:
 tensor([[ 0.3003, -0.0493,  0.0921,  0.4552]], requires_grad=True),
 Parameter containing:
 tensor([-0.2184], requires_grad=True))

## 数据准备

随机生成1000个样本数据。

In [6]:
num_samples = 1000
X = torch.rand(num_samples, d_i) * 10.0
y = torch.matmul(X, A_hat.T) + b_hat + torch.normal(0.0, 0.01, (num_samples, d_o))
X = X.to(device=dev)
y = y.to(device=dev)
X.shape, y.shape

(torch.Size([1000, 4]), torch.Size([1000, 1]))

构造数据集，并以10为单位对数据进行分批。

In [7]:
batch_size = 10

ds = torch.utils.data.TensorDataset(X, y)
data_iter = torch.utils.data.DataLoader(ds, batch_size, shuffle=True)

next(iter(data_iter))

[tensor([[8.0675, 8.3224, 8.1355, 1.0108],
         [1.5369, 0.0459, 2.2693, 0.9940],
         [0.8437, 2.0826, 0.6098, 1.9557],
         [5.3063, 8.3553, 5.9765, 5.4579],
         [4.7617, 2.1485, 5.0449, 7.2023],
         [7.5006, 6.0990, 3.2693, 0.0385],
         [9.8372, 7.1147, 0.9968, 3.5329],
         [5.0643, 2.5015, 7.1231, 9.6633],
         [1.6878, 0.4135, 1.9059, 4.4767],
         [3.3105, 4.7690, 4.5690, 0.2373]]),
 tensor([[97.0737],
         [19.2454],
         [10.8899],
         [73.2308],
         [45.2150],
         [53.1318],
         [38.3513],
         [60.4241],
         [14.6182],
         [53.9354]])]

## 训练

首先，准备好损失函数，这里选择平均方差，相当于最小二乘。

In [8]:
loss = torch.nn.MSELoss()
loss

MSELoss()

接下来，准备训练器。

In [9]:
trainer = torch.optim.SGD(model.parameters(), lr=0.005)
trainer

SGD (
Parameter Group 0
    dampening: 0
    foreach: None
    lr: 0.005
    maximize: False
    momentum: 0
    nesterov: False
    weight_decay: 0
)

In [10]:
max_epochs = 1000
threshold = 2e-4
for epoch in range(max_epochs):
    for features, labels in data_iter:
        l = loss(model(features), labels)
        trainer.zero_grad()
        l.backward()
        trainer.step()
    l = loss(model(X), y)
    print(f'epoch {epoch + 1}, loss {l:f}')
    if l < threshold:
        break

epoch 1, loss 0.014124
epoch 2, loss 0.011138
epoch 3, loss 0.009312
epoch 4, loss 0.012243
epoch 5, loss 0.007614
epoch 6, loss 0.008037
epoch 7, loss 0.005518
epoch 8, loss 0.004594
epoch 9, loss 0.004243
epoch 10, loss 0.003974
epoch 11, loss 0.003909
epoch 12, loss 0.002859
epoch 13, loss 0.002313
epoch 14, loss 0.002370
epoch 15, loss 0.001821
epoch 16, loss 0.001612
epoch 17, loss 0.001380
epoch 18, loss 0.001469
epoch 19, loss 0.001048
epoch 20, loss 0.000911
epoch 21, loss 0.000926
epoch 22, loss 0.000712
epoch 23, loss 0.000686
epoch 24, loss 0.000561
epoch 25, loss 0.000516
epoch 26, loss 0.000472
epoch 27, loss 0.000483
epoch 28, loss 0.000389
epoch 29, loss 0.000388
epoch 30, loss 0.000311
epoch 31, loss 0.000303
epoch 32, loss 0.000255
epoch 33, loss 0.000233
epoch 34, loss 0.000244
epoch 35, loss 0.000204
epoch 36, loss 0.000190


In [11]:
layer.weight, layer.bias

(Parameter containing:
 tensor([[ 1.2019,  3.0014,  7.7013, -0.8983]], requires_grad=True),
 Parameter containing:
 tensor([0.6664], requires_grad=True))

In [12]:
A_hat, b_hat

(tensor([[ 1.2000,  3.0000,  7.7000, -0.9000]]), tensor([0.7000]))