# 线性拟合

线性拟合算是最为基础的机器学习模型，本文使用PyTorch的神经网络模块`torch.nn`搭建一个简单的线性拟合模型，并进行学习。

In [1]:
import os
import sys

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

import torch
print(f'Import PyTorch V{torch.__version__}')

Import PyTorch V1.12.1


In [2]:
dev = torch.device(type='cuda') if torch.cuda.is_available() else torch.device(type='cpu')
print(f'Use device {dev}')

Use device cuda


## 模型

线性拟合的模型如下所示：

$$y = x A^T + b$$

其中$x$为输入数据，$y$为输出数据，$A$和$b$是需要拟合的模型参数，
令$d_i$和$d_o$分别代表输入和输出数据的维度，则$A$维度为$d_o \times d_i$，$b$维度为$d_o$。

这里假设问题的输入、输出维度分别为4和2，正确的解如下所示。

In [3]:
d_i, d_o = 4, 1
A_hat = torch.tensor([
    [1.2, 3.0, 7.7, -0.9],
])
b_hat = torch.tensor([0.7])

A_hat, b_hat

(tensor([[ 1.2000,  3.0000,  7.7000, -0.9000]]), tensor([0.7000]))

创建初始的模型

In [4]:
layer = torch.nn.Linear(d_i, d_o, device=dev)
model = torch.nn.Sequential(layer)
model

Sequential(
  (0): Linear(in_features=4, out_features=1, bias=True)
)

In [5]:
layer.weight, layer.bias

(Parameter containing:
 tensor([[ 0.2524, -0.1093,  0.2488,  0.2231]], device='cuda:0',
        requires_grad=True),
 Parameter containing:
 tensor([-0.2603], device='cuda:0', requires_grad=True))

## 数据准备

随机生成1000个样本数据。

In [6]:
num_samples = 1000
X = torch.rand(num_samples, d_i) * 10.0
y = torch.matmul(X, A_hat.T) + b_hat + torch.normal(0.0, 0.01, (num_samples, d_o))
X = X.to(device=dev)
y = y.to(device=dev)
X.shape, y.shape

(torch.Size([1000, 4]), torch.Size([1000, 1]))

构造数据集，并以10为单位对数据进行分批。

In [7]:
batch_size = 10

ds = torch.utils.data.TensorDataset(X, y)
data_iter = torch.utils.data.DataLoader(ds, batch_size, shuffle=True)

next(iter(data_iter))

[tensor([[1.8960, 0.5356, 2.3351, 3.3987],
         [1.4141, 0.6709, 5.6295, 3.4531],
         [2.1134, 0.1226, 1.3275, 6.3274],
         [7.0991, 7.6374, 9.2194, 7.3721],
         [3.2078, 7.9251, 0.2291, 5.6411],
         [8.0466, 4.7269, 1.0142, 7.3573],
         [5.6678, 2.5008, 1.5807, 8.8860],
         [8.8065, 1.9865, 0.7387, 6.6372],
         [9.4163, 6.0235, 8.9035, 0.5926],
         [0.3429, 1.9129, 9.6532, 0.4823]], device='cuda:0'),
 tensor([[19.5059],
         [44.6729],
         [ 8.1282],
         [96.4854],
         [25.0127],
         [25.7107],
         [19.1767],
         [16.9334],
         [98.0975],
         [80.7498]], device='cuda:0')]

## 训练

首先，准备好损失函数，这里选择平均方差，相当于最小二乘。

In [8]:
loss = torch.nn.MSELoss()
loss

MSELoss()

接下来，准备训练器。

In [9]:
trainer = torch.optim.SGD(model.parameters(), lr=0.005)
trainer

SGD (
Parameter Group 0
    dampening: 0
    foreach: None
    lr: 0.005
    maximize: False
    momentum: 0
    nesterov: False
    weight_decay: 0
)

In [10]:
max_epochs = 1000
threshold = 2e-4
for epoch in range(max_epochs):
    for features, labels in data_iter:
        l = loss(model(features), labels)
        trainer.zero_grad()
        l.backward()
        trainer.step()
    l = loss(model(X), y)
    print(f'epoch {epoch + 1}, loss {l:f}')
    if l < threshold:
        break

epoch 1, loss 0.016631
epoch 2, loss 0.014189
epoch 3, loss 0.012076
epoch 4, loss 0.010374
epoch 5, loss 0.009004
epoch 6, loss 0.007600
epoch 7, loss 0.006944
epoch 8, loss 0.006149
epoch 9, loss 0.004850
epoch 10, loss 0.004352
epoch 11, loss 0.003727
epoch 12, loss 0.003133
epoch 13, loss 0.002800
epoch 14, loss 0.002427
epoch 15, loss 0.002232
epoch 16, loss 0.001757
epoch 17, loss 0.001530
epoch 18, loss 0.001328
epoch 19, loss 0.001157
epoch 20, loss 0.001025
epoch 21, loss 0.000905
epoch 22, loss 0.000955
epoch 23, loss 0.000762
epoch 24, loss 0.001103
epoch 25, loss 0.000551
epoch 26, loss 0.000540
epoch 27, loss 0.000438
epoch 28, loss 0.000379
epoch 29, loss 0.000349
epoch 30, loss 0.000317
epoch 31, loss 0.000343
epoch 32, loss 0.000261
epoch 33, loss 0.000258
epoch 34, loss 0.000275
epoch 35, loss 0.000200


In [11]:
layer.weight, layer.bias

(Parameter containing:
 tensor([[ 1.2014,  3.0020,  7.7014, -0.8981]], device='cuda:0',
        requires_grad=True),
 Parameter containing:
 tensor([0.6652], device='cuda:0', requires_grad=True))

In [12]:
A_hat, b_hat

(tensor([[ 1.2000,  3.0000,  7.7000, -0.9000]]), tensor([0.7000]))