已知时程数据$Y$和$Y$，两者有线性关系：

$$
Y=AX
$$

尝试改用PyTorch梯度下降，通过线性回归，求$A$

考虑一个简单的无偏置的线性回归问题。对于

$$
\begin{aligned}
y_1 & = a_{11} x_1+ a_{21} x_2 +\cdots +a_{n1} x_n \\
& = \begin{bmatrix} x_{1} & x_{2} & \cdots & x_{n} \end{bmatrix}
\begin{bmatrix} a_{11} \\ a_{21} \\ \vdots \\ a_{n1} \end{bmatrix}
\end{aligned}
$$

若共有$T$对$\{x_i\}$和$y_1$的组合，则有：

$$
\begin{bmatrix} y_1(1) \\ y_1(2) \\ \vdots \\ y_1(T) \end{bmatrix} =
\begin{bmatrix}
x_{1}(1) & x_{2}(1) & \cdots & x_{n}(1) \\
x_{1}(2) & x_{2}(2) & \cdots & x_{n}(2) \\
\vdots & \vdots & \ddots & \vdots \\
x_{1}(T) & x_{2}(T) & \cdots & x_{n}(T)
\end{bmatrix}
\begin{bmatrix} a_{11} \\ a_{21} \\ \vdots \\ a_{n1} \end{bmatrix}
$$

因此，若继续引入更多$\{y_i\}$，则有：

$$
\begin{aligned}

\begin{bmatrix}
y_{1}(1) & y_{2}(1) & \cdots & y_{m}(1) \\
y_{1}(2) & y_{2}(2) & \cdots & y_{m}(2) \\
\vdots & \vdots & \ddots & \vdots \\
y_{1}(T) & y_{2}(T) & \cdots & y_{m}(T)
\end{bmatrix}_{T \times m} & =
\begin{bmatrix}
x_{1}(1) & x_{2}(1) & \cdots & x_{n}(1) \\
x_{1}(2) & x_{2}(2) & \cdots & x_{n}(2) \\
\vdots & \vdots & \ddots & \vdots \\
x_{1}(T) & x_{2}(T) & \cdots & x_{n}(T)
\end{bmatrix}_{T \times n}
\begin{bmatrix}
a_{11} & a_{12} & \cdots & a_{1m} \\
a_{21} & a_{22} & \cdots & a_{2m} \\
\vdots & \vdots & \ddots & \vdots \\
a_{n1} & a_{n2} & \cdots & a_{nm}
\end{bmatrix}_{n \times m} \\

\Rightarrow
Y_{T \times m} & = X_{T \times n} A_{n \times m}

\end{aligned}
$$

可见，由于$X$和$Y$在行方向表示时间上的演化，需要将问题转变为如下形式，方便通过**PyTorch**搭建单层的神经网络实现线性回归：

$$
\begin{aligned}
Y & = A X \\
\Rightarrow
Y^{H} & = X^{H} A^{H}
\end{aligned}
$$

只需要先线性回归出$A^{H}$，再共轭转置即可得到结果。

In [1]:
import torch
from torch import nn
from torch.utils.data import DataLoader, TensorDataset
from tqdm import tqdm

# 选择训练设备
def get_device(device=0):
    if torch.cuda.is_available():
        device_name = f"cuda:{device}"
        if torch.cuda.device_count() > device:
            return device_name
        else:
            print(f"No such cuda device: {device}")
    return "cpu"


# 定义网络结果
class LinearRegression(nn.Module):
    def __init__(self, input_size, output_size):
        super().__init__()
        self.linear = nn.Linear(input_size, output_size)

    def forward(self, x):
        return self.linear(x)


# 定义训练函数
def trainer(
    model, train_loader, num_epochs, optimizer, loss_type=nn.MSELoss(), device=0
):
    print("PyTorch Version:", torch.__version__)
    device = get_device(device)
    print("Training on", device)
    print(
        "====================================Start training===================================="
    )
    model.to(device)
    for epoch in range(num_epochs):
        with tqdm(
            train_loader, desc=f"Epoch {epoch+1}/{num_epochs}", unit="batch"
        ) as t:
            for x, y in t:
                x, y = x.to(device), y.to(device)
                optimizer.zero_grad()
                output = model(x)
                loss = loss_type(output, y)
                loss.backward()
                optimizer.step()
                t.set_postfix(loss=loss.item())
    print(
        "====================================Finish training====================================\n"
    )


单输入单输出线性回归示例

In [2]:
# 训练数据
X = torch.linspace(1, 1000, 1000).reshape(-1, 1)
y = torch.linspace(2, 2000, 1000).reshape(-1, 1)

# 数据加载器
train_data = TensorDataset(X, y)
train_loader = DataLoader(train_data, batch_size=10, shuffle=True)

# 模型
model1 = LinearRegression(1, 1)

# 训练模型
optimizer = torch.optim.Adam(model1.parameters(), lr=0.01, weight_decay=0)
trainer(
    model1,
    train_loader,
    num_epochs=10,
    optimizer=optimizer,
    loss_type=nn.MSELoss(),
    device=0,
)

WEIGHT = model1.linear.weight.detach().cpu()
WEIGHT = WEIGHT.numpy()
print("A =\n", WEIGHT)


PyTorch Version: 1.13.1
Training on cuda:0


Epoch 1/10: 100%|██████████| 100/100 [00:03<00:00, 25.47batch/s, loss=4.52e+5]
Epoch 2/10: 100%|██████████| 100/100 [00:00<00:00, 332.71batch/s, loss=1.1e+5]
Epoch 3/10: 100%|██████████| 100/100 [00:00<00:00, 392.57batch/s, loss=1.25e+4]
Epoch 4/10: 100%|██████████| 100/100 [00:00<00:00, 351.23batch/s, loss=2.64e+3]
Epoch 5/10: 100%|██████████| 100/100 [00:00<00:00, 373.59batch/s, loss=152]
Epoch 6/10: 100%|██████████| 100/100 [00:00<00:00, 332.70batch/s, loss=5.79]
Epoch 7/10: 100%|██████████| 100/100 [00:00<00:00, 428.69batch/s, loss=0.646]
Epoch 8/10: 100%|██████████| 100/100 [00:00<00:00, 394.14batch/s, loss=0.717]
Epoch 9/10: 100%|██████████| 100/100 [00:00<00:00, 378.53batch/s, loss=0.342]
Epoch 10/10: 100%|██████████| 100/100 [00:00<00:00, 350.15batch/s, loss=0.393]


A =
 [[1.9978443]]





多输入单输出示例

In [4]:
# 训练数据
X1 = torch.rand(1000, 1).reshape(-1, 1)
X2 = torch.rand(1000, 1).reshape(-1, 1)
X = torch.concatenate((X1, X2), axis=1)
y = (3 * X1 + 8 * X2).reshape(-1, 1)

# 数据加载器
train_data = TensorDataset(X, y)
train_loader = DataLoader(train_data, batch_size=5, shuffle=True)

# 模型
model2 = LinearRegression(2, 1)

# 训练模型
optimizer = torch.optim.Adam(model2.parameters(), lr=0.01, weight_decay=0)
trainer(
    model2,
    train_loader,
    num_epochs=30,
    optimizer=optimizer,
    loss_type=nn.MSELoss(),
    device=0,
)

WEIGHT = model2.linear.weight.detach().cpu()
WEIGHT = WEIGHT.numpy()
print("A =\n", WEIGHT)


PyTorch Version: 1.13.1
Training on cuda:0


Epoch 1/30: 100%|██████████| 200/200 [00:00<00:00, 590.72batch/s, loss=12.6]
Epoch 2/30: 100%|██████████| 200/200 [00:00<00:00, 541.71batch/s, loss=3.13] 
Epoch 3/30: 100%|██████████| 200/200 [00:00<00:00, 555.52batch/s, loss=1.27] 
Epoch 4/30: 100%|██████████| 200/200 [00:00<00:00, 574.44batch/s, loss=0.53] 
Epoch 5/30: 100%|██████████| 200/200 [00:00<00:00, 543.17batch/s, loss=1.02] 
Epoch 6/30: 100%|██████████| 200/200 [00:00<00:00, 561.88batch/s, loss=1.42] 
Epoch 7/30: 100%|██████████| 200/200 [00:00<00:00, 574.03batch/s, loss=0.876]
Epoch 8/30: 100%|██████████| 200/200 [00:00<00:00, 552.63batch/s, loss=0.38] 
Epoch 9/30: 100%|██████████| 200/200 [00:00<00:00, 490.18batch/s, loss=0.893] 
Epoch 10/30: 100%|██████████| 200/200 [00:00<00:00, 562.36batch/s, loss=0.352] 
Epoch 11/30: 100%|██████████| 200/200 [00:00<00:00, 534.37batch/s, loss=0.26]  
Epoch 12/30: 100%|██████████| 200/200 [00:00<00:00, 528.12batch/s, loss=0.152] 
Epoch 13/30: 100%|██████████| 200/200 [00:00<00:00, 598.26


A =
 [[2.9999115 7.9998837]]





多输入多输出示例

In [3]:
# 训练数据
X1 = torch.rand(1000, 1).reshape(-1, 1)
X2 = torch.rand(1000, 1).reshape(-1, 1)
X = torch.concatenate((X1, X2), axis=1)
y1 = (3 * X1 + 8 * X2).reshape(-1, 1)
y2 = (5 * X1 + 4 * X2).reshape(-1, 1)
y3 = (2 * X1 + 1 * X2).reshape(-1, 1)
y = torch.concatenate((y1, y2, y3), axis=1)

# 数据加载器
train_data = TensorDataset(X, y)
train_loader = DataLoader(train_data, batch_size=5, shuffle=True)

# 模型
model3 = LinearRegression(2, 3)

# 训练模型
optimizer = torch.optim.Adam(model3.parameters(), lr=0.01, weight_decay=0)
trainer(
    model3,
    train_loader,
    num_epochs=30,
    optimizer=optimizer,
    loss_type=nn.MSELoss(),
    device=0,
)

WEIGHT = model3.linear.weight.detach().cpu()
WEIGHT = WEIGHT.numpy()
print("A =\n", WEIGHT)


PyTorch Version: 1.13.1
Training on cuda:0


Epoch 1/30: 100%|██████████| 200/200 [00:00<00:00, 269.52batch/s, loss=3.68] 
Epoch 2/30: 100%|██████████| 200/200 [00:00<00:00, 296.43batch/s, loss=0.772]
Epoch 3/30: 100%|██████████| 200/200 [00:00<00:00, 318.36batch/s, loss=1.18] 
Epoch 4/30: 100%|██████████| 200/200 [00:00<00:00, 359.80batch/s, loss=0.544]
Epoch 5/30: 100%|██████████| 200/200 [00:00<00:00, 333.59batch/s, loss=0.742]
Epoch 6/30: 100%|██████████| 200/200 [00:00<00:00, 357.93batch/s, loss=0.512]
Epoch 7/30: 100%|██████████| 200/200 [00:00<00:00, 358.21batch/s, loss=0.478] 
Epoch 8/30: 100%|██████████| 200/200 [00:00<00:00, 312.34batch/s, loss=0.255] 
Epoch 9/30: 100%|██████████| 200/200 [00:00<00:00, 358.85batch/s, loss=0.0756]
Epoch 10/30: 100%|██████████| 200/200 [00:00<00:00, 336.18batch/s, loss=0.144] 
Epoch 11/30: 100%|██████████| 200/200 [00:00<00:00, 378.21batch/s, loss=0.116] 
Epoch 12/30: 100%|██████████| 200/200 [00:00<00:00, 326.37batch/s, loss=0.0753]
Epoch 13/30: 100%|██████████| 200/200 [00:00<00:00, 342


A =
 [[2.9997127  7.9996257 ]
 [4.999993   3.9999979 ]
 [1.9999995  0.99999994]]



