### 生成数据集

给定$x$，我们将[**使用以下三阶多项式来生成训练和测试数据的标签：**]

(**$$y = 5 + 1.2x - 3.4\frac{x^2}{2!} + 5.6 \frac{x^3}{3!} + \epsilon \text{ where }
\epsilon \sim \mathcal{N}(0, 0.1^2).$$**)

噪声项$\epsilon$服从均值为0且标准差为0.1的正态分布。
在优化的过程中，我们通常希望避免非常大的梯度值或损失值。
这就是我们将特征从$x^i$调整为$\frac{x^i}{i!}$的原因，
这样可以避免很大的$i$带来的特别大的指数值。
我们将为训练集和测试集各生成100个样本。

In [1]:
import math
import numpy as np
import torch
from torch import nn
from d2l import torch as d2l

# 构造数据集

In [52]:
max_degree = 20 # 我理解的是数据的维度，沐神给出的是多项式的最大维度（实际上是模型的复杂度）也就是W的维度
n_train, n_test = 100, 100 # 训练集与测试集
true_w = np.zeros(max_degree) # 给w分配20维度，20行一列的矩阵
true_w[0:4] = np.array([5, 1.2, -3.4, 5.6]) # 给w的前4维赋值
# 生成的features在这里是一个矩阵，200行一列的矩阵
features = np.random.normal(size = (n_train + n_test, 1))
np.random.shuffle(features) # 给features做随机调换
poly_features = np.power(features, np.arange(max_degree).reshape(1, -1))
# ploy_features.shape
features

array([[ 6.88256332e-01],
       [-1.84380106e+00],
       [-1.09970511e+00],
       [ 6.35413463e-01],
       [ 1.33745668e-01],
       [-4.78182753e-01],
       [ 1.15787837e+00],
       [-8.66663447e-01],
       [ 3.73214027e-01],
       [-6.03622511e-01],
       [-1.77043140e+00],
       [ 1.41970474e+00],
       [-9.99835678e-02],
       [-3.63540773e-01],
       [-2.86396275e-01],
       [ 1.33448047e+00],
       [-8.53423334e-01],
       [-1.36132761e+00],
       [ 1.22802865e+00],
       [ 1.08506783e+00],
       [ 1.26027388e+00],
       [-1.05307968e+00],
       [ 9.81058733e-01],
       [-8.70929398e-01],
       [ 6.89809645e-01],
       [ 1.34957278e+00],
       [-7.71629850e-01],
       [ 8.95675115e-01],
       [-2.53945311e-01],
       [ 3.90461948e-01],
       [ 9.89252503e-01],
       [-1.23375087e+00],
       [ 2.51856719e-01],
       [ 5.46698976e-01],
       [-3.86893291e-01],
       [ 7.89759211e-01],
       [-1.05521914e+00],
       [ 3.42179752e-01],
       [ 1.3

In [53]:
# 可以看出这个操作是单纯的将一个数组变成一个一行n列的矩阵
np.arange(max_degree).reshape(1, -1)

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15,
        16, 17, 18, 19]])

In [54]:
poly_features

array([[ 1.00000000e+00,  6.88256332e-01,  4.73696778e-01, ...,
         1.74483218e-03,  1.20089180e-03,  8.26521383e-04],
       [ 1.00000000e+00, -1.84380106e+00,  3.39960234e+00, ...,
        -3.28956915e+04,  6.06531108e+04, -1.11832270e+05],
       [ 1.00000000e+00, -1.09970511e+00,  1.20935133e+00, ...,
        -5.03148439e+00,  5.53314909e+00, -6.08483232e+00],
       ...,
       [ 1.00000000e+00, -1.10002353e+00,  1.21005178e+00, ...,
        -5.05630896e+00,  5.56205886e+00, -6.11839564e+00],
       [ 1.00000000e+00, -1.05752127e+00,  1.11835124e+00, ...,
        -2.58770573e+00,  2.73655384e+00, -2.89396390e+00],
       [ 1.00000000e+00, -1.32954560e+00,  1.76769151e+00, ...,
        -1.26752066e+02,  1.68522651e+02, -2.24058550e+02]])

In [55]:
for i in range(max_degree):
    # 对第i列做操作
    poly_features[:, i] /= math.gamma(i + 1)  # gamma(n)=(n-1)! 这里暂时不理解
# labels的维度:(n_train+n_test,) = [200 * 20] * [20 * 1] 
# 他的结果就是做了一个线性回归的内积将结果也就是label计算出来
labels = np.dot(ploy_features,true_w)
labels.shape

(200,)

In [56]:
poly_features.shape

(200, 20)

In [57]:
true_w.shape

(20,)

In [58]:
# 在lable上加上随机扰动项
labels += np.random.normal(scale=0.1, size=labels.shape)

In [59]:
# 将NumPyndarray转换成tensor
true_w, features,poly_features, labels = [
    torch.tensor(x, dtype = torch.float32) for x in [true_w, features,poly_features, labels]
]

In [60]:
# 分别只取前两行元素，拿到的全部就是前两个样本的全部
features[:2], poly_features[:2, :], labels[:2]

(tensor([[ 0.6883],
         [-1.8438]]),
 tensor([[ 1.0000e+00,  6.8826e-01,  2.3685e-01,  5.4337e-02,  9.3495e-03,
           1.2870e-03,  1.4763e-04,  1.4515e-05,  1.2488e-06,  9.5497e-08,
           6.5726e-09,  4.1124e-10,  2.3587e-11,  1.2487e-12,  6.1390e-14,
           2.8168e-15,  1.2117e-16,  4.9055e-18,  1.8757e-19,  6.7945e-21],
         [ 1.0000e+00, -1.8438e+00,  1.6998e+00, -1.0447e+00,  4.8155e-01,
          -1.7758e-01,  5.4570e-02, -1.4374e-02,  3.3128e-03, -6.7868e-04,
           1.2513e-04, -2.0975e-05,  3.2228e-06, -4.5709e-07,  6.0199e-08,
          -7.3997e-09,  8.5272e-10, -9.2485e-11,  9.4735e-12, -9.1933e-13]]),
 tensor([  4.8931, -28.5661]))

In [None]:
# 数据造好了