### 生成数据集

给定$x$，我们将[**使用以下三阶多项式来生成训练和测试数据的标签：**]

(**$$y = 5 + 1.2x - 3.4\frac{x^2}{2!} + 5.6 \frac{x^3}{3!} + \epsilon \text{ where }
\epsilon \sim \mathcal{N}(0, 0.1^2).$$**)

噪声项$\epsilon$服从均值为0且标准差为0.1的正态分布。
在优化的过程中，我们通常希望避免非常大的梯度值或损失值。
这就是我们将特征从$x^i$调整为$\frac{x^i}{i!}$的原因，
这样可以避免很大的$i$带来的特别大的指数值。
我们将为训练集和测试集各生成100个样本。

In [1]:
import math
import numpy as np
import torch
from torch import nn
from d2l import torch as d2l

# 构造数据集

In [26]:
max_degree = 20 # 我理解的是数据的维度，沐神给出的是多项式的最大维度（实际上是模型的复杂度）也就是W的维度
n_train, n_test = 100, 100 # 训练集与测试集
true_w = np.zeros(max_degree) # 给w分配20维度，20行一列的矩阵
true_w[0:4] = np.array([5, 1.2, -3.4, 5.6]) # 给w的前4维赋值
# 生成的features在这里是一个矩阵，200行一列的矩阵
features = np.random.normal(size = (n_train + n_test, 1))
np.random.shuffle(features) # 给features做随机调换
poly_features = np.power(features, np.arange(max_degree).reshape(1, -1))
# ploy_features.shape
features

array([[ 0.03436456],
       [-1.59934524],
       [-0.69068393],
       [-1.1544175 ],
       [-0.25289406],
       [ 2.00508396],
       [-0.81013213],
       [-1.58578474],
       [ 0.04310299],
       [ 0.95053071],
       [ 0.00948702],
       [-0.98190535],
       [-0.09482405],
       [ 1.77389412],
       [-1.25790191],
       [ 0.40586588],
       [ 0.28186877],
       [-0.72855052],
       [ 0.91268941],
       [-0.9899967 ],
       [ 0.01769862],
       [-0.24421415],
       [-1.14185796],
       [-2.34111988],
       [ 0.41418036],
       [-0.22003836],
       [-2.4392561 ],
       [-0.06593655],
       [ 1.52398277],
       [ 0.2689475 ],
       [ 0.17694853],
       [ 1.14259919],
       [ 1.66200868],
       [-1.52681476],
       [ 1.03401733],
       [-0.91297487],
       [-0.92741635],
       [ 0.30696952],
       [ 0.54299861],
       [ 1.8242183 ],
       [-0.28908461],
       [-0.81670805],
       [-0.8375747 ],
       [-0.17969222],
       [-1.37623897],
       [-0

In [13]:
# 可以看出这个操作是单纯的将一个数组变成一个一行n列的矩阵
np.arange(max_degree).reshape(1, -1)

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15,
        16, 17, 18, 19]])

In [27]:
poly_features

array([[ 1.00000000e+00,  3.43645634e-02,  1.18092322e-03, ...,
         1.29982272e-25,  4.46678402e-27,  1.53499083e-28],
       [ 1.00000000e+00, -1.59934524e+00,  2.55790521e+00, ...,
        -2.93101335e+03,  4.68770226e+03, -7.49725432e+03],
       [ 1.00000000e+00, -6.90683926e-01,  4.77044285e-01, ...,
        -1.85246038e-03,  1.27946460e-03, -8.83705636e-04],
       ...,
       [ 1.00000000e+00,  1.72962749e+00,  2.99161124e+00, ...,
         1.10966997e+04,  1.91931568e+04,  3.31970115e+04],
       [ 1.00000000e+00, -1.91993897e+00,  3.68616564e+00, ...,
        -6.54464650e+04,  1.25653218e+05, -2.41246510e+05],
       [ 1.00000000e+00,  9.06701985e-01,  8.22108490e-01, ...,
         1.89189762e-01,  1.71538733e-01,  1.55534509e-01]])

In [46]:
for i in range(max_degree):
    # 对第i列做操作
    poly_features[:, i] /= math.gamma(i + 1)  # gamma(n)=(n-1)! 这里暂时不理解
# labels的维度:(n_train+n_test,) = [200 * 20] * [20 * 1] 
# 他的结果就是做了一个线性回归的内积将结果也就是label计算出来
labels = np.dot(ploy_features,true_w)
labels.shape

(200,)

In [47]:
poly_features.shape

torch.Size([200, 20])

In [48]:
true_w.shape

torch.Size([20])

In [49]:
# 在lable上加上随机扰动项
labels += np.random.normal(scale=0.1, size=labels.shape)

In [50]:
# 将NumPyndarray转换成tensor
true_w, features,poly_features, labels = [
    torch.tensor(x, dtype = torch.float32) for x in [true_w, features,poly_features, labels]
]

  torch.tensor(x, dtype = torch.float32) for x in [true_w, features,poly_features, labels]


In [51]:
# 分别只取前两行元素，拿到的全部就是前两个样本的全部
features[:2], poly_features[:2, :], labels[:2]

(tensor([[ 0.0344],
         [-1.5993]]),
 tensor([[ 1.0000e+00,  3.4365e-02,  5.9046e-04,  6.7637e-06,  5.8107e-08,
           3.9937e-10,  2.2873e-12,  1.1229e-14,  4.8235e-17,  1.8418e-19,
           6.3291e-22,  1.9773e-24,  5.6623e-27,  1.4968e-29,  3.6740e-32,
           8.4171e-35,  1.8078e-37,  3.6544e-40,  6.9785e-43,  1.4013e-45],
         [ 1.0000e+00, -1.5993e+00,  1.2790e+00, -6.8183e-01,  2.7262e-01,
          -8.7203e-02,  2.3245e-02, -5.3109e-03,  1.0617e-03, -1.8868e-04,
           3.0176e-05, -4.3874e-06,  5.8475e-07, -7.1940e-08,  8.2183e-09,
          -8.7626e-10,  8.7590e-11, -8.2404e-12,  7.3218e-13, -6.1632e-14]]),
 tensor([  5.1239, -28.3540]))

In [None]:
# 数据造好了