# 《动手学深度学习》深度学习基础

导入包

In [21]:
import os

import numpy as np
import pandas as pd
import torch
from tensorflow.keras.datasets import fashion_mnist
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

信任当前文件

In [3]:
!jupyter trust 2.深度学习基础.ipynb

Signing notebook: 2.深度学习基础.ipynb


## 以简化的房价预测模型为例(8)

- 假设1: 影响房价的关键因素是`卧室的个数`$x_1$、`卫生间个数`$x_2$和`居住面积`$x_3$
- 假设2: `成交价`$y$是关键因素的加权和，即$y= w_1 x_1+w_2 x_2+w_3 x_3+b$


损失函数
- 平方损失: $l(y,\widehat{y}) = \frac{1}{2} (y-\widehat{y})^2$

假设样本数n
$$X = [x_1,x_2,\dots,x_n]^T$$
$$y = [y_1,y_2,\dots,y_n]^T$$

训练损失
$$l(X,y,w,b)=\frac{1}{2n} \sum_{i=1}^{n}(y_i - \left \langle x_i,w \right \rangle -b)^2 = \frac{1}{2n} \left \| y-Xw-b \right \|^2$$

最小化损失来学习参数
$$w^*,b^*= arg\min_{w,b} l(X,y,w,b)$$

## 从零开始实现线性回归


训练数据集的样本数为1000，输入的特征数是2，权重$w=[2,-3.4]^T$，偏差$b=4.2$，随机噪声$\epsilon $，生成标签：
$$y=Xw+b+\epsilon $$

其中$\epsilon $服从均值为0，标准差为0.01的正态分布。代表无意义的干扰。

生成数据集。

In [50]:

torch.manual_seed(1)
X = torch.normal(mean=0,std=1,size=(1000,2))
X
w = torch.tensor([2,-3.4])
w

c = torch.normal(mean=0,std=0.01,size=(1000,1))

b = 4.2

y = torch.matmul(X,w)

X@w
y

<torch._C.Generator at 0x2399bdc45f0>

tensor([[-1.5256, -0.7502],
        [-0.6540, -1.6095],
        [-0.1002, -0.6092],
        ...,
        [ 2.0441, -1.3229],
        [ 1.0491, -2.2162],
        [ 1.0833,  1.5990]])

tensor([ 2.0000, -3.4000])

tensor([-5.0040e-01,  4.1643e+00,  1.8709e+00,  3.5114e+00, -2.4569e+00,
        -6.9968e-01, -6.1807e+00, -1.1331e+00,  2.5526e+00,  7.2221e-01,
         9.0041e-01, -4.0707e+00,  2.8015e+00,  4.2684e-01,  2.2226e+00,
        -1.1486e+00,  4.0039e+00,  2.9453e+00,  2.4272e-01,  1.9232e+00,
        -1.2173e+00, -2.1781e-01,  8.2412e+00,  5.2911e+00, -3.7083e+00,
         1.6185e+00,  2.4064e-02, -4.5221e+00, -2.9166e+00,  1.8198e+00,
         6.0351e+00, -4.1195e+00,  3.4283e+00,  2.2264e+00, -3.9555e-01,
         3.3117e+00, -7.7801e-01,  8.3380e-01,  9.8224e-01,  4.1309e+00,
        -1.3522e+00, -3.7957e+00, -1.1660e+00,  8.8724e-01,  2.2159e+00,
        -4.5721e+00, -4.9570e-01, -6.1861e+00, -2.7228e+00,  4.7390e+00,
         5.5374e+00, -5.6358e+00,  6.1588e-01,  3.0725e-01, -2.0961e+00,
         8.9716e-01,  3.6525e+00, -2.0563e-01, -6.2676e-01,  2.5374e+00,
         5.9097e+00, -6.3531e+00, -1.0076e+01, -9.5240e-01,  7.4649e+00,
        -3.3710e+00,  5.3843e+00,  4.0045e+00,  3.1

tensor([-5.0040e-01,  4.1643e+00,  1.8709e+00,  3.5114e+00, -2.4569e+00,
        -6.9968e-01, -6.1807e+00, -1.1331e+00,  2.5526e+00,  7.2221e-01,
         9.0041e-01, -4.0707e+00,  2.8015e+00,  4.2684e-01,  2.2226e+00,
        -1.1486e+00,  4.0039e+00,  2.9453e+00,  2.4272e-01,  1.9232e+00,
        -1.2173e+00, -2.1781e-01,  8.2412e+00,  5.2911e+00, -3.7083e+00,
         1.6185e+00,  2.4064e-02, -4.5221e+00, -2.9166e+00,  1.8198e+00,
         6.0351e+00, -4.1195e+00,  3.4283e+00,  2.2264e+00, -3.9555e-01,
         3.3117e+00, -7.7801e-01,  8.3380e-01,  9.8224e-01,  4.1309e+00,
        -1.3522e+00, -3.7957e+00, -1.1660e+00,  8.8724e-01,  2.2159e+00,
        -4.5721e+00, -4.9570e-01, -6.1861e+00, -2.7228e+00,  4.7390e+00,
         5.5374e+00, -5.6358e+00,  6.1588e-01,  3.0725e-01, -2.0961e+00,
         8.9716e-01,  3.6525e+00, -2.0563e-01, -6.2676e-01,  2.5374e+00,
         5.9097e+00, -6.3531e+00, -1.0076e+01, -9.5240e-01,  7.4649e+00,
        -3.3710e+00,  5.3843e+00,  4.0045e+00,  3.1