## 3.13.3 简洁实现

在PyTorch中，我们只需要在全连接层后添加`Dropout`层并指定丢弃概率。在训练模型时，`Dropout`层将以指定的丢弃概率随机丢弃上一层的输出元素；在测试模型时（即`model.eval()`后），`Dropout`层并不发挥作用。

In [25]:
%matplotlib inline
import torch
import torch.nn as nn
import numpy as np
import sys

sys.path.append("..") 
import d2lzh_pytorch as d2l

In [26]:
drop_prob1 = 0.2
drop_prob2 = 0.5


num_inputs= 784
num_hiddens1 = 256
num_hiddens2 = 256
num_outputs = 10


W1 = torch.tensor(np.random.normal(0, 0.01, size=(num_inputs, num_hiddens1)), 
                  dtype=torch.float, 
                  requires_grad=True
                 )
b1 = torch.zeros(num_hiddens1, 
                 requires_grad=True
                )


W2 = torch.tensor(np.random.normal(0, 0.01, size=(num_hiddens1, num_hiddens2)), 
                  dtype=torch.float, 
                  requires_grad=True
                 )
b2 = torch.zeros(num_hiddens2, requires_grad=True)


W3 = torch.tensor(np.random.normal(0, 0.01, size=(num_hiddens2, num_outputs)), 
                  dtype=torch.float, 
                  requires_grad=True
                 )

b3 = torch.zeros(num_outputs, 
                 requires_grad=True
                )


params = [W1, b1, W2, b2, W3, b3]

In [27]:
batch_size = 256


loss = torch.nn.CrossEntropyLoss()

train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size)

In [28]:
drop_prob1 = 0.2
drop_prob2 = 0.5


num_inputs= 784
num_hiddens1 = 256
num_hiddens2 = 256
num_outputs = 10


net = nn.Sequential(d2l.FlattenLayer(),
                    nn.Linear(num_inputs, num_hiddens1),
                    nn.ReLU(),
                    nn.Dropout(drop_prob1),
                    nn.Linear(num_hiddens1, num_hiddens2), 
                    nn.ReLU(),
                    nn.Dropout(drop_prob2),
                    nn.Linear(num_hiddens2, 10)
                   )

In [29]:

# 定义学习方式
optimizer = torch.optim.SGD(net.parameters(), lr=0.5)

In [30]:
# 训练
d2l.train_ch3(net, train_iter, test_iter, 
              loss, num_epochs, batch_size, 
              None, None, optimizer
             )

epoch 1, loss 0.0035, train acc 0.664, test acc 0.804
epoch 2, loss 0.0021, train acc 0.804, test acc 0.802
epoch 3, loss 0.0018, train acc 0.831, test acc 0.840
epoch 4, loss 0.0017, train acc 0.845, test acc 0.819
epoch 5, loss 0.0016, train acc 0.850, test acc 0.855
epoch 6, loss 0.0015, train acc 0.859, test acc 0.854
epoch 7, loss 0.0015, train acc 0.863, test acc 0.833
epoch 8, loss 0.0014, train acc 0.868, test acc 0.852
epoch 9, loss 0.0014, train acc 0.873, test acc 0.848
epoch 10, loss 0.0013, train acc 0.876, test acc 0.858


## 小结

* 我们可以通过使用丢弃法应对过拟合。
* 丢弃法只在训练模型时使用。