<a href="https://colab.research.google.com/github/Mrcold2002/colab_code/blob/main/3_13.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 3.13 丢弃法

深度学习模型常常使用丢弃法（dropout） 来应对过拟合问题。

## 1 方法

当对多层感知器隐含层使用丢弃法时，该层的隐藏单元有一定的概率被丢掉。
设丢弃概率为p，那么该隐藏层单元$h_i$有概率p被丢掉，有1-p的概率处于$1-p$做拉伸。
这样做可以保证该单元输出的期望依然是$h_i$，即$E(h_i^{'})=h_i$

然而在测试模型时，为了得到更确定的结果，一般不使用丢弃法。


In [6]:
import d2lzh as d2l
from mxnet import autograd, gluon, init, nd
from mxnet.gluon import loss as gloss, nn

def dropout(X,drop_prob):
  assert 0<=drop_prob<=1
  keep_prob=1-drop_prob
  if keep_prob==0:
    return X.zeros_like()
  mask=nd.random.uniform(0,1,X.shape)<keep_prob
  return mask*X/keep_prob

In [4]:
X=nd.arange(16).reshape((2,8))
dropout(X,0.5)


[[ 0.  0.  0.  0.  0.  0.  0.  0.]
 [16.  0.  0. 22. 24. 26.  0. 30.]]
<NDArray 2x8 @cpu(0)>

In [9]:
drop_prob1, drop_prob2 = 0.2, 0.5
net=nn.Sequential()
net.add(nn.Dense(256,activation='relu'),
        nn.Dropout(drop_prob1),
        nn.Dense(256,activation='relu'),
        nn.Dropout(drop_prob2),
        nn.Dense(10))
net.initialize(init.Normal(sigma=0.01))

In [12]:
num_epochs, lr, batch_size = 5, 0.5, 256
loss = gloss.SoftmaxCrossEntropyLoss()
train_iter, test_iter = d2l.load_data_fashion_mnist(batch_size)
trainer = gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': lr})
d2l.train_ch3(net, train_iter, test_iter, loss, num_epochs, batch_size, None,
              None, trainer)

Downloading /root/.mxnet/datasets/fashion-mnist/train-images-idx3-ubyte.gz from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/train-images-idx3-ubyte.gz...
Downloading /root/.mxnet/datasets/fashion-mnist/train-labels-idx1-ubyte.gz from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz...
Downloading /root/.mxnet/datasets/fashion-mnist/t10k-images-idx3-ubyte.gz from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/t10k-images-idx3-ubyte.gz...
Downloading /root/.mxnet/datasets/fashion-mnist/t10k-labels-idx1-ubyte.gz from https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/t10k-labels-idx1-ubyte.gz...
epoch 1, loss 1.1383, train acc 0.557, test acc 0.792
epoch 2, loss 0.5749, train acc 0.787, test acc 0.838
epoch 3, loss 0.4869, train acc 0.822, test acc 0.854
epoch 4, loss 0.4425, train acc 0.839, test acc 0.860


## 2 小结

- 我们可以通过使用丢弃法应对过拟合。
- 丢弃法只在训练模型时使用。