In [1]:
%matplotlib inline


神经网络包
==========

我们重新设计了神经网络，以使它与autograd完全集成。
**容器换成autograd：**

   你不再需要使用像``ConcatTable``这样的Containers，或像``CAddTable``的模块,使用nngraph或者用它debug。我们将无缝地使用autograd来定义我们的神经网络。
- ``output = nn.CAddTable():forward({input1, input2})``变成``output = input1 + input2``
      
- ``output = nn.MulConstant(0.5):forward(input)``变成``output = input * 0.5``

**状态不再被保存在模块中，而是在网络图中：**

    由于这个原因，使用循环网络应该更简单。如果你想创建一个循环网络，简单地使用相同的线性层多次，不需要考虑共享权重。
    
**简化调试：**

    用Python的pdb调试器进行调试是很直观的，调试器和堆栈跟踪错误发生的地方并停止。

例1: ConvNet
------------------

让我们看看如何创建一个小的卷积神经网络。

所有的网络都是从基类``nn.Module``派生出来的：

-  在构造函数中，你声明你想要使用的所有层
-  在forward函数中，你定义了你的模型从输入到输出将如何运行



In [3]:
import torch
import torch.nn as nn
import torch.nn.functional as F


class MNISTConvNet(nn.Module):

    def __init__(self):
        # 这是实例化所有模块的地方
        # you can later access them using the same names you've given them in
        # here
        super(MNISTConvNet, self).__init__()
        self.conv1 = nn.Conv2d(1, 10, 5)
        self.pool1 = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(10, 20, 5)
        self.pool2 = nn.MaxPool2d(2, 2)
        self.fc1 = nn.Linear(320, 50)
        self.fc2 = nn.Linear(50, 10)

    # it's the forward function that defines the network structure
    # we're accepting only a single input in here, but if you want,
    # feel free to use more
    def forward(self, input):
        x = self.pool1(F.relu(self.conv1(input)))
        x = self.pool2(F.relu(self.conv2(x)))

        # in your model definition you can go full crazy and use arbitrary
        # python code to define your model structure
        # all these are perfectly legal, and will be handled correctly
        # by autograd:
        # if x.gt(0) > x.numel() / 2:
        #      ...
        #
        # you can even do a loop and reuse the same module inside it
        # modules no longer hold ephemeral state, so you can use them
        # multiple times during your forward pass
        # while x.norm(2) < 10:
        #    x = self.conv1(x)

        x = x.view(x.size(0), -1)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        return x

现在使用定义的卷积神经网络。
首先创建一个类的实例。



In [4]:
net = MNISTConvNet()
print(net)

MNISTConvNet(
  (conv1): Conv2d(1, 10, kernel_size=(5, 5), stride=(1, 1))
  (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (conv2): Conv2d(10, 20, kernel_size=(5, 5), stride=(1, 1))
  (pool2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (fc1): Linear(in_features=320, out_features=50, bias=True)
  (fc2): Linear(in_features=50, out_features=10, bias=True)
)


<div class="alert alert-info"><h4>Note</h4><p> ``torch.nn``只支持mini-batches。整个``torch.nn``
    包仅支持作为小批量样品的输入，而不是单个样品。

    比如``nn.Conv2d`` 将接受4D的Tensor：
    ``n个样本 x n通道 x Height x Width``.

    如果你有一个样本，只需使用``input.unsqueeze(0)``来伪造一个维度。</p></div>

创建一个mini-batch包含随机数据作为一个样本，然后输入给ConvNet。



In [5]:
input = torch.randn(1, 1, 28, 28)
out = net(input)
print(out.size())

torch.Size([1, 10])


定义一个虚拟目标标签，并使用损失函数来计算误差。


In [6]:
target = torch.tensor([3], dtype=torch.long)
loss_fn = nn.CrossEntropyLoss()  # LogSoftmax + ClassNLL Loss
err = loss_fn(out, target)
err.backward()

print(err)

tensor(2.3220)


ConvNet的输出``out``是一个``Tensor``。我们用它来计算损失，``err``种的结果也是一个``Tensor``。
调用``.backward``在``err``上，因此，它会在ConvNet中的所有分支中传播梯度。

访问单层的权重和梯度：



In [7]:
print(net.conv1.weight.grad.size())

torch.Size([10, 1, 5, 5])


In [8]:
print(net.conv1.weight.data.norm())  # norm of the weight
print(net.conv1.weight.grad.data.norm())  # norm of the gradients

tensor(1.8388)
tensor(0.6529)


前向和后向功能钩
-----------------------------------

我们已经查看了权重和梯度。 但如何查看/修改输出以及某一个层的grad\_output？

介绍一下**hooks**：

- 你可以在一个``Module``上注册一个函数或者一个``Tensor``。这个钩子可以是一个向前的钩，也可以是一个向后的钩。
- 当前向传播调用发生时，这个forward hook将被执行
- 反向传播过程中backward hook被执行。

看一个例子

在conv2上注册一个forward hook并且打印信息。



In [9]:
def printnorm(self, input, output):
    # input is a tuple of packed inputs
    # output is a Tensor. output.data is the Tensor we are interested
    print('Inside ' + self.__class__.__name__ + ' forward')
    print('')
    print('input: ', type(input))
    print('input[0]: ', type(input[0]))
    print('output: ', type(output))
    print('')
    print('input size:', input[0].size())
    print('output size:', output.data.size())
    print('output norm:', output.data.norm())


net.conv2.register_forward_hook(printnorm)

out = net(input)

Inside Conv2d forward

input:  <class 'tuple'>
input[0]:  <class 'torch.Tensor'>
output:  <class 'torch.Tensor'>

input size: torch.Size([1, 10, 12, 12])
output size: torch.Size([1, 20, 8, 8])
output norm: tensor(14.5405)


在conv2上注册一个forward hook并且打印信息。


In [10]:
def printgradnorm(self, grad_input, grad_output):
    print('Inside ' + self.__class__.__name__ + ' backward')
    print('Inside class:' + self.__class__.__name__)
    print('')
    print('grad_input: ', type(grad_input))
    print('grad_input[0]: ', type(grad_input[0]))
    print('grad_output: ', type(grad_output))
    print('grad_output[0]: ', type(grad_output[0]))
    print('')
    print('grad_input size:', grad_input[0].size())
    print('grad_output size:', grad_output[0].size())
    print('grad_input norm:', grad_input[0].norm())


net.conv2.register_backward_hook(printgradnorm)

out = net(input)
err = loss_fn(out, target)
err.backward()

Inside Conv2d forward

input:  <class 'tuple'>
input[0]:  <class 'torch.Tensor'>
output:  <class 'torch.Tensor'>

input size: torch.Size([1, 10, 12, 12])
output size: torch.Size([1, 20, 8, 8])
output norm: tensor(14.5405)
Inside Conv2d backward
Inside class:Conv2d

grad_input:  <class 'tuple'>
grad_input[0]:  <class 'torch.Tensor'>
grad_output:  <class 'tuple'>
grad_output[0]:  <class 'torch.Tensor'>

grad_input size: torch.Size([1, 10, 12, 12])
grad_output size: torch.Size([1, 20, 8, 8])
grad_input norm: tensor(0.1280)


完整的MNIST例子在这里https://github.com/pytorch/examples/tree/master/mnist

例 2: Recurrent Net
------------------------

使用PyTorch构建循环神经网络

因为网络的状态是在图中而不是在层中，可以简单地创建一个nn.Linear，然后重复使用它，再重复一次。


In [None]:
class RNN(nn.Module):

    # you can also accept arguments in your model constructor
    def __init__(self, data_size, hidden_size, output_size):
        super(RNN, self).__init__()

        self.hidden_size = hidden_size
        input_size = data_size + hidden_size

        self.i2h = nn.Linear(input_size, hidden_size)
        self.h2o = nn.Linear(hidden_size, output_size)

    def forward(self, data, last_hidden):
        input = torch.cat((data, last_hidden), 1)
        hidden = self.i2h(input)
        output = self.h2o(hidden)
        return hidden, output


rnn = RNN(50, 20, 10)

使用LSTMs和Penn Tree-bank的更完整的语言建模示例
https://github.com/pytorch/examples/tree/master/word\_language\_model

PyTorch默认情况下，对卷积神经网络和循环神经网络有无缝的CuDNN集成



In [None]:
loss_fn = nn.MSELoss()

batch_size = 10
TIMESTEPS = 5

# Create some fake data
batch = torch.randn(batch_size, 50)
hidden = torch.zeros(batch_size, 20)
target = torch.zeros(batch_size, 10)

loss = 0
for t in range(TIMESTEPS):
    # yes! you can reuse the same network several times,
    # sum up the losses, and call backward!
    hidden, output = rnn(batch, hidden)
    loss += loss_fn(output, target)
loss.backward()

## 参考

[Pytorch中的Hook有什么用途](https://www.zhihu.com/question/61044004)