![](img/the_real_reason.png)

# 前言


**本教程目标：**
- 理解PyTorch基本概念
- 能够使用PyTorch训练神经网络

注意：你不一定需要GPU运行，但用GPU会使运算速度更快，假如没有GPU，CPU也能正常运行。

# PyTorch官网


PyTorch 1.0.以前
> "PyTorch - Tensors and Dynamic neural networks in Python
with strong GPU acceleration.
PyTorch is a deep learning framework for fast, flexible experimentation."
>
> -- https://pytorch.org/*

PyTorch 1.0.以后

> "PyTorch - From Research To Production
> 
> An open source deep learning platform that provides a seamless path from research prototyping to production deployment."

## 动态图是什么？

![](img/dynamic_graph.gif)

这个解释更一目了然：

In [21]:
import torch
from IPython.core.debugger import set_trace

def f(x):
    res = x + x
    set_trace()  # <-- Magic
    #res = res + 1
    return res

x = torch.randn(1, 10)
print(x)
f(x)

tensor([[ 0.1746, -0.2692, -1.0100,  0.2121, -0.7283,  0.0077,  0.9785,  0.4991,
         -1.1123, -0.9612]])
> [0;32m<ipython-input-21-f546a9616d02>[0m(8)[0;36mf[0;34m()[0m
[0;32m      6 [0;31m    [0mset_trace[0m[0;34m([0m[0;34m)[0m  [0;31m# <-- Magic[0m[0;34m[0m[0m
[0m[0;32m      7 [0;31m    [0;31m#res = res + 1[0m[0;34m[0m[0;34m[0m[0m
[0m[0;32m----> 8 [0;31m    [0;32mreturn[0m [0mres[0m[0;34m[0m[0m
[0m[0;32m      9 [0;31m[0;34m[0m[0m
[0m[0;32m     10 [0;31m[0mx[0m [0;34m=[0m [0mtorch[0m[0;34m.[0m[0mrandn[0m[0;34m([0m[0;36m1[0m[0;34m,[0m [0;36m10[0m[0;34m)[0m[0;34m[0m[0m
[0m
ipdb> x
tensor([[ 0.1746, -0.2692, -1.0100,  0.2121, -0.7283,  0.0077,  0.9785,  0.4991,
         -1.1123, -0.9612]])
ipdb> res
tensor([[ 0.3493, -0.5384, -2.0200,  0.4243, -1.4566,  0.0155,  1.9569,  0.9982,
         -2.2246, -1.9224]])
ipdb> exit


BdbQuit: 

## Pytorch的特点：

- 更像python
- debug模式易用
- 扩展性、易扩展
- 广泛应用于学术界

## 相比于TF，Pytorch
- 不再需要 `session.run()`, `tf.control_dependencies()`, `tf.while_loop()`, `tf.cond()`, `tf.global_variables_initializer()`, etc.

## TF vs PyTorch
- static vs dynamic
- production vs prototyping 

## *更多DL框架*
- TensorFlow
- MXNet
- Keras
- CNTK
- Chainer
- caffe
- caffe2
- ......


# 参考资料
- Twitter: https://twitter.com/PyTorch
- Forum: https://discuss.pytorch.org/
- Tutorials: https://pytorch.org/tutorials/
- Examples: https://github.com/pytorch/examples
- API Reference: https://pytorch.org/docs/stable/index.html
- Torchvision: https://pytorch.org/docs/stable/torchvision/index.html
- PyTorch Text: https://github.com/pytorch/text
- PyTorch Audio: https://github.com/pytorch/audio
- AllenNLP: https://allennlp.org/
- Object detection/segmentation: https://github.com/facebookresearch/maskrcnn-benchmark
- Facebook AI Research Sequence-to-Sequence Toolkit written in PyTorch: https://github.com/pytorch/fairseq
- FastAI http://www.fast.ai/
- Stanford CS230 Deep Learning notes https://cs230-stanford.github.io

# 构建一个简单的神经网络

In [10]:
from collections import OrderedDict

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

In [11]:
# 简单的串联
model = nn.Sequential(
    nn.Conv2d(in_channels=1, out_channels=20, kernel_size=5),
    nn.ReLU(),
    nn.Conv2d(20, 64, 5),
    nn.ReLU(),
    nn.AdaptiveAvgPool2d(1),
)

In [12]:
model

Sequential(
  (0): Conv2d(1, 20, kernel_size=(5, 5), stride=(1, 1))
  (1): ReLU()
  (2): Conv2d(20, 64, kernel_size=(5, 5), stride=(1, 1))
  (3): ReLU()
  (4): AdaptiveAvgPool2d(output_size=1)
)

In [13]:
# 前向传播
model(torch.rand(16, 1, 32, 32)).shape

torch.Size([16, 64, 1, 1])

In [14]:
# 你还可以为每一个Layer命名
layers = OrderedDict([
    ("conv1", nn.Conv2d(in_channels=1, out_channels=20, kernel_size=5)),
    ("relu1", nn.ReLU()),
    ("conv2", nn.Conv2d(20,64,5)),
    ("relu2", nn.ReLU()),
    ("aavgp", nn.AdaptiveAvgPool2d(1)),
])
model = nn.Sequential(layers)
model

Sequential(
  (conv1): Conv2d(1, 20, kernel_size=(5, 5), stride=(1, 1))
  (relu1): ReLU()
  (conv2): Conv2d(20, 64, kernel_size=(5, 5), stride=(1, 1))
  (relu2): ReLU()
  (aavgp): AdaptiveAvgPool2d(output_size=1)
)

In [25]:
# 你只需要定义 1.层 2.前向传播。让Pytorch替你做反向传播。
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(in_channels=3, out_channels=6, kernel_size=5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(in_channels=6, out_channels=16, kernel_size=5)
        self.fc1 = nn.Linear(in_features=16 * 5 * 5, out_features=120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 5 * 5)
        #set_trace()
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        x = F.adaptive_avg_pool2d(x, 1)
        return x


model = Net()
model

Net(
  (conv1): Conv2d(3, 6, kernel_size=(5, 5), stride=(1, 1))
  (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
  (fc1): Linear(in_features=400, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=84, bias=True)
  (fc3): Linear(in_features=84, out_features=10, bias=True)
)

# 版本

In [18]:
import torch
torch.__version__

'1.3.1'

In [19]:
import torchvision
torchvision.__version__

'0.4.2'

In [20]:
import numpy as np
np.__version__

'1.15.4'