# pytorch 60分钟快速入门

来自官网的pytorch 60分钟快速入门
教程目的：
完成教程后能够掌握pytorch的Tensor library以及其在神经网络中的应用
训练一个小型的图像分类神经网络

*本教程要求学习者对numpy有所了解

In [1]:
from __future__ import print_function
import torch

## Tensor的概念
Tensor形式类似于numpy的数组ndarrays，是一种可以在GPU上进行计算的多维数组


In [2]:
from __future__ import print_function
import torch

empty()初始化tensor

In [3]:
x = torch.empty(5, 3)
print(x)

tensor([[ 0.0000e+00, -1.0842e-19,  5.5125e+18],
        [ 1.5849e+29, -2.1761e+25,  4.5835e-41],
        [ 1.5134e-43,  4.4842e-44,  5.5128e+18],
        [ 3.6902e+19,  0.0000e+00, -1.0842e-19],
        [ 2.8026e-45,  0.0000e+00,  0.0000e+00]])


rand()初始化tensor

In [4]:
x = torch.rand(5, 3)
print(x)

tensor([[0.8634, 0.8700, 0.5568],
        [0.1007, 0.2721, 0.4942],
        [0.3422, 0.2197, 0.7342],
        [0.9512, 0.4035, 0.1093],
        [0.5827, 0.9162, 0.5969]])


zeros()初始化，长整型tensor

In [5]:
x = torch.zeros(5, 3, dtype=torch.long)
print(x)

tensor([[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]])


由数据直接初始化tensor

In [6]:
x = torch.tensor([5.5, 3])
print(x)

tensor([5.5000, 3.0000])


初始化一个维度与另一个tensor相同的tensor

In [7]:
x = x.new_ones(5, 3, dtype=torch.double)      # new_* methods take in sizes
print(x)

x = torch.randn_like(x, dtype=torch.float)    # override dtype!
print(x)    

tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]], dtype=torch.float64)
tensor([[-0.6698,  0.2112, -0.5920],
        [-0.4484,  0.3563,  0.1257],
        [ 0.2637, -0.6570,  0.7835],
        [ 0.4487, -1.6298, -0.4429],
        [-0.4960,  0.5569, -1.1573]])


In [8]:
print(x.size())

torch.Size([5, 3])


torch.size返回值为tuple

## Tensor的计算

Tensor支持多种类型的运算符，以加法为例：

In [9]:
y = torch.rand(5, 3)
print(x + y)

tensor([[-0.2507,  0.9214,  0.0290],
        [ 0.2550,  0.8220,  1.0790],
        [ 0.7599,  0.1060,  1.3464],
        [ 0.7219, -0.8029, -0.2729],
        [ 0.0069,  1.4137, -1.0565]])


In [10]:
print(torch.add(x, y))

tensor([[-0.2507,  0.9214,  0.0290],
        [ 0.2550,  0.8220,  1.0790],
        [ 0.7599,  0.1060,  1.3464],
        [ 0.7219, -0.8029, -0.2729],
        [ 0.0069,  1.4137, -1.0565]])


In [11]:
# 参数定义输出
result = torch.empty(5, 3)
torch.add(x, y, out=result)
print(result)

tensor([[-0.2507,  0.9214,  0.0290],
        [ 0.2550,  0.8220,  1.0790],
        [ 0.7599,  0.1060,  1.3464],
        [ 0.7219, -0.8029, -0.2729],
        [ 0.0069,  1.4137, -1.0565]])


In [13]:
# 变量累加
y.add_(x)
print(y)

tensor([[-0.9205,  1.1326, -0.5630],
        [-0.1934,  1.1783,  1.2047],
        [ 1.0236, -0.5510,  2.1298],
        [ 1.1707, -2.4327, -0.7158],
        [-0.4891,  1.9706, -2.2138]])


note：Any operation that mutates a tensor in-place is post-fixed with an _. For example: x.copy_(y), x.t_(), will change x.

可以对tensor使用类似numpy的索引操作

In [14]:
print(x[:, 1])

tensor([ 0.2112,  0.3563, -0.6570, -1.6298,  0.5569])


改变tensor形状可以使用torch.view()

In [15]:
x = torch.randn(4, 4)
y = x.view(16)
z = x.view(-1, 8)  # the size -1 is inferred from other dimensions
print(x.size(), y.size(), z.size())

torch.Size([4, 4]) torch.Size([16]) torch.Size([2, 8])


对于只有一个数值的tensor可以使用.item()进行输出

In [16]:
x = torch.randn(1)
print(x)
print(x.item())

tensor([0.1208])
0.12084405869245529


更多tensor操作，请参考
https://pytorch.org/docs/stable/torch.html

## numpy array与tensor的相互转换

Torch Tensor和NumPy array共享内存空间（当tensor在cpu上运行时），他们之间可以互相转换

In [17]:
a = torch.ones(5)
print(a)

tensor([1., 1., 1., 1., 1.])


In [18]:
b = a.numpy()
print(b)

[1. 1. 1. 1. 1.]


In [19]:
a.add_(1)
print(a)
print(b)

tensor([2., 2., 2., 2., 2.])
[2. 2. 2. 2. 2.]


In [21]:
import numpy as np
a = np.ones(5)
b = torch.from_numpy(a)
np.add(a, 1, out=a)
print(a)
print(b)

[2. 2. 2. 2. 2.]
tensor([2., 2., 2., 2., 2.], dtype=torch.float64)


## CUDA Tensor

tensor可以通过.to()在CPU和GPU中进行切换

In [23]:
# let us run this cell only if CUDA is available
# We will use ``torch.device`` objects to move tensors in and out of GPU
if torch.cuda.is_available():
    device = torch.device("cuda")          # a CUDA device object
    y = torch.ones_like(x, device=device)  # directly create a tensor on GPU
    x = x.to(device)                       # or just use strings ``.to("cuda")``
    z = x + y
    print(z)
    print(z.to("cpu", torch.double))       # ``.to`` can also change dtype together!

######################################################

## AUTOGRAD：自动求微分

在pytorch上构建神经网络的核心是autograd库。autograd库为tensor的所有运算操作提供自动求微分功能，其使用define-by-run框架定义，这意味着你的反向传播过程可以直接由你的代码的执行逻辑结构直接定义，并且随着每一轮迭代改变

### Tensor
torch.Tensor是这个包的核心类。当你将属性.requires_grad设置为True时, 它将会追踪tensor的所有操作。当你完成所有运算后调用 .backward() 将会自动计算所有的梯度。tensor的梯度会自动累加到 .grad 属性中。

To stop a tensor from tracking history, you can call .detach() to detach it from the computation history, and to prevent future computation from being tracked.

To prevent tracking history (and using memory), you can also wrap the code block in with torch.no_grad():. This can be particularly helpful when evaluating a model because the model may have trainable parameters with requires_grad=True, but for which we don’t need the gradients.

There’s one more class which is very important for autograd implementation - a Function.

Tensor and Function are interconnected and build up an acyclic graph, that encodes a complete history of computation. Each tensor has a .grad_fn attribute that references a Function that has created the Tensor (except for Tensors created by the user - their grad_fn is None).

If you want to compute the derivatives, you can call .backward() on a Tensor. If Tensor is a scalar (i.e. it holds a one element data), you don’t need to specify any arguments to backward(), however if it has more elements, you need to specify a gradient argument that is a tensor of matching shape.