# learning with the docs

Docs文档中对每一个模块的讲解

In [1]:
import torch
import torch.autograd as autograd
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable

## 一、Autograd mechanics—— autograd中的机制

Variable中有两个 flags `requires_grad` 和 `volatile`

+ requires_grad
operation中有一个Variablle有grad为T，则output的Variable就可以被计算grad；
如果二者都不需要计算grad，其生成的output variable也不会在backward过程中被计算grad

In [2]:
x= Variable(torch.randn(5, 5))
y = Variable(torch.randn(5,5))
z = Variable(torch.randn(5,5), requires_grad=True)

x.requires_grad

False

In [3]:
z.requires_grad

True

In [4]:
a = x + y
a.requires_grad

False

In [5]:
b = z + x
b.requires_grad  # b需要被计算梯度

True

In [6]:
import torchvision
import torch.optim as optim
model = torchvision.models.resnet18(pretrained=True)
for param in model.parameters():
    param.requires_grad = False
# Replace the last fully-connected layer
# Parameters of newly constructed modules have requires_grad=True by default
model.fc = nn.Linear(512, 100)

# Optimize only the classifier
optimizer = optim.SGD(model.fc.parameters(), lr=1e-2, momentum=0.9)

+ volatile
当你确定不需要用backward的时候，它会使用尽可能小的内存，同时所有的require_grad都会被设置为False.
volatile与requires_grad的区别在于二者的传播方式；只要有一个volatile为true，后面的子图都不必计算梯度了。

In [7]:
 regular_input = Variable(torch.randn(5, 5))
volatile_input = Variable(torch.randn(5, 5), volatile=True)
model = torchvision.models.resnet18(pretrained=True)

In [8]:
volatile_input

Variable containing:
-0.0429  1.1719 -0.0993 -0.5069 -1.4597
-1.1709  0.7674  0.8756  0.2143 -1.6531
 1.4494 -1.4789 -0.2529 -2.5748 -3.1893
 1.7533  0.2632 -1.9919 -2.6632  0.0378
-0.8214  1.3098  0.5743  0.4149 -0.6739
[torch.FloatTensor of size 5x5]

### autograd 是怎么记录历史的呢?
   每个Variable有一个.creator的属性，指向上一个产生它的函数；
每当新生成一个新的Variable时，一个function就被实例化了，forward函数被执行，它的输出Variable的creator就指向这个function。

计算图在每一次迭代过程中都会被重新创建

### In-place operations on Variable
Unless you’re operating under heavy memory pressure, you might never need to use them.

## Serialization semantics 序列化的语义

### 1. 保存一个model

+ (推荐的) 保存以及只载入模型参数


+ 保存整个模型

## 文档

### 1. Tensor


In [9]:
x = torch.Tensor(torch.randn(3, 2))
x


-0.0386 -0.0170
-0.0057 -1.1883
-0.8964 -0.0027
[torch.FloatTensor of size 3x2]

In [10]:
torch.is_tensor(x)  # 是不是tensor

True

In [11]:
torch.is_storage(x) # 是不是一个pytorch storage object

False

In [12]:
# 计算一共有多少个元素
torch.numel(x)

6

**Creation Ops**

In [13]:
torch.eye(5, 2)  # 5 行 2 列


 1  0
 0  1
 0  0
 0  0
 0  0
[torch.FloatTensor of size 5x2]

In [14]:
torch.eye(5, 2, out= x)
x


 1  0
 0  1
 0  0
 0  0
 0  0
[torch.FloatTensor of size 5x2]

In [15]:
import numpy as np
a=np.array([1,2,3])
x=torch.from_numpy(a)
x


 1
 2
 3
[torch.LongTensor of size 3]

In [16]:
x[0] = 100
a

array([100,   2,   3])

In [17]:
torch.linspace(start = 1, end = 100, steps = 10)  # steps表示在这个区间内等距抽样的个数


   1
  12
  23
  34
  45
  56
  67
  78
  89
 100
[torch.FloatTensor of size 10]

$10^{start}到10^{end}之间$

In [18]:
torch.logspace( start = 1, end = 4, steps = 10)


    10.0000
    21.5443
    46.4159
   100.0000
   215.4435
   464.1587
  1000.0000
  2154.4343
  4641.5898
 10000.0000
[torch.FloatTensor of size 10]

In [19]:
torch.ones(2,5)


 1  1  1  1  1
 1  1  1  1  1
[torch.FloatTensor of size 2x5]

In [20]:
# 随机数 【0,1)均匀分布
torch.rand(2,3)


 0.8019  0.3055  0.6767
 0.9767  0.4511  0.1618
[torch.FloatTensor of size 2x3]

In [21]:
# 标准正态分布
torch.randn(4)


-0.8288
-0.9853
 1.7778
 0.2068
[torch.FloatTensor of size 4]

In [22]:
# 随机排列 0到n-1
torch.randperm(5)


 1
 2
 4
 0
 3
[torch.LongTensor of size 5]

In [23]:
torch.range(1, 4, step = 1)


 1
 2
 3
 4
[torch.FloatTensor of size 4]

In [24]:
torch.zeros(2,3)


 0  0  0
 0  0  0
[torch.FloatTensor of size 2x3]

**索引，切片，join， mutating**
+ 拼接

In [25]:
x = torch.randn(2,3)
x


 0.2282  2.0519  0.1840
-1.5891 -0.3069 -0.5706
[torch.FloatTensor of size 2x3]

In [26]:
torch.cat((x,x,x), 0) # rbind


 0.2282  2.0519  0.1840
-1.5891 -0.3069 -0.5706
 0.2282  2.0519  0.1840
-1.5891 -0.3069 -0.5706
 0.2282  2.0519  0.1840
-1.5891 -0.3069 -0.5706
[torch.FloatTensor of size 6x3]

In [27]:
torch.cat((x,x,x), 1)


 0.2282  2.0519  0.1840  0.2282  2.0519  0.1840  0.2282  2.0519  0.1840
-1.5891 -0.3069 -0.5706 -1.5891 -0.3069 -0.5706 -1.5891 -0.3069 -0.5706
[torch.FloatTensor of size 2x9]

In [28]:
y = torch.cat((x,x,x), 1)
torch.chunk(y, 2)

(
  0.2282  2.0519  0.1840  0.2282  2.0519  0.1840  0.2282  2.0519  0.1840
 [torch.FloatTensor of size 1x9], 
 -1.5891 -0.3069 -0.5706 -1.5891 -0.3069 -0.5706 -1.5891 -0.3069 -0.5706
 [torch.FloatTensor of size 1x9])

In [29]:
torch.chunk(y, 3, dim=1)

(
  0.2282  2.0519  0.1840
 -1.5891 -0.3069 -0.5706
 [torch.FloatTensor of size 2x3], 
  0.2282  2.0519  0.1840
 -1.5891 -0.3069 -0.5706
 [torch.FloatTensor of size 2x3], 
  0.2282  2.0519  0.1840
 -1.5891 -0.3069 -0.5706
 [torch.FloatTensor of size 2x3])

In [32]:
y.size()

torch.Size([2, 9])

In [33]:
# torch.gather
t = torch.Tensor([[1,2],[3,4]])
t


 1  2
 3  4
[torch.FloatTensor of size 2x2]

In [34]:
torch.gather(t, 1, torch.LongTensor([[0,0],[1,0]]))


 1  1
 4  3
[torch.FloatTensor of size 2x2]

In [35]:
torch.gather(t, 0, torch.LongTensor([[0,0],[1,0]]))


 1  2
 3  2
[torch.FloatTensor of size 2x2]