# Variable

autograd.Variable是Autograd中的核心类，他简单封装了Tensor，并支持几乎所有的Tensor的操作。

Tensor在被封装为Variable之后，可以调用他的.backward实现反向传播，自动计算所有梯度。

Variable的数据结构：
autograd.Variable
- data
- grad
- grad_fn

Variable主要包含三个属性
- data:保存Variable所包含的Tensor
- grad:保存data对应的梯度，grad也是个Variable,而不是Tensor,他和data的形状一样
- grad_fn:指向一个Function对象，这个Function用来反向传播计算输入的梯度

In [6]:
import torch as t

In [7]:
from torch.autograd import Variable

使用Tensor新建一个Variable

In [20]:
x = Variable(t.ones(2,2), requires_grad = True)
x

tensor([[1., 1.],
        [1., 1.]], requires_grad=True)

In [21]:
y = x.sum()
y

tensor(4., grad_fn=<SumBackward0>)

In [22]:
y.grad_fn

<SumBackward0 at 0x3d97c97cc0>

In [23]:
y.backward()  #反向传播，计算梯度

In [27]:
#  y = x.sum() = (x[0][0] + x[0][1] + x[1][0] + x[1][1])
# 每个值的梯度都为1
x.grad

tensor([[1., 1.],
        [1., 1.]])

<font color = red> 注意：grad在反向传播过程中是累加的（accumulated）,这意味着每次运行反向传播，梯度都会累加之前的梯度，所以反向传播之前需要把梯度清零

In [30]:
y.backward()
x.grad

tensor([[2., 2.],
        [2., 2.]])

In [31]:
y.backward()
x.grad

tensor([[3., 3.],
        [3., 3.]])

In [32]:
#一下画线结束的函数就是inplace操作
x.grad.data.zero_()

tensor([[0., 0.],
        [0., 0.]])

In [33]:
y.backward()
x.grad

tensor([[1., 1.],
        [1., 1.]])

## Variable和Tensor具有近乎一致的接口，在实际使用中可以无缝切换

In [34]:
x = Variable(t.ones(4,5))
y = t.cos(x)
x_tensor_cos = t.cos(x.data)
print(y)
x_tensor_cos

tensor([[0.5403, 0.5403, 0.5403, 0.5403, 0.5403],
        [0.5403, 0.5403, 0.5403, 0.5403, 0.5403],
        [0.5403, 0.5403, 0.5403, 0.5403, 0.5403],
        [0.5403, 0.5403, 0.5403, 0.5403, 0.5403]])


tensor([[0.5403, 0.5403, 0.5403, 0.5403, 0.5403],
        [0.5403, 0.5403, 0.5403, 0.5403, 0.5403],
        [0.5403, 0.5403, 0.5403, 0.5403, 0.5403],
        [0.5403, 0.5403, 0.5403, 0.5403, 0.5403]])

In [4]:
tensor = torch.FloatTensor([[1, 2], [3, 4]])
variable = Variable(tensor, requires_grad=True)

In [5]:
print(tensor)
print(variable)

tensor([[1., 2.],
        [3., 4.]])
tensor([[1., 2.],
        [3., 4.]], requires_grad=True)


In [9]:
t_out = torch.mean(tensor*tensor)  # x^2

In [12]:
v_out = torch.mean(variable*variable)   # 可以反向传播

In [13]:
print(t_out)
print(v_out)

tensor(7.5000)
tensor(7.5000, grad_fn=<MeanBackward0>)


误差的反向传递

In [14]:
v_out.backward()
# v_out = 1/4 * sum(var*var)
# d(v_out)/d(var) = 1/4 * 2*var = var/2
print(variable.grad)   #反向传递之后的更新值

tensor([[0.5000, 1.0000],
        [1.5000, 2.0000]])


In [15]:
print(variable)
print(variable.data)

tensor([[1., 2.],
        [3., 4.]], requires_grad=True)
tensor([[1., 2.],
        [3., 4.]])


如果要变成numpy的形式，不能直接variable.numpy(),而是要variable.data.numpy() 

variable.data等价于tensor的形式

In [16]:
print(variable.data.numpy())

[[1. 2.]
 [3. 4.]]
