# Automatic Differentiation

### *torch.autograd* provides classes and functions implementing automatic differentiation of arbitrary scalar valued functions. It requires minimal changes to the existing code - you only need to declare *Tensor*s for which gradients should be computed with the *requires_grad=True* keyword.

## Example

### 表达式：y= x^TAx
#### 计算图

[![irVr11.jpg](https://s1.ax1x.com/2018/10/23/irVr11.jpg)](https://imgchr.com/i/irVr11)

[![irVHnf.jpg](https://s1.ax1x.com/2018/10/23/irVHnf.jpg)](https://imgchr.com/i/irVHnf)

#### 反向传播算梯度
[![irVOAg.jpg](https://s1.ax1x.com/2018/10/23/irVOAg.jpg)](https://imgchr.com/i/irVOAg)

[![irVXNQ.jpg](https://s1.ax1x.com/2018/10/23/irVXNQ.jpg)](https://imgchr.com/i/irVXNQ)

[![irZE4J.jpg](https://s1.ax1x.com/2018/10/23/irZE4J.jpg)](https://imgchr.com/i/irZE4J)

[![irZZC9.jpg](https://s1.ax1x.com/2018/10/23/irZZC9.jpg)](https://imgchr.com/i/irZZC9)

## 自动求导如何编码历史信息
### 每次执行一个操作时，一个表示它的新*Function*就被实例化，它的*forward*方法被调用，并且它输出的*Tensor*的*grad_fn*被设置为这个*Function*.

[![irnY8J.png](https://s1.ax1x.com/2018/10/23/irnY8J.png)](https://imgchr.com/i/irnY8J)

## torch.autograd

In [1]:
# Computes the sum of gradients of given tensors w.r.t. graph leaves.
# torch.autograd.backward(tensors, grad_tensors=None, retain_graph=None, create_graph=False)

In [2]:
# Computes and returns the sum of gradients of outputs w.r.t. the inputs.
# torch.autograd.grad(outputs, inputs, grad_outputs=None, retain_graph=None, create_graph=False, only_inputs=True,
# allow_unused=False)

## Locally disabling gradient computation

In [4]:
# Context-manager that disabled gradient calculation. Also functions as a decorator.
# torch.autograd.no_grad

# Example
import torch

x = torch.tensor([1], dtype=torch.float, requires_grad=True)
with torch.no_grad():
    y = x * 2
print(y.requires_grad)

@torch.no_grad()
def doubler(x):
    return x * 2
z = doubler(x)
print(z.requires_grad)

False
False


In [5]:
# Context-manager that enables gradient calculation. Enables gradient calculation inside a no_grad context. This
# has no effect outside of no_grad. Also functions as a decorator.
# torch.autograd.enable_grad

In [6]:
# Context-manager that sets gradient calculation to on or off.
# torch.autograd.set_grad_enabled(mode)

# Example
x = torch.tensor([1], dtype=torch.float, requires_grad=True)
is_train = False
with torch.set_grad_enabled(is_train):
    y = x * 2
print(y.requires_grad)

False


## Tensor autograd functions

In [7]:
# Computes the gradient of current tensor w.r.t. graph leaves.
# torch.Tensor.backward(gradient=None, retain_graph=None, create_graph=False)

# Returns a new Tensor, detached from the current graph. The result will never require gradient.
# torch.Tensor.detach()

# Detaches the Tensor from the graph that created it, making it a leaf.
# torch.Tensor.detach_()

In [10]:
# Records operation history and defines formulas for differentiating ops.
# torch.autograd.Function

# Defines a formula for differentiating the operation. It must accept a context ctx as the first argument.
# static backward(ctx, *grad_outputs)

# Performs the operation. It must accept a context ctx as the first argument, followed by any number of arguments.
# the context can be used to store tensors that can be then retrieved during the backward pass.
# static forward(ctx, *args, **kwargs)