# PyTorch autograd使用教程

来自b站up主deep_thoughts(港中大) 合集【PyTorch源码教程与前沿人工智能算法复现讲解】

P_10_PyTorch autograd使用教程：
    
https://www.bilibili.com/video/BV1vL411u7bL/?spm_id_from=333.788

autograd 官方文档：https://pytorch.org/tutorials/beginner/basics/autogradqs_tutorial.html

## 微分计算示例

***Computing the loss:***
    
$z = wx +b$

$y = f(z)$

$L = \frac{1}{2}(y-t)^2$

***Computing the derivatives:***

$\overline{L} = 1$

$\overline{y} = y - t$

$\overline{z} = \overline{y}f'(z)$

$\overline{w} = \overline{z}x$

$\overline{b} = \overline{z}$

## 自动微分示例
来自autograd官方教程

In [1]:
import torch

x = torch.ones(5)  # input tensor
y = torch.zeros(3)  # expected output
w = torch.randn(5, 3, requires_grad=True)
b = torch.randn(3, requires_grad=True)
z = torch.matmul(x, w)+b
loss = torch.nn.functional.binary_cross_entropy_with_logits(z, y)

In [2]:
print(f"Gradient function for z = {z.grad_fn}")
print(f"Gradient function for loss = {loss.grad_fn}")

Gradient function for z = <AddBackward0 object at 0x000002149677A308>
Gradient function for loss = <BinaryCrossEntropyWithLogitsBackward0 object at 0x000002149676F448>


In [3]:
loss.backward()
print(w.grad)
print(b.grad)

tensor([[0.3221, 0.2832, 0.2368],
        [0.3221, 0.2832, 0.2368],
        [0.3221, 0.2832, 0.2368],
        [0.3221, 0.2832, 0.2368],
        [0.3221, 0.2832, 0.2368]])
tensor([0.3221, 0.2832, 0.2368])


In [4]:
z = torch.matmul(x, w)+b
print(z.requires_grad)

with torch.no_grad():
    z = torch.matmul(x, w)+b
print(z.requires_grad)

True
False


In [5]:
z = torch.matmul(x, w)+b
z_det = z.detach()
print(z_det.requires_grad)

False


In [6]:
inp = torch.eye(4, 5, requires_grad=True)
out = (inp+1).pow(2).t()
out.backward(torch.ones_like(out), retain_graph=True)
print(f"First call\n{inp.grad}")
out.backward(torch.ones_like(out), retain_graph=True)
print(f"\nSecond call\n{inp.grad}")
inp.grad.zero_()
out.backward(torch.ones_like(out), retain_graph=True)
print(f"\nCall after zeroing gradients\n{inp.grad}")

First call
tensor([[4., 2., 2., 2., 2.],
        [2., 4., 2., 2., 2.],
        [2., 2., 4., 2., 2.],
        [2., 2., 2., 4., 2.]])

Second call
tensor([[8., 4., 4., 4., 4.],
        [4., 8., 4., 4., 4.],
        [4., 4., 8., 4., 4.],
        [4., 4., 4., 8., 4.]])

Call after zeroing gradients
tensor([[4., 2., 2., 2., 2.],
        [2., 4., 2., 2., 2.],
        [2., 2., 4., 2., 2.],
        [2., 2., 2., 4., 2.]])
