<a href="https://colab.research.google.com/github/jimmy-pink/colab-playground/blob/main/3pytorch/PyTorch-%E5%85%A5%E9%97%A8-%E8%87%AA%E5%8A%A8%E6%B1%82%E5%AF%BC%E7%AD%89.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### 自动求导

#### 梯度跟踪

In [None]:
import torch

# 创建一个需要梯度的张量
x = torch.tensor(2.0, requires_grad=True)
y = x ** 2  # y = x^2

# 计算梯度
y.backward()  # dy/dx = 2x
print(x.grad)  # 输出: tensor(4.) (因为 x=2, 2*2=4)

tensor(4.)


#### 多变量梯度 (偏导数)

原函数： $y = (x_1 + x_2)^2$  
对x1偏导数：$\frac{\partial y}{\partial x_1} = 2(x_1 + x_2)$; 注意：$\frac{\partial (x+2)^2}{\partial x}=2(x+2)$    
对x2求偏导: $\frac{\partial y}{\partial x_2} = 2(x_1 + x_2)$

In [None]:
x = torch.tensor([1.0, 2.0], requires_grad=True)
y = x.sum() ** 2  # y = (x1 + x2)^2

y.backward()
print(x.grad)  # x^2的导数是2x， 2*（1+2）=6

tensor([6., 6.])


🥤例二
$f(u,v) = uv + u^2$  
$\frac{\partial f(u,v)}{\partial u} = v + 2u$   
$\frac{\partial f(u,v)}{\partial v} = u$

In [None]:
u=torch.tensor(1.0,requires_grad=True)
v=torch.tensor(2.0,requires_grad=True)
f=u*v+u**2
f.backward()
print(u.grad)
print(v.grad)

#### 动态计算图  
PyTorch 的计算图是动态的，每次前向传播都会构建一个新图：


In [None]:
def forward(x):
    return x ** 2 + 2 * x + 1

x = torch.tensor(3.0, requires_grad=True)
y = forward(x)
y.backward()
print(x.grad)  # 输出: tensor(8.) (因为 dy/dx=2x+2=8)

tensor(8.)


#### 梯度积累 与 清零
PyTorch 默认会累加梯度，训练时需手动清零：

In [None]:
x = torch.tensor(1.0, requires_grad=True)

# 第一次计算
y1 = x ** 2
y1.backward()
print(x.grad)  # tensor(2.)  2*x

# 清零梯度
# x.grad.zero_()

# 第二次计算（梯度会累加）
y2 = x ** 3
y2.backward()
print(x.grad)  # tensor(5.) 没清0: (2*1 + 3*1^2=5)； 梯度清过0: 3*1^2=3

# 清零梯度
x.grad.zero_()

tensor(2.)
tensor(5.)


tensor(0.)

#### 阻止梯度跟踪

detach() 分离张量，使其不参与梯度计算：

```python
x = torch.tensor(2.0, requires_grad=True)
y = x ** 2
z = y.detach()  # z 不参与梯度计算

z.backward()  # 报错！z 无梯度跟踪

## 临时禁用梯度计算：
x = torch.tensor(2.0, requires_grad=True)
with torch.no_grad():
    y = x ** 2  # y 不记录计算图
```

####  高阶导数

In [None]:
x = torch.tensor(2.0, requires_grad=True)
y = x ** 3

# 一阶导数
grad1 = torch.autograd.grad(y, x, create_graph=True)  # dy/dx=3x^2=12
print(grad1[0])  # tensor(12.)

# 二阶导数
grad2 = torch.autograd.grad(grad1[0], x)  # d²y/dx²=3*2x=12
print(grad2[0])  # tensor(12.)

tensor(12., grad_fn=<MulBackward0>)
tensor(12.)
