## Torch.autograd 를 사용한 역전파

+ torch.autograd supports automatic computation of gradient 

In [2]:
import torch

x = torch.ones(5) #input tensor
y = torch.zeros(3) #expected output
# requires_grad : default은 false, if autograd should record operations on the returned tensor, set True
# w and b are parameters want to optimize
# 추후 x.requires_grad_(True) 이렇게도 설정 가능
w = torch.rand(5, 3, requires_grad=True)
b = torch.rand(3, requires_grad=True)
z = torch.matmul(x, w)+b

# y값이 0 아니면 1로 하여 binary cross entropy
loss = torch.nn.functional.binary_cross_entropy_with_logits(z, y)


In [5]:
x

tensor([1., 1., 1., 1., 1.])

In [6]:
w

tensor([[0.8998, 0.2096, 0.7179],
        [0.1837, 0.1437, 0.6708],
        [0.0515, 0.2548, 0.9957],
        [0.2768, 0.9083, 0.8156],
        [0.3747, 0.4235, 0.3415]], requires_grad=True)

In [7]:
torch.matmul(x, w)

tensor([1.7865, 1.9400, 3.5415], grad_fn=<SqueezeBackward3>)

In [8]:
loss

tensor(3.1431, grad_fn=<BinaryCrossEntropyWithLogitsBackward>)

In [10]:
# back propagation computation with .grad_fn
print('Gradient function for z =',z.grad_fn)
print('Gradient function for loss =', loss.grad_fn)

Gradient function for z = <AddBackward0 object at 0x1060558e0>
Gradient function for loss = <BinaryCrossEntropyWithLogitsBackward object at 0x106055d60>


### Computing gradients

+ compute the derivatives of loss function with respective parameters.


$\frac{\partial loss}{\partial w}$ and
$\frac{\partial loss}{\partial b}$ under some fixed values of
`x` and `y`

In [12]:
# To compute derivatives, call loss.backword() first, and get value with w.grad
print(w.grad)
# loss.backward()을 하지 않고 call 하면 역전파 계산을 하지 않았으므로, None으로 나온다

None


In [13]:
loss.backward()
print(w.grad)
print(b.grad)

tensor([[0.3061, 0.3041, 0.3296],
        [0.3061, 0.3041, 0.3296],
        [0.3061, 0.3041, 0.3296],
        [0.3061, 0.3041, 0.3296],
        [0.3061, 0.3041, 0.3296]])
tensor([0.3061, 0.3041, 0.3296])


In [18]:
# 만약 더 이상 미분값을 기록할 필요 없을 때는, torch.no_grad()로 해주면 된다.
z = torch.matmul(x, w) + b
print(z.requires_grad)

with torch.no_grad():
    z = torch.matmul(x, w) + b

print(z.requires_grad)  

True
False


+ autograd 은 tensor들의 미분값들을 기록한다. 이 때, DAG(Directed Acyclic Graph)으로 기록한다. Input tensor은 leaf노드가 되며, root node은 output tensors가 된다. root에서 leaves까지 가면서 chain rule을 적용하면 자동적으로 gradient을 계산할 수 있다