# Automatic Differentiation with torch.autograd

PyTorch build a differentiation engine to help us compute gradients, named `torch.autograd`.



In [1]:
import torch 
x = torch.ones(5)  # input tensor
y = torch.zeros(3)  # expected output
w = torch.randn(5, 3, requires_grad=True)
b = torch.randn(3, requires_grad=True)
z = torch.matmul(x, w) + b
loss = torch.nn.functional.binary_cross_entropy_with_logits(z, y)


In [2]:
print(loss)

tensor(1.3804, grad_fn=<BinaryCrossEntropyWithLogitsBackward0>)


In [3]:
print(z)

tensor([-0.3228,  3.5479, -3.8853], grad_fn=<AddBackward0>)


# Note

U can set the value of `requires_grad` when creating a tensor, or later by using `x.requires_grad_(True)` methos.

In [4]:
print(f"Gradient function for z = {z.grad_fn}")
print(f"Gradient function for loss = {loss.grad_fn}")

Gradient function for z = <AddBackward0 object at 0x000001DA6DB05C10>
Gradient function for loss = <BinaryCrossEntropyWithLogitsBackward0 object at 0x000001DA6DAF7CD0>


# Computing Gradients
We call `loss.backward()` to compute the gradients.


In [5]:
print(w.grad)

None


In [6]:
loss.backward()
print(w.grad)
print(b.grad)

tensor([[0.1400, 0.3240, 0.0067],
        [0.1400, 0.3240, 0.0067],
        [0.1400, 0.3240, 0.0067],
        [0.1400, 0.3240, 0.0067],
        [0.1400, 0.3240, 0.0067]])
tensor([0.1400, 0.3240, 0.0067])


In [None]:
# compute gradient twice
loss.backward()
print(w.grad)
# output : RuntimeError: Trying to backward through the graph a second time

RuntimeError: Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved tensors after calling backward.

# Disabling Gradient Tracking 
Sometimes, we'll needn't to compute gradient, i.e. we only want to do *forward* computations through the network. 

We can stop tracking computations by surrounding our computation code with `torch.no_grad()` block.

In [8]:
z = x @ w + b 
print(z.requires_grad)

with torch.no_grad():
    z = x @ w + b 
print(z.requires_grad)

True
False


Another way to achieve the same result is, to use the `detach()` method on the tensor:

In [10]:
z = torch.matmul(x, w ) + b 
z_det = z.detach()
print(z_det.requires_grad) 

False


# when we should disable gradient tracking:

- To mark some parameters in u neural network as **frozen parameters** (Fine-Tune)
- To **speed up computations**, when u are only doing forward pass, because computations on tensors that do not track gradients would be more efficient.

# More on Computational Graphs
**Note**
After eache `.backward()` call, autograd starts populating a new graph. 
