# Automatic differentiation with `torch.autograd`
https://pytorch.org/tutorials/beginner/basics/autogradqs_tutorial.html

Consider the simplest one-layer neural network, with input `x`, parameters `w` and `b`, and some loss function. In can be defined in PyTorch in the following manner:

In [1]:
import torch

In [2]:
x = torch.ones(5) # input tensor
y = torch.zeros(3) # Expected output

w = torch.randn(5, 3, requires_grad=True)
b = torch.randn(3, requires_grad=True)
z = torch.matmul(x, w) + b

loss = torch.nn.functional.binary_cross_entropy_with_logits(z, y)

## Tensors, Functions and Computational graph

In [3]:
print(f"Gradient function for z = {z.grad_fn}")
print(f"Gradient function for loss = {loss.grad_fn}")

Gradient function for z = <AddBackward0 object at 0x7f51c0003a90>
Gradient function for loss = <BinaryCrossEntropyWithLogitsBackward0 object at 0x7f51c0003c40>


## Computing Gradients
To optimize weights of parameters in the neural network, we need to compute the derivatives of our loss function with respect to parameters (en français, ça veut dire, en rapport avec les paramètres - weights et biases -, donc en gros, on calculate la dérivé du loss en fonction des paramètres).
To compute those derivatives, we call `loss.backward()`, and then retrieve the values from w.grad and b.grad:

In [5]:
loss.backward()
print(w.grad)
print(b.grad)

RuntimeError: Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved tensors after calling backward.

## Disabling Gradient Tracking
By default, all tensors with `requires_grad=True` are tracking their computational history and support gradient computation. However, there are some cases when we do not need to do that, for example, when we have trained the model and just want to apply it to some input data, i.e, we only want to do forward computations through the network. We can stop tracking computations by surrounding our computation code with torch.no_grad() block:

In [7]:
z = torch.matmul(x, w) + b
print(z.requires_grad)

with torch.no_grad():
     z = torch.matmul(x, w) + b
print(z.requires_grad)

True
False


Another way to achieve the same result is to use the `detach()` method on the tensor:

In [8]:
z = torch.matmul(x, w) + b
z_det = z.detach()
print(z_det.requires_grad)

False


There are reasons you might want to disable gradient tracking:
To mark some parameters in your neural network as frozen parameters.

To speed up computations when you are only doing forward pass, because computations on tensors that do not track gradients would be more efficient.