Consider one-layer neural network with input `x`, parameters `w` and `b`, and some loss function:

In [1]:
import torch

In [3]:
x = torch.ones(5) # input tensor
y = torch.zeros(3) # expected output
w = torch.randn(5, 3, requires_grad=True)
b = torch.randn(3, requires_grad=True)
z = torch.matmul(x, w) + b
loss = torch.nn.functional.binary_cross_entropy_with_logits(z, y)

print(f"x:\n {x}")
print(f"y:\n {y}")
print(f"w:\n {w}")
print(f"b:\n {b}")
print(f"z:\n {z}")
print(f"loss:\n {loss}")

print(f"gradient function for z:\n {z.grad_fn}")
print(f"gradient funtion for loss:\n {loss.grad_fn}")

x:
 tensor([1., 1., 1., 1., 1.])
y:
 tensor([0., 0., 0.])
w:
 tensor([[-0.1101,  0.3991,  0.8016],
        [-0.1892,  0.7182,  1.2419],
        [ 1.0189, -0.5179,  1.3606],
        [ 1.1578,  0.4373,  0.6054],
        [-1.2938, -0.0192, -0.9676]], requires_grad=True)
b:
 tensor([ 0.3753,  0.6999, -0.2055], requires_grad=True)
z:
 tensor([0.9589, 1.7175, 2.8364], grad_fn=<AddBackward0>)
loss:
 2.0197770595550537
gradient function for z:
 <AddBackward0 object at 0x7f718d5c61a0>
gradient funtion for loss:
 <BinaryCrossEntropyWithLogitsBackward0 object at 0x7f718d5c5510>


> To optimize weights of parameters in the neural network, we need to compute the derivatives of our loss function 
with respect to parameters, namely we need $\frac{\partial_{\text{loss}}}{\partial\omega}$ and $\frac{\partial_{\text{loss}}}{\partial b}$ under some fixed values of `x` and `y`. To compute those derivatives, we call `loss.backward()`, and then retrieve the values from `w.grad` and `b.grad`:

In [4]:
loss.backward()
print(f"w.grad:\n {w.grad}")
print(f"b.grad:\n {b.grad}")

w.grad:
 tensor([[0.2410, 0.2826, 0.3149],
        [0.2410, 0.2826, 0.3149],
        [0.2410, 0.2826, 0.3149],
        [0.2410, 0.2826, 0.3149],
        [0.2410, 0.2826, 0.3149]])
b.grad:
 tensor([0.2410, 0.2826, 0.3149])


In [9]:
z = torch.matmul(x, w) + b
print(f"z requires grad: {z.requires_grad}")

# disable gradient tracking
with torch.no_grad():
    z = torch.matmul(x, w) + b
print(f"z requires grad: {z.requires_grad}")

# we can also use detach() method
z = torch.matmul(x, w) + b
z_det = z.detach()
print(f"z requires grad: {z.requires_grad}")
print(f"z_det requires grad: {z_det.requires_grad}")

z requires grad: True
z requires grad: False
z requires grad: True
z_det requires grad: False
