### Creating Parameters

"What distinguishes a tensor used for training data from a tensor used as a trainable parameter/weight?

The latter requires the computation of its gradients, so we can update their values (the parametersâ€™ values, that is). That is what the requires_grad=True argument is good for. It tells PyTorch to compute gradients for us

> A tensor for a learnable parameter requires a gradient!

In [1]:
import torch

# FIRST
# Initializes parameters "b" and "w" randomly, ALMOST as we 
# did in Numpy since we want to apply gradient descent on 
# these parameters we need to set REQUIRES_GRAD = TRUE
torch.manual_seed(42)
b = torch.randn(1, requires_grad=True, dtype=torch.float)
w = torch.randn(1, requires_grad=True, dtype=torch.float)
print(b, w)

tensor([0.3367], requires_grad=True) tensor([0.1288], requires_grad=True)


In [2]:


device = 'cuda' if torch.cuda.is_available() else 'cpu'
# SECOND
# But what if we want to run it on a GPU? We could just 
# send them to device, right?
torch.manual_seed(42)
b = torch.randn(1, requires_grad=True, dtype=torch.float).to(device)
w = torch.randn(1, requires_grad=True, dtype=torch.float).to(device)
print(b, w)
# Sorry, but NO! The to(device) "shadows" the gradient...

tensor([0.3367], requires_grad=True) tensor([0.1288], requires_grad=True)


In [3]:


device = 'cuda' if torch.cuda.is_available() else 'cpu'
# THIRD
# We can either create regular tensors and send them to 
# the device (as we did with our data)
torch.manual_seed(42)
b = torch.randn(1, dtype=torch.float).to(device)
w = torch.randn(1, dtype=torch.float).to(device)
# and THEN set them as requiring gradients...
b.requires_grad_()
w.requires_grad_()
print(b, w)

tensor([0.3367], requires_grad=True) tensor([0.1288], requires_grad=True)


In [4]:
import torch

device = 'cuda' if torch.cuda.is_available() else 'cpu'
# FINAL
# We can specify the device at the moment of creation
# RECOMMENDED!

# Step 0 - initializes parameters "b" and "w" randomly
torch.manual_seed(42)
b = torch.randn(1, requires_grad=True, dtype=torch.float, device=device)
w = torch.randn(1, requires_grad=True, dtype=torch.float, device=device)
print(b, w)

tensor([0.3367], requires_grad=True) tensor([0.1288], requires_grad=True)


### AutoGrad Mechanics

In [6]:
print(w.grad)

None
