<a href="https://colab.research.google.com/github/matthewreader/continuous-learning/blob/main/books/deep-learning-with-pytorch/Deep_Learning_with_PyTorch_Chapter_6.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

We could use call the `forward` method of our model, but should not.  There are several hooks that will not be called properly if we use `forward`.

In [None]:
import torch
import torch.nn as nn

t_u = [35.7, 55.9, 58.2, 81.9, 56.3, 48.9, 33.9, 21.8, 48.4, 60.4, 68.4]
t_u = torch.tensor(t_u).unsqueeze(1) # Adds the extra dimension at axis 1
t_un = t_u * 0.1

# 3 arguements, input size, output size, and bias (True by default)
linear_model = nn.Linear(1, 1)
linear_model(t_un)

tensor([[-2.9243],
        [-4.2698],
        [-4.4230],
        [-6.0016],
        [-4.2964],
        [-3.8035],
        [-2.8044],
        [-1.9984],
        [-3.7702],
        [-4.5695],
        [-5.1024]], grad_fn=<AddmmBackward0>)

In [None]:
linear_model.weight

Parameter containing:
tensor([[-0.6661]], requires_grad=True)

In [None]:
linear_model.bias

Parameter containing:
tensor([-0.5463], requires_grad=True)

`torch.nn` expects the zeroth dimension to be the number of batches for a sample.  The output of a model will be B x Nout, where B is the number of batches and Nout is the number of output features.

Why batch inputs?

1) Maximize computing power.  The full power of our processing unit (GPU, CPU, etc.,) will not be utilized to its fullest if we are passing samples for training one sample at a time.

In [None]:
t_c = [0.5,  14.0, 15.0, 28.0, 11.0,  8.0,  3.0, -4.0,  6.0, 13.0, 21.0]
t_u = [35.7, 55.9, 58.2, 81.9, 56.3, 48.9, 33.9, 21.8, 48.4, 60.4, 68.4]
t_c = torch.tensor(t_c).unsqueeze(1)
t_u = torch.tensor(t_u).unsqueeze(1)
 
t_u.shape # B x Nin, where B = 11 and N input features = 1

torch.Size([11, 1])

In [None]:
# 1 input, 1 output linear model
linear_model = nn.Linear(1, 1)
optimizer = torch.optim.SGD(
    linear_model.parameters(),
    lr=1e-2)

In [None]:
list(linear_model.parameters())

[Parameter containing:
 tensor([[0.3601]], requires_grad=True), Parameter containing:
 tensor([0.7939], requires_grad=True)]

In [None]:
# In[13]:
def training_loop(n_epochs, optimizer, model, loss_fn, t_u_train, t_u_val,
                  t_c_train, t_c_val):
    for epoch in range(1, n_epochs + 1):
        t_p_train = model(t_u_train)
        loss_train = loss_fn(t_p_train, t_c_train)

        t_p_val = model(t_u_val)
        loss_val = loss_fn(t_p_val, t_c_val)
 
        optimizer.zero_grad()
        loss_train.backward()
        optimizer.step()
 
        if epoch == 1 or epoch % 1000 == 0:
            print(f"Epoch {epoch}, Training loss {loss_train.item():.4f},"
                  f" Validation loss {loss_val.item():.4f}")

Creating a simple `Sequential` model

In [None]:
seq_model = nn.Sequential(
            nn.Linear(1, 13),
            nn.Tanh(),
            nn.Linear(13, 1))
seq_model

Sequential(
  (0): Linear(in_features=1, out_features=13, bias=True)
  (1): Tanh()
  (2): Linear(in_features=13, out_features=1, bias=True)
)

Collecting the weights and bias of the linear modules and inspecting their shape:

In [None]:
[param.shape for param in seq_model.parameters()]

[torch.Size([13, 1]), torch.Size([13]), torch.Size([1, 13]), torch.Size([1])]

In [None]:
for name, param in seq_model.named_parameters():
   print(name, param.shape)

0.weight torch.Size([13, 1])
0.bias torch.Size([13])
2.weight torch.Size([1, 13])
2.bias torch.Size([1])


Each layer can be named using `OrderedDict`

In [None]:
# In[19]:
from collections import OrderedDict

seq_model = nn.Sequential(OrderedDict([
   ('hidden_linear', nn.Linear(1, 8)),
   ('hidden_activation', nn.Tanh()),
   ('output_linear', nn.Linear(8, 1))
]))

seq_model

Sequential(
  (hidden_linear): Linear(in_features=1, out_features=8, bias=True)
  (hidden_activation): Tanh()
  (output_linear): Linear(in_features=8, out_features=1, bias=True)
)

In [None]:
# We now have more explanatory names for submodules.
for name, param in seq_model.named_parameters():
   print(name, param.shape)

hidden_linear.weight torch.Size([8, 1])
hidden_linear.bias torch.Size([8])
output_linear.weight torch.Size([1, 8])
output_linear.bias torch.Size([1])
