## Training Pipeline
- Design model (define input / output size, define `forward()` pass)
- Construct `loss` and `optimizer()`
- Training loop:
  - `forward()` pass: compute prediction
  - calculate loss from predicted output
  - `backward()` pass: compute graidents
  - `optimize()`: update weights

We are gonna do every single thing with PyTorch this time.

In [1]:
import torch
import torch.nn as nn

The input / output has to be 2D array.

Each **row** represents the number of **samples**.

Each **column** represents the number of **features** in each sample.

In the following case, there are four samples in `X` and each of the samples have only one feature.

In [5]:
# training set
X_train = torch.tensor([[1],
                        [2], 
                        [3], 
                        [4]], dtype=torch.float32)

Y_train = torch.tensor([[2],
                        [4], 
                        [6], 
                        [8]], dtype=torch.float32)

# test set
X_test = torch.tensor([5], dtype=torch.float32)

In [6]:
n_samples, n_features = X_train.shape
n_samples, n_features

(4, 1)

This time we don't define `w` to a random value ourselves. We let PyTorch do that.

#### Step 1: Design Model

In [7]:
input_size = n_features
output_size = n_features

model = nn.Linear(input_size, output_size)

In [9]:
print(f"Prediction before training: f(5) = {model(X_test).item():.3f}")

Prediction before training: f(5) = 1.661


In [11]:
learning_rate = 0.01
epochs = 100

#### Step 2: Construct `loss` and `optimizer`

In [16]:
loss = nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

#### Step 3: Training Loop

In [17]:
for epoch in range(epochs):
    # forward pass to calculate y_pred
    y_pred = model(X_train)

    # calculate loss
    l = loss(Y_train, y_pred)

    # backward pass (calculate gradients)
    l.backward()

    # optimize params
    optimizer.step()

    # reset the gradients
    optimizer.zero_grad()

    # print the params
    [w, b] = model.parameters()
    print(f"epoch {epoch + 1}: w = {w[0][0].item():.3f}, loss = {l:.5f}")

print(f"Prediction after training: f(5) = {model(X_test).item():.3f}")

epoch 1: w = 0.669, loss = 22.22612
epoch 2: w = 0.884, loss = 15.42308
epoch 3: w = 1.063, loss = 10.70259
epoch 4: w = 1.213, loss = 7.42714
epoch 5: w = 1.337, loss = 5.15437
epoch 6: w = 1.441, loss = 3.57733
epoch 7: w = 1.527, loss = 2.48306
epoch 8: w = 1.599, loss = 1.72376
epoch 9: w = 1.659, loss = 1.19689
epoch 10: w = 1.709, loss = 0.83130
epoch 11: w = 1.750, loss = 0.57763
epoch 12: w = 1.785, loss = 0.40160
epoch 13: w = 1.814, loss = 0.27946
epoch 14: w = 1.838, loss = 0.19470
epoch 15: w = 1.858, loss = 0.13588
epoch 16: w = 1.875, loss = 0.09506
epoch 17: w = 1.889, loss = 0.06674
epoch 18: w = 1.901, loss = 0.04708
epoch 19: w = 1.910, loss = 0.03343
epoch 20: w = 1.918, loss = 0.02396
epoch 21: w = 1.925, loss = 0.01738
epoch 22: w = 1.931, loss = 0.01281
epoch 23: w = 1.936, loss = 0.00963
epoch 24: w = 1.940, loss = 0.00743
epoch 25: w = 1.943, loss = 0.00589
epoch 26: w = 1.946, loss = 0.00482
epoch 27: w = 1.948, loss = 0.00407
epoch 28: w = 1.950, loss = 0.0035