<a href="https://colab.research.google.com/github/ratnaan23/ds_class/blob/main/ch5_optim.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Chapter 5 Exercise
### 라티나 아스투티 2332036006

Redefine the model to be:
   ```
   w2 * t_u ** 2 + w1 * t_u + b
   ```

In [None]:
%matplotlib inline
import numpy as np
import torch
import torch.optim as optim
torch.set_printoptions(edgeitems=2)

In [None]:
t_c = torch.tensor([0.5, 14.0, 15.0, 28.0, 11.0, 8.0,
                    3.0, -4.0, 6.0, 13.0, 21.0])
t_u = torch.tensor([35.7, 55.9, 58.2, 81.9, 56.3, 48.9,
                    33.9, 21.8, 48.4, 60.4, 68.4])
t_un = 0.1 * t_u

We need to change the model here:

In [None]:
def model(t_u, w1, w2, b):
  return w2 * t_u ** 2 + w1 * t_u + b

In [None]:
def loss_fn(t_p, t_c):
  squared_diffs = (t_p - t_c)**2
  return squared_diffs.mean()

In [None]:
def training_loop(n_epochs, optimizer, params, t_u, t_c):
  for epoch in range(1, n_epochs + 1):
    t_p = model(t_u, *params)
    loss = loss_fn(t_p, t_c)

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    if epoch % 500 == 0:
      print('Epoch %d, Loss %f' % (epoch, float(loss)))
  
  return params

We also need to define the initial value for each w1, w2, and b in `params`.

In this case, the initial value is:

w1 = 1.0

w2 = 1.0

b = 0.0

After several trial, the learning-rate value needs to be set to 1e-4 (or lower) to get the loss.

In [None]:
params = torch.tensor([1.0, 1.0, 0.0], requires_grad=True)
learning_rate = 1e-4
optimizer = optim.SGD([params], lr=learning_rate)
training_loop(
    n_epochs = 5000,
    optimizer = optimizer,
    params = params,
    t_u = t_un,
    t_c = t_c
)

Epoch 500, Loss 10.708596
Epoch 1000, Loss 8.642083
Epoch 1500, Loss 7.171005
Epoch 2000, Loss 6.123478
Epoch 2500, Loss 5.377227
Epoch 3000, Loss 4.845286
Epoch 3500, Loss 4.465788
Epoch 4000, Loss 4.194724
Epoch 4500, Loss 4.000802
Epoch 5000, Loss 3.861744


tensor([-0.8881,  0.5570, -0.8753], requires_grad=True)

In [None]:
n_samples = t_u.shape[0]
n_val = int(0.2 * n_samples)

shuffled_indices = torch.randperm(n_samples)

train_indices = shuffled_indices[:-n_val]
val_indices = shuffled_indices[-n_val:]

train_indices, val_indices

(tensor([ 5,  4,  3,  1,  8,  2, 10,  7,  9]), tensor([0, 6]))

In [None]:
train_t_u = t_u[train_indices]
train_t_c = t_c[train_indices]

val_t_u = t_u[val_indices]
val_t_c = t_c[val_indices]

train_t_un = 0.1 * train_t_u
val_t_un = 0.1 * val_t_u

In [None]:
def training_loop(n_epochs, optimizer, params, train_t_u, val_t_u,
                  train_t_c, val_t_c):
    for epoch in range(1, n_epochs + 1):
        train_t_p = model(train_t_u, *params)
        train_loss = loss_fn(train_t_p, train_t_c)
                             
        val_t_p = model(val_t_u, *params)
        val_loss = loss_fn(val_t_p, val_t_c)
        
        optimizer.zero_grad()
        train_loss.backward()
        optimizer.step()

        if epoch <= 3 or epoch % 500 == 0:
            print(f"Epoch {epoch}, Training loss {train_loss.item():.4f},"
                  f" Validation loss {val_loss.item():.4f}")
            
    return params

In [None]:
params = torch.tensor([1.0, 1.0, 0.0], requires_grad=True)
learning_rate = 1e-4
optimizer = optim.SGD([params], lr=learning_rate)

training_loop(
    n_epochs = 5000, 
    optimizer = optimizer,
    params = params,
    train_t_u = train_t_un,
    val_t_u = val_t_un,
    train_t_c = train_t_c,
    val_t_c = val_t_c)

Epoch 1, Training loss 782.4936, Validation loss 195.6477
Epoch 2, Training loss 411.6089, Validation loss 129.8466
Epoch 3, Training loss 219.1735, Validation loss 90.8555
Epoch 500, Training loss 9.7347, Validation loss 18.2875
Epoch 1000, Training loss 8.2693, Validation loss 14.7293
Epoch 1500, Training loss 7.1645, Validation loss 12.0038
Epoch 2000, Training loss 6.3311, Validation loss 9.9114
Epoch 2500, Training loss 5.7021, Validation loss 8.3013
Epoch 3000, Training loss 5.2271, Validation loss 7.0589
Epoch 3500, Training loss 4.8681, Validation loss 6.0975
Epoch 4000, Training loss 4.5963, Validation loss 5.3511
Epoch 4500, Training loss 4.3902, Validation loss 4.7697
Epoch 5000, Training loss 4.2337, Validation loss 4.3151


tensor([-0.7012,  0.5288, -0.8004], requires_grad=True)

In [None]:
params = torch.tensor([1.0, 1.0, 0.0], requires_grad=True)
learning_rate = 1e-1
optimizer = optim.Adam([params], lr=learning_rate)

training_loop(
    n_epochs = 5000, 
    optimizer = optimizer,
    params = params,
    train_t_u = train_t_u,
    val_t_u = val_t_u,
    train_t_c = train_t_c,
    val_t_c = val_t_c)

Epoch 1, Training loss 13966250.0000, Validation loss 1553974.0000
Epoch 2, Training loss 11302069.0000, Validation loss 1258117.5000
Epoch 3, Training loss 8928869.0000, Validation loss 994507.0625
Epoch 500, Training loss 4.9185, Validation loss 6.5057
Epoch 1000, Training loss 4.1196, Validation loss 4.2992
Epoch 1500, Training loss 3.7427, Validation loss 3.1047
Epoch 2000, Training loss 3.6265, Validation loss 2.6625
Epoch 2500, Training loss 3.5805, Validation loss 2.5331
Epoch 3000, Training loss 3.5363, Validation loss 2.4998
Epoch 3500, Training loss 3.4814, Validation loss 2.4896
Epoch 4000, Training loss 3.4132, Validation loss 2.4826
Epoch 4500, Training loss 3.3294, Validation loss 2.4757
Epoch 5000, Training loss 3.2283, Validation loss 2.4691


tensor([-0.0602,  0.0056, -2.6406], requires_grad=True)

The resulting loss is higher using the new model compared to the previous one. The validation loss also reached a lower point than the training loss in the earlier stage of epoch.