<a href="https://colab.research.google.com/github/ratnaan23/ds_class/blob/main/ch5_optim.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Chapter 5 Exercise
### 라티나 아스투티 2332036006

Redefine the model to be:
   ```
   w2 * t_u ** 2 + w1 * t_u + b
   ```

In [None]:
%matplotlib inline
import numpy as np
import torch
import torch.optim as optim
torch.set_printoptions(edgeitems=2)

In [None]:
t_c = torch.tensor([0.5, 14.0, 15.0, 28.0, 11.0, 8.0,
                    3.0, -4.0, 6.0, 13.0, 21.0])
t_u = torch.tensor([35.7, 55.9, 58.2, 81.9, 56.3, 48.9,
                    33.9, 21.8, 48.4, 60.4, 68.4])
t_un = 0.1 * t_u

We need to change the model here:

In [None]:
def model(t_u, w1, w2, b):
  return w2 * t_u ** 2 + w1 * t_u + b

In [None]:
def loss_fn(t_p, t_c):
  squared_diffs = (t_p - t_c)**2
  return squared_diffs.mean()

In [None]:
def training_loop(n_epochs, optimizer, params, t_u, t_c):
  for epoch in range(1, n_epochs + 1):
    t_p = model(t_u, *params)
    loss = loss_fn(t_p, t_c)

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    if epoch % 500 == 0:
      print('Epoch %d, Loss %f' % (epoch, float(loss)))
  
  return params

We also need to define the initial value for each w1, w2, and b in `params`.

In this case, the initial value is:

w1 = 1.0

w2 = 1.0

b = 0.0

After several trial, the learning-rate value needs to be set to 1e-4 (or lower) to get the loss.

In [None]:
params = torch.tensor([1.0, 1.0, 0.0], requires_grad=True)
learning_rate = 1e-4
optimizer = optim.SGD([params], lr=learning_rate)
training_loop(
    n_epochs = 5000,
    optimizer = optimizer,
    params = params,
    t_u = t_un,
    t_c = t_c
)

Epoch 500, Loss 10.708596
Epoch 1000, Loss 8.642083
Epoch 1500, Loss 7.171005
Epoch 2000, Loss 6.123478
Epoch 2500, Loss 5.377227
Epoch 3000, Loss 4.845286
Epoch 3500, Loss 4.465788
Epoch 4000, Loss 4.194724
Epoch 4500, Loss 4.000802
Epoch 5000, Loss 3.861744


tensor([-0.8881,  0.5570, -0.8753], requires_grad=True)

In [None]:
n_samples = t_u.shape[0]
n_val = int(0.2 * n_samples)

shuffled_indices = torch.randperm(n_samples)

train_indices = shuffled_indices[:-n_val]
val_indices = shuffled_indices[-n_val:]

train_indices, val_indices

(tensor([ 9,  4, 10,  8,  0,  3,  7,  5,  1]), tensor([6, 2]))

In [None]:
train_t_u = t_u[train_indices]
train_t_c = t_c[train_indices]

val_t_u = t_u[val_indices]
val_t_c = t_c[val_indices]

train_t_un = 0.1 * train_t_u
val_t_un = 0.1 * val_t_u

In [None]:
def training_loop(n_epochs, optimizer, params, train_t_u, val_t_u,
                  train_t_c, val_t_c):
    for epoch in range(1, n_epochs + 1):
        train_t_p = model(train_t_u, *params)
        train_loss = loss_fn(train_t_p, train_t_c)
                             
        val_t_p = model(val_t_u, *params)
        val_loss = loss_fn(val_t_p, val_t_c)
        
        optimizer.zero_grad()
        train_loss.backward()
        optimizer.step()

        if epoch <= 3 or epoch % 500 == 0:
            print(f"Epoch {epoch}, Training loss {train_loss.item():.4f},"
                  f" Validation loss {val_loss.item():.4f}")
            
    return params

In [None]:
params = torch.tensor([1.0, 1.0, 0.0], requires_grad=True)
learning_rate = 1e-4
optimizer = optim.SGD([params], lr=learning_rate)

training_loop(
    n_epochs = 5000, 
    optimizer = optimizer,
    params = params,
    train_t_u = train_t_un,
    val_t_u = val_t_un,
    train_t_c = train_t_c,
    val_t_c = val_t_c)

Epoch 1, Training loss 742.5377, Validation loss 375.4495
Epoch 2, Training loss 416.5768, Validation loss 208.6635
Epoch 3, Training loss 236.7594, Validation loss 116.8267
Epoch 500, Training loss 12.0186, Validation loss 3.4329
Epoch 1000, Training loss 9.5043, Validation loss 2.5997
Epoch 1500, Training loss 7.7160, Validation loss 2.2386
Epoch 2000, Training loss 6.4436, Validation loss 2.1767
Epoch 2500, Training loss 5.5379, Validation loss 2.2969
Epoch 3000, Training loss 4.8927, Validation loss 2.5207
Epoch 3500, Training loss 4.4327, Validation loss 2.7964
Epoch 4000, Training loss 4.1043, Validation loss 3.0907
Epoch 4500, Training loss 3.8694, Validation loss 3.3826
Epoch 5000, Training loss 3.7010, Validation loss 3.6600


tensor([-1.0695,  0.5798, -0.9587], requires_grad=True)

In [None]:
params = torch.tensor([1.0, 1.0, 0.0], requires_grad=True)
learning_rate = 1e-1
optimizer = optim.Adam([params], lr=learning_rate)

training_loop(
    n_epochs = 5000, 
    optimizer = optimizer,
    params = params,
    train_t_u = train_t_u,
    val_t_u = val_t_u,
    train_t_c = train_t_c,
    val_t_c = val_t_c)

Epoch 1, Training loss 12849290.0000, Validation loss 6580289.0000
Epoch 2, Training loss 10398391.0000, Validation loss 5324670.5000
Epoch 3, Training loss 8215151.0000, Validation loss 4206235.0000
Epoch 500, Training loss 5.3014, Validation loss 2.2282
Epoch 1000, Training loss 3.7736, Validation loss 3.2903
Epoch 1500, Training loss 3.2679, Validation loss 4.5707
Epoch 2000, Training loss 3.1677, Validation loss 5.2880
Epoch 2500, Training loss 3.1291, Validation loss 5.5075
Epoch 3000, Training loss 3.0860, Validation loss 5.5392
Epoch 3500, Training loss 3.0320, Validation loss 5.5329
Epoch 4000, Training loss 2.9649, Validation loss 5.5209
Epoch 4500, Training loss 2.8829, Validation loss 5.5066
Epoch 5000, Training loss 2.7845, Validation loss 5.4902


tensor([-0.0877,  0.0059, -2.7560], requires_grad=True)

The resulting loss is higher using the new model compared to the previous one. The validation loss also reached a lower point than the training loss in the earlier stage of epoch.