# [Evaluating and Improving Models](https://campus.datacamp.com/courses/introduction-to-deep-learning-with-pytorch/evaluating-and-improving-models?ex=1)

Fourth chapter in the Introduction to Deep Learning with PyTorch DataCamp course.

##  1 - Layer Initialization and Transfer Learning

* Data normalization scales input features for stability.

* The weights of a linear layer are also initialized to small values.

In [None]:
import torch.nn as nn

layer = nn.Linear( 64, 128)
print(layer.weight.min(), layer.weight.max(), '\n')

# Another way to initialize weigths, range between 0 to 1:
nn.init.uniform_(layer.weight)
print(layer.weight.min(), layer.weight.max())

tensor(-0.1250, grad_fn=<MinBackward1>) tensor(0.1250, grad_fn=<MaxBackward1>) 

tensor(0.0003, grad_fn=<MinBackward1>) tensor(0.9999, grad_fn=<MaxBackward1>)


* Transfer learning: Reusing a model trained on a first task for a second similar task.

In [10]:
# Transfer learning
import torch 

torch.save(layer, 'layer.pth')
new_layer = torch.load('layer.pth', weights_only=False)
print(new_layer)

Linear(in_features=64, out_features=128, bias=True)


* **Fine-tunning**: A type of transfer learning. Load weights from a previously trained model, but train the model with a smaller learning rate.

* Train part of the network (freeze some of them).

* Rule of thumb: freeze early layers of network and fine-tune layers colser to output layer.

In [11]:
model = nn.Sequential(
    nn.Linear(3, 10),
    nn.Linear(10, 2)
)

for name, param in model.named_parameters():
    if name == '0.weight':
        param.requires_grad = False

Fine-tuning process:

![image.png](attachment:image.png)

## 2 - Evaluating Model Performance

Typically split the dataset into three subsets: 

![image.png](attachment:image.png)

### 2.1 - Training Loss

* Calculating training loss, for each epoch:
    - Sum the loss across all batches in the dataloader.
    - Compute the mean training loss at the end of the epoch.

In [105]:
import torch.nn as nn
from torch.nn import MSELoss
from torch.utils.data import TensorDataset, DataLoader
import torch.optim as optim
import numpy as np

# Inputs
X_train = np.array([
    [0.5, 3.4, 6.7],
    [1.2, 5.0, 7.3],
    [34.2, 44.0, 12.3],
    [0.4, 6.7, 2.2],
    [20.3, 1.1, 5.8]
])

# Labels
y_train = np.array([1.3, 4.5, 3.4, 2.2, 4.0])

# 1) Create a model
layer1 = nn.Linear(3, 5)
nn.init.uniform_(layer1.weight)

layer2 = nn.Linear(5, 10)
nn.init.uniform_(layer2.weight)

model = nn.Sequential(
    layer1,
    nn.ReLU(),
    layer2,
    nn.ReLU(),
    nn.Linear(10, 1),
)

# 2) Choose a loss function
criterion = MSELoss()

# 3) Define a dataset
train_dataset = TensorDataset(
    torch.tensor(X_train).float(),
    torch.tensor(y_train).float()
)

train_dataloader = DataLoader(
    dataset=train_dataset,
    batch_size=2,
    shuffle=True
)

# 4) Set an optimizer
optimizer = optim.SGD(model.parameters(), lr=0.001)

In [162]:
# 5) Training loop
num_epochs = 10

for epoch in range(num_epochs):
    model.train()
    epoch_loss = 0.0
    for data in train_dataloader:
        optimizer.zero_grad() # Reset gradients

        # Get feature an target from the dataloader
        feature, target = data

        # Run a forward pass
        pred = model(feature)

        # Compute loss
        loss = criterion(pred, target.unsqueeze(1))

        # Backpropagation
        loss.backward() # Compute gradientes
        optimizer.step() # Update weights

        # Calculate and sum the loss
        epoch_loss += loss.item()

    avg_epoch_loss = epoch_loss / len(train_dataloader)
    print(avg_epoch_loss)


2.196245869000753
1.5812280376752217
1.9938315947850545
1.5522557298342388
1.5326358079910278
2.075171172618866
1.19460562368234
1.1858171969652176
1.8487066725889842
1.9160771767298381


## 2.2 - Evaluation Loop

In [None]:
# Inputs
X_validation = np.array([
    [0.9, 6.4, 0.7],
    [1.7, 3.0, 0.3],
    [21.2, 12.0, 0.3],

])

# Labels
y_validation = np.array([3.0, 1.9, 0.1])

validation_dataset = TensorDataset(
    torch.tensor(X_validation).float(),
    torch.tensor(y_validation).float()
)

validation_dataloader = DataLoader(
    dataset=validation_dataset,
    batch_size=2,
    shuffle=True
)

# Validation loop
for epoch in range(num_epochs):
    model.eval()
    validation_epoch_loss = 0.0
    with torch.no_grad():
        for inputs, labels in validation_dataloader:
            # Run forward pass
            outputs = model(inputs)

            # Calculate the loss
            loss = criterion(outputs, labels.unsqueeze(1))
            validation_epoch_loss += loss.item()

    avg_epoch_loss = validation_epoch_loss / len(validation_dataloader)
    print(avg_epoch_loss)
    model.train()

5.115960717201233
5.115960717201233
7.576113820075989
4.410033881664276
7.576113820075989
7.576113820075989
5.115960717201233
5.115960717201233
4.410033881664276
4.410033881664276
