<a href="https://colab.research.google.com/github/jeffheaton/app_deep_learning/blob/main/t81_558_class_03_4_early_stop.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# T81-558: Applications of Deep Neural Networks

**Module 3: Introduction to PyTorch**

- Instructor: [Jeff Heaton](https://sites.wustl.edu/jeffheaton/), McKelvey School of Engineering, [Washington University in St. Louis](https://engineering.wustl.edu/Programs/Pages/default.aspx)
- For more information visit the [class website](https://sites.wustl.edu/jeffheaton/t81-558/).


# Module 3 Material

- Part 3.1: Deep Learning and Neural Network Introduction [[Video]](https://www.youtube.com/watch?v=d-rU5IuFqLs&list=PLjy4p-07OYzuy_lHcRW8lPTLPTTOmUpmi) [[Notebook]](t81_558_class_03_1_neural_net.ipynb)
- Part 3.2: Introduction to PyTorch [[Video]](https://www.youtube.com/watch?v=Pf-rrhMolm0&list=PLjy4p-07OYzuy_lHcRW8lPTLPTTOmUpmi) [[Notebook]](t81_558_class_03_2_pytorch.ipynb)
- Part 3.3: Encoding a Feature Vector for PyTorch Deep Learning [[Video]](https://www.youtube.com/watch?v=7SGPm2tIT58&list=PLjy4p-07OYzuy_lHcRW8lPTLPTTOmUpmi) [[Notebook]](t81_558_class_03_3_feature_encode.ipynb)
- **Part 3.4: Early Stopping and Network Persistence** [[Video]](https://www.youtube.com/watch?v=lS0vvIWiahU&list=PLjy4p-07OYzuy_lHcRW8lPTLPTTOmUpmi) [[Notebook]](t81_558_class_03_4_early_stop.ipynb)
- Part 3.5: Sequences vs Classes in PyTorch [[Video]](https://www.youtube.com/watch?v=NOu8jMZx3LY&list=PLjy4p-07OYzuy_lHcRW8lPTLPTTOmUpmi) [[Notebook]](t81_558_class_03_5_pytorch_class_sequence.ipynb)

# Google CoLab Instructions

The following code ensures that Google CoLab is running and maps Google Drive if needed. We also initialize the PyTorch device to either GPU/MPS (if available) or CPU.


In [1]:
import torch
print("PyTorch version:", torch.__version__)
def check_cuda():
    if torch.cuda.is_available():
        print("CUDA is available on this device.")
        print(f"Device Name: {torch.cuda.get_device_name(0)}")
        print(f"CUDA Version: {torch.version.cuda}")
        print(f"Number of CUDA Devices: {torch.cuda.device_count()}")
    else:
        print("CUDA is not available on this device. Using CPU.")

if __name__ == "__main__":
    check_cuda()
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

PyTorch version: 2.1.0
CUDA is available on this device.
Device Name: NVIDIA GeForce RTX 2060
CUDA Version: 12.1
Number of CUDA Devices: 1


This script will:

* Check if CUDA is available using `torch.cuda.is_available()`.
* If CUDA is available, it will print the name of the CUDA device, the CUDA version, and the number of CUDA devices available.
* If CUDA is not available, it will inform you that it's defaulting to the CPU.

### Part 3.4: Early Stopping and Network Persistence

In this comprehensive module, we are getting into the weeds of two fundamental and critically important aspects of training neural networks, which are the concept of early stopping and the mechanisms for saving and loading PyTorch network models. The idea behind early stopping is deceptively simple yet profoundly impactful in the world of machine learning and neural network training. This technique is all about vigilantly monitoring the loss incurred during the validation phase of your training process. By keeping a close eye on this, we can astutely identify the opportune moment to cease further training, thereby efficiently circumventing the common pitfall of overfitting and ensuring the model's performance is optimized. 

### Early Stopping

The concept of early stopping emerges as a pragmatic solution to a common quandary faced by many in the field: determining the ideal number of epochs for training a neural network. The peril of overtraining looms large; if a neural network is subjected to an excessive number of training epochs, it risks becoming overly specialized to the training data. This phenomenon, known as overfitting, manifests when the neural network, rather than learning to generalize from the input data, begins to memorize it. The visible symptom of this issue is a neural network that performs admirably on the training data, boasting high accuracy, yet falters and fails to replicate this success on new, unseen data. 

Moreover, this module is not just limited to exploring the nuances of early stopping. We also venture into the critical domain of saving and loading PyTorch networks. This aspect of the training process is indispensable as it grants us the ability to preserve the state of our carefully trained models. Once saved, these models can be conveniently reloaded for future use, whether for making predictions, further training, or analysis. This ability is particularly valuable as it offers a means to efficiently manage and utilize the trained networks we've invested substantial time and resources in. Through a thorough understanding and application of these techniques, you will be better equipped to not only enhance your models’ performance but also manage your neural network projects with greater efficacy and finesse.


It can be difficult to determine how many epochs to cycle through to train a neural network. Overfitting will occur if you train the neural network for too many epochs, and the neural network will not perform well on new data, despite attaining a good accuracy on the training set. Overfitting occurs when a neural network is trained to the point that it begins to memorize rather than generalize, as demonstrated in Figure 3.

**Figure 3.OVER: Training vs. Validation Error for Overfitting**
![Training vs. Validation Error for Overfitting](https://raw.githubusercontent.com/jeffheaton/t81_558_deep_learning/master/images/class_3_training_val.png "Training vs. Validation Error for Overfitting")

It is important to segment the original dataset into several datasets:

- **Training Set**
- **Validation Set**
- **Holdout Set**

You can construct these sets in several different ways. The following programs demonstrate some of these.

The first method is a training and validation set. We use the training data to train the neural network until the validation set no longer improves. This attempts to stop at a near-optimal training point. This method will only give accurate "out of sample" predictions for the validation set; this is usually 20% of the data. The predictions for the training data will be overly optimistic, as these were the data that we used to train the neural network. Figure 3.VAL demonstrates how we divide the dataset.

**Figure 3.VAL: Training with a Validation Set**
![Training with a Validation Set](https://raw.githubusercontent.com/jeffheaton/t81_558_deep_learning/master/images/class_1_train_val.png "Training with a Validation Set")


Because PyTorch does not include a built-in early stopping function, we must define one of our own. We will use the following **EarlyStopping** class throughout this course.

We can provide several parameters to the **EarlyStopping** object:

- **min_delta** This value should be kept small; it specifies the minimum change that should be considered an improvement. Setting it even smaller will not likely have a great deal of impact.
- **patience** How long should the training wait for the validation error to improve?
- **restore_best_weights** You should usually set this to true, as it restores the weights to the values they were at when the validation set is the highest.

We will now see an example of this class in action.


In [2]:
import copy

class EarlyStopping:
    """
    EarlyStopping utility for PyTorch models.

    This class implements an early stopping mechanism to terminate training 
    when validation loss stops improving, helping prevent overfitting and saving 
    computational resources.

    Attributes:
        patience (int): Number of epochs to wait for improvement before stopping.
        min_delta (float): Minimum change in the monitored quantity to qualify as an improvement.
        restore_best_weights (bool): If True, restores model weights from the epoch with the lowest loss.

    Methods:
        __call__(model, val_loss): Call method to perform early stopping check.
    """

    def __init__(self, patience=5, min_delta=0, restore_best_weights=True):
        self.patience = patience
        self.min_delta = min_delta
        self.restore_best_weights = restore_best_weights
        self.best_model = None
        self.best_loss = None
        self.counter = 0
        self.status = ""

    def __call__(self, model, val_loss):
        """
        Perform early stopping check.

        Args:
            model: PyTorch model being trained.
            val_loss (float): Current validation loss.

        Returns:
            True if early stopping is triggered, False otherwise.
        """

        # First validation loss received; set as best and copy model
        if self.best_loss is None:
            self.best_loss = val_loss
            self.best_model = copy.deepcopy(model.state_dict())

        # Improvement found; update best loss and model, reset counter
        elif self.best_loss - val_loss >= self.min_delta:
            self.best_model = copy.deepcopy(model.state_dict())
            self.best_loss = val_loss
            self.counter = 0
            self.status = f"Improvement found, counter reset to {self.counter}"

        # No improvement; increment counter and check for early stopping
        else:
            self.counter += 1
            self.status = f"No improvement in the last {self.counter} epochs"
            if self.counter >= self.patience:
                self.status = f"Early stopping triggered after {self.counter} epochs."
                if self.restore_best_weights:
                    model.load_state_dict(self.best_model)
                return True

        return False


### Class: `EarlyStopping`
The `EarlyStopping` class is designed to monitor the training process and stop it appropriately.

#### Initialization (`__init__`):
- `patience`: The number of epochs to wait for an improvement in the validation loss before stopping the training. Default is 5.
- `min_delta`: The minimum change in the validation loss to qualify as an improvement. This helps in ignoring very small changes. Default is 0.
- `restore_best_weights`: If `True`, when early stopping is triggered, the model's weights are reverted to the state where the validation loss was lowest.
- `best_model`: To store the model's state dict (weights) at its best performance.
- `best_loss`: To keep track of the best (lowest) validation loss observed.
- `counter`: Counts how many epochs have passed without improvement in validation loss.
- `status`: A string to store messages about the early stopping status.

#### The Call Method (`__call__`):
This method is executed each time an `EarlyStopping` object is called with parameters.

- Parameters:
    - `model`: The neural network model being trained.
    - `val_loss`: The current epoch's validation loss.
- Process:
    - If `best_loss` is `None` (first call), it initializes `best_loss` with the current `val_loss` and copies the model's weights.
    - If the current `val_loss` shows an improvement greater than or equal to `min_delta` compared to `best_loss`, update `best_model` and `best_loss` with the current model's state and `val_loss`, respectively. Reset `counter` to 0.
    - If there's no improvement, increment the `counter`. If `counter` reaches `patience`, it implies no improvement in validation loss for a specified number of epochs, thus triggering early stopping.
        - If `restore_best_weights` is `True`, the model's weights are reverted to the best observed state.
        - Returns `True`, indicating early stopping is triggered.
- If early stopping is not triggered, the method returns `False`.

### Usage
The `EarlyStopping` class is used during the training loop of a neural network. After each epoch, the training script would call an instance of `EarlyStopping`, passing in the current model and the validation loss for that epoch. The early stopping mechanism will then decide whether training should continue or stop based on the evolution of the validation loss.

### Early Stopping with Classification

We will now see an example of classification training with early stopping. We will train the neural network until the error no longer improves on the validation set.


In [5]:
import time

import numpy as np
import pandas as pd
import torch
import tqdm
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder, StandardScaler
from torch import nn
from torch.autograd import Variable
from torch.utils.data import DataLoader, TensorDataset

# Set random seed for reproducibility
np.random.seed(42)
torch.manual_seed(42)

def load_data():
    df = pd.read_csv(
        "https://data.heatonresearch.com/data/t81-558/iris.csv", na_values=["NA", "?"]
    )

    le = LabelEncoder()

    x = df[["sepal_l", "sepal_w", "petal_l", "petal_w"]].values
    y = le.fit_transform(df["species"])
    species = le.classes_

    # Split into validation and training sets
    x_train, x_test, y_train, y_test = train_test_split(
        x, y, test_size=0.25, random_state=42
    )

    scaler = StandardScaler()
    x_train = scaler.fit_transform(x_train)
    x_test = scaler.transform(x_test)

    # Numpy to Torch Tensor
    x_train = torch.tensor(x_train, device=device, dtype=torch.float32)
    y_train = torch.tensor(y_train, device=device, dtype=torch.long)

    x_test = torch.tensor(x_test, device=device, dtype=torch.float32)
    y_test = torch.tensor(y_test, device=device, dtype=torch.long)

    return x_train, x_test, y_train, y_test, species


x_train, x_test, y_train, y_test, species = load_data()

# Create datasets
BATCH_SIZE = 16

dataset_train = TensorDataset(x_train, y_train)
dataloader_train = DataLoader(
    dataset_train, batch_size=BATCH_SIZE, shuffle=True)

dataset_test = TensorDataset(x_test, y_test)
dataloader_test = DataLoader(dataset_test, batch_size=BATCH_SIZE, shuffle=True)

# Create model using nn.Sequential
model = nn.Sequential(
    nn.Linear(x_train.shape[1], 50),
    nn.ReLU(),
    nn.Linear(50, 25),
    nn.ReLU(),
    nn.Linear(25, len(species)),
    nn.LogSoftmax(dim=1),
)

# model = torch.compile(model,backend="aot_eager").to(device)

model = model.to(device)

loss_fn = nn.CrossEntropyLoss()  # cross entropy loss

optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
es = EarlyStopping()

epoch = 0
done = False
while epoch < 1000 and not done:
    epoch += 1
    steps = list(enumerate(dataloader_train))
    pbar = tqdm.tqdm(steps)
    model.train()
    for i, (x_batch, y_batch) in pbar:
        x_batch = x_batch.to(device)  # Move your input batch to the device
        y_batch = y_batch.to(device)  # Move your target batch to the device

        y_batch_pred = model(x_batch.to(device))
        loss = loss_fn(y_batch_pred, y_batch.to(device))
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        loss, current = loss.item(), (i + 1) * len(x_batch)
        if i == len(steps) - 1:
            model.eval()
            pred = model(x_test)
            vloss = loss_fn(pred, y_test)
            if es(model, vloss):
                done = True
            pbar.set_description(
                f"Epoch: {epoch}, tloss: {loss}, vloss: {vloss:>7f}, {es.status}"
            )
        else:
            pbar.set_description(f"Epoch: {epoch}, tloss {loss:}")

Epoch: 1, tloss: 0.6026307344436646, vloss: 0.536555, : 100%|████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 17.10it/s]
Epoch: 2, tloss: 0.36586472392082214, vloss: 0.277725, Improvement found, counter reset to 0: 100%|█████████████████████████| 7/7 [00:00<00:00, 108.20it/s]
Epoch: 3, tloss: 0.15603022277355194, vloss: 0.187535, Improvement found, counter reset to 0: 100%|█████████████████████████| 7/7 [00:00<00:00, 203.37it/s]
Epoch: 4, tloss: 0.05794892832636833, vloss: 0.154333, Improvement found, counter reset to 0: 100%|█████████████████████████| 7/7 [00:00<00:00, 188.59it/s]
Epoch: 5, tloss: 0.18528980016708374, vloss: 0.076723, Improvement found, counter reset to 0: 100%|█████████████████████████| 7/7 [00:00<00:00, 206.79it/s]
Epoch: 6, tloss: 0.12420051544904709, vloss: 0.061499, Improvement found, counter reset to 0: 100%|█████████████████████████| 7/7 [00:00<00:00, 190.56it/s]
Epoch: 7, tloss: 0.03340417146682739, vloss: 0.045322, Improveme

In [6]:
pred = model(x_test)
vloss = loss_fn(pred, y_test)
print(f"Loss = {vloss}")

Loss = 0.008987347595393658


As you can see from above, we did not use the total number of requested epochs. The neural network training stopped once the validation set no longer improved.


In [7]:
from sklearn.metrics import accuracy_score

pred = model(x_test)
_, predict_classes = torch.max(pred, 1)
correct = accuracy_score(y_test.cpu(), predict_classes.cpu())
print(f"Accuracy: {correct}")

Accuracy: 1.0


### Early Stopping with Regression

The following code demonstrates how we can apply early stopping to a regression problem. The technique is similar to the early stopping for classification code that we just saw.


In [8]:
import time

import numpy as np
import pandas as pd
import torch.nn as nn
import torch.nn.functional as F
import tqdm
from sklearn import preprocessing
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split
from torch.autograd import Variable
from torch.utils.data import DataLoader, TensorDataset

# Read the MPG dataset.
df = pd.read_csv(
    "https://data.heatonresearch.com/data/t81-558/auto-mpg.csv", na_values=["NA", "?"]
)

cars = df["name"]

# Handle missing value
df["horsepower"] = df["horsepower"].fillna(df["horsepower"].median())

# Pandas to Numpy
x = df[
    [
        "cylinders",
        "displacement",
        "horsepower",
        "weight",
        "acceleration",
        "year",
        "origin",
    ]
].values
y = df["mpg"].values  # regression

# Split into validation and training sets
x_train, x_test, y_train, y_test = train_test_split(
    x, y, test_size=0.25, random_state=42
)

# Numpy to Torch Tensor
x_train = torch.tensor(x_train, device=device, dtype=torch.float32)
y_train = torch.tensor(y_train, device=device, dtype=torch.float32)

x_test = torch.tensor(x_test, device=device, dtype=torch.float32)
y_test = torch.tensor(y_test, device=device, dtype=torch.float32)


# Create datasets
BATCH_SIZE = 16

dataset_train = TensorDataset(x_train, y_train)
dataloader_train = DataLoader(dataset_train, batch_size=BATCH_SIZE, shuffle=True)

dataset_test = TensorDataset(x_test, y_test)
dataloader_test = DataLoader(dataset_test, batch_size=BATCH_SIZE, shuffle=True)


# Create model

model = nn.Sequential(
    nn.Linear(x_train.shape[1], 50),
    nn.ReLU(),
    nn.Linear(50, 25),
    nn.ReLU(),
    nn.Linear(25, 1)
)

# model = torch.compile(model, backend="aot_eager").to(device)
model = model.to(device)

# Define the loss function for regression
loss_fn = nn.MSELoss()

# Define the optimizer
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

es = EarlyStopping()

epoch = 0
done = False
while epoch < 1000 and not done:
    epoch += 1
    steps = list(enumerate(dataloader_train))
    pbar = tqdm.tqdm(steps)
    model.train()
    for i, (x_batch, y_batch) in pbar:
        y_batch_pred = model(x_batch).flatten()  #
        loss = loss_fn(y_batch_pred, y_batch)
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        loss, current = loss.item(), (i + 1) * len(x_batch)
        if i == len(steps) - 1:
            model.eval()
            pred = model(x_test).flatten()
            vloss = loss_fn(pred, y_test)
            if es(model, vloss):
                done = True
            pbar.set_description(
                f"Epoch: {epoch}, tloss: {loss}, vloss: {vloss:>7f}, EStop:[{es.status}]"
            )
        else:
            pbar.set_description(f"Epoch: {epoch}, tloss {loss:}")

Epoch: 1, tloss: 261.34014892578125, vloss: 672.386780, EStop:[]: 100%|████████████████████████████████████████████████████| 19/19 [00:00<00:00, 89.46it/s]
Epoch: 2, tloss: 189.85879516601562, vloss: 192.848831, EStop:[Improvement found, counter reset to 0]: 100%|██████████████| 19/19 [00:00<00:00, 144.34it/s]
Epoch: 3, tloss: 206.1345672607422, vloss: 180.863358, EStop:[Improvement found, counter reset to 0]: 100%|███████████████| 19/19 [00:00<00:00, 147.51it/s]
Epoch: 4, tloss: 189.10606384277344, vloss: 174.722336, EStop:[Improvement found, counter reset to 0]: 100%|██████████████| 19/19 [00:00<00:00, 147.88it/s]
Epoch: 5, tloss: 200.2677459716797, vloss: 170.232391, EStop:[Improvement found, counter reset to 0]: 100%|███████████████| 19/19 [00:00<00:00, 147.68it/s]
Epoch: 6, tloss: 180.11241149902344, vloss: 165.325562, EStop:[Improvement found, counter reset to 0]: 100%|██████████████| 19/19 [00:00<00:00, 154.35it/s]
Epoch: 7, tloss: 122.46980285644531, vloss: 152.626953, EStop:[I

Finally, we evaluate the error.


In [9]:
from sklearn import metrics

# Measure RMSE error.  RMSE is common for regression.
pred = model(x_test)
score = torch.sqrt(torch.nn.functional.mse_loss(pred.flatten(), y_test))
print(f"Final score (RMSE): {score}")

Final score (RMSE): 2.882631301879883


In [10]:
y_test

tensor([33.0000, 28.0000, 19.0000, 13.0000, 14.0000, 27.0000, 24.0000, 13.0000,
        17.0000, 21.0000, 15.0000, 38.0000, 26.0000, 15.0000, 25.0000, 12.0000,
        31.0000, 17.0000, 16.0000, 31.0000, 22.0000, 22.0000, 22.0000, 33.5000,
        18.0000, 44.0000, 26.0000, 24.5000, 18.1000, 12.0000, 27.0000, 36.0000,
        23.0000, 24.0000, 37.2000, 16.0000, 21.0000, 19.2000, 16.0000, 29.0000,
        26.8000, 27.0000, 18.0000, 10.0000, 23.0000, 36.0000, 26.0000, 25.0000,
        25.0000, 25.0000, 22.0000, 34.1000, 32.4000, 13.0000, 23.5000, 14.0000,
        18.5000, 29.8000, 28.0000, 19.0000, 11.0000, 33.0000, 23.0000, 21.0000,
        23.0000, 25.0000, 23.8000, 34.4000, 24.5000, 13.0000, 34.7000, 14.0000,
        15.0000, 18.0000, 25.0000, 19.9000, 17.5000, 28.0000, 29.0000, 17.0000,
        16.0000, 27.0000, 37.0000, 36.1000, 23.0000, 14.0000, 32.8000, 29.9000,
        20.0000, 12.0000, 15.5000, 23.7000, 24.0000, 36.0000, 19.0000, 38.0000,
        29.0000, 21.5000, 27.9000, 14.00

## Saving and Loading a PyTorch Neural Network

Complex neural networks will take a long time to fit/train. It is helpful to be able to save these neural networks so that you can reload them later. A reloaded neural network will not require retraining. PyTorch usually saves neural networks as [pickle](https://wiki.python.org/moin/UsingPickle) files. The following code trains a neural network to predict car MPG and saves the model.


In [11]:
import numpy as np
import pandas as pd
import torch
import torch.nn as nn
from sklearn.preprocessing import StandardScaler
from torch.utils.data import DataLoader, TensorDataset

# For reproducibility
torch.manual_seed(0)
np.random.seed(0)

# Read the MPG dataset.
df = pd.read_csv(
    "https://data.heatonresearch.com/data/t81-558/auto-mpg.csv", na_values=["NA", "?"]
)

# Handle missing value
df["horsepower"] = df["horsepower"].fillna(df["horsepower"].median())

# Select features and target
features = df[
    [
        "cylinders",
        "displacement",
        "horsepower",
        "weight",
        "acceleration",
        "year",
        "origin",
    ]
]
target = df["mpg"]

# Normalize the features
scaler = StandardScaler()
scaled_features = scaler.fit_transform(features)

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Convert Numpy to PyTorch tensors
features_tensor = torch.tensor(
    scaled_features, device=device, dtype=torch.float32)
target_tensor = torch.tensor(target.values, device=device, dtype=torch.float32)

# Convert to TensorDataset
dataset = TensorDataset(features_tensor, target_tensor)

# Convert to DataLoader
data_loader = DataLoader(dataset, batch_size=32)

# Define the neural network using nn.Sequential
model = nn.Sequential(
    nn.Linear(features_tensor.shape[1], 50),
    nn.ReLU(),
    nn.Linear(50, 25),
    nn.ReLU(),
    nn.Linear(25, 1),
).to(device)

# Define the loss function for regression
loss_fn = nn.MSELoss()

# Define the optimizer
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

# Train for 1000 epochs.
model.train()
for epoch in range(1000):
    for batch_features, batch_target in data_loader:
        optimizer.zero_grad()
        out = model(batch_features).flatten()
        loss = loss_fn(out, batch_target)
        loss.backward()
        optimizer.step()

    # Display status every 100 epochs.
    if epoch % 100 == 0:
        print(f"Epoch {epoch}, loss: {loss.item()}")

model.eval()
pred = model(features_tensor)

# Measure RMSE error.  RMSE is common for regression.
score = torch.sqrt(torch.nn.functional.mse_loss(pred.flatten(), target_tensor))
print(f"Before save score (RMSE): {score}")
torch.save(model, "mpg.pkl")

Epoch 0, loss: 743.8233642578125
Epoch 100, loss: 13.698925018310547
Epoch 200, loss: 11.317089080810547
Epoch 300, loss: 8.084850311279297
Epoch 400, loss: 6.356418132781982
Epoch 500, loss: 4.847251892089844
Epoch 600, loss: 5.083579063415527
Epoch 700, loss: 5.602072715759277
Epoch 800, loss: 2.9818522930145264
Epoch 900, loss: 2.6347310543060303
Before save score (RMSE): 2.0877859592437744


The code below sets up a neural network and reads the data (for predictions), but it does not clear the model directory or fit the neural network. The code loads the weights from the previous fit. Now we reload the network and perform another prediction. The RMSE should match the previous one exactly if we saved and reloaded the neural network correctly.


In [12]:
# Measure RMSE error for loaded network.  RMSE is common for regression.
model.eval()
pred = model(features_tensor)
score = torch.sqrt(torch.nn.functional.mse_loss(pred.flatten(), target_tensor))
print(f"Before save score (RMSE): {score}")
torch.save(model, "mpg.pkl")

Before save score (RMSE): 2.0877859592437744
