# Learning Objectives

Based on the previous step, we'll learn how to save/load the models for resuming training or for applying on data.

_Note: we have packages some of the training functions in the `helper.py` module for simplifying the notebook._

### Learning Objectives

- save and load a model to resume interrupted training
- save and load a model for using in deployment

### Requirements

To benefit from this content, it is preferable to know:
- how to train a simple model (see step 03)

In [1]:
import torch
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from helpers import BasicLabelledDataset
from helpers import BasicNeuralNet
from helpers import BasicModelTrainer

# 1. Training a model (iris again)

This whole section is already known. We'll do the whole IRIS basic neural net thing again. This time we've packaged all steps into a `helper.py` module to get rid of the usual lines of codes.

We will just:
- load the iris data from scikit-learn
- package it in a torch `DatasSet`
- create a `Module` class for our model
- execute a training loop using autograd.

In [2]:
data = load_iris()

np.random.seed(481516)  # just for this notebook to be consistent between runs
X_train, X_test, y_train, y_test = train_test_split(data.data, data.target, test_size=0.33)

In [3]:
# see class in helper.py
iris_training_dataset = BasicLabelledDataset(X_train, y_train)
iris_testing_dataset = BasicLabelledDataset(X_test, y_test)

In [4]:
model = BasicNeuralNet(
    4,  # input has size 4 (attributes)
    3,  # output has size 3 (one-hot, 3 classes)
    6   # hidden layer (param)
)

We'll just apply SGD with a specific criterion (MSELoss). SGD is initialized on the `parameters` of the model instance.

In [5]:
optimizer = torch.optim.SGD(model.parameters(), lr=0.1)
criterion = torch.nn.MSELoss()

# this is a helper class just executing the usual loop (see step 03)
trainer = BasicModelTrainer(
    model,
    optimizer,
    criterion,
    verbose=True
)

epochs=500

# executing the training
model, loss = trainer.fit(
    iris_training_dataset,
    epochs=epochs,  # just for trying
    batch_size=10
)

[epoch=499]	 epoch_loss=0.263995	 ETA:  0: 0: 0 secs (data=50000/50000, elapsed=13)

In [6]:
print("Accuracy: {}%".format(
    trainer.test_accuracy(iris_testing_dataset)
))

Accuracy: 100.0%


# 2. Saving the model

See [pytorch tutorial on saving and loading models](https://pytorch.org/tutorials/beginner/saving_loading_models.html).
> When saving a general checkpoint, to be used for either inference or resuming training, you must save more than just the model’s `state_dict`. It is important to also save the optimizer’s `state_dict`, as this contains buffers and parameters that are updated as the model trains. Other items that you may want to save are the epoch you left off on, the latest recorded training loss, external `torch.nn.Embedding` layers, etc.

In [7]:
model_file_path = "models/step-04-model-state-epoch{}-loss{:2f}.tar".format(epochs, loss)

torch.save(
    {
        'epoch': epochs,
        'model_state_dict': model.state_dict(),
        'optimizer_state_dict': optimizer.state_dict(),
        'loss': loss,
    },
    model_file_path
)

print("saved as {}".format(model_file_path))

saved as models/step-04-model-state-epoch500-loss0.265040.tar


# 3. Loading the model for resuming training

> To load the items, first initialize the model and optimizer, then load the dictionary locally using `torch.load()`. From here, you can easily access the saved items by simply querying the dictionary as you would expect.

In [8]:
model_2 = BasicNeuralNet(
    4,  # input has size 4 (attributes)
    3,  # output has size 3 (one-hot, 3 classes)
    6   # hidden layer (param)
)

optimizer_2 = torch.optim.SGD(model_2.parameters(), lr=0.01)
criterion_2 = torch.nn.MSELoss()

# comment/uncomment below to use your own saved model
#checkpoint = torch.load(model_file_path)

# or simply use the demo
checkpoint_2 = torch.load("models/step-04-demo-model-state-epoch500-loss0.238621.tar")

# this loads the state dict into the model and optimizer
model_2.load_state_dict(checkpoint_2['model_state_dict'])
optimizer_2.load_state_dict(checkpoint_2['optimizer_state_dict'])

restart_epoch = checkpoint_2['epoch']
loss_2 = checkpoint_2['loss']

> Remember that you must call `model.eval()` to set dropout and batch normalization layers to evaluation mode before running inference. Failing to do this will yield inconsistent inference results. If you wish to resuming training, call `model.train()` to ensure these layers are in training mode.

In [9]:
# use .eval() when loading the model for inference (production)
#model_loaded.eval()

# use .train() when loading the model for training more (interrupted training?)
model_2.train()

BasicNeuralNet(
  (x_to_z): Linear(in_features=4, out_features=6, bias=True)
  (z_to_h): Sigmoid()
  (h_to_s): Linear(in_features=6, out_features=3, bias=True)
  (s_to_y): Softmax(dim=1)
)

We can now resume the training...

In [10]:
from helpers import BasicModelTrainer

trainer_2 = BasicModelTrainer(
    model_2,
    optimizer_2,
    criterion_2,
    verbose=True
)

In [11]:
model_3, loss_3 = trainer_2.fit(
    iris_training_dataset,
    epochs=10,
    batch_size=10
)

[epoch=9]	 epoch_loss=0.233177	 ETA:  0: 0: 0 secs (data=1000/1000, elapsed=0)

# 4. Loading a model for inference (production)

When saving/loading a model for using it in production. You only need to save the state_dict of the model. A call to `model.eval()` will make sure the model is initialized properly.

In [12]:
model_4 = BasicNeuralNet(
    4,  # input has size 4 (attributes)
    3,  # output has size 3 (one-hot, 3 classes)
    6   # hidden layer (param)
)

# comment/uncomment below to use your own saved model
#checkpoint_4 = torch.load(model_file_path)

# or simply use the demo
checkpoint_4 = torch.load("models/step-04-demo-model-state-epoch500-loss0.238621.tar")

# this loads the state dict into the model only
model_4.load_state_dict(checkpoint_4['model_state_dict'])

# use .eval() when loading the model for inference (production)
model_4.eval()

BasicNeuralNet(
  (x_to_z): Linear(in_features=4, out_features=6, bias=True)
  (z_to_h): Sigmoid()
  (h_to_s): Linear(in_features=6, out_features=3, bias=True)
  (s_to_y): Softmax(dim=1)
)

We can now use this loaded model for evaluating its accuracy on the testing set...

In [13]:
# batch the testing data as well
iris_testing_loader = torch.utils.data.DataLoader(
    dataset=iris_testing_dataset,
    batch_size=10,
    shuffle=True
)

correct = 0
total = 0

with torch.no_grad():  # deactivate autograd during testing
    for data in iris_testing_loader:  # iterate on batches
        # get testing data batch
        inputs, targets = data
        
        # apply the NN
        outputs = model_4(inputs)                 # compute output class tensor
        predicted = torch.argmax(outputs, dim=1)  # get argmax of P(y_hat|x)
        actual = torch.argmax(targets, dim=1)     # get y

        # compute score
        total += targets.size(0)
        correct += (predicted == actual).sum().item()

print("Accuracy: {:2f}".format(100 * correct / total))

Accuracy: 98.000000
