# Saving and loading models

In the previous lab you probably noticed it can take a long time to train these models. It is therefore very useful to be able to save models to disk so they can be reused.

Before being able to save the models, we need something to save it to. Colab notebooks can easily interact with your Google Drive. To enable this, your drive first has to be mounted:

In [2]:
from google.colab import drive
drive.mount('/content/gdrive', force_remount=True)

Mounted at /content/gdrive


We can now write and read from the Drive. First, let's create a folder to work in

In [3]:
import os

folder_path = './gdrive/My Drive/thisfoldernamesurelydoesnotalreadyexistinyourdrive24/'

if not os.path.isdir(folder_path):
  os.mkdir(folder_path)
  print('Folder created!')


Folder created!


We use this folder store the files used in this demo

In [4]:
with open(folder_path + 'some_file.txt', 'w') as f:  # Write something to the file (overwriting current file content)
  f.write('foo')

with open(folder_path + 'some_file.txt', 'a') as f:  # Append to existing content in the file
  f.write('bar')

with open(folder_path + 'some_file.txt', 'r') as f:  # Read from the file
  print(f.read())


foobar


Now suppose we have some model that we'd like to reuse and thus want save to disk. In this example we do not use a trained model, but the same methods work for trained models as well.

In [5]:
import torch
import torch.nn as nn

In [6]:
class MLP(nn.Module):

  def __init__(self, input_size, layer_sizes, output_size):
    super(MLP, self).__init__()
    shape = (input_size,) + tuple(layer_sizes) + (output_size,)
    self.layers = nn.ModuleList([nn.Linear(shape[i], shape[i + 1]) for i in range(len(shape) - 1)])

  def forward(self, x):
    for layer in self.layers[:-1]:
      x = F.relu(layer(x))
    return F.softmax(self.layers[-1](x), dim=1)

In [7]:
model = MLP(784, (32, 32), 10)  # Initialize a new model

We can save the entire model object as follows:

In [8]:
torch.save(model, folder_path + 'model.pth')

The model can then be loaded from disk by

In [10]:
model = torch.load(folder_path + 'model.pth', weights_only=False)
print(model)

MLP(
  (layers): ModuleList(
    (0): Linear(in_features=784, out_features=32, bias=True)
    (1): Linear(in_features=32, out_features=32, bias=True)
    (2): Linear(in_features=32, out_features=10, bias=True)
  )
)


A disadvantage of this method is that this might break when changes are made to the model class or when the directories containing the class definitions are restructured. This is because only the location of the model class definition is stored, instead of the actual code. An alternative way of saving/loading models is to only store the weights. These weights are contained in the model's state dict. Saving and loading models then works as follows:

In [11]:
torch.save(model.state_dict(), folder_path + 'model_state.pth')

In [12]:
model_state = torch.load(folder_path + 'model_state.pth')  # Load the model's weights
model = MLP(784, (32, 32), 10)  # It is required to have a model object to set its weights, so we initialize a new one
model.load_state_dict(model_state)  # Set the model weights to the saved weights. Note that the model architecture must be equal to the architecture of the model from which the state_dict was obtained!

<All keys matched successfully>

Similarly, it is possible to save the state of an optimizer. Suppose you want

---

to stop training a model, but have the possibility of continuing later. This requires you to save the optimizer state as well (think for example of the velocity term when using momentum in SGD).

In [13]:
optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.5)

torch.save(optimizer.state_dict(), folder_path + 'optimizer_state.pth')

optimizer_state = torch.load(folder_path + 'optimizer_state.pth')

optimizer = torch.optim.SGD(model.parameters(), lr=42)

optimizer.load_state_dict(optimizer_state)

print(optimizer)

SGD (
Parameter Group 0
    dampening: 0
    differentiable: False
    foreach: None
    fused: None
    lr: 0.01
    maximize: False
    momentum: 0.5
    nesterov: False
    weight_decay: 0
)
