# Saving and Loading Models

Today we are going to learn about saving and loading models in `pytorch`. First, let's import our modules.

In [1]:
import torch
import torch.nn as nn

We have 3 different methods to remember:

This makes use of the `pickle` model to serialize the objects and save. The results are serialized and are not human readable.
* `torch.save(arg, PATH) # can be model, tensor, or dictionary`

This will load our model.
* `torch.load(PATH)`
* `torch.load_state_dict(arg)`


We have 2 different ways of saving the model:

* Lazy way: save whole model
* `torch.save(model, PATH)`

The model class must be defined somewhere:
* `model = torch.load(PATH)`
* `model.eval()`

The above method is imperfect because the serialized data are bound to specific classes in the exact directory structure when the model is saved.

-----

* Recommended way: save only the state_dict
This will save only the parameteres, so we can use the model for inference later.
*`torch.save(model.state_dict(), PATH)`

The model must be created again with parameters
* `model = Model(*args, **kwargs)`
* `model.load_state_dict(torch.load(PATH))`
* `model.eval()`

We will go into all 3 of these in more detail.

Defining and creating the model.

In [2]:
class Model(nn.Module):
    def __init__(self, n_input_features):
        super(Model, self).__init__()
        self.linear = nn.Linear(n_input_features, 1)

    def forward(self, x):
        y_pred = torch.sigmoid(self.linear(x))
        return y_pred

model = Model(n_input_features=6)
# train your model...

Now, to save the entire model. This is the lazy option.

In [3]:
#################### save all ######################################
for param in model.parameters():
    print(param)

# save and load entire model

FILE = "model.pth"
torch.save(model, FILE)

loaded_model = torch.load(FILE)
loaded_model.eval()

#inspecting the parameters 
for param in loaded_model.parameters():
    print(param)

Parameter containing:
tensor([[-0.1565, -0.1062, -0.2369, -0.2470,  0.3051,  0.0865]],
       requires_grad=True)
Parameter containing:
tensor([-0.2555], requires_grad=True)
Parameter containing:
tensor([[-0.1565, -0.1062, -0.2369, -0.2470,  0.3051,  0.0865]],
       requires_grad=True)
Parameter containing:
tensor([-0.2555], requires_grad=True)


Now to save only the state dictionary. This is the preferred option.

In [4]:
############save only state dict #########################

# save only state dict
FILE = "model.pth"
torch.save(model.state_dict(), FILE)

print(model.state_dict())
loaded_model = Model(n_input_features=6)
loaded_model.load_state_dict(torch.load(FILE)) # it takes the loaded dictionary, not the path file itself
loaded_model.eval()

#printing the parameters of the loaded model
print(loaded_model.state_dict())

OrderedDict([('linear.weight', tensor([[-0.1565, -0.1062, -0.2369, -0.2470,  0.3051,  0.0865]])), ('linear.bias', tensor([-0.2555]))])
OrderedDict([('linear.weight', tensor([[-0.1565, -0.1062, -0.2369, -0.2470,  0.3051,  0.0865]])), ('linear.bias', tensor([-0.2555]))])


Now, to save and load a checkpoint.

In [5]:
###########load checkpoint#####################
learning_rate = 0.01
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

checkpoint = {
"epoch": 90,
"model_state": model.state_dict(),
"optim_state": optimizer.state_dict()
}
print(optimizer.state_dict())
FILE = "checkpoint.pth"
torch.save(checkpoint, FILE)

model = Model(n_input_features=6)
optimizer = optimizer = torch.optim.SGD(model.parameters(), lr=0)

checkpoint = torch.load(FILE)
model.load_state_dict(checkpoint['model_state'])
optimizer.load_state_dict(checkpoint['optim_state'])
epoch = checkpoint['epoch']

model.eval()
# - or -
# model.train()

print(optimizer.state_dict())

# Remember that you must call model.eval() to set dropout and batch normalization layers 
# to evaluation mode before running inference. Failing to do this will yield 
# inconsistent inference results. If you wish to resuming training, 
# call model.train() to ensure these layers are in training mode.

{'state': {}, 'param_groups': [{'lr': 0.01, 'momentum': 0, 'dampening': 0, 'weight_decay': 0, 'nesterov': False, 'maximize': False, 'foreach': None, 'params': [0, 1]}]}
{'state': {}, 'param_groups': [{'lr': 0.01, 'momentum': 0, 'dampening': 0, 'weight_decay': 0, 'nesterov': False, 'maximize': False, 'foreach': None, 'params': [0, 1]}]}


To save on GPU/CPU, you can: 

* Save on GPU, Load on CPU

`device = torch.device("cuda")
model.to(device)
torch.save(model.state_dict(), PATH)
device = torch.device('cpu')
model = Model(*args, **kwargs)
model.load_state_dict(torch.load(PATH, map_location=device))`

* Save on GPU, Load on GPU

`device = torch.device("cuda")
model.to(device)
torch.save(model.state_dict(), PATH)
model = Model(*args, **kwargs)
model.load_state_dict(torch.load(PATH))
model.to(device)`

Note: Be sure to use the .to(torch.device('cuda')) function 
on all model inputs, too!

* Save on CPU, Load on GPU

`torch.save(model.state_dict(), PATH)
device = torch.device("cuda")
model = Model(*args, **kwargs)
model.load_state_dict(torch.load(PATH, map_location="cuda:0"))  # Choose whatever GPU device number you want
model.to(device)`

This loads the model to a given GPU device. 
Next, be sure to call model.to(torch.device('cuda')) to convert the model’s parameter tensors to CUDA tensors