### Saving and loading a pre-trained model to run inferences.
Important components of a defined neural network:
- A unique name (key) for each tensor (parameter). Taken care of with __init__
- The logic to connect every tensor in the network with one or the other. Taken care of with the forward method.
- The values (weight/bias values) of each tensor.

In [None]:
x = [[1,2],[3,4],[5,6],[7,8]]
y = [[3],[7],[11],[15]]

In [2]:
import torch
import torch.nn as nn
import numpy as np
from torch.utils.data import Dataset, DataLoader
device = 'cuda' if torch.cuda.is_available() else 'cpu'

In [3]:
class MyDataset(Dataset):
    def __init__(self, x, y):
        self.x = torch.tensor(x).float().to(device)
        self.y = torch.tensor(y).float().to(device)
    def __getitem__(self, ix):
        return self.x[ix], self.y[ix]
    def __len__(self): 
        return len(self.x)

In [4]:
ds = MyDataset(x, y)
dl = DataLoader(ds, batch_size=2, shuffle=True)

In [5]:
model = nn.Sequential(
    nn.Linear(2, 8),
    nn.ReLU(),
    nn.Linear(8, 1)
).to(device)

In [6]:
!pip install torch_summary
from torchsummary import summary



In [7]:
summary(model, torch.zeros(1,2));

Layer (type:depth-idx)                   Output Shape              Param #
├─Linear: 1-1                            [-1, 8]                   24
├─ReLU: 1-2                              [-1, 8]                   --
├─Linear: 1-3                            [-1, 1]                   9
Total params: 33
Trainable params: 33
Non-trainable params: 0
Total mult-adds (M): 0.00
Input size (MB): 0.00
Forward/backward pass size (MB): 0.00
Params size (MB): 0.00
Estimated Total Size (MB): 0.00


### Saving

In [8]:
# The state_dict command dictionary corresponds to the parameter names and values corresponding to the model.
# State refers to the snapshot of the model where every snapshot is the set of values at each tensor.
# Returns a dictionary of keys and values. Keys are the names of the model's layers and the values correspond to the weights of these layers.
# Transfer the model to the CPU before calling torch.save to save as CPU tensors vs CUDA tensors. Helps load to any machine that may not have CUDA capabilities. 
save_path = 'mymodel.pth'
torch.save(model.state_dict(), save_path)
!du -hsc {save_path} # size of the model on disk

'du' is not recognized as an internal or external command,
operable program or batch file.


### Loading

In [9]:
# To load the model, first initialize with random weights and then load weights from state_dict.
# 
load_path = 'mymodel.pth'
# Load the model from disk and unserialize it to create an orderedDict value.
model.load_state_dict(torch.load(load_path))

<All keys matched successfully>

### Predictions

In [10]:
# Format into float tensors
val = [[8,9],[10,11],[1.5,2.5]]
val = torch.tensor(val).float()

In [11]:
# Define the device.
model(val.to(device))

tensor([[-0.4223],
        [-0.5676],
        [ 0.1194]], device='cuda:0', grad_fn=<AddmmBackward0>)

In [12]:
# Make predictions.
val.sum(-1)

tensor([17., 21.,  4.])