<a href="https://colab.research.google.com/github/wonmadeit/LEEGAWON/blob/main/10_save_and_load_pytorch_model.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

- 학습한 모형을 저장하거나 불러오는 방법
  - 저장하고 불러오는 요소들
    - 각 parameter(tensor)에 대한 unique key
    - 각 tensor의 값들

In [3]:
x = [[1,2],[3,4],[5,6],[7,8]]
y = [[3],[7],[11],[15]]

In [4]:
import torch
import torch.nn as nn
import numpy as np
from torch.utils.data import Dataset, DataLoader
device = 'cuda' if torch.cuda.is_available() else 'cpu'

In [5]:
class MyDataset(Dataset):
    def __init__(self, x, y):
        self.x = torch.tensor(x).float().to(device)
        self.y = torch.tensor(y).float().to(device)
    def __getitem__(self, ix):
        return self.x[ix], self.y[ix]
    def __len__(self):
        return len(self.x)

In [6]:
ds = MyDataset(x, y)
dl = DataLoader(ds, batch_size=2, shuffle=True)

In [7]:
model = nn.Sequential(
    nn.Linear(2, 8),
    nn.ReLU(),
    nn.Linear(8, 1)
).to(device)

In [8]:
!pip install torch_summary
from torchsummary import summary

Collecting torch_summary
  Downloading torch_summary-1.4.5-py3-none-any.whl (16 kB)
Installing collected packages: torch_summary
Successfully installed torch_summary-1.4.5


In [9]:
summary(model, torch.zeros(1,2));

Layer (type:depth-idx)                   Output Shape              Param #
├─Linear: 1-1                            [-1, 8]                   24
├─ReLU: 1-2                              [-1, 8]                   --
├─Linear: 1-3                            [-1, 1]                   9
Total params: 33
Trainable params: 33
Non-trainable params: 0
Total mult-adds (M): 0.00
Input size (MB): 0.00
Forward/backward pass size (MB): 0.00
Params size (MB): 0.00
Estimated Total Size (MB): 0.00


In [10]:
loss_func = nn.MSELoss()
from torch.optim import SGD
opt = SGD(model.parameters(), lr = 0.001)
import time
loss_history = []
start = time.time()
for _ in range(50):
    for ix, iy in dl:
        opt.zero_grad()
        loss_value = loss_func(model(ix),iy)
        loss_value.backward()
        opt.step()
        loss_history.append(loss_value)
end = time.time()
print(end - start)

0.4818112850189209


### Saving
- `.state_dict()` : dictionary 구조 (`OrderedDict`)
  - key : parameter 이름
  - value : weight & bias 값

`model.state_dict()` 를 통해 key와 value 값들을 확인해볼 것

In [11]:
save_path = 'mymodel.pth'
torch.save(model.state_dict(), save_path)
## GPU가 아닌 CPU tensor를 저장해서 나중에 GPU가 없는 machine에서도 바로 불러와서 사용할 수 있게 'cpu'로 모형을 옮겨 저장
# torch.save(model.to('cpu').state_dict(), 'save_path')
!du -hsc {save_path} # size of the model on disk

4.0K	mymodel.pth
4.0K	total


In [12]:
 model.state_dict()

OrderedDict([('0.weight',
              tensor([[ 0.7648,  0.1706],
                      [-0.1390, -0.5008],
                      [ 0.2607, -0.5519],
                      [-0.2858,  0.1862],
                      [ 0.6420,  0.8098],
                      [-0.4696, -0.3025],
                      [-0.0110, -0.4810],
                      [-0.4226,  0.2460]], device='cuda:0')),
             ('0.bias',
              tensor([-0.1848,  0.4413,  0.4835,  0.0131, -0.2652, -0.2979,  0.1220,  0.1771],
                     device='cuda:0')),
             ('2.weight',
              tensor([[ 0.6135, -0.0665,  0.1171, -0.2966,  1.0012, -0.2657, -0.0585, -0.2576]],
                     device='cuda:0')),
             ('2.bias', tensor([0.3287], device='cuda:0'))])

### Loading
- 저장되어 있는 모형을 불러올 때는, 저장한 모형과 같은 모형이 정의되어 있어야 함
  - 여기서는 위에 `model` 변수에 할당된 모형을 같이 사용
  - `model`이 정의가 되어있지 않은 상황이라면 똑같이 `nn.Sequential(..)` 사용하여 정의
- `torch.load()`로 state dict를 불러오고, `model.load_state_dict()`를 통해 정의한 `model`의 parameter에 덮어씌움
  - 모든 key가 match되어야 함

In [13]:
load_path = 'mymodel.pth'
model.load_state_dict(torch.load(load_path))

<All keys matched successfully>

### Predictions

In [None]:
val = [[8,9],[10,11],[1.5,2.5]]
val = torch.tensor(val).float()

In [None]:
model(val.to(device))

tensor([[16.5711],
        [20.2829],
        [ 4.5079]], device='cuda:0', grad_fn=<AddmmBackward0>)

In [None]:
val.sum(-1)

tensor([17., 21.,  4.])