모델을 만들어서 저장해봅시다

CIFAR10 TUTORIAL


https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html#sphx-glr-beginner-blitz-cifar10-tutorial-py


In [1]:
# 라이브러리 임포트
import torch
import torchvision
import torchvision.transforms as transforms

In [2]:
# device
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(device)

cuda:0


In [3]:
# 데이터 불러올때 적용할 함수를 transpose를 활용하여 지정한다

transform = transforms.Compose(
    [transforms.ToTensor()]) #  ToTensor()로 타입 변경시 0 ~ 1 사이의 값으로 바뀜

batch_size = 32

trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size,   # trainloader를 통해서 배치단위로 데이터를 학습할 수 있도록 합니다. 
                                          shuffle=True, num_workers=2)  # num_workers는 멀티 프로세싱 개수 입력

testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                       download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size,
                                         shuffle=False, num_workers=2)


Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to ./data/cifar-10-python.tar.gz


  0%|          | 0/170498071 [00:00<?, ?it/s]

Extracting ./data/cifar-10-python.tar.gz to ./data
Files already downloaded and verified


In [4]:
trainset.classes

['airplane',
 'automobile',
 'bird',
 'cat',
 'deer',
 'dog',
 'frog',
 'horse',
 'ship',
 'truck']

In [5]:
import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):   # nn.Module subclassing함
    def __init__(self):  # 클래스 초기화
        super().__init__()  # 부모클래스 초기화
        self.conv1 = nn.Conv2d(3, 6, 5) # In, Out, Kernel
        self.pool = nn.MaxPool2d(2, 2)  # Kernel, stride
        self.conv2 = nn.Conv2d(6, 16, 5) # In, Out, Kernel
        self.fc1 = nn.Linear(16 * 5 * 5, 120) # In, Out; Flatten 후의 사이즈 반영
        self.fc2 = nn.Linear(120, 84) # In, Out
        self.fc3 = nn.Linear(84, 10) # In, Out

    def forward(self, x):  # x는 인풋 데이터
        x = self.pool(F.relu(self.conv1(x)))  # conv1 --> relu --> maxpool
        x = self.pool(F.relu(self.conv2(x))) # conv2 --> relu --> maxpool
        x = torch.flatten(x, 1) # flatten all dimensions except batch 
        x = F.relu(self.fc1(x))  # Linear + relu
        x = F.relu(self.fc2(x))  # Linear + relu
        x = self.fc3(x)
        return x

net = Net() # model instance 만들기

In [6]:
from torchsummary import summary

model = net.to(device)  # 디바이스를 gpu로
summary(model, (3, 32, 32))   # Input shape을 넣어 줘야함


----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1            [-1, 6, 28, 28]             456
         MaxPool2d-2            [-1, 6, 14, 14]               0
            Conv2d-3           [-1, 16, 10, 10]           2,416
         MaxPool2d-4             [-1, 16, 5, 5]               0
            Linear-5                  [-1, 120]          48,120
            Linear-6                   [-1, 84]          10,164
            Linear-7                   [-1, 10]             850
Total params: 62,006
Trainable params: 62,006
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.01
Forward/backward pass size (MB): 0.06
Params size (MB): 0.24
Estimated Total Size (MB): 0.31
----------------------------------------------------------------


In [7]:
# optimizer 설정

import torch.optim as optim

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)  # params는 최적화할 parameter, 보통 모델 파라미터를 넣어준다

In [8]:
for epoch in range(2):  # loop over the dataset multiple times

    running_loss = 0.0
    for i, data in enumerate(trainloader, 0):  #인덱스 0부터 시작
        # get the inputs; data is a list of [inputs, labels]
        inputs, labels = data

        #gpu 설정
        inputs = inputs.to(device)
        labels = labels.to(device)

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = net(inputs)  # 모델 예측
        loss = criterion(outputs, labels)  # 오차 계산
        loss.backward()  # gradient 계산
        optimizer.step()  # weight 업데이트

        # print statistics
        running_loss += loss.item()
        if i % 2000 == 1999:    # print every 2000 mini-batches
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 2000))
            running_loss = 0.0

print('Finished Training')

Finished Training


모델을 저장해봅시다. 모델 저장은 다양한 방법이 있음
확장자는 .pt 또는 .pth 활용

1. 전체 모델 저장

   가장 간단한 방법이나, 공식문서에서 추천하지 않는 방법 (python pickle로 저장)

2. state_dict 저장

   학습 파라미터만 저장, 모델을 별도로 불러와야 함

3. torchscript 저장

   inference용 저장, 모델 정의 필요 없음

4. 체크포인트 저장

   체크포인트 저장하여 추후 학습 등에 이용


https://pytorch.org/tutorials/beginner/saving_loading_models.html

1번 전체 모델 저장

In [9]:
PATH = '/content/drive/MyDrive/Teaching/DL 202301/cifar_net.pt'   # 파일이름 설정

In [10]:

torch.save(net, PATH)

.eval()

 dropout, batch normalization 레이어 등을 고정시켜 일관된 결과가 나오도록 변경


In [11]:
model2 = torch.load(PATH)
model2.eval()

Net(
  (conv1): Conv2d(3, 6, kernel_size=(5, 5), stride=(1, 1))
  (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
  (fc1): Linear(in_features=400, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=84, bias=True)
  (fc3): Linear(in_features=84, out_features=10, bias=True)
)

In [33]:
from torchsummary import summary
summary(model2,input_size = (3,32,32) ) 

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1            [-1, 6, 28, 28]             456
         MaxPool2d-2            [-1, 6, 14, 14]               0
            Conv2d-3           [-1, 16, 10, 10]           2,416
         MaxPool2d-4             [-1, 16, 5, 5]               0
            Linear-5                  [-1, 120]          48,120
            Linear-6                   [-1, 84]          10,164
            Linear-7                   [-1, 10]             850
Total params: 62,006
Trainable params: 62,006
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.01
Forward/backward pass size (MB): 0.06
Params size (MB): 0.24
Estimated Total Size (MB): 0.31
----------------------------------------------------------------


2번 state_dict

In [15]:
PATH2 = '/content/drive/MyDrive/Teaching/DL 202301/cifar_net2.pt'   # 파일이름 설정

In [16]:
model2.state_dict()  # 모델의 state_dict에는 파라미터 수치값들이 dict형식으로 저장되어 있음

OrderedDict([('conv1.weight',
              tensor([[[[ 4.7294e-02,  7.8445e-02,  4.8014e-02,  2.3938e-02,  6.9093e-02],
                        [ 8.6393e-02,  6.0785e-02, -5.6211e-02, -7.6950e-02,  1.2606e-02],
                        [ 1.1442e-01, -5.7123e-02, -2.5644e-02,  1.2771e-01,  6.3129e-02],
                        [ 1.1550e-01, -6.4153e-02,  3.3819e-02,  5.1994e-02,  1.0347e-01],
                        [ 1.3286e-02, -2.5154e-02, -2.5146e-02, -7.0577e-02,  3.8817e-02]],
              
                       [[ 2.4314e-02, -4.2228e-03,  2.3929e-03,  3.8104e-02, -7.4195e-02],
                        [-5.9610e-02, -7.6557e-02,  9.8298e-02,  8.8326e-02, -4.8591e-02],
                        [ 8.8456e-02, -8.9428e-02,  1.4709e-03, -2.4061e-02,  6.4328e-02],
                        [-5.0846e-02,  3.1642e-02, -2.6535e-02, -4.8630e-02, -5.0110e-02],
                        [ 4.2969e-02, -3.8305e-02, -3.4501e-02, -2.8085e-02, -5.2675e-02]],
              
                       [[-6.

In [17]:
torch.save(model2.state_dict(), PATH2)  #state_dict만 저장

In [21]:
state_dict = torch.load(PATH2)  #state_dict를 로드

In [22]:
state_dict

OrderedDict([('conv1.weight',
              tensor([[[[ 4.7294e-02,  7.8445e-02,  4.8014e-02,  2.3938e-02,  6.9093e-02],
                        [ 8.6393e-02,  6.0785e-02, -5.6211e-02, -7.6950e-02,  1.2606e-02],
                        [ 1.1442e-01, -5.7123e-02, -2.5644e-02,  1.2771e-01,  6.3129e-02],
                        [ 1.1550e-01, -6.4153e-02,  3.3819e-02,  5.1994e-02,  1.0347e-01],
                        [ 1.3286e-02, -2.5154e-02, -2.5146e-02, -7.0577e-02,  3.8817e-02]],
              
                       [[ 2.4314e-02, -4.2228e-03,  2.3929e-03,  3.8104e-02, -7.4195e-02],
                        [-5.9610e-02, -7.6557e-02,  9.8298e-02,  8.8326e-02, -4.8591e-02],
                        [ 8.8456e-02, -8.9428e-02,  1.4709e-03, -2.4061e-02,  6.4328e-02],
                        [-5.0846e-02,  3.1642e-02, -2.6535e-02, -4.8630e-02, -5.0110e-02],
                        [ 4.2969e-02, -3.8305e-02, -3.4501e-02, -2.8085e-02, -5.2675e-02]],
              
                       [[-6.

모델 클래스 정의

In [23]:
import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):   # nn.Module subclassing함
    def __init__(self):  # 클래스 초기화
        super().__init__()  # 부모클래스 초기화
        self.conv1 = nn.Conv2d(3, 6, 5) # In, Out, Kernel
        self.pool = nn.MaxPool2d(2, 2)  # Kernel, stride
        self.conv2 = nn.Conv2d(6, 16, 5) # In, Out, Kernel
        self.fc1 = nn.Linear(16 * 5 * 5, 120) # In, Out; Flatten 후의 사이즈 반영
        self.fc2 = nn.Linear(120, 84) # In, Out
        self.fc3 = nn.Linear(84, 10) # In, Out

    def forward(self, x):  # x는 인풋 데이터
        x = self.pool(F.relu(self.conv1(x)))  # conv1 --> relu --> maxpool
        x = self.pool(F.relu(self.conv2(x))) # conv2 --> relu --> maxpool
        x = torch.flatten(x, 1) # flatten all dimensions except batch 
        x = F.relu(self.fc1(x))  # Linear + relu
        x = F.relu(self.fc2(x))  # Linear + relu
        x = self.fc3(x)
        return x

net = Net() # model instance 만들기

In [24]:
model3= Net()  # model 클래스 만들기

In [25]:
model3.load_state_dict(state_dict)  # state_dict 적용

<All keys matched successfully>

In [26]:
model3.eval()

Net(
  (conv1): Conv2d(3, 6, kernel_size=(5, 5), stride=(1, 1))
  (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
  (fc1): Linear(in_features=400, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=84, bias=True)
  (fc3): Linear(in_features=84, out_features=10, bias=True)
)

In [27]:
model3.state_dict()

OrderedDict([('conv1.weight',
              tensor([[[[ 4.7294e-02,  7.8445e-02,  4.8014e-02,  2.3938e-02,  6.9093e-02],
                        [ 8.6393e-02,  6.0785e-02, -5.6211e-02, -7.6950e-02,  1.2606e-02],
                        [ 1.1442e-01, -5.7123e-02, -2.5644e-02,  1.2771e-01,  6.3129e-02],
                        [ 1.1550e-01, -6.4153e-02,  3.3819e-02,  5.1994e-02,  1.0347e-01],
                        [ 1.3286e-02, -2.5154e-02, -2.5146e-02, -7.0577e-02,  3.8817e-02]],
              
                       [[ 2.4314e-02, -4.2228e-03,  2.3929e-03,  3.8104e-02, -7.4195e-02],
                        [-5.9610e-02, -7.6557e-02,  9.8298e-02,  8.8326e-02, -4.8591e-02],
                        [ 8.8456e-02, -8.9428e-02,  1.4709e-03, -2.4061e-02,  6.4328e-02],
                        [-5.0846e-02,  3.1642e-02, -2.6535e-02, -4.8630e-02, -5.0110e-02],
                        [ 4.2969e-02, -3.8305e-02, -3.4501e-02, -2.8085e-02, -5.2675e-02]],
              
                       [[-6.

3번 torchscript

In [28]:
model3_scripted = torch.jit.script(model3)

In [29]:
model3_scripted

RecursiveScriptModule(
  original_name=Net
  (conv1): RecursiveScriptModule(original_name=Conv2d)
  (pool): RecursiveScriptModule(original_name=MaxPool2d)
  (conv2): RecursiveScriptModule(original_name=Conv2d)
  (fc1): RecursiveScriptModule(original_name=Linear)
  (fc2): RecursiveScriptModule(original_name=Linear)
  (fc3): RecursiveScriptModule(original_name=Linear)
)

In [30]:
PATH3 = '/content/drive/MyDrive/Teaching/DL 202301/cifar_net3.pt'   # 파일이름 설정

In [31]:
model3_scripted.save(PATH3) # Save

In [32]:
model4 = torch.jit.load(PATH3)
model4.eval()

RecursiveScriptModule(
  original_name=Net
  (conv1): RecursiveScriptModule(original_name=Conv2d)
  (pool): RecursiveScriptModule(original_name=MaxPool2d)
  (conv2): RecursiveScriptModule(original_name=Conv2d)
  (fc1): RecursiveScriptModule(original_name=Linear)
  (fc2): RecursiveScriptModule(original_name=Linear)
  (fc3): RecursiveScriptModule(original_name=Linear)
)

JIT 방식은 모델 클래스 정보를 가져올 수 없음

In [36]:

from torchsummary import summary
summary(model4,input_size = (3,32,32) ) 

RuntimeError: ignored

4번 체크포인트 저장

 state_dict이외에 추가 정보를 함께 저장

 ```
 # 예제 코드: 학습과 연관된 정보를 Dict 형태로 함께 저장

torch.save({
            'epoch': epoch,
            'model_state_dict': model.state_dict(),
            'optimizer_state_dict': optimizer.state_dict(),
            'loss': loss,
            ...
            }, PATH)
```

불러올때는 위에서 저장 한 해당 모델, 옵티마이터 클래스 생성 필요함

```
model = TheModelClass()
optimizer = TheOptimizerClass()

checkpoint = torch.load(PATH)
model.load_state_dict(checkpoint['model_state_dict'])
optimizer.load_state_dict(checkpoint['optimizer_state_dict'])
epoch = checkpoint['epoch']
loss = checkpoint['loss']

model.eval()
# - or -
model.train()
```