<a href="https://colab.research.google.com/github/oilportrait/test_colab/blob/main/resnetPractice.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
!mkdir -p ~/.kaggle
!cp kaggle.json ~/.kaggle
!chmod 600 ~/.kaggle/kaggle.json

In [2]:
!kaggle datasets download -d mikoajfish99/carrots-vs-rockets-image-classification

Downloading carrots-vs-rockets-image-classification.zip to /content
 98% 88.0M/90.2M [00:03<00:00, 41.5MB/s]
100% 90.2M/90.2M [00:03<00:00, 26.6MB/s]


In [None]:
!mkdir sample
!unzip carrots-vs-rockets-image-classification.zip -d ./sample/

In [None]:
! pip install transformers datasets

이미지 데이터를 어떻게 전처리할지 규정합니다.

In [5]:
import torchvision.transforms as transforms
from torchvision.datasets import ImageFolder
from torch.utils.data import DataLoader, random_split

transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    # resnet은 Imagenet기반으로 학습되었기에 ImageNet의 평균과 표준편차로 Standardization을 수행합니다.
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

In [6]:
# 규정한 전처리로 이미지데이터를 불러오며 가공합니다.
data = ImageFolder(root='./sample/Images/', transform=transform)

Train, validation, test로 데이터를 분류합니다.

In [7]:
trainProportion = 0.7
valProportion = 0.2

totalSize = len(data)
trainSize = int(trainProportion * totalSize)
valSize = int(valProportion * totalSize)
testSize = totalSize - trainSize - valSize

trainData, valData, testData = random_split(data, [trainSize, valSize, testSize])

데이터 세트를 어떻게 이용할것인지 규정해놓습니다.

In [8]:
trainLoader = DataLoader(trainData, batch_size=32, shuffle=True)
valLoader = DataLoader(valData, batch_size=32, shuffle=False)
testLoader = DataLoader(testData, batch_size=32, shuffle=False)

Pre-trained된 모델인 Resnet50을 가져옵니다.

In [21]:
import torch.nn as nn
import torchvision.models as models
model = models.resnet50(pretrained=True)
model.fc = nn.Linear(model.fc.in_features, 2)

불러온 모델을 파인 튜닝해보겠습니다.

파인 튜닝시 사용할 최적화 방식을 규정해 놓습니다.

In [23]:
import torch.optim as optim
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

그냥 모델을 학습 시키면 마지막에 추가한 fc레이어만 파라미터가 변합니다. resnet의 파라미터도 변하길 바라므로 아래와 같은 코드를 작성합니다.

In [32]:
"""resnet의 마지막 레이어만 변형했을떄 가장 좋은 결과가 나온다는걸 확인했습니다.
 모든 레이어 변경시 validation accuracy: 0.93
 fc레이어만 변경시 validation accuracy: 0.95
 레이터4만 변경시 validation accuracy: 0.96"""
for param in model.layer4.parameters():
    param.requires_grad = True

직접 모델을 훈련시키고 validation도 수행합니다.

In [29]:
import torch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
# earily stopping을 위한 변수와 에포크 횟수를 정합니다.
num_epochs = 20
best_val_accuracy = 0
patience_counter = 0
max_patience = 5

for epoch in range(num_epochs):
    model.train()
    for inputs, labels in trainLoader:
        inputs, labels = inputs.to(device), labels.to(device)
        optimizer.zero_grad() # 기울기를 누적하지 않습니다.
        outputs = model(inputs)
        loss = criterion(outputs, labels) # 정답과의 차이를 계산합니다.
        loss.backward() #  backpropagation을 이용해서 기울기를 계산합니다.
        optimizer.step() # 모델의 파라미터를 업데이트합니다.

    model.eval()
    total_val_loss = 0
    correct_val_predictions = 0
    with torch.no_grad():
        for inputs, labels in valLoader:
            inputs, labels = inputs.to(device), labels.to(device)
            outputs = model(inputs)
            loss = criterion(outputs, labels)
            total_val_loss += loss.item()
            _, preds = torch.max(outputs, 1)
            correct_val_predictions += torch.sum(preds == labels.data)
    val_accuracy = correct_val_predictions.double() / valSize
    print(f"Epoch {epoch+1}/{num_epochs}, Loss: {loss.item()}, Validation Loss: {total_val_loss}, Validation Accuracy: {val_accuracy}")

    # earlt stopping을 설정합니다.
    if val_accuracy > best_val_accuracy:
        best_val_accuracy = val_accuracy
        patience_counter = 0
    else:
        patience_counter += 1
        if patience_counter >= max_patience:
            print("너 데이터 학습을 안하고 암기를 해? 멈춰")
            break

Epoch 1/20, Loss: 0.007404356263577938, Validation Loss: 0.6212058952078223, Validation Accuracy: 0.9344262295081968
Epoch 2/20, Loss: 0.010848945006728172, Validation Loss: 0.6342896278947592, Validation Accuracy: 0.9508196721311476
Epoch 3/20, Loss: 0.0120578333735466, Validation Loss: 0.6746743246912956, Validation Accuracy: 0.9180327868852459
Epoch 4/20, Loss: 0.009943041019141674, Validation Loss: 0.6165975062176585, Validation Accuracy: 0.9344262295081968
Epoch 5/20, Loss: 0.01009453646838665, Validation Loss: 0.49752689711749554, Validation Accuracy: 0.9672131147540984
Epoch 6/20, Loss: 0.009471186436712742, Validation Loss: 0.4587067747488618, Validation Accuracy: 0.9672131147540984
Epoch 7/20, Loss: 0.0184218417853117, Validation Loss: 0.5662760268896818, Validation Accuracy: 0.9672131147540984
Epoch 8/20, Loss: 0.2939535677433014, Validation Loss: 1.1593802869319916, Validation Accuracy: 0.9344262295081968
Epoch 9/20, Loss: 0.2917131185531616, Validation Loss: 1.2214497327804

파인 튜닝된 모델로 평가를 해봅니다.

In [30]:
model.eval()
total_test_loss = 0
correct_test_predictions = 0
with torch.no_grad():
    for inputs, labels in testLoader:
        inputs, labels = inputs.to(device), labels.to(device)
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        total_test_loss += loss.item()
        _, preds = torch.max(outputs, 1)
        correct_test_predictions += torch.sum(preds == labels.data)
test_accuracy = correct_test_predictions.double() / testSize
print(f"Test Loss: {total_test_loss}, Test Accuracy: {test_accuracy}")


Test Loss: 0.06897492706775665, Test Accuracy: 0.967741935483871
