## Final Project with Kaggle Datasets Used by Peers

dataset url : https://www.kaggle.com/datasets/lantian773030/pokemonclassification

### Task : Classifying 150 Species of Pokémon Using Image Classification

* The goal of this project is to train image classification tasks on 150 species of Pokémon using pretrained models such as AlexNet, VGG19, GoogleNet, and ResNet50. The objective is to improve the performance of classification tasks compared to existing machine learning models.
* 이 프로젝트의 목표는 pretrained된 AlexNet, VGG19, GoogleNet, ResNet50을 이용하여 포켓몬 150 종류의 이미지를 분류하는 작업을 학습시키고, 기존 머신러닝 모델 대비 분류 작업의 성능을 개선하는 것입니다.



---

### 데이터셋 설명
* 해당 데이터 셋은 150가지 종류의 포켓몬 라벨별 40 ~ 60장의 이미지 데이터로 이루어져있습니다.

In [None]:
train_data_dir = "/content/PokemonData/"
classes = os.listdir(train_data_dir)
classes = {k: v for k,v in enumerate(sorted(classes))}
print(classes)

#출력 : {0: 'Abra', 1: 'Aerodactyl', 2: 'Alakazam', ...  147: 'Wigglytuff', 148: 'Zapdos', 149: 'Zubat'}

* 선택한 데이터 셋은 아래와 같은 두 가지의 데이터 셋으로 나누어 학습을 진행하였습니다.
    * train
    * validate
    * 데이터 분류 라벨 갯수가 많고, 각 라벨별 데이터가 많지 않으므로 train / validate 두 종류로 분류하여 학습을 진행하였습니다.
        
* 모델 학습 시 데이터 셋은 다음과 같이 분류하여 사용하였습니다.
    * 전체 학습 데이터 : 6820장의 이미지
    * train data : 학습 데이터 (5456장의 이미지 - 전체 데이터의 80%)
    * validation data : 모델 성능 검증을 위한 데이터  (1364장의 이미지 - 전체 데이터의 20%)


### Dataset Description
* This dataset consists of images of 150 different Pokémon labels, with each label having 40 to 60 images.

* The chosen dataset is divided into two subsets for the training process:
    * Train
    * Validate
    * Due to a large number of classification labels and a relatively small number of images per label, the dataset is split into 'train' and 'validate' subsets for effective training.

* During the model training, the dataset is categorized as follows:
    * Total training data: 6820 images
    * Train data: Used for model training, comprising 5456 images (80% of the total dataset)
    * Validation data: Utilized for assessing model performance, consisting of 1364 images (20% of the total dataset)

---

### 모델 학습 및 테스트 환경

* 구글 Colab에서 T4GPU를 사용하여 모델 학습 및 테스트 진행.
* 로컬 환경에서 GPU를 이용하여 모델 학습 및 테스트 진행.

### 학습 모델 선택 시의 고려 사항

1. 이전 Midterm Project에서 Batch Norminazation, Dropout, Optimization(Adam 사용) 등을 사용하며 진행한 바, 이번 Final Project에서는 이전에 사용해보지 않았고, 새롭게 학습한 데이터 전처리, 전이 학습을 사용에 집중하였습니다.
2. 특히 Existing Model 목차에서 학습한 모델들이 이미지 분류에 효과적인 성능을 보였던 모델들이므로 이를 이용하여 데이터 분류 성능을 향상시키고자 하였습니다.
3. 각각의 모델들은 Fine-Tuning을 이용하여 사용할 포켓몬 이미지 데이터에 맞게 조정하는 방법으로 학습을 진행하였습니다.


### 이미지 분류를 위해 선택한 학습 모델


1. AlexNet
    * 2012년 ImageNet Large Scale Visual Recognition Challenge에서 우승하여 주목받은 딥 뉴럴 네트워크 아키텍처
    * 5 개의 Convolution 레이어와 3 개의 완전 연결 레이어로 구성
    * ReLU (Rectified Linear Unit) 활성화 함수를 사용
    * 오버피팅을 방지하기 위해 완전 연결 레이어에서 드롭아웃을 사용
2. VGG-16
    * ILSVRC 2014년 대회에서 2위를 한 CNN 모델
    * VGG는 신경망의 깊이가 어떤 영향을 주는지 연구를 하기 위해 설계된 모델(filtersize는 3x3으로 고정되어 있음, filtersize가 크면 금방 이미지 사이즈가 작아져서 깊이 만들기 어려움)
    * 신경망의 깊이(레이어 수)에 따라 뒤에 붙는 숫자는 달라짐(ex. VGG16, VGG19 etc.)
3. VGG-19
    * Oxford Visual Geometry Group에서 개발한 모델
    * 9 개의 Convolution 레이어와 3 개의 완전 연결 레이어로 구성
    * 작은 필터 크기(3x3)의 컨볼루션 레이어를 중첩하여 깊이를 확장
4. GoogLeNet
    * 구글넷은 Google이 개발한 인셉션(Inception)이라 불리는 모듈을 사용한 네트워크 구조
    * 인셉션 모듈을 통해 다양한 필터 크기를 병렬로 적용하여 네트워크를 설계
    * 1x1 컨볼루션을 사용하여 차원 감소를 수행하고, 병렬적으로 특징을 추출
    * 네트워크 깊이로 인한 기울기 소실 문제를 완화하기 위해 auxiliary classifiers를 사용
5. ResNet50
    * residual network에 대안을 적용하여 더 깊은 네트워크를 구축할 수 있도록 한 모델
    * 1x1, 3x3, 1x1 크기의 다양한 필터를 사용하여 깊이와 넓이를 확장
    * 이로 인해 특성이 다음 레이어로 더 잘 전달되어 컴퓨터 비전에서 정확도를 매우 향상시킴

### Training and Testing Environment

* Model training and testing were conducted using a T4 GPU on Google Colab and locally leveraging GPU resources.

### Considerations in Model Selection

1. In the previous Midterm Project, we focused on techniques such as Batch Normalization, Dropout, and Optimization (using Adam). For this Final Project, the emphasis shifted towards exploring new data preprocessing techniques and utilizing transfer learning.
2. Particularly, in the Existing Model section, we chose models that had demonstrated effective performance in image classification. The goal was to enhance data classification performance using these well-performing models.
3. Each model underwent fine-tuning to adjust to the specific Pokémon image dataset, focusing on improving data classification performance.

### Selected Training Models for Image Classification

1. **AlexNet**
    * Deep neural network architecture that gained prominence by winning the 2012 ImageNet Large Scale Visual Recognition Challenge.
    * Comprises 5 convolution layers and 3 fully connected layers.
    * Uses Rectified Linear Unit (ReLU) activation functions.
    * Implements dropout in fully connected layers to prevent overfitting.
2. **VGG-16**
    * CNN model that took second place in the ILSVRC 2014 competition
    * VGG is a model designed to study how depth affects neural networks (filtersize is fixed at 3x3; if the filtersize is large, the image size quickly becomes small, making it difficult to create depth)
    * The number at the end varies depending on the depth (number of layers) of the neural network (ex. VGG16, VGG19 etc.)
3. **VGG-19**
    * Model developed by the Oxford Visual Geometry Group.
    * Consists of 9 convolution layers and 3 fully connected layers.
    * Achieves depth by stacking convolution layers with a small filter size (3x3).

4. **GoogLeNet**
    * Utilizes the Inception module developed by Google, applying various filter sizes in parallel for network design.
    * Employs 1x1 convolutions for dimension reduction, extracting features in parallel.
    * Addresses the vanishing gradient problem associated with deep networks by using auxiliary classifiers.

5. **ResNet50**
    * Model that applies the concept of residual networks to build deeper networks.
    * Expands depth and width using various filter sizes (1x1, 3x3, 1x1).
    * Enhances feature propagation to the next layer, significantly improving accuracy in computer vision.

---

### 데이터 저장 방식

#### 1. Colab 환경에 데이터 저장 (kaggle API 사용) - Installing Kaggle API and Data Download(Midterm Project 참조)

#### 2. Local 환경에서 데이터 다운받아 특정 경로에 위치

---

### 1. AlexNet

#### 1.1 필요한 라이브러리 import - Importing Essential Libraries and Modules

In [None]:
import os
from torch.utils.data import Dataset, DataLoader, random_split
import torch
from torchvision import datasets, transforms, models
import torch.nn as nn
import torch.optim as optim
from torchvision import transforms, datasets

#### 1.2. 로컬 GPU를 사용하여 데이터셋을 학습하기 위해 GPU를 사용할 수 있는 환경인지 확인 - Verify that the GPU is available for learning datasets using local GPUs

In [None]:
import torch
torch.cuda.is_available()
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(device)

#### 1.3 학습 관련 변수 사전 설정 및 데이터셋 로드(전처리 포함) - Setting up training-related variables and load datasets(including preprocessing)

In [None]:
# Define the transforms
transform = transforms.Compose([
    transforms.RandomResizedCrop(224),
    transforms.RandomHorizontalFlip(),
    transforms.ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2, hue=0.2),
    transforms.RandomRotation(30),
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),
])
# Create the dataset
trainset = datasets.ImageFolder("./PokemonData", transform=transform)
dataset_length = len(trainset)

# Split the dataset into training and validation sets
training_data, validation_data = random_split(trainset, [int(0.8 * dataset_length), dataset_length - int(0.8 * dataset_length)])
print("Total Classes: ",len(trainset.classes))
# Create data loaders
trainloader = DataLoader(training_data, batch_size=64, shuffle=True, num_workers=4)
testloader = DataLoader(validation_data, batch_size=64, shuffle=True, num_workers=4)
num_epochs = 50
best_acc = 0.0  # Variable to store the best accuracy on the validation set

```
Total Classes: 150
```

#### 1.4. 모델 로드 후 학습 진행 - Load pretrained AlexNet model and Perform Model Training

In [None]:
# Load the pretrained AlexNet model
model = models.alexnet(pretrained=True)

# Modify the classifier to match the number of classes
num_features = model.classifier[6].in_features
model.classifier[6] = nn.Linear(num_features, len(trainset.classes))

# Move the model to GPU
model = model.to(device)

# Define the loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)

for epoch in range(num_epochs):
    model.train()
    running_loss = 0.0
    correct_train = 0
    total_train = 0

    for i, data in enumerate(trainloader, 0):
        inputs, labels = data

        # Move inputs and labels to GPU
        inputs, labels = inputs.to(device), labels.to(device)

        optimizer.zero_grad()

        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()

        _, predicted_train = torch.max(outputs.data, 1)
        total_train += labels.size(0)
        correct_train += (predicted_train == labels).sum().item()

        if i % 100 == 99:
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 100))
            running_loss = 0.0

    # Training accuracy for the epoch
    train_accuracy = correct_train / total_train

    # Evaluate the model on the validation set
    model.eval()
    correct_val = 0
    total_val = 0

    with torch.no_grad():
        for data in testloader:
            inputs, labels = data
            inputs, labels = inputs.to(device), labels.to(device)
            outputs = model(inputs)
            _, predicted_val = torch.max(outputs.data, 1)
            total_val += labels.size(0)
            correct_val += (predicted_val == labels).sum().item()

    # Validation accuracy for the epoch
    val_accuracy = correct_val / total_val

    print(f'Epoch {epoch + 1}, Training Accuracy: {100 * train_accuracy:.8f}% ({correct_train}/{total_train}), Validation Accuracy: {100 * val_accuracy:.8f}% ({correct_val}/{total_val})')

    # Save the model if it has the best accuracy so far
    if val_accuracy > best_acc:
        best_acc = val_accuracy
        torch.save(model.state_dict(), 'best_model_alexnet.pth')

print('Finished Training')

```
Epoch 1, Training Accuracy: 23.53372434% (1284/5456), Validation Accuracy: 62.31671554% (850/1364)
Epoch 2, Training Accuracy: 72.67228739% (3965/5456), Validation Accuracy: 74.12023460% (1011/1364)
Epoch 3, Training Accuracy: 85.75879765% (4679/5456), Validation Accuracy: 79.76539589% (1088/1364)
Epoch 4, Training Accuracy: 91.64222874% (5000/5456), Validation Accuracy: 79.76539589% (1088/1364)
Epoch 5, Training Accuracy: 95.36290323% (5203/5456), Validation Accuracy: 79.32551320% (1082/1364)
Epoch 6, Training Accuracy: 95.98607038% (5237/5456), Validation Accuracy: 81.15835777% (1107/1364)
Epoch 7, Training Accuracy: 97.04912023% (5295/5456), Validation Accuracy: 81.37829912% (1110/1364)
Epoch 8, Training Accuracy: 97.96554252% (5345/5456), Validation Accuracy: 82.25806452% (1122/1364)
Epoch 9, Training Accuracy: 98.55205279% (5377/5456), Validation Accuracy: 82.55131965% (1126/1364)
Epoch 10, Training Accuracy: 98.47873900% (5373/5456), Validation Accuracy: 82.18475073% (1121/1364)
Epoch 11, Training Accuracy: 98.84530792% (5393/5456), Validation Accuracy: 81.67155425% (1114/1364)
Epoch 12, Training Accuracy: 98.93695015% (5398/5456), Validation Accuracy: 81.81818182% (1116/1364)
Epoch 13, Training Accuracy: 98.84530792% (5393/5456), Validation Accuracy: 81.89149560% (1117/1364)
Epoch 14, Training Accuracy: 99.06524927% (5405/5456), Validation Accuracy: 82.33137830% (1123/1364)
Epoch 15, Training Accuracy: 99.08357771% (5406/5456), Validation Accuracy: 82.55131965% (1126/1364)
Epoch 16, Training Accuracy: 99.32184751% (5419/5456), Validation Accuracy: 83.21114370% (1135/1364)
Epoch 17, Training Accuracy: 99.28519062% (5417/5456), Validation Accuracy: 83.72434018% (1142/1364)
Epoch 18, Training Accuracy: 99.28519062% (5417/5456), Validation Accuracy: 82.11143695% (1120/1364)
Epoch 19, Training Accuracy: 99.54178886% (5431/5456), Validation Accuracy: 82.62463343% (1127/1364)
Epoch 20, Training Accuracy: 99.59677419% (5434/5456), Validation Accuracy: 83.06451613% (1133/1364)
Epoch 21, Training Accuracy: 99.52346041% (5430/5456), Validation Accuracy: 82.47800587% (1125/1364)
Epoch 22, Training Accuracy: 99.43181818% (5425/5456), Validation Accuracy: 83.13782991% (1134/1364)
Epoch 23, Training Accuracy: 99.46847507% (5427/5456), Validation Accuracy: 82.84457478% (1130/1364)
Epoch 24, Training Accuracy: 99.28519062% (5417/5456), Validation Accuracy: 83.65102639% (1141/1364)
Epoch 25, Training Accuracy: 99.67008798% (5438/5456), Validation Accuracy: 82.77126100% (1129/1364)
...
Epoch 48, Training Accuracy: 99.81671554% (5446/5456), Validation Accuracy: 84.67741935% (1155/1364)
Epoch 49, Training Accuracy: 99.89002933% (5450/5456), Validation Accuracy: 85.11730205% (1161/1364)
Epoch 50, Training Accuracy: 99.96334311% (5454/5456), Validation Accuracy: 84.45747801% (1152/1364)
Finished Training
```

---

### 2. VGG 16

* 앞서 정의한 라이브러리와 로드한 데이터셋을 동일하게 사용하여 해당 설명은 생략함
* pretrained VGG16 모델 로드 후 학습 진행 - Load pretrained VGG16 model and Perform Model Training

In [None]:
best_acc = 0.0  # Variable to store the best accuracy on the validation set
# Load the pretrained AlexNet model
model = models.vgg16(pretrained=True)

# Modify the classifier to match the number of classes
num_features = model.classifier[6].in_features
model.classifier[6] = nn.Linear(num_features, len(trainset.classes))

# Move the model to GPU
model = model.to(device)

# Define the loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)

for epoch in range(num_epochs):
    model.train()
    running_loss = 0.0
    correct_train = 0
    total_train = 0

    for i, data in enumerate(trainloader, 0):
        inputs, labels = data

        # Move inputs and labels to GPU
        inputs, labels = inputs.to(device), labels.to(device)

        optimizer.zero_grad()

        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()

        _, predicted_train = torch.max(outputs.data, 1)
        total_train += labels.size(0)
        correct_train += (predicted_train == labels).sum().item()

        if i % 100 == 99:
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 100))
            running_loss = 0.0

    # Training accuracy for the epoch
    train_accuracy = correct_train / total_train

    # Evaluate the model on the validation set
    model.eval()
    correct_val = 0
    total_val = 0

    with torch.no_grad():
        for data in testloader:
            inputs, labels = data
            inputs, labels = inputs.to(device), labels.to(device)
            outputs = model(inputs)
            _, predicted_val = torch.max(outputs.data, 1)
            total_val += labels.size(0)
            correct_val += (predicted_val == labels).sum().item()

    # Validation accuracy for the epoch
    val_accuracy = correct_val / total_val

    print(f'Epoch {epoch + 1}, Training Accuracy: {100 * train_accuracy:.8f}% ({correct_train}/{total_train}), Validation Accuracy: {100 * val_accuracy:.8f}% ({correct_val}/{total_val})')


    # Save the model if it has the best accuracy so far
    if val_accuracy > best_acc:
        best_acc = val_accuracy
        torch.save(model.state_dict(), 'best_model_vgg16.pth')

print('Finished Training')

```
Downloading: "https://download.pytorch.org/models/vgg16-397923af.pth" to C:\Users\user/.cache\torch\hub\checkpoints\vgg16-397923af.pth
Epoch 1, Training Accuracy: 3.07917889% (168/5456), Validation Accuracy: 19.42815249% (265/1364)
Epoch 2, Training Accuracy: 41.47727273% (2263/5456), Validation Accuracy: 75.07331378% (1024/1364)
Epoch 3, Training Accuracy: 77.69428152% (4239/5456), Validation Accuracy: 82.62463343% (1127/1364)
Epoch 4, Training Accuracy: 87.40835777% (4769/5456), Validation Accuracy: 84.01759531% (1146/1364)
Epoch 5, Training Accuracy: 92.02712610% (5021/5456), Validation Accuracy: 85.26392962% (1163/1364)
Epoch 6, Training Accuracy: 95.12463343% (5190/5456), Validation Accuracy: 87.46334311% (1193/1364)
Epoch 7, Training Accuracy: 96.20601173% (5249/5456), Validation Accuracy: 87.09677419% (1188/1364)
Epoch 8, Training Accuracy: 96.97580645% (5291/5456), Validation Accuracy: 86.43695015% (1179/1364)
Epoch 9, Training Accuracy: 97.58064516% (5324/5456), Validation Accuracy: 87.53665689% (1194/1364)
Epoch 10, Training Accuracy: 98.29545455% (5363/5456), Validation Accuracy: 87.75659824% (1197/1364)
Epoch 11, Training Accuracy: 98.42375367% (5370/5456), Validation Accuracy: 87.68328446% (1196/1364)
Epoch 12, Training Accuracy: 98.64369501% (5382/5456), Validation Accuracy: 87.82991202% (1198/1364)
Epoch 13, Training Accuracy: 98.79032258% (5390/5456), Validation Accuracy: 89.22287390% (1217/1364)
Epoch 14, Training Accuracy: 98.93695015% (5398/5456), Validation Accuracy: 87.97653959% (1200/1364)
Epoch 15, Training Accuracy: 98.44208211% (5371/5456), Validation Accuracy: 88.48973607% (1207/1364)
Epoch 16, Training Accuracy: 99.13856305% (5409/5456), Validation Accuracy: 88.85630499% (1212/1364)
Epoch 17, Training Accuracy: 99.24853372% (5415/5456), Validation Accuracy: 89.14956012% (1216/1364)
Epoch 18, Training Accuracy: 99.39516129% (5423/5456), Validation Accuracy: 89.00293255% (1214/1364)
Epoch 19, Training Accuracy: 99.52346041% (5430/5456), Validation Accuracy: 89.22287390% (1217/1364)
Epoch 20, Training Accuracy: 99.54178886% (5431/5456), Validation Accuracy: 88.78299120% (1211/1364)
Epoch 21, Training Accuracy: 99.50513196% (5429/5456), Validation Accuracy: 88.70967742% (1210/1364)
Epoch 22, Training Accuracy: 99.50513196% (5429/5456), Validation Accuracy: 88.63636364% (1209/1364)
Epoch 23, Training Accuracy: 99.65175953% (5437/5456), Validation Accuracy: 89.22287390% (1217/1364)
Epoch 24, Training Accuracy: 99.59677419% (5434/5456), Validation Accuracy: 89.29618768% (1218/1364)
Epoch 25, Training Accuracy: 99.46847507% (5427/5456), Validation Accuracy: 90.02932551% (1228/1364)
...
Epoch 48, Training Accuracy: 99.78005865% (5444/5456), Validation Accuracy: 90.46920821% (1234/1364)
Epoch 49, Training Accuracy: 99.85337243% (5448/5456), Validation Accuracy: 90.10263930% (1229/1364)
Epoch 50, Training Accuracy: 99.83504399% (5447/5456), Validation Accuracy: 90.83577713% (1239/1364)
Finished Training
```

---

### 3. VGG 19

* 앞서 정의한 라이브러리와 로드한 데이터셋을 동일하게 사용하여 해당 설명은 생략함
* 모델 로드 후 학습 진행 - Load pretrained VGG19 model and Perform Model Training

In [None]:
best_acc = 0.0  # Variable to store the best accuracy on the validation set
# Load the pretrained AlexNet model
model = models.vgg19(pretrained=True)

# Modify the classifier to match the number of classes
num_features = model.classifier[6].in_features
model.classifier[6] = nn.Linear(num_features, len(trainset.classes))

# Move the model to GPU
model = model.to(device)

# Define the loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)

for epoch in range(num_epochs):
    model.train()
    running_loss = 0.0
    correct_train = 0
    total_train = 0

    for i, data in enumerate(trainloader, 0):
        inputs, labels = data

        # Move inputs and labels to GPU
        inputs, labels = inputs.to(device), labels.to(device)

        optimizer.zero_grad()

        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()

        _, predicted_train = torch.max(outputs.data, 1)
        total_train += labels.size(0)
        correct_train += (predicted_train == labels).sum().item()

        if i % 100 == 99:
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 100))
            running_loss = 0.0

    # Training accuracy for the epoch
    train_accuracy = correct_train / total_train

    # Evaluate the model on the validation set
    model.eval()
    correct_val = 0
    total_val = 0

    with torch.no_grad():
        for data in testloader:
            inputs, labels = data
            inputs, labels = inputs.to(device), labels.to(device)
            outputs = model(inputs)
            _, predicted_val = torch.max(outputs.data, 1)
            total_val += labels.size(0)
            correct_val += (predicted_val == labels).sum().item()

    # Validation accuracy for the epoch
    val_accuracy = correct_val / total_val

    print(f'Epoch {epoch + 1}, Training Accuracy: {100 * train_accuracy:.8f}% ({correct_train}/{total_train}), Validation Accuracy: {100 * val_accuracy:.8f}% ({correct_val}/{total_val})')


    # Save the model if it has the best accuracy so far
    if val_accuracy > best_acc:
        best_acc = val_accuracy
        torch.save(model.state_dict(), 'best_model_vgg19.pth')

print('Finished Training')

```Downloading: "https://download.pytorch.org/models/vgg19-dcbb9e9d.pth" to /root/.cache/torch/hub/checkpoints/vgg19-dcbb9e9d.pth
100%|██████████| 548M/548M [00:05<00:00, 110MB/s] 
Epoch 1, Training Accuracy: 2.36436950% (129/5456), Validation Accuracy: 11.14369501% (152/1364)
Epoch 2, Training Accuracy: 35.41055718% (1932/5456), Validation Accuracy: 69.50146628% (948/1364)
Epoch 3, Training Accuracy: 76.08137830% (4151/5456), Validation Accuracy: 81.37829912% (1110/1364)
Epoch 4, Training Accuracy: 86.69354839% (4730/5456), Validation Accuracy: 85.48387097% (1166/1364)
Epoch 5, Training Accuracy: 92.96187683% (5072/5456), Validation Accuracy: 85.19061584% (1162/1364)
Epoch 6, Training Accuracy: 95.05131965% (5186/5456), Validation Accuracy: 85.92375367% (1172/1364)
Epoch 7, Training Accuracy: 95.94941349% (5235/5456), Validation Accuracy: 86.73020528% (1183/1364)
Epoch 8, Training Accuracy: 97.04912023% (5295/5456), Validation Accuracy: 86.65689150% (1182/1364)
Epoch 9, Training Accuracy: 97.34237537% (5311/5456), Validation Accuracy: 87.60997067% (1195/1364)
Epoch 10, Training Accuracy: 97.52565982% (5321/5456), Validation Accuracy: 88.63636364% (1209/1364)
Epoch 11, Training Accuracy: 97.67228739% (5329/5456), Validation Accuracy: 87.82991202% (1198/1364)
Epoch 12, Training Accuracy: 98.53372434% (5376/5456), Validation Accuracy: 87.90322581% (1199/1364)
Epoch 13, Training Accuracy: 98.79032258% (5390/5456), Validation Accuracy: 88.56304985% (1208/1364)
Epoch 14, Training Accuracy: 99.30351906% (5418/5456), Validation Accuracy: 88.48973607% (1207/1364)
Epoch 15, Training Accuracy: 99.23020528% (5414/5456), Validation Accuracy: 89.14956012% (1216/1364)
Epoch 16, Training Accuracy: 99.41348974% (5424/5456), Validation Accuracy: 89.66275660% (1223/1364)
Epoch 17, Training Accuracy: 99.52346041% (5430/5456), Validation Accuracy: 88.34310850% (1205/1364)
Epoch 18, Training Accuracy: 99.32184751% (5419/5456), Validation Accuracy: 88.26979472% (1204/1364)
Epoch 19, Training Accuracy: 99.28519062% (5417/5456), Validation Accuracy: 88.78299120% (1211/1364)
Epoch 20, Training Accuracy: 99.43181818% (5425/5456), Validation Accuracy: 88.78299120% (1211/1364)
Epoch 21, Training Accuracy: 99.63343109% (5436/5456), Validation Accuracy: 88.70967742% (1210/1364)
Epoch 22, Training Accuracy: 99.56011730% (5432/5456), Validation Accuracy: 88.92961877% (1213/1364)
Epoch 23, Training Accuracy: 99.50513196% (5429/5456), Validation Accuracy: 88.85630499% (1212/1364)
Epoch 24, Training Accuracy: 99.35850440% (5421/5456), Validation Accuracy: 88.41642229% (1206/1364)
Epoch 25, Training Accuracy: 99.56011730% (5432/5456), Validation Accuracy: 89.22287390% (1217/1364)
...
Epoch 36, Training Accuracy: 99.78005865% (5444/5456), Validation Accuracy: 90.02932551% (1228/1364)
Epoch 37, Training Accuracy: 99.83504399% (5447/5456), Validation Accuracy: 89.36950147% (1219/1364)
Epoch 38, Training Accuracy: 99.78005865% (5444/5456), Validation Accuracy: 89.80938416% (1225/1364)
Epoch 39, Training Accuracy: 99.78005865% (5444/5456), Validation Accuracy: 89.80938416% (1225/1364)
```

---

### 4. GoogleNet

* 앞서 정의한 라이브러리와 로드한 데이터셋을 동일하게 사용하여 해당 설명은 생략함
* 모델 로드 후 학습 진행 - Load pretrained VGG19 model and Perform Model Training

---

### 5. ResNet50



#### 5.1 GPU를 사용할 수 있는지 확인

* ResNet50은 맥북에서 학습을 하였기에 window환경에서 학습할때와 확인하는 코드가 다름

In [None]:
import torch
device = torch.device('mps:0' if torch.backends.mps.is_available() else 'cpu')
print(device)
print (f"PyTorch version:{torch.__version__}") # 1.12.1 이상
print(f"MPS 장치를 지원하도록 build 되었는지: {torch.backends.mps.is_built()}") # True 여야 합니다.
print(f"MPS 장치가 사용 가능한지: {torch.backends.mps.is_available()}") # True 여야 합니다.
!python -c 'import platform;print(platform.platform())'

```mps:0
PyTorch version:2.1.1
MPS 장치를 지원하도록 build 되었는지: True
MPS 장치가 사용 가능한지: True
macOS-14.1.1-arm64-arm-64bit
```

#### 5.2 필요 라이브러리 로드

In [None]:
import torch
import torchvision
from torch.utils.data import DataLoader,Dataset,random_split
from torchvision import datasets,models
import torchvision.transforms as transforms
import torch.nn as nn
import torch.optim as optim
import matplotlib.pyplot as plt
import torch.optim.lr_scheduler as lr_scheduler
import time
import copy
import numpy as np 
import pandas as pd 

#### 5.3 데이터 전처리 정의 및 데이터셋 로드

In [None]:
transform = transforms.Compose([transforms.Resize(255),
                                 transforms.CenterCrop(224),
                                 transforms.ToTensor()])
dataset_train=datasets.ImageFolder("/Users/nykim/Downloads/PokemonData",transform=transform
                )
dataset_length=len(dataset_train)
trainset,valset=random_split(dataset_train,[int(0.8*dataset_length),dataset_length-int(0.8*dataset_length)])
print(len(dataset_train.classes))
loadeddata=DataLoader(dataset_train,batch_size=64,shuffle=True,num_workers=2)
loaded_train=DataLoader(trainset,batch_size=64,shuffle=True,num_workers=2)
loaded_val=DataLoader(valset,batch_size=64,shuffle=True,num_workers=2)

```
150
```

#### 5.4 pretrained resnet50 모델 로드 후 학습

In [None]:
my_model=models.resnet50(pretrained=True)
my_model.fc=nn.Linear(2048,150)
my_model=my_model.to(device)
criterion = nn.CrossEntropyLoss()
optimizer_ft = optim.SGD(my_model.parameters(), lr=0.001, momentum=0.9)
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=7, gamma=0.1)
def train_model(model, criterion, optimizer, scheduler, num_epochs=25):
    since = time.time()

    best_model_wts = copy.deepcopy(model.state_dict())
    best_acc = 0.0

    for epoch in range(num_epochs):
        print(f'Epoch {epoch}/{num_epochs - 1}')
        print('-' * 10)

        # Each epoch has a training and validation phase
        for phase in ['train', 'val']:
            if phase == 'train':
                model.train()  # Set model to training mode
            else:
                model.eval()   # Set model to evaluate mode

            running_loss = 0.0
            running_corrects = 0

            # Iterate over data.
            for inputs, labels in [loaded_train,loaded_val][phase=='val']:
                inputs = inputs.to(device)
                labels = labels.to(device)

                # zero the parameter gradients
                optimizer.zero_grad()
                inputs.unsqueeze(0)
                # forward
                # track history if only in train
                with torch.set_grad_enabled(phase == 'train'):
                    outputs = model(inputs)
                    _, preds = torch.max(outputs, 1)
                    loss = criterion(outputs, labels)

                    # backward + optimize only if in training phase
                    if phase == 'train':
                        loss.backward()
                        optimizer.step()

                # statistics
                running_loss += loss.item() * inputs.size(0)
                running_corrects += torch.sum(preds == labels.data)
            if phase == 'train':
                scheduler.step()

            epoch_loss = running_loss / [int(0.8*dataset_length),dataset_length-int(0.8*dataset_length)][phase=='val']
            epoch_acc = running_corrects.float() / [int(0.8*dataset_length),dataset_length-int(0.8*dataset_length)][phase=='val']

            print(f'{phase} Loss: {epoch_loss:.4f} Acc: {epoch_acc:.4f}')

            # deep copy the model
            if phase == 'val' and epoch_acc > best_acc:
                best_acc = epoch_acc
                best_model_wts = copy.deepcopy(model.state_dict())

        print()

    time_elapsed = time.time() - since
    print(f'Epoch {epoch + 1}, Training Accuracy: {100 * train_accuracy:.8f}% ({correct_train}/{total_train}), Validation Accuracy: {100 * val_accuracy:.8f}% ({correct_val}/{total_val})')


    # load best model weights
    model.load_state_dict(best_model_wts)
    return model

model_ft = train_model(my_model, criterion, optimizer_ft, exp_lr_scheduler,
                       num_epochs=25)

```
Epoch 0, Training Accuracy: 5.74% (314/5456), Validation Accuracy: 17.23% (235/1364)
Epoch 1, Training Accuracy: 35.06% (1911/5456), Validation Accuracy: 44.43% (607/1364)
Epoch 2, Training Accuracy: 60.83% (3312/5456), Validation Accuracy: 60.85% (829/1364)
Epoch 3, Training Accuracy: 75.86% (4134/5456), Validation Accuracy: 76.17% (1039/1364)
Epoch 4, Training Accuracy: 85.80% (4683/5456), Validation Accuracy: 82.77% (1129/1364)
Epoch 5, Training Accuracy: 90.91% (4957/5456), Validation Accuracy: 87.02% (1187/1364)
Epoch 6, Training Accuracy: 93.79% (5112/5456), Validation Accuracy: 88.71% (1210/1364)

...

Epoch 20, Training Accuracy: 96.98% (5289/5456), Validation Accuracy: 89.59% (1222/1364)
Epoch 21, Training Accuracy: 97.05% (5293/5456), Validation Accuracy: 89.52% (1221/1364)
Epoch 22, Training Accuracy: 97.07% (5294/5456), Validation Accuracy: 89.30% (1218/1364)
Epoch 23, Training Accuracy: 96.94% (5286/5456), Validation Accuracy: 89.74% (1222/1364)
Epoch 24, Training Accuracy: 97.18% (5303/5456), Validation Accuracy: 89.66% (1221/1364)
```

---

## 결과 총평

* 목적에 따른 알고리즘의 효율성
    * 머핀과 치와와 같은 특징적인 이미지를 학습시키고, 학습 기준에 따라 대상을 구분할 수 있도록 하고자 하였을 때는 CNN 알고리즘이 SVC보다 효과적이었습니다.

* 훈련에 사용될 데이터 수집 및 전처리 과정의 중요성
    * 선택한 오픈 데이터셋이 구글 이미지 검색 결과를 스크래핑한 것이어서 학습/테스트 데이터 자체에 '치와와', 혹은 '머핀'이 아닌 이미지가 섞여있어 더 효과적인 훈련이 진행되지 못한 것으로 보입니다.
    
* 최적화 알고리즘 선택의 중요성
    * CNN 알고리즘의 경우, Adam 대신 SGD 최적화 함수를 사용할 경우 정확도가 약 70%까지 떨어졌습니다.

## Overall Summary

* Efficiency of the Algorithm Based on the Objective
   * When the goal was to train and differentiate distinctive images like muffins and Chihuahuas, the CNN algorithm proved to be more effective than SVC.

* Importance of Data Collection and Preprocessing in Training
   * The selected open dataset appeared to be scraped from Google image search results and contained images that were not actually "Chihuahua" or "muffin," which hindered effective training.

* Significance of Algorithm Selection and Optimization
   * In the case of the CNN algorithm, using the SGD optimization function instead of Adam resulted in a decrease in accuracy to around 70%.