## ToBigs 5주차

### Vision Advanced 과제

#### 문제 1.

Object Detection 에는 2-stage model 과 1-stage model이 존재합니다.  
각 유형에 해당하는 model을 하나씩 선정하여 설명하세요. (단, 세션 시간에 설명한 모델과 아래 제시된 모델은 제외할 것)

In [None]:
#### 답안 작성
'''
1stage model은 Focal Loss를 처음 제안한 RetinaNet을 예로 들 수 있다.
RetinaNet은 Focal Loss를 통해 각 class 의 가중치 불균형 문제를 해결하였다.

2stage model로는 R-CNN과 달리 convolution layer를 추가하여 성능을 개선한 R-FCN을 예로 들 수 있다. R-FCN은 똑같이 Region Proposal network를 사용하였는데 이전 모델보다 더 빠르고 정확한 성능을 보였다.
'''

#### 문제 2.

아래 제시된 FasterRCNN 과 YOLOv5 를 각각 실행합니다.  
실행 결과를 제시하고 두 모델 사이의 차이점을 실행 결과에 근거하여 설명하세요.

In [2]:
## Package Import

import torch
from torch.utils.data import DataLoader
from torchvision import models, transforms
from torchvision.datasets import CocoDetection
from torchvision.models.detection.faster_rcnn import FastRCNNPredictor
import numpy as np
import time
import warnings


warnings.filterwarnings("ignore")

In [9]:
## Data Download

import os
os.makedirs('data', exist_ok=True)
os.makedirs('data/images', exist_ok=True)

!wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip
!unzip annotations_trainval2017.zip -d ./data/
!wget http://images.cocodataset.org/zips/val2017.zip
!unzip val2017.zip -d ./data/images/

--2024-09-16 13:16:36--  http://images.cocodataset.org/annotations/annotations_trainval2017.zip
Resolving images.cocodataset.org (images.cocodataset.org)... 52.216.209.57, 54.231.166.33, 3.5.2.216, ...
Connecting to images.cocodataset.org (images.cocodataset.org)|52.216.209.57|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 252907541 (241M) [application/zip]
Saving to: ‘annotations_trainval2017.zip’


2024-09-16 13:22:38 (683 KB/s) - ‘annotations_trainval2017.zip’ saved [252907541/252907541]

Archive:  annotations_trainval2017.zip
  inflating: ./data/annotations/instances_train2017.json  
  inflating: ./data/annotations/instances_val2017.json  
  inflating: ./data/annotations/captions_train2017.json  
  inflating: ./data/annotations/captions_val2017.json  
  inflating: ./data/annotations/person_keypoints_train2017.json  
  inflating: ./data/annotations/person_keypoints_val2017.json  
--2024-09-16 13:22:42--  http://images.cocodataset.org/zips/v

In [13]:
## FasterRCNN

# 경로 설정
image_dir = "./data/images/val2017/"
json_path = "./data/annotations/instances_val2017.json"

# Transform 설정
transform = transforms.Compose([
    transforms.ToTensor()
])
def collate_fn(batch):
    return tuple(zip(*batch))
# Dataset과 DataLoader 설정
dataset = CocoDetection(root=image_dir, annFile=json_path, transform=transform)
data_loader = DataLoader(dataset, batch_size=4, shuffle=False, num_workers=2, collate_fn=collate_fn)

# 모델 로드 및 설정
model = models.detection.fasterrcnn_resnet50_fpn(pretrained=True)

# 모델의 클래스 수를 3개로 설정 (배경 포함해서 0, 2, background)
num_classes = 3  # COCO의 경우 background 포함 (0, 2 두 개의 클래스 + 배경)
in_features = model.roi_heads.box_predictor.cls_score.in_features
model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes)

# GPU 사용 가능 시 GPU로 설정
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
model.to(device)

# 모델을 평가 모드로 설정
model.eval()

# 정확도 계산을 위한 변수
total_correct = 0
total_predictions = 0

# 소요 시간 측정 시작
start_time = time.time()

# 100장 예측 속도를 비교
cnt = 0

with torch.no_grad():
    for images, targets in data_loader:
        cnt += 1
        images = list(image.to(device) for image in images)

        # 각 이미지의 타깃을 적절히 변환하여 GPU로 전송
        processed_targets = []
        for target in targets:
            processed_target = {}
            processed_target['boxes'] = torch.tensor([ann['bbox'] for ann in target]).to(device)
            processed_target['labels'] = torch.tensor([ann['category_id'] for ann in target]).to(device)
            processed_targets.append(processed_target)

        # 모델 예측
        outputs = model(images)

        for i, output in enumerate(outputs):
            pred_labels = output['labels'].cpu().numpy()
            true_labels = processed_targets[i]['labels'].cpu().numpy()

            # 예측 수가 실제 라벨 수보다 많은 경우 예측 수를 잘라냄
            if len(pred_labels) > len(true_labels):
                pred_labels = pred_labels[:len(true_labels)]

            # 예측 레이블과 실제 레이블이 얼마나 일치하는지 확인
            correct = np.sum(pred_labels == true_labels[:len(pred_labels)])
            total_correct += correct
            total_predictions += len(true_labels)

        if cnt == 100:
            break

# 소요 시간 측정 종료
end_time = time.time()

# 소요 시간 및 정확도 출력
time_taken = end_time - start_time
accuracy = total_correct / total_predictions if total_predictions > 0 else 0

print(f"소요 시간: {time_taken:.2f}초")
print(f"정확도: {accuracy * 100:.2f}%")

loading annotations into memory...
Done (t=0.38s)
creating index...
index created!


Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/spawn.py", line 126, in _main
    self = reduction.pickle.load(from_parent)
AttributeError: Can't get attribute 'collate_fn' on <module '__main__' (built-in)>
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/spawn.py", line 126, in _main
    self = reduction.pickle.load(from_parent)
AttributeError: Can't get attribute 'collate_fn' on <module '__main__' (built-in)>


RuntimeError: DataLoader worker (pid(s) 21112, 21115) exited unexpectedly

In [15]:
## YOLOv5

# 경로 설정
image_dir = "./data/images/val2017/"
json_path = "./data/annotations/instances_val2017.json"

# Transform 설정
transform = transforms.Compose([
    transforms.Resize((640, 640)),  # YOLOv5의 기본 입력 크기
    transforms.ToTensor()
])

# Dataset과 DataLoader 설정
dataset = CocoDetection(root=image_dir, annFile=json_path, transform=transform)
data_loader = DataLoader(dataset, batch_size=1, shuffle=False, num_workers=2)

# 모델 로드 (PyTorch Hub에서 COCO로 사전 학습된 YOLOv5 모델 로드)
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True)

# GPU 사용 가능 시 GPU로 설정
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
model.to(device)
model.eval()

# 정확도 계산을 위한 변수
total_correct = 0
total_predictions = 0

# 소요 시간 측정 시작
start_time = time.time()

# 100장 예측 속도를 비교
cnt = 0

with torch.no_grad():
    for images, targets in data_loader:
        cnt += 1
        images = images.to(device)

        # 모델 예측
        outputs = model(images)

        # 예측된 바운딩 박스와 클래스 정보 추출
        for output in outputs:
            pred_labels = output[:, 5:].argmax(1).cpu().numpy()  # 예측된 클래스 레이블

            # 실제 라벨을 추출하고, 예측된 라벨과 비교
            true_labels = [t['category_id'].item() for t in targets]

            # 예측된 라벨과 실제 라벨이 일치하는지 확인
            correct = np.sum(pred_labels[:len(true_labels)] == true_labels)
            total_correct += correct
            total_predictions += len(true_labels)

        if cnt == 100:
            break

# 소요 시간 측정 종료
end_time = time.time()

# 소요 시간 및 정확도 출력
time_taken = end_time - start_time

# 정확도 계산
if total_predictions > 0:
    accuracy = (total_correct / total_predictions) * 100
else:
    accuracy = 0

# 소요 시간 및 정확도 출력
print(f"소요 시간: {time_taken:.2f}초")
print(f"정확도: {accuracy:.2f}%")

loading annotations into memory...
Done (t=0.38s)
creating index...
index created!


Using cache found in /Users/ganbrygna/.cache/torch/hub/ultralytics_yolov5_master


Collecting ultralytics
  Downloading ultralytics-8.2.94-py3-none-any.whl.metadata (41 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m41.9/41.9 kB[0m [31m1.3 MB/s[0m eta [36m0:00:00[0m
Collecting py-cpuinfo (from ultralytics)
  Downloading py_cpuinfo-9.0.0-py3-none-any.whl.metadata (794 bytes)
Collecting ultralytics-thop>=2.0.0 (from ultralytics)
  Downloading ultralytics_thop-2.0.6-py3-none-any.whl.metadata (9.1 kB)
Downloading ultralytics-8.2.94-py3-none-any.whl (872 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m872.7/872.7 kB[0m [31m1.1 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0mm
[?25hDownloading ultralytics_thop-2.0.6-py3-none-any.whl (26 kB)
Downloading py_cpuinfo-9.0.0-py3-none-any.whl (22 kB)
Installing collected packages: py-cpuinfo, ultralytics-thop, ultralytics
Successfully installed py-cpuinfo-9.0.0 ultralytics-8.2.94 ultralytics-thop-2.0.6



[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m24.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


[31m[1mrequirements:[0m Ultralytics requirements ['gitpython>=3.1.30', 'pillow>=10.3.0', 'requests>=2.32.0', 'setuptools>=70.0.0'] not found, attempting AutoUpdate...
Collecting gitpython>=3.1.30
  Downloading GitPython-3.1.43-py3-none-any.whl.metadata (13 kB)
Collecting pillow>=10.3.0
  Downloading pillow-10.4.0-cp310-cp310-macosx_11_0_arm64.whl.metadata (9.2 kB)
Collecting requests>=2.32.0
  Downloading requests-2.32.3-py3-none-any.whl.metadata (4.6 kB)
Collecting setuptools>=70.0.0
  Downloading setuptools-75.0.0-py3-none-any.whl.metadata (6.9 kB)
Collecting gitdb<5,>=4.0.1 (from gitpython>=3.1.30)
  Downloading gitdb-4.0.11-py3-none-any.whl.metadata (1.2 kB)
Collecting smmap<6,>=3.0.1 (from gitdb<5,>=4.0.1->gitpython>=3.1.30)
  Downloading smmap-5.0.1-py3-none-any.whl.metadata (4.3 kB)
Downloading GitPython-3.1.43-py3-none-any.whl (207 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m207.3/207.3 kB[0m [31m9.0 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading pil


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m24.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
YOLOv5 🚀 2024-9-16 Python-3.10.11 torch-2.2.0 CPU

Downloading https://github.com/ultralytics/yolov5/releases/download/v7.0/yolov5s.pt to yolov5s.pt...
100%|██████████| 14.1M/14.1M [00:02<00:00, 7.16MB/s]

Fusing layers... 
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs
Adding AutoShape... 


소요 시간: 23.94초
정확도: 0.28%


두 결과를 비교해 보았을 때, Yolo같은 경우는 모델의 cost가 낮아 소요시간이 낮은 반면 정확도가 낮고, FasterRCNN같은 경우 반대로 소요시간이 오래걸리고 정확도가 높다.