# [YOLOv8](https://docs.ultralytics.com/)

## 설치

1. 파이토치 설치
2. YOLOv8 설치
    - `pip install ultralytics`
3. 주피터노트북에서 실행할 경우 프로그래스바를 실행하기 위해서 다음을 설치한다. (필수는 아님)
    - `conda install -y -c conda-forge ipywidgets`

In [1]:
!pip install ultralytics



## 사용
- CLI (command line interface)에서 터미널 명령어로 추론/평가/학습을 진행할 수 있다.
- Python lib 를 이용해 코드상에 원하는 추론/평가/학습을 진행할 수 있다.

# CLI 기본 명령어 구조

- 구문
    - <span style='font-size:1.3em'>**yolo**  **task**=detect|classify|segment  **mode**=train|val|predict  **model**=yolov8n.yaml|yolov8n.pt|..  **args**</span>
    - <b style='font-size:1.2em'>task:</b> \[detect, classify, segment\] 중 하나를 지정한다. \[optional\]로 생략하면 model을 보고 추측해서 task를 정한다.
        - **detect:** Object detection
        - **classify:** Image classification
        - **segment:** Instance segmentation
    - <b style='font-size:1.2em'>mode:</b> \[train, val, predict, export\] 중 하나를 지정한다. \[필수\]로 입력해야 한다.
        - **train:** custom dataset을 train 시킨다.
        - **val:** 모델 성능을 평가한다.
        - **predict:** 입력 이미지에 대한 추론을 한다.
        - **export:** 모델을 다른 형식으로 변환한다.
    - <b style='font-size:1.2em'>model:</b> **pretrained 모델**이나 **모델 설정 yaml 파일**의 경로를 설정한다. \[필수\]로 입력해야 한다.
        - pretrained 모델 파일경로
            - task에 맞는 pretrained 모델파일의 저장 경로를 지정한다.
            - transfer learnging을 하거나 fine tuning 시 방법
        - 모델 구조 설정 yaml 파일 경로
            - task에 맞는 pretrained 모델 설정파일(yaml파일)의 경로를 지정한다.
            - train mode에서 지정하며 모델을 새로 생성해서 처음부터 학습 시킬 경우 지정한다.
        - Ultralytics에서 제공하는 Pretrained 모델
            - 모델 크기에 따라 5개의 모델을 제공하며 큰 모델은 작은 모델에 비해 추론 성능이 좋은대신 속도는 느리다.
            - 모델은 처음 추론또는 학습할때 local 컴퓨터에 없으면 download 받는다.
            - https://github.com/ultralytics/ultralytics#models
            - ### 제공 모델
            
            | **task\모델크기**           | **nano** | **small_** | **medium** | **large** | **xlarge** |
            |:--------------------|----------|-------------|------------|-----------|----------|
            | **detection**      | yolov8n  | yolov8s     | yolov8m    | yolov8l   | yolov8x    |
            | **segmentation**   | yolov8n-seg  | yolov8s-seg     | yolov8m-seg    | yolov8l-seg   | yolov8x-seg    |
            | **classification** | yolov8n-cls  | yolov8s-cls     | yolov8m-cls    | yolov8l-cls   | yolov8x-cls    |         
            | **pose estimation** | yolov8n-pose  | yolov8s-pose     | yolov8m-pose    | yolov8l-pose   | yolov8x-pose    |
            
            - 확장자가 `pt`이면 pretrained 된 모델을, `yaml`이면 모델 구조 설정파일을 download하여 실행한다.
                - pretrained model은 fine tuning이나 추론할 때, yaml설정파일은 처음부터 학습할 경우 설정하여 받는다.
    - <b style='font-size:1.2em'>args:</b> task와 mode과 관련한 추가 설정값들을 지정한다.
        - https://docs.ultralytics.com/cfg/

# [Object Detection](https://docs.ultralytics.com/tasks/detection/)

##  Predict (추론)

### 모델로딩
- Ultralytics에서 제공하는 Pretrained Model이나 직접 학습시킨 모델을 이용해 추론한다.
- Ultralytics는 Object Detection을 위한 [Pretrained 모델](#제공-모델)을 제공한다.
    - Object Detection 모델은 COCO dataset으로 학습되었다.
    - 모델 명을 지정하면 자동으로 다운로드를 받는다.

### CLI
`yolo task=detect mode=predict model=model_path source=추론할_image_path`
- 추가 설정 (configuration)
    - https://docs.ultralytics.com/cfg
    

In [2]:
!pip install ultralytics --upgrade



In [1]:
!cd /home/parking/ml/ml_colab_project/Object_Detection/

In [16]:
!yolo  task=detect  mode=predict   model=models/yolov8s.pt   source=test_image/3.jpg  save=True  save_txt=True  line_width=1

Ultralytics YOLOv8.0.117 🚀 Python-3.10.11 torch-2.0.1+cu118 CUDA:0 (NVIDIA GeForce RTX 3060, 12042MiB)
YOLOv8s summary (fused): 168 layers, 11156544 parameters, 0 gradients

  return F.conv2d(input, weight, bias, self.stride,
image 1/1 /home/parking/ml/ml_colab_project/Object_Detection/test_image/3.jpg: 448x640 1 car, 1 cup, 1 chair, 1 tv, 1 mouse, 1 keyboard, 3 cell phones, 128.3ms
Speed: 1.3ms preprocess, 128.3ms inference, 1.7ms postprocess per image at shape (1, 3, 640, 640)
Results saved to [1mruns/detect/predict3[0m
1 label saved to runs/detect/predict3/labels


### Python

In [2]:
from ultralytics import YOLO

In [12]:
model = YOLO("models/yolov8x.pt")  # YOLO 클래스 객체 생성하면서 사용할 pretrained model의 경로를 지정.

Downloading https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8x.pt to models/yolov8x.pt...
100%|██████████| 131M/131M [00:04<00:00, 28.4MB/s] 


In [19]:
image_path = 'test_image/bus.jpg'
result_list = model(image_path, save=True, save_txt=True, line_width=1)


image 1/1 /home/parking/ml/ml_colab_project/Object_Detection/test_image/bus.jpg: 640x480 5 persons, 1 bicycle, 1 bus, 141.8ms
Speed: 12.7ms preprocess, 141.8ms inference, 1.3ms postprocess per image at shape (1, 3, 640, 640)
Results saved to [1mruns/detect/predict7[0m
3 labels saved to runs/detect/predict7/labels


In [20]:
type(result_list), len(result_list)
# 리스트에 추론한 결과를 추론한 이미지별로 저장해서 반환.

(list, 1)

In [21]:
type(result_list[0])

ultralytics.yolo.engine.results.Results

### 한번에 여러장 추론
- 추론할 파일경로를 리스트로 묶어서 추론한다.

In [32]:
from glob import glob
file_path = glob('test_image/*.jpg')
file_path

['test_image/1433148424953500.jpg',
 'test_image/2.jpg',
 'test_image/3.jpg',
 'test_image/4.jpg',
 'test_image/6.jpg',
 'test_image/1.jpg',
 'test_image/5.jpg',
 'test_image/catsss.jpg',
 'test_image/bus.jpg']

In [33]:
model = YOLO('models/yolov8x.pt')
result_list = model(file_path, save=True, save_txt=True, line_width=1)


0: 640x640 11 persons, 2 elephants, 1: 640x640 15 persons, 6 cars, 4 buss, 4 traffic lights, 1 umbrella, 2: 640x640 1 car, 2 chairs, 1 tv, 1 mouse, 1 keyboard, 2 cell phones, 3: 640x640 3 elephants, 1 zebra, 4: 640x640 3 persons, 1 wine glass, 2 cups, 1 fork, 1 knife, 3 pizzas, 3 dining tables, 5: 640x640 9 persons, 5 bicycles, 5 cars, 1 motorcycle, 1 bus, 4 traffic lights, 1 dog, 1 backpack, 1 handbag, 6: 640x640 9 persons, 1 tie, 1 bottle, 15 wine glasss, 3 cups, 3 forks, 1 knife, 1 spoon, 1 bowl, 1 potted plant, 1 dining table, 1 vase, 7: 640x640 10 cars, 2 cats, 1 bowl, 8: 640x640 4 persons, 1 bicycle, 1 bus, 329.8ms
Speed: 1.6ms preprocess, 36.6ms inference, 1.0ms postprocess per image at shape (1, 3, 640, 640)
Results saved to [1mruns/detect/predict10[0m
9 labels saved to runs/detect/predict10/labels


In [25]:
len(result_list), type(result_list[0])

(7, ultralytics.yolo.engine.results.Results)

### web 상의 이미지 추론

In [26]:
result_list = model("https://ultralytics.com/images/bus.jpg", save=True)


Downloading https://ultralytics.com/images/bus.jpg to bus.jpg...
100%|██████████| 476k/476k [00:00<00:00, 3.83MB/s]
image 1/1 /home/parking/ml/ml_colab_project/Object_Detection/bus.jpg: 640x480 5 persons, 1 bicycle, 1 bus, 38.1ms
Speed: 10.5ms preprocess, 38.1ms inference, 2.0ms postprocess per image at shape (1, 3, 640, 640)
Results saved to [1mruns/detect/predict8[0m
7 labels saved to runs/detect/predict8/labels


In [35]:
result_list = model('https://d2u3dcdbebyaiu.cloudfront.net/uploads/atch_img/169/f531ef142dbb8dd3211fb40d3fca8f5a_res_a..gif', save=True)


Downloading https://d2u3dcdbebyaiu.cloudfront.net/uploads/atch_img/169/f531ef142dbb8dd3211fb40d3fca8f5a_res_a..gif to f531ef142dbb8dd3211fb40d3fca8f5a_res_a..gif...
100%|██████████| 159k/159k [00:00<00:00, 1.51MB/s]

    causing potential out-of-memory errors for large sources or long-running streams/videos.

    Usage:
        results = model(source=..., stream=True)  # generator of Results objects
        for r in results:
            boxes = r.boxes  # Boxes object for bbox outputs
            masks = r.masks  # Masks object for segment masks outputs
            probs = r.probs  # Class probabilities for classification outputs

video 1/1 (1/1) /home/parking/ml/ml_colab_project/Object_Detection/f531ef142dbb8dd3211fb40d3fca8f5a_res_a..gif: 384x640 1 person, 2 benchs, 1 cat, 126.1ms
Speed: 20.7ms preprocess, 126.1ms inference, 1.6ms postprocess per image at shape (1, 3, 640, 640)
Results saved to [1mruns/detect/predict10[0m
10 labels saved to runs/detect/predict10/labels


In [4]:
img_url = 'https://s.gae9.com/trend/99e262b59dda948c.orig'
result_list = model(img_url, save=True)


Downloading https:\jjal.today\data\file\gallery\1850094512_s9OJf0Qw_0d61635e94751fd731caba342e24a60ab5a5e0f5.gif to 1850094512_s9OJf0Qw_0d61635e94751fd731caba342e24a60ab5a5e0f5.gif...
100%|██████████| 1.19M/1.19M [00:00<00:00, 1.27MB/s]

    causing potential out-of-memory errors for large sources or long-running streams/videos.

    Usage:
        results = model(source=..., stream=True)  # generator of Results objects
        for r in results:
            boxes = r.boxes  # Boxes object for bbox outputs
            masks = r.masks  # Masks object for segment masks outputs
            probs = r.probs  # Class probabilities for classification outputs

video 1/1 (1/18) e:\Python\ml_colab_project\Object_Detection\1850094512_s9OJf0Qw_0d61635e94751fd731caba342e24a60ab5a5e0f5.gif: 384x640 7 persons, 2 ties, 1 cup, 1 clock, 917.2ms
video 1/1 (2/18) e:\Python\ml_colab_project\Object_Detection\1850094512_s9OJf0Qw_0d61635e94751fd731caba342e24a60ab5a5e0f5.gif: 384x640 5 persons, 2 ties, 856.1ms

In [27]:
result_list = model('https://storage3.ilyo.co.kr/contents/article/images/2015/0601/1433148424953500.jpg', save=True)


Downloading https://storage3.ilyo.co.kr/contents/article/images/2015/0601/1433148424953500.jpg to 1433148424953500.jpg...
100%|██████████| 228k/228k [00:00<00:00, 14.7MB/s]
image 1/1 /home/parking/ml/ml_colab_project/Object_Detection/1433148424953500.jpg: 448x640 12 persons, 2 elephants, 31.1ms
Speed: 2.5ms preprocess, 31.1ms inference, 1.8ms postprocess per image at shape (1, 3, 640, 640)
Results saved to [1mruns/detect/predict8[0m
8 labels saved to runs/detect/predict8/labels


## 추론결과

### ultralytics.yolo.engine.results.Results
- 모델의 추론 결과는 list에 이미지별 예측결과를 Results에 담아 반환한다.
- **Results** : 한개 이미지에 대한 추론결과를 담는 객체
- 추론 종류에 따라 다음 속성을 이용해 결과를 조회한다.
    - Detection: `result.boxes` - Boxes type
    - Segmentation: `result.masks` - Masks type
    - Classification: `result.probs` - torch.Tensor type
    - Pose: `result.keypoints` - Keypoints type
- 추가 정보
    - Results.orig_img: 추론한 원본 이미지
    - Results.orig_shape: 추론한 원본 이미지의 크기 (height, width)
    - Results.path: 추론한 원본이미지의 경로

In [31]:
from ultralytics import YOLO

model = YOLO('models/yolov8x.pt')
result_list = model('./test_image/catsss.jpg', save=True)


image 1/1 e:\Python\ml_colab_project\Object_Detection\test_image\catsss.jpg: 512x640 10 cars, 2 cats, 1 bowl, 1019.2ms
Speed: 2.9ms preprocess, 1019.2ms inference, 2.5ms postprocess per image at shape (1, 3, 640, 640)
Results saved to [1mruns\detect\predict14[0m


In [7]:
from ultralytics import YOLO

model = YOLO('models/yolov8m.pt')
result_list = model('./test_image/bus.jpg', save=True)


image 1/1 e:\Python\ml_colab_project\Object_Detection\test_image\bus.jpg: 640x480 4 persons, 1 bus, 622.1ms
Speed: 3.0ms preprocess, 622.1ms inference, 2.5ms postprocess per image at shape (1, 3, 640, 640)
Results saved to [1mruns\detect\predict6[0m


In [8]:
type(result_list), len(result_list), type(result_list[0])

(list, 1, ultralytics.yolo.engine.results.Results)

In [9]:
# 추론한 원본 이미지에 대한 정보
result = result_list[0]
print('원본 이미지 경로:', result.path)
print('원본 이미지 크기:', result.orig_shape)
print('원본 이미지:', result.orig_img.shape)

원본 이미지 경로: e:\Python\ml_colab_project\Object_Detection\test_image\bus.jpg
원본 이미지 크기: (1080, 810)
원본 이미지: (1080, 810, 3)


### Object Detection 결과값 조회

- ultralytics.yolo.engine.results.**Boxes**에 추론 결과를 담아 반환
    - Results.boxes로 조회
- 주요 속성
    - shape: 결과 shape. (찾은 물체개수, 6)
    - boxes
        - 6: 좌상단 x, 좌상단 y, 우하단 x, 우하단 y, confidence score, label
    - xyxy
        - bounding box의 `좌상단 x, 좌상단 y, 우하단 x, 우하단 y` 좌표 반환
    - xyxyn
        - xyxy를 이미지 대비 비율로 반환
    - xywh
        - bounding box의 `center x, center y, 너비, 높이` 를 반환
    - xywhn
        - xywh를 이미지 대비 비율로 반환
    - cls: 찾은 물체의 label
    - conf: cls에 대한 confidence score (그 물체일 확률)
    - boxes
        - `x, y, x, y, conf, cls` tensor를 반환

In [10]:
boxes = result.boxes  # detection한 결과를 조회 (Boxes 객체)
print(type(boxes))

<class 'ultralytics.yolo.engine.results.Boxes'>


In [11]:
boxes.shape
# [n, 6] : n-찾은 bbox(object) 개수,  6-x y x y label conf  (x y x y : 좌상단 우하단 좌표)

torch.Size([5, 6])

In [12]:
# 찾은 bbox들에 대한 classification 정보
print(boxes.cls)  # 찾은 n개 bbox에 대한 class 들을 반환.
print(boxes.conf) # 찾은 n개 bbox에대한 confidence score(확률) 들을 반환

tensor([5., 0., 0., 0., 0.])
tensor([0.9595, 0.9276, 0.9226, 0.9017, 0.7927])


In [13]:
# bbox의 위치정보
print(boxes.xyxy)  # 좌상단 우하단 x/y좌표
print(boxes.xyxyn) #이미지크기 대비 비율

tensor([[2.9311e+00, 2.2913e+02, 8.0436e+02, 7.4081e+02],
        [5.0429e+01, 3.9968e+02, 2.4746e+02, 9.0459e+02],
        [6.6839e+02, 3.9516e+02, 8.0971e+02, 8.8096e+02],
        [2.2227e+02, 4.1113e+02, 3.4418e+02, 8.6109e+02],
        [2.3668e-01, 5.5000e+02, 7.8286e+01, 8.7247e+02]])
tensor([[3.6187e-03, 2.1216e-01, 9.9304e-01, 6.8594e-01],
        [6.2258e-02, 3.7008e-01, 3.0551e-01, 8.3759e-01],
        [8.2517e-01, 3.6589e-01, 9.9964e-01, 8.1570e-01],
        [2.7441e-01, 3.8067e-01, 4.2491e-01, 7.9730e-01],
        [2.9220e-04, 5.0926e-01, 9.6650e-02, 8.0784e-01]])


In [14]:
print(boxes.xywh)  # center x, y좌표, bbox width, height
print(boxes.xywhn)  # center x, y좌표, bbox width, height 이미지크기 대비 비율

tensor([[403.6475, 484.9729, 801.4327, 511.6810],
        [148.9455, 652.1385, 197.0324, 504.9107],
        [739.0483, 638.0619, 141.3201, 485.7975],
        [283.2275, 636.1057, 121.9057, 449.9594],
        [ 39.2615, 711.2346,  78.0497, 322.4717]])
tensor([[0.4983, 0.4490, 0.9894, 0.4738],
        [0.1839, 0.6038, 0.2432, 0.4675],
        [0.9124, 0.5908, 0.1745, 0.4498],
        [0.3497, 0.5890, 0.1505, 0.4166],
        [0.0485, 0.6586, 0.0964, 0.2986]])


In [15]:
from module import util

print(util.get_color(0))
print(util.get_coco80_classname(0))
print(util.get_imagenet_classname(1))

(205, 92, 92)
person
goldfish, Carassius auratus


In [16]:
idx = boxes.cls[0]

util.get_coco80_classname(int(idx.item()))

'bus'

In [29]:
# 원본 이미지에 추론 결과를 출력
import cv2
from ultralytics import YOLO

model = YOLO('models/yolov8n.pt')
path = 'bus.jpg'
path = 'test_image/bus.jpg'
result_list = model(path, save=True)
result = result_list[0]

org_img = result.orig_img  #BGR
img = org_img.copy()

boxes = result.boxes
xyxy_list = boxes.xyxy  #좌상단/우하단 좌표
cls_list = boxes.cls    #label
conf_list = boxes.conf  #label  확률.
for xyxy, cls, conf in zip(xyxy_list, cls_list, conf_list):
#     print(xyxy, conf, cls)
    xyxy_arr = xyxy.to('cpu').numpy().astype('int32')
    pt1 = xyxy_arr[:2]
    pt2 = xyxy_arr[2:]
    
    label_name = util.get_coco80_classname(int(cls.item()))
    txt = f"{label_name}-{conf.item()*100:.2f}"
    
    color = util.get_color(int(cls.item()) % 10)
    # bbox
    cv2.rectangle(img, pt1=pt1, pt2=pt2, color=color, thickness=2)
    # label
    cv2.putText(img, text=txt, org=pt1-5, fontFace=cv2.FONT_HERSHEY_SIMPLEX, fontScale=0.5, 
                color=color, thickness=1, lineType=cv2.LINE_AA)



image 1/1 e:\Python\ml_colab_project\Object_Detection\test_image\bus.jpg: 640x480 4 persons, 1 bus, 1 stop sign, 222.0ms
Speed: 2.0ms preprocess, 222.0ms inference, 3.0ms postprocess per image at shape (1, 3, 640, 640)
Results saved to [1mruns\detect\predict13[0m


In [30]:
cv2.imshow('frame', img)
cv2.waitKey(0)
cv2.destroyAllWindows()

## 실시간 Detection


In [32]:
import cv2
from module import util
from ultralytics import YOLO

# 웹캠 연동
cap = cv2.VideoCapture(0)
# 모델 생성
model = YOLO('models/yolov8n.pt')
while True:
    # 한 Frame 읽기
    success, frame = cap.read()
    if not success:
        print('프레임을 읽지 못함')
        break
        
    frame = cv2.cvtColor(cv2.flip(frame, 1), cv2.COLOR_BGR2RGB)
    
    result = model(frame)[0]
    xyxy_list = result.boxes.xyxy.to('cpu').numpy().astype('int32')
    cls_list = result.boxes.cls.to('cpu').numpy().astype('int32')
    conf_list = result.boxes.conf.to('cpu').numpy()
    
    for xyxy, cls, conf in zip(xyxy_list, cls_list, conf_list):
        pt1, pt2 = xyxy[:2], xyxy[2:]
        txt = f"{util.get_coco80_classname(cls)}-{conf*100:.2f}%"
        color = util.get_color(cls % 10)
        cv2.rectangle(frame, pt1, pt2, color=color)
        cv2.putText(frame, txt, org=pt1, fontFace=cv2.FONT_HERSHEY_SIMPLEX, fontScale=1,
                    color=color, thickness=2, lineType=cv2.LINE_AA)
        
    # 화면에 출력
    cv2.imshow('frame', cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))
    if cv2.waitKey(1) == 27: # esc 
        break
        
cv2.destroyAllWindows()
cap.release()


0: 384x640 1 person, 174.9ms
Speed: 4.7ms preprocess, 174.9ms inference, 2.2ms postprocess per image at shape (1, 3, 640, 640)

0: 384x640 2 persons, 1 cup, 1 refrigerator, 173.3ms
Speed: 3.3ms preprocess, 173.3ms inference, 2.0ms postprocess per image at shape (1, 3, 640, 640)

0: 384x640 2 persons, 166.7ms
Speed: 2.9ms preprocess, 166.7ms inference, 2.0ms postprocess per image at shape (1, 3, 640, 640)

0: 384x640 2 persons, 1 bottle, 1 tv, 173.6ms
Speed: 2.5ms preprocess, 173.6ms inference, 1.5ms postprocess per image at shape (1, 3, 640, 640)

0: 384x640 1 person, 1 tv, 165.6ms
Speed: 3.0ms preprocess, 165.6ms inference, 2.5ms postprocess per image at shape (1, 3, 640, 640)

0: 384x640 1 person, 1 bottle, 1 tv, 160.3ms
Speed: 3.0ms preprocess, 160.3ms inference, 2.5ms postprocess per image at shape (1, 3, 640, 640)

0: 384x640 1 person, 1 tv, 1 laptop, 1 refrigerator, 164.6ms
Speed: 4.9ms preprocess, 164.6ms inference, 1.0ms postprocess per image at shape (1, 3, 640, 640)

0: 384x

In [33]:
model = YOLO('models/yolov8x.pt')
img_path = 'test_image/image.png'
result_list = model(img_path, save=True)


image 1/1 e:\Python\ml_colab_project\Object_Detection\test_image\image.png: 640x480 (no detections), 995.0ms
Speed: 2.1ms preprocess, 995.0ms inference, 2.0ms postprocess per image at shape (1, 3, 640, 640)
Results saved to [1mruns\detect\predict15[0m
