## YOLO

### PyTorch 기반 물체인식 모델
- CNN, rCNN(Regions with CNN)

#### YOLOv.5 이상 설치
```shell
> pip install ultralytics
```

In [37]:
# YOLO 설치
!pip install ultralytics




[notice] A new release of pip is available: 24.0 -> 25.0.1
[notice] To update, run: python.exe -m pip install --upgrade pip


#### 콘솔에서 예측

In [38]:
# 콘솔에서 예측
## yolo11n.pt - predict YOLO model
## 자동으로 yolo11n.pt 다운로드
## 웹 URL에 있는 이미지도 예측이 가능
!yolo predict model=yolo11n.pt source='https://ultralytics.com/images/bus.jpg'

Ultralytics 8.3.109 🚀 Python-3.11.9 torch-2.6.0+cu118 CUDA:0 (NVIDIA GeForce GTX 1650, 4096MiB)
YOLO11n summary (fused): 100 layers, 2,616,248 parameters, 0 gradients, 6.5 GFLOPs

Found https://ultralytics.com/images/bus.jpg locally at bus.jpg
image 1/1 c:\Source\iot-dataanalysis-2025\day08\bus.jpg: 640x480 4 persons, 1 bus, 33.6ms
Speed: 2.7ms preprocess, 33.6ms inference, 82.0ms postprocess per image at shape (1, 3, 640, 480)
Results saved to [1mC:\Source\iot-dataanalysis-2025\runs\detect\predict2[0m
💡 Learn more at https://docs.ultralytics.com/modes/predict


#### 파이썬으로 예측

In [39]:
# YOLO 모듈 로드
from ultralytics import YOLO

In [40]:
# YOLO 클래스가 들어오는 모델의 버전에 따라서 알아서 YOLO 예측모델 객체를 생성
model = YOLO('./yolo11n.pt')

##### coco8.yaml
- https://github.com/ultralytics/assets/releases/download/v0.0.0/coco8.zip
- 위 내용대로 훈련을 시킨 결과 -> yolo11n.pt

In [41]:
# coco8.yaml - YOLO 훈련에 사용할 데이터셋 정의파일
train_results = model.train(
    data='./coco8.yaml',
    epoch=100,
    imgsz=640,
    device='cuda:0'
)

SyntaxError: '[31m[1mepoch[0m' is not a valid YOLO argument. Similar arguments are i.e. ['epochs=100'].

    Arguments received: ['yolo', '--f=c:\\Users\\Admin\\AppData\\Roaming\\jupyter\\runtime\\kernel-v32eccc1442cfa30e89218bd03b070db63b1614a75.json']. Ultralytics 'yolo' commands use the following syntax:

        yolo TASK MODE ARGS

        Where   TASK (optional) is one of frozenset({'segment', 'obb', 'detect', 'pose', 'classify'})
                MODE (required) is one of frozenset({'track', 'train', 'val', 'export', 'benchmark', 'predict'})
                ARGS (optional) are any number of custom 'arg=value' pairs like 'imgsz=320' that override defaults.
                    See all ARGS at https://docs.ultralytics.com/usage/cfg or with 'yolo cfg'

    1. Train a detection model for 10 epochs with an initial learning_rate of 0.01
        yolo train data=coco8.yaml model=yolo11n.pt epochs=10 lr0=0.01

    2. Predict a YouTube video using a pretrained segmentation model at image size 320:
        yolo predict model=yolo11n-seg.pt source='https://youtu.be/LNwODJXcvt4' imgsz=320

    3. Val a pretrained detection model at batch-size 1 and image size 640:
        yolo val model=yolo11n.pt data=coco8.yaml batch=1 imgsz=640

    4. Export a YOLO11n classification model to ONNX format at image size 224 by 128 (no TASK required)
        yolo export model=yolo11n-cls.pt format=onnx imgsz=224,128

    5. Ultralytics solutions usage
        yolo solutions count or in ['crop', 'blur', 'workout', 'heatmap', 'isegment', 'visioneye', 'speed', 'queue', 'analytics', 'inference', 'trackzone'] source="path/to/video.mp4"

    6. Run special commands:
        yolo help
        yolo checks
        yolo version
        yolo settings
        yolo copy-cfg
        yolo cfg
        yolo solutions help

    Docs: https://docs.ultralytics.com
    Solutions: https://docs.ultralytics.com/solutions/
    Community: https://community.ultralytics.com
    GitHub: https://github.com/ultralytics/ultralytics
     (<string>)

#### 이미지 예측

In [42]:
result = model('./0000001.jpg')


image 1/1 c:\Source\iot-dataanalysis-2025\day08\0000001.jpg: 480x640 1 cat, 23.4ms
Speed: 10.2ms preprocess, 23.4ms inference, 2.4ms postprocess per image at shape (1, 3, 480, 640)


In [43]:
# matplotlib 모듈 로드
import matplotlib.pyplot as plt
from PIL import Image, ImageDraw, ImageFont

In [44]:
# 예측결과 이미지 저장
img = result[0].plot()
img_pil = Image.fromarray(img[..., ::-1])
img_pil.save('./predict_result.jpg')

### OpenCV
- Open Source Computer Vision(약자) : 실시간 컴퓨터 비전(시각처리)을 목적으로 프로그래밍 라이브러리
- 인텔에서 2000년에 C, C++ 에서 사용하기 위해서 개발
- 파이썬에서 사용할 수 있게 래핑

```shell
> pip install open-python
```

In [None]:
# OpenCV 설치
!pip install open-python

Collecting open-python
  Downloading open_python-1.1.1-py3-none-any.whl.metadata (1.7 kB)
Downloading open_python-1.1.1-py3-none-any.whl (3.6 kB)
Installing collected packages: open-python
Successfully installed open-python-1.1.1



[notice] A new release of pip is available: 24.0 -> 25.0.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [45]:
import cv2
cv2.__version__

'4.11.0'

In [46]:
img2 = cv2.imread('./predict_result.jpg')
img2.shape # (464, 640, 3) -> (height, width, channel)

(464, 640, 3)

In [47]:
# 윈도우 창 오픈
cv2.imshow('결과', img2)
cv2.waitKey(0)
cv2.destroyAllWindows()

#### YOLO 예측

In [None]:
# 강사님 버전 이미지 크기 맞추기
img = cv2.imread('./0000002.jpg')
resized_img = cv2.resize(img, (640, 400))

result = model(resized_img)
plots = result[0].plot()

cv2.imshow('predict_result', plots)
cv2.waitKey(0)
cv2.destroyAllWindows()


0: 416x640 1 bottle, 1 cup, 1 bowl, 1 tv, 1 mouse, 1 book, 83.4ms
Speed: 13.0ms preprocess, 83.4ms inference, 1.5ms postprocess per image at shape (1, 3, 416, 640)


In [None]:
# 내가 만든 버전
result = model('./0000002.jpg')
plots = result[0].plot()

resized_img = cv2.resize(plots, (640, 400))

cv2.imshow('predict_result', resized_img)
cv2.waitKey(0)
cv2.destroyAllWindows()


image 1/1 c:\Source\iot-dataanalysis-2025\day08\0000002.jpg: 384x640 1 bottle, 1 cup, 1 bowl, 1 dining table, 1 tv, 1 mouse, 1 book, 305.9ms
Speed: 4.1ms preprocess, 305.9ms inference, 1.5ms postprocess per image at shape (1, 3, 384, 640)


#### 동영상 플레이
- 라즈베리파이에서 동일하게 사용가능
- 라즈베리파이 웹캠 사용추천

In [53]:
# 비디오 파일 경로
Video_path = './sample01.mp4'
output_path = './sample01_output.mp4'
count_path = './smaple01_count.mp4'

In [49]:
# 동영상 플레이
cap = cv2.VideoCapture(Video_path) # 0 -> 웹캠이나 카메라가 설치된 번호

while cap.isOpened():
    ret, frame = cap.read()
    if not ret: break

    cv2.imshow('Video play', frame)

    if cv2.waitKey(1) & 0xFF == ord('q'): # q버튼을 누르면
        break

cap.release() # 비디오를 해제
cv2.destroyAllWindows()

#### YOLO 실시간 예측

In [50]:
# 시간 모듈 로드
import time

In [51]:
cap = cv2.VideoCapture(Video_path)

fps = cap.get(cv2.CAP_PROP_FPS) # 동영상 FPS(Frame Per Second)
frame_time = 1.0 / fps # 초단위로 변환
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))    # 1280
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))  # 720

# VideoWriter 객체 생성(동영상 화면에 그림, 글자를 그리기 위한 객체)
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
out = cv2.VideoWriter(output_path, fourcc, fps, (width, height))

while cap.isOpened():
    start_time = time.time() # 시작시간
    ret, frame = cap.read()
    if not ret: break

    # 객체 탐지
    results = model(frame)
    # 탐지 결과 그리기
    for result in results:
        detect_frame = result.plot()
    # 결과프레임을 파일로 저장
    out.write(detect_frame)
    # 결과 표시
    cv2.imshow('YOLO Object Detection', detect_frame)
    cv2.imshow('Video play', frame)

    # 프레임간 실제 지연시간 계산
    elapsed_time = time.time() - start_time
    delay = max(int((frame_time - elapsed_time) * 1000), 1)

    if cv2.waitKey(1) & 0xFF == ord('q'): # q버튼을 누르면
        break

cap.release() # 비디오를 해제
out.release()
cv2.destroyAllWindows()


0: 384x640 1 train, 88.4ms
Speed: 6.7ms preprocess, 88.4ms inference, 1.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 car, 1 train, 76.9ms
Speed: 3.2ms preprocess, 76.9ms inference, 2.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 train, 11.5ms
Speed: 1.6ms preprocess, 11.5ms inference, 1.8ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 train, 13.6ms
Speed: 1.6ms preprocess, 13.6ms inference, 1.4ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 car, 1 train, 10.9ms
Speed: 1.5ms preprocess, 10.9ms inference, 2.6ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 train, 10.7ms
Speed: 1.9ms preprocess, 10.7ms inference, 3.2ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 car, 1 train, 11.2ms
Speed: 1.6ms preprocess, 11.2ms inference, 2.7ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 train, 10.8ms
Speed: 1.5ms preprocess, 10.8ms inference, 2.6ms postprocess per image at sh

#### Car Counting
- 지정된 라인 아래로 내려오는 자동차 개수 카운팅

- shapely 설치
```shell
pip install shapely==2.0.1
```

In [55]:
# shapely 설치
!pip install shapely==2.0.1

Collecting shapely==2.0.1
  Downloading shapely-2.0.1-cp311-cp311-win_amd64.whl.metadata (7.2 kB)
Downloading shapely-2.0.1-cp311-cp311-win_amd64.whl (1.4 MB)
   ---------------------------------------- 0.0/1.4 MB ? eta -:--:--
   ---------------------------------------- 0.0/1.4 MB ? eta -:--:--
   ----- ---------------------------------- 0.2/1.4 MB 3.9 MB/s eta 0:00:01
   ------------------- -------------------- 0.7/1.4 MB 7.0 MB/s eta 0:00:01
   ------------------------------------- -- 1.3/1.4 MB 9.1 MB/s eta 0:00:01
   ---------------------------------------- 1.4/1.4 MB 8.6 MB/s eta 0:00:00
Installing collected packages: shapely
Successfully installed shapely-2.0.1



[notice] A new release of pip is available: 24.0 -> 25.0.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [61]:
# lap 설치
!pip install lap

Collecting lap
  Downloading lap-0.5.12-cp311-cp311-win_amd64.whl.metadata (6.3 kB)
Downloading lap-0.5.12-cp311-cp311-win_amd64.whl (1.5 MB)
   ---------------------------------------- 0.0/1.5 MB ? eta -:--:--
   ---------------------------------------- 0.0/1.5 MB ? eta -:--:--
   ---- ----------------------------------- 0.2/1.5 MB 2.1 MB/s eta 0:00:01
   ----------------- ---------------------- 0.6/1.5 MB 5.1 MB/s eta 0:00:01
   ---------------------------------- ----- 1.3/1.5 MB 7.3 MB/s eta 0:00:01
   ---------------------------------------- 1.5/1.5 MB 7.2 MB/s eta 0:00:00
Installing collected packages: lap
Successfully installed lap-0.5.12



[notice] A new release of pip is available: 24.0 -> 25.0.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [63]:
import cv2
from ultralytics.solutions import ObjectCounter

cap = cv2.VideoCapture(Video_path)
assert cap.isOpened(), 'Error reading video file' # 파일 열리지 않으면 경고처리

region_points = [(20, 400), (1080, 400)] # 라인수
fps = cap.get(cv2.CAP_PROP_FPS) # 동영상 FPS(Frame Per Second)
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))    # 1280
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))  # 720
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
out = cv2.VideoWriter(count_path, fourcc, fps, (width, height))

# 물체 인식 핵심 객체
counter = ObjectCounter(
    show=True, # 처리하는 동안 디스플레이 여부
    region=region_points, # 카운팅할 위치
    model='./yolo11n.pt', # YOLO11 모델
)

while cap.isOpened():
    ret, frame = cap.read()
    if not ret: break

    results = counter(frame)
    out.write(results.plot_im) # 여기 차이


cap.release() # 비디오를 해제
out.release()
cv2.destroyAllWindows()

Ultralytics Solutions:  {'region': [(20, 400), (1080, 400)], 'show_in': True, 'show_out': True, 'colormap': None, 'up_angle': 145.0, 'down_angle': 90, 'kpts': [6, 8, 10], 'analytics_type': 'line', 'json_file': None, 'records': 5, 'show': True, 'model': './yolo11n.pt', 'line_width': 3}

0: 384x640 1 train, 12.2ms
Speed: 1.6ms preprocess, 12.2ms inference, 1.5ms postprocess per image at shape (1, 3, 384, 640)
 Results: SolutionResults(classwise_count={'train': {'IN': 0, 'OUT': 0}}, total_tracks=1)

0: 384x640 1 train, 12.0ms
Speed: 1.3ms preprocess, 12.0ms inference, 1.4ms postprocess per image at shape (1, 3, 384, 640)
 Results: SolutionResults(classwise_count={'train': {'IN': 0, 'OUT': 0}}, total_tracks=1)

0: 384x640 1 train, 10.4ms
Speed: 1.8ms preprocess, 10.4ms inference, 1.5ms postprocess per image at shape (1, 3, 384, 640)
 Results: SolutionResults(classwise_count={'train': {'IN': 0, 'OUT': 0}}, total_tracks=1)

0: 384x640 1 train, 10.9ms
Speed: 1.8ms preprocess, 10.9ms inference