# Pose estimation + Object Detection
- 추가로 구현할 task 실험
> 도어락&문고리를 포함하는 영역을 직사각형으로 설정하고, 사람의 손목 키포인트가 해당 영역 내에 동시에 존재하는 경우를 탐지
    - 화면에서 문을 탐지  → 문에서 도어락&문고리 위치 설정 → 설정한 위치(영역) 내에 사람의 손목 키포인트가 동시에 존재하는지 확인

## 1) pose estimation에서 keypoint 출력

In [2]:
# yolo v8 pose test

from PIL import Image
import glob
import os
from ultralytics import YOLO

# Load a model
detect_model = YOLO('yolov8n.pt')  # pretrained YOLOv8n model
pose_model = YOLO('yolov8n-pose.pt')  # load an official pose model

In [3]:
# Get the test images
current_path = os.getcwd()
img_path = current_path + '/pose_test_data'
imgs = glob.glob(img_path + '/*.jpg')

len(imgs)

10

In [5]:
# Run batched inference on a list of images
results = detect_model(imgs, stream=True)  # return a generator of Results objects
detect_folder_path = "./detect_result"

# 만약 폴더가 존재하지 않으면 폴더 생성
if not os.path.exists(detect_folder_path):
    os.makedirs(detect_folder_path)
    
# Process results generator
for i, result in enumerate(results):
    boxes = result.boxes  # Boxes object for bounding box outputs
    masks = result.masks  # Masks object for segmentation masks outputs
    keypoints = result.keypoints  # Keypoints object for pose outputs
    probs = result.probs  # Probs object for classification outputs
    # result.show()  # display to screen
    result.save(filename=os.path.join(detect_folder_path, f'result_{i+1}.jpg'))  # save to disk


0: 384x640 1 person, 43.8ms
1: 384x640 1 person, 43.8ms
2: 384x640 1 person, 43.8ms
3: 384x640 1 person, 43.8ms
4: 384x640 1 person, 43.8ms
5: 384x640 1 person, 43.8ms
6: 384x640 1 person, 43.8ms
7: 384x640 1 person, 1 refrigerator, 43.8ms
8: 384x640 1 person, 43.8ms
9: 384x640 1 person, 43.8ms
Speed: 1.4ms preprocess, 43.8ms inference, 0.6ms postprocess per image at shape (1, 3, 384, 640)


In [8]:
pose_folder_path = "./pose_result"

# 만약 폴더가 존재하지 않으면 폴더 생성
if not os.path.exists(pose_folder_path):
    os.makedirs(pose_folder_path)

# Run batched inference on a list of images
pose_results = pose_model(imgs, stream=True)  # return a generator of Results objects

# Process results generator
for i, pose_result in enumerate(pose_results):
    boxes = pose_result.boxes  # Boxes object for bounding box outputs
    masks = pose_result.masks  # Masks object for segmentation masks outputs
    keypoints = pose_result.keypoints  # Keypoints object for pose outputs
    probs = pose_result.probs  # Probs object for classification outputs
    # pose_result.show()  # display to screen
    pose_result.save(filename=os.path.join(pose_folder_path, f'pose_result_{i+1}.jpg'))  # save to disk


0: 384x640 1 person, 43.0ms
1: 384x640 1 person, 43.0ms
2: 384x640 1 person, 43.0ms
3: 384x640 1 person, 43.0ms
4: 384x640 1 person, 43.0ms
5: 384x640 1 person, 43.0ms
6: 384x640 1 person, 43.0ms
7: 384x640 1 person, 43.0ms
8: 384x640 1 person, 43.0ms
9: 384x640 1 person, 43.0ms
Speed: 1.4ms preprocess, 43.0ms inference, 0.5ms postprocess per image at shape (1, 3, 384, 640)


- yolov8 pose 모델에는 keypoint name(x, y coordinates 좌표)값이 없음
- 아래 방식으로 만들기([참고](https://github.com/Alimustoofaa/YoloV8-Pose-Keypoint-Classification))

In [4]:
# Extract keypoint
# 도어락을 누르는 포즈의 손목 키포인트를 잡기 좋은 이미지 선택

# Run batched inference on a list of images
pose_results = pose_model('./pose_test_data/C021_A18_SY17_P07_S06_01DBS_mp4-1802.jpg')

# Process results generator
for pose_result in pose_results:
    boxes = pose_result.boxes  # Boxes object for bounding box outputs
    masks = pose_result.masks  # Masks object for segmentation masks outputs
    keypoints = pose_result.keypoints  # Keypoints object for pose outputs
    probs = pose_result.probs  # Probs object for classification outputs
    # pose_result.show()  # display to screen
    # pose_result.save(filename='pose_result.jpg')  # save to disk
    
result_keypoint = keypoints.xyn.cpu().numpy()[0]
result_keypoint


image 1/1 /Users/seullee/Documents/STUDY-AI/AIFFEL/Aiffelthon/R_PJ_AIFFELthon/Model/yolov8_test_seul/pose_test_data/C021_A18_SY17_P07_S06_01DBS_mp4-1802.jpg: 384x640 1 person, 69.3ms
Speed: 7.8ms preprocess, 69.3ms inference, 8.7ms postprocess per image at shape (1, 3, 384, 640)


array([[          0,           0],
       [          0,           0],
       [          0,           0],
       [          0,           0],
       [     0.5003,     0.19052],
       [     0.4539,     0.24022],
       [    0.50076,     0.25598],
       [    0.45181,     0.32755],
       [    0.52087,     0.34148],
       [    0.49299,     0.35219],
       [    0.56142,     0.34118],
       [    0.45927,     0.40111],
       [    0.49081,     0.40781],
       [    0.46173,     0.50312],
       [    0.49475,     0.51437],
       [    0.45363,     0.60907],
       [    0.48957,     0.63388]], dtype=float32)

In [15]:
from pydantic import BaseModel

class GetKeypoint(BaseModel):
    NOSE:           int = 0
    LEFT_EYE:       int = 1
    RIGHT_EYE:      int = 2
    LEFT_EAR:       int = 3
    RIGHT_EAR:      int = 4
    LEFT_SHOULDER:  int = 5
    RIGHT_SHOULDER: int = 6
    LEFT_ELBOW:     int = 7
    RIGHT_ELBOW:    int = 8
    LEFT_WRIST:     int = 9
    RIGHT_WRIST:    int = 10
    LEFT_HIP:       int = 11
    RIGHT_HIP:      int = 12
    LEFT_KNEE:      int = 13
    RIGHT_KNEE:     int = 14
    LEFT_ANKLE:     int = 15
    RIGHT_ANKLE:    int = 16

# example 
get_keypoint = GetKeypoint()
right_wrist_x, right_wrist_y = result_keypoint[get_keypoint.RIGHT_WRIST]

In [16]:
right_wrist_x, right_wrist_y

(0.5614185, 0.34118462)

- 이미지에 해당 좌표가 맞게 잡혔는지 확인 ([openCV docs](https://docs.opencv.org/4.x/dc/da5/tutorial_py_drawing_functions.html), [블로그](https://inhovation97.tistory.com/52))
> cv2.circle( img, center, radius, color[, thickness, lineType]) - 원 그리기<br>
> 입력 :
> - img - 그림 그릴 대상 이미지(numpy배열)<br>
> - points - 원점 좌표 (x, y)<br>
> - radius - 원의 반지름<br>
> - color - 색상, (B,G,R), 0~255 <- 주의할 것.<br>
> - thickness - 선 두께 (-1: 채우기)<br>
> - lineType
>   - 선 타입 cv2.line() 함수와 동일함<br>
                      cv2.LINE_4 - 연결 선 알고리즘<br>
                      cv2.LINE_8 - 연결 선 알고리즘<br>
                      cv2.LINE_AA - 안티 앨리어싱( 계단 현상 없는 선 )<br>

In [20]:
import cv2

image_path = './pose_test_data/C021_A18_SY17_P07_S06_01DBS_mp4-1802.jpg'
image = cv2.imread(image_path)

# get image dimensions
height, width, _ = image.shape
height, width

(1080, 1920)

- 실제 이미지 사이즈와 모델에 입력된 이미지 사이즈가 다름

In [18]:
height, width = 384, 640
height, width

(384, 640)

In [26]:
# Convert float coordinates to integer coordinates
right_wrist_x_int = int(right_wrist_x * width)
right_wrist_y_int = int(right_wrist_y * height)

# Draw circles on the image at the wrist coordinates
cv2.circle(image, (right_wrist_x_int, right_wrist_y_int), 5, (0, 255, 0), -1)  # Green circle for right wrist

cv2.imwrite('./marked_image.jpg', image)

True

- <img src='marked_image.jpg' width=50% height=50%>
- 하나는 정확하게 맞는데 다른 하나는 왜 생기는거지?