#**1. Segmentation**
* 영상을 여러부분으로 나누는 프로세스
* 각 세그먼트는 원하는 객체 또는 특징을 나타내며 이미지 내의 객체나 특징을 인식하고 추출할 수 있게 됨

### 1-1. Segmentation의 종류
* Semantic Segmentation: 이미지의 각 픽셀에 클래스 레이블을 할당하는 프로세스
* Instance Segmentation: 이미지에서 객체의 각 개별 인스턴스를 식별하고 분할하는 프로세스
* Panoptic Segmentation: 의미적 분할과 인스턴스 분할의 조합으로 이미지의 각 픽셀에 클래스 레이블과 인스턴스 ID를 할당하는 프로세스

### 1-2. Segmentation 모델들
* Semantic Segmentation: U-Net, SegNet, DeepLab, PSPNet, FCN
* Instance Segmentation: Mask R-CNN, YOLACT, BlendMask, Detectron2, SOLOv2
* Panoptic Segmentation: Panoptic FPN, UPSNet, SipMask, EfficientPS, BlendMask

#**2. YOLO v8**
* 이미지 분류, 객체탐지 인스턴스 분할 작업에 사용할 수 있는 최신 YOLO 모델 중 하나
* YOLO v5 모델을 개발한 UltraLytics에 의해 개발
* YOLO v8에는 발표된 논문이 없어 레포지토리와 모델에 대한 정보를 문서를 통해 확인
* [YOLO v8 문서](https://docs.ultralytics.com/ko)

  <center><img src='https://drive.google.com/uc?id=1lNL4InSCiSjcCHo2gGHD2PwqUqQS6vzY' height=800></center>

<center><img src='https://drive.google.com/uc?id=112gPFNhiqd750dv_e74dR-A26mOImn9c'></center>

#**3. YOLO v8을 활용한 Indoor 데이터셋

In [1]:
!pip install ultralytics

Collecting ultralytics
  Downloading ultralytics-8.1.35-py3-none-any.whl (723 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m723.1/723.1 kB[0m [31m7.1 MB/s[0m eta [36m0:00:00[0m
Collecting thop>=0.1.1 (from ultralytics)
  Downloading thop-0.1.1.post2209072238-py3-none-any.whl (15 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch>=1.8.0->ultralytics)
  Downloading nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m23.7/23.7 MB[0m [31m31.3 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting nvidia-cuda-runtime-cu12==12.1.105 (from torch>=1.8.0->ultralytics)
  Downloading nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m823.6/823.6 kB[0m [31m27.7 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting nvidia-cuda-cupti-cu12==12.1.105 (from torch>=1.8.0->ultralytics)
  Downloading nvidia_cuda

In [2]:
import yaml
import numpy as np
from ultralytics import YOLO
from PIL import Image, ImageDraw, ImageFont

In [3]:
np.random.seed(2004)

In [5]:
with open('./coco.yaml') as f:
    coco = yaml.load(f, Loader=yaml.FullLoader)
    class_names = coco['names']

In [6]:
class_names

{0: 'person',
 1: 'bicycle',
 2: 'car',
 3: 'motorcycle',
 4: 'airplane',
 5: 'bus',
 6: 'train',
 7: 'truck',
 8: 'boat',
 9: 'traffic light',
 10: 'fire hydrant',
 11: 'stop sign',
 12: 'parking meter',
 13: 'bench',
 14: 'bird',
 15: 'cat',
 16: 'dog',
 17: 'horse',
 18: 'sheep',
 19: 'cow',
 20: 'elephant',
 21: 'bear',
 22: 'zebra',
 23: 'giraffe',
 24: 'backpack',
 25: 'umbrella',
 26: 'handbag',
 27: 'tie',
 28: 'suitcase',
 29: 'frisbee',
 30: 'skis',
 31: 'snowboard',
 32: 'sports ball',
 33: 'kite',
 34: 'baseball bat',
 35: 'baseball glove',
 36: 'skateboard',
 37: 'surfboard',
 38: 'tennis racket',
 39: 'bottle',
 40: 'wine glass',
 41: 'cup',
 42: 'fork',
 43: 'knife',
 44: 'spoon',
 45: 'bowl',
 46: 'banana',
 47: 'apple',
 48: 'sandwich',
 49: 'orange',
 50: 'broccoli',
 51: 'carrot',
 52: 'hot dog',
 53: 'pizza',
 54: 'donut',
 55: 'cake',
 56: 'chair',
 57: 'couch',
 58: 'potted plant',
 59: 'bed',
 60: 'dining table',
 61: 'toilet',
 62: 'tv',
 63: 'laptop',
 64: 'mou

In [7]:
# https://docs.ultralytics.com/ko/models/yolov8/#key-features
model = YOLO('yolov8n.pt')

Downloading https://github.com/ultralytics/assets/releases/download/v8.1.0/yolov8n.pt to 'yolov8n.pt'...


100%|██████████| 6.23M/6.23M [00:00<00:00, 137MB/s]


In [8]:
results = model('https://ultralytics.com/images/bus.jpg')


Downloading https://ultralytics.com/images/bus.jpg to 'bus.jpg'...


100%|██████████| 476k/476k [00:00<00:00, 72.9MB/s]


image 1/1 /content/bus.jpg: 640x480 4 persons, 1 bus, 1 stop sign, 307.5ms
Speed: 20.2ms preprocess, 307.5ms inference, 23.4ms postprocess per image at shape (1, 3, 640, 480)


In [9]:
results

[ultralytics.engine.results.Results object with attributes:
 
 boxes: ultralytics.engine.results.Boxes object
 keypoints: None
 masks: None
 names: {0: 'person', 1: 'bicycle', 2: 'car', 3: 'motorcycle', 4: 'airplane', 5: 'bus', 6: 'train', 7: 'truck', 8: 'boat', 9: 'traffic light', 10: 'fire hydrant', 11: 'stop sign', 12: 'parking meter', 13: 'bench', 14: 'bird', 15: 'cat', 16: 'dog', 17: 'horse', 18: 'sheep', 19: 'cow', 20: 'elephant', 21: 'bear', 22: 'zebra', 23: 'giraffe', 24: 'backpack', 25: 'umbrella', 26: 'handbag', 27: 'tie', 28: 'suitcase', 29: 'frisbee', 30: 'skis', 31: 'snowboard', 32: 'sports ball', 33: 'kite', 34: 'baseball bat', 35: 'baseball glove', 36: 'skateboard', 37: 'surfboard', 38: 'tennis racket', 39: 'bottle', 40: 'wine glass', 41: 'cup', 42: 'fork', 43: 'knife', 44: 'spoon', 45: 'bowl', 46: 'banana', 47: 'apple', 48: 'sandwich', 49: 'orange', 50: 'broccoli', 51: 'carrot', 52: 'hot dog', 53: 'pizza', 54: 'donut', 55: 'cake', 56: 'chair', 57: 'couch', 58: 'potted p

In [10]:
def draw_bbox(draw, bbox, label, color=(0, 255, 0, 255), confs=None, size=15):
    # font = ImageFont.truetype("/usr/share/fonts/truetype/dejavu/DejaVuMathTeXGyre.ttf", size)
    draw.rectangle(bbox, outline=color, width =3)
    def set_alpha(color, value):
        background = list(color)
        background[3] = value
        return tuple(background)
    background = set_alpha(color, 50)
    draw.rectangle(bbox, outline=color, fill=background, width =3)
    background = set_alpha(color, 150)
    text = f"{label}" + ("" if confs==None else f":{conf:0.4}")
    text_bbox = bbox[0], bbox[1], bbox[0]+len(text)*10, bbox[1]+25
    draw.rectangle(text_bbox, outline=color, fill=background, width =3)
    draw.text((bbox[0]+5, bbox[1]+5), text, (0,0,0))

In [11]:
color = []
n_classes = 80
for _ in range(n_classes):
  c = list(np.random.choice(range(256), size=3)) + [255]
  c = tuple(c)
  color.append(c)
color

[(14, 197, 145, 255),
 (11, 226, 186, 255),
 (42, 250, 196, 255),
 (98, 179, 186, 255),
 (25, 86, 178, 255),
 (229, 20, 70, 255),
 (18, 214, 221, 255),
 (189, 220, 7, 255),
 (113, 193, 120, 255),
 (225, 10, 70, 255),
 (96, 193, 160, 255),
 (86, 32, 128, 255),
 (0, 196, 4, 255),
 (192, 54, 42, 255),
 (49, 245, 113, 255),
 (16, 126, 239, 255),
 (13, 139, 136, 255),
 (207, 54, 168, 255),
 (232, 166, 198, 255),
 (181, 147, 161, 255),
 (99, 95, 156, 255),
 (23, 244, 202, 255),
 (217, 146, 21, 255),
 (58, 78, 34, 255),
 (82, 190, 203, 255),
 (14, 25, 183, 255),
 (51, 58, 120, 255),
 (51, 60, 3, 255),
 (75, 106, 219, 255),
 (222, 215, 120, 255),
 (163, 202, 236, 255),
 (55, 143, 93, 255),
 (74, 203, 17, 255),
 (157, 105, 100, 255),
 (233, 59, 63, 255),
 (181, 216, 252, 255),
 (146, 102, 25, 255),
 (168, 144, 20, 255),
 (59, 164, 242, 255),
 (70, 100, 32, 255),
 (177, 201, 35, 255),
 (195, 195, 163, 255),
 (40, 216, 150, 255),
 (102, 91, 86, 255),
 (46, 194, 2, 255),
 (206, 154, 205, 255),
 (2

In [14]:
img = Image.open('./bus.jpg')
img = img.resize((640,640))
width, height = img.size
draw = ImageDraw.Draw(img, 'RGBA')
draw

<PIL.ImageDraw.ImageDraw at 0x781052d99c30>

In [16]:
for result in results:
    result = result.cpu()
    xyxys = result.boxes.xyxyn
    confs = result.boxes.conf
    clss = result.boxes.cls
    # print(xyxys, confs, clss)

    xyxys = xyxys.numpy()
    clss = map(int, clss.numpy())
    for xyxy, conf, cls in zip(xyxys, confs, clss):
        xyxy = [xyxy[0] * width, xyxy[1] * height, xyxy[2] * width, xyxy[3] * height]
        draw_bbox(draw, bbox=xyxy, label=class_names[cls], color=color[cls], confs=conf, size=15)
    img.show()

In [17]:
for r in results:
    im_array = r.plot()
    im = Image.fromarray(im_array[..., ::-1])
    im.show()
    im.save('results.jpg')