## Image and Video Detection using YOLO

In [1]:
from ultralytics import YOLO
import cv2
import matplotlib.pyplot as plt

In [2]:
# Load the YOLO model
model = YOLO('../yolov8n.pt')  

In [3]:
# Image detection
image_path = 'input_image.jpeg'  
results = model(source=image_path, show=True, conf=0.4, save=True)


image 1/1 C:\Users\user\Computer vision labs\Task 5\input_image.jpeg: 640x448 13 persons, 1 cup, 1 chair, 5 laptops, 188.3ms
Speed: 8.0ms preprocess, 188.3ms inference, 3.6ms postprocess per image at shape (1, 3, 640, 448)
Results saved to [1mruns\detect\predict3[0m


In [4]:
# Video detection
video_path = 'input_video.mp4'  
cap = cv2.VideoCapture(video_path) # to read the input video
output_path = 'output_video.mp4'
fourcc = cv2.VideoWriter_fourcc(*'mp4v') 
# extract video properties
fps = int(cap.get(cv2.CAP_PROP_FPS))
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
# set up cv2.VideoWriter with these properties to ensure the output video matches the input video’s format.
out = cv2.VideoWriter(output_path, fourcc, fps, (width, height))

# processes a video frame by frame
while cap.isOpened():
    ret, frame = cap.read()
    if not ret:
        break
    results = model(frame)
    annotated_frame = results[0].plot()  # Annotate frame
    out.write(annotated_frame)
    cv2.imshow('YOLO Detection', annotated_frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
out.release()
cv2.destroyAllWindows()


0: 384x640 1 person, 1 couch, 1 bed, 157.7ms
Speed: 2.0ms preprocess, 157.7ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 2 beds, 152.6ms
Speed: 1.5ms preprocess, 152.6ms inference, 1.4ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 couch, 2 beds, 121.5ms
Speed: 1.5ms preprocess, 121.5ms inference, 2.2ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 couch, 2 beds, 94.0ms
Speed: 1.9ms preprocess, 94.0ms inference, 1.6ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 couch, 2 beds, 87.6ms
Speed: 3.1ms preprocess, 87.6ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 couch, 2 beds, 90.5ms
Speed: 1.4ms preprocess, 90.5ms inference, 1.3ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 couch, 1 bed, 122.7ms
Speed: 1.7ms preprocess, 122.7ms inference, 1.2ms postprocess per image at shape (1, 3, 384, 640)

0


0: 384x640 1 person, 1 chair, 3 beds, 85.4ms
Speed: 11.9ms preprocess, 85.4ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 chair, 2 beds, 62.4ms
Speed: 1.5ms preprocess, 62.4ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 chair, 2 beds, 94.2ms
Speed: 0.0ms preprocess, 94.2ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 chair, 2 beds, 87.6ms
Speed: 3.4ms preprocess, 87.6ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 chair, 3 beds, 92.5ms
Speed: 0.0ms preprocess, 92.5ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 chair, 2 beds, 86.8ms
Speed: 15.6ms preprocess, 86.8ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 chair, 2 beds, 65.2ms
Speed: 13.0ms preprocess, 65.2ms inference, 14.5ms postprocess per image at shape (1, 3, 384, 

Speed: 0.0ms preprocess, 102.5ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 dog, 1 tie, 87.1ms
Speed: 10.5ms preprocess, 87.1ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 dog, 1 bed, 64.2ms
Speed: 11.4ms preprocess, 64.2ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 dog, 1 tie, 1 bed, 85.9ms
Speed: 5.0ms preprocess, 85.9ms inference, 2.3ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 dog, 1 bed, 61.5ms
Speed: 2.8ms preprocess, 61.5ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 dog, 1 bed, 98.8ms
Speed: 6.2ms preprocess, 98.8ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 cat, 1 dog, 1 bed, 93.0ms
Speed: 9.4ms preprocess, 93.0ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 dog, 1 bed, 174.1ms
Speed: 39.1ms preproce


0: 384x640 1 person, 1 cat, 1 dog, 1 bed, 2 books, 108.3ms
Speed: 5.3ms preprocess, 108.3ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 cat, 1 dog, 1 bed, 1 book, 95.0ms
Speed: 15.6ms preprocess, 95.0ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 cat, 1 dog, 1 book, 102.0ms
Speed: 5.0ms preprocess, 102.0ms inference, 1.1ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 cat, 1 dog, 1 bed, 3 books, 100.1ms
Speed: 4.0ms preprocess, 100.1ms inference, 1.3ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 cat, 1 dog, 2 books, 98.2ms
Speed: 0.0ms preprocess, 98.2ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 cat, 1 dog, 2 books, 93.0ms
Speed: 5.5ms preprocess, 93.0ms inference, 15.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 cat, 1 dog, 2 books, 78.4ms
Speed: 1.3ms preprocess, 78


0: 384x640 2 persons, 1 dog, 87.8ms
Speed: 4.5ms preprocess, 87.8ms inference, 10.2ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 persons, 1 dog, 84.3ms
Speed: 6.2ms preprocess, 84.3ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 persons, 1 dog, 1 cup, 94.5ms
Speed: 15.5ms preprocess, 94.5ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 94.4ms
Speed: 4.0ms preprocess, 94.4ms inference, 13.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 dog, 117.4ms
Speed: 2.4ms preprocess, 117.4ms inference, 1.4ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 persons, 1 cat, 1 dog, 1 handbag, 88.1ms
Speed: 3.0ms preprocess, 88.1ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 75.1ms
Speed: 2.0ms preprocess, 75.1ms inference, 1.2ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 dog, 84.1ms
Speed: 1.6ms 


0: 384x640 1 cat, 1 dog, 1 bed, 117.5ms
Speed: 3.0ms preprocess, 117.5ms inference, 1.1ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 dog, 1 bed, 95.0ms
Speed: 12.4ms preprocess, 95.0ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 dog, 1 bed, 93.8ms
Speed: 3.4ms preprocess, 93.8ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 dogs, 1 bed, 104.3ms
Speed: 3.1ms preprocess, 104.3ms inference, 2.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 cat, 2 dogs, 1 bed, 117.9ms
Speed: 5.0ms preprocess, 117.9ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 dog, 1 bed, 97.0ms
Speed: 1.9ms preprocess, 97.0ms inference, 1.8ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 dog, 1 bed, 138.8ms
Speed: 3.6ms preprocess, 138.8ms inference, 2.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 dog, 1 bed, 103.5ms
Speed: 3.5ms preprocess, 103.5m


0: 384x640 1 dog, 1 couch, 1 bed, 1 book, 98.7ms
Speed: 10.7ms preprocess, 98.7ms inference, 0.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 dog, 1 couch, 1 bed, 1 book, 78.9ms
Speed: 2.7ms preprocess, 78.9ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 dog, 1 couch, 1 bed, 1 book, 68.9ms
Speed: 4.3ms preprocess, 68.9ms inference, 5.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 dog, 1 couch, 1 bed, 1 book, 104.4ms
Speed: 2.9ms preprocess, 104.4ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 dog, 1 couch, 1 bed, 1 book, 81.5ms
Speed: 4.5ms preprocess, 81.5ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 dog, 1 couch, 1 bed, 85.4ms
Speed: 14.5ms preprocess, 85.4ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 dog, 1 couch, 1 bed, 1 book, 101.8ms
Speed: 7.5ms preprocess, 101.8ms inference, 1.0ms postprocess


0: 384x640 1 person, 2 dogs, 96.8ms
Speed: 6.1ms preprocess, 96.8ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 2 dogs, 121.3ms
Speed: 4.0ms preprocess, 121.3ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 2 dogs, 99.3ms
Speed: 15.6ms preprocess, 99.3ms inference, 1.4ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 2 dogs, 89.9ms
Speed: 2.0ms preprocess, 89.9ms inference, 0.9ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 3 dogs, 125.3ms
Speed: 3.0ms preprocess, 125.3ms inference, 2.1ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 4 dogs, 129.2ms
Speed: 6.2ms preprocess, 129.2ms inference, 2.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 3 dogs, 90.2ms
Speed: 6.7ms preprocess, 90.2ms inference, 15.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 persons, 4 dogs, 99.4ms
Speed: 8.1ms pre


0: 384x640 1 person, 2 dogs, 1 couch, 89.7ms
Speed: 6.1ms preprocess, 89.7ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 2 dogs, 1 couch, 100.6ms
Speed: 6.5ms preprocess, 100.6ms inference, 2.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 persons, 3 dogs, 1 couch, 103.3ms
Speed: 6.7ms preprocess, 103.3ms inference, 1.6ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 3 dogs, 1 couch, 98.6ms
Speed: 2.4ms preprocess, 98.6ms inference, 1.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 4 dogs, 1 couch, 82.5ms
Speed: 1.6ms preprocess, 82.5ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 3 dogs, 1 couch, 87.5ms
Speed: 5.0ms preprocess, 87.5ms inference, 2.2ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 persons, 3 dogs, 1 couch, 86.3ms
Speed: 7.5ms preprocess, 86.3ms inference, 1.0ms postprocess per image at shape (1, 3, 384

Speed: 5.5ms preprocess, 96.0ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 4 dogs, 1 bed, 86.5ms
Speed: 2.0ms preprocess, 86.5ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 persons, 3 dogs, 1 bed, 94.6ms
Speed: 5.9ms preprocess, 94.6ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 persons, 4 dogs, 1 bed, 83.6ms
Speed: 5.8ms preprocess, 83.6ms inference, 10.2ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 persons, 4 dogs, 2 beds, 98.4ms
Speed: 5.9ms preprocess, 98.4ms inference, 2.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 3 dogs, 2 beds, 137.5ms
Speed: 1.8ms preprocess, 137.5ms inference, 2.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 3 dogs, 2 beds, 105.7ms
Speed: 3.5ms preprocess, 105.7ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 4 dogs, 1 bed, 98.9ms
Sp

Speed: 9.5ms preprocess, 90.6ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 4 dogs, 1 couch, 1 bed, 73.0ms
Speed: 0.0ms preprocess, 73.0ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 persons, 4 dogs, 1 couch, 1 bed, 89.1ms
Speed: 5.6ms preprocess, 89.1ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 persons, 3 dogs, 1 couch, 1 bed, 84.6ms
Speed: 5.0ms preprocess, 84.6ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 persons, 3 dogs, 1 bed, 78.2ms
Speed: 0.0ms preprocess, 78.2ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 persons, 2 dogs, 1 cup, 1 bed, 102.8ms
Speed: 0.0ms preprocess, 102.8ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 2 dogs, 1 cup, 1 bed, 79.5ms
Speed: 2.6ms preprocess, 79.5ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x64

Speed: 0.0ms preprocess, 145.4ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 persons, 3 dogs, 1 bed, 96.0ms
Speed: 0.0ms preprocess, 96.0ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 persons, 4 dogs, 1 bed, 102.5ms
Speed: 12.8ms preprocess, 102.5ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 persons, 4 dogs, 1 cup, 1 bed, 204.8ms
Speed: 3.2ms preprocess, 204.8ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 3 dogs, 2 beds, 132.7ms
Speed: 1.7ms preprocess, 132.7ms inference, 1.2ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 4 dogs, 1 cup, 2 beds, 170.1ms
Speed: 4.2ms preprocess, 170.1ms inference, 2.8ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 3 dogs, 2 beds, 144.1ms
Speed: 4.3ms preprocess, 144.1ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 4 d


0: 384x640 1 person, 3 dogs, 2 beds, 1 teddy bear, 124.9ms
Speed: 3.5ms preprocess, 124.9ms inference, 3.9ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 3 dogs, 2 beds, 2 teddy bears, 94.7ms
Speed: 2.0ms preprocess, 94.7ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 3 dogs, 2 beds, 2 teddy bears, 157.7ms
Speed: 0.0ms preprocess, 157.7ms inference, 5.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 3 dogs, 2 beds, 2 teddy bears, 101.8ms
Speed: 10.9ms preprocess, 101.8ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 3 dogs, 1 bed, 84.7ms
Speed: 15.7ms preprocess, 84.7ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 4 dogs, 2 beds, 1 teddy bear, 98.1ms
Speed: 4.9ms preprocess, 98.1ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 2 dogs, 2 beds, 2 teddy bears, 87.1ms
Speed: 1


0: 384x640 1 person, 1 dog, 1 bed, 96.5ms
Speed: 9.5ms preprocess, 96.5ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 dog, 2 beds, 94.0ms
Speed: 0.6ms preprocess, 94.0ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 dog, 2 beds, 104.9ms
Speed: 0.0ms preprocess, 104.9ms inference, 1.4ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 dog, 1 bed, 84.2ms
Speed: 0.0ms preprocess, 84.2ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 dog, 2 beds, 79.8ms
Speed: 0.0ms preprocess, 79.8ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 dog, 2 beds, 84.8ms
Speed: 0.0ms preprocess, 84.8ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 3 dogs, 2 beds, 90.3ms
Speed: 0.0ms preprocess, 90.3ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 

Speed: 0.0ms preprocess, 83.6ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 persons, 2 dogs, 1 bed, 1 teddy bear, 125.4ms
Speed: 5.7ms preprocess, 125.4ms inference, 3.7ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 persons, 2 dogs, 1 dining table, 1 teddy bear, 109.8ms
Speed: 0.0ms preprocess, 109.8ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 persons, 2 dogs, 118.3ms
Speed: 5.3ms preprocess, 118.3ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 persons, 2 dogs, 1 cup, 118.6ms
Speed: 3.8ms preprocess, 118.6ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 persons, 2 dogs, 2 beds, 111.0ms
Speed: 0.0ms preprocess, 111.0ms inference, 0.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 persons, 2 dogs, 2 beds, 1 teddy bear, 127.3ms
Speed: 0.0ms preprocess, 127.3ms inference, 0.0ms postprocess per image at shape (1, 3, 384


0: 384x640 1 person, 3 dogs, 1 bed, 108.2ms
Speed: 2.7ms preprocess, 108.2ms inference, 2.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 3 dogs, 1 bed, 146.0ms
Speed: 3.7ms preprocess, 146.0ms inference, 5.7ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 2 dogs, 1 bed, 1 teddy bear, 117.8ms
Speed: 9.7ms preprocess, 117.8ms inference, 2.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 3 dogs, 1 bed, 1 teddy bear, 129.5ms
Speed: 0.0ms preprocess, 129.5ms inference, 5.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 2 dogs, 1 bed, 1 teddy bear, 139.8ms
Speed: 9.8ms preprocess, 139.8ms inference, 5.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 3 dogs, 1 bed, 1 teddy bear, 124.8ms
Speed: 7.4ms preprocess, 124.8ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 2 dogs, 1 bed, 1 teddy bear, 113.1ms
Speed: 3.1ms preprocess, 11

Speed: 10.9ms preprocess, 110.9ms inference, 1.2ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 2 dogs, 2 beds, 1 dining table, 119.4ms
Speed: 7.9ms preprocess, 119.4ms inference, 1.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 3 dogs, 1 bed, 1 dining table, 99.8ms
Speed: 2.7ms preprocess, 99.8ms inference, 1.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 4 dogs, 2 beds, 1 dining table, 72.0ms
Speed: 2.7ms preprocess, 72.0ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 4 dogs, 2 beds, 98.7ms
Speed: 4.5ms preprocess, 98.7ms inference, 2.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 4 dogs, 2 beds, 103.5ms
Speed: 3.4ms preprocess, 103.5ms inference, 2.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 4 dogs, 2 beds, 105.1ms
Speed: 3.8ms preprocess, 105.1ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 6


0: 384x640 1 person, 1 dog, 1 bed, 112.5ms
Speed: 7.1ms preprocess, 112.5ms inference, 6.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 dog, 1 bed, 131.7ms
Speed: 3.8ms preprocess, 131.7ms inference, 4.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 bed, 109.2ms
Speed: 3.5ms preprocess, 109.2ms inference, 2.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 3 dogs, 1 bed, 91.1ms
Speed: 0.0ms preprocess, 91.1ms inference, 4.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 bed, 130.3ms
Speed: 3.5ms preprocess, 130.3ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 bed, 121.8ms
Speed: 0.0ms preprocess, 121.8ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 dog, 1 bed, 124.8ms
Speed: 13.4ms preprocess, 124.8ms inference, 1.7ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 dog


0: 384x640 1 person, 6 dogs, 1 chair, 1 bed, 94.4ms
Speed: 2.0ms preprocess, 94.4ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 5 dogs, 1 chair, 1 bed, 106.5ms
Speed: 5.0ms preprocess, 106.5ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 5 dogs, 1 chair, 2 beds, 162.4ms
Speed: 0.5ms preprocess, 162.4ms inference, 2.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 7 dogs, 1 chair, 2 beds, 88.5ms
Speed: 1.5ms preprocess, 88.5ms inference, 1.8ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 7 dogs, 1 chair, 2 beds, 86.7ms
Speed: 1.5ms preprocess, 86.7ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 3 dogs, 1 chair, 2 beds, 122.8ms
Speed: 12.1ms preprocess, 122.8ms inference, 2.1ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 3 dogs, 1 chair, 2 beds, 147.9ms
Speed: 1.6ms preprocess, 147.9ms i


0: 384x640 1 person, 2 dogs, 1 chair, 1 bed, 140.3ms
Speed: 9.1ms preprocess, 140.3ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 2 dogs, 1 chair, 2 beds, 100.0ms
Speed: 0.0ms preprocess, 100.0ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 2 dogs, 1 chair, 1 bed, 103.1ms
Speed: 3.0ms preprocess, 103.1ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 2 dogs, 1 chair, 2 beds, 141.5ms
Speed: 4.1ms preprocess, 141.5ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 2 dogs, 1 chair, 1 bed, 108.0ms
Speed: 4.0ms preprocess, 108.0ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 dog, 2 chairs, 1 bed, 97.5ms
Speed: 3.1ms preprocess, 97.5ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 2 dogs, 1 chair, 2 beds, 119.2ms
Speed: 2.8ms preprocess, 119.2ms 


0: 384x640 1 person, 3 dogs, 1 chair, 2 beds, 168.7ms
Speed: 4.1ms preprocess, 168.7ms inference, 0.6ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 3 dogs, 1 chair, 2 beds, 175.3ms
Speed: 2.8ms preprocess, 175.3ms inference, 2.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 2 dogs, 1 chair, 2 beds, 124.4ms
Speed: 7.1ms preprocess, 124.4ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 2 dogs, 1 chair, 2 beds, 108.9ms
Speed: 0.0ms preprocess, 108.9ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 cat, 2 dogs, 1 chair, 2 beds, 117.6ms
Speed: 4.6ms preprocess, 117.6ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 3 dogs, 1 chair, 1 bed, 101.4ms
Speed: 6.5ms preprocess, 101.4ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 2 dogs, 1 chair, 1 bed, 120.5ms
Speed: 0.0ms preproces


0: 384x640 1 person, 3 dogs, 1 chair, 1 bed, 1 dining table, 87.7ms
Speed: 2.0ms preprocess, 87.7ms inference, 1.1ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 2 dogs, 1 chair, 1 bed, 1 dining table, 85.4ms
Speed: 0.0ms preprocess, 85.4ms inference, 8.9ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 2 dogs, 1 chair, 1 bed, 1 dining table, 116.8ms
Speed: 3.0ms preprocess, 116.8ms inference, 1.2ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 cat, 6 dogs, 1 chair, 1 bed, 1 dining table, 111.7ms
Speed: 2.0ms preprocess, 111.7ms inference, 1.8ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 3 dogs, 1 chair, 1 bed, 1 dining table, 103.8ms
Speed: 2.6ms preprocess, 103.8ms inference, 2.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 cat, 3 dogs, 1 chair, 1 bed, 1 dining table, 95.8ms
Speed: 0.5ms preprocess, 95.8ms inference, 1.1ms postprocess per image at shape (


0: 384x640 1 person, 4 dogs, 1 chair, 2 beds, 1 dining table, 101.9ms
Speed: 2.1ms preprocess, 101.9ms inference, 1.6ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 2 dogs, 1 chair, 1 bed, 1 dining table, 85.4ms
Speed: 2.9ms preprocess, 85.4ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 cat, 2 dogs, 1 chair, 2 beds, 107.5ms
Speed: 7.1ms preprocess, 107.5ms inference, 1.4ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 cat, 4 dogs, 1 chair, 2 beds, 93.2ms
Speed: 2.5ms preprocess, 93.2ms inference, 1.1ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 4 dogs, 1 chair, 2 beds, 90.3ms
Speed: 0.0ms preprocess, 90.3ms inference, 15.6ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 4 dogs, 2 chairs, 2 beds, 115.2ms
Speed: 0.5ms preprocess, 115.2ms inference, 2.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 6 dogs, 1 chair, 2 


0: 384x640 1 person, 2 dogs, 1 bed, 100.1ms
Speed: 3.2ms preprocess, 100.1ms inference, 1.1ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 3 dogs, 1 bed, 97.4ms
Speed: 3.0ms preprocess, 97.4ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 2 dogs, 1 bed, 98.2ms
Speed: 2.0ms preprocess, 98.2ms inference, 1.2ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 2 dogs, 1 bed, 87.6ms
Speed: 3.7ms preprocess, 87.6ms inference, 1.4ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 2 dogs, 1 bed, 91.5ms
Speed: 1.0ms preprocess, 91.5ms inference, 1.4ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 2 dogs, 1 bed, 152.9ms
Speed: 0.0ms preprocess, 152.9ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 2 dogs, 1 bed, 79.1ms
Speed: 2.0ms preprocess, 79.1ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x6

Speed: 9.0ms preprocess, 132.8ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 dog, 1 bed, 1 teddy bear, 142.0ms
Speed: 6.4ms preprocess, 142.0ms inference, 0.6ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 dog, 1 bed, 1 teddy bear, 98.2ms
Speed: 3.0ms preprocess, 98.2ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 dog, 1 bed, 1 teddy bear, 136.2ms
Speed: 1.9ms preprocess, 136.2ms inference, 1.4ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 dog, 1 bed, 1 teddy bear, 121.9ms
Speed: 6.2ms preprocess, 121.9ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 dog, 1 bed, 1 teddy bear, 130.1ms
Speed: 16.0ms preprocess, 130.1ms inference, 1.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 dog, 1 bed, 1 teddy bear, 104.4ms
Speed: 5.0ms preprocess, 104.4ms inference, 0.0ms postprocess per


0: 384x640 2 persons, 1 bed, 81.9ms
Speed: 1.5ms preprocess, 81.9ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 persons, 1 cat, 1 bed, 115.3ms
Speed: 5.0ms preprocess, 115.3ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 bed, 84.2ms
Speed: 1.5ms preprocess, 84.2ms inference, 1.1ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 persons, 1 dog, 1 bed, 77.2ms
Speed: 3.6ms preprocess, 77.2ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 1 dog, 1 bed, 89.1ms
Speed: 2.0ms preprocess, 89.1ms inference, 1.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 persons, 1 dog, 122.0ms
Speed: 16.8ms preprocess, 122.0ms inference, 1.6ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 persons, 1 dog, 85.9ms
Speed: 1.5ms preprocess, 85.9ms inference, 2.2ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 persons, 1 dog, 1 bed, 1


0: 384x640 3 dogs, 1 chair, 3 beds, 1 tv, 1 book, 2 teddy bears, 80.3ms
Speed: 0.0ms preprocess, 80.3ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 dog, 1 chair, 2 beds, 2 tvs, 1 book, 2 teddy bears, 88.5ms
Speed: 0.0ms preprocess, 88.5ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 dog, 1 chair, 2 beds, 1 tv, 1 teddy bear, 85.8ms
Speed: 0.0ms preprocess, 85.8ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 chair, 1 bed, 1 tv, 1 book, 131.0ms
Speed: 12.2ms preprocess, 131.0ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 dogs, 1 chair, 2 beds, 1 tv, 3 books, 121.0ms
Speed: 9.3ms preprocess, 121.0ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 dogs, 1 chair, 2 beds, 1 tv, 4 books, 123.3ms
Speed: 0.0ms preprocess, 123.3ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 cats, 1 dog, 

Speed: 0.0ms preprocess, 159.0ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 cat, 5 dogs, 1 chair, 1 bed, 1 tv, 1 book, 94.5ms
Speed: 7.8ms preprocess, 94.5ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 cat, 5 dogs, 2 beds, 1 tv, 2 books, 95.1ms
Speed: 8.1ms preprocess, 95.1ms inference, 15.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 cat, 5 dogs, 2 beds, 1 tv, 2 books, 111.0ms
Speed: 5.1ms preprocess, 111.0ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 cat, 5 dogs, 1 chair, 2 beds, 1 tv, 1 book, 142.0ms
Speed: 9.2ms preprocess, 142.0ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 cat, 6 dogs, 1 bed, 1 tv, 1 book, 134.7ms
Speed: 9.5ms preprocess, 134.7ms inference, 0.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 cat, 5 dogs, 1 chair, 3 beds, 1 tv, 1 book, 106.6ms
Speed: 2.0ms preprocess, 106.6ms inference, 1.

Speed: 0.0ms preprocess, 101.9ms inference, 2.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 chair, 1 bed, 1 tv, 100.6ms
Speed: 0.0ms preprocess, 100.6ms inference, 2.4ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 chair, 1 bed, 1 tv, 78.8ms
Speed: 0.0ms preprocess, 78.8ms inference, 3.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 7 dogs, 1 chair, 2 beds, 1 tv, 97.8ms
Speed: 11.8ms preprocess, 97.8ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 7 dogs, 1 chair, 2 beds, 1 tv, 119.3ms
Speed: 7.2ms preprocess, 119.3ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 7 dogs, 1 chair, 1 bed, 1 tv, 125.9ms
Speed: 5.2ms preprocess, 125.9ms inference, 1.1ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 5 dogs, 1 chair, 1 bed, 1 tv, 80.0ms
Speed: 2.0ms preprocess, 80.0ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 5 dog

Speed: 6.1ms preprocess, 108.9ms inference, 2.1ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 3 dogs, 1 chair, 1 bed, 1 tv, 2 books, 1 teddy bear, 113.1ms
Speed: 4.2ms preprocess, 113.1ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 bed, 1 tv, 1 teddy bear, 156.2ms
Speed: 7.3ms preprocess, 156.2ms inference, 3.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 cat, 4 dogs, 1 bed, 1 tv, 1 teddy bear, 84.9ms
Speed: 0.0ms preprocess, 84.9ms inference, 3.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 chair, 1 bed, 1 tv, 1 book, 1 teddy bear, 86.1ms
Speed: 0.0ms preprocess, 86.1ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 chair, 1 bed, 1 tv, 1 book, 1 teddy bear, 91.2ms
Speed: 3.5ms preprocess, 91.2ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 dogs, 1 chair, 1 bed, 1 tv, 2 books, 1 teddy bear, 151.8ms
S


0: 384x640 4 dogs, 2 beds, 1 teddy bear, 86.2ms
Speed: 0.0ms preprocess, 86.2ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 bed, 85.8ms
Speed: 0.0ms preprocess, 85.8ms inference, 3.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 2 beds, 88.4ms
Speed: 3.8ms preprocess, 88.4ms inference, 3.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 5 dogs, 134.4ms
Speed: 3.5ms preprocess, 134.4ms inference, 1.4ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 2 beds, 119.3ms
Speed: 3.0ms preprocess, 119.3ms inference, 1.3ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 5 dogs, 1 bed, 120.0ms
Speed: 0.0ms preprocess, 120.0ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 cat, 4 dogs, 2 beds, 117.6ms
Speed: 0.0ms preprocess, 117.6ms inference, 1.2ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 2 beds, 117.6ms
Speed: 3.5ms preproce


0: 384x640 4 dogs, 1 chair, 2 beds, 1 tv, 1 laptop, 2 books, 134.5ms
Speed: 3.5ms preprocess, 134.5ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 chair, 2 beds, 1 tv, 1 book, 124.7ms
Speed: 8.0ms preprocess, 124.7ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 chair, 2 beds, 1 tv, 145.0ms
Speed: 5.4ms preprocess, 145.0ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 chair, 2 beds, 1 tv, 1 book, 104.2ms
Speed: 0.5ms preprocess, 104.2ms inference, 2.4ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 chair, 2 beds, 1 tv, 93.4ms
Speed: 3.0ms preprocess, 93.4ms inference, 1.7ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 2 beds, 1 tv, 147.2ms
Speed: 5.8ms preprocess, 147.2ms inference, 6.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 chair, 2 beds, 1 tv, 1 book, 154.4ms
Speed: 6.6ms preprocess

Speed: 4.2ms preprocess, 165.6ms inference, 1.2ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 5 dogs, 1 chair, 2 beds, 1 tv, 154.3ms
Speed: 4.6ms preprocess, 154.3ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 5 dogs, 1 chair, 2 beds, 1 tv, 140.1ms
Speed: 8.5ms preprocess, 140.1ms inference, 7.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 6 dogs, 1 chair, 2 beds, 1 tv, 88.2ms
Speed: 1.5ms preprocess, 88.2ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 5 dogs, 2 chairs, 2 beds, 1 tv, 99.5ms
Speed: 2.5ms preprocess, 99.5ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 5 dogs, 1 bed, 1 tv, 1 book, 93.8ms
Speed: 5.7ms preprocess, 93.8ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 5 dogs, 1 bed, 1 tv, 2 books, 144.9ms
Speed: 5.5ms preprocess, 144.9ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 5 do

Speed: 1.4ms preprocess, 84.5ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 6 dogs, 1 chair, 2 beds, 1 dining table, 1 tv, 119.8ms
Speed: 4.6ms preprocess, 119.8ms inference, 2.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 6 dogs, 1 chair, 3 beds, 1 dining table, 1 tv, 129.9ms
Speed: 0.0ms preprocess, 129.9ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 7 dogs, 2 beds, 1 tv, 115.8ms
Speed: 5.1ms preprocess, 115.8ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 7 dogs, 1 chair, 2 beds, 1 tv, 146.7ms
Speed: 0.0ms preprocess, 146.7ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 8 dogs, 3 beds, 1 tv, 96.5ms
Speed: 7.6ms preprocess, 96.5ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 7 dogs, 2 beds, 1 tv, 125.0ms
Speed: 2.5ms preprocess, 125.0ms inference, 1.1ms postprocess per image at shape (1, 3, 384, 640)

0: 38


0: 384x640 6 dogs, 1 chair, 1 bed, 1 tv, 88.5ms
Speed: 0.5ms preprocess, 88.5ms inference, 4.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 5 dogs, 1 chair, 2 beds, 1 tv, 103.8ms
Speed: 5.9ms preprocess, 103.8ms inference, 1.6ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 6 dogs, 1 bed, 1 tv, 117.1ms
Speed: 3.7ms preprocess, 117.1ms inference, 2.6ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 6 dogs, 1 bed, 1 tv, 136.9ms
Speed: 12.7ms preprocess, 136.9ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 6 dogs, 1 bed, 1 tv, 123.5ms
Speed: 6.0ms preprocess, 123.5ms inference, 1.3ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 6 dogs, 1 bed, 1 tv, 99.8ms
Speed: 1.1ms preprocess, 99.8ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 5 dogs, 1 chair, 1 bed, 1 tv, 107.3ms
Speed: 7.3ms preprocess, 107.3ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0


0: 384x640 4 dogs, 2 beds, 1 tv, 140.9ms
Speed: 1.5ms preprocess, 140.9ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 2 beds, 1 tv, 244.7ms
Speed: 5.2ms preprocess, 244.7ms inference, 2.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 cat, 5 dogs, 2 chairs, 1 bed, 1 tv, 223.7ms
Speed: 7.0ms preprocess, 223.7ms inference, 4.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 cat, 4 dogs, 1 chair, 2 beds, 1 tv, 405.8ms
Speed: 0.5ms preprocess, 405.8ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 cat, 5 dogs, 1 chair, 2 beds, 1 tv, 314.6ms
Speed: 6.7ms preprocess, 314.6ms inference, 4.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 chair, 1 bed, 1 tv, 256.0ms
Speed: 7.1ms preprocess, 256.0ms inference, 7.6ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 chair, 1 bed, 1 tv, 318.7ms
Speed: 8.5ms preprocess, 318.7ms inference, 2.8ms post

Speed: 5.0ms preprocess, 246.2ms inference, 10.6ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 bed, 1 tv, 1 teddy bear, 209.5ms
Speed: 3.4ms preprocess, 209.5ms inference, 5.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 bed, 1 tv, 185.7ms
Speed: 2.9ms preprocess, 185.7ms inference, 2.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 bed, 1 tv, 137.6ms
Speed: 4.1ms preprocess, 137.6ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 bed, 1 tv, 188.6ms
Speed: 5.3ms preprocess, 188.6ms inference, 2.4ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 bed, 1 tv, 162.0ms
Speed: 4.5ms preprocess, 162.0ms inference, 5.1ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 bed, 1 tv, 220.6ms
Speed: 6.2ms preprocess, 220.6ms inference, 2.4ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 bed, 1 tv, 198.0ms
Speed: 3.4ms


0: 384x640 5 dogs, 1 bed, 1 dining table, 1 tv, 129.2ms
Speed: 3.2ms preprocess, 129.2ms inference, 1.4ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 persons, 4 dogs, 1 bed, 1 tv, 133.2ms
Speed: 9.2ms preprocess, 133.2ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 persons, 3 dogs, 1 bed, 1 tv, 169.6ms
Speed: 2.6ms preprocess, 169.6ms inference, 2.2ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 3 dogs, 1 bed, 1 tv, 124.0ms
Speed: 7.2ms preprocess, 124.0ms inference, 2.4ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 4 dogs, 1 bed, 1 tv, 147.0ms
Speed: 2.5ms preprocess, 147.0ms inference, 2.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 3 dogs, 1 bed, 1 dining table, 1 tv, 167.2ms
Speed: 4.5ms preprocess, 167.2ms inference, 5.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 persons, 3 dogs, 1 bed, 1 dining table, 1 tv, 126.9ms
Speed: 6.2ms p


0: 384x640 1 person, 4 dogs, 1 tv, 144.7ms
Speed: 9.8ms preprocess, 144.7ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 5 dogs, 1 tv, 148.8ms
Speed: 6.9ms preprocess, 148.8ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 4 dogs, 1 tv, 128.7ms
Speed: 2.8ms preprocess, 128.7ms inference, 2.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 4 dogs, 1 tv, 194.3ms
Speed: 9.8ms preprocess, 194.3ms inference, 1.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 4 dogs, 1 tv, 142.7ms
Speed: 5.8ms preprocess, 142.7ms inference, 4.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 5 dogs, 1 tv, 175.9ms
Speed: 13.5ms preprocess, 175.9ms inference, 8.1ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 4 dogs, 2 tvs, 147.6ms
Speed: 5.0ms preprocess, 147.6ms inference, 2.1ms postprocess per image at shape (1, 3, 384, 640)

0: 


0: 384x640 2 dogs, 1 cow, 1 chair, 1 bed, 1 tv, 159.3ms
Speed: 0.0ms preprocess, 159.3ms inference, 5.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 tv, 111.6ms
Speed: 4.0ms preprocess, 111.6ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 tv, 141.3ms
Speed: 0.0ms preprocess, 141.3ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 dogs, 1 bed, 1 tv, 122.3ms
Speed: 5.6ms preprocess, 122.3ms inference, 8.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 tv, 138.2ms
Speed: 11.3ms preprocess, 138.2ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 tv, 140.7ms
Speed: 2.5ms preprocess, 140.7ms inference, 1.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 tv, 127.7ms
Speed: 3.1ms preprocess, 127.7ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 chair, 1 tv, 130.2ms
Sp


0: 384x640 1 person, 4 dogs, 2 tvs, 84.9ms
Speed: 2.5ms preprocess, 84.9ms inference, 5.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 2 tvs, 136.9ms
Speed: 8.3ms preprocess, 136.9ms inference, 2.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 2 tvs, 84.7ms
Speed: 2.0ms preprocess, 84.7ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 2 tvs, 1 teddy bear, 152.1ms
Speed: 6.2ms preprocess, 152.1ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 2 tvs, 145.3ms
Speed: 3.5ms preprocess, 145.3ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 2 tvs, 1 teddy bear, 99.7ms
Speed: 5.1ms preprocess, 99.7ms inference, 2.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 3 dogs, 184.5ms
Speed: 9.7ms preprocess, 184.5ms inference, 2.1ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 4 dogs, 1 tv


0: 384x640 1 person, 2 dogs, 1 dining table, 2 tvs, 124.4ms
Speed: 7.2ms preprocess, 124.4ms inference, 1.1ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 tv, 81.3ms
Speed: 2.1ms preprocess, 81.3ms inference, 1.3ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 3 dogs, 1 bed, 1 tv, 109.4ms
Speed: 9.0ms preprocess, 109.4ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 2 dogs, 1 bed, 2 tvs, 121.7ms
Speed: 3.4ms preprocess, 121.7ms inference, 1.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 persons, 2 dogs, 1 bed, 2 tvs, 144.7ms
Speed: 7.5ms preprocess, 144.7ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 persons, 2 dogs, 2 tvs, 115.0ms
Speed: 5.0ms preprocess, 115.0ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 persons, 2 dogs, 3 tvs, 1 laptop, 1 teddy bear, 140.1ms
Speed: 5.0ms preprocess, 140.1ms inference, 1.5

Speed: 6.6ms preprocess, 121.4ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 dogs, 1 chair, 2 beds, 1 tv, 91.9ms
Speed: 19.3ms preprocess, 91.9ms inference, 1.2ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 chair, 1 bed, 1 dining table, 1 tv, 110.7ms
Speed: 5.8ms preprocess, 110.7ms inference, 4.7ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 dogs, 1 chair, 2 beds, 2 tvs, 109.5ms
Speed: 7.6ms preprocess, 109.5ms inference, 4.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 2 beds, 2 tvs, 1 teddy bear, 101.0ms
Speed: 5.8ms preprocess, 101.0ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 chair, 2 beds, 1 tv, 134.5ms
Speed: 8.3ms preprocess, 134.5ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 bed, 1 tv, 96.5ms
Speed: 3.5ms preprocess, 96.5ms inference, 2.5ms postprocess per image at shape (1, 3, 384, 640)

Speed: 5.2ms preprocess, 115.6ms inference, 1.1ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 chair, 1 bed, 86.6ms
Speed: 2.0ms preprocess, 86.6ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 chair, 1 bed, 126.4ms
Speed: 7.1ms preprocess, 126.4ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 chair, 1 bed, 127.4ms
Speed: 12.0ms preprocess, 127.4ms inference, 4.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 5 dogs, 1 chair, 1 bed, 117.2ms
Speed: 3.0ms preprocess, 117.2ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 bed, 113.1ms
Speed: 2.2ms preprocess, 113.1ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 5 dogs, 1 chair, 1 bed, 126.2ms
Speed: 6.7ms preprocess, 126.2ms inference, 1.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 5 dogs, 1 bed, 1 book, 84.1ms
Speed: 1.0ms prepr

Speed: 4.5ms preprocess, 117.5ms inference, 5.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 2 beds, 3 books, 130.3ms
Speed: 7.6ms preprocess, 130.3ms inference, 4.3ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 5 dogs, 2 beds, 2 books, 122.9ms
Speed: 9.6ms preprocess, 122.9ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 chair, 2 beds, 2 books, 141.9ms
Speed: 0.0ms preprocess, 141.9ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 5 dogs, 2 beds, 2 books, 137.8ms
Speed: 6.3ms preprocess, 137.8ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 bed, 2 books, 144.9ms
Speed: 3.4ms preprocess, 144.9ms inference, 1.7ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 6 dogs, 1 bed, 2 books, 148.0ms
Speed: 7.6ms preprocess, 148.0ms inference, 2.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 7 dogs, 2 beds, 2 books, 


0: 384x640 4 dogs, 1 chair, 1 bed, 4 books, 138.2ms
Speed: 4.6ms preprocess, 138.2ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 chair, 2 beds, 1 tv, 4 books, 85.2ms
Speed: 1.4ms preprocess, 85.2ms inference, 1.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 chair, 2 beds, 1 tv, 2 books, 127.2ms
Speed: 7.6ms preprocess, 127.2ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 sheep, 1 chair, 2 beds, 1 tv, 85.2ms
Speed: 3.0ms preprocess, 85.2ms inference, 5.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 chair, 2 beds, 1 tv, 94.9ms
Speed: 2.1ms preprocess, 94.9ms inference, 2.2ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 chair, 2 beds, 1 tv, 71.0ms
Speed: 2.0ms preprocess, 71.0ms inference, 1.1ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 chair, 2 beds, 1 tv, 81.4ms
Speed: 0.5ms preprocess, 81.4ms inf

Speed: 2.1ms preprocess, 88.3ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 chair, 1 bed, 1 tv, 103.8ms
Speed: 3.0ms preprocess, 103.8ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 chair, 1 bed, 1 tv, 140.5ms
Speed: 11.2ms preprocess, 140.5ms inference, 1.2ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 chair, 1 bed, 1 tv, 128.1ms
Speed: 5.7ms preprocess, 128.1ms inference, 5.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 chair, 2 beds, 1 tv, 100.4ms
Speed: 2.0ms preprocess, 100.4ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 2 chairs, 1 bed, 1 tv, 97.8ms
Speed: 2.0ms preprocess, 97.8ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 chair, 1 bed, 1 tv, 89.4ms
Speed: 2.0ms preprocess, 89.4ms inference, 2.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 do


0: 384x640 3 dogs, 1 chair, 1 bed, 1 tv, 118.0ms
Speed: 6.3ms preprocess, 118.0ms inference, 1.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 chair, 1 bed, 1 tv, 132.5ms
Speed: 1.2ms preprocess, 132.5ms inference, 1.8ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 chair, 1 bed, 1 tv, 85.9ms
Speed: 2.0ms preprocess, 85.9ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 chair, 1 bed, 1 tv, 143.4ms
Speed: 6.7ms preprocess, 143.4ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 chair, 2 beds, 1 tv, 126.2ms
Speed: 2.7ms preprocess, 126.2ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 dogs, 1 chair, 1 bed, 1 tv, 108.0ms
Speed: 6.1ms preprocess, 108.0ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 chair, 1 bed, 1 tv, 88.4ms
Speed: 3.5ms preprocess, 88.4ms inference, 0.0ms postprocess per


0: 384x640 3 dogs, 1 bed, 1 tv, 111.6ms
Speed: 3.5ms preprocess, 111.6ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 bed, 1 tv, 1 teddy bear, 115.1ms
Speed: 7.1ms preprocess, 115.1ms inference, 2.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 bed, 1 tv, 1 teddy bear, 116.4ms
Speed: 2.0ms preprocess, 116.4ms inference, 4.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 5 dogs, 1 bed, 1 tv, 127.0ms
Speed: 7.6ms preprocess, 127.0ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 5 dogs, 1 bed, 1 tv, 1 teddy bear, 87.2ms
Speed: 0.0ms preprocess, 87.2ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 6 dogs, 1 bed, 1 tv, 134.7ms
Speed: 8.1ms preprocess, 134.7ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 bed, 1 tv, 150.0ms
Speed: 4.4ms preprocess, 150.0ms inference, 2.0ms postprocess per image at shape (1, 


0: 384x640 2 dogs, 1 chair, 1 bed, 1 tv, 1 teddy bear, 135.4ms
Speed: 4.4ms preprocess, 135.4ms inference, 3.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 dogs, 1 chair, 1 bed, 1 tv, 1 teddy bear, 151.3ms
Speed: 5.0ms preprocess, 151.3ms inference, 0.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 dogs, 1 chair, 1 bed, 1 tv, 1 teddy bear, 157.0ms
Speed: 4.9ms preprocess, 157.0ms inference, 1.2ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 4 dogs, 2 beds, 130.2ms
Speed: 7.4ms preprocess, 130.2ms inference, 5.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 4 dogs, 3 beds, 118.3ms
Speed: 8.0ms preprocess, 118.3ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 4 dogs, 2 beds, 153.9ms
Speed: 5.0ms preprocess, 153.9ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 person, 3 dogs, 3 beds, 116.1ms
Speed: 12.2ms preprocess, 116.1ms in

Speed: 0.0ms preprocess, 132.8ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 chair, 2 beds, 120.8ms
Speed: 6.1ms preprocess, 120.8ms inference, 5.1ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 chair, 2 beds, 1 dining table, 143.8ms
Speed: 6.0ms preprocess, 143.8ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 chair, 2 beds, 134.9ms
Speed: 4.6ms preprocess, 134.9ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 5 dogs, 1 chair, 2 beds, 1 dining table, 109.5ms
Speed: 4.7ms preprocess, 109.5ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 5 dogs, 1 chair, 2 beds, 1 dining table, 150.0ms
Speed: 9.6ms preprocess, 150.0ms inference, 3.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 bed, 142.6ms
Speed: 0.0ms preprocess, 142.6ms inference, 7.0ms postprocess per image at shape (1, 3, 384, 640)

0: 38


0: 384x640 5 dogs, 1 chair, 2 beds, 1 tv, 120.4ms
Speed: 5.0ms preprocess, 120.4ms inference, 1.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 6 dogs, 1 chair, 1 bed, 1 tv, 78.9ms
Speed: 4.1ms preprocess, 78.9ms inference, 2.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 5 dogs, 1 chair, 2 beds, 1 dining table, 1 tv, 107.5ms
Speed: 7.1ms preprocess, 107.5ms inference, 4.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 chair, 1 bed, 1 tv, 145.4ms
Speed: 10.9ms preprocess, 145.4ms inference, 1.1ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 chair, 1 bed, 1 dining table, 1 tv, 103.3ms
Speed: 3.5ms preprocess, 103.3ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 chair, 1 bed, 1 dining table, 1 tv, 117.9ms
Speed: 0.0ms preprocess, 117.9ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 dogs, 1 chair, 1 bed, 1 dining table, 1 tv, 123.2


0: 384x640 1 cat, 5 dogs, 1 bed, 156.1ms
Speed: 2.0ms preprocess, 156.1ms inference, 1.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 5 dogs, 1 chair, 1 bed, 122.9ms
Speed: 5.8ms preprocess, 122.9ms inference, 3.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 5 dogs, 1 chair, 1 bed, 119.3ms
Speed: 5.6ms preprocess, 119.3ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 5 dogs, 1 bed, 144.8ms
Speed: 8.1ms preprocess, 144.8ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 cat, 5 dogs, 1 bed, 139.0ms
Speed: 9.9ms preprocess, 139.0ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 bed, 120.5ms
Speed: 9.2ms preprocess, 120.5ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 bed, 115.2ms
Speed: 8.0ms preprocess, 115.2ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 5 dogs, 1 bed, 110.2ms
Sp

Speed: 2.0ms preprocess, 103.1ms inference, 2.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 chair, 1 bed, 1 tv, 167.0ms
Speed: 12.8ms preprocess, 167.0ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 chair, 1 bed, 1 tv, 148.3ms
Speed: 11.6ms preprocess, 148.3ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 chair, 1 bed, 1 tv, 112.2ms
Speed: 7.4ms preprocess, 112.2ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 chair, 1 tv, 84.5ms
Speed: 2.6ms preprocess, 84.5ms inference, 1.2ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 chair, 1 tv, 112.9ms
Speed: 5.0ms preprocess, 112.9ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 chair, 1 bed, 1 tv, 139.6ms
Speed: 5.2ms preprocess, 139.6ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 5 dogs, 1 chair,


0: 384x640 5 dogs, 1 chair, 1 bed, 1 tv, 105.2ms
Speed: 3.2ms preprocess, 105.2ms inference, 2.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 5 dogs, 1 chair, 1 bed, 1 tv, 87.2ms
Speed: 2.0ms preprocess, 87.2ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 5 dogs, 1 chair, 2 beds, 1 tv, 124.9ms
Speed: 9.7ms preprocess, 124.9ms inference, 1.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 5 dogs, 1 chair, 1 bed, 1 tv, 143.2ms
Speed: 9.6ms preprocess, 143.2ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 chair, 3 beds, 1 tv, 129.8ms
Speed: 5.6ms preprocess, 129.8ms inference, 2.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 6 dogs, 1 chair, 3 beds, 1 tv, 77.9ms
Speed: 2.2ms preprocess, 77.9ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 5 dogs, 1 chair, 3 beds, 1 tv, 126.9ms
Speed: 5.5ms preprocess, 126.9ms inference, 0.0ms postprocess 


0: 384x640 1 cat, 1 dog, 1 chair, 1 bed, 1 tv, 136.0ms
Speed: 0.1ms preprocess, 136.0ms inference, 5.6ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 dogs, 1 cup, 1 chair, 1 bed, 1 tv, 146.3ms
Speed: 5.2ms preprocess, 146.3ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 dogs, 1 cup, 1 chair, 1 bed, 1 tv, 136.6ms
Speed: 7.7ms preprocess, 136.6ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 dogs, 1 chair, 1 bed, 1 tv, 134.0ms
Speed: 5.6ms preprocess, 134.0ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 cat, 2 dogs, 1 chair, 1 bed, 1 tv, 144.1ms
Speed: 9.9ms preprocess, 144.1ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 2 cats, 1 dog, 1 chair, 2 beds, 1 tv, 150.0ms
Speed: 5.0ms preprocess, 150.0ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 1 dog, 1 chair, 2 beds, 1 tv, 157.4ms
Speed: 4.2ms preprocess, 15


0: 384x640 3 dogs, 1 bed, 149.1ms
Speed: 9.0ms preprocess, 149.1ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 bed, 182.5ms
Speed: 0.0ms preprocess, 182.5ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 2 beds, 158.0ms
Speed: 0.0ms preprocess, 158.0ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 bed, 167.7ms
Speed: 0.0ms preprocess, 167.7ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 bed, 157.7ms
Speed: 5.0ms preprocess, 157.7ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 bed, 162.8ms
Speed: 1.7ms preprocess, 162.8ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 bed, 145.3ms
Speed: 13.3ms preprocess, 145.3ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 158.0ms
Speed: 15.7ms preprocess, 158.0ms infer


0: 384x640 3 dogs, 1 bed, 103.4ms
Speed: 0.0ms preprocess, 103.4ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 2 beds, 159.0ms
Speed: 14.0ms preprocess, 159.0ms inference, 2.6ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 bed, 165.8ms
Speed: 0.0ms preprocess, 165.8ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 bed, 157.6ms
Speed: 12.5ms preprocess, 157.6ms inference, 1.1ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 2 beds, 123.2ms
Speed: 5.0ms preprocess, 123.2ms inference, 1.5ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 2 beds, 139.3ms
Speed: 5.7ms preprocess, 139.3ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 4 dogs, 1 bed, 144.1ms
Speed: 15.1ms preprocess, 144.1ms inference, 0.0ms postprocess per image at shape (1, 3, 384, 640)

0: 384x640 3 dogs, 1 bed, 136.4ms
Speed: 0.0ms preprocess, 136

This **while** block processes a video frame by frame using YOLO, annotates each frame with detection results, and writes the annotated frames into an output video. It also displays the results in a real-time OpenCV window.

---

### **1. `while cap.isOpened():`**
- **Purpose**: Keeps processing frames as long as the video file or camera feed (`cap`) is open and functional.
- **`cap.isOpened()`**: Returns `True` if the video capture object (`cap`) is successfully opened.

---

### **2. `ret, frame = cap.read()`**
- **Purpose**: Reads the next frame from the video stream.
- **`cap.read()`**:
  - Returns two values:
    - `ret`: A boolean indicating if the frame was successfully read (`True` for success, `False` if there are no more frames or an error occurs).
    - `frame`: The actual video frame as a NumPy array.

---

### **3. `if not ret: break`**
- **Purpose**: Exits the loop if no frame is read (e.g., end of the video).
- **`not ret`**: Checks if `ret` is `False`, signaling that the video has no more frames or an error occurred.

---

### **4. `results = model(frame)`**
- **Purpose**: Passes the current video frame to the YOLO model for object detection or segmentation.
- **`model(frame)`**:
  - Processes the frame with YOLO to generate detection results.
  - `results` contains:
    - Detected objects' bounding boxes.
    - Class labels.
    - Confidence scores (and masks, for segmentation).

---

### **5. `annotated_frame = results[0].plot()`**
- **Purpose**: Visualizes the YOLO detection results on the frame.
- **`results[0]`**: Accesses the first result (YOLO supports batch processing, but typically a single frame is processed at a time in videos).
- **`.plot()`**: Annotates the frame with bounding boxes, class names, and other detection information.

---

### **6. `out.write(annotated_frame)`**
- **Purpose**: Writes the annotated frame to the output video file.
- **`out.write()`**: Appends the frame to the video being saved.
- **`annotated_frame`**: The frame with YOLO-detected objects drawn on it.

---

### **7. `cv2.imshow('YOLO Detection', annotated_frame)`**
- **Purpose**: Displays the annotated frame in a real-time OpenCV window titled `'YOLO Detection'`.
- **`cv2.imshow()`**: Opens a window to show the processed frame.

---

### **8. `if cv2.waitKey(1) & 0xFF == ord('q'):`**
- **Purpose**: Allows the user to exit the video display by pressing the `q` key.
- **`cv2.waitKey(1)`**:
  - Waits for 1 millisecond for a key press.
  - Returns a code for the pressed key.
- **`& 0xFF`**: Ensures compatibility across platforms by masking unwanted bits.
- **`ord('q')`**: Converts the character `'q'` to its ASCII value for comparison.

---

### **9. `cap.release()`**
- **Purpose**: Releases the video capture object (`cap`) and frees associated resources.

---

### **10. `out.release()`**
- **Purpose**: Closes the video writer (`out`) and saves the output video file.

---

### **11. `cv2.destroyAllWindows()`**
- **Purpose**: Closes all OpenCV display windows to clean up after the program ends.

---

### **Summary of the Workflow**
1. Reads video frames one by one.
2. Runs YOLO detection or segmentation on each frame.
3. Annotates the frame with results.
4. Writes the annotated frame to an output video file.
5. Displays the annotated frame in real time.
6. Exits on `q` key press or when the video ends.

This block is crucial for real-time video processing and output generation using YOLO. Let me know if you'd like me to elaborate further!