

**This Jupyter Notebook (`live_object_detection.ipynb`) demonstrates live object detection using a pre-trained YOLO (You Only Look Once)**
                                                                                                                
- It captures video from your webcam, processes each frame with the YOLO model, and displays the results with bounding boxes and class labels overlaid on the video feed.

-----------------

In [1]:
pip install ultralytics

Collecting ultralytics
  Downloading ultralytics-8.3.76-py3-none-any.whl.metadata (35 kB)
Collecting opencv-python>=4.6.0 (from ultralytics)
  Using cached opencv_python-4.11.0.86-cp37-abi3-win_amd64.whl.metadata (20 kB)
Collecting torch>=1.8.0 (from ultralytics)
  Using cached torch-2.6.0-cp312-cp312-win_amd64.whl.metadata (28 kB)
Collecting torchvision>=0.9.0 (from ultralytics)
  Using cached torchvision-0.21.0-cp312-cp312-win_amd64.whl.metadata (6.3 kB)
Collecting ultralytics-thop>=2.0.0 (from ultralytics)
  Using cached ultralytics_thop-2.0.14-py3-none-any.whl.metadata (9.4 kB)
Collecting sympy==1.13.1 (from torch>=1.8.0->ultralytics)
  Using cached sympy-1.13.1-py3-none-any.whl.metadata (12 kB)
Downloading ultralytics-8.3.76-py3-none-any.whl (915 kB)
   ---------------------------------------- 0.0/915.2 kB ? eta -:--:--
   --------------------------------------- 915.2/915.2 kB 13.9 MB/s eta 0:00:00
Using cached opencv_python-4.11.0.86-cp37-abi3-win_amd64.whl (39.5 MB)
Using cached

In [7]:
from ultralytics import YOLO

In [9]:
from ultralytics import YOLO
import cv2
import math
# start webcam
cap = cv2.VideoCapture(0)
cap.set(3, 640)
cap.set(4, 480)

# model
model = YOLO("yolo-Weights/yolov8n.pt")

# object classes
classNames = ["person", "bicycle", "car", "motorbike", "aeroplane", "bus", "train", "truck", "boat",
              "traffic light", "fire hydrant", "stop sign", "parking meter", "bench", "bird", "cat",
              "dog", "horse", "sheep", "cow", "elephant", "bear", "zebra", "giraffe", "backpack", "umbrella",
              "handbag", "tie", "suitcase", "frisbee", "skis", "snowboard", "sports ball", "kite", "baseball bat",
              "baseball glove", "skateboard", "surfboard", "tennis racket", "bottle", "wine glass", "cup",
              "fork", "knife", "spoon", "bowl", "banana", "apple", "sandwich", "orange", "broccoli",
              "carrot", "hot dog", "pizza", "donut", "cake", "chair", "sofa", "pottedplant", "bed",
              "diningtable", "toilet", "tvmonitor", "laptop", "mouse", "remote", "keyboard", "cell phone",
              "microwave", "oven", "toaster", "sink", "refrigerator", "book", "clock", "vase", "scissors",
              "teddy bear", "hair drier", "toothbrush"
              ]


while True:
    success, img = cap.read()
    results = model(img, stream=True)

    # coordinates
    for r in results:
        boxes = r.boxes

        for box in boxes:
            # bounding box
            x1, y1, x2, y2 = box.xyxy[0]
            x1, y1, x2, y2 = int(x1), int(y1), int(x2), int(y2) # convert to int values

            # put box in cam
            cv2.rectangle(img, (x1, y1), (x2, y2), (255, 0, 255), 3)

            # confidence
            confidence = math.ceil((box.conf[0]*100))/100
            print("Confidence --->",confidence)

            # class name
            cls = int(box.cls[0])
            print("Class name -->", classNames[cls])

            # object details
            org = [x1, y1]
            font = cv2.FONT_HERSHEY_SIMPLEX
            fontScale = 1
            color = (255, 0, 0)
            thickness = 2

            cv2.putText(img, classNames[cls], org, font, fontScale, color, thickness)

    cv2.imshow('Webcam', img)
    if cv2.waitKey(1) == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

Downloading https://github.com/ultralytics/assets/releases/download/v8.3.0/yolov8n.pt to 'yolo-Weights\yolov8n.pt'...


100%|█████████████████████████████████████████████████████████████████████████████| 6.25M/6.25M [00:01<00:00, 6.18MB/s]



0: 480x640 1 person, 198.1ms
Confidence ---> 0.85
Class name --> person
Speed: 17.2ms preprocess, 198.1ms inference, 3.4ms postprocess per image at shape (1, 3, 480, 640)

0: 480x640 1 person, 143.6ms
Confidence ---> 0.88
Class name --> person
Speed: 7.4ms preprocess, 143.6ms inference, 3.3ms postprocess per image at shape (1, 3, 480, 640)

0: 480x640 1 person, 147.9ms
Confidence ---> 0.83
Class name --> person
Speed: 3.3ms preprocess, 147.9ms inference, 2.2ms postprocess per image at shape (1, 3, 480, 640)

0: 480x640 1 person, 207.3ms
Confidence ---> 0.87
Class name --> person
Speed: 37.1ms preprocess, 207.3ms inference, 2.8ms postprocess per image at shape (1, 3, 480, 640)

0: 480x640 1 person, 95.4ms
Confidence ---> 0.87
Class name --> person
Speed: 4.3ms preprocess, 95.4ms inference, 1.0ms postprocess per image at shape (1, 3, 480, 640)

0: 480x640 1 person, 76.7ms
Confidence ---> 0.85
Class name --> person
Speed: 1.9ms preprocess, 76.7ms inference, 0.9ms postprocess per image at


KeyboardInterrupt

