## Object Detection  

Object detection is a task that involves identifying the location and class of objects in an image or video stream.

The output of an object detector is a set of bounding boxes that enclose the objects in the image, along with class labels and confidence scores for each box. Object detection is a good choice when you need to identify objects of interest in a scene, but don't need to know exactly where the object is or its exact shape.



In [28]:
from ultralytics import YOLO

# Load a model
model = YOLO("yolo11n.pt") 

# Perform object detection on an image
results = model("test2.jpg",save=True)
results[0].show()


image 1/1 C:\Users\nadha\vision\yolo11\tutorial1\test2.jpg: 320x640 1 bird, 154.8ms
Speed: 4.0ms preprocess, 154.8ms inference, 1.0ms postprocess per image at shape (1, 3, 320, 640)
Results saved to [1mC:\Users\nadha\runs\detect\predict4[0m


![Image](detect.jpg)

In [10]:
from ultralytics import YOLO

# Load a model
model = YOLO("yolo11n.pt") 

# Perform object detection on an image
results = model("vid.mp4", save=True)



errors for large sources or long-running streams and videos. See https://docs.ultralytics.com/modes/predict/ for help.

Example:
    results = model(source=..., stream=True)  # generator of Results objects
    for r in results:
        boxes = r.boxes  # Boxes object for bbox outputs
        masks = r.masks  # Masks object for segment masks outputs
        probs = r.probs  # Class probabilities for classification outputs

video 1/1 (frame 1/1164) C:\Users\nadha\vision\yolo11\tutorial1\vid.mp4: 384x640 10 persons, 3 teddy bears, 188.3ms
video 1/1 (frame 2/1164) C:\Users\nadha\vision\yolo11\tutorial1\vid.mp4: 384x640 11 persons, 1 cow, 1 teddy bear, 248.1ms
video 1/1 (frame 3/1164) C:\Users\nadha\vision\yolo11\tutorial1\vid.mp4: 384x640 9 persons, 2 teddy bears, 204.0ms
video 1/1 (frame 4/1164) C:\Users\nadha\vision\yolo11\tutorial1\vid.mp4: 384x640 9 persons, 2 teddy bears, 157.2ms
video 1/1 (frame 5/1164) C:\Users\nadha\vision\yolo11\tutorial1\vid.mp4: 384x640 8 persons, 4 teddy bear

####  REAL TIME OBJECT DETECTION

In [12]:
from ultralytics import YOLO

# Load a model
model = YOLO("yolo11n.pt") 

# Perform object detection on an image
results = model(0, save=True)
#results[0].show()


1/1: 0... Success  (inf frames of shape 640x480 at 30.00 FPS)


errors for large sources or long-running streams and videos. See https://docs.ultralytics.com/modes/predict/ for help.

Example:
    results = model(source=..., stream=True)  # generator of Results objects
    for r in results:
        boxes = r.boxes  # Boxes object for bbox outputs
        masks = r.masks  # Masks object for segment masks outputs
        probs = r.probs  # Class probabilities for classification outputs

0: 480x640 (no detections), 164.6ms
0: 480x640 (no detections), 146.2ms
0: 480x640 (no detections), 187.2ms
0: 480x640 (no detections), 156.0ms
0: 480x640 (no detections), 158.5ms
0: 480x640 (no detections), 155.1ms
0: 480x640 (no detections), 150.4ms
0: 480x640 (no detections), 174.5ms
0: 480x640 (no detections), 161.4ms
0: 480x640 (no detections), 147.7ms
0: 480x640 (no detections), 224.2ms
0: 480x640 (no detections), 176.0ms
0: 480x640 (no detections), 159.5ms
0: 480x640 (no detections), 151.6ms
0: 48

KeyboardInterrupt: 

## Image Segmentation 

Instance segmentation goes a step further than object detection and involves identifying individual objects in an image and segmenting them from the rest of the image.

The output of an instance segmentation model is a set of masks or contours that outline each object in the image, along with class labels and confidence scores for each object. Instance segmentation is useful when you need to know not only where objects are in an image, but also what their exact shape is.

In [26]:
from ultralytics import YOLO

# Load a model
model = YOLO("YOLO11n-seg.pt") 

# Perform object detection on an image
results = model("test1.jpg",save=True)
results[0].show()


image 1/1 C:\Users\nadha\vision\yolo11\tutorial1\test1.jpg: 384x640 10 persons, 1 sports ball, 1 baseball bat, 182.5ms
Speed: 4.0ms preprocess, 182.5ms inference, 14.1ms postprocess per image at shape (1, 3, 384, 640)
Results saved to [1mC:\Users\nadha\runs\segment\predict2[0m


![Image](seg.jpg)

## Pose estimation 

Pose estimation is a task that involves identifying the location of specific points in an image, usually referred to as keypoints. The keypoints can represent various parts of the object such as joints, landmarks, or other distinctive features. The locations of the keypoints are usually represented as a set of 2D [x, y] or 3D [x, y, visible] coordinates.

The output of a pose estimation model is a set of points that represent the keypoints on an object in the image, usually along with the confidence scores for each point. Pose estimation is a good choice when you need to identify specific parts of an object in a scene, and their location in relation to each other.

In [16]:
from ultralytics import YOLO

# Load a model
model = YOLO("YOLO11n-pose.pt") 

# Perform object detection on an image
results = model("test.jpeg", save=True)
results[0].show()


image 1/1 C:\Users\nadha\vision\yolo11\tutorial1\test.jpeg: 448x640 6 persons, 238.7ms
Speed: 6.9ms preprocess, 238.7ms inference, 13.3ms postprocess per image at shape (1, 3, 448, 640)
Results saved to [1mC:\Users\nadha\runs\pose\predict[0m


![Image](pose.jpg)


## Image Classification 

Image classification is the simplest of the three tasks and involves classifying an entire image into one of a set of predefined classes.

The output of an image classifier is a single class label and a confidence score. Image classification is useful when you need to know only what class an image belongs to and don't need to know where objects of that class are located or what their exact shape is.

In [17]:
from ultralytics import YOLO

# Load a model
model = YOLO("YOLO11n-cls.pt") 

# Perform object detection on an image
results = model("cat.jpg", save=True)
results[0].show()


Downloading https://ultralytics.com/assets/Arial.ttf to 'C:\Users\nadha\AppData\Roaming\Ultralytics\Arial.ttf'...


100%|████████████████████████████████████████████████████████████████████████████████| 755k/755k [00:01<00:00, 470kB/s]

image 1/1 C:\Users\nadha\vision\yolo11\tutorial1\cat.jpg: 224x224 tiger_cat 0.58, tabby 0.25, Egyptian_cat 0.14, lynx 0.02, doormat 0.00, 55.8ms
Speed: 40.0ms preprocess, 55.8ms inference, 0.0ms postprocess per image at shape (1, 3, 224, 224)
Results saved to [1mC:\Users\nadha\runs\classify\predict[0m





![Image](classify.jpg)


## OBB 

Oriented object detection goes a step further than object detection and introduces an extra angle to locate objects more accurately in an image.

The output of an oriented object detector is a set of **rotated bounding boxes** that exactly enclose the objects in the image, along with class labels and confidence scores for each box. Object detection is a good choice when you need to identify objects of interest in a scene, but don't need to know exactly where the object is or its exact shape.

Pretrained Datasets

0. Plane  
1. Ship  
2. Storage Tank  
3. Baseball Diamond  
4. Tennis Court  
5. Basketball Court  
6. Ground Track Field  
7. Harbor  
8. Bridge  
9. Large Vehicle  
10. Small Vehicle  
11. Helicopter  
12. Roundabout  
13. Soccer Ball Field  
14. Swimming Pool 

Oriented object detection provides more precise localization by including the angle of rotation, which is especially useful for detecting objects in aerial images or when objects are not aligned horizontally.


In [18]:
from ultralytics import YOLO

# Load a model
model = YOLO("YOLO11n-obb.pt") 

# Perform object detection on an image
results = model("vid1.mp4", save=True)
#results[0].save()



errors for large sources or long-running streams and videos. See https://docs.ultralytics.com/modes/predict/ for help.

Example:
    results = model(source=..., stream=True)  # generator of Results objects
    for r in results:
        boxes = r.boxes  # Boxes object for bbox outputs
        masks = r.masks  # Masks object for segment masks outputs
        probs = r.probs  # Class probabilities for classification outputs

video 1/1 (frame 1/373) C:\Users\nadha\vision\yolo11\tutorial1\vid1.mp4: 576x1024 362.1ms
video 1/1 (frame 2/373) C:\Users\nadha\vision\yolo11\tutorial1\vid1.mp4: 576x1024 317.9ms
video 1/1 (frame 3/373) C:\Users\nadha\vision\yolo11\tutorial1\vid1.mp4: 576x1024 319.0ms
video 1/1 (frame 4/373) C:\Users\nadha\vision\yolo11\tutorial1\vid1.mp4: 576x1024 314.6ms
video 1/1 (frame 5/373) C:\Users\nadha\vision\yolo11\tutorial1\vid1.mp4: 576x1024 327.9ms
video 1/1 (frame 6/373) C:\Users\nadha\vision\yolo11\tutorial1\vid1.mp4: 576x1024 327.8ms
video 1/1 (frame 7/373) C:\Users