# Lab 07

In this lab, I explored the YOLO (You Only Look Once) object detection model, one of the most advanced and efficient algorithms for real-time visual recognition. The primary goal was to understand how YOLO performs object detection by predicting bounding boxes and class probabilities directly from images. Unlike traditional methods that rely on multiple processing stages, YOLO treats object detection as a single regression problem, allowing for faster and more accurate results. This lab provided hands-on experience with the implementation, execution, and interpretation of YOLO‚Äôs outputs, helping to understand how artificial intelligence models can visually perceive and interpret their surroundings.

**Setup**

In [1]:
!pip install ultralytics

Collecting ultralytics
  Downloading ultralytics-8.3.226-py3-none-any.whl.metadata (37 kB)
Collecting ultralytics-thop>=2.0.18 (from ultralytics)
  Downloading ultralytics_thop-2.0.18-py3-none-any.whl.metadata (14 kB)
Downloading ultralytics-8.3.226-py3-none-any.whl (1.1 MB)
[2K   [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m1.1/1.1 MB[0m [31m32.5 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading ultralytics_thop-2.0.18-py3-none-any.whl (28 kB)
Installing collected packages: ultralytics-thop, ultralytics
Successfully installed ultralytics-8.3.226 ultralytics-thop-2.0.18


In [2]:
!uv pip install ultralytics
import ultralytics
ultralytics.checks()

Ultralytics 8.3.226 üöÄ Python-3.12.12 torch-2.8.0+cu126 CUDA:0 (Tesla T4, 15095MiB)
Setup complete ‚úÖ (2 CPUs, 12.7 GB RAM, 39.3/112.6 GB disk)


**Train**

In [3]:
from ultralytics import YOLO

# Load a model
model = YOLO("yolo11n.pt")  # load a pretrained model (recommended for training)

# Train the model
results = model.train(data="HomeObjects-3K.yaml", epochs=3, imgsz=640)

[KDownloading https://github.com/ultralytics/assets/releases/download/v8.3.0/yolo11n.pt to 'yolo11n.pt': 100% ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ 5.4MB 111.8MB/s 0.0s
Ultralytics 8.3.226 üöÄ Python-3.12.12 torch-2.8.0+cu126 CUDA:0 (Tesla T4, 15095MiB)
[34m[1mengine/trainer: [0magnostic_nms=False, amp=True, augment=False, auto_augment=randaugment, batch=16, bgr=0.0, box=7.5, cache=False, cfg=None, classes=None, close_mosaic=10, cls=0.5, compile=False, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=False, cutmix=0.0, data=HomeObjects-3K.yaml, degrees=0.0, deterministic=True, device=None, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=3, erasing=0.4, exist_ok=False, fliplr=0.5, flipud=0.0, format=torchscript, fraction=1.0, freeze=None, half=False, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=640, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.01, lrf=0.01, mask_ratio=4, max_det=300, mixup=0.0, mode=train, model=yolo11n.pt, momentum=0.937,

![HomeObject-3K dataset sample image](https://github.com/ultralytics/docs/releases/download/0/homeobjects-3k-dataset-sample.avif)

**Predict**

In [5]:
from ultralytics import YOLO

# Load a model
modelp = YOLO(f"{model.trainer.save_dir}/weights/best.pt")  # load a fine-tuned model

# Inference using the model
prediction_results = modelp.predict("https://ultralytics.com/assets/home-objects-sample.jpg", save=True)


Found https://ultralytics.com/assets/home-objects-sample.jpg locally at home-objects-sample.jpg
image 1/1 /content/home-objects-sample.jpg: 448x640 1 sofa, 2 tables, 3 lamps, 3 potted plants, 6 photo frames, 8.6ms
Speed: 3.0ms preprocess, 8.6ms inference, 1.2ms postprocess per image at shape (1, 3, 448, 640)
Results saved to [1m/content/runs/detect/predict2[0m


&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
<img align="left" src="https://github.com/user-attachments/assets/07e7811e-3677-4982-99f3-5e77e7f62651" width="720" height="460">

**Export**

In [6]:
from ultralytics import YOLO

# Load a model
modele = YOLO(f"{model.trainer.save_dir}/weights/best.pt")  # load a custom trained model

# Export the model
modele.export(format="onnx")

Ultralytics 8.3.226 üöÄ Python-3.12.12 torch-2.8.0+cu126 CPU (Intel Xeon CPU @ 2.00GHz)
üí° ProTip: Export to OpenVINO format for best performance on Intel hardware. Learn more at https://docs.ultralytics.com/integrations/openvino/
YOLO11n summary (fused): 100 layers, 2,584,492 parameters, 0 gradients, 6.3 GFLOPs

[34m[1mPyTorch:[0m starting from '/content/runs/detect/train/weights/best.pt' with input shape (1, 3, 640, 640) BCHW and output shape(s) (1, 16, 8400) (5.2 MB)
[31m[1mrequirements:[0m Ultralytics requirements ['onnx>=1.12.0', 'onnxslim>=0.1.71', 'onnxruntime-gpu'] not found, attempting AutoUpdate...
Using Python 3.12.12 environment at: /usr
Resolved 14 packages in 167ms
Prepared 6 packages in 6.16s
Installed 6 packages in 240ms
 + colorama==0.4.6
 + coloredlogs==15.0.1
 + humanfriendly==10.0
 + onnx==1.20.0rc1
 + onnxruntime-gpu==1.23.2
 + onnxslim==0.1.73

[31m[1mrequirements:[0m AutoUpdate success ‚úÖ 6.8s


[34m[1mONNX:[0m starting export with onnx 1.20.0rc1 

'/content/runs/detect/train/weights/best.onnx'

**Citation**

```
@dataset{Jocher_Ultralytics_Datasets_2025,
    author = {Jocher, Glenn and Rizwan, Muhammad},
    license = {AGPL-3.0},
    month = {May},
    title = {Ultralytics Datasets: HomeObjects-3K Detection Dataset},
    url = {https://docs.ultralytics.com/datasets/detect/homeobject-3k/},
    version = {1.0.0},
    year = {2025}
}
```


**Observations**

While going through the notebook, I observed how YOLO divides an image into multiple grids, with each grid cell responsible for detecting objects that appear within its boundaries. The model simultaneously predicts bounding box coordinates, confidence scores, and class labels. This process enables YOLO to detect multiple objects in a single frame, which is highly beneficial for real-time tasks such as autonomous driving, video surveillance, and robotics.

The notebook demonstrated the use of pretrained YOLO weights to perform detection on test images. The output visualizations showed bounding boxes accurately surrounding the detected objects along with their class names and confidence percentages. It was clear that YOLO‚Äôs greatest advantage is its ability to achieve fast inference speeds without sacrificing much accuracy. However, one limitation noticed was its occasional difficulty in detecting small or overlapping objects. This is a trade-off between speed and precision inherent in single-shot detectors.

Overall, the lab emphasized how deep learning-based detection systems rely heavily on training data, model architecture, and parameter tuning. It also highlighted the importance of GPU acceleration for efficient processing of high-resolution images, especially in real-world applications that require quick and reliable results.

**Conclusion**

From this lab, I learned that YOLO represents a major step forward in the field of computer vision by enabling real-time object detection. Its unified architecture allows both localization and classification to occur in one process, resulting in faster performance compared to earlier region-based methods. The lab also reinforced my understanding of how neural networks can be applied to visual perception problems, turning pixel data into meaningful object insights.

In conclusion, the YOLO model combines efficiency, accuracy, and scalability, making it one of the most practical solutions for modern AI applications. This lab enhanced my understanding of how object detection systems work and how they can be optimized for real-world scenarios like robotics, autonomous navigation, and intelligent surveillance systems.