In this tutorial, you will learn how to set up and run object detection using YOLO in Google Colab. It begins by guiding you through mounting Google Drive and executing essential Linux commands, such as unzipping datasets and removing folders. You will then install the Ultralytics YOLO package and integrate SAM3 with YOLO for advanced object segmentation and video object tracking, including downloading the SAM3 model and performing text-prompt-based detection on video files. After preparing and generating the dataset, you will train a YOLO model using both Python code and the command-line interface. Finally, the tutorial demonstrates object detection on single and multiple images, as well as how to save detected object names and confidence scores to a text file.

**Mount Google Drive to Access Datasets and Project Files**

In [None]:
from google.colab import drive
drive.mount('/content/drive')

**Running Linux Commands in Colab**

You can execute Linux shell commands by adding `!` at the beginning of a line.

Example:



In [None]:
zip_file = "Dataset.zip"
!unzip {zip_file}

**Cleaning Previous Training Runs**

We remove the `runs/` directory to avoid conflicts with previous experiments.

In [None]:
!rm -r runs/

**Integrating SAM3 with YOLO for Object Segmentation**

**Install YOLO**

In [None]:
!pip install ultralytics

**Download the SAM3 Model from ModelScope**

In [None]:
!wget -O sam3.pt https://www.modelscope.cn/models/facebook/sam3/resolve/master/sam3.pt

**Detect and Track Objects in Videos Using the SAM3 Model with YOLO**

In [None]:
from ultralytics.models.sam import SAM3VideoSemanticPredictor

# Initialize semantic video predictor
overrides = dict(conf=0.75, task="segment", mode="predict", imgsz=640, model="sam3.pt", half=True, name="penguin")
predictor = SAM3VideoSemanticPredictor(overrides=overrides)

# Track concepts using text prompts
# results = predictor(source="01.mp4", text=["tiger"], save=True)
results = predictor(source="02.mp4", text=["penguin"], save=True)

**Run YOLO in Python:**

In [None]:
# Train YOLO model
from ultralytics import YOLO

model = YOLO("yolo26n.pt")
model.train(data="/content/dataset/data.yaml", epochs=10, imgsz=640, batch=10)

**Run YOLO in command line interface (CLI)**

In [None]:
# Train YOLO model
!yolo train "yolo11x.pt" imgsz=640  batch=32 epochs=100  data="data.yaml" device=0  name=Train

**Object Detection in a Single Image**

In [None]:
from ultralytics import YOLO

model = YOLO("/content/best.pt")
results = model.predict(source="/content/1.jpg", conf=0.10, save=True)

**Batch Object Detection on Multiple Images**

In [None]:
from ultralytics import YOLO

model = YOLO("/content/best.pt")
results = model.predict(
    source="/content/images",
    conf=0.10,
    save=True
)

**Detect Objects in Multiple Images and Save Image Name, Object Label, and Confidence Score to a Text File**

In [None]:
from ultralytics import YOLO
import os


model = YOLO("/content/best_final.pt")

results = model.predict(
    source="/content/im",
    conf=0.10,
    save=True
)



output_file = "/content/all_detections.txt"

with open(output_file, "w") as f:
    for r in results:
        image_name = os.path.basename(r.path)

        f.write(f"{image_name}\n")
        f.write("detected:\n")

        if r.boxes is None or len(r.boxes) == 0:
            f.write("none\n\n")
            continue

        for box in r.boxes:
            cls_id = int(box.cls[0])
            conf = float(box.conf[0])
            class_name = r.names[cls_id]

            f.write(f"{class_name}\t{conf:.2f}\n")

        f.write("\n")

print("Saved to:", output_file)