[![Labellerr](https://storage.googleapis.com/labellerr-cdn/%200%20Labellerr%20template/notebook.webp)](https://www.labellerr.com)

# **Fine-Tune RT-DETR for Plant Weed Detection**

---

[![labellerr](https://img.shields.io/badge/Labellerr-BLOG-black.svg)](https://www.labellerr.com/blog)
[![Youtube](https://img.shields.io/badge/Labellerr-YouTube-b31b1b.svg)](https://www.youtube.com/@Labellerr)
[![Github](https://img.shields.io/badge/Labellerr-GitHub-green.svg)](https://github.com/Labellerr/Hands-On-Learning-in-Computer-Vision)

1.  **Setup**: Imports libraries and clones the `yolo_finetune_utils` repository for helper functions.
2.  **Data Preparation**:
    *   Extracts random frames from input videos.
    *   Converts annotations from COCO JSON to YOLO format.
    *   Splits data into Train, Validation, and Test sets.
3.  **Training**: Fine-tunes the **RT-DETR-L** model on the prepared custom dataset for 100 epochs.
4.  **Visualization**: Runs inference on a new video and generates an output video on detected weeds.

In [None]:
import cv2
import numpy as np
from ultralytics import RTDETR
import torch
from typing import Dict, Tuple, Optional, List
from pathlib import Path
import random 
%matplotlib inline

## Setup: Clone YOLO Fine-tuning Utilities for helper functions
Clone the `yolo_finetune_utils` repository which contains helper functions for:
- Frame extraction from videos
- COCO to YOLO format conversion
- Dataset preparation utilities
> **Note**: Uncomment this cell only if you haven't cloned the repository yet.

In [None]:
# !git clone https://github.com/Labellerr/yolo_finetune_utils.git

## Data Preparation: Extract Random Frames from Videos
Extract random frames from video files to create a dataset for annotation.
**Parameters**:
- `paths`: List of video directories
- `total_images`: Number of frames to extract
- `out_dir`: Output directory for extracted frames
- `jpg_quality`: JPEG compression quality (100 = highest)
- `seed`: Random seed for reproducibility

In [None]:
from yolo_finetune_utils.frame_extractor import extract_random_frames

extract_random_frames(
        paths=[r"videos\manufacturing_video_data"],
        total_images=150,
        out_dir="dataset_frames",
        jpg_quality=100,
        seed=42
    )

## Data Preparation: Convert Annotations to YOLO Format
Convert COCO-format annotations to YOLO format and split the dataset into train/val/test sets.
**Configuration**:
- **Train/Val/Test Split**: 80% / 10% / 10%
- **Input**: COCO JSON annotations + image directory
- **Output**: YOLO-format dataset in `model_dataset/`
The converter automatically:
- Creates train/val/test splits
- Generates YOLO-format label files (.txt)
- Organizes images and labels into proper directory structure

In [None]:
from yolo_finetune_utils.coco_yolo_converter.seg_converter import coco_to_yolo_converter

ANNOTATION_JSON = "annotations.json"
IMAGE_DIR = "dataset_frames"


coco_to_yolo_converter(
        json_path=ANNOTATION_JSON,
        images_dir=IMAGE_DIR,
        output_dir="model_dataset",
        use_split=True,
        train_ratio=0.8,
        val_ratio=0.1,
        test_ratio=0.1,
        shuffle=True,
        verbose=False
    )

## System Check: GPU Memory Status
Clear GPU cache and display current memory usage to ensure sufficient resources for training.
**Memory Metrics**:
- **Allocated**: Currently used GPU memory
- **Cached**: Reserved but not actively used
- **Free**: Available GPU memory
> Run this cell before training to free up GPU memory.

In [2]:
torch.cuda.empty_cache()

# Check GPU memory status
print(f"Allocated: {torch.cuda.memory_allocated(0)/1024**3:.2f} GB")
print(f"Cached: {torch.cuda.memory_reserved(0)/1024**3:.2f} GB")
print(f"Free: {torch.cuda.mem_get_info(0)[0]/1024**3:.2f} GB")

Allocated: 0.00 GB
Cached: 0.00 GB
Free: 6.87 GB


## Model Training: RT-DETR for Plant Weed Detection
Train the RT-DETR-L (Large) model on the dataset.
**Training Configuration**:
| Parameter | Value | Description |
|-----------|-------|-------------|
| `data` | `model_dataset/data.yaml` | Dataset configuration file |
| `epochs` | 100 | Number of training epochs |
| `imgsz` | 640 | Input image size (640x640) |
| `batch` | 4 | Batch size |
| `device` | 0 | GPU device ID (0 = first GPU) |
| `workers` | 1 | Number of dataloader workers |
**Model**: RT-DETR-L (Large variant)
- Pre-trained weights: `rtdetr-l.pt`
- Architecture: Real-Time Detection Transformer
- Optimized for real-time object detection
> **Training Time**: Approximately 30-60 minutes depending on GPU

In [None]:
from ultralytics import RTDETR
# Load a model
model = RTDETR("rtdetr-l.pt")

# Train the model
results = model.train(
    data=r"model_dataset\data.yaml",    # Path to your dataset YAML file
    epochs=100,                        # Number of training epochs
    imgsz=640,                         # Image size
    batch=4,                          # Batch size
    device=0,                          # GPU device (0 for first GPU, 'cpu' for CPU)
    workers=1                          # Number of dataloader workers
)

## Visualize Model Inference

**What it does:**
This function takes a video, finds weeds using your trained AI model, and highlights them with **red transparent boxes**.


In [5]:
def run_inference_video(video_path, output_path="output.mp4", conf=0.5):
    """
    Visualizes RT-DETR detections on a video with a red semi-transparent overlay.

    Args:
        video_path (str): Path to the input video.
        output_path (str): Path to save the output video.
        model (ultralytics.RTDETR): Loaded RT-DETR model.
        conf (float): Confidence threshold for detections.
    """
    model = RTDETR(r"runs\detect\train\weights\best.pt")

    cap = cv2.VideoCapture(video_path)
    if not cap.isOpened():
        print(f"Error: Could not open video {video_path}")
        return

    width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
    height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
    fps = int(cap.get(cv2.CAP_PROP_FPS))
    
    fourcc = cv2.VideoWriter_fourcc(*'mp4v')
    out = cv2.VideoWriter(output_path, fourcc, fps, (width, height))
    
    print(f"Processing {video_path}...")
    
    frame_count = 0
    total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))

    while cap.isOpened():
        ret, frame = cap.read()
        if not ret:
            break
            
        # Run inference
        # RT-DETR model call returns a list of Results objects
        results = model(frame, verbose=False, conf=conf)
        
        # Create overlay
        overlay = frame.copy()
        
        for result in results:
            # result.boxes.xyxy is a Tensor, convert to numpy
            if result.boxes is not None:
                boxes = result.boxes.xyxy.cpu().numpy()
                for box in boxes:
                    x1, y1, x2, y2 = map(int, box[:4])
                    # Draw red filled rectangle (BGR: 0, 0, 255)
                    cv2.rectangle(overlay, (x1, y1), (x2, y2), (0, 0, 255), -1)
                
        # Apply transparency
        alpha = 0.5
        cv2.addWeighted(overlay, alpha, frame, 1 - alpha, 0, frame)
        
        out.write(frame)
        
        frame_count += 1
        if frame_count % 50 == 0:
            print(f"Processed {frame_count}/{total_frames} frames")
        
    cap.release()
    out.release()
    print(f"Processed video saved to {output_path}")

## Run Inference on Video

In [None]:
video = r"plant_weed_video\sample2.mp4"

run_inference_video(video, conf=0.5)

---

## üë®‚Äçüíª About Labellerr's Hands-On Learning in Computer Vision

Thank you for exploring this **Labellerr Hands-On Computer Vision Cookbook**! We hope this notebook helped you learn, prototype, and accelerate your vision projects.  
Labellerr provides ready-to-run Jupyter/Colab notebooks for the latest models and real-world use cases in computer vision, AI agents, and data annotation.

---
## üßë‚Äçüî¨ Check Our Popular Youtube Videos

Whether you're a beginner or a practitioner, our hands-on training videos are perfect for learning custom model building, computer vision techniques, and applied AI:

- [How to Fine-Tune YOLO on Custom Dataset](https://www.youtube.com/watch?v=pBLWOe01QXU)  
  Step-by-step guide to fine-tuning YOLO for real-world use‚Äîenvironment setup, annotation, training, validation, and inference.
- [Build a Real-Time Intrusion Detection System with YOLO](https://www.youtube.com/watch?v=kwQeokYDVcE)  
  Create an AI-powered system to detect intruders in real time using YOLO and computer vision.
- [Finding Athlete Speed Using YOLO](https://www.youtube.com/watch?v=txW0CQe_pw0)  
  Estimate real-time speed of athletes for sports analytics.
- [Object Counting Using AI](https://www.youtube.com/watch?v=smsjBBQcIUQ)  
  Learn dataset curation, annotation, and training for robust object counting AI applications.
---

## üé¶ Popular Labellerr YouTube Videos

Level up your skills and see video walkthroughs of these tools and notebooks on the  
[Labellerr YouTube Channel](https://www.youtube.com/@Labellerr/videos):

- [How I Fixed My Biggest Annotation Nightmare with Labellerr](https://www.youtube.com/watch?v=hlcFdiuz_HI) ‚Äì Solving complex annotation for ML engineers.
- [Explore Your Dataset with Labellerr's AI](https://www.youtube.com/watch?v=LdbRXYWVyN0) ‚Äì Auto-tagging, object counting, image descriptions, and dataset exploration.
- [Boost AI Image Annotation 10X with Labellerr's CLIP Mode](https://www.youtube.com/watch?v=pY_o4EvYMz8) ‚Äì Refine annotations with precision using CLIP mode.
- [Boost Data Annotation Accuracy and Efficiency with Active Learning](https://www.youtube.com/watch?v=lAYu-ewIhTE) ‚Äì Speed up your annotation workflow using Active Learning.

> üëâ **Subscribe** for Labellerr's deep learning, annotation, and AI tutorials, or watch videos directly alongside notebooks!

---

## ü§ù Stay Connected

- **Website:** [https://www.labellerr.com/](https://www.labellerr.com/)
- **Blog:** [https://www.labellerr.com/blog/](https://www.labellerr.com/blog/)
- **GitHub:** [Labellerr/Hands-On-Learning-in-Computer-Vision](https://github.com/Labellerr/Hands-On-Learning-in-Computer-Vision)
- **LinkedIn:** [Labellerr](https://in.linkedin.com/company/labellerr)
- **Twitter/X:** [@Labellerr1](https://x.com/Labellerr1)

*Happy learning and building with Labellerr!*
