Joint training on images with bounding boxes and labels, and images with only labels (YOLO9000 style) #13312

sidharthanup · 2024-06-02T20:39:49Z

Search before asking

I have searched the YOLOv8 issues and discussions and found no similar questions.

Question

Hello!

Assuming I have a dataset with images that have bounding boxes (+ superclass labels) and images with only class labels, I needed some help with how I can train this in a YOLO9000 manner. For YOLO9000, When the training

Encounters an image with a ground truth BB, it propagates the entire loss as usual.
Encounters an image without a ground truth BB, it selects the BB with the highest prediction probability, propagates classification error and also some constant measure (0.3 times) of IOU loss.

Can I adapt something similar to allow for joint training of detection and classification images and if so how do I proceed?

Awaiting your response. Thanks in advance!

Additional

No response

github-actions · 2024-06-02T20:40:14Z

👋 Hello @sidharthanup, thank you for your interest in Ultralytics YOLOv8 🚀! We recommend a visit to the Docs for new users where you can find many Python and CLI usage examples and where many of the most common questions may already be answered.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Join the vibrant Ultralytics Discord 🎧 community for real-time conversations and collaborations. This platform offers a perfect space to inquire, showcase your work, and connect with fellow Ultralytics users.

Install

Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.

pip install ultralytics

Environments

YOLOv8 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Notebooks with free GPU:
Google Cloud Deep Learning VM. See GCP Quickstart Guide
Amazon Deep Learning AMI. See AWS Quickstart Guide
Docker Image. See Docker Quickstart Guide

Status

If this badge is green, all Ultralytics CI tests are currently passing. CI tests verify correct operation of all YOLOv8 Modes and Tasks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

glenn-jocher · 2024-06-03T04:38:28Z

Hello!

Great question! To train a model on a mixed dataset with both bounding boxes and class labels only (YOLO9000 style), you'll need to modify the training process to handle each type of data appropriately.

For YOLOv8, you can implement a custom training loop that:

Checks if an image has associated bounding boxes:
- If yes, compute the full loss (localization + classification).
- If no, select the bounding box with the highest prediction probability, compute the classification loss for this box, and apply a scaled-down version of the localization loss (e.g., 0.3 times the IoU loss as you mentioned).

This approach requires modifying the loss computation part of your model's training script. You might need to dive into the model's codebase to implement these conditional checks and loss adjustments.

If you're comfortable editing the model's training code, this could be a feasible approach. Otherwise, consulting with a developer familiar with the YOLO architecture and its implementation might be necessary.

Let us know if you need further assistance or specific guidance on the code changes!

sidharthanup · 2024-06-03T15:13:48Z

Thank you @glenn-jocher! That makes sense. And yes I'm new to the codebase and I'll really appreciate it if you guys can help me get started on the code and a general sense of where I should be changing stuff.

glenn-jocher · 2024-06-03T22:51:35Z

Hello!

We're glad to hear the information was helpful! 🚀 To get started with modifying the codebase for your needs, I recommend first familiarizing yourself with the structure of the YOLOv8 model, particularly focusing on the files where the loss functions are defined and handled.

A good starting point would be to look into the models directory, where you'll find the model definitions and forward pass logic. Pay special attention to the loss computation sections within these files. This will give you insight into where and how to implement the conditional logic for handling images with and without bounding boxes.

If you encounter any specific issues or have questions as you go through the code, feel free to reach out. We're here to help!

Happy coding!

sidharthanup · 2024-06-10T17:32:33Z

Thank you! Can you help me with pointing out where the loss calculations (dfs, vfs etc) are? I see backward being called in the trainer code : https://github.com/ultralytics/ultralytics/blob/main/ultralytics/engine/trainer.py . Is that where I should be focusing on?

glenn-jocher · 2024-06-11T01:43:02Z

@sidharthanup hello!

Absolutely, I'd be happy to help you navigate the codebase! 😊

You're on the right track by looking into the trainer.py file. The backward call indeed indicates where the gradients are computed, but for modifying the loss calculations, you'll want to dive a bit deeper into the specifics of how the loss is constructed.

Here are the key areas to focus on:

Loss Calculation:
- The loss functions are typically defined in the model files. For YOLOv8, you might want to look into the loss.py file within the ultralytics/models directory. This file should contain the definitions for the different components of the loss (like classification loss, localization loss, etc.).
Trainer Logic:
- The trainer.py file is where the training loop is managed. This includes data loading, forward passes, loss computation, and backpropagation. You'll want to modify the part where the loss is computed to include your conditional logic for handling images with and without bounding boxes.

Here's a general outline of what you might need to do:

Modify the Loss Function:
- In loss.py, add logic to handle cases where only class labels are available. This might involve adding a new function or modifying an existing one to compute a scaled-down IoU loss when bounding boxes are not provided.
Update the Training Loop:
- In trainer.py, update the training loop to check if an image has bounding boxes. Based on this check, call the appropriate loss function.

Here's a small snippet to give you an idea:

# In loss.py
def compute_loss(predictions, targets, has_bboxes):
    if has_bboxes:
        # Compute full loss (classification + localization)
        loss = full_loss(predictions, targets)
    else:
        # Compute classification loss and scaled IoU loss
        loss = classification_loss(predictions, targets) + 0.3 * iou_loss(predictions, targets)
    return loss

# In trainer.py
for batch in dataloader:
    images, targets, has_bboxes = batch
    predictions = model(images)
    loss = compute_loss(predictions, targets, has_bboxes)
    loss.backward()
    optimizer.step()

This is a simplified example, but it should give you a starting point. Make sure to test thoroughly to ensure the new logic integrates well with the existing training process.

If you encounter any specific issues or need further guidance, feel free to ask. We're here to help! 🚀

Happy coding!

sidharthanup · 2024-06-11T14:14:47Z

Thank you so much! I'll let you know how it goes!

glenn-jocher · 2024-06-11T22:06:43Z

@sidharthanup you're very welcome! 😊 We're excited to see how your implementation progresses.

Before you dive in, here are a couple of quick checks to ensure everything runs smoothly:

Reproducible Example: If you encounter any issues, please provide a minimum reproducible code example. This helps us understand the context and reproduce the issue on our end. You can find more details on how to create one here: Minimum Reproducible Example.
Latest Versions: Make sure you're using the latest versions of torch and ultralytics. Sometimes, bugs are fixed in newer releases, so it's always a good idea to update your packages and try again if you run into any problems.

If you need further assistance or run into any issues, don't hesitate to reach out. We're here to help!

Happy coding, and best of luck with your project! 🚀

sidharthanup added the question Further information is requested label Jun 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Joint training on images with bounding boxes and labels, and images with only labels (YOLO9000 style) #13312

Joint training on images with bounding boxes and labels, and images with only labels (YOLO9000 style) #13312

sidharthanup commented Jun 2, 2024

github-actions bot commented Jun 2, 2024

glenn-jocher commented Jun 3, 2024

sidharthanup commented Jun 3, 2024

glenn-jocher commented Jun 3, 2024

sidharthanup commented Jun 10, 2024 •

edited

Loading

glenn-jocher commented Jun 11, 2024

sidharthanup commented Jun 11, 2024

glenn-jocher commented Jun 11, 2024

Joint training on images with bounding boxes and labels, and images with only labels (YOLO9000 style) #13312

Joint training on images with bounding boxes and labels, and images with only labels (YOLO9000 style) #13312

Comments

sidharthanup commented Jun 2, 2024

Search before asking

Question

Additional

github-actions bot commented Jun 2, 2024

Install

Environments

Status

glenn-jocher commented Jun 3, 2024

sidharthanup commented Jun 3, 2024

glenn-jocher commented Jun 3, 2024

sidharthanup commented Jun 10, 2024 • edited Loading

glenn-jocher commented Jun 11, 2024

sidharthanup commented Jun 11, 2024

glenn-jocher commented Jun 11, 2024

sidharthanup commented Jun 10, 2024 •

edited

Loading