training failed when only has 1 training image #11693

PacificDou · 2024-05-06T15:30:42Z

Search before asking

I have searched the YOLOv8 issues and found no similar bug report.

YOLOv8 Component

No response

Bug

When there is only 1 training image, then the dataset.max_buffer_length will be set as 1.
Then, each time after loading the image, the dataset.buffer will be cleared.
A cascading failure will happen when dataset.buffer is empty, because it tries to draw some samples from an empty list.

Environment

Ultralytics YOLOv8.2.10 🚀 Python-3.10.12 torch-2.2.1+cu118 CUDA:0 (NVIDIA L4, 22478MiB)
Setup complete ✅ (8 CPUs, 31.3 GB RAM, 267.1/484.4 GB disk)

OS Linux-6.5.0-1018-gcp-x86_64-with-glibc2.35
Environment Linux
Python 3.10.12
Install git
RAM 31.33 GB
CPU Intel Xeon 2.20GHz
CUDA 11.8

matplotlib ✅ 3.8.3>=3.3.0
opencv-python ✅ 4.9.0.80>=4.6.0
pillow ✅ 10.2.0>=7.1.2
pyyaml ✅ 6.0.1>=5.3.1
requests ✅ 2.31.0>=2.23.0
scipy ✅ 1.12.0>=1.4.1
torch ✅ 2.2.1+cu118>=1.8.0
torchvision ✅ 0.17.1+cu118>=0.9.0
tqdm ✅ 4.66.2>=4.64.0
psutil ✅ 5.9.8
py-cpuinfo ✅ 9.0.0
thop ✅ 0.1.1-2209072238>=0.1.1
pandas ✅ 1.3.5>=1.1.4
seaborn ✅ 0.13.2>=0.11.0

Minimal Reproducible Example

from ultralytics import YOLO

model = YOLO('yolov8n.pt')

# Train the model
def on_train_start(trainer):
    trainer.train_loader.dataset.max_buffer_length = 1 # to simulate cases when training set has only 1 image
model.callbacks["on_train_start"].append(on_train_start)

results = model.train(data='coco8.yaml', epochs=1, imgsz=640, cache=False, workers=0)

Additional

No response

Are you willing to submit a PR?

Yes I'd like to help by submitting a PR!

The text was updated successfully, but these errors were encountered:

glenn-jocher · 2024-05-07T00:42:56Z

Hi there! Thank you for the detailed issue report. 👍

It looks like the issue arises from trying to perform training with only 1 image in the dataset. Generally, it's recommended to use a larger dataset for effective training, as this ensures better model generalization and prevents overfitting. Additionally, certain buffer mechanisms in the data loader expect more than one sample to operate correctly.

As a workaround, you could manually repeat your single training image several times to increase the effective dataset size. Here's a quick example of how you might adjust your dataset configuration:

# coco8.yaml
train: path/to/repeated_images/  # Folder containing replicated images

And make sure you have multiple copies of your single training image in the repeated_images directory.

Alternatively, if you're looking at making changes to how the buffer handles a single-image dataset, consider adjusting the buffer filling mechanism to accommodate this edge case.

Let me know if this helps or if you need further assistance!

PacificDou added the bug Something isn't working label May 6, 2024

PacificDou mentioned this issue May 6, 2024

Fix single-image dataset failure mode #11694

Merged

glenn-jocher added the fixed Bug has been resolved label May 6, 2024

glenn-jocher linked a pull request May 6, 2024 that will close this issue

Fix single-image dataset failure mode #11694

Merged

glenn-jocher closed this as completed in #11694 May 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

training failed when only has 1 training image #11693

training failed when only has 1 training image #11693

PacificDou commented May 6, 2024

glenn-jocher commented May 7, 2024

training failed when only has 1 training image #11693

training failed when only has 1 training image #11693

Comments

PacificDou commented May 6, 2024

Search before asking

YOLOv8 Component

Bug

Environment

Minimal Reproducible Example

Additional

Are you willing to submit a PR?

glenn-jocher commented May 7, 2024