Image Resize in Validation #13753

Rbrq03 · 2024-06-17T16:07:01Z

Search before asking

I have searched the YOLOv8 issues and discussions and found no similar questions.

Question

Hey there! I am confused by the imgsz argument in validation.
When i set imgsz to 512, i think the final image shape will be (batch, channel, imgsz, imgsz). However, when i try to print the shape in the batch by:

for batch_i, batch in enumerate(bar):
    self.run_callbacks("on_val_batch_start")
    self.batch_i = batch_i

    # Preprocess
    with dt[0]:
        batch = self.preprocess(batch)

    print(batch["img"].shape)

    # Inference
    with dt[1]:
        preds = model(batch["img"], augment=augment)

    # Loss
    with dt[2]:
        if self.training:
            self.loss += model.loss(batch, preds)[1]

    # Postprocess
    with dt[3]:
        preds = self.postprocess(preds)

    self.update_metrics(preds, batch)
    if self.args.plots and batch_i < 3:
        self.plot_val_samples(batch, batch_i)
        self.plot_predictions(batch, preds, batch_i)

this code is in

ultralytics/ultralytics/engine/validator.py

Lines 169 to 194 in 605e7f4

    
           for batch_i, batch in enumerate(bar): 
        
               self.run_callbacks("on_val_batch_start") 
        
               self.batch_i = batch_i 
        
               # Preprocess 
        
               with dt[0]: 
        
                   batch = self.preprocess(batch) 
        
               # Inference 
        
               with dt[1]: 
        
                   preds = model(batch["img"], augment=augment) 
        
               # Loss 
        
               with dt[2]: 
        
                   if self.training: 
        
                       self.loss += model.loss(batch, preds)[1] 
        
               # Postprocess 
        
               with dt[3]: 
        
                   preds = self.postprocess(preds) 
        
               self.update_metrics(preds, batch) 
        
               if self.args.plots and batch_i < 3: 
        
                   self.plot_val_samples(batch, batch_i) 
        
                   self.plot_predictions(batch, preds, batch_i) 
        
               self.run_callbacks("on_val_batch_end")

and the train code is

from ultralytics import YOLOv10
import torch
model = YOLOv10("yolov10n.yaml")
model.train(
    data="coco.yaml",
    epochs=100,
    batch=256,
    imgsz=512,
    optimizer="SGD",
    lr0=0.01,
    lrf=0.0001,
    plots=True,
    fna=True,
    device=[0, 1, 2, 3, 4, 5, 6, 7],
)

excepted output is torch.Size([64, 3, 320, 544]), however the output is torch.Size([64, 3, 320, 544]).

So i am confused by this output, my question is:

what the argument imgsz use for?
What can i do if i want the shape to be torch.Size([64, 3, 512, 512])?

Additional

No response

The text was updated successfully, but these errors were encountered:

github-actions · 2024-06-17T16:07:27Z

👋 Hello @Rbrq03, thank you for your interest in Ultralytics YOLOv8 🚀! We recommend a visit to the Docs for new users where you can find many Python and CLI usage examples and where many of the most common questions may already be answered.

If this is a 🐛 Bug Report, please provide a minimum reproducible example to help us debug it.

If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our Tips for Best Training Results.

Join the vibrant Ultralytics Discord 🎧 community for real-time conversations and collaborations. This platform offers a perfect space to inquire, showcase your work, and connect with fellow Ultralytics users.

Install

Pip install the ultralytics package including all requirements in a Python>=3.8 environment with PyTorch>=1.8.

pip install ultralytics

Environments

YOLOv8 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Notebooks with free GPU:
Google Cloud Deep Learning VM. See GCP Quickstart Guide
Amazon Deep Learning AMI. See AWS Quickstart Guide
Docker Image. See Docker Quickstart Guide

Status

If this badge is green, all Ultralytics CI tests are currently passing. CI tests verify correct operation of all YOLOv8 Modes and Tasks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

Rbrq03 · 2024-06-17T16:10:11Z

Additional log are shown below:

engine/trainer: task=detect, mode=train, model=yolov10n.yaml, data=coco.yaml, epochs=100, time=None, patience=100,
 batch=256, imgsz=512, save=True, save_period=-1, val_period=1, cache=False, device=[0, 1, 2, 3, 4, 5, 6, 7], workers=8, 
project=None, name=train35, exist_ok=False, pretrained=True, optimizer=SGD, verbose=True, seed=0, deterministic=False, 
single_cls=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, amp=True, fraction=1.0, profile=False, 
freeze=None, multi_scale=False, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False,
 save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, vid_stride=1, 
stream_buffer=False, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=False, 
embed=None, show=False, save_frames=False, save_txt=False, save_conf=False, save_crop=False, show_labels=True, 
show_conf=True, show_boxes=True, line_width=None, format=torchscript, keras=False, optimize=False, int8=False, 
dynamic=False, simplify=False, opset=None, workspace=4, nms=False, lr0=0.01, lrf=0.0001, momentum=0.937, 
weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=7.5, cls=0.5, dfl=1.5, 
pose=12.0, kobj=1.0, label_smoothing=0.0, nbs=64, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, 
scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, bgr=0.0, mosaic=1.0, mixup=0.0, copy_paste=0.0,
 auto_augment=randaugment, erasing=0.4, crop_fraction=1.0, cfg=None, tracker=botsort.yaml,

Freezing layer 'model.23.dfl.conv.weight'
AMP: running Automatic Mixed Precision (AMP) checks with YOLOv8n...
AMP: checks passed ✅
train: Scanning /opt/data/private/hjn/fna-detection/datasets/coco/labels/train2017.cache... 117266 images, 1021 backgrounds, 0 corrupt: 100%|██████████| 118287/118287 [00:00<?, ?it/s]
val: Scanning /opt/data/private/hjn/fna-detection/datasets/coco/labels/val2017.cache... 4952 images, 48 backgrounds, 0 corrupt: 100%|██████████| 5000/5000 [00:00<?, ?it/s]
Plotting labels to /opt/data/private/hjn/fna-detection/yolov10/runs/detect/train35/labels.jpg... 
optimizer: SGD(lr=0.01, momentum=0.937) with parameter groups 95 weight(decay=0.0), 108 weight(decay=0.002), 107 bias(decay=0.0)
Image sizes 512 train, 512 val
Using 64 dataloader workers
Logging results to /opt/data/private/hjn/fna-detection/yolov10/runs/detect/train35
Starting training for 100 epochs...

      Epoch    GPU_mem     box_om     cls_om     dfl_om     box_oo     cls_oo     dfl_oo  Instances       Size
      1/100      5.42G      3.562      5.044      3.487      3.284      6.018      3.206         30        512: 100%|██████████| 463/463 [01:27<00:00,  5.29it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95):   0%|          | 0/79 [00:00<?, ?it/s]torch.Size([64, 3, 288, 544])

glenn-jocher · 2024-06-17T18:15:44Z

@Rbrq03 hello!

Thank you for reaching out with your question regarding the imgsz argument in validation.

The imgsz argument is used to specify the target image size for both training and validation. When you set imgsz=512, it indicates that the images should be resized to 512x512 pixels. However, in practice, the actual image dimensions may vary slightly due to the aspect ratio preservation and padding applied during preprocessing.

From your provided logs, it seems that the images are being resized to dimensions close to 512x512 but not exactly square, resulting in shapes like torch.Size([64, 3, 288, 544]). This discrepancy is due to the rect argument, which is set to False by default. When rect is False, the images are resized while preserving their aspect ratios, and padding is added to match the target size.

To ensure that the images are resized to exactly 512x512, you can set the rect argument to True. This will enforce rectangular training and validation, resizing the images to the specified dimensions without preserving the aspect ratio.

Here’s how you can modify your training script to achieve this:

from ultralytics import YOLOv10
import torch

model = YOLOv10("yolov10n.yaml")
model.train(
    data="coco.yaml",
    epochs=100,
    batch=256,
    imgsz=512,
    optimizer="SGD",
    lr0=0.01,
    lrf=0.0001,
    plots=True,
    fna=True,
    device=[0, 1, 2, 3, 4, 5, 6, 7],
    rect=True  # Add this line to enforce rectangular resizing
)

By setting rect=True, the images will be resized to exactly 512x512 during validation, ensuring consistent dimensions.

If you have any further questions or need additional assistance, feel free to ask. Happy training! 😊

Rbrq03 · 2024-06-18T03:25:10Z

Thanks @glenn-jocher! What I further concern is, the shape I print is excepted to be 512x512 as it will be feed into the model forward directly. So it should be the shape which is after padding? What cause this problem and how should i do?

glenn-jocher · 2024-06-18T07:52:06Z

Hello @Rbrq03,

Thank you for your follow-up question!

To ensure that your images are resized to exactly 512x512 before being fed into the model, you should set the rect argument to True in your training script. This will enforce rectangular resizing, ensuring that the images are resized to the specified dimensions without preserving the aspect ratio.

Here’s how you can modify your training script:

from ultralytics import YOLOv10
import torch

model = YOLOv10("yolov10n.yaml")
model.train(
    data="coco.yaml",
    epochs=100,
    batch=256,
    imgsz=512,
    optimizer="SGD",
    lr0=0.01,
    lrf=0.0001,
    plots=True,
    fna=True,
    device=[0, 1, 2, 3, 4, 5, 6, 7],
    rect=True  # Add this line to enforce rectangular resizing
)

By setting rect=True, the images will be resized to exactly 512x512 during validation, ensuring consistent dimensions that match your expectation.

If you continue to experience issues, please ensure you are using the latest versions of torch and ultralytics. If the problem persists, providing a minimum reproducible code example would be very helpful for us to investigate further. You can find more details on how to create one here: https://docs.ultralytics.com/help/minimum_reproducible_example.

Feel free to reach out if you have any more questions! 😊

Rbrq03 · 2024-06-18T10:52:35Z

Thanks @plashchynski ! it solves my problem.

Rbrq03 added the question Further information is requested label Jun 17, 2024

Rbrq03 closed this as completed Jun 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Image Resize in Validation #13753

Image Resize in Validation #13753

Rbrq03 commented Jun 17, 2024 •

edited

Loading

github-actions bot commented Jun 17, 2024

Rbrq03 commented Jun 17, 2024

glenn-jocher commented Jun 17, 2024

Rbrq03 commented Jun 18, 2024

glenn-jocher commented Jun 18, 2024

Rbrq03 commented Jun 18, 2024

Image Resize in Validation #13753

Image Resize in Validation #13753

Comments

Rbrq03 commented Jun 17, 2024 • edited Loading

Search before asking

Question

Additional

github-actions bot commented Jun 17, 2024

Install

Environments

Status

Rbrq03 commented Jun 17, 2024

glenn-jocher commented Jun 17, 2024

Rbrq03 commented Jun 18, 2024

glenn-jocher commented Jun 18, 2024

Rbrq03 commented Jun 18, 2024

Rbrq03 commented Jun 17, 2024 •

edited

Loading