Resizing : keeping aspect ratio, or not #232

iraadit · 2017-10-16T14:03:54Z

Hi,

In your implementation of Darknet, when resizing the image, the aspect ratio of the image is not kept in the function get_image_from_stream_resize:

darknet/src/image.c

Line 627 in 5a2efd5

    
           image get_image_from_stream_resize(CvCapture *cap, int w, int h, IplImage** in_img)

used instead of the original resizing function letterbox_image_into in the pjreddie version:

darknet/src/image.c

Line 892 in 5a2efd5

void letterbox_image_into(image im, int w, int h, image boxed)

https://github.com/pjreddie/darknet/blob/532c6e1481e78ba0e27c56f4492a7e8d3cc36597/src/image.c#L913
(in the original repo)
that was resizing the image keeping the aspect ratio and putting it in a letterbox.

It is the same in the OpenCV implementation you pushed some days ago (no kept aspect ratio)
https://github.com/opencv/opencv/blob/73af899b7c737677f008b831c8e61eaeb2984342/samples/dnn/yolo_object_detection.cpp#L60

Why this difference?

It seems to me it is the old behavior of YOLO (v1) : https://github.com/pjreddie/darknet/blame/179ed8ec76f329eb22360440c3836fdcb2560330/src/demo.c#L44
Why didn't you update this behavior in the same way?

It seems to me than not keeping the aspect ratio would mean you HAVE TO use the same aspect ratio for the training images and the test/application images/videos (and I saw you writing that at different places)

Does this mean that a network trained with one version of YOLO isn't fully usable with the other one (having the same results)?

In my case, I have several training images, but some of them are in portrait orientation, while the application video will be a landscape orientation video. Does that mean I can't use them? (Only with your implementation not keeping aspect ratio, or in any case?)

Thank you

AlexeyAB · 2017-10-16T15:05:53Z

Hi,

In the OpenCV version of Yolo you can keep aspect ratio right now, just replace this code:
https://github.com/opencv/opencv/blob/73af899b7c737677f008b831c8e61eaeb2984342/samples/dnn/yolo_object_detection.cpp#L58-L65

    //! [Resizing without keeping aspect ratio]
    cv::Mat resized;
    cv::resize(frame, resized, cv::Size(network_width, network_height));
    //! [Resizing without keeping aspect ratio]

    //! [Prepare blob]
    Mat inputBlob = blobFromImage(resized, 1 / 255.F);
    //! [Prepare blob]

to this:

    //! [Prepare blob]
    Mat inputBlob = blobFromImage(frame, 1 / 255.F, cv::Size(network_width, network_height)); //Convert Mat to batch of images
    //! [Prepare blob]

AlexeyAB · 2017-10-16T15:07:48Z

There are at least 3 versions of Yolo:

Yolo v1 with fully connected layers: https://pjreddie.com/darknet/yolov1/
Yolo v2 fully convolutional network yolo.2.0.cfg and yolo-voc.2.0.cfg - that used in my fork: https://arxiv.org/pdf/1612.08242.pdf
Yolo v2.x with keeping of aspect ratio - current (since 10 Apr 2017): https://pjreddie.com/darknet/yolo/

It seems to me it is the old behavior of YOLO (v1) : https://github.com/pjreddie/darknet/blame/179ed8ec76f329eb22360440c3836fdcb2560330/src/demo.c#L44

No. Yolo v1 used fully conected layers, file yolo_demo.c instead of demo.c and had to small accuracy, you can find Yolo v1 here: https://github.com/AlexeyAB/yolo-windows

This my fork fully corresponds to the Yolo v2 that uses yolo-voc.2.0.cfg or yolo.2.0.cfg and with accuracy 78.6 mAP (VOC 2007), 73.4 mAP (VOC 2012), 44.0 mAP (COCO - table 5): https://arxiv.org/abs/1612.08242

Yolo v2 released at 17 Nov 2016 (1 year ago): pjreddie@c6afc7f
resize_image() replaced by letterbox_image() with keeping aspect ratio at 10 Apr 2017 (6 monthes ago): pjreddie@8d9ed0a#diff-4e71a2cf0098713e52e5dae1dfd56c06L44

So now with keeping of aspect ratio we can get about 48.1 mAP (COCO) so it adds about +4.1 mAP for COCO: https://pjreddie.com/darknet/yolo/

Why didn't you update this behavior in the same way?

May be I will update it latter. Maybe soon Joseph will release a new version of Yolo with new improvements, and I'll add it all together.

This version of Yolo v2 works a bit worse with different aspects of the training and detection datasets, but it works. Aspect ratio invariance is achieved by using crop that depends on jitter parameter in .cfg-file:

darknet/src/data.c

Line 697 in 5a2efd5

image cropped = crop_image(orig, pleft, ptop, swidth, sheight);

iraadit · 2017-10-16T15:31:16Z

Thank you once again for your completer answer.

If I'm modifying your code to include the keeping of the aspect ratio, would you be interested in a Pull Request?

iraadit · 2017-10-16T15:38:50Z

For OpenCV, looking at the definition of blobFromImage, it appears to me that the behaviour is different to kept aspect ratio in pjreddie Darknet:

input image is resized so one side after resize is equal to corresponding dimension in size and another one is equal or larger. Then, crop from the center is performed.

It seems to resize and cut the parts of the image that aren't fitting in a square, instead of adding black margins (letterbox)

AlexeyAB · 2017-10-16T17:09:25Z

Yes, you are right, blobFromImage does it in the different way than Darknet.

But there are trade-off in any cases - below is an example of resizing an image in different ways:

blobFromImage(): object size 71 x 43, keeps aspect ratio, but part of image is lost (cropped)
letterbox_image(): the smallest object size 48 x 28, keeps aspect ratio and see whole image
resize_image(): object size 48 x 43, see whole image but doesn't keep aspect ratio

If I'm modifying your code to include the keeping of the aspect ratio, would you be interested in a Pull Request?

Known Yolo problem is a difficult to detect of small objects. And letterbox_image() has the smallest object size 48 x 28.

So yes, I'll apply your pull request, but I think there should be an if-branch that depends on command line flag, which allows us to use the current resize_image() version without keeping the aspect ratio, and letterbox_image() version with keeping the aspect ratio.

For example:

Original image:

Resized (416x416) with keeping aspect ratio - OpenCV blobFromImage():

Resized (416x416) with keeping aspect ratio - Darknet letterbox_image():

Resized (416x416) without keeping aspect ratio - this fork of Darknet resize_image():

2598Nitz · 2018-06-21T06:08:48Z

Hi AlexeyAB ,
I found letterbox_image function in image.c file in your repo. But it seems that none of the files use this function. Also, get_image_from_stream_resize is also there in image.c and it has been used in demo.c. So if I replace get_image_from_resize by letterbox_image in demo.c will the aspect ratio be maintained by padding black margin as described above?

AlexeyAB · 2018-06-21T11:48:35Z

@2598Nitz

Update your code from GitHub.

To use letterbox:

for video - set this flag to 1:

darknet/src/demo.c

Line 62 in 455b2fc

static int letter_box = 0;

static int letter_box = 1;

for image - comment this line:

darknet/src/detector.c

Line 1121 in 455b2fc

image sized = resize_image(im, net.w, net.h);

and un-comment this line:

darknet/src/detector.c

Line 1122 in 455b2fc

//image sized = letterbox_image(im, net.w, net.h); letterbox = 1;

2598Nitz · 2018-06-21T13:37:11Z

Thanks for your reply.
My purpose is to implement letterbox function at both train and test time.I still have a few questions regarding resizing of images.

The above changes that you mentioned ,are they for training or for testing?If it is for testing what changes should I make for training on custom dataset to implement letterbox function?
I assume demo.c is for test time and detector.c is for training purpose.
There's one more line in detector.c file(line no. 495) where letterbox is defined.Is that for training purpose?Should I also set it to 1?
Also does resizing implementation vary for different versions of yolo?

shubham-shahh · 2020-10-03T08:14:59Z

Shall I use a varied aspect ratio for training or not? finally, I have to do inference on a constant aspect ratio.and in the .cfg file what should be the width and height? if my training images are of size 150(width)x80(height) to 600(width)x200(height). I am using YOLOV4
thanks @AlexeyAB

developer0hye · 2021-01-23T10:14:14Z

@AlexeyAB

I know it's the old issue, but i think this issue is very important.

You said the resize method keeping aspect ratio improved model's performance.

#232 (comment)

But, it seems that your preprocessing method is based on the resize method not keeping aspect ratio.

Are there any reason for that?

AlexeyAB · 2021-01-23T14:23:10Z

@developer0hye

yolov4.cfg doesn't keep aspect ratio: https://github.com/AlexeyAB/darknet/blob/master/cfg/yolov4.cfg
yolov4-csp.cfg keeps aspect ratio letter_box=1: https://github.com/AlexeyAB/darknet/blob/master/cfg/yolov4-csp.cfg#L27

On MSCOCO letter_box=1 works well (for training and detection) if you use network resolution width=608 height=608 in cfg-file or higher.

developer0hye · 2021-01-23T14:29:20Z

@AlexeyAB
Thanks!

iraadit closed this as completed Oct 16, 2017

iraadit reopened this Oct 16, 2017

AlexeyAB added the question label Oct 17, 2017

ilkarman mentioned this issue Jan 13, 2018

Detection on many varied-size objects #333

Closed

TheMikeyR mentioned this issue Feb 8, 2018

Deeper Training Questions/Theory pt. 2 #380

Open

AlexeyAB mentioned this issue Feb 15, 2018

How to compute mAP of tiny yolo on VOC2007-test #350

Open

AlexeyAB mentioned this issue Feb 25, 2018

Tiny YOLO: Looking for suggestions to improve training on a custom dataset #406

Open

AlexeyAB mentioned this issue Mar 16, 2018

How did YOLO resize images ? #464

Closed

TheMikeyR mentioned this issue Apr 4, 2018

About Resizing Images, and width-height parameters #496

Open

This was referenced Apr 10, 2018

The results of yolo v3 detector are different. #602

Open

Added DNN Darknet Yolo v2 for object detection opencv/opencv#9705

Merged

lyzband mentioned this issue Apr 26, 2018

differences between original repo and this fork #705

Open

TaihuLight mentioned this issue Apr 27, 2018

Why the accuracy is lower for detector.c with fixed TP/FP/FN, IoU calculation for specified threshold #711

Open

AlexeyAB added the Explanations Explanations of the source code, algorithms or method of use label Jun 4, 2018

AlexeyAB mentioned this issue Jun 4, 2018

The Difference of AlexeyAB/Darknet and Pjreddie/Darknet #969

Closed

AlexeyAB referenced this issue Jun 6, 2018

Letter box is disabled in the darknet.py by defualt

1e512b2

AlexeyAB mentioned this issue Jul 25, 2018

To detect the same picture, why is AB-version much better than PJ-version? #1277

Open

AlexeyAB mentioned this issue Aug 2, 2018

performance drop when training yolov3 on voc #1323

Open

This was referenced Aug 10, 2018

YoloV3 detection results are worse compared with pjreddie master #1375

Closed

Image resize fix #1385

Closed

Differences between forks - AlexeyAB, Pjreddie, darknet-NNPACK #1419

Open

AlexeyAB mentioned this issue Aug 30, 2018

How to build a yolo pipeline? #1492

Open

AlexeyAB mentioned this issue May 16, 2019

letter parameter at get_network_boxes() in network.c #3172

Open

AlexeyAB mentioned this issue May 29, 2019

compare the C api vs. opencv's cv2.dnn.readNetFromDarknet #3273

Closed

This was referenced Jun 19, 2019

add min_area and pj_crop #3435

Closed

Sensitivity Effects Near Grid Boundaries (Experimental Results) ~+1 AP@[.5, .95] #3293

Closed

valid mAP much higher than the original darknet using the same weights trained by this repo #3478

Closed

AlexeyAB mentioned this issue Jun 28, 2019

how to configure stretching vs. letteboxing #3119

Closed

AlexeyAB mentioned this issue Aug 3, 2019

Couldn't display YOLO result via OpenCV dnn module #3705

Closed

AlexeyAB mentioned this issue Nov 25, 2019

The network size, image size and object size #3215

Open

This was referenced Dec 3, 2019

cuDNN status Error in: file: ./src/convolutional_kernels.cu : () when training #4366

Open

KITTI Results #4454

Open

General information regarding input image #4484

Open

AldusLorde mentioned this issue Feb 11, 2020

Ignoring annotations outside of an image fizyr/keras-retinanet#1269

Closed

This was referenced Mar 31, 2020

OpenCV DNN is faster then Darknet? Different Sizes = Different Predictions. #5144

Closed

yolo layer configuration #5168

Open

brian208579 mentioned this issue Apr 7, 2020

About image resize problem jkjung-avt/tensorrt_demos#82

Closed

brian208579 mentioned this issue Apr 14, 2020

Does cv2.resize () keeping the aspect ratio ? opencv/opencv#17069

Closed

This was referenced Apr 29, 2020

Accuracy yolov3 in deepstream lower then darknet #5413

Open

Regarding mAP and latency of Yolov4 #5354

Open

Results between OpenCV and Darknet CLI differ #5435

Open

marcoslucianops mentioned this issue Oct 4, 2020

Image size in config file for training yolov4 marcoslucianops/DeepStream-Yolo#9

Closed

natuition mentioned this issue Jan 6, 2021

resize_image() #4854

Open

developer0hye mentioned this issue Jan 23, 2021

Low mAP of yolo-tiny-prn. #4091

Open

AlexeyAB mentioned this issue Feb 5, 2021

mAP by Darknet and mAP by Cartucho are different on detection same validation images. #7332

Open

lsd1994 mentioned this issue Feb 10, 2021

YOLOv4 image resizing #7349

Open

AlexeyAB mentioned this issue Oct 18, 2021

YOLOv3-tiny in Darknet vs OpenCV DNN: large objects are missed #8146

Closed

erikstauber mentioned this issue Dec 22, 2022

Confused about resize aspect ratio training/inferencing stephanecharette/DarkHelp#34

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resizing : keeping aspect ratio, or not #232

Resizing : keeping aspect ratio, or not #232

iraadit commented Oct 16, 2017 •

edited

AlexeyAB commented Oct 16, 2017

AlexeyAB commented Oct 16, 2017

iraadit commented Oct 16, 2017

iraadit commented Oct 16, 2017

AlexeyAB commented Oct 16, 2017 •

edited

2598Nitz commented Jun 21, 2018

AlexeyAB commented Jun 21, 2018 •

edited

2598Nitz commented Jun 21, 2018 •

edited

shubham-shahh commented Oct 3, 2020 •

edited

developer0hye commented Jan 23, 2021

AlexeyAB commented Jan 23, 2021

developer0hye commented Jan 23, 2021

Resizing : keeping aspect ratio, or not #232

Resizing : keeping aspect ratio, or not #232

Comments

iraadit commented Oct 16, 2017 • edited

AlexeyAB commented Oct 16, 2017

AlexeyAB commented Oct 16, 2017

iraadit commented Oct 16, 2017

iraadit commented Oct 16, 2017

AlexeyAB commented Oct 16, 2017 • edited

2598Nitz commented Jun 21, 2018

AlexeyAB commented Jun 21, 2018 • edited

2598Nitz commented Jun 21, 2018 • edited

shubham-shahh commented Oct 3, 2020 • edited

developer0hye commented Jan 23, 2021

AlexeyAB commented Jan 23, 2021

developer0hye commented Jan 23, 2021

iraadit commented Oct 16, 2017 •

edited

AlexeyAB commented Oct 16, 2017 •

edited

AlexeyAB commented Jun 21, 2018 •

edited

2598Nitz commented Jun 21, 2018 •

edited

shubham-shahh commented Oct 3, 2020 •

edited