Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data augmentation zoom #3830

Closed
LukeAI opened this issue Aug 28, 2019 · 20 comments
Closed

data augmentation zoom #3830

LukeAI opened this issue Aug 28, 2019 · 20 comments

Comments

@LukeAI
Copy link

LukeAI commented Aug 28, 2019

Is anybody aware of a script to augment my data by randomly cropping (zooming in on smaller objects) and adjusting the corresponding bounding boxes annotations accordingly?

@LukeAI
Copy link
Author

LukeAI commented Aug 28, 2019

I was wondering why this isn't part of the darknet augmentation? at least as a cfg option? Might it not be an effective way to help Yolo move towards scale invariance if it saw zoomed in versions of the smaller objects during training?

@AlexeyAB
Copy link
Owner

jitter= and random=1 in [yolo] layer do these things https://github.com/AlexeyAB/darknet/wiki/CFG-Parameters-in-the-different-layers

@LukeAI
Copy link
Author

LukeAI commented Aug 28, 2019

Ah, so I did read those:

jitter=.3 - randomly crops and resizes images with changing aspect ratio from x(1 - 2*jitter) to x(1 + 2*jitter) (data augmentation parameter is used only from the last layer)`

random=1 - randomly resizes network for each 10 iterations from 1/1.4 to 1.4(data augmentation parameter is used only from the last layer)

So random=1 will essentially blur the whole image - and jitter will randomly crop and also warp the aspect ratio. What are the constraints on the random cropping - how small might the crops go?

@AlexeyAB
Copy link
Owner

random=1 - will not blur image, it will resize the network (and then resize image to this network size)

yes, jitter= will randomly crop, resize and change aspect ratio.
For example, it can crop part of image (1 - 2*jitter) x (1 - 2*jitter), so if jitter=0.3 then
the smallest part of image which can be cropped 0.4 x 0.4, i.e. image size will be increased 2.5x times (width and height will be increased 2.5x times)

@LukeAI
Copy link
Author

LukeAI commented Aug 28, 2019

Is there any way that I could get greater cropping without drastically warping the image?

@AlexeyAB
Copy link
Owner

AlexeyAB commented Aug 28, 2019

Do you want just change the size without changing aspect ratio?

@LukeAI
Copy link
Author

LukeAI commented Aug 28, 2019

Well, my dataset is fairly high resolution images - almost 4 times the width and height of the network and it's biased towards objects in the distance, but I want the model to be better at detecting objects closer up. Maybe jitter=0.4 would be ok for the purposes of cropping - I'm a little concerned that it may be too much distortion?

@AlexeyAB AlexeyAB added the want enhancement Want to improve accuracy, speed or functionality label Aug 28, 2019
@Hwijune
Copy link

Hwijune commented Aug 29, 2019

Rotation augmentation is also required.

@LukeAI
Copy link
Author

LukeAI commented Aug 29, 2019

My thinking is - for my dataset, the shapes of the objects never change, they are all rigid squares and rectangles, so jitter augmentation may be counterproductive if it stops the detector from being able to learn that. the zoom/crop effect does sound very useful though, I want the detector to be as scale-invariant as possible.

@AlexeyAB
Copy link
Owner

Now you can use resize=2.0 it will randomly resize and move image.


will resize image [1/2; 2]x

[yolo]
jitter=0.0
resize=2.0

will resize image [1/2; 1]x

[yolo]
jitter=0.0
resize=0.5

@AlexeyAB AlexeyAB added enhancement and removed want enhancement Want to improve accuracy, speed or functionality labels May 13, 2020
@LukeAI
Copy link
Author

LukeAI commented May 13, 2020

so to clarify: resize=0.5 will randomly crop away 50% of the area of each image?

@AlexeyAB
Copy link
Owner

AlexeyAB commented May 13, 2020

resize=0.5 - image width and height will be reduced by 2 times (image area will be reduced by 4 times)


You can see how this works by usig -show_imgs flag at the end of training command.

@AlexeyAB
Copy link
Owner

Because you get Nan loss.
You are doing something wrong.

@AlexeyAB
Copy link
Owner

AlexeyAB commented May 14, 2020

  • What dataset did you use?

  • Try to use -show_imgs flag, do you see correct bboxes?

  • What training command did you use?

  • Show such screenshot:
    image

  • Attach you cfg-file.

  • Try to train with jitter=0.0 resize=1.5 for each of 3 [yolo] layers.

@LukeAI
Copy link
Author

LukeAI commented May 14, 2020

I was doing something wrong. I accidentally had burn in set too low. I set it back to default and got this: (baseline is around 76% mAP)
image

I'll try again with your suggested jitter and resize.

@AlexeyAB
Copy link
Owner

AlexeyAB commented May 14, 2020

  • If your training/validation/test images has the same size - use jitter=0.0 resize=1.5 and preferably letter_box=0

  • If your training/validation/test images has different size - use letter_box=1 jitter=0.0 resize=1.5 or letter_box=0 jitter=0.3 resize=0.0

@LukeAI
Copy link
Author

LukeAI commented May 14, 2020

training/validation/test are the same size. Is letter_box=0 not the default behaviour?

@AlexeyAB
Copy link
Owner

letter_box=0 is the default behaviour

@LukeAI
Copy link
Author

LukeAI commented May 14, 2020

training with resize=1.5, mosaic=0, jitter=0, letter_box=0, random=0
AP is down from the ~73% AP that I got with default yolov4 at same size.

resize=1.5
image

baseline:
small

@AlexeyAB
Copy link
Owner

baseline:

What params resize=1.5, mosaic=0, jitter=0, letter_box=0, random=0 did you use?

It seems that your images size are not the same, or aspect ratio of objects are not the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants