Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Add standard ResNet data augmentation for ImageRecordIter #11027

Merged
merged 40 commits into from
Jun 19, 2018

Conversation

hetong007
Copy link
Contributor

@hetong007 hetong007 commented May 23, 2018

Description

Add standard ResNet data augmentation for ImageRecordIter.

Specifically, this augmentation performs a "random resized crop", which crop the image with respect to a random aspect ratio, and a random area, and resize it to the desired size.

  • If random_resized_crop=True, existing max_random_scale and min_random_scale will be ignored.
  • If random_resized_crop=False, newly-added max_random_area and min_random_area will be ignored.

Besides, color jittering and pca noise are added as additional data augmentation options.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

@piiswrong
Copy link
Contributor

Needs to add jitter pca noise etc

@rahul003
Copy link
Member

We might want to add this option to the standard image classification example: image_classification/common/data.py

@hetong007 hetong007 requested a review from szha as a code owner May 23, 2018 23:12
@@ -218,6 +257,80 @@ class DefaultImageAugmenter : public ImageAugmenter {
res = src;
}

if (param_.random_resized_crop) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add checks for the mutual exclusion relationship between random_resized_crop and (resize, crop, ctc)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean unittest, or sanity checks in C++?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sanity checks

@@ -31,7 +31,7 @@
data.add_data_args(parser)
data.add_data_aug_args(parser)
# use a large aug level
data.set_data_aug_level(parser, 3)
# data.set_data_aug_level(parser, 3)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The set_data_aug_level doesn't set standard imagenet training augmentation.

data_shape = data_shape,
batch_size = batch_size,
random_resized_crop = True,
max_aspect_ratio = 0.25,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does 0.25 mean?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it the same as (3/4, 4/3)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I realized it's (3/4, 5/4), which is different. Fixing.

DMLC_DECLARE_FIELD(saturation).set_default(0.0f)
.describe("Add a random value in ``[-saturation, saturation]`` to "
"the saturation of image.");
DMLC_DECLARE_FIELD(pca_noise).set_default(0.0f)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add description of what the value would mean in this case

help='max area to crop in random resized crop, whose range is [0, 1]')
aug.add_argument('--min-random-area', type=float, default=1,
help='min area to crop in random resized crop, whose range is [0, 1]')
aug.add_argument('--brightness', type=float, default=0,
Copy link
Member

@rahul003 rahul003 Jun 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to set defaults to the values we know work well, so that this script is runnable easily. Especially since we no longer set the aug level in train_imagenet.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My opinion is that by setting default values to perform no augmentation, users know exactly what kind of augmentation has been applied to the pipeline. It is transparent and will cause less confusion.

On the other hand, not every model uses the same augmentation as ResNet to train on ImageNet. If later another work introduces a better augmentation pipeline with different parameters, then we are unable to justify our default parameters.

Copy link
Member

@rahul003 rahul003 Jun 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about a function like the set_aug_level with the defaults set for resnet, but that function is not automatically called?
My only concern was that we don't have a good source (for users not using GluonCv, on a good set of starting params). Your point also makes sense, so am not too strongly opinionated about this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just reviewed the code again, I think I'll turn off --random-crop and --random-mirror as well.

I can replace set_aug_level with set_resnet_aug. The current one has settings look kind of arbitrary to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR removed set_data_aug_level() in data.py, but the function in still used in example/image-classification/fine-tune.py and example/image-classification/train_cifar10.py. Also, shouldn't set_resnet_aug() turn on mirroring, given the default has been changed to 0? Currently:

def set_resnet_aug(aug):
    # standard data augmentation setting for resnet training
    aug.set_defaults(random_crop=0, random_resized_crop=1)
    aug.set_defaults(min_random_area=0.08)
    aug.set_defaults(max_random_aspect_ratio=4./3., min_random_aspect_ratio=3./4.)
    aug.set_defaults(brightness=0.4, contrast=0.4, saturation=0.4, pca_noise=0.1)

@hetong007 @ptrendx

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DickJC123 Hi Dick, I'm fixing these issues in #11533

@@ -120,6 +136,7 @@ def get_rec_iter(args, kv=None):
else:
(rank, nworker) = (0, 1)
rgb_mean = [float(i) for i in args.rgb_mean.split(',')]
random_resized_crop = args.random_resized_crop is not None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't args.random_resized_crop never be None? action='store_true' means if not passed it is set by default to false

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I'll fix that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, for uniformity with random-crop and random-mirror we might want to random-resized-crop be of type int

index_t new_cols = static_cast<index_t>(static_cast<float>(param_.data_shape[1]) /
static_cast<float>(res.rows) *
static_cast<float>(res.cols));
int interpolation_method = GetInterMethod(param_.inter_method, res.cols, res.rows,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the three calls to GetInterMethod can be extracted outside

@@ -218,10 +265,96 @@ class DefaultImageAugmenter : public ImageAugmenter {
res = src;
}

if (param_.random_resized_crop) {
// random resize crop
CHECK(param_.min_random_scale == 1.0f &&
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@piiswrong do you think this check is good enough?

"min_crop_size, max_crop_size, "
"and rand_crop.";
if (param_.max_random_area != 1.0f || param_.min_random_area != 1.0f
|| param_.max_aspect_ratio != 1.0f || param_.min_aspect_ratio != 1.0f) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if these are all 1.0, you still need to resize the image right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct. I need to move the center crop out of this if-condition.

}
random_resized_crop_exec = true;
}
}
// normal augmentation by affine transformation.
if (param_.max_rotate_angle > 0 || param_.max_shear_ratio > 0.0f
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be else if?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible that users want to add rotation on top of random resized crop.

@@ -277,7 +413,8 @@ class DefaultImageAugmenter : public ImageAugmenter {
}

// crop logic
if (param_.max_crop_size != -1 || param_.min_crop_size != -1) {
if (!param_.random_resized_crop &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks weird. Why not resize in the random sized crop section?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

random resized crop considers sample the relative area and the aspect ratio of the to-be-cropped part.

here's random crop, it randomly crop a square with its edge length sampled from [min_crop_size, max_crop_size]

@hetong007
Copy link
Contributor Author

@piiswrong I have made the following changes:

  • set min_aspect_ratio to optional<float>
  • moved rotation and padding before random_resized_crop and random_crop
  • if neither random_resized_crop nor random_crop has been applied, perform center_crop

help='max change of aspect ratio, whose range is [0, 1]')
aug.add_argument('--min-random-aspect-ratio', type=float, default=1,
help='min value of aspect ratio, whose value should be positive.')
aug.add_argument('--max-random-aspect-ratio', type=float, default=1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should the default be max-random-aspect-ratio = 0 and min-random-aspect-ratio = None so that we don't break previous API?

@piiswrong piiswrong merged commit ccee176 into apache:master Jun 19, 2018
zheng-da pushed a commit to zheng-da/incubator-mxnet that referenced this pull request Jun 28, 2018
* add resnet augmentation

* add test

* fix scope

* fix warning

* fix lint

* fix lint

* add color jitter and pca noise

* fix center crop

* merge

* fix lint

* Trigger CI

* fix

* fix augmentation implementation

* add checks for parameters

* modify training script

* fix compile error

* Trigger CI

* Trigger CI

* modify error message

* Trigger CI

* Trigger CI

* Trigger CI

* improve script in example

* fix script

* clear code

* Trigger CI

* set min_aspect_ratio to optional, move rotation and pad before random resized crop

* fix

* Trigger CI

* Trigger CI

* Trigger CI

* fix default values

* Trigger CI
XinYao1994 pushed a commit to XinYao1994/incubator-mxnet that referenced this pull request Aug 29, 2018
* add resnet augmentation

* add test

* fix scope

* fix warning

* fix lint

* fix lint

* add color jitter and pca noise

* fix center crop

* merge

* fix lint

* Trigger CI

* fix

* fix augmentation implementation

* add checks for parameters

* modify training script

* fix compile error

* Trigger CI

* Trigger CI

* modify error message

* Trigger CI

* Trigger CI

* Trigger CI

* improve script in example

* fix script

* clear code

* Trigger CI

* set min_aspect_ratio to optional, move rotation and pad before random resized crop

* fix

* Trigger CI

* Trigger CI

* Trigger CI

* fix default values

* Trigger CI
@TccccD
Copy link

TccccD commented Sep 17, 2018

Although it is standard during training, it cannot be resized to 256*256 and center crop in validation.
@hetong007 @rahul003 @piiswrong

@hetong007
Copy link
Contributor Author

@TccccD What is the exact issue you have? FYI for training you can set up the pipeline like this: https://github.com/dmlc/gluon-cv/blob/master/scripts/classification/imagenet/train_imagenet.py#L183

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants