Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ImageNet performance? #7

Closed
hszhao opened this issue Jan 14, 2019 · 8 comments
Closed

ImageNet performance? #7

hszhao opened this issue Jan 14, 2019 · 8 comments

Comments

@hszhao
Copy link

hszhao commented Jan 14, 2019

Hi, does anyone gets the performance on ImageNet with the provided autoaugment?
Here is my results with autoaugment using official implementation, compared to official results, no impressive improvements are got?
Results of ResNet50,101,152 in terms of top1/5 accuracy:
official without autoaugment: 76.15/92.87, 77.37/93.56, 78.31/94.06.
mine with autoaugment: 75.33/92.45, 77.57/93.78, 78.51/94.07.
Update: all the above results are tested with training epochs as 90, a longer one such as 270 used in the paper may help get the reported results.

@Jongchan
Copy link

Hi, I am also planning to test out AutoAugment in PyTorch, ImageNet. According to your numbers, the improvements seems very insignificant.

Did you use the hyper-parameter settings mentioned in the original paper? For example, epochs?
ImageNet implementations usually use 90/100 epochs for ResNet training, and the AutoAugment paper mentions ImageNet training is done for 270 epochs.

@hszhao
Copy link
Author

hszhao commented May 17, 2019

Hi, I am also planning to test out AutoAugment in PyTorch, ImageNet. According to your numbers, the improvements seems very insignificant.

Did you use the hyper-parameter settings mentioned in the original paper? For example, epochs?
ImageNet implementations usually use 90/100 epochs for ResNet training, and the AutoAugment paper mentions ImageNet training is done for 270 epochs.

You are right, that might be the issue. I only run for 90 epochs as the official implementation. Maybe longer training epochs can help get the results in the paper.

@Jongchan
Copy link

Dear @hszhao ,

I am currently doing CIFAR-10/100 (WRN28-10) experiments with the AutoAugment provided in this repository, and there are extra modifications upon baseline WRN + AutoAugment.

There are extra Shake-Shake / ShakeDrop / CutOut, and cosine learning rate schedule(possibly lower initial learning rates). I think the cosine lr schedule is for better training set fit, since my initial experiment without cosine schedule fails to overfit to the training set.

I am concurrently doing ImageNet ResNet-50 experiments. Since it takes ~2days for each experiment w/ 4GPUs for 90epochs, I can get back in a week maybe.

@Jongchan
Copy link

Jongchan commented Jun 3, 2019

It takes so long to train with AutoAugment :(
Sorry for the late result. @hszhao

Here's my settings & result:

Arch AutoAugment Epochs HyperParams Top-1 Acc(%) (in-paper acc) Top-5 Acc(%) (in-paper acc)
ResNet50 No 100 initial LR 0.1, batch size 256, 4 GPUs 76.180 (-) 92.918 (-)
ResNet50 No 270 initial LR 0.2, batch size 512, 4 GPUs 76.716 (-) 93.064 (-)
ResNet50 No 270 initial LR 1.6, batch size 4096 - (76.3) - (93.1)
ResNet50 Yes 270 initial LR 0.2, batch size 512, 4 GPUs 77.450 (-) 93.504 (-)
ResNet50 Yes 270 initial LR 1.6, batch size 4096 - (77.6) - (93.8)

In fact, I skipped last several epochs due to my tight schedule (and sth wrong with my server), but it's only a few epochs, so probably this is very close to or exactly the best we can get.

Two different aspects from the original paper: (1) I did not use inception-style pre-processing and (2) batch size is 512, not 4096 as the paper.

You can see the AutoAugment experiment training log here / Without AutoAugment here

  • Edit: 270 epoch result added. fixed several hyperparams.

@Jongchan
Copy link

Jongchan commented Jun 3, 2019

One thing to notice here is the loss or error rate of the two experiments.
Without AutoAugment, the last epoch's avg loss and top-1 acc are 0.66 / 84.08%
With AutoAugment, the last epoch's loss and top-1 acc are 0.9 / 77.99%.
So, with AutoAugment, the training converges much slower than the baseline, and thus may require longer epochs. And given that my result is worse than the paper's result, larger batch size may be beneficial.

I should have done the experiments with Inception-style augmentation for fair comparison :(

@hszhao
Copy link
Author

hszhao commented Jun 3, 2019

@Jongchan Hi, Jongchan, thanks for the results. They are very close to the ones in the paper. It seems that more epochs are needed for various augmentation. In mixup paper, they also trained for longer epochs (200 epochs).

@zzx528
Copy link

zzx528 commented Aug 17, 2021

Hi, does anyone gets the performance on ImageNet with the provided autoaugment?
Here is my results with autoaugment using official implementation, compared to official results, no impressive improvements are got?
Results of ResNet50,101,152 in terms of top1/5 accuracy:
official without autoaugment: 76.15/92.87, 77.37/93.56, 78.31/94.06.
mine with autoaugment: 75.33/92.45, 77.57/93.78, 78.51/94.07.
Update: all the above results are tested with training epochs as 90, a longer one such as 270 used in the paper may help get the reported results.

Do you know how to use the policy given in the article to train our own models? What is the training process like? I train a model according to my own understanding, but the test accuracy of the model was very low. I use 10000 images from cifar10, According to the policy given in the article, do the autoaugment, and get the autoaugment 10000 images. Last , use the 10000 images to train the model.
Is there something wrong with me doing this? If so, please tell me the right way to train. Thank you !

@zzx528
Copy link

zzx528 commented Aug 17, 2021

One thing to notice here is the loss or error rate of the two experiments.
Without AutoAugment, the last epoch's avg loss and top-1 acc are 0.66 / 84.08%
With AutoAugment, the last epoch's loss and top-1 acc are 0.9 / 77.99%.
So, with AutoAugment, the training converges much slower than the baseline, and thus may require longer epochs. And given that my result is worse than the paper's result, larger batch size may be beneficial.

I should have done the experiments with Inception-style augmentation for fair comparison :(

Do you know how to use the policy given in the article to train our own models? What is the training process like? I train a model according to my own understanding, but the test accuracy of the model was very low. I use 10000 images from cifar10, According to the policy given in the article, do the autoaugment, and get the autoaugment 10000 images. Last , use the 10000 images to train the model.
Is there something wrong with me doing this? If so, please tell me the right way to train. Thank you !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants