New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proper Benchmarking for DER and DER++ #6
Comments
Also, is there an easy way to disable data augmentations in the pipeline ? I could not find a argument for that |
I ended up running a grid search over the following parameters, drawing inspiration from the best arguments reported.
I'm interested in reproducing the method on the setting observed in (Aljundi & al.), i.e. I'm running with arguments I thought it would be good to post my results here, in case someone wanted to benchmark in a similar setting. The winning args for
(chosen by selecting best accuracy averaged over 3 runs, with If one of the authors could double validate this or point out flaws in the methodology that would be awesome. This way I can make sure DER results are properly reported. |
Hello there, Sorry for not replying sooner, we are a little busy with a paper rebuttal atm. Thank you for your compliments to the framework, we use it all the time ourselves and designed it to be easily extended, so we're very glad it works for you! :D To disable augmentations, you can modify the dataset directly and change the transforms in the TRANSFORM variable. If you simply comment out the lines containing RandomCrops and RandomFlips you are left with just the Normalisation. The bare minimum you need to leave in place for everything to work is a transform.ToTensor(). Notice however, as we write in our paper, that we believe that augmentation is very important for the models to provide their best accuracy. I guess your method for finding the best parameters is OK, just a word of warning: as we've written in our paper, we are very critical of the single-epoch setting usually promoted by Aljundi, Lopez-Paz, Chaudhry and others when referred to non-trivial datasets such as cifar10, cifar100 or even ImageNet (see section F.3 of our appendix) as we believe that it does not allow for a disentanglement of forgetting from underfitting. However, as you did, you can easily implement your desired behaviour by reducing the number of epochs. Best of luck for your sub. Feel free to reach out if you need anything else! |
Thanks for getting back! Yea I understand the concerns about separating forgetting from underfitting, which I agree with. Interesting discussion in the appendix, I had not examined it previously. For what it's worth, DER++ was able to reach stronger performance under a single epoch Good luck with the rebuttal! |
Actually I think the discrepancy in the results may be due to the batch size more than anything else. Using |
Thanks for the info, we look forward to reading more about it in the final paper! |
Hi!
I'm looking to add DER / DER++ results as a baseline in my paper. I'm interested in the setting where only a single pass is done through the dataset (i.e.
num_epochs == 1
). I was wondering if you had any advice / suggestions regarding hyperparameter selections in this setting ? Specificallyalpha
andbeta
and the learning rate. Thanks in advance :)and great job on the library! I'm usually not a fan; I usually find them too convoluted. This one is super minimal and easy to play with 👌
The text was updated successfully, but these errors were encountered: