Proper Benchmarking for DER and DER++ #6

cl-for-life · 2021-03-22T18:16:32Z

Hi!

I'm looking to add DER / DER++ results as a baseline in my paper. I'm interested in the setting where only a single pass is done through the dataset (i.e. num_epochs == 1). I was wondering if you had any advice / suggestions regarding hyperparameter selections in this setting ? Specifically alpha and beta and the learning rate. Thanks in advance :)

and great job on the library! I'm usually not a fan; I usually find them too convoluted. This one is super minimal and easy to play with 👌

The text was updated successfully, but these errors were encountered:

cl-for-life · 2021-03-22T18:17:19Z

Also, is there an easy way to disable data augmentations in the pipeline ? I could not find a argument for that

cl-for-life · 2021-03-24T02:00:25Z

I ended up running a grid search over the following parameters, drawing inspiration from the best arguments reported.

'--alpha': [0.1, 0.2, 0.3, .4, .5],
'--beta': [0.25, 0.5, .75, 1],
'--lr': [0.05, 0.01, 0.1],

I'm interested in reproducing the method on the setting observed in (Aljundi & al.), i.e. I'm running with arguments --dataset seq-cifar10 --n_epochs 1 --batch_size 10 --minibatch_size 10 and --buffer_size 200, 500, 1000.

I thought it would be good to post my results here, in case someone wanted to benchmark in a similar setting. The winning args for DER++ were

derpp     alpha=0.1  beta=0.50 lr=0.01 M=1000
derpp     alpha=0.4  beta=1.00 lr=0.01 M=500
derpp     alpha=0.4  beta=0.50 lr=0.01 M=200

(chosen by selecting best accuracy averaged over 3 runs, with --validation)

If one of the authors could double validate this or point out flaws in the methodology that would be awesome. This way I can make sure DER results are properly reported.

@JosephKJ @baraldilorenzo @mbosc @angpo

mbosc · 2021-03-24T09:04:08Z

Hello there,

Sorry for not replying sooner, we are a little busy with a paper rebuttal atm.

Thank you for your compliments to the framework, we use it all the time ourselves and designed it to be easily extended, so we're very glad it works for you! :D

To disable augmentations, you can modify the dataset directly and change the transforms in the TRANSFORM variable. If you simply comment out the lines containing RandomCrops and RandomFlips you are left with just the Normalisation. The bare minimum you need to leave in place for everything to work is a transform.ToTensor(). Notice however, as we write in our paper, that we believe that augmentation is very important for the models to provide their best accuracy.

I guess your method for finding the best parameters is OK, just a word of warning: as we've written in our paper, we are very critical of the single-epoch setting usually promoted by Aljundi, Lopez-Paz, Chaudhry and others when referred to non-trivial datasets such as cifar10, cifar100 or even ImageNet (see section F.3 of our appendix) as we believe that it does not allow for a disentanglement of forgetting from underfitting. However, as you did, you can easily implement your desired behaviour by reducing the number of epochs.

Best of luck for your sub. Feel free to reach out if you need anything else!

cl-for-life · 2021-03-24T13:23:18Z

Thanks for getting back! Yea I understand the concerns about separating forgetting from underfitting, which I agree with. Interesting discussion in the appendix, I had not examined it previously. For what it's worth, DER++ was able to reach stronger performance under a single epoch (58%,57%,50%) for M=1K,200,100 than what was in the appendix. I'm guessing you guys didn't run another hparam search given that this was not the main setting.

Good luck with the rebuttal!

cl-for-life · 2021-03-24T13:27:23Z

Actually I think the discrepancy in the results may be due to the batch size more than anything else. Using bs == 10 vs say 32 gives the learner more optimization steps, leading to better performance.

mbosc · 2021-03-24T14:01:41Z

Thanks for the info, we look forward to reading more about it in the final paper!

mbosc closed this as completed Mar 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proper Benchmarking for DER and DER++ #6

Proper Benchmarking for DER and DER++ #6

cl-for-life commented Mar 22, 2021

cl-for-life commented Mar 22, 2021

cl-for-life commented Mar 24, 2021 •

edited

mbosc commented Mar 24, 2021

cl-for-life commented Mar 24, 2021

cl-for-life commented Mar 24, 2021

mbosc commented Mar 24, 2021

Proper Benchmarking for DER and DER++ #6

Proper Benchmarking for DER and DER++ #6

Comments

cl-for-life commented Mar 22, 2021

cl-for-life commented Mar 22, 2021

cl-for-life commented Mar 24, 2021 • edited

mbosc commented Mar 24, 2021

cl-for-life commented Mar 24, 2021

cl-for-life commented Mar 24, 2021

mbosc commented Mar 24, 2021

cl-for-life commented Mar 24, 2021 •

edited