Multi-epoch training was performed on the test set. #3

Dyb3438 · 2022-03-02T10:49:39Z

There is a discrepancy about test-time adaptation in this code that has me wondering.

When adaptation operation runs on the test set, TTT and Tent perform only one epoch instead of hundreds of epochs. As I understand it, this code performs multiple epochs of adaptation to the network on the test set, which often does not make sense in practice in my opinion.

YuejiangLIU · 2022-03-03T15:39:23Z

Thanks for the question!

To my knowledge, both single-epoch and multi-epoch are commonly used in prior literature. The code of TTT and Tent use a single epoch, whereas SHOT, another baseline method we compared with, falls into the latter.

I personally lean towards the multi-epoch setting (with an oracle for model selection) for evaluation and comparison. The reason is that, in the single-epoch setting, the adaptation performance is often quite sensitive to the choice of the learning rate, which can lead to noisy comparisons. In contrast, in our multi-epoch evaluation, we chose relatively small learning rates and ran the adaptation for sufficiently long to thoroughly estimate the effectiveness of an algorithm.

Besides, even in practice, I believe that using the test examples at hand for multiple epochs is still a better choice, if computational time allows. This is probably a subjective opinion though.

p.s. Why do you think multiple-epoch does not make sense in practice?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-epoch training was performed on the test set. #3

Multi-epoch training was performed on the test set. #3

Dyb3438 commented Mar 2, 2022 •

edited

YuejiangLIU commented Mar 3, 2022

Multi-epoch training was performed on the test set. #3

Multi-epoch training was performed on the test set. #3

Comments

Dyb3438 commented Mar 2, 2022 • edited

YuejiangLIU commented Mar 3, 2022

Dyb3438 commented Mar 2, 2022 •

edited