Slow convergence with SGD linear evaluation #37

gergopool · 2022-05-10T11:58:51Z

Hi!

I am running a linear evaluation right now on a simsiam network I've just trained. It's on a different repository.
In contrast to the evaluation protocol you've written, I use another one preferred by a few other papers:
256bs, 100 epoch, SGD with momentum, 0.3 lr, 0 weight decay.

My first intuition was that my code had a bug, because even when I used the weights you shared in this repository, my evaluation started off with 5% accuracy after the first epoch, which is somewhat close to random weights' performance. Now as few epochs passed I see some progress, maybe I will have 30%+ after 10 epochs. However, other self-supervised methods kick off this evaluation with 60% right after the first epoch.

Do you have any guesses why I experience low convergence with simsiam?

Thank you.

tuntianjun · 2023-03-06T09:07:05Z

Hi, I was wondering if you have solved this problem? I had the same probelm. @gergopool

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow convergence with SGD linear evaluation #37

Slow convergence with SGD linear evaluation #37

gergopool commented May 10, 2022

tuntianjun commented Mar 6, 2023

Slow convergence with SGD linear evaluation #37

Slow convergence with SGD linear evaluation #37

Comments

gergopool commented May 10, 2022

tuntianjun commented Mar 6, 2023