Problem about Cifar10 Expriements Reproduction #1

cwfxcz · 2018-03-23T02:48:08Z

Hi there,
Thanks for your great work of shampoo implementation in Pytorch. I'm trying to reproduce the cifar10 results in the Shampoo paper. But I got a much lower testing results. I have tried changing the learning rate form 0.01 to 10(according to the paper suggests), but still got a near 85% acc. Here are my experiments results:

We use the Resnet32 network in Cifar10 experiments.
--momentum, 0.9
--epsilon, 1e-4
--batchSize, 128

lr=0.1:(250 epochs)

Training Loss	Training Acc	Testing loss	Testing Acc
0.65	77.03%	0.68	76.39%

lr=1: (250 epochs)

Training Loss	Training Acc	Testing loss	Testing Acc
0.25	91.33%	0.57	84.04%

lr=2: (250 epochs)

Training Loss	Training Acc	Testing loss	Testing Acc
0.23	91.87%	0.72	82.02%

lr=5: (250 epochs)

Training Loss	Training Acc	Testing loss	Testing Acc
0.22	92.33%	0.75	82.04%

When training for 500 epochs for different lr above, the testing acc ramains almost the same. Still can't reach even 90% acc.

Any idea or suggestions about this problem? Thanks for your time.

moskomule · 2018-03-23T03:52:48Z

Thank you for your comprehensive experiments. Indeed, I also cannot reproduce the reported results with my implementation even though using the average of gradients.
So far, I'm also still investigating the reason. If you find something, please let me know.

cwfxcz · 2018-03-26T03:48:00Z

Hi, some questions about the Algorithm 2 code.
In the Shampoo paper, for different dimension it use the original grad to calculate the contraction.

But in the code, the grad will be updated for each dimension, and then used to calculate the contraction for the next dimension. Is it sth wrong of my understanding about the code or the algo.2 in the paper?
.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem about Cifar10 Expriements Reproduction #1

Problem about Cifar10 Expriements Reproduction #1

cwfxcz commented Mar 23, 2018

moskomule commented Mar 23, 2018

cwfxcz commented Mar 26, 2018

Problem about Cifar10 Expriements Reproduction #1

Problem about Cifar10 Expriements Reproduction #1

Comments

cwfxcz commented Mar 23, 2018

moskomule commented Mar 23, 2018

cwfxcz commented Mar 26, 2018