You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey, I just want to repeat your work. I saw that your paper implied that the batch size for all experiments should be 1. However, I find that if I set the batch size to 1, I cannot get the same error as your experiments (about 10 times difference). But if I set the batch size to 32, I got decent result.
I would very appreciate your help to explain details in your experiments!
The text was updated successfully, but these errors were encountered:
Hi, batch size of 1 is used to report memory requirements for all methods. The larger the batch size the better, since you want to minimize the variance of the gradients.
In the example scripts you can see the default batch size is 32 for the speed limits and 128 for the mnist experiment.
Hey, I just want to repeat your work. I saw that your paper implied that the batch size for all experiments should be 1. However, I find that if I set the batch size to 1, I cannot get the same error as your experiments (about 10 times difference). But if I set the batch size to 32, I got decent result.
I would very appreciate your help to explain details in your experiments!
The text was updated successfully, but these errors were encountered: