Gradient accumulation of several sample #7964
Comments
You mean only collect several examples' gradients in a batch to update weight? You can create a 0-1 mask and use it as the outermost out_grad to backward。 |
A bit more description of the process (it's for image retrieval-like task):
I can't pack these 100 samples in one batch and make the learning process on this batch because a batch of 100*3 images take too much place in graphic card memory. |
If you want to do this on whole dataset level. Then you should do an extra forward on whole dataset every time to get the indices of top-100, then create a batch, run forward-backward and update weight. |
I try but I can't run forward-backward on a batch of 100+ images, it consume too much memory and crash. |
I think you can hack |
I was looking for a more regular way :( |
I'm afraid that there is no regular way because actually this is not a regular case. I think hacking optimizer is the easiest way.
On 09/23/2017 18:07, Erwan BERNARD wrote:
I was looking for a more regular way :(
I will test to hack the optimizer thx
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
I search to compute a triplet ranking loss.
To achieve this, I need to accumulate gradient through several example and update weight with the resulting gradient.
In chainer we can easily do that because backward function accumulate gradient by default :
Is there a way to do this with Mxnet Gluon and the autograd API ?
That similar to gradient accumulation inside a batch.
The text was updated successfully, but these errors were encountered: