You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
if i use memoryEfficientLoss function to backward loss, the training seems to be normal
but if i put the content of memoryEfficientLoss into trainEpoch function and do not define extra memoryEfficientLoss function, the training will not converge, all other code are the same.
and another question is that i guess the split operation along the first dimension to ouputs of the model can save the memory, however, if so, how do we calculate gradients and do backward and why do you call the backward() two times (loss.backward() and outputs.backward()) ? can you explain this ? thank you.
Can anyone tell me why ? any reply will be appreciated.
The text was updated successfully, but these errors were encountered:
zhang-wen
changed the title
a strange problem
a strange problem about saving memory in training process
Jun 11, 2017
hello guys,
i have some strange problems here,
if i use
memoryEfficientLoss
function tobackward
loss, the training seems to be normalbut if i put the content of
memoryEfficientLoss
intotrainEpoch
function and do not define extra memoryEfficientLoss function, the training will not converge, all other code are the same.and another question is that i guess the split operation along the first dimension to
ouputs
of the model can save the memory, however, if so, how do we calculate gradients and dobackward
and why do you call thebackward()
two times (loss.backward()
andoutputs.backward()
) ? can you explain this ? thank you.Can anyone tell me why ? any reply will be appreciated.
The text was updated successfully, but these errors were encountered: