Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

a strange problem about saving memory in training process #64

Closed
zhang-wen opened this issue Jun 11, 2017 · 1 comment
Closed

a strange problem about saving memory in training process #64

zhang-wen opened this issue Jun 11, 2017 · 1 comment

Comments

@zhang-wen
Copy link

zhang-wen commented Jun 11, 2017

hello guys,

i have some strange problems here,

if i use memoryEfficientLoss function to backward loss, the training seems to be normal

but if i put the content of memoryEfficientLoss into trainEpoch function and do not define extra memoryEfficientLoss function, the training will not converge, all other code are the same.

and another question is that i guess the split operation along the first dimension to ouputs of the model can save the memory, however, if so, how do we calculate gradients and do backward and why do you call the backward() two times (loss.backward() and outputs.backward()) ? can you explain this ? thank you.

Can anyone tell me why ? any reply will be appreciated.

@zhang-wen zhang-wen changed the title a strange problem a strange problem about saving memory in training process Jun 11, 2017
@srush
Copy link
Contributor

srush commented Jul 5, 2017

Check out the new version of the code. I think it makes this function much clearer.

@srush srush closed this as completed Jul 5, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants