New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert and clearState #138
Comments
Indeed, About calling |
(1) is fixed. |
sure, adding it now. |
@soumith I understand (2) is out of scope for changes to clearState. Is there a work-around? Even when zeroed out, gradWeight is still pretty big when written to disk. |
@vgatto you could write a small save and load function. Save goes through the model and nils all the gradWeight and gradBias. Load goes through the model and makes gradWeight same size as weight, and gradBias same size as bias. However, sharing will break in this scheme. |
@vgatto to complement @soumith answer, here is a code (which is present in |
@soumith, @fmassa - Thanks for the pointers, that definitely works for me. It seems like the long term solution would be enhancing the nn.Module interface to support this save/load case and having each module implement whatever makes sense. Basically, another clearState specifically for this purpose. Is there an enhancement issue filed for this? |
Recently I want to use "convert" method to reduce the size of models(cmusatyalab/openface#110) . And my previous method worked much better than clearState. I found why:
Shouldn't this buffers be added to "clearState" function?
The last thing is this line. Why not using "clearState()" here, no matter if this is 'nn' or 'cudnn' version?
I think that "clearState()" can have different implementation in nn and cudnn what can cause buffers which have occured in cudnn version will not be cleared.
The text was updated successfully, but these errors were encountered: