Convert and clearState #138

melgor · 2016-03-17T17:40:04Z

Recently I want to use "convert" method to reduce the size of models(cmusatyalab/openface#110) . And my previous method worked much better than clearState. I found why:

"_gradOutput" is never cleared in any module.
"gradBias" and "gradWeight" are not cleared in SpatialConvolution
Shouldn't this buffers be added to "clearState" function?

The last thing is this line. Why not using "clearState()" here, no matter if this is 'nn' or 'cudnn' version?
I think that "clearState()" can have different implementation in nn and cudnn what can cause buffers which have occured in cudnn version will not be cleared.

fmassa · 2016-03-17T17:51:05Z

Indeed, _gradOutput and _input are not cleared in clearState, as mentioned in here and that should be included.
About gradWeight and gradBias, as mentioned in cmusatyalab/openface#110, we can't only free them, but we also need to reconstruct them before training, so they go out of the scope of clearState I think.

About calling :clearState instead of :clearDesc, I think that's a good idea. It's probably like this because convert was merged in before clearState.

soumith · 2016-03-17T18:34:40Z

(1) is fixed.
(2) is out of scope of clearState unfortunately.

melgor · 2016-03-17T18:57:43Z

@soumith Thanks for fast response and action. I see you added _gradOutput only to SpatialConvolution, It would be great that if you can add it to Pointwise and other modules which have "_gradOutput" buffer.

soumith · 2016-03-17T19:02:22Z

sure, adding it now.

vgatto · 2016-04-29T18:35:50Z

@soumith I understand (2) is out of scope for changes to clearState. Is there a work-around? Even when zeroed out, gradWeight is still pretty big when written to disk.

soumith · 2016-05-10T03:38:53Z

@vgatto you could write a small save and load function. Save goes through the model and nils all the gradWeight and gradBias. Load goes through the model and makes gradWeight same size as weight, and gradBias same size as bias. However, sharing will break in this scheme.

fmassa · 2016-05-10T10:01:49Z

@vgatto to complement @soumith answer, here is a code (which is present in optnet) which does exactly what Soumith mentioned, but which also takes care of the sharing scheme.

vgatto · 2016-05-10T20:57:50Z

@soumith, @fmassa - Thanks for the pointers, that definitely works for me. It seems like the long term solution would be enhancing the nn.Module interface to support this save/load case and having each module implement whatever makes sense. Basically, another clearState specifically for this purpose. Is there an enhancement issue filed for this?

soumith mentioned this issue Mar 17, 2016

adding _input and _gradOutput to clearState #139

Merged

soumith closed this as completed Mar 17, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert and clearState #138

Convert and clearState #138

melgor commented Mar 17, 2016

fmassa commented Mar 17, 2016

soumith commented Mar 17, 2016

melgor commented Mar 17, 2016

soumith commented Mar 17, 2016

vgatto commented Apr 29, 2016

soumith commented May 10, 2016

fmassa commented May 10, 2016

vgatto commented May 10, 2016

Convert and clearState #138

Convert and clearState #138

Comments

melgor commented Mar 17, 2016

fmassa commented Mar 17, 2016

soumith commented Mar 17, 2016

melgor commented Mar 17, 2016

soumith commented Mar 17, 2016

vgatto commented Apr 29, 2016

soumith commented May 10, 2016

fmassa commented May 10, 2016

vgatto commented May 10, 2016