Can CNN buffers be shared ? #73

helson73 · 2017-01-13T16:55:44Z

I'm interested in if CNN is included in sequencer module, could its gradInput and output buffers be shared along clones ? (assume memory pre-allocation is enabled.)
For instance, nn.SpatialConvolution, nn.SpatialMaxPooling and their cudnn alternatives.

guillaumekln · 2017-01-13T17:19:03Z

gradInput tensors can always be shared unless they are exposed outside the network but the code handles that for you.

output tensors can be shared if they are not used in the backward pass. You should take a look at the implementation of every new modules you introduce in the code and apply this rule:

If the content of input is used in updateGradInput or accGradParameters, add the module in the protectInput table.
If the content of self.output is used in updateGradInput or accGradParameters, add the module in the protectOutput table.

For example, see the modules that are already blacklisted in MemoryOptimizer.lua and how they use their input or output in the backward pass.

It forces you to know in details the modules you use but the optimization is efficient. To validate your approach, you should get the same perplexity with and without memory optimization.

helson73 · 2017-01-13T17:24:23Z

@guillaumekln Thank you so much!

helson73 closed this as completed Jan 13, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can CNN buffers be shared ? #73

Can CNN buffers be shared ? #73

helson73 commented Jan 13, 2017 •

edited

Loading

guillaumekln commented Jan 13, 2017 •

edited

Loading

helson73 commented Jan 13, 2017

Can CNN buffers be shared ? #73

Can CNN buffers be shared ? #73

Comments

helson73 commented Jan 13, 2017 • edited Loading

guillaumekln commented Jan 13, 2017 • edited Loading

helson73 commented Jan 13, 2017

helson73 commented Jan 13, 2017 •

edited

Loading

guillaumekln commented Jan 13, 2017 •

edited

Loading