Skip to content
This repository has been archived by the owner on Jun 10, 2021. It is now read-only.

Can CNN buffers be shared ? #73

Closed
helson73 opened this issue Jan 13, 2017 · 2 comments
Closed

Can CNN buffers be shared ? #73

helson73 opened this issue Jan 13, 2017 · 2 comments

Comments

@helson73
Copy link

helson73 commented Jan 13, 2017

I'm interested in if CNN is included in sequencer module, could its gradInput and output buffers be shared along clones ? (assume memory pre-allocation is enabled.)
For instance, nn.SpatialConvolution, nn.SpatialMaxPooling and their cudnn alternatives.

@guillaumekln
Copy link
Collaborator

guillaumekln commented Jan 13, 2017

gradInput tensors can always be shared unless they are exposed outside the network but the code handles that for you.

output tensors can be shared if they are not used in the backward pass. You should take a look at the implementation of every new modules you introduce in the code and apply this rule:

  1. If the content of input is used in updateGradInput or accGradParameters, add the module in the protectInput table.
  2. If the content of self.output is used in updateGradInput or accGradParameters, add the module in the protectOutput table.

For example, see the modules that are already blacklisted in MemoryOptimizer.lua and how they use their input or output in the backward pass.

It forces you to know in details the modules you use but the optimization is efficient. To validate your approach, you should get the same perplexity with and without memory optimization.

@helson73
Copy link
Author

@guillaumekln Thank you so much!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

No branches or pull requests

2 participants