Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In-place computation can break gradient computation #2015

Open
cdoersch opened this issue Mar 2, 2015 · 4 comments
Open

In-place computation can break gradient computation #2015

cdoersch opened this issue Mar 2, 2015 · 4 comments
Labels

Comments

@cdoersch
Copy link
Contributor

cdoersch commented Mar 2, 2015

For instance, MVNLayer reads data from its top blob during the backward pass, under the assumption that this data is exactly the same as the output it created. If it's been modified by a later layer that does in-place computation, the gradient will be computed incorrectly.

In general, caffe should not rely on the user to know under what circumstances a layer can safely be done in-place.

@seanbell
Copy link

seanbell commented Mar 3, 2015

Note: you may want to coordinate with #1979 which fixes some bugs in MVNLayer.

@longjon
Copy link
Contributor

longjon commented Jul 13, 2015

According to @mfigurnov, cuDNN max pooling is also a layer that requires its top data during backward.

@seanbell
Copy link

It's probably worth adding a mechanism to each layer that says whether it (a) does in-place computation and (b) can support the next layer doing in-place computation. Then, the net could check that all of the layers are compatible upon startup.

@shelhamer
Copy link
Member

Further thoughts from Sean Bell in #2853:

My understanding is that right now there is no specification -- you basically need to study the layer implementation to decide whether or not you can put an in-place layer after it. Getting it wrong will lead to incorrect results, but you won't get any error or warning about it.

A better solution would be to have each layer declare whether or not it allows for in-place computation, as well as whether the next layer can have in-place computation. Then, caffe could check these flags and raise errors as necessary. This isn't implemented, but it would be great if someone did.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants