Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about "Add feature Layer-wise Relevance Propagation (#342)" #546

Open
chunfuchen opened this issue Dec 1, 2020 · 4 comments
Open

Comments

@chunfuchen
Copy link

❓ Questions and Help

We have a set of listed resources available on the website and FAQ: https://captum.ai/ and https://captum.ai/docs/faq . Feel free to open an issue here on the github or in our discussion forums:

May I have a question about #342?
At the very beginning of the conversion, it says that it does not support skip connection like ResNet and does not support in-place operation, either.

Nonetheless, I do see the unit tests have the testing for skip connection and in-place operations.
May I consider that the LRP in the master branch supports both (skip-connection and in-place operations) cases?

Thanks.

@NarineK
Copy link
Contributor

NarineK commented Dec 2, 2020

Hi @chunfuchen!

  1. I think that we fixed the issues related to in-place operations, such as in-place relus.
  2. In terms of skip-connections, @nanohanno had workaround for that problem but @rGure found some issues with the current implementation and is planning to submit a PR related to that.
    Add feature Layer-wise Relevance Propagation #342 (comment)

cc: @nanohanno , @rGure

@rGure
Copy link

rGure commented Dec 3, 2020

Hi @chunfuchen,

yes as @NarineK said there was an issue if you have different rules assigned to the operations in the skip connection, e.g. if you have f(x) + g(x), in the current implementation there is an ordering ambiguity (which may be wrongly resolved) if you assign different rules to f and g, when the overridden gradients are accumulated at the intermediate result x. However, if you are completely happy with just using one rule, that should not be a problem.
I will try to get as much of it done as possible this weekend.

I also would like to point out as @nanohanno wrote somewhere in the documentation, that if you want to assign an LRP rule to the addition operation itself (which you probably should) it has to be implemented as torch.nn.Module.

Best

@chunfuchen
Copy link
Author

Thanks! @rGure and @NarineK

So, if I replace the f(x)+g(x) statement with AdditionModule(f(x), g(x)), it will be fine, right? (I use the same rule for both f(x) and g(x).)

Thanks for all the great works.

@rGure
Copy link

rGure commented Dec 7, 2020

Hi @chunfuchen,

From everything I have observed that should be fine. However, there is the compute_convergence_delta method, which you can use to monitor gross misbehavior. Ideally the convergence deltas would be zero, in practice however they are not, but they should not be ~ 1e+3 or sth ...

Regarding the addition the current implementation of LRP recursively resolves the involved operations using the children() method, for that it is important that all operations show up in the _modules property. The easiest way to assure that is to put any "child" module as an property of the "parent module" and then just use the call method in the parents module forward method, e.g. for the Basic block of the torchvision resnet implementation, you would add sth. like
self.add = AdditionModule()
to the constructor of the BasicBlock class and then invoke
out = self.add(out, identity)
in the forward method at the appropriate place.
Hope that helps.

@NarineK
sry for the silence. I am still waiting for some internal clearance which because of the holidays took longer than expected. I push as soon as it is there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants