Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for more attribution methods #22

Closed
RachitBansal opened this issue Jan 15, 2021 · 6 comments
Closed

Add support for more attribution methods #22

RachitBansal opened this issue Jan 15, 2021 · 6 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@RachitBansal
Copy link

Hi,
Currently, the project seems to be relying on grad-norm and grad-x-input to obtain the attributions. However, there are other arguably better (as discussed in recent work) methods to obtain saliency maps. Integrating them in this project would also provide a good way to compare them on the same input examples.

Some of these methods from the top of my head are- integrated gradients, gradient shapley, and LIME. Perhaps support for visualizing the attention map from the model being interpreted itself could also be added. Methods based on feature ablation are also possible but they might need more work to integrate.

There is support for these aforementioned methods on Captum, but it takes effort to get them working for NLP tasks, especially those based on language modeling. Thus, I feel this would be a useful addition here.

@jalammar jalammar added enhancement New feature or request help wanted Extra attention is needed labels Jan 16, 2021
@jalammar
Copy link
Owner

Hi Rachit,

Thanks for your feedback. I see where you're coming from. I actually have code fragments of getting Captum's Integrated Gradients working for text generation, and it would indeed be reasonable to "outsource" saliency calculation to Captum. At least as an option. I'll keep this issue active and open for contributions. Let me dig up that code and post a gist to help orient the discussion.

@RachitBansal
Copy link
Author

RachitBansal commented Jan 16, 2021

Thank you for addressing this, @jalammar. This sounds great.

I had got the aforementioned methods working for machine translation using Captum, but I had to actually make some changes internally in the open-sourced model (XLM in this case). I wonder what might be the best way to share such a thing.

Also, I would be happy to help with this contribution. Could you elaborate a bit on what you meant by outsourcing the saliency calculation to Captum? Do you mean adding it as a dependency and plugging it at the places where the attributions need to be found? I was thinking more on the side of implementing those methods alongside the current methods in attribution.py itself.

@jalammar
Copy link
Owner

I wonder what might be the best way to share such a thing.
Probably a github Gist?

Could you elaborate a bit on what you meant by outsourcing the saliency calculation to Captum? Do you mean adding it as a dependency and plugging it at the places where the attributions need to be found?

Yes, exactly. This way it becomes Captum's concern to get these implementations correct and we don't duplicate the effort hile getting a large collection of methods supported and maintained.

@jalammar
Copy link
Owner

jalammar commented Dec 6, 2021

#47 now adds the following methods (powered by Captum):

  • IntegratedGradients,
  • Saliency,
  • InputXGradient,
  • DeepLift,
  • DeepLiftShap,
  • GuidedBackprop,
  • GuidedGradCam,
  • Deconvolution,
  • LRP

@jalammar jalammar closed this as completed Dec 6, 2021
@Victordmz
Copy link

Victordmz commented Dec 12, 2022

LRP does not work since it takes no parameter forward_func. Same for all the rest except ig, saliency, grad x input.

@BiEchi
Copy link
Contributor

BiEchi commented Aug 23, 2023

@jalammar Sry to continue on this closed issue. As @Victordmz said, the supports to other models like LRP and DeepLIME are not working. Specifically, even changing forward_func to model (as described in Capsum) raises an exception. New code in attribution.py:

ig = attr_method_class(model=model)
attributions = ig.attribute(inputs, target=prediction_id

This raises an exception in the local model:

saliency4alce/transformers/src/transformers/models/llama/modeling_llama.py", line 629, in forward
    batch_size, seq_length = input_ids.shape
ValueError: too many values to unpack (expected 2)

I'm trying to figure out a solution and will contribute by opening a PR. If anyone has already solved this problem, I would appreciate it a lot if you can ping me here!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants