Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct way to use LayerDeepLiftShap and LayerGradientShap for the XLnet model. #903

Open
aj280192 opened this issue Mar 16, 2022 · 0 comments

Comments

@aj280192
Copy link

I am facing an issue with the Shap-based explainers for the XLnet model for the IMDB dataset. I am using a batch size of 1 for getting the attribution from the explainer.

I get the below error for LayerDeepLiftShap (baseline is a torch stack of special tokens of the XLnet model and zero tensors like input of the model).

Traceback (most recent call last):
File "run_explainer.py", line 74, in
attribution, predictions = explainer.explain(batch)
File "/home/ravichandran/thermostat/src/thermostat/explainers/shap.py", line 92, in explain
attributions = self.explainer.attribute(
File "/opt/conda/lib/python3.8/site-packages/captum/log/init.py", line 35, in wrapper
return func(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/captum/attr/_core/layer/layer_deep_lift.py", line 673, in attribute
attributions = DeepLiftShap._compute_mean_across_baselines(
File "/opt/conda/lib/python3.8/site-packages/captum/attr/_core/deep_lift.py", line 914, in _compute_mean_across_baselines
return torch.mean(attribution.view(attr_shape), dim=1, keepdim=False)
RuntimeError: shape '[1, 2, 4, 768]' is invalid for input of size 786432

I get the below error for LayerGradientShap

Traceback (most recent call last):
File "run_explainer.py", line 74, in
attribution, predictions = explainer.explain(batch)
File "/home/ravichandran/thermostat/src/thermostat/explainers/shap.py", line 48, in explain
attributions = self.explainer.attribute(inputs=inputs,
File "/opt/conda/lib/python3.8/site-packages/captum/log/init.py", line 35, in wrapper
return func(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/captum/attr/_core/layer/layer_gradient_shap.py", line 304, in attribute
attributions = nt.attribute.wrapped(
File "/opt/conda/lib/python3.8/site-packages/captum/attr/_utils/common.py", line 409, in wrapper
return func(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/captum/attr/_core/noise_tunnel.py", line 375, in attribute
update_partial_attribution_and_delta(
File "/opt/conda/lib/python3.8/site-packages/captum/attr/_core/noise_tunnel.py", line 323, in update_partial_attribution_and_delta
update_sum_attribution_and_sq(
File "/opt/conda/lib/python3.8/site-packages/captum/attr/_core/noise_tunnel.py", line 236, in update_sum_attribution_and_sq
attribution = attribution.view(attribution_shape)
RuntimeError: shape '[102, 5, 5, 768]' is invalid for input of size 1966080

The same functions work fine for bert, albert, roberta and electra. Facing the issue only for xlnet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant