Correct way to use LayerDeepLiftShap and LayerGradientShap for the XLnet model. #903

aj280192 · 2022-03-16T15:27:09Z

I am facing an issue with the Shap-based explainers for the XLnet model for the IMDB dataset. I am using a batch size of 1 for getting the attribution from the explainer.

I get the below error for LayerDeepLiftShap (baseline is a torch stack of special tokens of the XLnet model and zero tensors like input of the model).

Traceback (most recent call last):
File "run_explainer.py", line 74, in
attribution, predictions = explainer.explain(batch)
File "/home/ravichandran/thermostat/src/thermostat/explainers/shap.py", line 92, in explain
attributions = self.explainer.attribute(
File "/opt/conda/lib/python3.8/site-packages/captum/log/init.py", line 35, in wrapper
return func(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/captum/attr/_core/layer/layer_deep_lift.py", line 673, in attribute
attributions = DeepLiftShap._compute_mean_across_baselines(
File "/opt/conda/lib/python3.8/site-packages/captum/attr/_core/deep_lift.py", line 914, in _compute_mean_across_baselines
return torch.mean(attribution.view(attr_shape), dim=1, keepdim=False)
RuntimeError: shape '[1, 2, 4, 768]' is invalid for input of size 786432

I get the below error for LayerGradientShap

Traceback (most recent call last):
File "run_explainer.py", line 74, in
attribution, predictions = explainer.explain(batch)
File "/home/ravichandran/thermostat/src/thermostat/explainers/shap.py", line 48, in explain
attributions = self.explainer.attribute(inputs=inputs,
File "/opt/conda/lib/python3.8/site-packages/captum/log/init.py", line 35, in wrapper
return func(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/captum/attr/_core/layer/layer_gradient_shap.py", line 304, in attribute
attributions = nt.attribute.wrapped(
File "/opt/conda/lib/python3.8/site-packages/captum/attr/_utils/common.py", line 409, in wrapper
return func(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/captum/attr/_core/noise_tunnel.py", line 375, in attribute
update_partial_attribution_and_delta(
File "/opt/conda/lib/python3.8/site-packages/captum/attr/_core/noise_tunnel.py", line 323, in update_partial_attribution_and_delta
update_sum_attribution_and_sq(
File "/opt/conda/lib/python3.8/site-packages/captum/attr/_core/noise_tunnel.py", line 236, in update_sum_attribution_and_sq
attribution = attribution.view(attribution_shape)
RuntimeError: shape '[102, 5, 5, 768]' is invalid for input of size 1966080

The same functions work fine for bert, albert, roberta and electra. Facing the issue only for xlnet.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Correct way to use LayerDeepLiftShap and LayerGradientShap for the XLnet model. #903

Correct way to use LayerDeepLiftShap and LayerGradientShap for the XLnet model. #903

aj280192 commented Mar 16, 2022

Correct way to use LayerDeepLiftShap and LayerGradientShap for the XLnet model. #903

Correct way to use LayerDeepLiftShap and LayerGradientShap for the XLnet model. #903

Comments

aj280192 commented Mar 16, 2022