Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Summary] Add metrics for feature attribution evaluation #112

Open
gsarti opened this issue Dec 1, 2021 · 0 comments
Open

[Summary] Add metrics for feature attribution evaluation #112

gsarti opened this issue Dec 1, 2021 · 0 comments
Labels
enhancement New feature or request help wanted Extra attention is needed summary Summarizes multiple sub-tasks

Comments

@gsarti
Copy link
Member

gsarti commented Dec 1, 2021

馃殌 Feature Request

The following is a non-exhaustive list of attention-based feature attribution methods that could be added to the library:

Method name Source Code implementation Status
Sensitivity Yeh et al. '19 pytorch/captum
Infidelity Yeh et al. '19 pytorch/captum
Log Odds Shrikumar et al. '17 INK-USC/DIG
Sufficiency De Young et al. '20 INK-USC/DIG
Comprehensiveness De Young et al. '20 INK-USC/DIG
Human Agreement Atanasova et al. '20 copenlu/xai-benchmark
Confidence Indication Atanasova et al. '20 copenlu/xai-benchmark
Cross-Model Rationale Consistency Atanasova et al. '20 copenlu/xai-benchmark
Cross-Example Rationale Consistency (Dataset Consistency) Atanasova et al. '20 copenlu/xai-benchmark
Sensitivity Yin et al. '22 Iuclanlp/NLP-Interpretation-Faithfulness
Stability Yin et al. '22 Iuclanlp/NLP-Interpretation-Faithfulness

Notes:

  1. The Log Odds metric is just the negative logarithm of the Comprehensiveness metric. The application of - log can be controlled by a parameter do_log_odds: bool = False in the same function. The reciprocal can be obtained for the Sufficiency metric.

  2. All metrics that control masking/dropping a portion of the inputs via a top_k parameter can benefit from a recursive application to ensure the masking of most salient tokens at all times, as described in Madsen et al. '21. This could be captured by a parameter recursive_steps: Optional[int] = None. If specified, a masking of size top_k // recursive_steps + int(top_k % recursive_steps > 0) is performed for recursive_steps times, with the last step having size equal to top_k % recursive_steps if top_k % recursive_steps > 0.

  3. The Sensitivity and Infidelity methods add noise to input embeddings, which could produce unrealistic input embeddings for the model (see discussion in Sanyal et al. '21). Both sensitivity and infidelity can include a parameter discretize: bool = False that when turned on replaces the top-k inputs with their nearest neighbors in the vocabulary embedding space instead of their noised versions. Using Stability is more principled in this context since fluency is preserved by the two step procedure presented by Alzantot et al. '18, which includes a language modeling component. An additional parameter sample_topk_neighbors: int = 1 can be used to control the nearest neighbors' pool size used for replacement.

  4. Sensitivity by Yin et al. '22 is an adaptation to the NLP domain of Sensitivity-n by Yeh et al. '19. An important difference is that the norm of the noise vector causing the prediction to flip is used as a metric in Yin et al. '22, while the original Sensitivity in Captum uses the difference between original and noised prediction scores. The first should be prioritized for implementation.

  5. Cross-Lingual Faithfulness by Zaman and Belinkov '22 (code) is a special case of the Dataset Consistency metric by Atanasova et al. 2020 in which the pair is constituted by an example and its translated variant.

Overviews

A Comparative Study of Faithfulness Metrics for Model Interpretability Methods, Chan et al. '22

@gsarti gsarti added enhancement New feature or request summary Summarizes multiple sub-tasks help wanted Extra attention is needed labels Dec 1, 2021
@gsarti gsarti added this to the v1.0 milestone Dec 1, 2021
@gsarti gsarti removed this from the Demo Paper Release milestone Oct 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed summary Summarizes multiple sub-tasks
Projects
None yet
Development

No branches or pull requests

1 participant