Skip to content

Latest commit

 

History

History
126 lines (75 loc) · 9.7 KB

saliency_card_template.md

File metadata and controls

126 lines (75 loc) · 9.7 KB

For more information on saliency cards, see: Saliency Cards: A Framework to Characterize and Compare Saliency Methods

{Method Name} Saliency Card

Provide a summary of the saliency method.

Methodology

Describe how the saliency is computed, its intended use, and important considerations.

  • Developed by: {developers}
  • Shared by [optional]: {who is sharing it}
  • References: {links to relevant papers, blog posts, and demos}
  • Implementations and Tutorials [optional]: {links to source code, tutorials, and implementations}
  • Aliases [optional]: {other names the method is referred by}
  • Example: {a visual example of the method}

Determinism

Describe the saliency method's sources of non-determinism.

Hyperparameter Dependence

Describe the saliency method's hyperparameters and suggest how to set them.

Model Agnosticism

Describe the types of models the saliency method applies to.

Computational Efficiency

Describe the saliency method's computational efficiency and computing expectations.

Semantic Directness

Describe what the saliency method's output represents and the knowledge required to interpret the results.

Sensitivity Testing

Report results of the relevant sensitivity evaluations. Use 🟢 to indicate the saliency method passed, 🟥 to indicate it failed, and 🟨 to indicate the evaluation was inconclusive.

Input Sensitivity

Provide the results of the saliency method on input sensitivity tests:

[🟢 / 🟨 / 🟥] Completeness: Requires the sum of the saliency to equal the difference between the model's output on the original input and the model's output on a meaningless input.

[🟢 / 🟨 / 🟥] Deletion: Measures the change in the model's output as input features are iteratively removed based on their saliency ranking. Additional evaluations in: Metrics for saliency map evaluation of deep learning explanation methods.

[🟢 / 🟨 / 🟥] Faithfulness: Measures the change in the model's output as input features are obscured or removed based on their saliency rank.

[🟢 / 🟨 / 🟥] Infidelity: Measures the mean squared error between the saliency weighted by an input perturbation and the difference in the model's output between the actual and perturbed inputs.

[🟢 / 🟨 / 🟥] Input Consistency: Measures the consistency of the saliency when the input features are swapped with synonymous features.

[🟢 / 🟨 / 🟥] Input Invariance: Measures the difference in saliency between a model trained on the original data and a model trained on the data with a constant shift.

[🟢 / 🟨 / 🟥] Insertion: Measures the change in the model's output as input features are iteratively added based on their saliency ranking. Additional evaluations in: Metrics for saliency map evaluation of deep learning explanation methods.

[🟢 / 🟨 / 🟥] Perturbation Testing (LeRF): Measures the change in the model's output as input features are iteratively set to zero, starting with the least saliency features.

[🟢 / 🟨 / 🟥] Perturbation Testing (MoRF): Measures the change in the model's output as input features are iteratively set to zero, starting with the ost salient features.

[🟢 / 🟨 / 🟥] Region Perturbation: Measures the change in the model's output as input regions are perturbed based on their saliency ranking.

[🟢 / 🟨 / 🟥] ROAR: Measures the difference in model behavior between a model trained on the original inputs and a model trained on the original model's salient features.

[🟢 / 🟨 / 🟥] Robustness: Measures the change in saliency when meaningless perturbations are applied to the input features.

[🟢 / 🟨 / 🟥] Sensitivity: Measures the change in saliency when insignificant perturbations are added to the input.

[🟢 / 🟨 / 🟥] Stability: Measures the change in saliency when adversarial perturbations are added to the input.

[🟢 / 🟨 / 🟥] Sufficiency: Tests if the set of salient features is sufficient for the model to make a confident and correct prediction.

Label Sensitivity

Provide the results of the saliency method on label sensitivity tests:

[🟢 / 🟨 / 🟥] Data Randomization: Measures the change in saliency between a model trained on the original labels and a model trained with random label permutations.

[🟢 / 🟨 / 🟥] Model Contrast Score: Measures the change in saliency between two models trained on controlled variants of the dataset where feature importances are known.

Model Sensitivity

Provide the results of the saliency method on model sensitivity tests:

[🟢 / 🟨 / 🟥] Cascading Model Parameter Randomization: Measures the change in saliency as model weights are successively randomized.

[🟢 / 🟨 / 🟥] Implementation Invariance: Tests if the saliency is identical for two functionally equivalent models.

[🟢 / 🟨 / 🟥] Independent Model Parameter Randomization: Measures the change in saliency as layers of the model are randomized one at a time.

[🟢 / 🟨 / 🟥] Linearity: Tests that the saliency of two composed models is a weighted sum of the saliency for each model.

[🟢 / 🟨 / 🟥] Model Consistency: Measures the change in saliency between the original model and its compressed variant.

[🟢 / 🟨 / 🟥] Model Weight Randomization: Measures the change in saliency between fully trained and fully randomized models.

[🟢 / 🟨 / 🟥] Repeatability: Measures the difference in saliency between two independently initialized models trained in the same way on the same data.

[🟢 / 🟨 / 🟥] Reproducibility: Measures the difference in saliency between two models with different architectures trained in the same way on the same data.

Perceptibility Testing

Report results of the relevant perceptibility evaluations. Use 🟢 to indicate the saliency method passed, 🟥 to indicate it failed, and 🟨 to indicate the evaluation was inconclusive.

Minimality

Provide the results of the saliency method on minimality tests:

[🟢 / 🟨 / 🟥] Minimality: Tests if the salient features are the smallest set of features the model can use to make a confident and correct prediction.

[🟢 / 🟨 / 🟥] Sparsity: Measures the ratio between the maximum and minimum saliency values. High sparsity means the saliency's values are narrow and focused.

[🟢 / 🟨 / 🟥] Visual Sharpening: Human evaluation of the "sharpness" of the saliency.

Perceptual Correspondence

Provide the results of the saliency method on perceptual correspondence tests:

[🟢 / 🟨 / 🟥] Localization Utility: Measures the intersection of the saliency and the ground truth features.

[🟢 / 🟨 / 🟥] Luminosity Calibration: Measures if the relative saliency for two features is equivalent to their relative impact on the model's output.

[🟢 / 🟨 / 🟥] Mean IoU: Measures the intersection-over-union of the salient features and a set of ground truth features.

[🟢 / 🟨 / 🟥] Plausibility: Measures if the saliency highlights features known to be important to humans.

[🟢 / 🟨 / 🟥] The Pointing Game: Measures if the highest saliency value is in the set of ground truth features. Additional evaluations in: Metrics for saliency map evaluation of deep learning explanation methods.

Citation [optional]

Provide a citation to the paper or blog post that introduces the method.

BibTeX: