This work contains the PyTorch implementation of and demonstrations of NeurIPS 2021: Self-Interpretable Model with Transformation Equivariant Interpretation (SITE)
Method
SITE trains a self-interpretable model that offers both consistent predictions and explanations across geometric transformations. This is achieved through the regularization of a self-interpretable module, thereby increasing the model's trustworthiness.
For academic usage, please consider citing:
@article{wang2021self, title={Self-interpretable model with transformation equivariant interpretation}, author={Wang, Yipei and Wang, Xiaoqian}, journal={Advances in Neural Information Processing Systems}, volume={34}, pages={2359--2372}, year={2021} }
Libraries
numpy==1.19.5 torch==1.10.2 torchvision=0.11.3
Training notebooks demonstrate the training process of SITE on MNIST and CIFAR datasets.
Example notebooks demonstrate how SITE is used to generate explanations for MNIST and CIFAR datasets