Code for "Don't trust your eyes: on the (un)reliability of feature visualizations"

This repository contains code to replicate experiments from Don't trust your eyes: on the (un)reliability of feature visualizations by Robert Geirhos*, Roland S. Zimmermann*, Blair Bilodeau*, Wieland Brendel, and Been Kim.

Fooling feature visualizations

Feature visualizations are widely used interpretability tools - but can we trust them? We investigate this question from an adversarial, empirical and theoretical perspective. The result: Don’t trust your eyes!

For instance, from an adversarial perspective we can adapt a model such that it maintains identical behavior on natural image input (e.g., identical ImageNet accuracy) but its feature visualizations are changed completely. In the example here, the feature visualization shows a painting (right) instead of the original feature visualization (left).

Citation

@article{geirhos2023fooling,
  url = {https://arxiv.org/abs/2306.04719},
  author = {Geirhos, Robert and Zimmermann, Roland S and Bilodeau, Blair and Brendel, Wieland and Kim, Been},
  title = {Don't trust your eyes: on the (un)reliability of feature visualizations},
  journal={arXiv preprint arXiv:2306.04719},
  year = {2023},

Disclaimer

This is not an officially supported Google product.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
assets		assets
code		code
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

assets

assets

code

code

CONTRIBUTING.md

CONTRIBUTING.md

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Code for "Don't trust your eyes: on the (un)reliability of feature visualizations"

Fooling feature visualizations

Citation

Disclaimer

About

Releases

Packages

Contributors 2

Languages

License

google-research/fooling-feature-visualizations

Folders and files

Latest commit

History

Repository files navigation

Code for "Don't trust your eyes: on the (un)reliability of feature visualizations"

Fooling feature visualizations

Citation

Disclaimer

About

Resources

License

Stars

Watchers

Forks

Languages