Boosting Generative Adversarial Transferability with Self-supervised Vision Transformer Features
Shangbo Wu, Yu-an Tan, Ruinan Ma, Wencong Ma, Dehua Zhu, Yuanzhang Li^
We present dSVA, a generative dual self-supervised ViT features attack, that exploits dual self-supervised ViT features -- both from contrastive learning (e.g., DINO) and masked image modeling (e.g., MAE) -- to achieve remarkable black-box adversarial transferability, outperforming state-of-the-arts.
The source code is released under the MIT License.
N.B.: To appear in ICCV 2025. This repo will soon be updated, stay tuned.