(ICLR 2022 Spotlight) Official PyTorch implementation of "How Do Vision Transformers Work?"
-
Updated
Jul 14, 2022 - Python
(ICLR 2022 Spotlight) Official PyTorch implementation of "How Do Vision Transformers Work?"
Create animations for the optimization trajectory of neural nets
[TMLR] "Can You Win Everything with Lottery Ticket?" by Tianlong Chen, Zhenyu Zhang, Jun Wu, Randy Huang, Sijia Liu, Shiyu Chang, Zhangyang Wang
[NeurIPS 2021] Towards Better Understanding of Training Certifiably Robust Models against Adversarial Examples | ⛰️
[Int. J. Comput. Vis. 2024] Revisiting Deep Ensemble for Out-of-Distribution Detection: A Loss Landscape Perspective
Worth-reading papers and related awesome resources on deep learning optimization algorithms. 值得一读的深度学习优化器论文与相关资源。
Surrogate Gap Guided Sharpness-Aware Minimization (GSAM) implementation for keras/tensorflow 2
This project builds on recent research that explores the phenomenon of Grokking. The goal is to investigate when, why, and how grokking occurs, focusing on transformers under various batch sizes.
Add a description, image, and links to the loss-landscape topic page so that developers can more easily learn about it.
To associate your repository with the loss-landscape topic, visit your repo's landing page and select "manage topics."