Reading list for research topics in Vison Transformers.
We list the most popular methods for Vision Transformer, if I missed something, please submit a request. (Note: We show the date of the first version of Arxiv here. But the link of paper is the lastest version.)
updating......
Image Synthesis:
Date | Method | Conference | Title | Code |
---|---|---|---|---|
2020-12-17 | Taming Transformer | CVPR 2021(Oral) | Taming Transformers for High-Resolution Image Synthesis | TamingTransformer |
2021-xx-xx | TransGAN | NeurIPS 2021 | TransGAN: Two Pure Transformers Can Make One Strong GAN, and That Can Scale Up | TransGAN |
Date | Method | Conference | Title | Code |
---|---|---|---|---|
2021-04-05 | MoCo v3 | ICCV 2021(Oral) | An Empirical Study of Training Self-Supervised Vision Transformers | Moco v3 |
2021-06-14 | BeiT | ICLR 2022(Oral) | BEiT: BERT Pre-Training of Image Transformers | BeiT |
2021-11-11 | MAE | Arxiv 2021 | Masked Autoencoders Are Scalable Vision Learners | MAE |
2021-11-15 | iBoT | Arxiv 2021 | iBOT: Image BERT Pre-Training with Online Tokenizer | iBoT |
2021-11-18 | SimMIM | Arxiv 2021 | SimMIM: A Simple Framework for Masked Image Modeling | SimMIM |
2021-12-16 | MaskFeat | Arxiv 2021 | Masked Feature Prediction for Self-Supervised Visual Pre-Training | None |
2021-12-20 | SplitMask | Arxiv 2021 | Are Large-scale Datasets Necessary for Self-Supervised Pre-training? | None |
2022-01-19 | RePre | Arxiv 2022 | RePre: Improving Self-Supervised Vision Transformer with Reconstructive Pre-training | None |
2022-02-07 | CAE | Arxiv 2022 | Context Autoencoder for Self-Supervised Representation Learning | None |
Todo:iBoT,DINO
Date | Conference/journal | Title |
---|---|---|
2020-12-23 (latest version: 2021-02-23) | TPAMI | A Survey on Vision Transformer |
2021-01-04 (latest version: 2022-01-19) | ACM Computing Surveys | Transformers in vision: A survey |
2022-05-10 (latest version: 2022-08-06) | Knowledge-Based Systems | Vision transformers for dense prediction: A survey |