Reading list for research topics in Masked Image Modeling(MIM).
We list the most popular methods for MIM, if we missed something, please submit a request. (Note: We show the date the first edition of the paper was submitted to arxiv, but the link to the paper may be up to date.)
Date | Method | Conference | Title | Code |
---|---|---|---|---|
2022-04-06 | MIMDet | Arxiv 2022 | Unleashing Vanilla Vision Transformer with Masked Image Modeling for Object Detection | MIMDet |
Date | Method | Conference | Title | Code |
---|---|---|---|---|
2021-11-29 | Point-BERT | CVPR 2022 | Point-BERT: Pre-training 3D Point Cloud Transformers with Masked Point Modeling | Point-BERT |
2022-03-28 | Point-MAE | ECCV 2022 | Masked Autoencoders for Point Cloud Self-supervised Learning | Point-MAE |
2022-05-28 | Point-M2AE | NeurIPS 2022 | Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training | Point-M2AE |
2022-12-13 | I2P-MAE | CVPR 2023 | Learning 3D Representations from 2D Pre-trained Models via Image-to-Point Masked Autoencoders | I2P-MAE |
Date | Method | Conference | Title | Code |
---|---|---|---|---|
2022-02-08 | MaskGIT | Arxiv 2022 | MaskGIT: Masked Generative Image Transformer | None |
Date | Method | Conference | Title | Code |
---|---|---|---|---|
2023-06-18 | MIC | CVPR 2023 | MIC: Masked Image Consistency for Context-Enhanced Domain Adaptation | None |
Date | Method | Conference | Title | Code |
---|---|---|---|---|
2021-12-02 | BEVT | Arxiv 2021 | BEVT: BERT Pretraining of Video Transformers | BEVT |
2022-03-23 | VideoMAE | NeurIPS 2022 | VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training | VideoMAE |
2022-05-18 | MAE_ST | NeurIPS 2022 | Masked Autoencoders As Spatiotemporal Learners | MAE_ST |
Date | Method | Conference | Title | Code |
---|---|---|---|---|
2022-04-04 | MultiMAE | Arxiv 2022 | MultiMAE: Multi-modal Multi-task Masked Autoencoders | MultiMAE |
2022-05-27 | M3AE | Arxiv 2022 | Multimodal Masked Autoencoders Learn Transferable Representations | None |
2022-08-03 | xxx | Arxiv 2022 | Masked Vision and Language Modeling for Multi-modal Representation Learning | None |
2022-12-01 | FLIP | Arxiv 2022 | Scaling Language-Image Pre-training via Masking | None |
Date | Method | Conference | Title | Code |
---|---|---|---|---|
2022-03-10 | MedMAE | Arxiv 2022 | Self Pre-training with Masked Autoencoders for Medical Image Analysis | None |
Date | Method | Conference | Title |
---|---|---|---|
2022-08-08 | RelaxMIM | Arxiv 2022 | Understanding Masked Image Modeling via Learning Occlusion Invariant Feature |
Date | Conference | Title |
---|---|---|
2022-07-30 | Arxiv 2022 | A Survey on Masked Autoencoder for Self-supervised Learning in Vision and Beyond |
2023-12-31 | Arxiv 2023 | Masked Modeling for Self-supervised Representation Learning on Vision and Beyond |