A Survey of Historical Learning: Learning Models with Learning History

Xiang Li*, Ge Wu*, Lingfeng Yang, Wenhai Wang, Renjie Song, Jian Yang#

$*$ Equal contribution. # Corresponding author.

This repo is the official paper list of A Survey of Historical Learning: Learning Models with Learning History. It mainly includes papers related to historical learning, which are classified according to different aspects of historical types.

If you have any questions about the contents of the paper or list, please contact us through opening issues or email.

Update

[February, 2023] The repo 'Awesome-Historical-Learning' is public at github.

Overview

Prediction
Intermediate Feature Representation
- Record the instance-level feature representations
- Memorize the feature statistics
Model Parameter
Gradient
- The gradients of the model parameters
- The gradients of the all-level features
Loss Values

Prediction

[CVPR2019 CCN] Mutual learning of complementary networks via residual correction for improving semi-supervised classification [Paper]
[Arxiv2019 SELF] Self: Learning to filter noisy labels with self-ensembling [Paper]
[AAAI2020 D2CNN] Deep discriminative CNN with temporal ensembling for ambiguously-labeled image classification [Paper]
[ICCV2021 PS-KD] Self-knowledge distillation with progressive refinement of targets [Paper]
[CVPR2022 DLB] Self-Distillation from the Last Mini-Batch for Consistency Regularization [Paper]
[NeurIPS2022 RecursiveMix] RecursiveMix: Mixed Learning with History [Paper]

Intermediate Feature Representation

Record the instance-level feature representations

[AAAI2022 MBJ] Memory-Based Jitter: Improving Visual Recognition on Long Tailed Data with Diversity in Memory [Paper]
[AAAI2022 MeCoQ] Contrastive quantization with code memory for unsupervised image retrieval [Paper]
[AAAI2022 InsCLR] InsCLR: Improving instance retrieval with self-supervision [Paper]
[CVPR2022 MAUM] Learning Memory-Augmented Unidirectional Metrics for Cross-Modality Person Re-Identification [Paper]
[CVPR2022 QB-Norm] Cross Modal Retrieval with Querybank Normalisation [Paper]
[CVPR2022 CIRKD] Cross-Image Relational Knowledge Distillation for Semantic Segmentation [Paper]
[ECCV2022 DAS] DAS: Densely-Anchored Sampling for Deep Metric Learning [Paper]
[Arxiv2022 Memorizing transformers] Memorizing transformers [Paper]
[Arxiv2022 ] Learning Equivariant Segmentation with Instance-Unique Querying [Paper]
[Arxiv2022 MCL] Online Knowledge Distillation via Mutual Contrastive Learning for Visual Recognition [Paper]
[AAAI2021 IM-CFB] Instance mining with class feature banks for weakly supervised object detection [Paper]
[CVPR2021 LOCE] Exploring classification equilibrium in long-tailed object detection [Paper]
[CVPR2021 VPL] Variational prototype learning for deep face recognition [Paper]
[CVPR2021 GLT] Group-aware label transfer for domain adaptive person re-identification [Paper]
[CVPR2021 MCIBI] Mining contextual information beyond image for semantic segmentation [Paper]
[CVPR2021 SAN] Spatial assembly networks for image representation learning [Paper]
[CVPR2021 PRISM] Noise-resistant deep metric learning with ranking-based instance selection [Paper]
[ICCV2021 LOCE] Exploring classification equilibrium in long-tailed object detection [Paper]
[NeurIPS2021 CBA-MR] When False Positive is Intolerant: End-to-End Optimization with Low FPR for Multipartite Ranking [Paper]
[TIP2021 Dual-Refinement] Dual-refinement: Joint label and feature refinement for unsupervised domain adaptive person re-identification [Paper]
[CVPR2020 ] Cross-Batch Memory for Embedding Learning [Paper]
[CVPR2020 PIRL] Self-supervised learning of pretext-invariant representations [Paper]
[ECCV2020 TAC-CCL] Unsupervised Deep Metric Learning with Transformed Attention Consistency and Contrastive Clustering Loss [Paper]
[ECCV2020 CMC] Contrastive multiview coding [Paper]
[CVPR2019 ECN] Invariance Matters: Exemplar Memory for Domain AdaptivePerson Re-identification [Paper]
[CVPR2018 NCE] Unsupervised Feature Learning via Non-Parametric Instance Discrimination [Paper]
[ECCV2018 NCA] Improving Generalization via Scalable Neighborhood Component Analysis [Paper]
[ICLR2018 MbPA] Memory-based Parameter Adaptation [Paper]
[CVPR2017 OIM] Joint detection and identification feature learning for person search [Paper]

Memorize the feature statistics

[ECCV2016 center loss] A discriminative feature learning approach for deep face recognition [Paper]
[CVPR2018 TCL] Triplet-Center Loss for Multi-View 3D Object Retrieval [Paper]
[AAAI2019 ATCL] Angular triplet-center loss for multi-view 3d shape retrieval [Paper]
[ICCV2019 3C-Net] 3C-Net: Category Count and Center Loss for Weakly-Supervised Action Localization [Paper]
[ECCV2020 A2CL-PT] Adversarial Background-Aware Loss for Weakly-supervised Temporal Activity Localization [Paper]
[Transactions on Multimedia2020 HCTL] Deep fusion feature representation learning with hard mining center-triplet loss for person re-identification [Paper]
[Neurocomputing2020 HC loss] Hetero-center loss for cross-modality person re-identification [Paper]
[CVPR2020 OBTL] Generalized zero-shot learning via over-complete distribution [Paper]
[ICCV2021 SAMC-loss] FREE: Feature Refinement for Generalized Zero-Shot Learning [Paper]
[CVPR2021 SCL] Frequency-aware Discriminative Feature Learning Supervised by Single-Center Loss for Face Forgery Detection [Paper]
[CVPR2021 Cross-modal center Loss] Cross-Modal Center Loss for 3D Cross-Modal Retrieval [Paper]
[ICML2015 Batch Normalization] Batch normalization: Accelerating deep network training by reducing internal covariate shift [Paper]
[NeurIPS2016 Weight Normalization] Weight normalization: A simple reparameterization to accelerate training of deep neural networks [Paper]
[ICML2016 Normalization Propagation] Normalization prop agation: A parametric technique for removing internal covariate shift in deep networks [Paper]
[NeurIPS2017 Batch Renormalization] Batch renormalization: Towards reducing minibatch depen dence in batch-normalized models [Paper]
[ICLR2017 AdaBN] Revisiting batch normaliza tion for practical domain adaptation [Paper]

Model Parameter

Constructing the teachers from past models

[NeurIPS2020 BYOL] Bootstrap your own latent: A new approach to self-supervised learning [Paper]
[CVPR2020 MoCo] Momentum contrast for unsupervised visual representation learning [Paper]
[Arxiv2020 MoCo v2] Improved baselines with momentum contrastive learning [Paper]
[ICCV2021 MoCo v3] An empirical study of training self supervised visual transformers [Paper]
[ICCV2021 TKC] Temporal knowledge consistency for unsupervised visual representation learning [Paper]
[ICCV2021 DINO] Emerging properties in self-supervised vision transformers [Paper]
[ICML2018 BANs] Born again neural networks [Paper]
[CVPR2019 MLNT] Learning to learn from noisy labeled data [Paper]
[CVPR2019 SD] Snapshot distillation: Teacher student optimization in one generation [Paper]
[CVPR2020 Tf-KD] Revisiting knowledge distillation via label smoothing regularization [Paper]
[NeurIPS2022 CheckpointKD] Efficient knowledge distillation from model checkpoints [Paper]
[Arxiv2022 EEKD] Learn from the past: Experience ensemble knowledge distillation [Paper]
[Arxiv2022 SEAT] Self-ensemble adversarial training for improved robustness [Paper]
[NeurIPS2017 Mean teacher] Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results [Paper]
[ECCV2018 TSSDL] Transductive semi-supervised deep learning using min-max features [Paper]
[CVPR2021 EMAN] Exponential moving average normalization for self-supervised and semi-supervised learning [Paper]
[ECCV2022] Unsupervised selective labeling for more effective semi-supervised learning [Paper]
[Arxiv2022 Semi-ViT] Semi-supervised vision transformers at scale [Paper]
[Arxiv2023 AEMA] Robust domain adaptive object detection with unified multi-granularity alignment [Paper]

Directly exploiting ensemble results in inference

[MICCAI2017] Automatic Segmentation and Disease Classification Using Cardiac Cine MR Images [Paper]
[TSG2018] Short-term load forecasting with deep residual networks [Paper]
[ECCV2018] Uncertainty Estimates and Multi-Hypotheses Networks for Optical Flow [Paper]
[CVPRWs2019 DIDN] Deep iterative down-up cnn for image denoising [Paper]
[PR2020] Multi-model ensemble with rich spatial information for object detection [Paper]
[CVPR2020 Self] On the uncertainty of self-supervised monocular depth estimation [Paper]
[NeurIPS2020 PAS] On the Loss Landscape of Adversarial Training: Identifying Challenges and How to Overcome Them [Paper]

Building unitary ensemble architecture in inference

[UAI2018 SWA] Averaging weights leads to wider optima and better generalization [Paper]
[NeurIPS2018 FGE] Loss surfaces, mode connectivity, and fast ensembling of dnns [Paper]
[ICLR2019 fast-SWA] There are many consistent explanations of unlabeled data: Why you should average [Paper]
[NeurIPS2019 SWAG] A simple baseline for bayesian uncertainty in deep learning [Paper]
[ICML2019 SWALP] SWALP: Stochastic Weight Averaging in Low-Precision Training [Paper]
[Arxiv2020 SWA Object Detection] Swa object detection [Paper]
[ICML2021 late-phase weights] Neural networks with late-phase weights [Paper]
[NeurIPS2021 SWAD] SWAD: Domain generalization by seeking flat minima [Paper]
[Arxiv2022 PSWA] Stochastic Weight Averaging Revisited [Paper]

Gradient

The gradients of the model parameters

[JMLR2011 Adagrad] Adaptive subgradient methods for online learning and stochastic optimization [Paper]
[ICML2013 SGD] On the importance of initialization and momentum in deep learning [Paper]
[ICLR2015 Adam] Adam: A method for stochastic optimization [Paper]
[ICLR2016 Nadam] Incorporating nesterov momentum into adam [Paper]
[Arxiv2019 AdaMod] An Adaptive and Momental Bound Method for Stochastic Learning [Paper]
[ICLR2019 AdamW] Decoupled weight decay regularization [Paper]
[IJCAI2020 Padam] Closing the generalization gap of adaptive gradient methods in training deep neural networks [Paper]
[ICLR2020 Radam] On the variance of the adaptive learning rate and beyond [Paper]
[ICLR2021 Adamp] Adamp: Slowing down the slowdown for momentum optimizers on scale-invariant weights [Paper]
[NeurIPS2022 Adan] Adan: Adaptive nesterov momentum algorithm for faster optimizing deep models [Paper]

The gradients of the all-level features

[ECCV2020 GradCon] Backpropgated gradient representations for anomaly detection [Paper]
[CVPR2021 Eqlv2] Equalization Loss v2: A New Gradient Balance Approach for Long-tailed Object Detection [Paper]
[Arxiv2022 EFL] Equalized Focal Loss for Dense Long-Tailed Object Detection [Paper]

Loss Values

[CVPR2019 O2u-net] O2u-net: A simple noisy label detection approach for deep neural networks [Paper]
[ECCV2020 ] Neural batch sampling with reinforcement learning for semi-supervised anomaly detection [Paper]
[ICCV2021 iLPC] Iterative label cleaning for transductive and semi-supervised few-shot learning [Paper]
[ECML PKDD2021 HVS] Small-vote sample selection for label-noise learning [Paper]
[Arxiv2021 CNLCU] Sample selection with uncertainty of losses for learning with noisy labels [Paper]
[AAAI2022 ] Delving into sample loss curve to embrace noisy and imbalanced data [Paper]
[Arxiv2022 CTRL] Ctrl: Clustering training losses for label error detection [Paper]

Citation

If you find this repository useful, please consider citing this list:

Acknowledgement

The project are based on Ultimate-Awesome-Transformer-Attention and UM-MAE. Thanks for their wonderful work.

License

This project is under the CC-BY-NC 4.0 license. See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Repository files navigation

A Survey of Historical Learning: Learning Models with Learning History

Update

Overview

Prediction

Intermediate Feature Representation

Record the instance-level feature representations

Memorize the feature statistics

Model Parameter

Constructing the teachers from past models

Directly exploiting ensemble results in inference

Building unitary ensemble architecture in inference

Gradient

The gradients of the model parameters

The gradients of the all-level features

Loss Values

Citation

Acknowledgement

License

About

Releases

Packages

Contributors 2

Martinser/Awesome-Historical-Learning

Folders and files

Latest commit

History

README.md

README.md

Repository files navigation

A Survey of Historical Learning: Learning Models with Learning History

Update

Overview

Prediction

Intermediate Feature Representation

Record the instance-level feature representations

Memorize the feature statistics

Model Parameter

Constructing the teachers from past models

Directly exploiting ensemble results in inference

Building unitary ensemble architecture in inference

Gradient

The gradients of the model parameters

The gradients of the all-level features

Loss Values

Citation

Acknowledgement

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages