Lifelong learning with Cycle Memory Networks

Abstract • Experiments • Installation • Usages • Result • Citation

This is the official repository for the paper:

Lifelong learning with Cycle Memory Networks
published on IEEE Transactions on Neural Networks and Learning Systems.2023, DOI:10.1109/TNNLS.2023.3294495.

Abstract: Learning from a sequence of tasks for a lifetime is essential for an agent toward artificial general intelligence. Despite the explosion of this research field in recent years, most work focuses on the well-known catastrophic forgetting issue. In contrast, this work aims to explore knowledge-transferable lifelong learning without storing historical data and significant additional computational overhead. We demonstrate that existing data-free frameworks, including regularization-based single-network and structure-based multinetwork frameworks, face a fundamental issue of lifelong learning, named anterograde forgetting, i.e., preserving and transferring memory may inhibit learning new knowledge. We attribute it to the fact that the learning network capacity decreases while memorizing historical knowledge and conceptual confusion between the irrelevant old knowledge and the current task. Inspired by the complementary learning theory in neuroscience, we endow artificial neural networks with the ability to continuously learn without forgetting while recalling historical knowledge to facilitate learning new knowledge. Specifically, this work proposes a general framework named cycle memory networks (CMNs). The CMN consists of two individual memory networks to store short- and long-term memories separately to avoid capacity shrinkage and a transfer cell between them. It enables knowledge transfer from the long-term to the short-term memory network to mitigate conceptual confusion. In addition, the memory consolidation mechanism integrates short-term knowledge into the long-term memory network for knowledge accumulation. We demonstrate that the CMN can effectively address the anterograde forgetting on task-related, task-conflict, class-incremental, and cross-domain benchmarks. Furthermore, we provide extensive ablation studies to verify each framework component.

Overview of our method and results

Experiments

This code is the experimental code of CMN, which includes the Cycle Memory code as well as the comparison method code (One, Joint, fine-tuning, PNN, LwF, EWC, DGR, HNet). Each method will learn a sequence of 10 subtasks split from the CIFAR-100 dataset. All methods except HNet use ResNet-18 as the backbone, and HNet uses ResNet-20 as the backbone.

All experiments used the stochastic gradient descent optimisation strategy with an initialized learning rate {1, 0.1, 0.001}. The momentum was 0.9, and the weight decay was 1 × 10^−5. The grid search was utilized to get the optimal result. The training epoch was 100. Besides, all models’ weights were initialized with kaiming_uniform. The mini-batch size was set to 512 or 1024 in this experiment.

Install

All experiments were implemented on 8 pieces of A100s with 40G RAM. The deep learning frame- work torch 1.8.1 was used for the experiments. The requirements.txt file contains all the project's dependent installation packages, which can be quickly installed using the following command.

$ pip install -r requirements.txt

Usages

For each method, run with the following command in that method directory:

python main.py

You diy related hyper-parameters in the './main.py' file.

Result

The test results are saved in each method directory's result folder.

	One	Joint	fine-tuning	PNN	LwF	EWC	DGR	HNet	CMN
ACC	0.7788	0.7423	0.1745	0.816	0.3398	0.44	0.0712	0.4003	0.8402
FWT	0	\	-0.001667	0.04988889	-0.20589	-4.76	-0.414	-0.32989	0.10088889

Relate Works

fine-tuning：Fine-tuning Deep Neural Networks in Continuous Learning Scenarios https://pub.inf-cv.uni-jena.de/pdf/Kaeding16_FDN.pdf
EWC：Overcoming catastrophic forgetting in neural networkshttps://doi.org/10.1073/pnas.1611835114
LwF：Learning without Forgetting https://arxiv.org/abs/1606.09282
PNN：Progressive Neural Networks https://arxiv.org/abs/1606.04671
DGR ：Continual Learning with Deep Generative Replay https://arxiv.org/abs/1705.08690
HNet：Continual learning with hypernetworks https://arxiv.org/abs/1906.00695

Experimental Details and Supplement Materials

More details and supplement experimental results can be found at CMN_supplements.

Citation

If you like our work, please cite the following formation,

@ARTICLE{10197260,
  author={Peng, Jian and Ye, Dingqi and Tang, Bo and Lei, Yinjie and Liu, Yu and Li, Haifeng},
  journal={IEEE Transactions on Neural Networks and Learning Systems}, 
  title={Lifelong Learning With Cycle Memory Networks}, 
  year={2023},
  volume={},
  number={},
  pages={1-14},
  keywords={Task analysis;Knowledge engineering;Learning (artificial intelligence);Learning systems;Neuroscience;Microprocessors;Knowledge transfer;Anterograde forgetting;catastrophic forgetting;complementary learning theory;cycle memory network (CMN);lifelong learning},
  doi={10.1109/TNNLS.2023.3294495}}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
CMN-CIFAR100		CMN-CIFAR100
DGR-CIFAR100		DGR-CIFAR100
EWC-CIFAR100		EWC-CIFAR100
HNet-CIFAR100		HNet-CIFAR100
Joint-CIFAR100		Joint-CIFAR100
LwF-CIFAR100		LwF-CIFAR100
ONE-CIFAR100		ONE-CIFAR100
PNN-CIFAR100		PNN-CIFAR100
fig		fig
finetuning-CIFAR100		finetuning-CIFAR100
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

GeoX-Lab/CMN

Folders and files

Latest commit

History

Repository files navigation

Lifelong learning with Cycle Memory Networks

Experiments

Install

Usages

Result

Relate Works

Experimental Details and Supplement Materials

Citation

About

Resources

License

Stars

Watchers

Forks

Languages