Skip to content

GeoX-Lab/CMN

Repository files navigation

Lifelong learning with Cycle Memory Networks


AbstractExperimentsInstallationUsagesResultCitation


This is the official repository for the paper:

Lifelong learning with Cycle Memory Networks
published on IEEE Transactions on Neural Networks and Learning Systems.2023, DOI:10.1109/TNNLS.2023.3294495.

Abstract: ​Learning from a sequence of tasks for a lifetime is essential for an agent toward artificial general intelligence. Despite the explosion of this research field in recent years, most work focuses on the well-known catastrophic forgetting issue. In contrast, this work aims to explore knowledge-transferable lifelong learning without storing historical data and significant additional computational overhead. We demonstrate that existing data-free frameworks, including regularization-based single-network and structure-based multinetwork frameworks, face a fundamental issue of lifelong learning, named anterograde forgetting, i.e., preserving and transferring memory may inhibit learning new knowledge. We attribute it to the fact that the learning network capacity decreases while memorizing historical knowledge and conceptual confusion between the irrelevant old knowledge and the current task. Inspired by the complementary learning theory in neuroscience, we endow artificial neural networks with the ability to continuously learn without forgetting while recalling historical knowledge to facilitate learning new knowledge. Specifically, this work proposes a general framework named cycle memory networks (CMNs). The CMN consists of two individual memory networks to store short- and long-term memories separately to avoid capacity shrinkage and a transfer cell between them. It enables knowledge transfer from the long-term to the short-term memory network to mitigate conceptual confusion. In addition, the memory consolidation mechanism integrates short-term knowledge into the long-term memory network for knowledge accumulation. We demonstrate that the CMN can effectively address the anterograde forgetting on task-related, task-conflict, class-incremental, and cross-domain benchmarks. Furthermore, we provide extensive ablation studies to verify each framework component.


Overview of our method and results

Experiments

​ This code is the experimental code of CMN, which includes the Cycle Memory code as well as the comparison method code (One, Joint, fine-tuning, PNN, LwF, EWC, DGR, HNet). Each method will learn a sequence of 10 subtasks split from the CIFAR-100 dataset. All methods except HNet use ResNet-18 as the backbone, and HNet uses ResNet-20 as the backbone.

​ All experiments used the stochastic gradient descent optimisation strategy with an initialized learning rate {1, 0.1, 0.001}. The momentum was 0.9, and the weight decay was 1 × 10^−5. The grid search was utilized to get the optimal result. The training epoch was 100. Besides, all models’ weights were initialized with kaiming_uniform. The mini-batch size was set to 512 or 1024 in this experiment.

Install

All experiments were implemented on 8 pieces of A100s with 40G RAM. The deep learning frame- work torch 1.8.1 was used for the experiments. The requirements.txt file contains all the project's dependent installation packages, which can be quickly installed using the following command.

$ pip install -r requirements.txt

Usages

For each method, run with the following command in that method directory:

python main.py

You diy related hyper-parameters in the './main.py' file.

Result

The test results are saved in each method directory's result folder.

One Joint fine-tuning PNN LwF EWC DGR HNet CMN
ACC 0.7788 0.7423 0.1745 0.816 0.3398 0.44 0.0712 0.4003 0.8402
FWT 0 \ -0.001667 0.04988889 -0.20589 -4.76 -0.414 -0.32989 0.10088889

Relate Works

Experimental Details and Supplement Materials

More details and supplement experimental results can be found at CMN_supplements.

Citation

If you like our work, please cite the following formation,

@ARTICLE{10197260,
  author={Peng, Jian and Ye, Dingqi and Tang, Bo and Lei, Yinjie and Liu, Yu and Li, Haifeng},
  journal={IEEE Transactions on Neural Networks and Learning Systems}, 
  title={Lifelong Learning With Cycle Memory Networks}, 
  year={2023},
  volume={},
  number={},
  pages={1-14},
  keywords={Task analysis;Knowledge engineering;Learning (artificial intelligence);Learning systems;Neuroscience;Microprocessors;Knowledge transfer;Anterograde forgetting;catastrophic forgetting;complementary learning theory;cycle memory network (CMN);lifelong learning},
  doi={10.1109/TNNLS.2023.3294495}}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages