Learning Shared Knowledge for Deep Lifelong Learning using Deconvolutional Networks

This is code of Deconvolutional Factorized CNN, proposed in IJCAI 2019 paper "Learning Shared Knowledge for Deep Lifelong Learning using Deconvolutional Networks".

Version and Dependencies

This code is compatible with both Python 2.7 and Python 3.5. It requires numpy, tensorflow and scikit-image. Pre-processed data is stored and loaded by .pkl file, but please be aware that pickle file generated by Python 3 is NOT compatible to Python 2. Currently, MATLAB is required to load summary of experiment result because it is saved in .mat file.

Data

MNIST (MTL)
- MNIST data have 10 labels, so we split them into 5 binary classification tasks (0 vs 1, 2 vs 3, and so on) for heterogeneous task distribution and into 10 one-vs-all classification tasks for homogeneous task distribution.
- Baseline models have good accuracy, so we used 3%, 5%, 7%, 10% and 30% of the provided data for training/validation set per a task. Test data have 1800/2000 instances per a task for heterogeneous and homogeneous task distribution respectively.
- We didn't use any method for data augmentation, but we rescaled range of image value to 0~1.
CIFAR-10 (MTL)
- CIFAR-10 data have 10 labels, so we split them into 5 binary classification tasks (0 vs 1, 2 vs 3, and so on) for heterogeneous task distribution and into 10 one-vs-all classification tasks for homogeneous task distribution.
- We trained/tested models at following cases: training and validation data are 4%, 10%, 30%, 50% and 70% of the provided dataset for training.
- Test data have 2000 instances per a task.
- We didn't use any method for data augmentation, but we did normalization.
CIFAR-100 (Lifelong)
- Similar to CIFAR-10, but having 100 classes.
- Each task is 10-class classification task, and there are 10 tasks for the lifelong learning task with heterogeneous task distribution (disjoint set of image classes for these sub-tasks).
- We trained models by using only 4% of the available dataset.
- We normalized images.
Office-Home (Lifelong)
- We used images in Product and Real-World domains.
- Each task is 13-class classification task, and image classes of sub-tasks are randomly chosen without repetition (but distinguishing classes from Product domain and those from Real-World domain).
- Images are rescaled to 128x128 size and rescaled range of pixel value to 0~1, but not normalized or augmented.

Proposed Model

DF-CNN model (Deconvolutional_Factorized_CNN model in the code)
Ablated model 1: DF-CNN.direct model (Deconvolutional_Factorized_CNN_Direct model in the code)
Ablated model 2: DF-CNN.tc2 model (Deconvolutional_Factorized_CNN_tc2 model in the code)

Baseline Model

Single Task model
- Construct independent models as many as the number of tasks, and train them independently.
Single Neural Net model
- Construct a single neural network, and treat data of all task as same.
Hard-parameter Shared model
- Neural networks for tasks share convolution layers, and have independent fully-connected layers for output.
Tensor Factorization model
- Factorize parameter of each layer into multiplication of several tensors, and share all but one over different tasks. (Details in the paper Yang, Yongxin, and Timothy Hospedales. "Deep multi-task representation learning: A tensor factorisation approach." arXiv preprint arXiv:1605.06391 (2016).)
- We used Tucker decomposition because it worked better than others.
Dynamically Expandable Network model
- Extended hard-parameter shared model by retraining some neurons selectively/adding new neurons/splitting neurons into disjoint groups for different set of tasks according to the given data.
- The code (cnn_den_model.py) is almost the same as code provided by authors.

How to use code

Prepare data:
- Download raw MNIST, CIFAR-10, CIFAR-100 and/or Office-Home dataset, and place it in ./Data directory.
Run main_train_cl.py with following arguments:
- gpu: index of GPU to use
- data_type : name of dataset and number of tasks (e.g. MNIST5/MNIST10/CIFAR10_5/CIFAR10_10/CIFAR100_10/CIFAR100_20/OfficeHome)
- data_percent : the percent of original dataset to be used for training. Please check utils/utils_env_cl.py prior to use.
- model_type : type of architecture
- lifelong : flag to set the framework of training as lifelong learning (only one task is available at every update)
- save_mat_name : the name of .mat file storing all information computed during training
example command:
- python main_train_cl.py --gpu 0 --data_type MNIST10 --data_percent 3 --model_type STL --save_mat_name MNIST10_result_3p_STL.mat
- python main_train_cl.py --gpu 1 --data_type OfficeHome --model_type DFCNN --lifelong --save_mat_name OfficeHome_result_DFCNN_lifelong.mat

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
classification		classification
utils		utils
LICENSE		LICENSE
README.md		README.md
main_train_cl.py		main_train_cl.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

classification

classification

utils

utils

LICENSE

LICENSE

README.md

README.md

main_train_cl.py

main_train_cl.py

Repository files navigation

Learning Shared Knowledge for Deep Lifelong Learning using Deconvolutional Networks

Version and Dependencies

Data

Proposed Model

Baseline Model

How to use code

About

Releases

Packages

Languages

License

Lifelong-ML/DF-CNN

Folders and files

Latest commit

History

Repository files navigation

Learning Shared Knowledge for Deep Lifelong Learning using Deconvolutional Networks

Version and Dependencies

Data

Proposed Model

Baseline Model

How to use code

About

Resources

License

Stars

Watchers

Forks

Languages