Generate a semi-binary mask for a target network using a hypernetwork.
Use environment.yml
file to create a conda environment with necessary libraries. One of the most essential packages is hypnettorch which should easy create hypernetworks in PyTorch.
The implemented experiments uses four publicly available datasets for continual learning tasks: Permuted MNIST, Split MNIST, Split CIFAR-100 and Tiny ImageNet. The datasets may be downloaded when the algorithm runs.
The description of HyperMask is included in the paper. To perform experiments with the use of the best hyperparameters found and reproduce the results from the publication for five different seed values, one should run main.py
file with the variable create_grid_search
set to False
and the variable dataset
set to PermutedMNIST
, SplitMNIST
, CIFAR100
or TinyImageNet
. In the third and fourth cases, as a target network ResNet-20
or ZenkeNet
can be selected. To train ResNets, it is necessary to set part = 0
, while to prepare ZenkeNets, one has to set part = 1
. In the remaining cases, the variable part
is insignificant.
Also, to prepare experiments with CIFAR100
according to the FeCAM scenario, one should set the variable dataset
in main.py
to CIFAR100_FeCAM_setup
with part = 6
to run training with a ResNet model or part = 7
to train a ZenkeNet model.
One can also easily perform hyperparameter optimization using a grid search technique. For this purpose, one should set the variable create_grid_search
to True
in main.py
file and modify lists with hyperparameters for the selected dataset in datasets.py
file.
If you use this library in your research project, please cite the following paper:
@misc{książek2023hypermask,
title={HyperMask: Adaptive Hypernetwork-based Masks for Continual Learning},
author={Kamil Książek and Przemysław Spurek},
year={2023},
eprint={2310.00113},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
Copyright 2023 Institute of Theoretical and Applied Informatics, Polish Academy of Sciences (ITAI PAS) https://www.iitis.pl and Group of Machine Learning Research (GMUM), Faculty of Mathematics and Computer Science of Jagiellonian University https://gmum.net/.
Authors:
- Kamil Książek (ITAI PAS, ORCID ID: 0000−0002−0201−6220),
- Przemysław Spurek (Jagiellonian University, ORCID ID: 0000-0003-0097-5521).
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.
The HyperMask repository includes parts of the code that come or are based on external sources: hypnettorch, FeCAM, Tiny ImageNet preprocessing 1 and Tiny ImageNet preprocessing 2.