Skip to content
No description, website, or topics provided.
Python
Branch: master
Clone or download
Latest commit 840afdf Sep 6, 2018
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
imgs Updated results Sep 4, 2018
preproc_msceleb good preproc steps May 16, 2018
res Selective forgetting added Apr 4, 2018
README.md Update README.md Sep 5, 2018
cfaces.py good preproc steps May 16, 2018
ewc_fg.py Permuted MNIST experiment May 6, 2018
feat_vgg.py good preproc steps May 16, 2018
mnist.py Permuted MNIST experiment May 6, 2018
mnist_permute_exp.py Updated results Sep 4, 2018
mnist_split_exp.py Selective forgetting added Apr 4, 2018
ref.md Update ref.md Apr 4, 2018
vgg_face.py good preproc steps May 16, 2018

README.md

Selectively Forgetting Tasks with Elastic Weight Consolidation

Elastic Weight Consolidation (https://arxiv.org/abs/1612.00796) allows a parameterized model to sequentially learn tasks with independent data. This project: (1) reproduces the results of the EWC paper (2) studies the saturation behavior of models (performance as more tasks are learned) and (3) implements a simple modification in the algorithm to learn tasks and then selectively forget some to free capacity. The full report for the project can be seen here

Elastic Weight Consolidation paper reproduction: Validation curves on using standard EWC with a large model (2 hidden layers, 1000 units each):

  • 5 permuted MNIST tasks with only SGD+Dropout:

  • Same tasks with EWC:

The model is large enough to all 5 independent tasks. As the tasks are learned sequentially, their validation accuracies converge and then stay constant.

To recreate the permuted MNIST experiments, run mnist_permute_exp.py (usage instructions in the script). This will run the experiment on the saturation behaviour of the model. This uses a smaller model (2 hidden layers, 100 units each):

  • With EWC: Note the drop in accuracy for new tasks as the small model is forced to remember all previous tasks

  • Without EWC (SGD+dropout): Plain SGD causes the model to forget a task as soon as it is trained on another

  • Selectively forgetting (here, the "forget policy" was: after task 4, forget tasks 0, 1, 3, 5). Note the drop in validation accuracies for these tasks while the accuracy for task 2 remains the same.

The script also displays a visualization of the Fisher Information Matrices for selected tasks and shows the weight correlation matrices after training on all tasks (the report describes these in detail).

You can’t perform that action at this time.