# **Continual Learning with Avalanche**

* CL consists of algorithms that learn about the external world continuously and adaptively through time, enabling the incremental1development of ever more complex skills and knowledge. 
* Rather than assuming a fixed and representative training set available a priori, as usually done in regularML, CL algorithms deal with a possibly non iid and unlimited stream of data or tasks.
* The rationale behind CL is learning efficiently in an online manner and without forgetting previously learned concepts
* In this notebook we will summarize the major Avalanche features for continual learning.

## **Main reference**
* https://medium.com/@NataliaDiazRodr/conceiving-avalanche-a-comprehensive-framework-for-continual-learning-research-d8b6820ab5e0
* https://github.com/ContinualAI/avalanche

## **Installing Avalanche**

In [None]:
pip install git+https://github.com/ContinualAI/avalanche.git

Collecting git+https://github.com/ContinualAI/avalanche.git
  Cloning https://github.com/ContinualAI/avalanche.git to /tmp/pip-req-build-85n8g0wu
  Running command git clone -q https://github.com/ContinualAI/avalanche.git /tmp/pip-req-build-85n8g0wu
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
    Preparing wheel metadata ... [?25l[?25hdone
Collecting wandb
  Downloading wandb-0.12.6-py2.py3-none-any.whl (1.7 MB)
[K     |████████████████████████████████| 1.7 MB 5.2 MB/s 
Collecting quadprog
  Downloading quadprog-0.1.10.tar.gz (121 kB)
[K     |████████████████████████████████| 121 kB 40.8 MB/s 
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
    Preparing wheel metadata ... [?25l[?25hdone
Collecting gputil
  Downloading GPUtil-1.4.0.tar.gz (5.5 kB)
Collecting pytorchcv
  Downloading pytorchcv-0.0.67-py2.py3-none-any.whl (532 kB)
[K     |███████

In [None]:
import avalanche
avalanche.__version__

'0.0.1'

In [None]:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

In [None]:
import torch
import argparse
from torch.nn import CrossEntropyLoss
from torch.optim import SGD

In [None]:
from avalanche.benchmarks.classic import PermutedOmniglot, RotatedOmniglot, \
    SplitOmniglot
from avalanche.models import SimpleMLP
from avalanche.training.strategies import Naive

In [None]:
device = torch.device(f"cuda:{args.cuda}"
                          if torch.cuda.is_available() and
                          args.cuda >= 0 else "cpu")

## **Using Simple MLP as defined in the avalanche framework**

In [None]:
# model
model = SimpleMLP(num_classes=964)

In [None]:
parser = argparse.ArgumentParser()

In [None]:
parser.add_argument('--mnist_type', type=str, default='split',
                        choices=['rotated', 'permuted', 'split'],
                        help='Choose between MNIST variations: '
                             'rotated, permuted or split.')

_StoreAction(option_strings=['--mnist_type'], dest='mnist_type', nargs=None, const=None, default='split', type=<class 'str'>, choices=['rotated', 'permuted', 'split'], help='Choose between MNIST variations: rotated, permuted or split.', metavar=None)

In [None]:
parser.add_argument('--cuda', type=int, default=0,
                        help='Select zero-indexed cuda device. -1 to use CPU.')

_StoreAction(option_strings=['--cuda'], dest='cuda', nargs=None, const=None, default=0, type=<class 'int'>, choices=None, help='Select zero-indexed cuda device. -1 to use CPU.', metavar=None)

In [None]:
args, unknown = parser.parse_known_args()

## **Downloading the omniglot dataset**

In [None]:
scenario = PermutedOmniglot(n_experiences=4, seed=1)

Downloading https://raw.githubusercontent.com/brendenlake/omniglot/master/python/images_background.zip to /root/.avalanche/data/omniglot/omniglot-py/omniglot-py/images_background.zip


  0%|          | 0/9464212 [00:00<?, ?it/s]

Extracting /root/.avalanche/data/omniglot/omniglot-py/omniglot-py/images_background.zip to /root/.avalanche/data/omniglot/omniglot-py/omniglot-py
Downloading https://raw.githubusercontent.com/brendenlake/omniglot/master/python/images_evaluation.zip to /root/.avalanche/data/omniglot/omniglot-py/omniglot-py/images_evaluation.zip


  0%|          | 0/6462886 [00:00<?, ?it/s]

Extracting /root/.avalanche/data/omniglot/omniglot-py/omniglot-py/images_evaluation.zip to /root/.avalanche/data/omniglot/omniglot-py/omniglot-py


In [None]:
# Then we can extract the parallel train and test streams
train_stream = scenario.train_stream
test_stream = scenario.test_stream

In [None]:
# Prepare for training & testing
optimizer = SGD(model.parameters(), lr=0.001, momentum=0.9)
criterion = CrossEntropyLoss()

In [None]:
# Continual learning strategy with default logger
cl_strategy = Naive(model, optimizer, criterion, train_mb_size=32, train_epochs=2,
        eval_mb_size=32, device=device)

### **Split omniglot dataset**

In [None]:
from avalanche.benchmarks.classic import SplitOmniglot
scenario = SplitOmniglot(n_experiences=4, seed=1)
print("Starting experiment...")
results = []
for experience in scenario.test_stream:
  print("Start of experience: ", experience.current_experience)
  print("Current classes: ", experience.classes_in_this_experience)

  train_res = cl_strategy.train(experience)
  print("Training completed")

  print("Computing accuracy on the whole test set")
  results.append(cl_strategy.eval(scenario.test_stream))

Starting experiment...
Start of experience:  0
Current classes:  [5, 6]
-- >> Start of training phase << --
-- Starting training on experience 0 (Task 0) from test stream --
100%|██████████| 58/58 [02:16<00:00,  2.35s/it]
Epoch 0 ended.
	Loss_Epoch/train_phase/test_stream/Task000 = 2.0732
	Top1_Acc_Epoch/train_phase/test_stream/Task000 = 0.7405
100%|██████████| 58/58 [00:01<00:00, 44.12it/s]
Epoch 1 ended.
	Loss_Epoch/train_phase/test_stream/Task000 = 0.1081
	Top1_Acc_Epoch/train_phase/test_stream/Task000 = 0.9616
-- >> End of training phase << --
Training completed
Computing accuracy on the whole test set
-- >> Start of eval phase << --
-- Starting eval on experience 0 (Task 0) from test stream --
100%|██████████| 58/58 [00:00<00:00, 60.11it/s]
> Eval on experience 0 (Task 0) from test stream ended.
	Loss_Exp/eval_phase/test_stream/Task000/Exp000 = 0.0694
	Top1_Acc_Exp/eval_phase/test_stream/Task000/Exp000 = 0.9768
-- Starting eval on experience 1 (Task 0) from test stream --
100%|███

In [None]:
# train and test loop
results = []
for train_task in train_stream:
  print("Current Classes: ", train_task.classes_in_this_experience)
  cl_strategy.train(train_task)
  results.append(cl_strategy.eval(test_stream))

### **Rotated omniglot dataset**

In [None]:
from avalanche.benchmarks.classic import RotatedOmniglot
scenario = RotatedOmniglot(n_experiences=5, rotations_list=[30, 60, 90, 120, 150], seed=1)

Files already downloaded and verified
Files already downloaded and verified


In [None]:
# train and test loop
results = []
for train_task in train_stream:
  print("Current Classes: ", train_task.classes_in_this_experience)
  cl_strategy.train(train_task)
  results.append(cl_strategy.eval(test_stream))

### **Permuted omniglot dataset**

In [None]:
from avalanche.benchmarks.classic import PermutedOmniglot
scenario = PermutedOmniglot(n_experiences=4, seed=1)

Files already downloaded and verified
Files already downloaded and verified


In [None]:
# train and test loop
results = []
for train_task in train_stream:
  print("Current Classes: ", train_task.classes_in_this_experience)
  cl_strategy.train(train_task)
  results.append(cl_strategy.eval(test_stream))

## **Testing on MNIST dataset**

In [36]:
# Then we can extract the parallel train and test streams
train_stream = scenario.train_stream
test_stream = scenario.test_stream

In [37]:
from avalanche.benchmarks.classic import SplitMNIST
scenario = SplitMNIST(n_experiences=5, seed=1)
# train and test loop
results = []
for train_task in train_stream:
    cl_strategy.train(train_task, num_workers=4)
    results.append(cl_strategy.eval(test_stream))

-- >> Start of training phase << --
-- Starting training on experience 0 (Task 0) from train stream --


  cpuset_checked))


100%|██████████| 355/355 [37:09<00:00,  6.28s/it]
Epoch 0 ended.
	Loss_Epoch/train_phase/train_stream/Task000 = 0.1945
	Top1_Acc_Epoch/train_phase/train_stream/Task000 = 0.9453
100%|██████████| 355/355 [00:07<00:00, 46.23it/s]
Epoch 1 ended.
	Loss_Epoch/train_phase/train_stream/Task000 = 0.0615
	Top1_Acc_Epoch/train_phase/train_stream/Task000 = 0.9786
-- >> End of training phase << --
-- >> Start of eval phase << --
-- Starting eval on experience 0 (Task 0) from test stream --
100%|██████████| 58/58 [00:00<00:00, 63.65it/s]
> Eval on experience 0 (Task 0) from test stream ended.
	Loss_Exp/eval_phase/test_stream/Task000/Exp000 = 0.0477
	Top1_Acc_Exp/eval_phase/test_stream/Task000/Exp000 = 0.9865
-- Starting eval on experience 1 (Task 0) from test stream --
100%|██████████| 68/68 [00:01<00:00, 63.37it/s]
> Eval on experience 1 (Task 0) from test stream ended.
	Loss_Exp/eval_phase/test_stream/Task000/Exp001 = 4.4455
	Top1_Acc_Exp/eval_phase/test_stream/Task000/Exp001 = 0.0272
-- Starting 