In [1]:
"""
In this script, we will see how to define a custom model and train it using the Mammoth library.

In addition to the usual `train` and `load_runner` functions, we will need:
- `register_model`: to register our custom model with the Mammoth library.
- `ContinualModel`: to define our custom model.
"""

from mammoth_lite import register_model, ContinualModel, load_runner, train

In [2]:
@register_model('new-sgd') # Register this model with the name 'new-sgd'
class NewSgd(ContinualModel):
    """
    Each model must inherit from ContinualModel and implement the observe method.

    The observe method is called every time a new batch of data is available.
    It is responsible for training the model on the current task using the data provided.

    The model can also include a `COMPATIBILITY` attribute to specify the scenarios it is compatible with.
    In Mammoth-Lite, only the 'class-il' and 'task-il' scenarios are available and are set as default.

    The ContinualModel class provides attrubutes such as:
    - `net`: the backbone model that is used for training. The backbone is defined by default in the dataset but can be changed with the `backbone` argument.
    - `opt`: the optimizer used for training.
    - `loss`: the loss function used for training. This is defined by the dataset and is usually CrossEntropyLoss.
    """
    # COMPATIBILITY = ['class-il', 'task-il'] # More scenarios are available in the full Mammoth repo.

    def observe(self, inputs, labels, not_aug_inputs, epoch=None):
        """
        We will implement just the simplest algorithm for Continual Learning: SGD.
        With SGD, we simply train the model with the data provided, with no countermeasures against forgetting.
        """
        # zero the gradients
        self.opt.zero_grad()

        # forward pass on the model
        outputs = self.net(inputs)

        # compute the loss
        loss = self.loss(outputs, labels)

        # backward pass
        loss.backward()

        # update the weights
        self.opt.step()

        # return the loss value, for logging purposes
        return loss.item()


In [3]:
"""
Now we can use the `load_runner` function to load our custom model.
"""

model, dataset = load_runner('new-sgd','seq-cifar10',{'lr': 0.1, 'n_epochs': 1, 'batch_size': 32})
train(model, dataset)

Loading model:  new-sgd
- Using ResNet as backbone
Using device cuda


  0%|          | 0/313 [00:00<?, ?it/s]

Task 1


Evaluating Task 1: 100%|██████████| 63/63 [00:00<00:00, 89.18it/s, acc_task_1=68.2] 


Accuracy for task 1	[Class-IL]: 68.20 	[Task-IL]: 68.20


  0%|          | 0/313 [00:00<?, ?it/s]

Task 2


Evaluating Task 2: 100%|██████████| 126/126 [00:01<00:00, 90.67it/s, acc_task_2=65.8] 


Accuracy for task 2	[Class-IL]: 32.90 	[Task-IL]: 62.62


  0%|          | 0/313 [00:00<?, ?it/s]

Task 3


Evaluating Task 3: 100%|██████████| 189/189 [00:02<00:00, 88.83it/s, acc_task_3=79]   


Accuracy for task 3	[Class-IL]: 26.33 	[Task-IL]: 70.88


  0%|          | 0/313 [00:00<?, ?it/s]

Task 4


Evaluating Task 4: 100%|██████████| 252/252 [00:02<00:00, 92.62it/s, acc_task_4=72.8] 


Accuracy for task 4	[Class-IL]: 18.20 	[Task-IL]: 63.24


  0%|          | 0/313 [00:00<?, ?it/s]

Task 5


Evaluating Task 5: 100%|██████████| 315/315 [00:03<00:00, 87.87it/s, acc_task_5=82.5] 

Accuracy for task 5	[Class-IL]: 16.50 	[Task-IL]: 68.13





In [2]:
"""
Let's create a more sophisticated model that uses a *replay* buffer to store past data and use it to train the model.

Replay-based methods (rehearsal) are so common in Continual Learning that Mammoth-Lite provides a simple implementation of a replay buffer.
To define a buffer, we can use the `Buffer` class, which implements the simple `reservoir sampling` algorithm.
In addition, we will use the `add_rehearsal_args` function to add command line arguments for the replay buffer size.
"""

from argparse import ArgumentParser
from mammoth_lite import Buffer, add_rehearsal_args

@register_model('experience-replay')  # Register this model with the name 'experience-replay'
class ExperienceReplay(ContinualModel):
    """
    This model uses a replay buffer to store past data and use it to train the model.
    The replay buffer is defined by the `Buffer` class, which is a simple FIFO queue.
    """
    COMPATIBILITY = ['class-il', 'task-il']

    @staticmethod
    def get_parser(parser: ArgumentParser):
        """
        This method is used to define additional command line arguments for the model.
        It is called by the `load_runner` function to parse the arguments.
        """

        # We can add the rehearsal arguments to the parser.
        # This includes the `--buffer_size` argument, which defines the size of the replay buffer, and the `--minibatch_size` argument, which defines the size of the mini-batch used for training.
        add_rehearsal_args(parser)

        # We can also add other arguments specific to the model.
        parser.add_argument('--alpha', type=float, default=0.5,
                            help='Controls the balance between new and old data.')
        
        return parser

    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)

        # initialize the replay buffer with the size defined in the command line arguments
        self.buffer = Buffer(buffer_size=self.args.buffer_size) 

    def observe(self, inputs, labels, not_aug_inputs, epoch=None):
        """
        We will implement a simple experience replay algorithm.
        We will store the data in the replay buffer and use it to train the model.
        """
        self.opt.zero_grad()

        # compute the loss on the examples from the current task
        outputs = self.net(inputs)
        loss = self.loss(outputs, labels)

        # Sample a batch from the buffer
        if len(self.buffer) > 0:
            buffer_inputs, buffer_labels = self.buffer.get_data(
                size=self.args.minibatch_size, device=self.device)
            
            # Forward pass on the buffer data
            buffer_outputs = self.net(buffer_inputs)
            # Compute the loss on the buffer data
            buffer_loss = self.loss(buffer_outputs, buffer_labels)
            # Combine the losses from the current batch and the buffer
            loss = loss + self.args.alpha * buffer_loss

        # backward pass and update the weights
        loss.backward()
        self.opt.step()
        
        # Store the current batch in the buffer
        self.buffer.add_data(inputs, labels)

        # return the loss value, for logging purposes
        return loss.item()

In [3]:
"""
Now let's see it in action.
"""

args = {
    # these are the same arguments as before, but we will add the buffer size and minibatch size
    'lr': 0.1, 
    'n_epochs': 1,
    'batch_size': 32,
    # now we can pass the buffer size and minibatch size as arguments 
    'buffer_size': 1000, 
    'minibatch_size': 32, 
    'alpha': 0.2
    }

model, dataset = load_runner('experience-replay','seq-cifar10', args)
train(model, dataset)

Loading model:  experience-replay
- Using ResNet as backbone
Using device cuda


  0%|          | 0/313 [00:00<?, ?it/s]

Evaluating Task 1: 100%|██████████| 63/63 [00:00<00:00, 88.79it/s, acc_task_1=77]   


Accuracy for task 1	[Class-IL]: 77.00 	[Task-IL]: 77.00


  0%|          | 0/313 [00:00<?, ?it/s]

Evaluating Task 2: 100%|██████████| 126/126 [00:01<00:00, 85.04it/s, acc_task_2=59.5] 


Accuracy for task 2	[Class-IL]: 32.38 	[Task-IL]: 69.08


  0%|          | 0/313 [00:00<?, ?it/s]

Evaluating Task 3: 100%|██████████| 189/189 [00:02<00:00, 87.20it/s, acc_task_3=74.7]  


Accuracy for task 3	[Class-IL]: 31.78 	[Task-IL]: 76.10


  0%|          | 0/313 [00:00<?, ?it/s]

Evaluating Task 4: 100%|██████████| 252/252 [00:02<00:00, 87.22it/s, acc_task_4=90]    


Accuracy for task 4	[Class-IL]: 36.15 	[Task-IL]: 80.36


  0%|          | 0/313 [00:00<?, ?it/s]

Evaluating Task 5: 100%|██████████| 315/315 [00:03<00:00, 84.22it/s, acc_task_5=88.3] 

Accuracy for task 5	[Class-IL]: 28.40 	[Task-IL]: 81.94



