---
description: 'First things first: let''s start with a good model!'
---

# Models

Welcome to the "**Models**" tutorial of the "_From Zero to Hero_" series. In this notebook we will talk about the features offered by the `models` _Avalanche_ sub-module.

### Support for pytorch Modules

Every continual learning experiment needs a model to train incrementally. You can use any `torch.nn.Module`, even pretrained models.  The `models` sub-module provides the most commonly used architectures in the CL literature.

You can use any model provided in the [Pytorch](https://pytorch.org/) official ecosystem models as well as the ones provided by [pytorchcv](https://pypi.org/project/pytorchcv/)!

In [1]:
!pip install avalanche-lib==0.3.1

In [2]:
from avalanche.models import SimpleCNN
from avalanche.models import SimpleMLP
from avalanche.models import SimpleMLP_TinyImageNet
from avalanche.models import MobilenetV1

model = SimpleCNN()
print(model)

SimpleCNN(
  (features): Sequential(
    (0): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): ReLU(inplace=True)
    (2): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1))
    (3): ReLU(inplace=True)
    (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (5): Dropout(p=0.25, inplace=False)
    (6): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (7): ReLU(inplace=True)
    (8): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1))
    (9): ReLU(inplace=True)
    (10): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (11): Dropout(p=0.25, inplace=False)
    (12): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1))
    (13): ReLU(inplace=True)
    (14): AdaptiveMaxPool2d(output_size=1)
    (15): Dropout(p=0.25, inplace=False)
  )
  (classifier): Sequential(
    (0): Linear(in_features=64, out_features=10, bias=True)
  )
)


## Dynamic Model Expansion
A continual learning model may change over time. As an example, a classifier may add new units for previously unseen classes, while progressive networks add a new set units after each experience. Avalanche provides `DynamicModule`s to support these use cases. `DynamicModule`s are `torch.nn.Module`s that provide an addition method, `adaptation`, that is used to update the model's architecture. The method takes a single argument, the data from the current experience.

For example, an IncrementalClassifier updates the number of output units:

In [3]:
from avalanche.benchmarks import SplitMNIST
from avalanche.models import IncrementalClassifier

benchmark = SplitMNIST(5, shuffle=False, class_ids_from_zero_in_each_exp=False)
model = IncrementalClassifier(in_features=784)

print(model)
for exp in benchmark.train_stream:
    model.adaptation(exp)
    print(model)

IncrementalClassifier(
  (classifier): Linear(in_features=784, out_features=2, bias=True)
)
IncrementalClassifier(
  (classifier): Linear(in_features=784, out_features=2, bias=True)
)
IncrementalClassifier(
  (classifier): Linear(in_features=784, out_features=4, bias=True)
)
IncrementalClassifier(
  (classifier): Linear(in_features=784, out_features=6, bias=True)
)
IncrementalClassifier(
  (classifier): Linear(in_features=784, out_features=8, bias=True)
)
IncrementalClassifier(
  (classifier): Linear(in_features=784, out_features=10, bias=True)
)


As you can see, after each call to the `adaptation` method, the model adds 2 new units to account for the new classes. Notice that no learning occurs at this point since the method only modifies the model's architecture.

Keep in mind that when you use Avalanche strategies you don't have to call the adaptation yourself. Avalanche strategies automatically call the model's adaptation and update the optimizer to include the new parameters.

## Multi-Task models

Some models, such as multi-head classifiers, are designed to exploit task labels. In Avalanche, such models are implemented as `MultiTaskModule`s. These are dynamic models (since they need to be updated whenever they encounter a new task) that have an additional `task_labels` argument in their `forward` method. `task_labels` is a tensor with a task id for each sample.

In [4]:
from avalanche.benchmarks import SplitMNIST
from avalanche.models import MultiHeadClassifier

benchmark = SplitMNIST(5, shuffle=False, return_task_id=True, class_ids_from_zero_in_each_exp=True)
model = MultiHeadClassifier(in_features=784)

print(model)
for exp in benchmark.train_stream:
    model.adaptation(exp)
    print(model)

MultiHeadClassifier(
  (classifiers): ModuleDict(
    (0): IncrementalClassifier(
      (classifier): Linear(in_features=784, out_features=2, bias=True)
    )
  )
)
MultiHeadClassifier(
  (classifiers): ModuleDict(
    (0): IncrementalClassifier(
      (classifier): Linear(in_features=784, out_features=2, bias=True)
    )
  )
)
MultiHeadClassifier(
  (classifiers): ModuleDict(
    (0): IncrementalClassifier(
      (classifier): Linear(in_features=784, out_features=2, bias=True)
    )
    (1): IncrementalClassifier(
      (classifier): Linear(in_features=784, out_features=2, bias=True)
    )
  )
)
MultiHeadClassifier(
  (classifiers): ModuleDict(
    (0): IncrementalClassifier(
      (classifier): Linear(in_features=784, out_features=2, bias=True)
    )
    (1): IncrementalClassifier(
      (classifier): Linear(in_features=784, out_features=2, bias=True)
    )
    (2): IncrementalClassifier(
      (classifier): Linear(in_features=784, out_features=2, bias=True)
    )
  )
)
MultiHeadClas

When you use a `MultiHeadClassifier`, a new head is initialized whenever a new task is encountered. Avalanche strategies automatically recognize multi-task modules and provide task labels to them.

### How to define a multi-task Module
If you want to define a custom multi-task module you need to override two methods: `adaptation` (if needed), and `forward_single_task`. The `forward` method of the base class will split the mini-batch by task-id and provide single task mini-batches to `forward_single_task`.

In [5]:
from avalanche.models import MultiTaskModule

class CustomMTModule(MultiTaskModule):
    def __init__(self, in_features, initial_out_features=2):
        super().__init__()

    def adaptation(self, dataset):
        super().adaptation(dataset)
        # your adaptation goes here

    def forward_single_task(self, x, task_label):
        # your forward goes here.
        # task_label is a single integer
        # the mini-batch is split by task-id inside the forward method.
        pass

Alternatively, if you only want to convert a single-head model into a multi-head model, you can use the `as_multitask` wrapper, which converts the model for you.

In [6]:
from avalanche.models import as_multitask

model = SimpleCNN()
print(model)

mt_model = as_multitask(model, 'classifier')
print(mt_model)

SimpleCNN(
  (features): Sequential(
    (0): Conv2d(3, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): ReLU(inplace=True)
    (2): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1))
    (3): ReLU(inplace=True)
    (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (5): Dropout(p=0.25, inplace=False)
    (6): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (7): ReLU(inplace=True)
    (8): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1))
    (9): ReLU(inplace=True)
    (10): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    (11): Dropout(p=0.25, inplace=False)
    (12): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1))
    (13): ReLU(inplace=True)
    (14): AdaptiveMaxPool2d(output_size=1)
    (15): Dropout(p=0.25, inplace=False)
  )
  (classifier): Sequential(
    (0): Linear(in_features=64, out_features=10, bias=True)
  )
)
MultiTaskDecorator(
  (model): SimpleCNN(
    (features): Seque

## 🤝 Run it on Google Colab

You can run _this chapter_ and play with it on Google Colaboratory: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ContinualAI/avalanche/blob/master/notebooks/from-zero-to-hero-tutorial/02_models.ipynb)