# Abstract
This tutorial aims to be a simple starter kit for people wanting to experiment continual semantic segmentation. To simplify experiments and developments of methods, we propose Continual-Extended-Mnist, a new toy dataset for continual segmentation that reflects the same scenarios explored in recent papers. We explain two of the important challenges, namely catastrophic forgetting coupled with background shift, and show how a naïve baseline systematically fails even on our black-and-white low-resolution images. Then, we introduce a few tricks popularized by the papers “Knowledge Distillation for Incremental Learning in Semantic Segmentation” and “Modeling the Background for Incremental Learning in Semantic Segmentation” (MiB).

# Introduction
If we look at papers on continual learning for classification, most of them experiment on simple datasets such as MNIST, CIFAR10 or Mini-Imagenet. These are small-scale datasets of tiny images which are used to experiment on a relatively simple task: classify object-centered images. On the other hand, the datasets employed to explore the more complex task of continual semantic segmentation are generally Pascal-VOC, ADE20K or COCO: ~10K to 20K high-resolution images with many objects in each image, which requires several hours of training on GPUs, if not days.

While toy datasets are sometimes blamed to be unrealistic as they are much more simple than real-life scenarios, they serve a non negligeable purpose. **They accelerate the research cycle and development of models** as they remove unnecessary complicated variables while keeping most of the essentials. It allows to do many runs of experiments with different models in a fraction of the time to see if it addresses the problems we aim to solve, i.e. in our case catastrophic forgetting especially. If a technique cannot reduce catastrophic forgetting on 60x60 black-and-white images, obviously there is no reason to think that it will solve it on real images of complex scenes.


# Continual Extended Mnist

We owe the generation of the raw data to this [repository](https://github.com/LukeTonin/simple-deep-learning) made by Luke Tonin. He also made an excellent tutorial for [simple semantic segmentation ](https://awaywithideas.com/mnist-extended-a-dataset-for-semantic-segmentation-and-object-detection/) that was an inspiration for ours. We started from this dataset and adapted it to a class-incremental learning scenario, i.e. classes are seen sequentially in successive tasks. In the first task, the model has to learn to segment 0's and 1's, then in the next task it must learn 2's and 3's, then 4's and 5's, and so on. 

Let's first take a look at the dataset.

In [1]:
#from google.colab import drive
#drive.mount('/content/gdrive')

Mounted at /content/gdrive


In [1]:
import os
import sys
#os.chdir("/content/gdrive/MyDrive/Colab Notebooks/mnist_continual_seg")
sys.path.append("/home/mathieu/Documents/Deep/mnist-continual-seg/simple-deep-learning")
import matplotlib.pyplot as plt
%matplotlib inline  
from scenarios import ContinualMnist


2023-06-07 14:31:24.818040: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0


In [2]:
# Initialize dataloader, optimizer and trainer
from models import simple_seg_model
_tasks = {0: [0,1], 1: [2,3], 2: [4,5], 3: [6,7], 4: [8,9]}
model = simple_seg_model(n_classes_per_task=[len(_tasks[0])+1])
model = model.cuda()

In [3]:
import torch
from metrics import EvaluaterCallback
from trainer import Trainer

# Set optimizer
optimizer = torch.optim.Adam(lr = 0.0005, params=model.parameters())
# Set scenario
continual_mnist = ContinualMnist(n_train=5000, n_test=2500, batch_size=72, tasks=_tasks)
# Set Evaluator
evaluater = EvaluaterCallback(model, ["confusion_matrix"], callback_frequency="epoch", n_classes=11, save_matrices=True)
# Set trainer
trainer = Trainer(model,
                  n_classes=[3],
                  optim=optimizer,
                  callbacks=[evaluater])

In [4]:
from utils import meta_train
meta_train(n_tasks = len(_tasks),
           epochs = 100,
           scenario = continual_mnist, 
           trainer=trainer,
           evaluater=evaluater,
           memory=None,
           animation_path='naive')

*******
Task #0
*******
Classes to learn:
-1 0 1
*******


  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
100%|██████████| 100/100 [01:51<00:00,  1.11s/it]



Overall stats
|    |   Overall Acc |   Mean Acc |   Mean IoU |
|---:|--------------:|-----------:|-----------:|
|  0 |      0.992256 |   0.946277 |   0.893905 |
Class IoU
|    |       -1 |        0 |        1 | 2   | 3   | 4   | 5   | 6   | 7   | 8   | 9   |
|---:|---------:|---------:|---------:|:----|:----|:----|:----|:----|:----|:----|:----|
|  0 | 0.991974 | 0.839183 | 0.850556 | X   | X   | X   | X   | X   | X   | X   | X   |
Class Acc
|    |       -1 |        0 |        1 | 2   | 3   | 4   | 5   | 6   | 7   | 8   | 9   |
|---:|---------:|---------:|---------:|:----|:----|:----|:----|:----|:----|:----|:----|
|  0 | 0.995685 | 0.917666 | 0.925482 | X   | X   | X   | X   | X   | X   | X   | X   |

####################################
Next Task
####################################


  0%|          | 0/100 [00:00<?, ?it/s]

*******
Task #1
*******
Classes to learn:
-1 2 3
*******


100%|██████████| 100/100 [01:49<00:00,  1.09s/it]



Overall stats
|    |   Overall Acc |   Mean Acc |   Mean IoU |
|---:|--------------:|-----------:|-----------:|
|  0 |      0.963796 |   0.552332 |   0.516122 |
Class IoU
|    |       -1 |   0 |   1 |        2 |        3 | 4   | 5   | 6   | 7   | 8   | 9   |
|---:|---------:|----:|----:|---------:|---------:|:----|:----|:----|:----|:----|:----|
|  0 | 0.963059 |   0 |   0 | 0.790388 | 0.827166 | X   | X   | X   | X   | X   | X   |
Class Acc
|    |      -1 |   0 |   1 |        2 |        3 | 4   | 5   | 6   | 7   | 8   | 9   |
|---:|--------:|----:|----:|---------:|---------:|:----|:----|:----|:----|:----|:----|
|  0 | 0.99693 |   0 |   0 | 0.867415 | 0.897315 | X   | X   | X   | X   | X   | X   |

####################################
Next Task
####################################


  0%|          | 0/100 [00:00<?, ?it/s]

*******
Task #2
*******
Classes to learn:
-1 4 5
*******


100%|██████████| 100/100 [01:55<00:00,  1.15s/it]



Overall stats
|    |   Overall Acc |   Mean Acc |   Mean IoU |
|---:|--------------:|-----------:|-----------:|
|  0 |      0.940266 |   0.398478 |   0.370336 |
Class IoU
|    |       -1 |   0 |   1 |   2 |   3 |        4 |        5 | 6   | 7   | 8   | 9   |
|---:|---------:|----:|----:|----:|----:|---------:|---------:|:----|:----|:----|:----|
|  0 | 0.939326 |   0 |   0 |   0 |   0 | 0.835262 | 0.817766 | X   | X   | X   | X   |
Class Acc
|    |       -1 |   0 |   1 |   2 |   3 |        4 |       5 | 6   | 7   | 8   | 9   |
|---:|---------:|----:|----:|----:|----:|---------:|--------:|:----|:----|:----|:----|
|  0 | 0.998161 |   0 |   0 |   0 |   0 | 0.894935 | 0.89625 | X   | X   | X   | X   |

####################################
Next Task
####################################


  0%|          | 0/100 [00:00<?, ?it/s]

*******
Task #3
*******
Classes to learn:
-1 6 7
*******


100%|██████████| 100/100 [02:02<00:00,  1.23s/it]



Overall stats
|    |   Overall Acc |   Mean Acc |   Mean IoU |
|---:|--------------:|-----------:|-----------:|
|  0 |      0.921012 |   0.314014 |   0.287986 |
Class IoU
|    |       -1 |   0 |   1 |   2 |   3 |   4 |   5 |       6 |        7 | 8   | 9   |
|---:|---------:|----:|----:|----:|----:|----:|----:|--------:|---------:|:----|:----|
|  0 | 0.920241 |   0 |   0 |   0 |   0 |   0 |   0 | 0.85159 | 0.820039 | X   | X   |
Class Acc
|    |       -1 |   0 |   1 |   2 |   3 |   4 |   5 |        6 |       7 | 8   | 9   |
|---:|---------:|----:|----:|----:|----:|----:|----:|---------:|--------:|:----|:----|
|  0 | 0.998729 |   0 |   0 |   0 |   0 |   0 |   0 | 0.914137 | 0.91326 | X   | X   |

####################################
Next Task
####################################


  0%|          | 0/100 [00:00<?, ?it/s]

*******
Task #4
*******
Classes to learn:
-1 8 9
*******


100%|██████████| 100/100 [01:59<00:00,  1.20s/it]


Overall stats
|    |   Overall Acc |   Mean Acc |   Mean IoU |
|---:|--------------:|-----------:|-----------:|
|  0 |      0.898288 |   0.258688 |   0.230222 |
Class IoU
|    |      -1 |   0 |   1 |   2 |   3 |   4 |   5 |   6 |   7 |        8 |        9 |
|---:|--------:|----:|----:|----:|----:|----:|----:|----:|----:|---------:|---------:|
|  0 | 0.89723 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 | 0.822529 | 0.812685 |
Class Acc
|    |       -1 |   0 |   1 |   2 |   3 |   4 |   5 |   6 |   7 |       8 |        9 |
|---:|---------:|----:|----:|----:|----:|----:|----:|----:|----:|--------:|---------:|
|  0 | 0.997878 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 | 0.94066 | 0.907034 |





In [6]:
evaluater.create_animation('experiments/2-2/1000/72/100/fine-tuning/ft', sample_freq=4)

: 