# CIFAR10 DER - Showcase

CIFAR10 is a dataset of 60,000 32x32 color images in 10 classes, with 6,000 images per class. There are 50,000 training images and 10,000 test images. In this benchmark we will use CIFAR10 in the CIL and TIL settings.

In [1]:
import warnings
import os
import sys

%load_ext autoreload
%autoreload 2

warnings.filterwarnings('ignore')
current_dir = %pwd

parent_dir = os.path.abspath(os.path.join(current_dir, '../'))
sys.path.append(parent_dir)

import main

## Overview

In the original paper, the authors test the implementation of the proposed algorithms by training a ResNet-18 on CIFAR10 in the CIL and TIL settings, with 5 epochs. However, for computational time we only show the result for one epoch (which takes about 3 minutes on a chip M2 Pro with 9000 images). The results are on the same line as the ones reported in the paper, with a little decrease (due to the lower number of epochs).

Even here, the hyperparameters are fixed to those suggested by the original paper (from their validation).

# DER

This benchmark is harder than MNIST, so we expect slightly lower performances. 

In [2]:
main.run_experiment(
    DATASET='SequentialCIFAR10',
    lr=0.03,
    alpha=0.3,
)


Experience (0) - Training Samples: 9000
Experience (1) - Training Samples: 9000
Experience (2) - Training Samples: 9000
Experience (3) - Training Samples: 9000
Experience (4) - Training Samples: 9000
Epoch 1/1 - Loss: 1.166968584060669
 ===  Accuracies - CIL === 

[[84.7   0.    0.    0.    0.  ]
 [80.65 70.1   0.    0.    0.  ]
 [78.85 45.65 74.8   0.    0.  ]
 [80.2  23.   32.9  84.4   0.  ]
 [77.35 43.   17.85 22.6  78.1 ]]

 ===  Accuracies - TIL === 

[[84.7  52.45 52.1  52.4  46.2 ]
 [80.65 71.8  52.2  50.4  48.25]
 [78.85 65.55 74.85 48.45 46.45]
 [80.2  69.3  75.35 85.25 48.  ]
 [77.35 66.75 73.95 85.1  78.2 ]]

=== Task-IL (TIL) vs Class-IL (CIL) Metrics ===

Accuracy - Last Model (CIL): 	 78.10
Accuracy - Last Model (TIL): 	 78.20

Accuracy - Average (CIL): 	 59.61
Accuracy - Average (TIL): 	 76.52

Accuracy - Full Stream (CIL): 	 47.78
Accuracy - Full Stream (TIL): 	 76.27

Forgetting (CIL): 	 38.30
Forgetting (TIL): 	 3.36

Backward Transfer (CIL): 	 -38.30
Backward Transfe

<src.metric.Metric at 0x10c001610>

# DER++

Even here, we achieve a comparable performance with the results reported in the paper.

In [3]:
main.run_experiment(
    DATASET='SequentialCIFAR10',
    lr=0.03,
    alpha=0.2,
    beta=0.5,
)

Experience (0) - Training Samples: 9000
Experience (1) - Training Samples: 9000
Experience (2) - Training Samples: 9000
Experience (3) - Training Samples: 9000
Experience (4) - Training Samples: 9000
Epoch 1/1 - Loss: 1.5061662197113037
 ===  Accuracies - CIL === 

[[87.95  0.    0.    0.    0.  ]
 [85.8  72.05  0.    0.    0.  ]
 [84.75 55.95 68.4   0.    0.  ]
 [82.65 56.4  28.05 68.55  0.  ]
 [80.05 61.75 51.7  46.9  78.35]]

 ===  Accuracies - TIL === 

[[87.95 45.7  47.9  55.35 47.2 ]
 [85.8  73.35 50.7  48.05 59.8 ]
 [84.75 71.85 77.35 47.15 50.65]
 [82.65 70.6  71.8  78.7  57.4 ]
 [80.05 71.6  77.7  86.   83.4 ]]

=== Task-IL (TIL) vs Class-IL (CIL) Metrics ===

Accuracy - Last Model (CIL): 	 78.35
Accuracy - Last Model (TIL): 	 83.40

Accuracy - Average (CIL): 	 67.29
Accuracy - Average (TIL): 	 78.90

Accuracy - Full Stream (CIL): 	 63.75
Accuracy - Full Stream (TIL): 	 79.75

Forgetting (CIL): 	 14.14
Forgetting (TIL): 	 0.50

Backward Transfer (CIL): 	 -14.14
Backward Transf

<src.metric.Metric at 0x11fa3c350>