Augmented Box Replay: Overcoming Foreground Shift for Incremental Object Detection -- @ICCV 23, Paper
In incremental learning, replaying stored samples from previous tasks together with current task samples is one of the most efficient approaches to address catastrophic forgetting. However, unlike incremental classification, image replay has not been successfully applied to incremental object detection (IOD). In this paper, we identify the overlooked problem of foreground shift as the main reason for this. Foreground shift only occurs when replaying images of previous tasks and refers to the fact that their background might contain foreground objects of the current task. To overcome this problem, a novel and efficient Augmented Box Replay (ABR) method is developed that only stores and replays foreground objects and thereby circumvents the foreground shift problem. In addition, we propose an innovative Attentive RoI Distillation loss that uses spatial attention from region-of-interest (RoI) features to constrain current model to focus on the most important information from old model. ABR significantly reduces forgetting of previous classes while maintaining high plasticity in current classes. Moreover, it considerably reduces the storage requirements when compared to standard image replay. Comprehensive experiments on Pascal-VOC and COCO datasets support the state-of-the-art performance of our model.
Illustration of our proposed framework, which highlights the key novelties of Augmented Box Replay (ABR) and Attentive
RoI Distillation. ABR fuses prototype object
Please, follow the instruction provided by Detectron 1 and found in install.md
You can find the Pascal-VOC dataset already in Detectron.
We provide scripts to run the experiments in the paper (JT, FT, ABR and ablations).
You can find three scripts in the scripts/ file: run_JT.sh, run_MI.sh, and run_SI.sh. The file can be used to
run, respectively: single-incremetal-step detection settings (19-1, 15-5, 10-10, 5-15), multi-incremental-step detection
settings (10-5, 10-2, 15-1, 10-1, 5-5).
Without specifying any option, the defaults will load the Finetune method using the Faster-RCNN. You can play with the following parameters to obtain all the results in the paper:
--featwith options [no,std,ard]. No means not using feature distillation,stdis the feature distillation employed in Faster-ILOD, whileardis the attentive RoI distillation (as in ABR)- (default: no).--incwill enable the incremental setting - (default: not use);--dist_typewith options [l2,id,none], wherel2is the distillation used in ILOD,idthe Inclusive Distillation Loss used in our method ABR, andnonemeans not use it (default: l2);--alpha_inclusive_distillationis a float indicating the weight of the inclusive_distillation loss (default: 1.); In ABR we vary it in range [0.1, 0.2, 0.5, 1];--beta_attentive_roi_distillationis a float indicating the weight of the attentive RoI distillation loss.--gammais a hyperparameter that controls the strength of the regularization of overall ARD loss. We used 1.0 for our ABR.--memory_bufferis a int representing the box rehearsal memory size.--memory_typewith options [mean,random,herding], wheremeanis the replay strategy in our ABR,randomis choosing the boxes images randomly, whileherdingselecting boxes by herding strategy in icarl - (default: None).
For the setting of the Pascal-VOC dataset, the first task usually contains categories 5, 10, 15 and 19.
So when we train the first task for different settings, we could only train four kinds of the first task.
First, choosing the correspoinding task setting (e.g., taks=10-10) in the scripts/run_firststep.sh file according to
your needs, and then run:
To avoid repeated calculations with the same first task, we place the box rehearsal in the output root of the first task, and then run:
bash scripts/run_firststep.shModify the scripts/run_SI.sh or scripts/run_MI.sh.
For example, if you want to train ABR under single incremental step task=15-5, then please keep the following
sentences uncommented:
task=15-5
name=ABR_LR001_BS4_ALPHA05_BETA1_GAMMA1_INIT
exp -t ${task} -n ${name} -s ${step} --feat ard -gamma 1.0 --uce --dist_type id -alpha 0.5 -beta 1.0 -mb 2000 -mt mean -cvd 0,1, and then run:
bash scripts/run_SI.shOtherwise, if you want to train the finetune method, then please keep the following sentences uncommented:
task=15-5
name=Finetune
exp -t ${task} -n ${name} -s ${step} -cvd 0,1, and then run:
bash scripts/run_SI.sh@InProceedings{yuyang2023augmented,
title={Augmented Box Replay: Overcoming Foreground Shift for Incremental Object Detection},
author={Yuyang, Liu and Yang, Cong and Dipam, Goswami and Xialei, Liu and van de Weijer, Joost},
booktitle={In Proceedings of the IEEE International Conference on Computer Vision (ICCV)},
month={October},
year={2023}
}
Our repository is based on the amazing work of @fcdl94MMA and @CanPeng123 FasterILOD and on the maskrcnn-benchmark library. We thank the authors and the contibutors of these projects for releasing their code.
