Learning to Segment the Tail
In this repository, we release code for Learning to Segment The Tail (LST). The code is directly modified from the project maskrcnn_benchmark, which is an excellent codebase! If you get any problem that causes you unable to run the project, you can check the issues under maskrcnn_benchmark first.
After downloading LVIS_v0.5 dataset (the images are the same as COCO 2017 version), we recommend to symlink the path to the lvis dataset to datasets/ as follows
# symlink the lvis dataset cd ~/github/LST_LVIS mkdir -p datasets/lvis ln -s /path_to_lvis_dataset/annotations datasets/lvis/annotations ln -s /path_to_coco_dataset/images datasets/lvis/images
A detailed visualization demo for LVIS is LVIS_visualization. You'll find it is the most useful thing you can get from this repo :P
Dataset Pre-processing and Indices Generation
dataset_preprocess.ipynb: LVIS dataset is split into the base set and sets for the incremental phases.
balanced_replay.ipynb: We generate indices to load the LVIS dataset offline using the balanced replay scheme discussed in our paper.
The base training is the same as conventional training. For example, to train a model with 8 GPUs you can run:
python -m torch.distributed.launch --nproc_per_node=8 /path_to_maskrcnn_benchmark/tools/train_net.py --use-tensorboard --config-file "/path/to/config/train_file.yaml" MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN 1000
The details about
MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN is discussed in maskrcnn-benchmark.
Edit this line to initialze the dataloader with corresponding sorted category ids.
The training for each incremental phase is armed with our data balanced replay. It needs to be initialized properly here, providing the corresponding external img-id/cls-id pairs for data-loading.
We use ground truth bounding boxes to get prediction logits using the model trained from last step. Change this to decide which classes to be distilled.
Here is an example for running:
python ./tools/train_net.py --use-tensorboard --config-file "/path/to/config/get_distillation_file.yaml" MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN 1000
The output distillation logits are saved in json format.
The evaluation for LVIS is a little bit different from COCO since it is not exhausted annotated, which is discussed in detail in Gupta et al.'s work.
We also report the AP for each phase and each class, which can provide better analysis.
You can run:
export NGPUS=8 python -m torch.distributed.launch --nproc_per_node=$NGPUS /path_to_maskrcnn_benchmark/tools/test_net.py --config-file "/path/to/config/train_file.yaml"
We also provide periodically testing to check the result better, as discussed in this issue.
Thanks for all the previous work and the sharing of their codes. Sorry for my ugly code and I appreciate your advice.