Skip to content

Distilling Knowledge from Refinement in Multiple Instance Detection Networks

License

Notifications You must be signed in to change notification settings

yangsuhui/Boosted-OICR

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Distilling Knowledge from Refinement in Multiple Instance Detection Networks (Boosted-OICR)

By Luis Felipe Zeni and Claudio Jung.

Institute of Informatics, Federal University of Rio Grande do Sul, Brazil

This repository contains the PyTorch implementation of our paper Distilling Knowledge from Refinement in Multiple Instance Detection Networks published in Deep Vision 2020 CVPR workshop. (Go to Contents section if you are interested in how to run the code).

News:

21-sep-2020: I returned the code to an old version. I made a considerable refactoring to release the code, and some of these changes impacted a little bit in the final mAP. As I am short on time, I decided to return the code to an older version (which is not beauty as the refactored one but have a better mAP in the end.). I also added a reproducibility section in this document were I explain why the results are not the same after training with the same seed.

25-may-2020: Finally we received the results from VOC 2012 evaluation server, and we beat C-MIl in detection mAP :). By best of my knowledge this is the best WSOD result in the VOC until now. http://host.robots.ox.ac.uk:8080/anonymous/E7JSMD.html


About this work:

In this work, we claim that carefully selecting the aggregation criteria can considerably improve the accuracy of the learned detector. We start by proposing an additional refinement step to an existing approach (OICR), which we call refinement knowledge distillation. Then, we present an adaptive supervision aggregation function that dynamically changes the aggregation criteria for selecting boxes related to one of the ground-truth classes, background, or even ignored during the generation of each refinement module supervision. We call these improvements "Boosted-OICR".

We made improvements in OICR's architecture

Boosted-OICR architecture

And also improved "le trick"

le trick

Some cool detection results visualizations

Some visualization results

License

Our code is under the MIT License (refer to the LICENSE file for details).

Citing Boosted-OICR

If you find our paper or our implementation useful in your research, please consider citing:

@inproceedings{zeni2020distilling,
  title={Distilling Knowledge From Refinement in Multiple Instance Detection Networks},
  author={Felipe Zeni, Luis and Jung, Claudio R},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  pages={768--769},
  year={2020}
}

Contents:

  1. Requirements: software
  2. Requirements: hardware
  3. Basic installation
  4. Installation for training and testing
  5. Extra Downloads (Models trained on PASCAL VOC)
  6. Usage
  7. About the training reproducibility

Requirements: software

  • Linux OS (I did not tested it on other OS.)

    • octave
  • python3 packages and versions used (listed using pip freeze):

    • certifi==2020.6.20
    • cycler==0.10.0
    • Cython==0.29.21
    • kiwisolver==1.2.0
    • matplotlib==3.3.2
    • numpy==1.19.2
    • opencv-python==4.2.0.34
    • Pillow==7.2.0
    • protobuf==3.13.0
    • pycocotools==2.0.2
    • pyparsing==2.4.7
    • python-dateutil==2.8.1
    • PyYAML==5.3.1
    • six==1.15.0
    • tensorboardX==2.1
    • torch==1.2.0+cu92
    • torchvision==0.4.0+cu92
    • tqdm==4.49.0
  • An Nnvidia GPU wuth suport to CUDA

    • We used cuda 10.0 and cudnn 7.0
    • We used an Nvidia Titan Xp with 12G of memory. But it shold be ok to train if you have a GPU with at least 8Gb.
    • NOTICE: different versions of Pytorch have different memory usages.
  • Docker

    • If you are not using Docker to run your experiments, we highly recommend that you start using it. In the folder, 'docker' is the Dockerfile to build a docker container to run our code ;).

Installation

  1. Clone this repository

    git clone https://github.com/luiszeni/Boosted-OICR && cd Boosted-OICR
  2. [Optional] Build the docker-machine and start it. You should have the Nvidia-docker installed in your host machine

    2.1. Enter in the docker folder inside the repo

    cd docker

    2.2. Build the docker image

    docker build . -t boicr

    2.3. Return to the root of the repo ($BOOSTED_OICR_ROOT)

    cd ..

    2.4 Create a container using the image. I prefer to mount an external volume with the code in a folder in the host machine. It makes it easier to edit the code using a GUI-text-editor or ide. This command will drop you in the container shell.

    docker run --gpus all -v  $(pwd):/root/Boosted-OICR --shm-size 12G -ti \
    --name boicr boicr

    2.5 If, in any moment of the future, you exit the container, you can enter the container again using this command.

    docker start -ai boicr 

    Observation: I will not talk about how to display windows using X11 forwarding from the container to the host X. You will need this if you are interested to use the visualization scripts. There are a lot of tutorials on the internet teching X11 Foward in Docker.

Setting up for training and testing

  1. Create a "data" folder in $BOOSTED_OICR_ROOT and enter in this folder

    mkdir data
    cd data
  2. Download the training, validation, test data, and VOCdevkit

    wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
    wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar

    Optional, normally faster to download, links to VOC (from darknet):

    wget https://pjreddie.com/media/files/VOCtrainval_06-Nov-2007.tar
    wget https://pjreddie.com/media/files/VOCtest_06-Nov-2007.tar
  3. Extract all of these tars into one directory named VOCdevkit

    tar xvf VOCtrainval_06-Nov-2007.tar
    tar xvf VOCtest_06-Nov-2007.tar
  4. Download the VOCdevkit evaluation code adapted to octave

    wget http://inf.ufrgs.br/~lfazeni/CVPR_deepvision2020/VOCeval_octave.tar
  5. Extract VOCeval_octave

    tar xvf VOCeval_octave.tar
  6. Download pascal annotations in the COCO format

    wget http://inf.ufrgs.br/~lfazeni/CVPR_deepvision2020/coco_annotations_VOC.tar
  7. Extract the annotations

    tar xvf coco_annotations_VOC.tar
  8. It should have this basic structure

    $VOC2007/                           
    $VOC2007/annotations
    $VOC2007/JPEGImages
    $VOC2007/VOCdevkit        
    # ... and several other directories ...
  9. [Optional] download and extract PASCAL VOC 2012.

    wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
    tar xvf VOCtrainval_11-May-2012.tar

    or

    wget https://pjreddie.com/media/files/VOCtrainval_11-May-2012.tar
    tar xvf VOCtrainval_11-May-2012.tar

    Observation: The '2012 test set' is only available in the PASCAL VOC Evaluation Server to download. You must create a user and download it by yourself. After downloading, you can extract it in the data folder.

  10. Download the proposals data generated by selective search

    wget http://inf.ufrgs.br/~lfazeni/CVPR_deepvision2020/selective_search_data.tar
  11. Extract the proposals

    tar xvf selective_search_data.tar
  12. Download the pre-trained VGG16 model

    wget http://inf.ufrgs.br/~lfazeni/CVPR_deepvision2020/pretrained_model.tar
  13. Extract the pre-trained VGG16 model

    tar xvf pretrained_model.tar
  14. [optional] Delete the downloaded files to free space

    rm *.tar
  15. Return to the root folder $BOOSTED_OICR_ROOT

    cd ..

Downloading model trained on PASCAL VOC 2007

  1. Download the pretrained files at the root folder ($BOOSTED_OICR_ROOT) folder
    wget http://inf.ufrgs.br/~lfazeni/CVPR_deepvision2020/trained_models.tar
  2. Extract it
    tar xvf trained_models.tar
  3. Delete the tar file to free space
    rm trained_models.tar

Usage

Running detection and localization on weighs used on the paper

Calculating the detection mAP in Pascal VOC 2007 (test set)
python3 code/tasks/test.py --cfg configs/baselines/vgg16_voc2007.yaml \
  --dataset voc2007test \
  --model oicr_lambda_log_distillation \
  --load_ckpt snapshots/deepvision2020/oicr_lambda_log_distillation/final.pth \
  --use_matlab
Calculating the corloc in Pascal VOC 2007 (trainval set)
  python3 code/tasks/test.py --cfg configs/baselines/vgg16_voc2007.yaml \
  --dataset voc2007trainval \
  --model oicr_lambda_log_distillation \
  --load_ckpt snapshots/deepvision2020/oicr_lambda_log_distillation/final.pth

Training your own model ;DD

To Train the Boosted-OICR network on VOC 2007 trainval set:

python3 code/tasks/train.py --dataset voc2007 \
--cfg configs/baselines/vgg16_voc2007.yaml \
--bs 1 --nw 4 --iter_size 4 --model oicr_lambda_log_distillation

To Evaluate the Boosted-OICR network on VOC 2007:

On test (detection mAP)
  python3 code/tasks/test.py --cfg configs/baselines/vgg16_voc2007.yaml \
  --dataset voc2007test \
  --model oicr_lambda_log_distillation \
  --load_ckpt snapshots/oicr_lambda_log_distillation/<some-running-date-time>/ckpt/model_step24999.pth \
  --use_matlab
On trainval (corloc)
  python3 code/tasks/test.py --cfg configs/baselines/vgg16_voc2007.yaml \
  --dataset voc2007trainval \
  --model oicr_lambda_log_distillation \
  --load_ckpt snapshots/oicr_lambda_log_distillation/<some-running-date-time>/ckpt/model_step24999.pth
Visualize the nice detections

You can run the visualization script to show the results in a openCV window

python3 code/tasks/visualize.py --cfg configs/baselines/vgg16_voc2007.yaml \
--dataset voc2007test \
--detections snapshots/deepvision2020/test/final/detections.pkl

...or you can save the visualizations as images. First create a folder to save the outputs

mkdir img_out

and pass it with the --output_dir argument

python3 code/tasks/visualize.py --cfg configs/baselines/vgg16_voc2007.yaml\
--dataset voc2007test\
--detections snapshots/deepvision2020/test/final/detections.pkl\
--output_dir img_out
Training a Fast-RCNN using the trained model.

We used the code available here

About the training reproducibility:

If you use model weights available to download, you will reproduce the same mAP, and Corloc described in the paper on the Pascal VOC2007 dataset. However, if you retrain the model, the final mAP and Corloc can differ from those described in the article.

I tried my best to make the result after the retraining using the same seed to be as similar as possible. Anyway, even fixing the seed, the results differ a little between different training instances. I am not sure from where this non-determinism comes. My best guess is that it is coming from the RoiPooling implemented in the Torchvision.

I retrained the model with the same seed five times, and the final mAP oscillates between 49.0 and 49.9.

It is also important to be aware that completely reproducible results are not guaranteed across PyTorch versions (https://pytorch.org/docs/stable/notes/randomness.html), so make sure to use the same version that we use here.

Special Thanks:

We would like to thanks Peng Tang and his colleagues for making the PCL and OICR codes publicly available.

About

Distilling Knowledge from Refinement in Multiple Instance Detection Networks

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.4%
  • MATLAB 1.1%
  • Dockerfile 0.5%