Skip to content

zjh31/CPL

Repository files navigation

CPL

This repository is the official Pytorch implementation for ICCV2023 paper Confidence-aware Pseudo-label Learning for Weakly Supervised Visual Grounding.

Please leave a STAR ⭐ if you like this project!

Contents

  1. Usage
  2. Results
  3. Contacts
  4. Acknowledgments

Usage

Dependencies

Data Preparation

1.You can download the images from the original source and place them in ./data/image_data folder:

Finally, the ./data/ and ./image_data/ folder will have the following structure:

|-- data
      |-- flickr
      |-- gref
      |-- gref_umd
      |-- referit
      |-- unc
      |-- unc+
|-- image_data
   |-- Flickr30k
      |-- flickr30k-images
   |-- other
      |-- images
   |-- referit
      |-- images
  • ./data/: Take the Flickr30K dataset as an example, ./data/flickr/ shoud contain files about the dataset's train/validation/test annotations and our generated pseudo-samples for this dataset. You can download these file from [data] and put them on the corresponding folder.
  • ./image_data/Flickr30k/flickr30k-images/: Image data for the Flickr30K dataset, please download from this link. Fill the form and download the images.
  • ./image_data/other/images/: Image data for RefCOCO/RefCOCO+/RefCOCOg.
  • ./image_data/referit/images/: Image data for ReferItGame.
  1. The generated pseudo region-query pairs can be download from data or you can generate pseudo samples follow instructions.

Note that to train the model with pseudo samples for different dataset you should put the uncompressed pseudo sample files under the right folder ./data/xxx/. For example, put the unc/train_cross_modal.pth under ./data/unc/.

For generating pseudo-samples, we adopt the pretrained detector and attribute classifier from the [VinVL]. The pytorch implementation of this paper is available at VinVL.

Pretrained Checkpoints

1.You can download the DETR checkpoints from detr_checkpoints. These checkpoints should be downloaded and move to the checkpoints directory.

mkdir checkpoints
mv detr_checkpoints.tar.gz ./checkpoints/
tar -zxvf detr_checkpoints.tar.gz

2.Checkpoints that trained on our pseudo-samples can be downloaded from Google Drive. You can evaluate the checkpoints following the instruction right below.

mv cpl_checkpoints.tar.gz ./checkpoints/
tar -zxvf cpl_checkpoints.tar.gz

Training and Evaluation

  1. Training on RefCOCO.

    CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 --master_port 28888 --use_env train.py --num_workers 8 --epochs 20 --batch_size 32 --lr 0.0001 --lr_bert 0.00001 --lr_visu_cnn 0.00001 --lr_visu_tra 0.00001 --lr_scheduler cosine --aug_crop --aug_scale --aug_translate --backbone resnet50 --detr_model checkpoints/detr-r50-unc.pth --bert_enc_num 12 --detr_enc_num 6 --dataset unc --max_query_len 20 --data_root ./data/image_data --split_root ./data/ --output_dir ./outputs/unc/
    

    Notably, if you use a smaller batch size, you should also use a smaller learning rate. Original learning rate is set for batch size 128(4GPU x 32). Please refer to scripts/train.sh for training commands on other datasets.

  2. Evaluation on RefCOCO.

    CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 --master_port 28888 --use_env eval.py --num_workers 4 --batch_size 128 --backbone resnet50 --bert_enc_num 12 --detr_enc_num 6 --dataset unc --max_query_len 20 --data_root ./data/image_data --split_root ./data/ --eval_model ./checkpoints/unc_best_checkpoint.pth --eval_set testA --output_dir ./outputs/unc/testA/;
    

    Please refer to scripts/eval.sh for evaluation commands on other splits or datasets.

Results

        RefCOCO         RefCOCO+         RefCOCOg ReferItGame Flickr30K
val testA testB val testA testB g-val u-val u-test test test
70.67 74.58 67.19 51.81 58.34 46.17 57.04 60.21 60.12 45.23 63.87

Contacts

zhangjiahua at stu dot pku dot edu dot cn

Any discussions or concerns are welcomed!

Acknowledge

This codebase is partially based on Pseudo-Q, BLIP and VinVL.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published