Skip to content

kkakkkka/ETRIS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ETRIS

This is an official PyTorch implementation of Bridging Vision and Language Encoders: Parameter-Efficient Tuning for Referring Image Segmentation.

teaser

Overall Architecture

Preparation

  1. Environment

    • PyTorch (e.g. 1.8.1+cu111)
    • Other dependencies in requirements.txt
      pip install torch==1.8.1+cu111 torchvision==0.9.1+cu111 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html
      pip install -r requirements.txt
  2. Datasets

  3. Pretrained weights

    • Download the pretrained weights of ResNet-50/101 and ViT-B to pretrain
      mkdir pretrain && cd pretrain
      # ResNet-50
      wget https://openaipublic.azureedge.net/clip/models/afeb0e10f9e5a86da6080e35cf09123aca3b358a0c3e3b6c78a7b63bc04b6762/RN50.pt
      # ResNet-101
      wget https://openaipublic.azureedge.net/clip/models/8fa8567bab74a42d41c5915025a8e4538c3bdbe8804a470a72f30b0d94fab599/RN101.pt
      # ViT-B
      wget https://openaipublic.azureedge.net/clip/models/5806e77cd80f8b59890b7e101eabd078d9fb84e6937f9e85e4ecb61988df416f/ViT-B-16.pt

Quick Start

To do training of ETRIS, modify the script according to your requirement and run:

bash run_scripts/train.sh

To do evaluation of ETRIS, modify the script according to your requirement and run:

bash run_scripts/test.sh

Weights

The weights of our model have been made available at the following link: https://pan.baidu.com/s/1jaOJKdIg1t8wnWrxgCkkRA?pwd=vmyv Please note that you may need to enter the password "vmyv" to access the files.

Acknowledgements

The code is based on CRIS. We thank the authors for their open-sourced code and encourage users to cite their works when applicable.

Citation

If ETRIS is useful for your research, please consider citing:

@inproceedings{xu2023bridging,
  title={Bridging vision and language encoders: Parameter-efficient tuning for referring image segmentation},
  author={Xu, Zunnan and Chen, Zhihong and Zhang, Yong and Song, Yibing and Wan, Xiang and Li, Guanbin},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={17503--17512},
  year={2023}
}

About

[ICCV-2023] The official code of Bridging Vision and Language Encoders: Parameter-Efficient Tuning for Referring Image Segmentation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published