CARIS: Context-Aware Referring Image Segmentation

This repository is for the ACM MM 2023 paper CARIS: Context-Aware Referring Image Segmentation.

Requirements

The code is verified with Python 3.8 and PyTorch 1.11. Other dependencies are listed in requirements.txt.

Datasets

Please follow the instruction in .refer to download annotations of RefCOCO/RefCOCO+/RefCOCOg. We provide the combined annotations as refcocom here.

Download images from COCO. Please use the first downloading link 2014 Train images [83K/13GB], and extract the downloaded train_2014.zip file.

Data paths should be as follows:

.{YOUR_REFER_PATH}
├── refcoco
├── refcoco+
├── refcocog
├── refcocom

.{YOUR_COCO_PATH}
├── train2014

Pretrained Models

Download pretrained Swin-B and BERT-B. Check models to get pretrained CARIS models.

Usage

Train

By default, we use fp16 training for efficiency. To train a model on refcoco with 2 GPUs, modify YOUR_COCO_PATH, YOUR_REFER_PATH, YOUR_MODEL_PATH, and YOUR_CODE_PATH in scripts/train_refcoco.sh then run:

sh scripts/train_refcoco.sh

You can change DATASET to refcoco+/refcocog/refcocom for training on different datasets. Note that for RefCOCOg, there are two splits (umd and google). You should add --splitBy umd or --splitBy google to specify the split.

Test

Single-GPU evaluation is supported. To evaluate a model on refcoco, modify the settings in scripts/test_refcoco.sh and run:

sh scripts/test_refcoco.sh

You can change DATASET and SPLIT to evaludate on different splits of each dataset. Note that for RefCOCOg, there are two splits (umd and google). You should add --splitBy umd or --splitBy google to specify the split. For the models trained on refcocom, you can directly evaluate them on the splits of refcoco/refcoco+/refcocog(umd).

References

This repo is mainly built based on LAVT and mmdetection. Thanks for their great work!

Citation

If you find our code useful, please consider to cite with:

@inproceedings{liu2023caris,
  title={CARIS: Context-Aware Referring Image Segmentation},
  author={Liu, Sun-Ao and Zhang, Yiheng and Qiu, Zhaofan and Xie, Hongtao and Zhang, Yongdong and Yao, Ting},
  booktitle={Proceedings of the 31st ACM International Conference on Multimedia},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
bert		bert
data		data
figures		figures
model		model
refer		refer
scripts		scripts
README.md		README.md
args.py		args.py
criterion.py		criterion.py
eval.py		eval.py
requirements.txt		requirements.txt
train.py		train.py
transforms.py		transforms.py
utils.py		utils.py

lsa1997/CARIS

Folders and files

Latest commit

History

Repository files navigation

CARIS: Context-Aware Referring Image Segmentation

Requirements

Datasets

Pretrained Models

Usage

Train

Test

References

Citation

About

Resources

Stars

Watchers

Forks

Languages