The official PyTorch implementation of SAM model for refering image segmentation(RIS).
More details(SAM)
- Environment
- refer to SAM
- Datasets
- The detailed instruction is in LAVT.
- Pretrained weights
- refer to SAM
Training with 3 V-100s GPUs:
CUDA_VISIBLE_DEVICES=0,1,2 python -m torch.distributed.launch --nproc_per_node 3 train.py --model vit_h --dataset refcoco --split train --batch-size 8 --epochs 40 --img_size 1024 --lr 0.0001 2>&1 | tee ./logs/refcoco/vit_h_output
Testing
python test.py --model vit_h --dataset refcoco --split testB --resume ./checkpoints/vit_h_best_refcoco.pth --img_size 1024 --multimask
Babysitting
tensorboard --logdir ./logs/vit_h_refcoco_test/ --port 6006
More details, refer to LAVT.
Dataset | P@0.5 | P@0.6 | P@0.7 | P@0.8 | P@0.9 | Overall IoU | Mean IoU |
---|---|---|---|---|---|---|---|
RefCOCO val | 79.50 | 74.00 | 67.45 | 55.47 | 22.93 | 64.64 | 71.06 |
RefCOCO test A | 83.03 | 78.20 | 71.68 | 58.60 | 22.38 | 68.61 | 73.35 |
RefCOCO test B | 73.68 | 67.11 | 60.22 | 49.44 | 26.79 | 59.96 | 67.79 |
This project is under the MIT license. See LICENSE for details.