RMSIN

This repository is the offical implementation for "Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation."

Setting Up

Preliminaries

The code has been verified to work with PyTorch v1.7.1 and Python 3.7.

Clone this repository.
Change directory to root of this repository.

Package Dependencies

Create a new Conda environment with Python 3.7 then activate it:

conda create -n RMSIN python==3.7
conda activate RMSIN

Install PyTorch v1.7.1 with a CUDA version that works on your cluster/machine (CUDA 10.2 is used in this example):

conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=10.2 -c pytorch

Install the packages in requirements.txt via pip:

pip install -r requirements.txt

The Initialization Weights for Training

Create the ./pretrained_weights directory where we will be storing the weights.

mkdir ./pretrained_weights

Download pre-trained classification weights of the Swin Transformer, and put the pth file in ./pretrained_weights. These weights are needed for training to initialize the model.

Datasets

We perform all experiments on our proposed dataset RRSIS-D. RRSIS-D is a new Referring Remote Sensing Image Segmentation benchmark which containes 17,402 image-caption-mask triplets. It can be downloaded from Google Drive or Baidu Netdisk (access code: sjoe).

Usage

Download our dataset.
Copy all the downloaded files to ./refer/data/. The dataset folder should be like this:

$DATA_PATH
├── rrsisd
│   ├── refs(unc).p
│   ├── instances.json
└── images
    └── rrsisd
        ├── JPEGImages
        ├── ann_split

Training

We use DistributedDataParallel from PyTorch for training. To run on 4 GPUs (with IDs 0, 1, 2, and 3) on a single node:

CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node 4 --master_port 12345 train.py --dataset rrsisd --model_id RMSIN --epochs 40 --img_size 480 2>&1 | tee ./output

Testing

python test.py --swin_type base --dataset rrsisd --resume ./your_checkpoints_path --split val --workers 4 --window12 --img_size 480

Acknowledgements

Code in this repository is built on LAVT. We'd like to thank the authors for open sourcing their project.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
arc		arc
bert		bert
data		data
lib		lib
loss		loss
refer		refer
README.md		README.md
args.py		args.py
pipeline.jpg		pipeline.jpg
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py
transforms.py		transforms.py
utils.py		utils.py

Lsan2401/RMSIN

Folders and files

Latest commit

History

Repository files navigation

RMSIN

Setting Up

Preliminaries

Package Dependencies

The Initialization Weights for Training

Datasets

Usage

Training

Testing

Acknowledgements

About

Resources

Stars

Watchers

Forks

Languages