Skip to content
forked from yeliudev/CATNet

πŸ›°οΈ Learning to Aggregate Multi-Scale Context for Instance Segmentation in Remote Sensing Images (arXiv 2021)

License

Notifications You must be signed in to change notification settings

mbrukman/CATNet

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Context Aggregation Network

arXiv License

This repository maintains the official implementation of the paper Learning to Aggregate Multi-Scale Context for Instance Segmentation in Remote Sensing Images by Ye Liu, Huifang Li, Chao Hu, Shuang Luo, Yan Luo, and Chang Wen Chen.

Installation

Please refer to the following environmental settings that we use. You may install these packages by yourself if you meet any problem during automatic installation.

  • CUDA 10.2 Update 2
  • CUDNN 8.0.5.39
  • Python 3.9.7
  • PyTorch 1.10.0
  • MMCV 1.3.17
  • MMDetection 2.18.1
  • NNCore 0.3.2

Install from source

  1. Clone the repository from GitHub.
git clone https://github.com/yeliudev/CATNet.git
cd CATNet
  1. Install dependencies.
pip install -r requirements.txt

Getting Started

Download and prepare the datasets

  1. Download and extract the datasets.

Note that the images in iSAID dataset are splitted into patches with both sides no more than 512 pixels, as reported in our paper. We strongly recommend using this pre-processed version directly since the offical toolkit has known unknown bugs, leading to undesirable patch sizes (e.g. extreme aspect ratios).

  1. Prepare the files in the following structure.
CATNet
β”œβ”€β”€ configs
β”œβ”€β”€ datasets
β”œβ”€β”€ models
β”œβ”€β”€ tools
β”œβ”€β”€ data
β”‚   β”œβ”€β”€ dior
β”‚   β”‚   β”œβ”€β”€ Annotations
β”‚   β”‚   β”œβ”€β”€ ImageSets
β”‚   β”‚   β”œβ”€β”€ JPEGImages-test
β”‚   β”‚   └── JPEGImages-trainval
β”‚   β”œβ”€β”€ hrsid
β”‚   β”‚   β”œβ”€β”€ annotations
β”‚   β”‚   └── images
β”‚   β”œβ”€β”€ isaid
β”‚   β”‚   β”œβ”€β”€ annotations
β”‚   β”‚   β”œβ”€β”€ train
β”‚   β”‚   └── val
β”‚   └── vhr
β”‚       β”œβ”€β”€ ground truth
β”‚       └── positive image set
β”œβ”€β”€ README.md
β”œβ”€β”€ setup.cfg
└── Β·Β·Β·
  1. Convert DIOR annotations to PASCAL VOC format.
python tools/convert_dior.py
  1. Convert NWPU VHR-10 annotations to COCO format.
python tools/convert_vhr.py

Train a model

Run the following command to train a model using a specified config.

torchrun --nproc_per_node=4 tools/train.py <path-to-config>

Test a model and evaluate results

Run the following command to test a model and evaluate results.

torchrun --nproc_per_node=4 tools/test.py <path-to-config> <path-to-checkpoint>

Model Zoo

We provide multiple pre-trained models here. All the models are trained using 4 NVIDIA Tesla V100-SXM2 GPUs and are evaluated using the default metrics of the datasets.

Dataset Model Backbone Schd Aug Performance Download
BBox AP Mask AP
iSAID CAT Mask R-CNN ResNet-50 1x βœ— 46.2 38.5 model | metrics
CAT Mask R-CNN ResNet-50 1x βœ“ 47.6 40.1 model | metrics
DIOR CATNet ResNet-50 3x βœ— 76.3 β€” model | metrics
CATNet ResNet-50 3x βœ“ 78.6 β€” model | metrics
CAT R-CNN ResNet-50 3x βœ— 77.7 β€” model | metrics
CAT R-CNN ResNet-50 3x βœ“ 81.9 β€” model | metrics
NWPU
VHR-10
CATNet ResNet-50 6x βœ— 95.8 β€” model | metrics
CATNet ResNet-50 6x βœ“ 97.4 β€” model | metrics
CAT R-CNN ResNet-50 6x βœ— 96.4 β€” model | metrics
CAT R-CNN ResNet-50 6x βœ“ 97.7 β€” model | metrics
HRSID CAT Mask R-CNN ResNet-50 3x βœ— 71.7 58.2 model | metrics
CAT Mask R-CNN ResNet-50 3x βœ“ 73.3 59.6 model | metrics
CAT R-CNN ResNet-50 3x βœ— 70.5 β€” model | metrics
CAT R-CNN ResNet-50 3x βœ“ 72.8 β€” model | metrics

Citation

If you find this project useful for your research, please kindly cite our paper.

@techreport{liu2021learning,
  title={Learning to Aggregate Multi-Scale Context for Instance Segmentation in Remote Sensing Images},
  author={Liu, Ye and Li, Huifang and Hu, Chao and Luo, Shuang and Luo, Yan and Chen, Chang Wen},
  number={arXiv:2111.11057},
  year={2021}
}

About

πŸ›°οΈ Learning to Aggregate Multi-Scale Context for Instance Segmentation in Remote Sensing Images (arXiv 2021)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%