Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
0 parents
commit 58ee8a9
Showing
130 changed files
with
28,827 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
# Windows: | ||
Thumbs.db | ||
ehthumbs.db | ||
Desktop.ini | ||
|
||
# Mac | ||
.DS_Store | ||
|
||
# Python: | ||
*.py[cod] | ||
*.so | ||
*.egg | ||
*.egg-info | ||
dist | ||
build |
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,108 @@ | ||
## Disclaimer | ||
This is the official repo of paper [_DOTA: A Large-scale Dataset for Object Detection in Aerial Images_](https://arxiv.org/abs/1711.10398). This repo contains code for training Faster R-CNN on oriented bounding boxes and horizontal bounding boxes as reported in our paper. | ||
|
||
If you use these code in your project, please contain this repo in your paper or license. Please also cite our paper: | ||
|
||
DOTA: A Large-scale Dataset for Object Detection in Aerial Images | ||
Gui-Song Xia\*, Xiang Bai\*, Jian Ding, Zhen Zhu, Serge Belongie, Jiebo Luo, Mihai Datcu, Marcello Pelillo, Liangpei Zhang | ||
In CVPR 2018. (* equal contributions) | ||
|
||
The code is build on a fork of [deformble](https://github.com/msracver/Deformable-ConvNets) | ||
We use the Faster-RCNN part of it. We made some modifications based on Faster-RCNN to regress a quatrangle. More details can be seen in our [paper](https://arxiv.org/abs/1711.10398). | ||
|
||
## Requirements: Software | ||
|
||
1. MXNet from [the offical repository](https://github.com/dmlc/mxnet). We tested our code on [MXNet@(commit 62ecb60)](https://github.com/dmlc/mxnet/tree/62ecb60). Due to the rapid development of MXNet, it is recommended to checkout this version if you encounter any issues. | ||
|
||
2. Python 2.7. We recommend using Anaconda2 | ||
|
||
3. Python packages might missing: cython, opencv-python >= 3.2.0, easydict. If `pip` is set up on your system, those packages should be able to be fetched and installed by running | ||
``` | ||
pip install Cython | ||
pip install opencv-python==3.2.0.6 | ||
pip install easydict==1.6 | ||
``` | ||
4. For Windows users, Visual Studio 2015 is needed to compile cython module. | ||
|
||
|
||
## Requirements: Hardware | ||
|
||
Any NVIDIA GPUs with at least 4GB memory should be sufficient. | ||
|
||
## Installation | ||
|
||
1. Clone the repository | ||
~~~ | ||
git clone https://gitee.com/dingjiansw101/faster-rcnn-mxnet | ||
~~~ | ||
2. For Windows users, run ``cmd .\init.bat``. For Linux user, run `sh ./init.sh`. The scripts will build cython module automatically and create some folders. | ||
|
||
## Demo & Deformable Model | ||
|
||
We provide trained convnet models, including Faster R-CNN models trained on DOTA. | ||
|
||
1. To use the demo with our pre-trained faster-rcnn models for DOTA, please download manually from [OneDrive](https://drive.google.com/open?id=1b6P-UMaBBpMPlcgvc38dMToPAa_Gyu6F), or [BaiduYun](https://pan.baidu.com/s/1YuB5ib7O-Ori1ZpiGf8Egw)and put it under the following folder. | ||
|
||
Make sure it looks like this: | ||
``` | ||
./output/rcnn/DOTA_quadrangle/DOTA_quadrangle/train/rcnn_DOTA_quadrangle-0059.params | ||
./output/rcnn/DOTA/DOTA/train/rcnn_DOTA_aligned-0032.params | ||
``` | ||
|
||
|
||
## Preparation for Training & Testing | ||
|
||
<!-- For R-FCN/Faster R-CNN\: --> | ||
|
||
1. Please download [DOTA](https://captain-whu.github.io/DOTA/dataset.html) dataset, use the [DOTA_devkit](https://github.com/CAPTAIN-WHU/DOTA_devkit) to split the data into patches. And make sure the split images looks like this: | ||
``` | ||
./path-to-dota-split/images | ||
./path-to-dota-split/labelTxt | ||
./path-to-dota-split/test.txt | ||
./path-to-dota-split/train.txt | ||
``` | ||
The test.txt and train.txt are name of the subimages(without suffix) for train and test respectively. | ||
|
||
|
||
2. Please download ImageNet-pretrained ResNet-v1-101 model manually from [OneDrive](https://1drv.ms/u/s!Am-5JzdW2XHzhqMEtxf1Ciym8uZ8sg), and put it under folder `./model`. Make sure it looks like this: | ||
``` | ||
./model/pretrained_model/resnet_v1_101-0000.params | ||
``` | ||
|
||
## Usage | ||
|
||
1. All of our experiment settings (GPU #, dataset, etc.) are kept in yaml config files at folder `./experiments/faster_rcnn/cfgs`. | ||
|
||
2. Set the "dataset_path" and "root_path" in DOTA.yaml and DOTA_quadrangle.yaml. The "dataset_path" should be the father folder of "images" and "labelTxt". The "root_path" is the path you want to save the cache data. | ||
|
||
3. Set the scales in DOTA.yaml and DOTA_quadrangle.yaml. | ||
|
||
3. To perform experiments, run the python scripts with the corresponding config file as input. For example, train and test on quadrangle, run | ||
``` | ||
python experiments/faster_rcnn/rcnn_dota_e2e.py --cfg experiments/faster_rcnn/cfgs/DOTA.yaml | ||
``` | ||
<!-- A cache folder would be created automatically to save the model and the log under `output/rfcn_dcn_coco/`. --> | ||
4. Please find more details in config files and in our code. | ||
|
||
## Misc. | ||
|
||
Code has been tested under: | ||
<!-- | ||
- Ubuntu 14.04 with a Maxwell Titan X GPU and Intel Xeon CPU E5-2620 v2 @ 2.10GHz --> | ||
- Ubuntu 14.04 with 4 Pascal Titan X GPUs and 32 Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz | ||
<!-- - Windows Server 2012 R2 with 8 K40 GPUs and Intel Xeon CPU E5-2650 v2 @ 2.60GHz | ||
- Windows Server 2012 R2 with 4 Pascal Titan X GPUs and Intel Xeon CPU E5-2650 v4 @ 2.30GHz --> | ||
|
||
## Cite | ||
|
||
If you use our project, please cite: | ||
``` | ||
@article{xia2017dota, | ||
title={DOTA: A Large-scale Dataset for Object Detection in Aerial Images}, | ||
author={Xia, Gui-Song and Bai, Xiang and Ding, Jian and Zhu, Zhen and Belongie, Serge and Luo, Jiebo and Datcu, Mihai and Pelillo, Marcello and Zhang, Liangpei}, | ||
journal={arXiv preprint arXiv:1711.10398}, | ||
year={2017} | ||
} | ||
``` | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,159 @@ | ||
--- | ||
MXNET_VERSION: "mxnet" | ||
output_path: "./output/rcnn/DOTA" | ||
symbol: resnet_v1_101_rcnn | ||
gpus: '1' | ||
CLASS_AGNOSTIC: false | ||
SCALES: | ||
- 1024 | ||
- 1024 | ||
default: | ||
frequent: 100 | ||
kvstore: device | ||
network: | ||
pretrained: "./model/pretrained_model/resnet_v1_101" | ||
pretrained_epoch: 0 | ||
PIXEL_MEANS: | ||
- 103.06 | ||
- 115.90 | ||
- 123.15 | ||
IMAGE_STRIDE: 0 | ||
RCNN_FEAT_STRIDE: 16 | ||
RPN_FEAT_STRIDE: 16 | ||
FIXED_PARAMS: | ||
- conv1 | ||
- bn_conv1 | ||
- res2 | ||
- bn2 | ||
- gamma | ||
- beta | ||
FIXED_PARAMS_SHARED: | ||
- conv1 | ||
- bn_conv1 | ||
- res2 | ||
- bn2 | ||
- res3 | ||
- bn3 | ||
- res4 | ||
- bn4 | ||
- gamma | ||
- beta | ||
ANCHOR_RATIOS: | ||
- 0.3 | ||
- 0.5 | ||
- 1 | ||
- 2 | ||
- 4 | ||
ANCHOR_SCALES: | ||
- 8 | ||
- 16 | ||
- 32 | ||
NUM_ANCHORS: 15 | ||
dataset: | ||
NUM_CLASSES: 16 | ||
dataset: DOTA | ||
# dataset_path: "/data/dj/dota/dota-split" | ||
dataset_path: "/data/dj/ODAI/ODAI_test_split" | ||
# NUM_CLASSES: 16 | ||
# dataset: DOTA | ||
# dataset_path: "/home/dj/data/DOTA-v3" | ||
image_set: train | ||
# root_path: "/data/dj/dota" | ||
root_path: "/data/dj/ODAI" | ||
test_image_set: test | ||
proposal: rpn | ||
TRAIN: | ||
lr: 0.0005 | ||
lr_step: '40,52' | ||
warmup: true | ||
warmup_lr: 0.00005 | ||
# typically we will use 4000 warmup step for single GPU on VOC | ||
warmup_step: 4000 | ||
begin_epoch: 0 | ||
end_epoch: 60 | ||
model_prefix: 'rcnn_DOTA_aligned' | ||
# whether resume training | ||
RESUME: false | ||
# whether flip image | ||
FLIP: true | ||
# whether shuffle image | ||
SHUFFLE: true | ||
# whether use OHEM | ||
ENABLE_OHEM: true | ||
# size of images for each device, 2 for rcnn, 1 for rpn and e2e | ||
BATCH_IMAGES: 1 | ||
# e2e changes behavior of anchor loader and metric | ||
END2END: true | ||
# group images with similar aspect ratio | ||
ASPECT_GROUPING: true | ||
# R-CNN | ||
# rcnn rois batch size | ||
BATCH_ROIS: 128 | ||
BATCH_ROIS_OHEM: 128 | ||
# rcnn rois sampling params | ||
FG_FRACTION: 0.25 | ||
FG_THRESH: 0.5 | ||
BG_THRESH_HI: 0.5 | ||
BG_THRESH_LO: 0.1 | ||
# rcnn bounding box regression params | ||
BBOX_REGRESSION_THRESH: 0.5 | ||
BBOX_WEIGHTS: | ||
- 1.0 | ||
- 1.0 | ||
- 1.0 | ||
- 1.0 | ||
|
||
# RPN anchor loader | ||
# rpn anchors batch size | ||
RPN_BATCH_SIZE: 256 | ||
# rpn anchors sampling params | ||
RPN_FG_FRACTION: 0.5 | ||
RPN_POSITIVE_OVERLAP: 0.7 | ||
RPN_NEGATIVE_OVERLAP: 0.3 | ||
RPN_CLOBBER_POSITIVES: false | ||
# rpn bounding box regression params | ||
RPN_BBOX_WEIGHTS: | ||
- 1.0 | ||
- 1.0 | ||
- 1.0 | ||
- 1.0 | ||
RPN_POSITIVE_WEIGHT: -1.0 | ||
# used for end2end training | ||
# RPN proposal | ||
CXX_PROPOSAL: false | ||
RPN_NMS_THRESH: 0.7 | ||
RPN_PRE_NMS_TOP_N: 6000 | ||
RPN_POST_NMS_TOP_N: 300 | ||
RPN_MIN_SIZE: 0 | ||
# approximate bounding box regression | ||
BBOX_NORMALIZATION_PRECOMPUTED: true | ||
BBOX_MEANS: | ||
- 0.0 | ||
- 0.0 | ||
- 0.0 | ||
- 0.0 | ||
BBOX_STDS: | ||
- 0.1 | ||
- 0.1 | ||
- 0.2 | ||
- 0.2 | ||
TEST: | ||
# use rpn to generate proposal | ||
HAS_RPN: true | ||
# size of images for each device | ||
BATCH_IMAGES: 1 | ||
# RPN proposal | ||
CXX_PROPOSAL: false | ||
RPN_NMS_THRESH: 0.7 | ||
RPN_PRE_NMS_TOP_N: 6000 | ||
RPN_POST_NMS_TOP_N: 300 | ||
RPN_MIN_SIZE: 0 | ||
# RPN generate proposal | ||
PROPOSAL_NMS_THRESH: 0.7 | ||
PROPOSAL_PRE_NMS_TOP_N: 20000 | ||
PROPOSAL_POST_NMS_TOP_N: 2000 | ||
PROPOSAL_MIN_SIZE: 0 | ||
# RCNN nms | ||
NMS: 0.3 | ||
test_epoch: 32 | ||
|
Oops, something went wrong.