First Commit

jessemelpolio · Apr 30, 2018 · 58ee8a9 · 58ee8a9
commit 58ee8a9
Show file tree

Hide file tree

Showing 130 changed files with 28,827 additions and 0 deletions.
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1,15 @@
+# Windows:
+Thumbs.db
+ehthumbs.db
+Desktop.ini
+
+# Mac
+.DS_Store
+
+# Python:
+*.py[cod]
+*.so
+*.egg
+*.egg-info
+dist
+build
diff --git a/LICENSE b/LICENSE
diff --git a/README.md b/README.md
@@ -0,0 +1,108 @@
+## Disclaimer
+This is the official repo of paper [_DOTA: A Large-scale Dataset for Object Detection in Aerial Images_](https://arxiv.org/abs/1711.10398). This repo contains code for training Faster R-CNN on oriented bounding boxes and horizontal bounding boxes as reported in our paper.
+
+If you use these code in your project, please contain this repo in your paper or license. Please also cite our paper:
+
+DOTA: A Large-scale Dataset for Object Detection in Aerial Images  
+Gui-Song Xia\*, Xiang Bai\*, Jian Ding, Zhen Zhu, Serge Belongie, Jiebo Luo, Mihai Datcu, Marcello Pelillo, Liangpei Zhang  
+In CVPR 2018. (* equal contributions)
+
+The code is build on a fork of [deformble](https://github.com/msracver/Deformable-ConvNets) 
+We use the Faster-RCNN part of it. We made some modifications based on Faster-RCNN to regress a quatrangle. More details can be seen in our [paper](https://arxiv.org/abs/1711.10398).
+
+## Requirements: Software
+
+1. MXNet from [the offical repository](https://github.com/dmlc/mxnet). We tested our code on [MXNet@(commit 62ecb60)](https://github.com/dmlc/mxnet/tree/62ecb60). Due to the rapid development of MXNet, it is recommended to checkout this version if you encounter any issues. 
+
+2. Python 2.7. We recommend using Anaconda2
+
+3. Python packages might missing: cython, opencv-python >= 3.2.0, easydict. If `pip` is set up on your system, those packages should be able to be fetched and installed by running
+	```
+	pip install Cython
+	pip install opencv-python==3.2.0.6
+	pip install easydict==1.6
+	```
+4. For Windows users, Visual Studio 2015 is needed to compile cython module.
+
+
+## Requirements: Hardware
+
+Any NVIDIA GPUs with at least 4GB memory should be sufficient. 
+
+## Installation
+
+1. Clone the repository
+~~~
+git clone https://gitee.com/dingjiansw101/faster-rcnn-mxnet
+~~~
+2. For Windows users, run ``cmd .\init.bat``. For Linux user, run `sh ./init.sh`. The scripts will build cython module automatically and create some folders.
+
+## Demo & Deformable Model
+
+We provide trained convnet models, including Faster R-CNN models trained on DOTA.
+
+1. To use the demo with our pre-trained faster-rcnn models for DOTA, please download manually from [OneDrive](https://drive.google.com/open?id=1b6P-UMaBBpMPlcgvc38dMToPAa_Gyu6F), or [BaiduYun](https://pan.baidu.com/s/1YuB5ib7O-Ori1ZpiGf8Egw)and put it under the following folder.
+
+	Make sure it looks like this:
+	```
+    ./output/rcnn/DOTA_quadrangle/DOTA_quadrangle/train/rcnn_DOTA_quadrangle-0059.params
+	./output/rcnn/DOTA/DOTA/train/rcnn_DOTA_aligned-0032.params
+	```
+
+
+## Preparation for Training & Testing
+
+<!-- For R-FCN/Faster R-CNN\: -->
+
+1. Please download [DOTA](https://captain-whu.github.io/DOTA/dataset.html) dataset, use the [DOTA_devkit](https://github.com/CAPTAIN-WHU/DOTA_devkit) to split the data into patches. And make sure the split images looks like this:
+```
+./path-to-dota-split/images
+./path-to-dota-split/labelTxt
+./path-to-dota-split/test.txt
+./path-to-dota-split/train.txt
+```
+The test.txt and train.txt are name of the subimages(without suffix) for train and test respectively.
+
+
+2. Please download ImageNet-pretrained ResNet-v1-101 model manually from [OneDrive](https://1drv.ms/u/s!Am-5JzdW2XHzhqMEtxf1Ciym8uZ8sg), and put it under folder `./model`. Make sure it looks like this:
+	```
+	./model/pretrained_model/resnet_v1_101-0000.params
+	```
+
+## Usage
+
+1. All of our experiment settings (GPU #, dataset, etc.) are kept in yaml config files at folder  `./experiments/faster_rcnn/cfgs`.
+
+2. Set the "dataset_path" and "root_path" in DOTA.yaml and DOTA_quadrangle.yaml. The "dataset_path" should be the father folder of "images" and "labelTxt". The "root_path" is the path you want to save the cache data.
+
+3. Set the scales in DOTA.yaml and DOTA_quadrangle.yaml.
+
+3. To perform experiments, run the python scripts with the corresponding config file as input. For example, train and test on quadrangle, run
+    ```
+	python experiments/faster_rcnn/rcnn_dota_e2e.py --cfg experiments/faster_rcnn/cfgs/DOTA.yaml
+    ```
+    <!-- A cache folder would be created automatically to save the model and the log under `output/rfcn_dcn_coco/`. -->
+4. Please find more details in config files and in our code.
+
+## Misc.
+
+Code has been tested under:
+<!-- 
+- Ubuntu 14.04 with a Maxwell Titan X GPU and Intel Xeon CPU E5-2620 v2 @ 2.10GHz -->
+- Ubuntu 14.04 with 4 Pascal Titan X GPUs and 32  Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz
+<!-- - Windows Server 2012 R2 with 8 K40 GPUs and Intel Xeon CPU E5-2650 v2 @ 2.60GHz
+- Windows Server 2012 R2 with 4 Pascal Titan X GPUs and Intel Xeon CPU E5-2650 v4 @ 2.30GHz -->
+
+## Cite
+
+If you use our project, please cite:
+```
+@article{xia2017dota,
+  title={DOTA: A Large-scale Dataset for Object Detection in Aerial Images},
+  author={Xia, Gui-Song and Bai, Xiang and Ding, Jian and Zhu, Zhen and Belongie, Serge and Luo, Jiebo and Datcu, Mihai and Pelillo, Marcello and Zhang, Liangpei},
+  journal={arXiv preprint arXiv:1711.10398},
+  year={2017}
+}
+```
+
+
diff --git a/experiments/faster_rcnn/cfgs/DOTA.yaml b/experiments/faster_rcnn/cfgs/DOTA.yaml
@@ -0,0 +1,159 @@
+---
+MXNET_VERSION: "mxnet"
+output_path: "./output/rcnn/DOTA"
+symbol: resnet_v1_101_rcnn
+gpus: '1'
+CLASS_AGNOSTIC: false
+SCALES:
+- 1024
+- 1024
+default:
+  frequent: 100
+  kvstore: device
+network:
+  pretrained: "./model/pretrained_model/resnet_v1_101"
+  pretrained_epoch: 0
+  PIXEL_MEANS:
+  - 103.06
+  - 115.90
+  - 123.15
+  IMAGE_STRIDE: 0
+  RCNN_FEAT_STRIDE: 16
+  RPN_FEAT_STRIDE: 16
+  FIXED_PARAMS:
+  - conv1
+  - bn_conv1
+  - res2
+  - bn2
+  - gamma
+  - beta
+  FIXED_PARAMS_SHARED:
+  - conv1
+  - bn_conv1
+  - res2
+  - bn2
+  - res3
+  - bn3
+  - res4
+  - bn4
+  - gamma
+  - beta
+  ANCHOR_RATIOS:
+  - 0.3
+  - 0.5
+  - 1
+  - 2
+  - 4
+  ANCHOR_SCALES:
+  - 8
+  - 16
+  - 32
+  NUM_ANCHORS: 15
+dataset:
+  NUM_CLASSES: 16
+  dataset: DOTA
+#  dataset_path: "/data/dj/dota/dota-split"
+  dataset_path: "/data/dj/ODAI/ODAI_test_split"
+  # NUM_CLASSES: 16
+  # dataset: DOTA
+  # dataset_path: "/home/dj/data/DOTA-v3"
+  image_set: train
+#  root_path: "/data/dj/dota"
+  root_path: "/data/dj/ODAI"
+  test_image_set: test
+  proposal: rpn
+TRAIN:
+  lr: 0.0005
+  lr_step: '40,52'
+  warmup: true
+  warmup_lr: 0.00005
+  # typically we will use 4000 warmup step for single GPU on VOC
+  warmup_step: 4000
+  begin_epoch: 0
+  end_epoch: 60
+  model_prefix: 'rcnn_DOTA_aligned'
+  # whether resume training
+  RESUME: false
+  # whether flip image
+  FLIP: true
+  # whether shuffle image
+  SHUFFLE: true
+  # whether use OHEM
+  ENABLE_OHEM: true
+  # size of images for each device, 2 for rcnn, 1 for rpn and e2e
+  BATCH_IMAGES: 1
+  # e2e changes behavior of anchor loader and metric
+  END2END: true
+  # group images with similar aspect ratio
+  ASPECT_GROUPING: true
+  # R-CNN
+  # rcnn rois batch size
+  BATCH_ROIS: 128
+  BATCH_ROIS_OHEM: 128
+  # rcnn rois sampling params
+  FG_FRACTION: 0.25
+  FG_THRESH: 0.5
+  BG_THRESH_HI: 0.5
+  BG_THRESH_LO: 0.1
+  # rcnn bounding box regression params
+  BBOX_REGRESSION_THRESH: 0.5
+  BBOX_WEIGHTS:
+  - 1.0
+  - 1.0
+  - 1.0
+  - 1.0
+
+  # RPN anchor loader
+  # rpn anchors batch size
+  RPN_BATCH_SIZE: 256
+  # rpn anchors sampling params
+  RPN_FG_FRACTION: 0.5
+  RPN_POSITIVE_OVERLAP: 0.7
+  RPN_NEGATIVE_OVERLAP: 0.3
+  RPN_CLOBBER_POSITIVES: false
+  # rpn bounding box regression params
+  RPN_BBOX_WEIGHTS:
+  - 1.0
+  - 1.0
+  - 1.0
+  - 1.0
+  RPN_POSITIVE_WEIGHT: -1.0
+  # used for end2end training
+  # RPN proposal
+  CXX_PROPOSAL: false
+  RPN_NMS_THRESH: 0.7
+  RPN_PRE_NMS_TOP_N: 6000
+  RPN_POST_NMS_TOP_N: 300
+  RPN_MIN_SIZE: 0
+  # approximate bounding box regression
+  BBOX_NORMALIZATION_PRECOMPUTED: true
+  BBOX_MEANS:
+  - 0.0
+  - 0.0
+  - 0.0
+  - 0.0
+  BBOX_STDS:
+  - 0.1
+  - 0.1
+  - 0.2
+  - 0.2
+TEST:
+  # use rpn to generate proposal
+  HAS_RPN: true
+  # size of images for each device
+  BATCH_IMAGES: 1
+  # RPN proposal
+  CXX_PROPOSAL: false
+  RPN_NMS_THRESH: 0.7
+  RPN_PRE_NMS_TOP_N: 6000
+  RPN_POST_NMS_TOP_N: 300
+  RPN_MIN_SIZE: 0
+  # RPN generate proposal
+  PROPOSAL_NMS_THRESH: 0.7
+  PROPOSAL_PRE_NMS_TOP_N: 20000
+  PROPOSAL_POST_NMS_TOP_N: 2000
+  PROPOSAL_MIN_SIZE: 0
+  # RCNN nms
+  NMS: 0.3
+  test_epoch: 32
+