Skip to content
This repository has been archived by the owner on Jan 3, 2023. It is now read-only.

Commit

Permalink
faster-rcnn aeon integration
Browse files Browse the repository at this point in the history
use cache_dir for saving VGG weights
skip serialization of dataloader object
refactor voc_eval to skip temporary file write
update unit tests for aeon
reorganize faster-rcnn example
use config files, add kitti dataset
  • Loading branch information
Hanlin Tang committed Nov 15, 2016
1 parent f352419 commit ca28fd9
Show file tree
Hide file tree
Showing 16 changed files with 1,173 additions and 1,398 deletions.
4 changes: 2 additions & 2 deletions examples/faster-rcnn/NOTICE
Expand Up @@ -4,13 +4,13 @@ This directory uses the open sourced code in the following file:
voc_eval.py
generate_anchors.py
util.py
#
#
# --------------------------------------------------------
# Fast R-CNN
# Copyright (c) 2015 Microsoft
# Licensed under The MIT License [see LICENSE for details]
# Written by Ross Girshick
# --------------------------------------------------------
The mAP evaluation script and various util functions are from:
The mAP evaluation script and various util functions are adapted from:
https://github.com/rbgirshick/py-faster-rcnn/

95 changes: 63 additions & 32 deletions examples/faster-rcnn/README.md
@@ -1,60 +1,91 @@
## Model
## Faster-RCNN

This example demonstrates how to train and test a faster R-CNN model using PASCAL VOC dataset.

The script will download the PASCAL dataset and ingest and provide the data for training and inference.
This example demonstrates how to train and test the Faster R-CNN model on the [PASCAL VOC](http://host.robots.ox.ac.uk/pascal/VOC/) dataset. This object localization model learns to detect objects in natural scenes and provide bounding boxes and category information for each object

Reference:

"Faster R-CNN"\
http://arxiv.org/abs/1506.01497\
https://github.com/rbgirshick/py-faster-rcnn
"Faster R-CNN"http://arxiv.org/abs/1506.01497, https://github.com/rbgirshick/py-faster-rcnn

### Model script
#### train.py
### Data preparation

Trains a Faster-RCNN model to do object localization using PASCAL VOC dataset.
Note: This example requires installing our new dataloader: [aeon](https://github.com/NervanaSystems/aeon). For more information, see the [aeon documentation](http://aeon.nervanasys.com/index.html/)

By default, the faster R-CNN model has several convolution and linear layers initialized from a pre-trained VGG16 model, and this script will download the VGG model from neon model zoo and load the weights for those layers. If the script is given --model_file, it will continue training the Faster R-CNN from the given model file.
First, download and unzip the PASCALVOC 2007 [training](http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar) and [testing](http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar) datasets to a local directory, which we call `PASCAL_DATA_PATH`. These datasets consist of images of scenes and corresponding annotation files with bounding box and category information for each object in the scene.

Then, run the `ingest_pascalvoc.py` script to ingest the data:

Usage:
```
python examples/faster-rcnn/train.py -r0 -e7 -s faster_rcnn.pkl -vv
````
python ingest_pascal.py --input_dir $PASCAL_DATA_PATH
```
The above script will:

1. Convert the annotations from the XML format to the json format expected by our dataloader. The converted json files to the folders `Annotations-json` and `Annotations-json-inference`. When training the model, we exclude objects with the 'difficult' metadata tag. For evaluating the model however, the 'difficult' objects are included (following the above reference), so we create separate folders for the two conditions.

2. Write manifest files for the training and testing sets. These are written to `$PASCAL_DATA_PATH`

3. Write a configuration file to pass to neon. The config file is written to the `faster_rcnn` folder as `pascalvoc.cfg`. The config file contains the paths to the manifest files, as well as some other dataset-specific settings. For example:

```
manifest = [train:/usr/local/data/VOCdevkit/VOC2007/trainval.csv, val:/usr/local/data/VOCdevkit/VOC2007/val.csv]
epochs = 14
height = 1000
width = 1000
batch_size = 1
```

Notes:

1. The training currently runs 1 image in each minibatch.
#### Training

2. The original caffe model goes through 40000 iteration (mb) of training, with
1 images per minibatch (iteration), but 2 iterations per weight updates.
To train the model on the PASCALVOC 2007 dataset, use:
```
python examples/faster-rcnn/train.py -c <path-to-config-file> --verbose --rng_seed 0 -s frcn_model.prm
````

The above command will train the model for 14 epochs (~70K iterations), saving the model to the file `frcn_model.prm`. Note that the training uses a minibatch of 1 image.

3. The model converges after training for 7 epochs.
By default, the Faster R-CNN model has several convolution and linear layers that are initialized from a pre-trained VGG16 model. These VGG weights will be automatically downloaded from the [neon model zoo](https://github.com/NervanaSystems/ModelZoo) and saved in `$PASCAL_DATA_PATH/pascalvoc_cache/`.

Note: the config file passes its contents to the python script as command-line arguments. The equivalent command by passing in the arguments directly is:
```
python examples/faster-rcnn/train.py --manifest train:$PASCAL_DATA_PATH/VOC2007/trainval.csv \
--manifest val:$PASCAL_DATA_PATH/VOC2007/val.csv \
-e 14 --height 1000 --width 1000 --batch_size 1 --verbose --rng_seed 0 -s frcn_model.prm
```

4. The dataset can be cached as the preprocessed file and re-use if the same
configuration of the dataset is used again. The cached file by default ~/nervana/data

### inference.py
### Testing

Test a trained Faster-RCNN model to do object detection using PASCAL VOC dataset.
To evaluate the trained model using the Mean Average Precision (MAP) metric, use the below command.

Usage:
```
python examples/faster-rcnn/inference.py --model_file faster_rcnn.pkl
python examples/faster-rcnn/inference.py -c <path-to-config-file> --model_file frcn_model.prm
```
Notes:

1. This test currently runs 1 image at a time.
A fully trained model should yield a MAP of >69%. The mAP evaluation script is adapted from: https://github.com/rbgirshick/py-faster-rcnn/

2. The dataset can be cached as the preprocessed file and re-use that if the same
configuration of the dataset is used again.
### Other files

3. The mAP evaluation script is adapted from:
https://github.com/rbgirshick/py-faster-rcnn/
This folder includes several other key files, which we describe here:
- `faster_rcnn.py`: Functions for creating the Faster R-CNN network and transforming the output to bounding box predictions.
- `roi_pooling.py`: ROI-pooling layer.
- `proposal_layer.py`: Proposal layer.
- `objectlocalization.py`: Dataset-specific configurations and settings.

Several utility functions are also included:
- `voc_eval.py`: computes the MAP on the voc dataset.
- `util.py`: Bounding box calculations and non-max suppression.
- `generate_anchors.py`: Generate anchor boxes.
- `convert_xml_to_json.py`: Converts PASCAL XML format to json format.

### Tests
There are a few unit tests for components of the model. It is setup based on the py.test framework. To run the tests,
There are a few unit tests for components of the model, set up using the py.test framework. To run these tests, use the below command. The unit tests require defining the environment variable
```
py.test examples/faster-rcnn/tests
```

### Other datasets

To extend Faster-RCNN to other datasets, write a script to ingest the data by converting the annotations into json format, and generate a manifest file according to the specifications in our [aeon documentation](http://aeon.nervanasys.com/index.html/). As an example, we included the ingest script for the KITTI dataset `ingest_kitti.py` and the configuration class `KITTI` in `objectlocalization.py`.



149 changes: 149 additions & 0 deletions examples/faster-rcnn/convert_xml_to_json.py
@@ -0,0 +1,149 @@
#!/usr/bin/python

import json
import glob
import collections
import os
from os.path import join
import xml.etree.ElementTree as et
from collections import defaultdict
import argparse


# http://stackoverflow.com/questions/7684333/converting-xml-to-dictionary-using-elementtree
def etree_to_dict(t):
d = {t.tag: {} if t.attrib else None}
children = list(t)
if children:
dd = defaultdict(list)
for dc in map(etree_to_dict, children):
for k, v in dc.iteritems():
dd[k].append(v)
d = {t.tag: {k: v[0] if len(v) == 1 else v for k, v in dd.iteritems()}}
if t.attrib:
d[t.tag].update(('@' + k, v) for k, v in t.attrib.iteritems())
if t.text:
text = t.text.strip()
if children or t.attrib:
if text:
d[t.tag]['#text'] = text
else:
d[t.tag] = text
return d


def validate_metadata(jobj, file):
boxlist = jobj['object']
if not isinstance(boxlist, collections.Sequence):
print('{0} is not a sequence').format(file)
return False

index = 0
for box in boxlist:
if 'part' in box:
parts = box['part']
if not isinstance(parts, collections.Sequence):
print('parts {0} is not a sequence').format(file)
return False
index += 1
return True


def convert_xml_to_json(input_path, output_path, difficult):

if not os.path.exists(output_path):
os.makedirs(output_path)
onlyfiles = glob.glob(join(input_path, '*.xml'))
onlyfiles.sort()
for file in onlyfiles:
outfile = join(output_path, os.path.basename(file))
outfile = os.path.splitext(outfile)[0] + '.json'
trimmed = parse_single_file(join(input_path, file), difficult)
if validate_metadata(trimmed, file):
result = json.dumps(trimmed, sort_keys=True, indent=4, separators=(',', ': '))
f = open(outfile, 'w')
f.write(result)
else:
print('error parsing metadata {0}').format(file)


def parse_single_file(path, difficult):
tree = et.parse(path)
root = tree.getroot()
d = etree_to_dict(root)
trimmed = d['annotation']
olist = trimmed['object']
if not isinstance(olist, collections.Sequence):
trimmed['object'] = [olist]
olist = trimmed['object']
size = trimmed['size']

# Add version number to json
trimmed['version'] = {'major': 1, 'minor': 0}

# convert all numbers from string representation to number so json does not quote them
# all of the bounding box numbers are one based so subtract 1
size['width'] = int(size['width'])
size['height'] = int(size['height'])
size['depth'] = int(size['depth'])
width = trimmed['size']['width']
height = trimmed['size']['height']
for obj in olist:
obj['difficult'] = int(obj['difficult']) != 0
obj['truncated'] = int(obj['truncated']) != 0
box = obj['bndbox']
box['xmax'] = int(box['xmax']) - 1
box['xmin'] = int(box['xmin']) - 1
box['ymax'] = int(box['ymax']) - 1
box['ymin'] = int(box['ymin']) - 1
if 'part' in obj:
for part in obj['part']:
box = part['bndbox']
box['xmax'] = float(box['xmax']) - 1
box['xmin'] = float(box['xmin']) - 1
box['ymax'] = float(box['ymax']) - 1
box['ymin'] = float(box['ymin']) - 1
xmax = box['xmax']
xmin = box['xmin']
ymax = box['ymax']
ymin = box['ymin']
if xmax > width - 1:
print('xmax {0} exceeds width {1}').format(xmax, width)
if xmin < 0:
print('xmin {0} exceeds width {1}').format(xmin, width)
if ymax > height - 1:
print('ymax {0} exceeds width {1}').format(ymax, height)
if ymin < 0:
print('ymin {0} exceeds width {1}').format(ymin, height)

# exclude difficult objects
if not difficult:
trimmed['object'] = [o for o in trimmed['object'] if not o['difficult']]

return trimmed


def main(args):
input_path = args.input
output_path = args.output
parse_file = args.parse

if parse_file:
print(parse_file)
parsed = parse_single_file(parse_file, args.difficult)
json1 = json.dumps(parsed, sort_keys=True, indent=4, separators=(',', ': '))
print(json1)
elif input_path:
convert_xml_to_json(input_path, output_path, args.difficult)


if __name__ == "__main__":
parser = argparse.ArgumentParser(description="convert xml to json for pascalvoc dataset")
parser.add_argument('-i, --input', dest='input', help='input directory with xml files.')
parser.add_argument('-o, --output', dest='output', help='output directory of json files.')
parser.add_argument('-p, --parse', dest='parse', help='parse a single xml file.')
parser.add_argument('--difficult', dest='difficult', action='store_true',
help='include objects with the difficult tag. Default is to exclude.')

args = parser.parse_args()
main(args)

0 comments on commit ca28fd9

Please sign in to comment.