Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
113 commits
Select commit Hold shift + click to select a range
8cf88f3
added bidirectional
oindrilasaha Mar 5, 2020
487e9bf
bidirectional in BaseRNN
oindrilasaha Mar 13, 2020
cc89c3d
updated all rnn for new bidirectional
oindrilasaha Mar 13, 2020
e4215a3
debugging for fastgrnncuda
oindrilasaha Mar 18, 2020
461e573
fix for fastgrnncuda
oindrilasaha Mar 19, 2020
880b4d7
visual wakeword
oindrilasaha Mar 25, 2020
ccfad8c
visual wakewords evaluation
oindrilasaha Apr 4, 2020
83c8e28
visual wakeword evaluation
oindrilasaha Apr 5, 2020
8dbbdfe
updated readme for eval
oindrilasaha Apr 5, 2020
5c83c5c
face detection
oindrilasaha Apr 21, 2020
6b27551
update face detection
oindrilasaha Apr 23, 2020
e29ca14
rnn edit
oindrilasaha Apr 23, 2020
525801c
test update
oindrilasaha Apr 23, 2020
3ddf41c
model loading change
oindrilasaha Apr 23, 2020
e080661
update readme
oindrilasaha Apr 23, 2020
df61e76
update eval tools in readme
oindrilasaha Apr 23, 2020
3dc8f89
update train
oindrilasaha Apr 23, 2020
eaebfa0
readme
oindrilasaha Apr 23, 2020
691af2f
readme
oindrilasaha Apr 23, 2020
2e68e57
readme
oindrilasaha Apr 23, 2020
9f5cef7
requirements
oindrilasaha Apr 23, 2020
8d32771
requirements
oindrilasaha Apr 23, 2020
f144403
readme
oindrilasaha Apr 23, 2020
01aba90
rnn
oindrilasaha Apr 23, 2020
6be3f50
update s3fd_net
oindrilasaha Apr 24, 2020
d955860
update s3fd_net
oindrilasaha Apr 24, 2020
cef55a3
update s3fd_net
oindrilasaha Apr 24, 2020
7348f33
train
oindrilasaha Apr 24, 2020
58c51f7
train
oindrilasaha Apr 24, 2020
b02cd7e
add additional args
oindrilasaha Apr 25, 2020
eb4daf1
add additional args
oindrilasaha Apr 25, 2020
ef90977
arg changes in wider_test
oindrilasaha Apr 25, 2020
a42b02c
added arg for using new ckpt
oindrilasaha Apr 25, 2020
b609e40
readme
oindrilasaha Apr 25, 2020
e10fdc9
remove old modelsupport
oindrilasaha Apr 25, 2020
a2b3948
readme
oindrilasaha Apr 27, 2020
83df3dc
readme
oindrilasaha Apr 27, 2020
6aba538
requirements
oindrilasaha Apr 27, 2020
a85111d
readme
oindrilasaha Apr 27, 2020
a9a7372
eval on all format
oindrilasaha Apr 29, 2020
0125999
support for calculating MAP
oindrilasaha Apr 29, 2020
eab81c3
update readme
oindrilasaha Apr 29, 2020
793404c
Update README.md
harsha-simhadri Apr 30, 2020
6101a76
readme
oindrilasaha Apr 30, 2020
80cb43a
readme
oindrilasaha Apr 30, 2020
6877035
readme
oindrilasaha Apr 30, 2020
d94bfe2
readme
oindrilasaha Apr 30, 2020
df9146d
fix for warnings
oindrilasaha Apr 30, 2020
0a6ddb6
readme
oindrilasaha Apr 30, 2020
b96b09f
readme scores
oindrilasaha May 1, 2020
a25f19f
Merge branch 'master' into oindrila-rnn
May 1, 2020
8f113aa
add dump weights and traces support
oindrilasaha May 3, 2020
4d73031
readme
oindrilasaha May 3, 2020
ffc7279
remove eval warnings
oindrilasaha May 7, 2020
0a184af
eval remove import warnings
oindrilasaha May 7, 2020
c9eb23e
readme changes
oindrilasaha May 7, 2020
283687a
Merge branch 'oindrila-rnn' of https://github.com/microsoft/EdgeML in…
oindrilasaha May 7, 2020
052896b
readme changes
oindrilasaha May 7, 2020
0bf6102
support for qvga monochrome
oindrilasaha May 9, 2020
d755fbd
Merge branch 'oindrila-rnn' of https://github.com/microsoft/EdgeML in…
oindrilasaha May 9, 2020
bf872cc
readme update
oindrilasaha May 9, 2020
b4d7ca7
readme update
oindrilasaha May 9, 2020
f682fb0
Update README.md
harsha-simhadri May 9, 2020
60dfa87
readme update
oindrilasaha May 10, 2020
afe6620
environment key update
oindrilasaha May 11, 2020
ad475df
config files
oindrilasaha May 11, 2020
d11b85c
update both config files text
oindrilasaha May 11, 2020
f6e8cb7
change architecture
oindrilasaha May 11, 2020
09a03f3
readme update
oindrilasaha May 11, 2020
d4a6222
quantized cpp rnnpool
harsha-simhadri May 15, 2020
2b27432
Update README.md
harsha-simhadri May 15, 2020
72f48f1
Update README.md
harsha-simhadri May 15, 2020
99c7d61
Update README.md
harsha-simhadri May 15, 2020
d7cb08c
smaller model for qvga
oindrilasaha May 26, 2020
d3c3477
Merge branch 'oindrila-rnn' of https://github.com/microsoft/EdgeML in…
oindrilasaha May 26, 2020
3fa8400
Update RPool_Face_QVGA_monochrome.py
oindrilasaha Jun 9, 2020
86eee96
update to ssd code
oindrilasaha Jun 16, 2020
c0883b2
update to init
oindrilasaha Jun 16, 2020
2b69b94
update to init
oindrilasaha Jun 16, 2020
ee1d56d
update to dataloader
oindrilasaha Jun 16, 2020
a11ad1b
tf code for face detection
oindrilasaha Jun 19, 2020
3497cb8
tf code for face detection
oindrilasaha Jun 19, 2020
8bc3b66
Merge branch 'oindrila-rnn' of https://github.com/microsoft/EdgeML in…
oindrilasaha Jun 19, 2020
eb9a498
add tf face detection code
oindrilasaha Jun 19, 2020
62a2edf
eval file
oindrilasaha Jun 19, 2020
f710901
fix weights and detect function
oindrilasaha Jun 19, 2020
3bc8ee7
Delete factory.py
oindrilasaha Jun 19, 2020
a694ea9
Update RPool_Face_QVGA_monochrome.py
oindrilasaha Jul 2, 2020
21588cd
Update RPool_Face_QVGA_monochrome.py
oindrilasaha Aug 18, 2020
c6a4d15
Update RPool_Face_C.py
oindrilasaha Aug 18, 2020
087d6a8
Update RPool_Face_Quant.py
oindrilasaha Aug 18, 2020
b41afc7
Update augmentations.py
oindrilasaha Aug 18, 2020
06fc9bc
Update detection.py
oindrilasaha Aug 18, 2020
4ae117c
Update eval.py
oindrilasaha Aug 18, 2020
7984df9
vww updates
oindrilasaha Aug 19, 2020
7f499bd
removed tf code
oindrilasaha Aug 19, 2020
0e861f9
Update fastcell_example.py
oindrilasaha Aug 20, 2020
0841b26
Update widerface.py
oindrilasaha Aug 20, 2020
d6160ac
Update rnnpool.py
oindrilasaha Aug 20, 2020
8509657
Update fastTrainer.py
oindrilasaha Aug 20, 2020
a175966
Update fastTrainer.py
oindrilasaha Aug 20, 2020
c315662
Update fastTrainer.py
oindrilasaha Aug 20, 2020
6bd9c11
Update model_mobilenet_2rnnpool.py
oindrilasaha Aug 20, 2020
57be74b
Update model_mobilenet_rnnpool.py
oindrilasaha Aug 20, 2020
404648b
Delete top_level.txt
oindrilasaha Aug 20, 2020
4586336
Delete dependency_links.txt
oindrilasaha Aug 20, 2020
7af54a9
remove egg-info
oindrilasaha Aug 20, 2020
ca9a3a0
Merge branch 'oindrila-rnn' of https://github.com/microsoft/EdgeML in…
oindrilasaha Aug 20, 2020
335506b
Update fastcell_example.py
oindrilasaha Aug 21, 2020
bf4eabf
file copyright edits
oindrilasaha Aug 21, 2020
892be38
Update README.md
oindrilasaha Aug 21, 2020
f6ca1e6
delete output blank file
oindrilasaha Sep 29, 2020
df66d9f
remove input trace file
harsha-simhadri Oct 2, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
146 changes: 146 additions & 0 deletions examples/pytorch/vision/Face_Detection/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,146 @@
# Code for Face Detection experiments with RNNPool
## Requirements
1. Follow instructions to install requirements for EdgeML operators and the EdgeML operators [here](https://github.com/microsoft/EdgeML/blob/master/pytorch/README.md).
2. Install requirements for face detection model using
``` pip install -r requirements.txt ```
We have tested the installation and the code on Ubuntu 18.04 with Cuda 10.2 and CuDNN 7.6

## Dataset
1. Download WIDER face dataset images and annotations from http://shuoyang1213.me/WIDERFACE/ and place them all in a folder with name 'WIDER_FACE'. That is, download WIDER_train.zip, WIDER_test.zip, WIDER_val.zip, wider_face_split.zip and place it in WIDER_FACE folder, and unzip files using:

```shell
cd WIDER_FACE
unzip WIDER_train.zip
unzip WIDER_test.zip
unzip WIDER_val.zip
unzip wider_face_split.zip
cd ..

```

2. In `data/config.py` , set _C.HOME to the parent directory of the above folder, and set the _C.FACE.WIDER_DIR to the folder path.
That is, if the WIDER_FACE folder is created in /mnt folder, then _C.HOME='/mnt'
_C.FACE.WIDER_DIR='/mnt/WIDER_FACE'.
Similarly, change `data/config_qvga.py` to set _C.HOME and _C.FACE.WIDER_DIR.
3. Run
``` python prepare_wider_data.py ```


# Usage

## Training

```shell

IS_QVGA_MONO=0 python train.py --batch_size 32 --model_arch RPool_Face_Quant --cuda True --multigpu True --save_folder weights/ --epochs 300 --save_frequency 5000

```

For QVGA:
```shell

IS_QVGA_MONO=1 python train.py --batch_size 64 --model_arch RPool_Face_QVGA_monochrome --cuda True --multigpu True --save_folder weights/ --epochs 300 --save_frequency 5000

```
This will save checkpoints after every '--save_frequency' number of iterations in a weight file with 'checkpoint.pth' at the end and weights for the best state in a file with 'best_state.pth' at the end. These will be saved in '--save_folder'. For resuming training from a checkpoint, use '--resume <checkpoint_name>.pth' with the above command. For example,


```shell

IS_QVGA_MONO=1 python train.py --batch_size 64 --model_arch RPool_Face_QVGA_monochrome --cuda True --multigpu True --save_folder weights/ --epochs 300 --save_frequency 5000 --resume <checkpoint_name>.pth

```

If IS_QVGA_MONO is 0 then training input images will be 640x640 and RGB.
If IS_QVGA_MONO is 1 then training input images will be 320x320 and converted to monochrome.

Input images for training models are cropped and reshaped to square to maintain consistency with [S3FD](https://arxiv.org/abs/1708.05237). However testing can be done on any size of images, thus we resize testing input image size to have area equal to VGA (640x480)/QVGA (320x240), so that aspect ratio is not changed.

The architecture RPool_Face_QVGA_monochrome is for QVGA monochrome format while RPool_Face_C and RPool_Face_Quant are for VGA RGB format.


## Test
There are two modes of testing the trained model -- the evaluation mode to generate bounding boxes for a set of sample images, and the test mode to compute statistics like mAP scores.

#### Evaluation Mode

Given a set of images in <your_image_folder>, `eval/py` generates bounding boxes around faces (where the confidence is higher than certain threshold) and write the images in <your_save_folder>. To evaluate the `rpool_face_best_state.pth` model (stored in ./weights), execute the following command:

```shell
IS_QVGA_MONO=0 python eval.py --model_arch RPool_Face_Quant --model ./weights/RPool_Face_Quant_best_state.pth --image_folder <your_image_folder> --save_dir <your_save_folder>
```

For QVGA:
```shell
IS_QVGA_MONO=1 python eval.py --model_arch RPool_Face_QVGA_monochrome --model ./weights/RPool_Face_QVGA_monochrome_best_state.pth --image_folder <your_image_folder> --save_dir <your_save_folder>
```

This will save images in <your_save_folder> with bounding boxes around faces, where the confidence is high. Here is an example image with a single bounding box.

![Camera: Himax0360](imrgb20ft.png)

If IS_QVGA_MONO=0 the evaluation code accepts an image of any size and resizes it to 640x480x3 while preserving original image aspect ratio.

If IS_QVGA_MONO=1 the evaluation code accepts an image of any size and resizes and converts it to monochrome to make it 320x240x1 while preserving original image aspect ratio.

#### WIDER Set Test
In this mode, we test the generated model against the provided WIDER_FACE validation and test dataset.

For this, first run the following to generate predictions of the model and store output in the '--save_folder' folder.

```shell
IS_QVGA_MONO=0 python wider_test.py --model_arch RPool_Face_Quant --model ./weights/RPool_Face_Quant_best_state.pth --save_folder rpool_face_quant_val --subset val
```

For QVGA:
```shell
IS_QVGA_MONO=1 python wider_test.py --model_arch RPool_Face_QVGA_monochrome --model ./weights/RPool_Face_QVGA_monochrome_best_state.pth --save_folder rpool_face_qvgamono_val --subset val
```

The above command generates predictions for each image in the "validation" dataset. For each image, a separate prediction file is provided (image_name.txt file in appropriate folder). The first line of the prediction file contains the total number of boxes identified.
Then each line in the file corresponds to an identified box. For each box, five numbers are generated: length of the box, height of the box, x-axis offset, y-axis offset, confidence value for presence of a face in the box.

If IS_QVGA_MONO=1 then testing is done by converting images to monochrome and QVGA, else if IS_QVGA_MONO=0 then testing is done on VGA RGB images.

The architecture RPool_Face_QVGA_monochrome is for QVGA monochrome format while RPool_Face_C and RPool_Face_Quant are for VGA RGB format.

###### For calculating MAP scores:
Now using these boxes, we can compute the standard MAP score that is widely used in this literature (see [here](https://medium.com/@jonathan_hui/map-mean-average-precision-for-object-detection-45c121a31173) for more details) as follows:

1. Download eval_tools.zip from http://shuoyang1213.me/WIDERFACE/support/eval_script/eval_tools.zip and unzip in a folder of same name in this directory.

Example code:

```shell
wget http://shuoyang1213.me/WIDERFACE/support/eval_script/eval_tools.zip
unzip eval_tools.zip
```

2. Set up scripts to use the Matlab '.mat' data files in eval_tools/ground_truth folder for MAP calculation: The following installs python files that provide the same functionality as the '.m' matlab scripts in eval_tools folder.
```
cd eval_tools
git clone https://github.com/wondervictor/WiderFace-Evaluation.git
cd WiderFace-Evaluation
python3 setup.py build_ext --inplace
```

3. Run ```python3 evaluation.py -p <your_save_folder> -g <groud truth dir>``` in WiderFace-Evaluation folder

where `prediction_dir` is the '--save_folder' used for `wider_test.py` above and <groud truth dir> is the subfolder `eval_tools/ground_truth`. That is in, WiderFace-Evaluation directory, run:

```shell
python3 evaluation.py -p <your_save_folder> -g ../ground_truth
```
This script should output the MAP for the WIDER-easy, WIDER-medium, and WIDER-hard subsets of the dataset. Our best performance using RPool_Face_Quant model is: 0.80 (WIDER-easy), 0.78 (WIDER-medium), 0.53 (WIDER-hard).


##### Dump RNNPool Input Output Traces and Weights

To save model weights and/or input output pairs for each patch through RNNPool in numpy format use the command below. Put images which you want to save traces for in <your_image_folder> . Specify output folder for saving model weights in numpy format in <your_save_model_numpy_folder>. Specify output folder for saving input output traces of RNNPool in numpy format in <your_save_traces_numpy_folder>. Note that input traces will be saved in a folder named 'inputs' and output traces in a folder named 'outputs' inside <your_save_traces_numpy_folder>.

```shell
python3 dump_model.py --model ./weights/RPool_Face_QVGA_monochrome_best_state.pth --model_arch RPool_Face_Quant --image_folder <your_image_folder> --save_model_npy_dir <your_save_model_numpy_folder> --save_traces_npy_dir <your_save_traces_numpy_folder>
```
If you wish to save only model weights, do not specify --save_traces_npy_dir. If you wish to save only traces do not specify --save_model_npy_dir.

Code has been built upon https://github.com/yxlijun/S3FD.pytorch
31 changes: 31 additions & 0 deletions examples/pytorch/vision/Face_Detection/data/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT license.

from .widerface import WIDERDetection

from data.choose_config import cfg
cfg = cfg.cfg


import torch


def detection_collate(batch):
"""Custom collate fn for dealing with batches of images that have a different
number of associated object annotations (bounding boxes).

Arguments:
batch: (tuple) A tuple of tensor images and lists of annotations

Return:
A tuple containing:
1) (tensor) batch of images stacked on their 0 dim
2) (list of tensors) annotations for a given image are stacked on
0 dim
"""
targets = []
imgs = []
for sample in batch:
imgs.append(sample[0])
targets.append(torch.FloatTensor(sample[1]))
return torch.stack(imgs, 0), targets
15 changes: 15 additions & 0 deletions examples/pytorch/vision/Face_Detection/data/choose_config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT license.

import os
from importlib import import_module

IS_QVGA_MONO = os.environ['IS_QVGA_MONO']


name = 'config'
if IS_QVGA_MONO == '1':
name = name + '_qvga'


cfg = import_module('data.' + name)
65 changes: 65 additions & 0 deletions examples/pytorch/vision/Face_Detection/data/config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT license.

import os
from easydict import EasyDict
import numpy as np


_C = EasyDict()
cfg = _C
# data augument config
_C.expand_prob = 0.5
_C.expand_max_ratio = 4
_C.hue_prob = 0.5
_C.hue_delta = 18
_C.contrast_prob = 0.5
_C.contrast_delta = 0.5
_C.saturation_prob = 0.5
_C.saturation_delta = 0.5
_C.brightness_prob = 0.5
_C.brightness_delta = 0.125
_C.data_anchor_sampling_prob = 0.5
_C.min_face_size = 6.0
_C.apply_distort = True
_C.apply_expand = False
_C.img_mean = np.array([104., 117., 123.])[:, np.newaxis, np.newaxis].astype(
'float32')
_C.resize_width = 640
_C.resize_height = 640
_C.scale = 1 / 127.0
_C.anchor_sampling = True
_C.filter_min_face = True


_C.IS_MONOCHROME = False


# anchor config
_C.FEATURE_MAPS = [160, 80, 40, 20, 10, 5]
_C.INPUT_SIZE = 640
_C.STEPS = [4, 8, 16, 32, 64, 128]
_C.ANCHOR_SIZES = [16, 32, 64, 128, 256, 512]
_C.CLIP = False
_C.VARIANCE = [0.1, 0.2]

# detection config
_C.NMS_THRESH = 0.3
_C.NMS_TOP_K = 5000
_C.TOP_K = 750
_C.CONF_THRESH = 0.01

# loss config
_C.NEG_POS_RATIOS = 3
_C.NUM_CLASSES = 2
_C.USE_NMS = True

# dataset config
_C.HOME = '/mnt/' ## change here ----------

# face config
_C.FACE = EasyDict()
_C.FACE.TRAIN_FILE = './data/face_train.txt'
_C.FACE.VAL_FILE = './data/face_val.txt'
_C.FACE.WIDER_DIR = '/mnt/WIDER_FACE' ## change here ---------
_C.FACE.OVERLAP_THRESH = [0.1, 0.35, 0.5]
64 changes: 64 additions & 0 deletions examples/pytorch/vision/Face_Detection/data/config_qvga.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT license.

import os
from easydict import EasyDict
import numpy as np


_C = EasyDict()
cfg = _C
# data augument config
_C.expand_prob = 0.5
_C.expand_max_ratio = 2
_C.hue_prob = 0.5
_C.hue_delta = 18
_C.contrast_prob = 0.5
_C.contrast_delta = 0.5
_C.saturation_prob = 0.5
_C.saturation_delta = 0.5
_C.brightness_prob = 0.5
_C.brightness_delta = 0.125
_C.data_anchor_sampling_prob = 0.5
_C.min_face_size = 1.0
_C.apply_distort = True
_C.apply_expand = False
_C.img_mean = np.array([104., 117., 123.])[:, np.newaxis, np.newaxis].astype(
'float32')
_C.resize_width = 320
_C.resize_height = 320
_C.scale = 1 / 127.0
_C.anchor_sampling = True
_C.filter_min_face = True


_C.IS_MONOCHROME = True

# anchor config
_C.FEATURE_MAPS = [40, 40, 20, 20]
_C.INPUT_SIZE = 320
_C.STEPS = [8, 8, 16, 16]
_C.ANCHOR_SIZES = [8, 16, 32, 48]
_C.CLIP = False
_C.VARIANCE = [0.1, 0.2]

# detection config
_C.NMS_THRESH = 0.3
_C.NMS_TOP_K = 5000
_C.TOP_K = 750
_C.CONF_THRESH = 0.05

# loss config
_C.NEG_POS_RATIOS = 3
_C.NUM_CLASSES = 2
_C.USE_NMS = True

# dataset config
_C.HOME = '/mnt/'

# face config
_C.FACE = EasyDict()
_C.FACE.TRAIN_FILE = './data/face_train.txt'
_C.FACE.VAL_FILE = './data/face_val.txt'
_C.FACE.WIDER_DIR = '/mnt/WIDER_FACE'
_C.FACE.OVERLAP_THRESH = [0.1, 0.35, 0.5]
Loading