diff --git a/examples/pytorch/vision/Face_Detection/README.md b/examples/pytorch/vision/Face_Detection/README.md
new file mode 100755
index 000000000..a4fa7d6c2
--- /dev/null
+++ b/examples/pytorch/vision/Face_Detection/README.md
@@ -0,0 +1,146 @@
+# Code for Face Detection experiments with RNNPool
+## Requirements
+1. Follow instructions to install requirements for EdgeML operators and the EdgeML operators [here](https://github.com/microsoft/EdgeML/blob/master/pytorch/README.md).
+2. Install requirements for face detection model using
+``` pip install -r requirements.txt ``` 
+We have tested the installation and the code on Ubuntu 18.04 with Cuda 10.2 and CuDNN 7.6
+
+## Dataset
+1. Download WIDER face dataset images and annotations from http://shuoyang1213.me/WIDERFACE/ and place them all in a folder with name 'WIDER_FACE'. That is, download WIDER_train.zip, WIDER_test.zip, WIDER_val.zip, wider_face_split.zip and place it in WIDER_FACE folder, and unzip files using: 
+
+```shell
+cd WIDER_FACE
+unzip WIDER_train.zip
+unzip WIDER_test.zip
+unzip WIDER_val.zip
+unzip wider_face_split.zip
+cd ..
+
+```
+
+2. In `data/config.py` , set _C.HOME to the parent directory of the above folder, and set the _C.FACE.WIDER_DIR to the folder path. 
+That is, if the WIDER_FACE folder is created in /mnt folder, then _C.HOME='/mnt'
+_C.FACE.WIDER_DIR='/mnt/WIDER_FACE'.
+Similarly, change `data/config_qvga.py` to set _C.HOME and _C.FACE.WIDER_DIR.
+3. Run
+``` python prepare_wider_data.py ```
+
+
+# Usage
+
+## Training
+
+```shell
+
+IS_QVGA_MONO=0 python train.py --batch_size 32 --model_arch RPool_Face_Quant --cuda True --multigpu True --save_folder weights/ --epochs 300 --save_frequency 5000 
+
+```
+
+For QVGA:
+```shell
+
+IS_QVGA_MONO=1 python train.py --batch_size 64 --model_arch RPool_Face_QVGA_monochrome --cuda True --multigpu True --save_folder weights/ --epochs 300 --save_frequency 5000 
+
+```
+This will save checkpoints after every '--save_frequency' number of iterations in a weight file with 'checkpoint.pth' at the end and weights for the best state in a file with 'best_state.pth' at the end. These will be saved in '--save_folder'. For resuming training from a checkpoint, use '--resume <checkpoint_name>.pth' with the above command. For example, 
+
+
+```shell
+
+IS_QVGA_MONO=1 python train.py --batch_size 64 --model_arch RPool_Face_QVGA_monochrome --cuda True --multigpu True --save_folder weights/ --epochs 300 --save_frequency 5000 --resume <checkpoint_name>.pth
+
+```
+
+If IS_QVGA_MONO is 0 then training input images will be 640x640 and RGB. 
+If IS_QVGA_MONO is 1 then training input images will be 320x320 and converted to monochrome. 
+
+Input images for training models are cropped and reshaped to square to maintain consistency with [S3FD](https://arxiv.org/abs/1708.05237). However testing can be done on any size of images, thus we resize testing input image size to have area equal to VGA (640x480)/QVGA (320x240), so that aspect ratio is not changed.
+
+The architecture RPool_Face_QVGA_monochrome is for QVGA monochrome format while RPool_Face_C and RPool_Face_Quant are for VGA RGB format.
+
+
+## Test
+There are two modes of testing the trained model -- the evaluation mode to generate bounding boxes for a set of sample images, and the test mode to compute statistics like mAP scores.
+
+#### Evaluation Mode
+
+Given a set of images in <your_image_folder>, `eval/py` generates bounding boxes around faces (where the confidence is higher than certain threshold) and write the images in <your_save_folder>. To evaluate the `rpool_face_best_state.pth` model (stored in ./weights), execute the following command: 
+
+```shell
+IS_QVGA_MONO=0 python eval.py --model_arch RPool_Face_Quant --model ./weights/RPool_Face_Quant_best_state.pth --image_folder <your_image_folder> --save_dir <your_save_folder>
+```
+
+For QVGA:
+```shell
+IS_QVGA_MONO=1 python eval.py --model_arch RPool_Face_QVGA_monochrome --model ./weights/RPool_Face_QVGA_monochrome_best_state.pth --image_folder <your_image_folder> --save_dir <your_save_folder>
+```
+
+This will save images in <your_save_folder> with bounding boxes around faces, where the confidence is high. Here is an example image with a single bounding box.
+
+![Camera: Himax0360](imrgb20ft.png)
+
+If IS_QVGA_MONO=0 the evaluation code accepts an image of any size and resizes it to 640x480x3 while preserving original image aspect ratio.
+
+If IS_QVGA_MONO=1 the evaluation code accepts an image of any size and resizes and converts it to monochrome to make it 320x240x1 while preserving original image aspect ratio.
+
+#### WIDER Set Test
+In this mode, we test the generated model against the provided WIDER_FACE validation and test dataset. 
+
+For this, first run the following to generate predictions of the model and store output in the '--save_folder' folder. 
+
+```shell
+IS_QVGA_MONO=0 python wider_test.py --model_arch RPool_Face_Quant --model ./weights/RPool_Face_Quant_best_state.pth --save_folder rpool_face_quant_val --subset val
+```
+
+For QVGA:
+```shell
+IS_QVGA_MONO=1 python wider_test.py --model_arch RPool_Face_QVGA_monochrome --model ./weights/RPool_Face_QVGA_monochrome_best_state.pth --save_folder rpool_face_qvgamono_val --subset val
+```
+
+The above command generates predictions for each image in the "validation" dataset. For each image, a separate prediction file is provided (image_name.txt file in appropriate folder). The first line of the prediction file contains the total number of boxes identified. 
+Then each line in the file corresponds to an identified box. For each box, five numbers are generated: length of the box, height of the box, x-axis offset, y-axis offset, confidence value for presence of a face in the box. 
+
+If IS_QVGA_MONO=1 then testing is done by converting images to monochrome and QVGA, else if IS_QVGA_MONO=0 then testing is done on VGA RGB images.
+
+The architecture RPool_Face_QVGA_monochrome is for QVGA monochrome format while RPool_Face_C and RPool_Face_Quant are for VGA RGB format.
+
+###### For calculating MAP scores:
+Now using these boxes, we can compute the standard MAP score that is widely used in this literature (see [here](https://medium.com/@jonathan_hui/map-mean-average-precision-for-object-detection-45c121a31173) for more details) as follows:
+
+1. Download eval_tools.zip from http://shuoyang1213.me/WIDERFACE/support/eval_script/eval_tools.zip and unzip in a folder of same name in this directory.
+
+Example code: 
+
+```shell
+wget http://shuoyang1213.me/WIDERFACE/support/eval_script/eval_tools.zip 
+unzip eval_tools.zip
+```
+
+2. Set up scripts to use the Matlab '.mat' data files in eval_tools/ground_truth folder for MAP calculation: The following installs python files that provide the same functionality as the '.m' matlab scripts in eval_tools folder.
+``` 
+cd eval_tools
+git clone https://github.com/wondervictor/WiderFace-Evaluation.git
+cd WiderFace-Evaluation 
+python3 setup.py build_ext --inplace
+```
+
+3. Run ```python3 evaluation.py -p <your_save_folder> -g <groud truth dir>``` in WiderFace-Evaluation folder
+
+where `prediction_dir` is the '--save_folder' used for `wider_test.py` above and <groud truth dir> is the subfolder `eval_tools/ground_truth`. That is in, WiderFace-Evaluation directory, run: 
+
+```shell
+python3 evaluation.py -p <your_save_folder> -g ../ground_truth
+```
+This script should output the MAP for the WIDER-easy, WIDER-medium, and WIDER-hard subsets of the dataset. Our best performance using RPool_Face_Quant model is: 0.80 (WIDER-easy), 0.78 (WIDER-medium), 0.53 (WIDER-hard). 
+
+
+##### Dump RNNPool Input Output Traces and Weights
+
+To save model weights and/or input output pairs for each patch through RNNPool in numpy format use the command below. Put images which you want to save traces for in <your_image_folder> . Specify output folder for saving model weights in numpy format in <your_save_model_numpy_folder>. Specify output folder for saving input output traces of RNNPool in numpy format in <your_save_traces_numpy_folder>. Note that input traces will be saved in a folder named 'inputs' and output traces in a folder named 'outputs' inside <your_save_traces_numpy_folder>.
+
+```shell
+python3 dump_model.py --model ./weights/RPool_Face_QVGA_monochrome_best_state.pth --model_arch RPool_Face_Quant --image_folder <your_image_folder> --save_model_npy_dir <your_save_model_numpy_folder> --save_traces_npy_dir <your_save_traces_numpy_folder>
+```
+If you wish to save only model weights, do not specify --save_traces_npy_dir. If you wish to save only traces do not specify --save_model_npy_dir.
+
+Code has been built upon https://github.com/yxlijun/S3FD.pytorch
\ No newline at end of file
diff --git a/examples/pytorch/vision/Face_Detection/data/__init__.py b/examples/pytorch/vision/Face_Detection/data/__init__.py
new file mode 100755
index 000000000..ba320eeeb
--- /dev/null
+++ b/examples/pytorch/vision/Face_Detection/data/__init__.py
@@ -0,0 +1,31 @@
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT license.
+
+from .widerface import WIDERDetection
+
+from data.choose_config import cfg
+cfg = cfg.cfg
+
+
+import torch
+
+
+def detection_collate(batch):
+    """Custom collate fn for dealing with batches of images that have a different
+    number of associated object annotations (bounding boxes).
+
+    Arguments:
+        batch: (tuple) A tuple of tensor images and lists of annotations
+
+    Return:
+        A tuple containing:
+            1) (tensor) batch of images stacked on their 0 dim
+            2) (list of tensors) annotations for a given image are stacked on
+                                 0 dim
+    """
+    targets = []
+    imgs = []
+    for sample in batch:
+        imgs.append(sample[0])
+        targets.append(torch.FloatTensor(sample[1]))
+    return torch.stack(imgs, 0), targets
\ No newline at end of file
diff --git a/examples/pytorch/vision/Face_Detection/data/choose_config.py b/examples/pytorch/vision/Face_Detection/data/choose_config.py
new file mode 100644
index 000000000..e86c18d83
--- /dev/null
+++ b/examples/pytorch/vision/Face_Detection/data/choose_config.py
@@ -0,0 +1,15 @@
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT license.
+
+import os
+from importlib import import_module
+
+IS_QVGA_MONO = os.environ['IS_QVGA_MONO']
+
+
+name = 'config'
+if IS_QVGA_MONO == '1':
+	name = name + '_qvga'
+
+
+cfg = import_module('data.' + name)
\ No newline at end of file
diff --git a/examples/pytorch/vision/Face_Detection/data/config.py b/examples/pytorch/vision/Face_Detection/data/config.py
new file mode 100755
index 000000000..86760a009
--- /dev/null
+++ b/examples/pytorch/vision/Face_Detection/data/config.py
@@ -0,0 +1,65 @@
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT license.
+
+import os
+from easydict import EasyDict
+import numpy as np
+
+
+_C = EasyDict()
+cfg = _C
+# data augument config
+_C.expand_prob = 0.5
+_C.expand_max_ratio = 4
+_C.hue_prob = 0.5
+_C.hue_delta = 18
+_C.contrast_prob = 0.5
+_C.contrast_delta = 0.5
+_C.saturation_prob = 0.5
+_C.saturation_delta = 0.5
+_C.brightness_prob = 0.5
+_C.brightness_delta = 0.125
+_C.data_anchor_sampling_prob = 0.5
+_C.min_face_size = 6.0
+_C.apply_distort = True
+_C.apply_expand = False
+_C.img_mean = np.array([104., 117., 123.])[:, np.newaxis, np.newaxis].astype(
+    'float32')
+_C.resize_width = 640
+_C.resize_height = 640
+_C.scale = 1 / 127.0
+_C.anchor_sampling = True
+_C.filter_min_face = True
+
+
+_C.IS_MONOCHROME = False
+
+
+# anchor config
+_C.FEATURE_MAPS = [160, 80, 40, 20, 10, 5]
+_C.INPUT_SIZE = 640
+_C.STEPS = [4, 8, 16, 32, 64, 128]
+_C.ANCHOR_SIZES = [16, 32, 64, 128, 256, 512]
+_C.CLIP = False
+_C.VARIANCE = [0.1, 0.2]
+
+# detection config
+_C.NMS_THRESH = 0.3
+_C.NMS_TOP_K = 5000
+_C.TOP_K = 750
+_C.CONF_THRESH = 0.01
+
+# loss config
+_C.NEG_POS_RATIOS = 3
+_C.NUM_CLASSES = 2
+_C.USE_NMS = True
+
+# dataset config
+_C.HOME = '/mnt/'  ## change here ----------
+
+# face config
+_C.FACE = EasyDict()
+_C.FACE.TRAIN_FILE = './data/face_train.txt'
+_C.FACE.VAL_FILE = './data/face_val.txt'
+_C.FACE.WIDER_DIR = '/mnt/WIDER_FACE'  ## change here ---------
+_C.FACE.OVERLAP_THRESH = [0.1, 0.35, 0.5]
diff --git a/examples/pytorch/vision/Face_Detection/data/config_qvga.py b/examples/pytorch/vision/Face_Detection/data/config_qvga.py
new file mode 100644
index 000000000..06417bac5
--- /dev/null
+++ b/examples/pytorch/vision/Face_Detection/data/config_qvga.py
@@ -0,0 +1,64 @@
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT license.
+
+import os
+from easydict import EasyDict
+import numpy as np
+
+
+_C = EasyDict()
+cfg = _C
+# data augument config
+_C.expand_prob = 0.5
+_C.expand_max_ratio = 2
+_C.hue_prob = 0.5
+_C.hue_delta = 18
+_C.contrast_prob = 0.5
+_C.contrast_delta = 0.5
+_C.saturation_prob = 0.5
+_C.saturation_delta = 0.5
+_C.brightness_prob = 0.5
+_C.brightness_delta = 0.125
+_C.data_anchor_sampling_prob = 0.5
+_C.min_face_size = 1.0
+_C.apply_distort = True
+_C.apply_expand = False
+_C.img_mean = np.array([104., 117., 123.])[:, np.newaxis, np.newaxis].astype(
+    'float32')
+_C.resize_width = 320
+_C.resize_height = 320
+_C.scale = 1 / 127.0
+_C.anchor_sampling = True
+_C.filter_min_face = True
+
+
+_C.IS_MONOCHROME = True
+
+# anchor config
+_C.FEATURE_MAPS = [40, 40, 20, 20]
+_C.INPUT_SIZE = 320
+_C.STEPS = [8, 8, 16, 16]
+_C.ANCHOR_SIZES = [8, 16, 32, 48]
+_C.CLIP = False
+_C.VARIANCE = [0.1, 0.2]
+
+# detection config
+_C.NMS_THRESH = 0.3
+_C.NMS_TOP_K = 5000
+_C.TOP_K = 750
+_C.CONF_THRESH = 0.05
+
+# loss config
+_C.NEG_POS_RATIOS = 3
+_C.NUM_CLASSES = 2
+_C.USE_NMS = True
+
+# dataset config
+_C.HOME = '/mnt/'
+
+# face config
+_C.FACE = EasyDict()
+_C.FACE.TRAIN_FILE = './data/face_train.txt'
+_C.FACE.VAL_FILE = './data/face_val.txt'
+_C.FACE.WIDER_DIR = '/mnt/WIDER_FACE'
+_C.FACE.OVERLAP_THRESH = [0.1, 0.35, 0.5]
diff --git a/examples/pytorch/vision/Face_Detection/data/widerface.py b/examples/pytorch/vision/Face_Detection/data/widerface.py
new file mode 100755
index 000000000..afe0523e0
--- /dev/null
+++ b/examples/pytorch/vision/Face_Detection/data/widerface.py
@@ -0,0 +1,115 @@
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT license.
+
+import torch
+from PIL import Image, ImageDraw
+import torch.utils.data as data
+import numpy as np
+import random
+import sys; sys.path.append('../')
+from utils.augmentations import preprocess
+
+
+class WIDERDetection(data.Dataset):
+    """docstring for WIDERDetection"""
+
+    def __init__(self, list_file, mode='train', mono_mode=False):
+        super(WIDERDetection, self).__init__()
+        self.mode = mode
+        self.mono_mode = mono_mode
+        self.fnames = []
+        self.boxes = []
+        self.labels = []
+
+        with open(list_file) as f:
+            lines = f.readlines()
+
+        for line in lines:
+            line = line.strip().split()
+            num_faces = int(line[1])
+            box = []
+            label = []
+            for i in range(num_faces):
+                x = float(line[2 + 5 * i])
+                y = float(line[3 + 5 * i])
+                w = float(line[4 + 5 * i])
+                h = float(line[5 + 5 * i])
+                c = int(line[6 + 5 * i])
+                if w <= 0 or h <= 0:
+                    continue
+                box.append([x, y, x + w, y + h])
+                label.append(c)
+            if len(box) > 0:
+                self.fnames.append(line[0])
+                self.boxes.append(box)
+                self.labels.append(label)
+
+        self.num_samples = len(self.boxes)
+
+    def __len__(self):
+        return self.num_samples
+
+    def __getitem__(self, index):
+        img, target, h, w = self.pull_item(index)
+        return img, target
+
+    def pull_item(self, index):
+        while True:
+            image_path = self.fnames[index]
+            img = Image.open(image_path)
+            if img.mode == 'L':
+                img = img.convert('RGB')
+
+            im_width, im_height = img.size
+            boxes = self.annotransform(
+                np.array(self.boxes[index]), im_width, im_height)
+            label = np.array(self.labels[index])
+            bbox_labels = np.hstack((label[:, np.newaxis], boxes)).tolist()
+            img, sample_labels = preprocess(
+                img, bbox_labels, self.mode, image_path)
+            sample_labels = np.array(sample_labels)
+            if len(sample_labels) > 0:
+                target = np.hstack(
+                    (sample_labels[:, 1:], sample_labels[:, 0][:, np.newaxis]))
+
+                assert (target[:, 2] > target[:, 0]).any()
+                assert (target[:, 3] > target[:, 1]).any()
+                break 
+            else:
+                index = random.randrange(0, self.num_samples)
+
+
+        if self.mono_mode==True:
+            im = 0.299 * img[0] + 0.587 * img[1] + 0.114 * img[2]
+            return torch.from_numpy(np.expand_dims(im,axis=0)), target, im_height, im_width
+
+        return torch.from_numpy(img), target, im_height, im_width
+        
+
+    def annotransform(self, boxes, im_width, im_height):
+        boxes[:, 0] /= im_width
+        boxes[:, 1] /= im_height
+        boxes[:, 2] /= im_width
+        boxes[:, 3] /= im_height
+        return boxes
+
+
+def detection_collate(batch):
+    """Custom collate fn for dealing with batches of images that have a different
+    number of associated object annotations (bounding boxes).
+
+    Arguments:
+        batch: (tuple) A tuple of tensor images and lists of annotations
+
+    Return:
+        A tuple containing:
+            1) (tensor) batch of images stacked on their 0 dim
+            2) (list of tensors) annotations for a given image are stacked on
+                                 0 dim
+    """
+    targets = []
+    imgs = []
+    for sample in batch:
+        imgs.append(sample[0])
+        targets.append(torch.FloatTensor(sample[1]))
+    return torch.stack(imgs, 0), targets
diff --git a/examples/pytorch/vision/Face_Detection/dump_model.py b/examples/pytorch/vision/Face_Detection/dump_model.py
new file mode 100644
index 000000000..1a2d76d6f
--- /dev/null
+++ b/examples/pytorch/vision/Face_Detection/dump_model.py
@@ -0,0 +1,191 @@
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT license.
+
+from __future__ import division
+from __future__ import absolute_import
+from __future__ import print_function
+
+import os
+import torch
+import argparse
+import torch.nn as nn
+import torch.utils.data as data
+import torch.backends.cudnn as cudnn
+import torchvision.transforms as transforms
+
+import cv2
+import time
+import numpy as np
+from PIL import Image, ImageFilter
+
+from data.config import cfg
+from torch.autograd import Variable
+from utils.augmentations import to_chw_bgr
+
+from importlib import import_module
+
+import warnings
+warnings.filterwarnings("ignore")
+
+
+parser = argparse.ArgumentParser(description='face detection dump')
+
+parser.add_argument('--model', type=str,
+                    default='weights/rpool_face_c.pth', help='trained model')
+                    #small_fgrnn_smallram_sd.pth', help='trained model')
+parser.add_argument('--model_arch',
+                    default='RPool_Face_C', type=str,
+                    choices=['RPool_Face_C', 'RPool_Face_B', 'RPool_Face_A', 'RPool_Face_Quant'],
+                    help='choose architecture among rpool variants')
+parser.add_argument('--image_folder', default=None, type=str, help='folder containing images')
+parser.add_argument('--save_model_npy_dir', default=None, type=str, help='Directory for saving model in numpy array format')
+parser.add_argument('--save_traces_npy_dir', default=None, type=str, help='Directory for saving RNNPool input and output traces in numpy array format')
+
+
+args = parser.parse_args()
+
+
+use_cuda = torch.cuda.is_available()
+
+if use_cuda:
+    torch.set_default_tensor_type('torch.cuda.FloatTensor')
+else:
+    torch.set_default_tensor_type('torch.FloatTensor')
+
+
+
+def saveModelNpy(net):
+    if os.path.isdir(args.save_model_npy_dir) is False:
+        try:
+            os.mkdir(args.save_model_npy_dir)
+        except OSError:
+            print("Creation of the directory %s failed" % args.save_model_npy_dir)
+            return
+
+    np.save(args.save_model_npy_dir+'/W1.npy', net.rnn_model.cell_rnn.cell.W.cpu().detach().numpy())
+    np.save(args.save_model_npy_dir+'/W2.npy', net.rnn_model.cell_bidirrnn.cell.W.cpu().detach().numpy())
+    np.save(args.save_model_npy_dir+'/U1.npy', net.rnn_model.cell_rnn.cell.U.cpu().detach().numpy())
+    np.save(args.save_model_npy_dir+'/U2.npy', net.rnn_model.cell_bidirrnn.cell.U.cpu().detach().numpy())
+    np.save(args.save_model_npy_dir+'/Bg1.npy', net.rnn_model.cell_rnn.cell.bias_gate.cpu().detach().numpy())
+    np.save(args.save_model_npy_dir+'/Bg2.npy', net.rnn_model.cell_bidirrnn.cell.bias_gate.cpu().detach().numpy())
+    np.save(args.save_model_npy_dir+'/Bh1.npy', net.rnn_model.cell_rnn.cell.bias_update.cpu().detach().numpy())
+    np.save(args.save_model_npy_dir+'/Bh2.npy', net.rnn_model.cell_bidirrnn.cell.bias_update.cpu().detach().numpy())
+    np.save(args.save_model_npy_dir+'/nu1.npy', net.rnn_model.cell_rnn.cell.nu.cpu().detach().numpy())
+    np.save(args.save_model_npy_dir+'/nu2.npy', net.rnn_model.cell_bidirrnn.cell.nu.cpu().detach().numpy())
+    np.save(args.save_model_npy_dir+'/zeta1.npy', net.rnn_model.cell_rnn.cell.zeta.cpu().detach().numpy())
+    np.save(args.save_model_npy_dir+'/zeta2.npy', net.rnn_model.cell_bidirrnn.cell.zeta.cpu().detach().numpy())
+
+
+
+activation = {}
+def get_activation(name):
+    def hook(model, input, output):
+        activation[name] = output.detach()
+    return hook
+
+def saveTracesNpy(net, img_list):
+    if os.path.isdir(args.save_traces_npy_dir) is False:
+        try:
+            os.mkdir(args.save_traces_npy_dir)
+        except OSError:
+            print("Creation of the directory %s failed" % args.save_traces_npy_dir)
+            return
+
+    if os.path.isdir(os.path.join(args.save_traces_npy_dir,'inputs')) is False:
+        try:
+            os.mkdir(os.path.join(args.save_traces_npy_dir,'inputs'))
+        except OSError:
+            print("Creation of the directory %s failed" % os.path.join(args.save_traces_npy_dir,'inputs'))
+            return
+
+    if os.path.isdir(os.path.join(args.save_traces_npy_dir,'outputs')) is False:
+        try:
+            os.mkdir(os.path.join(args.save_traces_npy_dir,'outputs'))
+        except OSError:
+            print("Creation of the directory %s failed" % os.path.join(args.save_traces_npy_dir,'outputs'))
+            return
+
+    inputDims = net.rnn_model.inputDims
+    nRows = net.rnn_model.nRows
+    nCols = net.rnn_model.nCols
+    count=0
+    for img_path in img_list:
+        img = Image.open(os.path.join(args.image_folder, img_path))
+        
+        img = img.convert('RGB')
+
+        img = np.array(img)
+        max_im_shrink = np.sqrt(
+            640 * 480 / (img.shape[0] * img.shape[1]))
+        image = cv2.resize(img, None, None, fx=max_im_shrink,
+                          fy=max_im_shrink, interpolation=cv2.INTER_LINEAR)
+
+        x = to_chw_bgr(image)
+        x = x.astype('float32')
+        x -= cfg.img_mean
+        x = x[[2, 1, 0], :, :]
+
+        x = Variable(torch.from_numpy(x).unsqueeze(0))
+        if use_cuda:
+            x = x.cuda()
+        t1 = time.time()
+        y = net(x)
+
+
+        patches = activation['prepatch']
+        patches = torch.cat(torch.unbind(patches,dim=2),dim=0)
+        patches = torch.reshape(patches,(-1,inputDims,nRows,nCols))
+
+        rnnX = activation['rnn_model']
+
+        patches_all = torch.stack(torch.split(patches, split_size_or_sections=1, dim=0),dim=-1)
+        rnnX_all = torch.stack(torch.split(rnnX, split_size_or_sections=1, dim=0),dim=-1)
+
+        for k in range(patches_all.shape[-1]):
+            patches_tosave = patches_all[0,:,:,:,k].cpu().numpy().transpose(1,2,0)
+            rnnX_tosave = rnnX_all[0,:,k].cpu().numpy()
+            np.save(args.save_traces_npy_dir+'/inputs/trace_'+str(count)+'_'+str(k)+'.npy', patches_tosave)
+            np.save(args.save_traces_npy_dir+'/outputs/trace_'+str(count)+'_'+str(k)+'.npy', rnnX_tosave)
+
+        count+=1
+
+
+
+    
+
+
+if __name__ == '__main__':
+
+    module = import_module('models.' + args.model_arch)
+    net = module.build_s3fd('test', cfg.NUM_CLASSES)
+
+    # net = torch.nn.DataParallel(net)
+
+    checkpoint_dict = torch.load(args.model)
+
+    model_dict = net.state_dict()
+
+
+    model_dict.update(checkpoint_dict) 
+    net.load_state_dict(model_dict)
+
+
+
+    net.eval()
+
+    if use_cuda:
+        net.cuda()
+        cudnn.benckmark = True
+
+
+
+    if args.save_model_npy_dir is not None:
+        saveModelNpy(net)
+
+    if args.save_traces_npy_dir is not None:
+        net.unfold.register_forward_hook(get_activation('prepatch'))     
+        net.rnn_model.register_forward_hook(get_activation('rnn_model'))  
+        img_path = args.image_folder
+        img_list = [os.path.join(img_path, x)
+                for x in os.listdir(img_path)]
+        saveTracesNpy(net, img_list)
diff --git a/examples/pytorch/vision/Face_Detection/eval.py b/examples/pytorch/vision/Face_Detection/eval.py
new file mode 100644
index 000000000..00063d943
--- /dev/null
+++ b/examples/pytorch/vision/Face_Detection/eval.py
@@ -0,0 +1,133 @@
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT license.
+
+import torch
+import torch.nn as nn
+import torch.utils.data as data
+import torch.backends.cudnn as cudnn
+import torchvision.transforms as transforms
+
+import os
+import time
+import argparse
+import numpy as np
+from PIL import Image
+import cv2
+
+from data.choose_config import cfg
+cfg = cfg.cfg
+
+from utils.augmentations import to_chw_bgr
+
+from importlib import import_module
+
+
+parser = argparse.ArgumentParser(description='face detection demo')
+parser.add_argument('--save_dir', type=str, default='results/',
+                    help='Directory for detect result')
+parser.add_argument('--model', type=str,
+                    default='weights/rpool_face_c.pth', help='trained model')
+parser.add_argument('--thresh', default=0.17, type=float,
+                    help='Final confidence threshold')
+parser.add_argument('--model_arch',
+                    default='RPool_Face_C', type=str,
+                    choices=['RPool_Face_C', 'RPool_Face_Quant', 'RPool_Face_QVGA_monochrome'],
+                    help='choose architecture among rpool variants')
+parser.add_argument('--image_folder', default=None, type=str, help='folder containing images')
+
+
+args = parser.parse_args()
+
+if not os.path.exists(args.save_dir):
+    os.makedirs(args.save_dir)
+
+use_cuda = torch.cuda.is_available()
+
+if use_cuda:
+    torch.set_default_tensor_type('torch.cuda.FloatTensor')
+else:
+    torch.set_default_tensor_type('torch.FloatTensor')
+
+
+def detect(net, img_path, thresh):
+    img = Image.open(img_path)
+    img = img.convert('RGB')
+    img = np.array(img)
+    height, width, _ = img.shape
+
+    if os.environ['IS_QVGA_MONO'] == '1':
+        max_im_shrink = np.sqrt(
+            320 * 240 / (img.shape[0] * img.shape[1]))
+    else:
+        max_im_shrink = np.sqrt(
+            640 * 480 / (img.shape[0] * img.shape[1]))
+
+    image = cv2.resize(img, None, None, fx=max_im_shrink,
+                      fy=max_im_shrink, interpolation=cv2.INTER_LINEAR)
+    # img = cv2.resize(img, (640, 640))
+    x = to_chw_bgr(image)
+    x = x.astype('float32')
+    x -= cfg.img_mean
+    x = x[[2, 1, 0], :, :]
+
+
+    if cfg.IS_MONOCHROME == True:
+        x = 0.299 * x[0] + 0.587 * x[1] + 0.114 * x[2]
+        x = torch.from_numpy(x).unsqueeze(0).unsqueeze(0)
+    else:
+        x = torch.from_numpy(x).unsqueeze(0)
+    if use_cuda:
+        x = x.cuda()
+    t1 = time.time()
+    y = net(x)
+    detections = y.data
+    scale = torch.Tensor([img.shape[1], img.shape[0],
+                          img.shape[1], img.shape[0]])
+
+    img = cv2.imread(img_path, cv2.IMREAD_COLOR)
+
+    for i in range(detections.size(1)):
+        j = 0
+        while detections[0, i, j, 0] >= thresh:
+            score = detections[0, i, j, 0]
+            pt = (detections[0, i, j, 1:] * scale).cpu().numpy()
+            left_up, right_bottom = (pt[0], pt[1]), (pt[2], pt[3])
+            j += 1
+            cv2.rectangle(img, left_up, right_bottom, (0, 0, 255), 2)
+            conf = "{:.3f}".format(score)
+            point = (int(left_up[0]), int(left_up[1] - 5))
+            cv2.putText(img, conf, point, cv2.FONT_HERSHEY_COMPLEX,
+                       0.6, (0, 255, 0), 1)
+
+    t2 = time.time()
+    print('detect:{} timer:{}'.format(img_path, t2 - t1))
+
+    cv2.imwrite(os.path.join(args.save_dir, os.path.basename(img_path)), img)
+
+
+if __name__ == '__main__':
+
+    module = import_module('models.' + args.model_arch)
+    net = module.build_s3fd('test', cfg.NUM_CLASSES)
+
+    net = torch.nn.DataParallel(net)
+
+    checkpoint_dict = torch.load(args.model)
+
+    model_dict = net.state_dict()
+
+
+    model_dict.update(checkpoint_dict) 
+    net.load_state_dict(model_dict)
+
+    net.eval()
+
+    if use_cuda:
+        net.cuda()
+        cudnn.benckmark = True
+
+    img_path = args.image_folder
+    img_list = [os.path.join(img_path, x)
+                for x in os.listdir(img_path)]
+    for path in img_list:
+        detect(net, path, args.thresh)
\ No newline at end of file
diff --git a/examples/pytorch/vision/Face_Detection/imrgb20ft.png b/examples/pytorch/vision/Face_Detection/imrgb20ft.png
new file mode 100755
index 000000000..bb5df2253
Binary files /dev/null and b/examples/pytorch/vision/Face_Detection/imrgb20ft.png differ
diff --git a/examples/pytorch/vision/Face_Detection/layers/__init__.py b/examples/pytorch/vision/Face_Detection/layers/__init__.py
new file mode 100755
index 000000000..ade560f74
--- /dev/null
+++ b/examples/pytorch/vision/Face_Detection/layers/__init__.py
@@ -0,0 +1,5 @@
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT license.
+
+from .functions import *
+from .modules import *
diff --git a/examples/pytorch/vision/Face_Detection/layers/bbox_utils.py b/examples/pytorch/vision/Face_Detection/layers/bbox_utils.py
new file mode 100755
index 000000000..9f7451855
--- /dev/null
+++ b/examples/pytorch/vision/Face_Detection/layers/bbox_utils.py
@@ -0,0 +1,306 @@
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT license.
+
+import torch
+
+
+def point_form(boxes):
+    """ Convert prior_boxes to (xmin, ymin, xmax, ymax)
+    representation for comparison to point form ground truth data.
+    Args:
+        boxes: (tensor) center-size default boxes from priorbox layers.
+    Return:
+        boxes: (tensor) Converted xmin, ymin, xmax, ymax form of boxes.
+    """
+    return torch.cat((boxes[:, :2] - boxes[:, 2:] / 2,     # xmin, ymin
+                      boxes[:, :2] + boxes[:, 2:] / 2), 1)  # xmax, ymax
+
+
+def center_size(boxes):
+    """ Convert prior_boxes to (cx, cy, w, h)
+    representation for comparison to center-size form ground truth data.
+    Args:
+        boxes: (tensor) point_form boxes
+    Return:
+        boxes: (tensor) Converted xmin, ymin, xmax, ymax form of boxes.
+    """
+    return torch.cat([(boxes[:, 2:] + boxes[:, :2]) / 2,  # cx, cy
+                     boxes[:, 2:] - boxes[:, :2]], 1)  # w, h
+
+
+def intersect(box_a, box_b):
+    """ We resize both tensors to [A,B,2] without new malloc:
+    [A,2] -> [A,1,2] -> [A,B,2]
+    [B,2] -> [1,B,2] -> [A,B,2]
+    Then we compute the area of intersect between box_a and box_b.
+    Args:
+      box_a: (tensor) bounding boxes, Shape: [A,4].
+      box_b: (tensor) bounding boxes, Shape: [B,4].
+    Return:
+      (tensor) intersection area, Shape: [A,B].
+    """
+    A = box_a.size(0)
+    B = box_b.size(0)
+    max_xy = torch.min(box_a[:, 2:].unsqueeze(1).expand(A, B, 2),
+                       box_b[:, 2:].unsqueeze(0).expand(A, B, 2))
+    min_xy = torch.max(box_a[:, :2].unsqueeze(1).expand(A, B, 2),
+                       box_b[:, :2].unsqueeze(0).expand(A, B, 2))
+    inter = torch.clamp((max_xy - min_xy), min=0)
+    return inter[:, :, 0] * inter[:, :, 1]
+
+
+def jaccard(box_a, box_b):
+    """Compute the jaccard overlap of two sets of boxes.  The jaccard overlap
+    is simply the intersection over union of two boxes.  Here we operate on
+    ground truth boxes and default boxes.
+    E.g.:
+        A ∩ B / A ∪ B = A ∩ B / (area(A) + area(B) - A ∩ B)
+    Args:
+        box_a: (tensor) Ground truth bounding boxes, Shape: [num_objects,4]
+        box_b: (tensor) Prior boxes from priorbox layers, Shape: [num_priors,4]
+    Return:
+        jaccard overlap: (tensor) Shape: [box_a.size(0), box_b.size(0)]
+    """
+    inter = intersect(box_a, box_b)
+    area_a = ((box_a[:, 2] - box_a[:, 0]) *
+              (box_a[:, 3] - box_a[:, 1])).unsqueeze(1).expand_as(inter)  # [A,B]
+    area_b = ((box_b[:, 2] - box_b[:, 0]) *
+              (box_b[:, 3] - box_b[:, 1])).unsqueeze(0).expand_as(inter)  # [A,B]
+    union = area_a + area_b - inter
+    return inter / union  # [A,B]
+
+
+def match(threshold, truths, priors, variances, labels, loc_t, conf_t, idx):
+    """Match each prior box with the ground truth box of the highest jaccard
+    overlap, encode the bounding boxes, then return the matched indices
+    corresponding to both confidence and location preds.
+    Args:
+        threshold: (float) The overlap threshold used when mathing boxes.
+        truths: (tensor) Ground truth boxes, Shape: [num_obj, num_priors].
+        priors: (tensor) Prior boxes from priorbox layers, Shape: [n_priors,4].
+        variances: (tensor) Variances corresponding to each prior coord,
+            Shape: [num_priors, 4].
+        labels: (tensor) All the class labels for the image, Shape: [num_obj].
+        loc_t: (tensor) Tensor to be filled w/ endcoded location targets.
+        conf_t: (tensor) Tensor to be filled w/ matched indices for conf preds.
+        idx: (int) current batch index
+    Return:
+        The matched indices corresponding to 1)location and 2)confidence preds.
+    """
+    # jaccard index
+    overlaps = jaccard(
+        truths,
+        point_form(priors)
+    )
+    # (Bipartite Matching)
+    # [1,num_objects] best prior for each ground truth
+    best_prior_overlap, best_prior_idx = overlaps.max(1, keepdim=True)
+    # [1,num_priors] best ground truth for each prior
+    best_truth_overlap, best_truth_idx = overlaps.max(
+        0, keepdim=True)  # 0-2000
+    best_truth_idx.squeeze_(0)
+    best_truth_overlap.squeeze_(0)
+    best_prior_idx.squeeze_(1)
+    best_prior_overlap.squeeze_(1)
+    best_truth_overlap.index_fill_(0, best_prior_idx, 2)  # ensure best prior
+    # TODO refactor: index  best_prior_idx with long tensor
+    # ensure every gt matches with its prior of max overlap
+    for j in range(best_prior_idx.size(0)):
+        best_truth_idx[best_prior_idx[j]] = j
+    _th1, _th2, _th3 = threshold  # _th1 = 0.1 ,_th2 = 0.35,_th3 = 0.5
+
+    N = (torch.sum(best_prior_overlap >= _th2) +
+         torch.sum(best_prior_overlap >= _th3)) // 2
+    matches = truths[best_truth_idx]          # Shape: [num_priors,4]
+    conf = labels[best_truth_idx]         # Shape: [num_priors]
+    conf[best_truth_overlap < _th2] = 0  # label as background
+
+    best_truth_overlap_clone = best_truth_overlap.clone()
+    add_idx = best_truth_overlap_clone.gt(
+        _th1).eq(best_truth_overlap_clone.lt(_th2))
+    best_truth_overlap_clone[~add_idx] = 0
+    stage2_overlap, stage2_idx = best_truth_overlap_clone.sort(descending=True)
+
+    stage2_overlap = stage2_overlap.gt(_th1)
+
+    if N > 0:
+        N = torch.sum(stage2_overlap[:N]) if torch.sum(
+            stage2_overlap[:N]) < N else N
+        conf[stage2_idx[:N]] += 1
+
+    loc = encode(matches, priors, variances)
+    loc_t[idx] = loc    # [num_priors,4] encoded offsets to learn
+    conf_t[idx] = conf  # [num_priors] top class label for each prior
+
+
+def match_ssd(threshold, truths, priors, variances, labels, loc_t, conf_t, idx):
+    """Match each prior box with the ground truth box of the highest jaccard
+    overlap, encode the bounding boxes, then return the matched indices
+    corresponding to both confidence and location preds.
+    Args:
+        threshold: (float) The overlap threshold used when mathing boxes.
+        truths: (tensor) Ground truth boxes, Shape: [num_obj, num_priors].
+        priors: (tensor) Prior boxes from priorbox layers, Shape: [n_priors,4].
+        variances: (tensor) Variances corresponding to each prior coord,
+            Shape: [num_priors, 4].
+        labels: (tensor) All the class labels for the image, Shape: [num_obj].
+        loc_t: (tensor) Tensor to be filled w/ endcoded location targets.
+        conf_t: (tensor) Tensor to be filled w/ matched indices for conf preds.
+        idx: (int) current batch index
+    Return:
+        The matched indices corresponding to 1)location and 2)confidence preds.
+    """
+    # jaccard index
+    overlaps = jaccard(
+        truths,
+        point_form(priors)
+    )
+    # (Bipartite Matching)
+    # [1,num_objects] best prior for each ground truth
+    best_prior_overlap, best_prior_idx = overlaps.max(1, keepdim=True)
+    # [1,num_priors] best ground truth for each prior
+    best_truth_overlap, best_truth_idx = overlaps.max(
+        0, keepdim=True)  # 0-2000
+    best_truth_idx.squeeze_(0)
+    best_truth_overlap.squeeze_(0)
+    best_prior_idx.squeeze_(1)
+    best_prior_overlap.squeeze_(1)
+    best_truth_overlap.index_fill_(0, best_prior_idx, 2)  # ensure best prior
+    # TODO refactor: index  best_prior_idx with long tensor
+    # ensure every gt matches with its prior of max overlap
+    for j in range(best_prior_idx.size(0)):
+        best_truth_idx[best_prior_idx[j]] = j
+    matches = truths[best_truth_idx]          # Shape: [num_priors,4]
+    conf = labels[best_truth_idx]         # Shape: [num_priors]
+    conf[best_truth_overlap < threshold] = 0  # label as background
+    loc = encode(matches, priors, variances)
+    loc_t[idx] = loc    # [num_priors,4] encoded offsets to learn
+    conf_t[idx] = conf  # [num_priors] top class label for each prior
+
+
+def encode(matched, priors, variances):
+    """Encode the variances from the priorbox layers into the ground truth boxes
+    we have matched (based on jaccard overlap) with the prior boxes.
+    Args:
+        matched: (tensor) Coords of ground truth for each prior in point-form
+            Shape: [num_priors, 4].
+        priors: (tensor) Prior boxes in center-offset form
+            Shape: [num_priors,4].
+        variances: (list[float]) Variances of priorboxes
+    Return:
+        encoded boxes (tensor), Shape: [num_priors, 4]
+    """
+
+    # dist b/t match center and prior's center
+    g_cxcy = (matched[:, :2] + matched[:, 2:]) / 2 - priors[:, :2]
+    # encode variance
+    g_cxcy /= (variances[0] * priors[:, 2:])
+    # match wh / prior wh
+    g_wh = (matched[:, 2:] - matched[:, :2]) / priors[:, 2:]
+    #g_wh = torch.log(g_wh) / variances[1]
+    g_wh = torch.log(g_wh) / variances[1]
+    # return target for smooth_l1_loss
+    return torch.cat([g_cxcy, g_wh], 1)  # [num_priors,4]
+
+
+# Adapted from https://github.com/Hakuyume/chainer-ssd
+def decode(loc, priors, variances):
+    """Decode locations from predictions using priors to undo
+    the encoding we did for offset regression at train time.
+    Args:
+        loc (tensor): location predictions for loc layers,
+            Shape: [num_priors,4]
+        priors (tensor): Prior boxes in center-offset form.
+            Shape: [num_priors,4].
+        variances: (list[float]) Variances of priorboxes
+    Return:
+        decoded bounding box predictions
+    """
+
+    boxes = torch.cat((
+        priors[:, :2] + loc[:, :2] * variances[0] * priors[:, 2:],
+        priors[:, 2:] * torch.exp(loc[:, 2:] * variances[1])), 1)
+    boxes[:, :2] -= boxes[:, 2:] / 2
+    boxes[:, 2:] += boxes[:, :2]
+    return boxes
+
+
+def log_sum_exp(x):
+    """Utility function for computing log_sum_exp while determining
+    This will be used to determine unaveraged confidence loss across
+    all examples in a batch.
+    Args:
+        x (Variable(tensor)): conf_preds from conf layers
+    """
+    x_max = x.data.max()
+    return torch.log(torch.sum(torch.exp(x - x_max), 1, keepdim=True)) + x_max
+
+
+# Original author: Francisco Massa:
+# https://github.com/fmassa/object-detection.torch
+# Ported to PyTorch by Max deGroot (02/01/2017)
+def nms(boxes, scores, overlap=0.5, top_k=200):
+    """Apply non-maximum suppression at test time to avoid detecting too many
+    overlapping bounding boxes for a given object.
+    Args:
+        boxes: (tensor) The location preds for the img, Shape: [num_priors,4].
+        scores: (tensor) The class predscores for the img, Shape:[num_priors].
+        overlap: (float) The overlap thresh for suppressing unnecessary boxes.
+        top_k: (int) The Maximum number of box preds to consider.
+    Return:
+        The indices of the kept boxes with respect to num_priors.
+    """
+
+    keep = scores.new(scores.size(0)).zero_().long()
+    if boxes.numel() == 0:
+        return keep
+    x1 = boxes[:, 0]
+    y1 = boxes[:, 1]
+    x2 = boxes[:, 2]
+    y2 = boxes[:, 3]
+    area = torch.mul(x2 - x1, y2 - y1)
+    v, idx = scores.sort(0)  # sort in ascending order
+    # I = I[v >= 0.01]
+    idx = idx[-top_k:]  # indices of the top-k largest vals
+    xx1 = boxes.new()
+    yy1 = boxes.new()
+    xx2 = boxes.new()
+    yy2 = boxes.new()
+    w = boxes.new()
+    h = boxes.new()
+
+    # keep = torch.Tensor()
+    count = 0
+    while idx.numel() > 0:
+        i = idx[-1]  # index of current largest val
+        # keep.append(i)
+        keep[count] = i
+        count += 1
+        if idx.size(0) == 1:
+            break
+        idx = idx[:-1]  # remove kept element from view
+        # load bboxes of next highest vals
+        torch.index_select(x1, 0, idx, out=xx1)
+        torch.index_select(y1, 0, idx, out=yy1)
+        torch.index_select(x2, 0, idx, out=xx2)
+        torch.index_select(y2, 0, idx, out=yy2)
+        # store element-wise max with next highest score
+        xx1 = torch.clamp(xx1, min=x1[i])
+        yy1 = torch.clamp(yy1, min=y1[i])
+        xx2 = torch.clamp(xx2, max=x2[i])
+        yy2 = torch.clamp(yy2, max=y2[i])
+        w.resize_as_(xx2)
+        h.resize_as_(yy2)
+        w = xx2 - xx1
+        h = yy2 - yy1
+        # check sizes of xx1 and xx2.. after each iteration
+        w = torch.clamp(w, min=0.0)
+        h = torch.clamp(h, min=0.0)
+        inter = w * h
+        # IoU = i / (area(a) + area(b) - i)
+        rem_areas = torch.index_select(area, 0, idx)  # load remaining areas)
+        union = (rem_areas - inter) + area[i]
+        IoU = inter / union  # store result in iou
+        # keep only elements with an IoU <= overlap
+        idx = idx[IoU.le(overlap)]
+    return keep, count
diff --git a/examples/pytorch/vision/Face_Detection/layers/functions/__init__.py b/examples/pytorch/vision/Face_Detection/layers/functions/__init__.py
new file mode 100755
index 000000000..8ae7faa12
--- /dev/null
+++ b/examples/pytorch/vision/Face_Detection/layers/functions/__init__.py
@@ -0,0 +1,8 @@
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT license.
+
+from .prior_box import PriorBox
+from .detection import detect_function
+
+__all__=['detect_function','PriorBox']
+
diff --git a/examples/pytorch/vision/Face_Detection/layers/functions/detection.py b/examples/pytorch/vision/Face_Detection/layers/functions/detection.py
new file mode 100755
index 000000000..1fb450850
--- /dev/null
+++ b/examples/pytorch/vision/Face_Detection/layers/functions/detection.py
@@ -0,0 +1,59 @@
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT license.
+
+from __future__ import division
+from __future__ import absolute_import
+from __future__ import print_function
+
+import torch
+
+from ..bbox_utils import decode, nms
+
+
+
+def detect_function(cfg, loc_data, conf_data, prior_data):
+    """
+    Args:
+        loc_data: (tensor) Loc preds from loc layers
+            Shape: [batch,num_priors*4]
+        conf_data: (tensor) Shape: Conf preds from conf layers
+            Shape: [batch*num_priors,num_classes]
+        prior_data: (tensor) Prior boxes and variances from priorbox layers
+            Shape: [1,num_priors,4] 
+    """
+    with torch.no_grad():
+        num = loc_data.size(0)
+        num_priors = prior_data.size(0)
+
+        conf_preds = conf_data.view(
+            num, num_priors, cfg.NUM_CLASSES).transpose(2, 1)
+        batch_priors = prior_data.view(-1, num_priors,
+                                       4).expand(num, num_priors, 4)
+        batch_priors = batch_priors.contiguous().view(-1, 4)
+
+        decoded_boxes = decode(loc_data.view(-1, 4),
+                               batch_priors, cfg.VARIANCE)
+        decoded_boxes = decoded_boxes.view(num, num_priors, 4)
+
+        output = torch.zeros(num, cfg.NUM_CLASSES, cfg.TOP_K, 5)
+
+        for i in range(num):
+            boxes = decoded_boxes[i].clone()
+            conf_scores = conf_preds[i].clone()
+
+            for cl in range(1, cfg.NUM_CLASSES):
+                c_mask = conf_scores[cl].gt(cfg.CONF_THRESH)
+                scores = conf_scores[cl][c_mask]
+                
+                if scores.dim() == 0:
+                    continue
+                l_mask = c_mask.unsqueeze(1).expand_as(boxes)
+                boxes_ = boxes[l_mask].view(-1, 4)
+                ids, count = nms(
+                    boxes_, scores, cfg.NMS_THRESH, cfg.NMS_TOP_K)
+                count = count if count < cfg.TOP_K else cfg.TOP_K
+
+                output[i, cl, :count] = torch.cat((scores[ids[:count]].unsqueeze(1),
+                                                   boxes_[ids[:count]]), 1)
+
+    return output
diff --git a/examples/pytorch/vision/Face_Detection/layers/functions/prior_box.py b/examples/pytorch/vision/Face_Detection/layers/functions/prior_box.py
new file mode 100755
index 000000000..9c84dab07
--- /dev/null
+++ b/examples/pytorch/vision/Face_Detection/layers/functions/prior_box.py
@@ -0,0 +1,51 @@
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT license.
+
+import torch
+from itertools import product as product
+import math
+
+
+class PriorBox(object):
+    """Compute priorbox coordinates in center-offset form for each source
+    feature map.
+    """
+
+    def __init__(self, input_size, feature_maps,cfg):
+        super(PriorBox, self).__init__()
+        self.imh = input_size[0]
+        self.imw = input_size[1]
+
+        # number of priors for feature map location (either 4 or 6)
+        self.variance = cfg.VARIANCE or [0.1]
+        #self.feature_maps = cfg.FEATURE_MAPS
+        self.min_sizes = cfg.ANCHOR_SIZES
+        self.steps = cfg.STEPS
+        self.clip = cfg.CLIP
+        for v in self.variance:
+            if v <= 0:
+                raise ValueError('Variances must be greater than 0')
+        self.feature_maps = feature_maps
+
+
+    def forward(self):
+        mean = []
+        for k in range(len(self.feature_maps)):
+            feath = self.feature_maps[k][0]
+            featw = self.feature_maps[k][1]
+            for i, j in product(range(feath), range(featw)):
+                f_kw = self.imw / self.steps[k]
+                f_kh = self.imh / self.steps[k]
+
+                cx = (j + 0.5) / f_kw
+                cy = (i + 0.5) / f_kh
+
+                s_kw = self.min_sizes[k] / self.imw
+                s_kh = self.min_sizes[k] / self.imh
+
+                mean += [cx, cy, s_kw, s_kh]
+
+        output = torch.Tensor(mean).view(-1, 4)
+        if self.clip:
+            output.clamp_(max=1, min=0)
+        return output
\ No newline at end of file
diff --git a/examples/pytorch/vision/Face_Detection/layers/modules/__init__.py b/examples/pytorch/vision/Face_Detection/layers/modules/__init__.py
new file mode 100755
index 000000000..8c2365787
--- /dev/null
+++ b/examples/pytorch/vision/Face_Detection/layers/modules/__init__.py
@@ -0,0 +1,8 @@
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT license.
+
+from .l2norm import L2Norm
+from .multibox_loss import MultiBoxLoss
+
+__all__ = ['L2Norm', 'MultiBoxLoss']
+
diff --git a/examples/pytorch/vision/Face_Detection/layers/modules/l2norm.py b/examples/pytorch/vision/Face_Detection/layers/modules/l2norm.py
new file mode 100755
index 000000000..e6c909e63
--- /dev/null
+++ b/examples/pytorch/vision/Face_Detection/layers/modules/l2norm.py
@@ -0,0 +1,29 @@
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT license.
+
+import torch
+import torch.nn as nn
+import torch.nn.init as init
+
+
+class L2Norm(nn.Module):
+    def __init__(self,n_channels, scale):
+        super(L2Norm,self).__init__()
+        self.n_channels = n_channels
+        self.gamma = scale or None
+        self.eps = 1e-10
+        self.weight = nn.Parameter(torch.Tensor(self.n_channels))
+        self.reset_parameters()
+
+    def reset_parameters(self):
+        init.constant_(self.weight,self.gamma)
+
+    def forward(self, x):
+        norm = x.pow(2).sum(dim=1, keepdim=True).sqrt()+self.eps
+        #x /= norm
+        x = torch.div(x,norm)
+        out = self.weight.unsqueeze(0).unsqueeze(2).unsqueeze(3).expand_as(x) * x
+        return out
+
+
+        
\ No newline at end of file
diff --git a/examples/pytorch/vision/Face_Detection/layers/modules/multibox_loss.py b/examples/pytorch/vision/Face_Detection/layers/modules/multibox_loss.py
new file mode 100755
index 000000000..5ad634cc2
--- /dev/null
+++ b/examples/pytorch/vision/Face_Detection/layers/modules/multibox_loss.py
@@ -0,0 +1,118 @@
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT license.
+
+import math
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+from torch.autograd import Variable
+
+
+from ..bbox_utils import match, log_sum_exp, match_ssd
+
+
+class MultiBoxLoss(nn.Module):
+    """SSD Weighted Loss Function
+    Compute Targets:
+        1) Produce Confidence Target Indices by matching  ground truth boxes
+           with (default) 'priorboxes' that have jaccard index > threshold parameter
+           (default threshold: 0.5).
+        2) Produce localization target by 'encoding' variance into offsets of ground
+           truth boxes and their matched  'priorboxes'.
+        3) Hard negative mining to filter the excessive number of negative examples
+           that comes with using a large number of default bounding boxes.
+           (default negative:positive ratio 3:1)
+    Objective Loss:
+        L(x,c,l,g) = (Lconf(x, c) + αLloc(x,l,g)) / N
+        Where, Lconf is the CrossEntropy Loss and Lloc is the SmoothL1 Loss
+        weighted by α which is set to 1 by cross val.
+        Args:
+            c: class confidences,
+            l: predicted boxes,
+            g: ground truth boxes
+            N: number of matched default boxes
+        See: https://arxiv.org/pdf/1512.02325.pdf for more details.
+    """
+
+    def __init__(self, cfg, dataset, use_gpu=True):
+        super(MultiBoxLoss, self).__init__()
+        self.use_gpu = use_gpu
+        self.num_classes = cfg.NUM_CLASSES
+        self.negpos_ratio = cfg.NEG_POS_RATIOS
+        self.variance = cfg.VARIANCE
+        self.dataset = dataset
+        
+        self.threshold = cfg.FACE.OVERLAP_THRESH
+        self.match = match
+
+    def forward(self, predictions, targets):
+        """Multibox Loss
+        Args:
+            predictions (tuple): A tuple containing loc preds, conf preds,
+            and prior boxes from SSD net.
+                conf shape: torch.size(batch_size,num_priors,num_classes)
+                loc shape: torch.size(batch_size,num_priors,4)
+                priors shape: torch.size(num_priors,4)
+
+            targets (tensor): Ground truth boxes and labels for a batch,
+                shape: [batch_size,num_objs,5] (last idx is the label).
+        """
+        loc_data, conf_data, priors = predictions
+        num = loc_data.size(0)
+        priors = priors[:loc_data.size(1), :]
+        num_priors = (priors.size(0))
+        num_classes = self.num_classes
+
+        # match priors (default boxes) and ground truth boxes
+        loc_t = torch.Tensor(num, num_priors, 4)
+        conf_t = torch.LongTensor(num, num_priors)
+        for idx in range(num):
+            truths = targets[idx][:, :-1].data
+            labels = targets[idx][:, -1].data
+            defaults = priors.data
+            self.match(self.threshold, truths, defaults, self.variance, labels,
+                       loc_t, conf_t, idx)
+        if self.use_gpu:
+            loc_t = loc_t.cuda()
+            conf_t = conf_t.cuda()
+        # wrap targets
+        loc_t = Variable(loc_t, requires_grad=False)
+        conf_t = Variable(conf_t, requires_grad=False)
+
+        pos = conf_t > 0
+        num_pos = pos.sum(dim=1, keepdim=True)
+        # Localization Loss (Smooth L1)
+        # Shape: [batch,num_priors,4]
+        pos_idx = pos.unsqueeze(pos.dim()).expand_as(loc_data)
+        loc_p = loc_data[pos_idx].view(-1, 4)
+        loc_t = loc_t[pos_idx].view(-1, 4)
+        loss_l = F.smooth_l1_loss(loc_p, loc_t, size_average=False)
+        # print(loc_p)
+        # Compute max conf across batch for hard negative mining
+        batch_conf = conf_data.view(-1, self.num_classes)
+        loss_c = log_sum_exp(batch_conf) - \
+            batch_conf.gather(1, conf_t.view(-1, 1))
+
+        # Hard Negative Mining
+        loss_c[pos.view(-1, 1)] = 0  # filter out pos boxes for now
+        loss_c = loss_c.view(num, -1)
+        _, loss_idx = loss_c.sort(1, descending=True)
+        _, idx_rank = loss_idx.sort(1)
+        num_pos = pos.long().sum(1, keepdim=True)
+        num_neg = torch.clamp(self.negpos_ratio *
+                              num_pos, max=pos.size(1) - 1)
+        neg = idx_rank < num_neg.expand_as(idx_rank)
+
+        # Confidence Loss Including Positive and Negative Examples
+        pos_idx = pos.unsqueeze(2).expand_as(conf_data)
+        neg_idx = neg.unsqueeze(2).expand_as(conf_data)
+        conf_p = conf_data[(pos_idx + neg_idx).gt(0)
+                           ].view(-1, self.num_classes)
+        targets_weighted = conf_t[(pos + neg).gt(0)]
+        loss_c = F.cross_entropy(conf_p, targets_weighted, size_average=False)
+
+        # Sum of losses: L(x,c,l,g) = (Lconf(x, c) + αLloc(x,l,g)) / N
+        N = num_pos.data.sum() if num_pos.data.sum() > 0 else num
+        loss_l /= N
+        loss_c /= N
+        return loss_l, loss_c
\ No newline at end of file
diff --git a/examples/pytorch/vision/Face_Detection/models/RPool_Face_C.py b/examples/pytorch/vision/Face_Detection/models/RPool_Face_C.py
new file mode 100755
index 000000000..96396061d
--- /dev/null
+++ b/examples/pytorch/vision/Face_Detection/models/RPool_Face_C.py
@@ -0,0 +1,382 @@
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT license.
+
+from __future__ import division
+from __future__ import absolute_import
+from __future__ import print_function
+
+import os
+import torch
+import torch.nn as nn
+import torch.nn.init as init
+import torch.nn.functional as F
+
+
+from layers import *
+from data.config import cfg
+import numpy as np
+
+from edgeml_pytorch.graph.rnnpool import *
+
+class S3FD(nn.Module):
+    """Single Shot Multibox Architecture
+    The network is composed of a base VGG network followed by the
+    added multibox conv layers.  Each multibox layer branches into
+        1) conv2d for class conf scores
+        2) conv2d for localization predictions
+        3) associated priorbox layer to produce default bounding
+           boxes specific to the layer's feature map size.
+    See: https://arxiv.org/pdf/1512.02325.pdf for more details.
+
+    Args:
+        phase: (string) Can be "test" or "train"
+        size: input image size
+        base: VGG16 layers for input, size of either 300 or 500
+        extras: extra layers that feed to multibox loc and conf layers
+        head: "multibox head" consists of loc and conf conv layers
+    """
+
+    def __init__(self, phase, base, head, num_classes):
+        super(S3FD, self).__init__()
+        self.phase = phase
+        self.num_classes = num_classes
+        '''
+        self.priorbox = PriorBox(size,cfg)
+        self.priors = Variable(self.priorbox.forward(), volatile=True)
+        '''
+        # SSD network
+
+        self.unfold = nn.Unfold(kernel_size=(8,8),stride=(4,4))
+
+        self.rnn_model = RNNPool(8, 8, 16, 16, 3)
+        
+        self.mob = nn.ModuleList(base)
+        # Layer learns to scale the l2 normalized features from conv4_3
+        self.L2Norm3_3 = L2Norm(24, 10)
+        self.L2Norm4_3 = L2Norm(32, 8)
+        self.L2Norm5_3 = L2Norm(64, 5)
+
+
+        self.loc = nn.ModuleList(head[0])
+        self.conf = nn.ModuleList(head[1])
+
+        if self.phase == 'test':
+            self.softmax = nn.Softmax(dim=-1)
+            # self.detect = Detect(cfg)
+ 
+
+
+    def forward(self, x):
+        """Applies network layers and ops on input image(s) x.
+
+        Args:
+            x: input image or batch of images. Shape: [batch,3,300,300].
+
+        Return:
+            Depending on phase:
+            test:
+                Variable(tensor) of output class label predictions,
+                confidence score, and corresponding location predictions for
+                each object detected. Shape: [batch,topk,7]
+
+            train:
+                list of concat outputs from:
+                    1: confidence layers, Shape: [batch*num_priors,num_classes]
+                    2: localization layers, Shape: [batch,num_priors*4]
+                    3: priorbox layers, Shape: [2,num_priors*4]
+        """
+        size = x.size()[2:]
+        batch_size = x.shape[0]
+        sources = list()
+        loc = list()
+        conf = list()
+
+        patches = self.unfold(x)
+        patches = torch.cat(torch.unbind(patches,dim=2),dim=0)
+        patches = torch.reshape(patches,(-1,3,8,8))
+
+        output_x = int((x.shape[2]-8)/4 + 1)
+        output_y = int((x.shape[3]-8)/4 + 1)
+
+        rnnX = self.rnn_model(patches, int(batch_size)*output_x*output_y)
+
+        x = torch.stack(torch.split(rnnX, split_size_or_sections=int(batch_size), dim=0),dim=2)
+
+        x = F.fold(x, kernel_size=(1,1), output_size=(output_x,output_y))
+
+        x = F.pad(x, (0,1,0,1), mode='replicate')
+
+
+
+        for k in range(2):
+            x = self.mob[k](x)
+
+        s = self.L2Norm3_3(x)
+        sources.append(s)
+
+        for k in range(2, 5):
+            x = self.mob[k](x)
+
+        s = self.L2Norm4_3(x)
+        sources.append(s)
+
+        for k in range(5, 9):
+            x = self.mob[k](x)
+
+        s = self.L2Norm5_3(x)
+        sources.append(s)
+
+        for k in range(9, 12):
+            x = self.mob[k](x)
+        sources.append(x)
+
+        for k in range(12, 14):
+            x = self.mob[k](x)
+        sources.append(x)
+
+        for k in range(14, 15):
+            x = self.mob[k](x)
+        sources.append(x)
+
+
+
+        # apply multibox head to source layers
+
+        loc_x = self.loc[0](sources[0])
+        conf_x = self.conf[0](sources[0])
+
+        max_conf, _ = torch.max(conf_x[:, 0:3, :, :], dim=1, keepdim=True)
+        conf_x = torch.cat((max_conf, conf_x[:, 3:, :, :]), dim=1)
+
+        loc.append(loc_x.permute(0, 2, 3, 1).contiguous())
+        conf.append(conf_x.permute(0, 2, 3, 1).contiguous())
+
+        for i in range(1, len(sources)):
+            x = sources[i]
+            conf.append(self.conf[i](x).permute(0, 2, 3, 1).contiguous())
+            loc.append(self.loc[i](x).permute(0, 2, 3, 1).contiguous())
+
+
+        features_maps = []
+        for i in range(len(loc)):
+            feat = []
+            feat += [loc[i].size(1), loc[i].size(2)]
+            features_maps += [feat]
+
+        self.priorbox = PriorBox(size, features_maps, cfg)
+        self.priors = self.priorbox.forward()
+
+        loc = torch.cat([o.view(o.size(0), -1) for o in loc], 1)
+        conf = torch.cat([o.view(o.size(0), -1) for o in conf], 1)
+
+       
+        if self.phase == 'test':
+            output = detect_function(
+                loc.view(loc.size(0), -1, 4),                   # loc preds
+                self.softmax(conf.view(conf.size(0), -1,
+                                       self.num_classes)),                # conf preds
+                self.priors.type(type(x.data))                  # default boxes
+            )
+
+        else:
+            output = (
+                loc.view(loc.size(0), -1, 4),
+                conf.view(conf.size(0), -1, self.num_classes),
+                self.priors
+            )
+        return output
+
+    def load_weights(self, base_file):
+        other, ext = os.path.splitext(base_file)
+        if ext == '.pkl' or '.pth':
+            print('Loading weights into state dict...')
+            mdata = torch.load(base_file,
+                               map_location=lambda storage, loc: storage)
+            weights = mdata['weight']
+            epoch = mdata['epoch']
+            self.load_state_dict(weights)
+            print('Finished!')
+        else:
+            print('Sorry only .pth and .pkl files supported.')
+        return epoch
+
+    def xavier(self, param):
+        init.xavier_uniform(param)
+
+    def weights_init(self, m):
+        if isinstance(m, nn.Conv2d):
+            self.xavier(m.weight.data)
+            m.bias.data.zero_()
+
+
+
+
+def _make_divisible(v, divisor, min_value=None):
+    """
+    This function is taken from the original tf repo.
+    It ensures that all layers have a channel number that is divisible by 8
+    It can be seen here:
+    https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet/mobilenet.py
+    :param v:
+    :param divisor:
+    :param min_value:
+    :return:
+    """
+    if min_value is None:
+        min_value = divisor
+    new_v = max(min_value, int(v + divisor / 2) // divisor * divisor)
+    # Make sure that round down does not go down by more than 10%.
+    if new_v < 0.9 * v:
+        new_v += divisor
+    return new_v
+
+
+class ConvBNReLU(nn.Sequential):
+    def __init__(self, in_planes, out_planes, kernel_size=3, stride=1, groups=1):
+        padding = (kernel_size - 1) // 2
+        super(ConvBNReLU, self).__init__(
+            nn.Conv2d(in_planes, out_planes, kernel_size, stride, padding, groups=groups, bias=False),
+            nn.BatchNorm2d(out_planes),
+            nn.ReLU6(inplace=True)
+        )
+
+
+class InvertedResidual(nn.Module):
+    def __init__(self, inp, oup, stride, expand_ratio):
+        super(InvertedResidual, self).__init__()
+        self.stride = stride
+        assert stride in [1, 2]
+
+        hidden_dim = int(round(inp * expand_ratio))
+        self.use_res_connect = self.stride == 1 and inp == oup
+
+        layers = []
+        if expand_ratio != 1:
+            # pw
+            layers.append(ConvBNReLU(inp, hidden_dim, kernel_size=1))
+        layers.extend([
+            # dw
+            ConvBNReLU(hidden_dim, hidden_dim, stride=stride, groups=hidden_dim),
+            # pw-linear
+            nn.Conv2d(hidden_dim, oup, 1, 1, 0, bias=False),
+            nn.BatchNorm2d(oup),
+        ])
+        self.conv = nn.Sequential(*layers)
+
+    def forward(self, x):
+        if self.use_res_connect:
+            return x + self.conv(x)
+        else:
+            return self.conv(x)
+
+
+class MobileNetV2(nn.Module):
+    def __init__(self, num_classes=1000, width_mult=1.0, inverted_residual_setting=None, round_nearest=8):
+        """
+        MobileNet V2 main class
+        Args:
+            num_classes (int): Number of classes
+            width_mult (float): Width multiplier - adjusts number of channels in each layer by this amount
+            inverted_residual_setting: Network structure
+            round_nearest (int): Round the number of channels in each layer to be a multiple of this number
+            Set to 1 to turn off rounding
+        """
+        super(MobileNetV2, self).__init__()
+        block = InvertedResidual
+        input_channel = 64
+
+        if inverted_residual_setting is None:
+            inverted_residual_setting = [
+                # t, c, n, s
+                # [1, 16, 1, 1],
+                [1, 24, 1, 1],
+                [6, 24, 1, 1],
+                [6, 32, 3, 2],
+                [6, 64, 4, 2],
+                [6, 96, 3, 2],
+                [6, 160, 2, 2],
+                [6, 320, 1, 2],
+            ]
+
+        if len(inverted_residual_setting) == 0 or len(inverted_residual_setting[0]) != 4:
+            raise ValueError("inverted_residual_setting should be non-empty "
+                             "or a 4-element list, got {}".format(inverted_residual_setting))
+
+        # building first layer
+        input_channel = _make_divisible(input_channel * width_mult, round_nearest)
+        self.last_channel = _make_divisible(last_channel * max(1.0, width_mult), round_nearest)
+        self.layers = []
+        # building inverted residual blocks
+        for t, c, n, s in inverted_residual_setting:
+            output_channel = _make_divisible(c * width_mult, round_nearest)
+            for i in range(n):
+                stride = s if i == 0 else 1
+                self.layers.append(block(input_channel, output_channel, stride, expand_ratio=t))
+                input_channel = output_channel
+
+        # weight initialization
+        for m in self.modules():
+            if isinstance(m, nn.Conv2d):
+                nn.init.kaiming_normal_(m.weight, mode='fan_out')
+                if m.bias is not None:
+                    nn.init.zeros_(m.bias)
+            elif isinstance(m, nn.BatchNorm2d):
+                nn.init.ones_(m.weight)
+                nn.init.zeros_(m.bias)
+            elif isinstance(m, nn.Linear):
+                nn.init.normal_(m.weight, 0, 0.01)
+                nn.init.zeros_(m.bias)
+
+
+
+def multibox(mobilenet, num_classes):
+    loc_layers = []
+    conf_layers = []
+
+    loc_layers += [nn.Conv2d(24, 4,
+                             kernel_size=5, padding=2)]
+    conf_layers += [nn.Conv2d(24,
+                              3 + (num_classes-1), kernel_size=5, padding=2)]
+
+    loc_layers += [nn.Conv2d(32,
+                                 4, kernel_size=5, padding=2)]
+    conf_layers += [nn.Conv2d(32,
+                                  num_classes, kernel_size=5, padding=2)]
+
+    loc_layers += [nn.Conv2d(64,
+                                 4, kernel_size=5, padding=2)]
+    conf_layers += [nn.Conv2d(64,
+                                  num_classes, kernel_size=5, padding=2)]
+
+    loc_layers += [nn.Conv2d(96,
+                                 4, kernel_size=1, padding=0)]
+    conf_layers += [nn.Conv2d(96,
+                                  num_classes, kernel_size=1, padding=0)]
+
+    loc_layers += [nn.Conv2d(160,
+                                 4, kernel_size=1, padding=0)]
+    conf_layers += [nn.Conv2d(160,
+                                  num_classes, kernel_size=1, padding=0)]
+
+    loc_layers += [nn.Conv2d(320,
+                                 4, kernel_size=3, padding=1)]
+    conf_layers += [nn.Conv2d(320,
+                                  num_classes, kernel_size=3, padding=1)]
+
+ 
+   
+    return mobilenet, (loc_layers, conf_layers)
+
+
+def build_s3fd(phase, num_classes=2):
+    base_, head_ = multibox(
+        MobileNetV2().layers, num_classes)
+    
+    return S3FD(phase, base_, head_, num_classes)
+
+
+if __name__ == '__main__':
+    net = build_s3fd('train', num_classes=2)
+    inputs = Variable(torch.randn(4, 3, 640, 640))
+    output = net(inputs)
+
diff --git a/examples/pytorch/vision/Face_Detection/models/RPool_Face_QVGA_monochrome.py b/examples/pytorch/vision/Face_Detection/models/RPool_Face_QVGA_monochrome.py
new file mode 100644
index 000000000..bb796f1a7
--- /dev/null
+++ b/examples/pytorch/vision/Face_Detection/models/RPool_Face_QVGA_monochrome.py
@@ -0,0 +1,348 @@
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT license.
+
+from __future__ import division
+from __future__ import absolute_import
+from __future__ import print_function
+
+import os
+import torch
+import torch.nn as nn
+import torch.nn.init as init
+import torch.nn.functional as F
+
+
+from layers import *
+from data.config_qvga import cfg
+import numpy as np
+
+from edgeml_pytorch.graph.rnnpool import *
+
+class S3FD(nn.Module):
+    """Single Shot Multibox Architecture
+    The network is composed of a base VGG network followed by the
+    added multibox conv layers.  Each multibox layer branches into
+        1) conv2d for class conf scores
+        2) conv2d for localization predictions
+        3) associated priorbox layer to produce default bounding
+           boxes specific to the layer's feature map size.
+    See: https://arxiv.org/pdf/1512.02325.pdf for more details.
+    Args:
+        phase: (string) Can be "test" or "train"
+        size: input image size
+        base: VGG16 layers for input, size of either 300 or 500
+        extras: extra layers that feed to multibox loc and conf layers
+        head: "multibox head" consists of loc and conf conv layers
+    """
+
+    def __init__(self, phase, base, head, num_classes):
+        super(S3FD, self).__init__()
+        self.phase = phase
+        self.num_classes = num_classes
+        '''
+        self.priorbox = PriorBox(size,cfg)
+        self.priors = Variable(self.priorbox.forward(), volatile=True)
+        '''
+        # SSD network
+        self.conv = ConvBNReLU(1, 4, stride=2)
+
+        self.unfold = nn.Unfold(kernel_size=(8,8),stride=(4,4))
+
+        self.rnn_model = RNNPool(8, 8, 16, 16, 4)#num_init_features)
+        
+        self.mob = nn.ModuleList(base)
+        # Layer learns to scale the l2 normalized features from conv4_3
+        self.L2Norm3_3 = L2Norm(32, 10)
+        self.L2Norm4_3 = L2Norm(32, 8)
+        self.L2Norm5_3 = L2Norm(96, 5)
+
+
+        self.loc = nn.ModuleList(head[0])
+        self.conf = nn.ModuleList(head[1])
+
+                
+        if self.phase == 'test':
+            self.softmax = nn.Softmax(dim=-1) 
+
+
+    def forward(self, x):
+        """Applies network layers and ops on input image(s) x.
+        Args:
+            x: input image or batch of images. Shape: [batch,3,300,300].
+        Return:
+            Depending on phase:
+            test:
+                Variable(tensor) of output class label predictions,
+                confidence score, and corresponding location predictions for
+                each object detected. Shape: [batch,topk,7]
+            train:
+                list of concat outputs from:
+                    1: confidence layers, Shape: [batch*num_priors,num_classes]
+                    2: localization layers, Shape: [batch,num_priors*4]
+                    3: priorbox layers, Shape: [2,num_priors*4]
+        """
+        size = x.size()[2:]
+        batch_size = x.shape[0]
+        sources = list()
+        loc = list()
+        conf = list()
+
+        x = self.conv(x)
+
+        patches = self.unfold(x)
+        patches = torch.cat(torch.unbind(patches,dim=2),dim=0)
+        patches = torch.reshape(patches,(-1,4,8,8))
+
+        output_x = int((x.shape[2]-8)/4 + 1)
+        output_y = int((x.shape[3]-8)/4 + 1)
+
+        rnnX = self.rnn_model(patches, int(batch_size)*output_x*output_y)
+
+        x = torch.stack(torch.split(rnnX, split_size_or_sections=int(batch_size), dim=0),dim=2)
+
+        x = F.fold(x, kernel_size=(1,1), output_size=(output_x,output_y))
+
+        x = F.pad(x, (0,1,0,1), mode='replicate')
+
+        for k in range(4):
+            x = self.mob[k](x)
+
+        s = self.L2Norm3_3(x)
+        sources.append(s)
+
+        for k in range(4, 8):
+            x = self.mob[k](x)
+
+        s = self.L2Norm4_3(x)
+        sources.append(s)
+
+        for k in range(8, 11):
+            x = self.mob[k](x)
+
+        s = self.L2Norm5_3(x)
+        sources.append(s)
+
+        for k in range(11, 14):
+            x = self.mob[k](x)
+        sources.append(x)
+
+      
+        # apply multibox head to source layers
+
+        loc_x = self.loc[0](sources[0])
+        conf_x = self.conf[0](sources[0])
+
+        max_conf, _ = torch.max(conf_x[:, 0:3, :, :], dim=1, keepdim=True)
+        conf_x = torch.cat((max_conf, conf_x[:, 3:, :, :]), dim=1)
+
+        loc.append(loc_x.permute(0, 2, 3, 1).contiguous())
+        conf.append(conf_x.permute(0, 2, 3, 1).contiguous())
+
+        for i in range(1, len(sources)):
+            x = sources[i]
+            conf.append(self.conf[i](x).permute(0, 2, 3, 1).contiguous())
+            loc.append(self.loc[i](x).permute(0, 2, 3, 1).contiguous())
+
+
+        features_maps = []
+        for i in range(len(loc)):
+            feat = []
+            feat += [loc[i].size(1), loc[i].size(2)]
+            features_maps += [feat]
+
+        self.priorbox = PriorBox(size, features_maps, cfg)
+        
+        self.priors = self.priorbox.forward()
+
+        loc = torch.cat([o.view(o.size(0), -1) for o in loc], 1)
+        conf = torch.cat([o.view(o.size(0), -1) for o in conf], 1)
+
+       
+        if self.phase == 'test':
+            output = detect_function(cfg,
+                loc.view(loc.size(0), -1, 4),                   # loc preds
+                self.softmax(conf.view(conf.size(0), -1,
+                                       self.num_classes)),      # conf preds
+                self.priors.type(type(x.data))                  # default boxes
+            )
+
+        else:
+            output = (
+                loc.view(loc.size(0), -1, 4),
+                conf.view(conf.size(0), -1, self.num_classes),
+                self.priors
+            )
+        return output
+
+    def load_weights(self, base_file):
+        other, ext = os.path.splitext(base_file)
+        if ext == '.pkl' or '.pth':
+            print('Loading weights into state dict...')
+            mdata = torch.load(base_file,
+                               map_location=lambda storage, loc: storage)
+            weights = mdata['weight']
+            epoch = mdata['epoch']
+            self.load_state_dict(weights)
+            print('Finished!')
+        else:
+            print('Sorry only .pth and .pkl files supported.')
+        return epoch
+
+    def xavier(self, param):
+        init.xavier_uniform(param)
+
+    def weights_init(self, m):
+        if isinstance(m, nn.Conv2d):
+            self.xavier(m.weight.data)
+            m.bias.data.zero_()
+
+
+
+
+def _make_divisible(v, divisor, min_value=None):
+    """
+    This function is taken from the original tf repo.
+    It ensures that all layers have a channel number that is divisible by 8
+    It can be seen here:
+    https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet/mobilenet.py
+    :param v:
+    :param divisor:
+    :param min_value:
+    :return:
+    """
+    if min_value is None:
+        min_value = divisor
+    new_v = max(min_value, int(v + divisor / 2) // divisor * divisor)
+    # Make sure that round down does not go down by more than 10%.
+    if new_v < 0.9 * v:
+        new_v += divisor
+    return new_v
+
+
+class ConvBNReLU(nn.Sequential):
+    def __init__(self, in_planes, out_planes, kernel_size=3, stride=1, groups=1):
+        padding = (kernel_size - 1) // 2
+        super(ConvBNReLU, self).__init__(
+            nn.Conv2d(in_planes, out_planes, kernel_size, stride, padding, groups=groups, bias=False),
+            nn.BatchNorm2d(out_planes),
+            nn.ReLU6(inplace=True)
+        )
+
+
+class InvertedResidual(nn.Module):
+    def __init__(self, inp, oup, stride, expand_ratio):
+        super(InvertedResidual, self).__init__()
+        self.stride = stride
+        assert stride in [1, 2]
+
+        hidden_dim = int(round(inp * expand_ratio))
+        self.use_res_connect = self.stride == 1 and inp == oup
+
+        layers = []
+        if expand_ratio != 1:
+            # pw
+            layers.append(ConvBNReLU(inp, hidden_dim, kernel_size=1))
+        layers.extend([
+            # dw
+            ConvBNReLU(hidden_dim, hidden_dim, stride=stride, groups=hidden_dim),
+            # pw-linear
+            nn.Conv2d(hidden_dim, oup, 1, 1, 0, bias=False),
+            nn.BatchNorm2d(oup),
+        ])
+        self.conv = nn.Sequential(*layers)
+
+    def forward(self, x):
+        if self.use_res_connect:
+            return x + self.conv(x)
+        else:
+            return self.conv(x)
+
+
+class MobileNetV2(nn.Module):
+    def __init__(self, num_classes=1000, width_mult=1.0, inverted_residual_setting=None, round_nearest=8):
+        """
+        MobileNet V2 main class
+        Args:
+            num_classes (int): Number of classes
+            width_mult (float): Width multiplier - adjusts number of channels in each layer by this amount
+            inverted_residual_setting: Network structure
+            round_nearest (int): Round the number of channels in each layer to be a multiple of this number
+            Set to 1 to turn off rounding
+        """
+        super(MobileNetV2, self).__init__()
+        block = InvertedResidual
+        input_channel = 64
+
+        if inverted_residual_setting is None:
+            inverted_residual_setting = [
+                # t, c, n, s               
+                [2, 32, 4, 1],
+                [2, 32, 4, 1],
+                [2, 96, 3, 2],
+                [2, 128, 3, 1],              
+            ]
+
+        # only check the first element, assuming user knows t,c,n,s are required
+        if len(inverted_residual_setting) == 0 or len(inverted_residual_setting[0]) != 4:
+            raise ValueError("inverted_residual_setting should be non-empty "
+                             "or a 4-element list, got {}".format(inverted_residual_setting))
+
+        # building first layer
+        input_channel = _make_divisible(input_channel * width_mult, round_nearest)
+        self.last_channel = _make_divisible(last_channel * max(1.0, width_mult), round_nearest)
+        self.layers = []
+        # building inverted residual blocks
+        for t, c, n, s in inverted_residual_setting:
+            output_channel = _make_divisible(c * width_mult, round_nearest)
+            for i in range(n):
+                stride = s if i == 0 else 1
+                self.layers.append(block(input_channel, output_channel, stride, expand_ratio=t))
+                input_channel = output_channel
+
+        # weight initialization
+        for m in self.modules():
+            if isinstance(m, nn.Conv2d):
+                nn.init.kaiming_normal_(m.weight, mode='fan_out')
+                if m.bias is not None:
+                    nn.init.zeros_(m.bias)
+            elif isinstance(m, nn.BatchNorm2d):
+                nn.init.ones_(m.weight)
+                nn.init.zeros_(m.bias)
+            elif isinstance(m, nn.Linear):
+                nn.init.normal_(m.weight, 0, 0.01)
+                nn.init.zeros_(m.bias)
+
+
+
+
+def multibox(mobilenet, num_classes):
+    loc_layers = []
+    conf_layers = []
+
+    loc_layers += [nn.Conv2d(32, 4, kernel_size=3, padding=1)]
+    conf_layers += [nn.Conv2d(32, 3 + (num_classes-1), kernel_size=3, padding=1)]
+
+    loc_layers += [nn.Conv2d(32, 4, kernel_size=3, padding=1)]
+    conf_layers += [nn.Conv2d(32, num_classes, kernel_size=3, padding=1)]
+
+    loc_layers += [nn.Conv2d(96, 4, kernel_size=3, padding=1)]
+    conf_layers += [nn.Conv2d(96, num_classes, kernel_size=3, padding=1)]
+
+    loc_layers += [nn.Conv2d(128, 4, kernel_size=3, padding=1)]
+    conf_layers += [nn.Conv2d(128, num_classes, kernel_size=3, padding=1)]
+
+      
+    return mobilenet, (loc_layers, conf_layers)
+
+
+def build_s3fd(phase, num_classes=2):
+    base_, head_ = multibox(
+        MobileNetV2().layers, num_classes)
+    
+    return S3FD(phase, base_, head_, num_classes)
+
+
+if __name__ == '__main__':
+    net = build_s3fd('train', num_classes=2)
+    inputs = Variable(torch.randn(4, 1, 320, 320))
+    output = net(inputs)
diff --git a/examples/pytorch/vision/Face_Detection/models/RPool_Face_Quant.py b/examples/pytorch/vision/Face_Detection/models/RPool_Face_Quant.py
new file mode 100755
index 000000000..6cac8fa90
--- /dev/null
+++ b/examples/pytorch/vision/Face_Detection/models/RPool_Face_Quant.py
@@ -0,0 +1,375 @@
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT license.
+
+from __future__ import division
+from __future__ import absolute_import
+from __future__ import print_function
+
+import os
+import torch
+import torch.nn as nn
+import torch.nn.init as init
+import torch.nn.functional as F
+
+
+from layers import *
+from data.config import cfg
+import numpy as np
+
+from edgeml_pytorch.graph.rnnpool import *
+
+class S3FD(nn.Module):
+    """Single Shot Multibox Architecture
+    The network is composed of a base VGG network followed by the
+    added multibox conv layers.  Each multibox layer branches into
+        1) conv2d for class conf scores
+        2) conv2d for localization predictions
+        3) associated priorbox layer to produce default bounding
+           boxes specific to the layer's feature map size.
+    See: https://arxiv.org/pdf/1512.02325.pdf for more details.
+    Args:
+        phase: (string) Can be "test" or "train"
+        size: input image size
+        base: VGG16 layers for input, size of either 300 or 500
+        extras: extra layers that feed to multibox loc and conf layers
+        head: "multibox head" consists of loc and conf conv layers
+    """
+
+    def __init__(self, phase, base, head, num_classes):
+        super(S3FD, self).__init__()
+        self.phase = phase
+        self.num_classes = num_classes
+        '''
+        self.priorbox = PriorBox(size,cfg)
+        self.priors = Variable(self.priorbox.forward(), volatile=True)
+        '''
+        # SSD network
+
+        self.conv_top = nn.Sequential(ConvBNReLU(3, 4, kernel_size=3, stride=2), ConvBNReLU(4, 4, kernel_size=3))
+
+        self.unfold = nn.Unfold(kernel_size=(8,8),stride=(4,4))
+
+        self.rnn_model = RNNPool(8, 8, 8, 8, 4)
+
+        self.mob = nn.ModuleList(base)
+        # Layer learns to scale the l2 normalized features from conv4_3
+        self.L2Norm3_3 = L2Norm(4, 10)
+        self.L2Norm4_3 = L2Norm(16, 8)
+        self.L2Norm5_3 = L2Norm(24, 5)
+
+
+        self.loc = nn.ModuleList(head[0])
+        self.conf = nn.ModuleList(head[1])
+
+        if self.phase == 'test':
+            self.softmax = nn.Softmax(dim=-1) 
+
+
+    def forward(self, x):
+        """Applies network layers and ops on input image(s) x.
+        Args:
+            x: input image or batch of images. Shape: [batch,3,300,300].
+        Return:
+            Depending on phase:
+            test:
+                Variable(tensor) of output class label predictions,
+                confidence score, and corresponding location predictions for
+                each object detected. Shape: [batch,topk,7]
+            train:
+                list of concat outputs from:
+                    1: confidence layers, Shape: [batch*num_priors,num_classes]
+                    2: localization layers, Shape: [batch,num_priors*4]
+                    3: priorbox layers, Shape: [2,num_priors*4]
+        """
+        size = x.size()[2:]
+        batch_size = x.shape[0]
+        sources = list()
+        loc = list()
+        conf = list()
+
+        x = self.conv_top(x)
+
+        s = self.L2Norm3_3(x)
+        sources.append(s)
+
+        patches = self.unfold(x)
+        patches = torch.cat(torch.unbind(patches,dim=2),dim=0)
+        patches = torch.reshape(patches,(-1,4,8,8))
+
+        output_x = int((x.shape[2]-8)/4 + 1)
+        output_y = int((x.shape[3]-8)/4 + 1)
+
+        rnnX = self.rnn_model(patches, int(batch_size)*output_x*output_y)
+
+        x = torch.stack(torch.split(rnnX, split_size_or_sections=int(batch_size), dim=0),dim=2)
+
+        x = F.fold(x, kernel_size=(1,1), output_size=(output_x,output_y))
+
+        x = F.pad(x, (0,1,0,1), mode='replicate')
+
+        for k in range(4):
+            x = self.mob[k](x)
+
+        s = self.L2Norm4_3(x)
+        sources.append(s)
+
+        for k in range(4, 8):
+            x = self.mob[k](x)
+
+        s = self.L2Norm5_3(x)
+        sources.append(s)
+
+        for k in range(8, 10):
+            x = self.mob[k](x)
+        sources.append(x)
+
+        for k in range(10, 11):
+            x = self.mob[k](x)
+        sources.append(x)
+
+        for k in range(11, 12):
+            x = self.mob[k](x)
+        sources.append(x)
+
+
+
+        # apply multibox head to source layers
+
+        loc_x = self.loc[0](sources[0])
+        conf_x = self.conf[0](sources[0])
+
+        loc_x = self.loc[1](loc_x)
+        conf_x = self.conf[1](conf_x)
+
+        max_conf, _ = torch.max(conf_x[:, 0:3, :, :], dim=1, keepdim=True)
+        conf_x = torch.cat((max_conf, conf_x[:, 3:, :, :]), dim=1)
+
+        loc.append(loc_x.permute(0, 2, 3, 1).contiguous())
+        conf.append(conf_x.permute(0, 2, 3, 1).contiguous())
+
+        for i in range(1, len(sources)):
+            x = sources[i]
+            conf.append(self.conf[i+1](x).permute(0, 2, 3, 1).contiguous())
+            loc.append(self.loc[i+1](x).permute(0, 2, 3, 1).contiguous())
+
+
+        features_maps = []
+        for i in range(len(loc)):
+            feat = []
+            feat += [loc[i].size(1), loc[i].size(2)]
+            features_maps += [feat]
+
+        self.priorbox = PriorBox(size, features_maps, cfg)
+        self.priors = self.priorbox.forward()
+
+        loc = torch.cat([o.view(o.size(0), -1) for o in loc], 1)
+        conf = torch.cat([o.view(o.size(0), -1) for o in conf], 1)
+
+       
+        if self.phase == 'test':
+            output = detect_function(cfg,
+                loc.view(loc.size(0), -1, 4),                   # loc preds
+                self.softmax(conf.view(conf.size(0), -1,
+                                       self.num_classes)),                # conf preds
+                self.priors.type(type(x.data))                  # default boxes
+            )
+
+        else:
+            output = (
+                loc.view(loc.size(0), -1, 4),
+                conf.view(conf.size(0), -1, self.num_classes),
+                self.priors
+            )
+        return output
+
+    def load_weights(self, base_file):
+        other, ext = os.path.splitext(base_file)
+        if ext == '.pkl' or '.pth':
+            print('Loading weights into state dict...')
+            mdata = torch.load(base_file,
+                               map_location=lambda storage, loc: storage)
+            weights = mdata['weight']
+            epoch = mdata['epoch']
+            self.load_state_dict(weights)
+            print('Finished!')
+        else:
+            print('Sorry only .pth and .pkl files supported.')
+        return epoch
+
+    def xavier(self, param):
+        init.xavier_uniform(param)
+
+    def weights_init(self, m):
+        if isinstance(m, nn.Conv2d):
+            self.xavier(m.weight.data)
+            m.bias.data.zero_()
+
+
+
+
+def _make_divisible(v, divisor, min_value=None):
+    """
+    This function is taken from the original tf repo.
+    It ensures that all layers have a channel number that is divisible by 8
+    It can be seen here:
+    https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet/mobilenet.py
+    :param v:
+    :param divisor:
+    :param min_value:
+    :return:
+    """
+    if min_value is None:
+        min_value = divisor
+    new_v = max(min_value, int(v + divisor / 2) // divisor * divisor)
+    # Make sure that round down does not go down by more than 10%.
+    if new_v < 0.9 * v:
+        new_v += divisor
+    return new_v
+
+
+class ConvBNReLU(nn.Sequential):
+    def __init__(self, in_planes, out_planes, kernel_size=3, stride=1, groups=1):
+        padding = (kernel_size - 1) // 2
+        super(ConvBNReLU, self).__init__(
+            nn.Conv2d(in_planes, out_planes, kernel_size, stride, padding, groups=groups, bias=False),
+            nn.BatchNorm2d(out_planes),
+            nn.ReLU6(inplace=True)
+        )
+
+
+class InvertedResidual(nn.Module):
+    def __init__(self, inp, oup, stride, expand_ratio):
+        super(InvertedResidual, self).__init__()
+        self.stride = stride
+        assert stride in [1, 2]
+
+        hidden_dim = int(round(inp * expand_ratio))
+        self.use_res_connect = self.stride == 1 and inp == oup
+
+        layers = []
+        if expand_ratio != 1:
+            # pw
+            layers.append(ConvBNReLU(inp, hidden_dim, kernel_size=1))
+        layers.extend([
+            # dw
+            ConvBNReLU(hidden_dim, hidden_dim, stride=stride, groups=hidden_dim),
+            # pw-linear
+            nn.Conv2d(hidden_dim, oup, 1, 1, 0, bias=False),
+            nn.BatchNorm2d(oup),
+        ])
+        self.conv = nn.Sequential(*layers)
+
+    def forward(self, x):
+        if self.use_res_connect:
+            return x + self.conv(x)
+        else:
+            return self.conv(x)
+
+
+class MobileNetV2(nn.Module):
+    def __init__(self, num_classes=1000, width_mult=1.0, inverted_residual_setting=None, round_nearest=8):
+        """
+        MobileNet V2 main class
+        Args:
+            num_classes (int): Number of classes
+            width_mult (float): Width multiplier - adjusts number of channels in each layer by this amount
+            inverted_residual_setting: Network structure
+            round_nearest (int): Round the number of channels in each layer to be a multiple of this number
+            Set to 1 to turn off rounding
+        """
+        super(MobileNetV2, self).__init__()
+        block = InvertedResidual
+        input_channel = 32
+
+        if inverted_residual_setting is None:
+            inverted_residual_setting = [
+                # t, c, n, s
+                [2, 16, 4, 1],
+                [2, 24, 4, 2],
+                [2, 32, 2, 2],
+                [2, 64, 1, 2],
+                [2, 96, 1, 2],
+            ]
+
+        if len(inverted_residual_setting) == 0 or len(inverted_residual_setting[0]) != 4:
+            raise ValueError("inverted_residual_setting should be non-empty "
+                             "or a 4-element list, got {}".format(inverted_residual_setting))
+
+        # building first layer
+        input_channel = _make_divisible(input_channel * width_mult, round_nearest)
+        self.last_channel = _make_divisible(last_channel * max(1.0, width_mult), round_nearest)
+        self.layers = []
+        # building inverted residual blocks
+        for t, c, n, s in inverted_residual_setting:
+            output_channel = _make_divisible(c * width_mult, round_nearest)
+            for i in range(n):
+                stride = s if i == 0 else 1
+                self.layers.append(block(input_channel, output_channel, stride, expand_ratio=t))
+                input_channel = output_channel
+                
+        # weight initialization
+        for m in self.modules():
+            if isinstance(m, nn.Conv2d):
+                nn.init.kaiming_normal_(m.weight, mode='fan_out')
+                if m.bias is not None:
+                    nn.init.zeros_(m.bias)
+            elif isinstance(m, nn.BatchNorm2d):
+                nn.init.ones_(m.weight)
+                nn.init.zeros_(m.bias)
+            elif isinstance(m, nn.Linear):
+                nn.init.normal_(m.weight, 0, 0.01)
+                nn.init.zeros_(m.bias)
+
+
+
+
+def multibox(mobilenet, num_classes):
+    loc_layers = []
+    conf_layers = []
+
+    loc_layers += nn.Sequential(ConvBNReLU(4, 8, kernel_size=3, stride=2),
+                    nn.Conv2d(8, 4,  kernel_size=3, padding=1))
+    conf_layers += nn.Sequential(ConvBNReLU(4, 8, kernel_size=3, stride=2),
+                    nn.Conv2d(8, 3 + (num_classes-1), kernel_size=3, padding=1))
+
+    loc_layers += [nn.Conv2d(16,
+                                 4, kernel_size=3, padding=1)]
+    conf_layers += [nn.Conv2d(16,
+                                  num_classes, kernel_size=3, padding=1)]
+
+    loc_layers += [nn.Conv2d(24,
+                                 4, kernel_size=3, padding=1)]
+    conf_layers += [nn.Conv2d(24,
+                                  num_classes, kernel_size=3, padding=1)]
+
+    loc_layers += [nn.Conv2d(32,
+                                 4, kernel_size=3, padding=1)]
+    conf_layers += [nn.Conv2d(32,
+                                  num_classes, kernel_size=3, padding=1)]
+
+    loc_layers += [nn.Conv2d(64,
+                                 4, kernel_size=3, padding=1)]
+    conf_layers += [nn.Conv2d(64,
+                                  num_classes, kernel_size=3, padding=1)]
+
+    loc_layers += [nn.Conv2d(96,
+                                 4, kernel_size=3, padding=1)]
+    conf_layers += [nn.Conv2d(96,
+                                  num_classes, kernel_size=3, padding=1)]
+
+ 
+   
+    return mobilenet, (loc_layers, conf_layers)
+
+
+def build_s3fd(phase, num_classes=2):
+    base_, head_ = multibox(
+        MobileNetV2().layers, num_classes)
+    
+    return S3FD(phase, base_, head_, num_classes)
+
+
+if __name__ == '__main__':
+    net = build_s3fd('train', num_classes=2)
+    inputs = Variable(torch.randn(4, 3, 640, 640))
+    output = net(inputs)
diff --git a/examples/pytorch/vision/Face_Detection/models/__init__.py b/examples/pytorch/vision/Face_Detection/models/__init__.py
new file mode 100644
index 000000000..f8dc538e3
--- /dev/null
+++ b/examples/pytorch/vision/Face_Detection/models/__init__.py
@@ -0,0 +1,6 @@
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT license.
+
+from __future__ import absolute_import
+from .RPool_Face_C import *
+from .RPool_Face_Quant import *
diff --git a/examples/pytorch/vision/Face_Detection/prepare_wider_data.py b/examples/pytorch/vision/Face_Detection/prepare_wider_data.py
new file mode 100755
index 000000000..bc6985c00
--- /dev/null
+++ b/examples/pytorch/vision/Face_Detection/prepare_wider_data.py
@@ -0,0 +1,86 @@
+## This code is built on https://github.com/yxlijun/S3FD.pytorch
+#-*- coding:utf-8 -*-
+
+from __future__ import division
+from __future__ import absolute_import
+from __future__ import print_function
+
+
+import os
+from data.config import cfg
+import cv2
+
+WIDER_ROOT = os.path.join(cfg.HOME, 'WIDER_FACE')
+train_list_file = os.path.join(WIDER_ROOT, 'wider_face_split',
+                               'wider_face_train_bbx_gt.txt')
+val_list_file = os.path.join(WIDER_ROOT, 'wider_face_split',
+                             'wider_face_val_bbx_gt.txt')
+
+WIDER_TRAIN = os.path.join(WIDER_ROOT, 'WIDER_train', 'images')
+WIDER_VAL = os.path.join(WIDER_ROOT, 'WIDER_val', 'images')
+
+
+def parse_wider_file(root, file):
+    with open(file, 'r') as fr:
+        lines = fr.readlines()
+    face_count = []
+    img_paths = []
+    face_loc = []
+    img_faces = []
+    count = 0
+    flag = False
+    for k, line in enumerate(lines):
+        line = line.strip().strip('\n')
+        if count > 0:
+            line = line.split(' ')
+            count -= 1
+            loc = [int(line[0]), int(line[1]), int(line[2]), int(line[3])]
+            face_loc += [loc]
+        if flag:
+            face_count += [int(line)]
+            flag = False
+            count = int(line)
+        if 'jpg' in line:
+            img_paths += [os.path.join(root, line)]
+            flag = True
+
+    total_face = 0
+    for k in face_count:
+        face_ = []
+        for x in range(total_face, total_face + k):
+            face_.append(face_loc[x])
+        img_faces += [face_]
+        total_face += k
+    return img_paths, img_faces
+
+
+def wider_data_file():
+    img_paths, bbox = parse_wider_file(WIDER_TRAIN, train_list_file)
+    fw = open(cfg.FACE.TRAIN_FILE, 'w')
+    for index in range(len(img_paths)):
+        path = img_paths[index]
+        boxes = bbox[index]
+        fw.write(path)
+        fw.write(' {}'.format(len(boxes)))
+        for box in boxes:
+            data = ' {} {} {} {} {}'.format(box[0], box[1], box[2], box[3], 1)
+            fw.write(data)
+        fw.write('\n')
+    fw.close()
+
+    img_paths, bbox = parse_wider_file(WIDER_VAL, val_list_file)
+    fw = open(cfg.FACE.VAL_FILE, 'w')
+    for index in range(len(img_paths)):
+        path = img_paths[index]
+        boxes = bbox[index]
+        fw.write(path)
+        fw.write(' {}'.format(len(boxes)))
+        for box in boxes:
+            data = ' {} {} {} {} {}'.format(box[0], box[1], box[2], box[3], 1)
+            fw.write(data)
+        fw.write('\n')
+    fw.close()
+
+
+if __name__ == '__main__':
+    wider_data_file()
diff --git a/examples/pytorch/vision/Face_Detection/requirements.txt b/examples/pytorch/vision/Face_Detection/requirements.txt
new file mode 100644
index 000000000..6d3a8719c
--- /dev/null
+++ b/examples/pytorch/vision/Face_Detection/requirements.txt
@@ -0,0 +1,10 @@
+Cython==0.29.15
+easydict==1.9
+importlib-metadata==1.5.0
+matplotlib==3.2.1
+opencv-python-headless==4.2.0.32
+PyYAML==3.12
+scikit-image==0.15.0
+tensorboard==1.14.0
+tensorboardX==1.9
+tqdm==4.36.1
\ No newline at end of file
diff --git a/examples/pytorch/vision/Face_Detection/train.py b/examples/pytorch/vision/Face_Detection/train.py
new file mode 100755
index 000000000..1b52089c5
--- /dev/null
+++ b/examples/pytorch/vision/Face_Detection/train.py
@@ -0,0 +1,244 @@
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT license.
+
+from data import *
+from layers.modules import MultiBoxLoss
+import os
+import time
+import torch
+import torch.nn as nn
+import torch.optim as optim
+import torch.nn.init as init
+import torch.utils.data as data
+import numpy as np
+import argparse
+import torch.backends.cudnn as cudnn
+
+from data.choose_config import cfg
+cfg = cfg.cfg
+from importlib import import_module
+
+
+def str2bool(v):
+    return v.lower() in ("yes", "true", "t", "1")
+
+parser = argparse.ArgumentParser(
+    description='S3FD face Detector Training With Pytorch')
+train_set = parser.add_mutually_exclusive_group()
+parser.add_argument('--dataset',
+                    default='face',
+                    choices=['hand', 'face', 'head'],
+                    help='Train target')
+parser.add_argument('--basenet',
+                    default='vgg16_reducedfc.pth',
+                    help='Pretrained base model')
+parser.add_argument('--batch_size',
+                    default=16, type=int,
+                    help='Batch size for training')
+parser.add_argument('--resume',
+                    default=None, type=str,
+                    help='Checkpoint state_dict file to resume training from')
+parser.add_argument('--model_arch',
+                    default='RPool_Face_C', type=str,
+                    choices=['RPool_Face_C', 'RPool_Face_Quant', 'RPool_Face_QVGA_monochrome'],
+                    help='choose architecture among rpool variants')
+parser.add_argument('--num_workers',
+                    default=128, type=int,
+                    help='Number of workers used in dataloading')
+parser.add_argument('--cuda',
+                    default=True, type=str2bool,
+                    help='Use CUDA to train model')
+parser.add_argument('--lr', '--learning-rate',
+                    default=1e-2, type=float,
+                    help='initial learning rate')
+parser.add_argument('--momentum',
+                    default=0.9, type=float,
+                    help='Momentum value for optim')
+parser.add_argument('--weight_decay',
+                    default=5e-4, type=float,
+                    help='Weight decay for SGD')
+parser.add_argument('--gamma',
+                    default=0.1, type=float,
+                    help='Gamma update for SGD')
+parser.add_argument('--multigpu',
+                    default=False, type=str2bool,
+                    help='Use mutil Gpu training')
+parser.add_argument('--save_folder',
+                    default='weights/',
+                    help='Directory for saving checkpoint models')
+parser.add_argument('--epochs',
+                    default=300, type=int,
+                    help='total epochs')
+parser.add_argument('--save_frequency',
+                    default=5000, type=int,
+                    help='iterations interval after which checkpoint is saved')
+args = parser.parse_args()
+
+
+if torch.cuda.is_available():
+    if args.cuda:
+        torch.set_default_tensor_type('torch.cuda.FloatTensor')
+    if not args.cuda:
+        print("WARNING: It looks like you have a CUDA device, but aren't " +
+              "using CUDA.\nRun with --cuda for optimal training speed.")
+        torch.set_default_tensor_type('torch.FloatTensor')
+else:
+    torch.set_default_tensor_type('torch.FloatTensor')
+
+if not os.path.exists(args.save_folder):
+    os.makedirs(args.save_folder)
+
+
+train_dataset = WIDERDetection(cfg.FACE.TRAIN_FILE, mode='train', mono_mode=cfg.IS_MONOCHROME)
+val_dataset = WIDERDetection(cfg.FACE.VAL_FILE, mode='val', mono_mode=cfg.IS_MONOCHROME)
+
+train_loader = data.DataLoader(train_dataset, args.batch_size,
+                               num_workers=args.num_workers,
+                               shuffle=True,
+                               collate_fn=detection_collate,
+                               pin_memory=True)
+
+val_batchsize = args.batch_size // 2
+val_loader = data.DataLoader(val_dataset, val_batchsize,
+                             num_workers=args.num_workers,
+                             shuffle=False,
+                             collate_fn=detection_collate,
+                             pin_memory=True)
+
+min_loss = np.inf
+start_epoch = 0
+
+module = import_module('models.' + args.model_arch)
+net = module.build_s3fd('train', cfg.NUM_CLASSES)
+
+
+
+if args.cuda:
+    if args.multigpu:
+        net = torch.nn.DataParallel(net)
+    net = net.cuda()
+    cudnn.benckmark = True
+
+if args.resume:
+    print('Resuming training, loading {}...'.format(args.resume))
+    net.load_state_dict(torch.load(args.resume))
+
+optimizer = optim.SGD(net.parameters(), lr=args.lr, momentum=args.momentum,
+                      weight_decay=args.weight_decay)
+
+criterion = MultiBoxLoss(cfg, args.dataset, args.cuda)
+print('Loading wider dataset...')
+print('Using the specified args:')
+print(args)
+
+
+def train():
+    step_index = 0
+    iteration = 0
+    
+    for epoch in range(start_epoch, args.epochs):
+        net.train()
+        losses = 0
+        train_loader_len = len(train_loader)
+        for batch_idx, (images, targets) in enumerate(train_loader):
+            adjust_learning_rate(optimizer, epoch, batch_idx, train_loader_len)
+
+            if args.cuda:
+                images = images.cuda()
+                targets = [ann.cuda()
+                           for ann in targets]
+            else:
+                images = images
+                targets = [ann for ann in targets]
+
+              
+            t0 = time.time()
+            out = net(images)
+            # backprop
+            optimizer.zero_grad()
+            loss_l, loss_c = criterion(out, targets)
+            loss = loss_l + loss_c
+            loss.backward()
+            optimizer.step()
+            t1 = time.time()
+            losses += loss.item()
+
+            if iteration % 10 == 0:
+                tloss = losses / (batch_idx + 1)
+                print('Timer: %.4f' % (t1 - t0))
+                print('epoch:' + repr(epoch) + ' || iter:' +
+                      repr(iteration) + ' || Loss:%.4f' % (tloss))
+                print('->> conf loss:{:.4f} || loc loss:{:.4f}'.format(
+                    loss_c.item(), loss_l.item()))
+                print('->>lr:{:.6f}'.format(optimizer.param_groups[0]['lr']))
+
+            if iteration != 0 and iteration % args.save_frequency == 0:
+                print('Saving state, iter:', iteration)
+                file = 'rpool_' + args.dataset + '_' + repr(iteration) + '_checkpoint.pth'
+                torch.save(net.state_dict(),
+                           os.path.join(args.save_folder, file))
+            iteration += 1
+
+        val(epoch)
+        if iteration == cfg.MAX_STEPS:
+            break
+
+
+def val(epoch):
+    net.eval()
+    loc_loss = 0
+    conf_loss = 0
+    step = 0
+    t1 = time.time()
+    with torch.no_grad():
+        for batch_idx, (images, targets) in enumerate(val_loader):
+            if args.cuda:
+                images = images.cuda()
+                targets = [ann.cuda()
+                           for ann in targets]
+            else:
+                images = images
+                targets = [ann for ann in targets]
+
+            out = net(images)
+            loss_l, loss_c = criterion(out, targets)
+            loss = loss_l + loss_c
+            loc_loss += loss_l.item()
+            conf_loss += loss_c.item()
+            step += 1
+
+    tloss = (loc_loss + conf_loss) / step
+    t2 = time.time()
+    print('Timer: %.4f' % (t2 - t1))
+    print('test epoch:' + repr(epoch) + ' || Loss:%.4f' % (tloss))
+
+    global min_loss
+    if tloss < min_loss:
+        print('Saving best state,epoch', epoch)
+        file = '{}_best_state.pth'.format(args.model_arch)
+        torch.save(net.state_dict(), os.path.join(
+            args.save_folder, file))
+        min_loss = tloss
+
+
+
+from math import cos, pi
+def adjust_learning_rate(optimizer, epoch, iteration, num_iter):
+    lr = optimizer.param_groups[0]['lr']
+
+    warmup_epoch = 5
+    warmup_iter = warmup_epoch * num_iter
+    current_iter = iteration + epoch * num_iter
+    max_iter = args.epochs * num_iter
+
+    lr = args.lr * (1 + cos(pi * (current_iter - warmup_iter) / (max_iter - warmup_iter))) / 2
+
+    if epoch < warmup_epoch:
+        lr = args.lr * current_iter / warmup_iter
+
+    for param_group in optimizer.param_groups:
+        param_group['lr'] = lr
+
+
+if __name__ == '__main__':
+    train()
\ No newline at end of file
diff --git a/examples/pytorch/vision/Face_Detection/utils/__init__.py b/examples/pytorch/vision/Face_Detection/utils/__init__.py
new file mode 100755
index 000000000..279e648c5
--- /dev/null
+++ b/examples/pytorch/vision/Face_Detection/utils/__init__.py
@@ -0,0 +1,4 @@
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT license.
+
+from .augmentations import *
\ No newline at end of file
diff --git a/examples/pytorch/vision/Face_Detection/utils/augmentations.py b/examples/pytorch/vision/Face_Detection/utils/augmentations.py
new file mode 100755
index 000000000..3b902aac7
--- /dev/null
+++ b/examples/pytorch/vision/Face_Detection/utils/augmentations.py
@@ -0,0 +1,862 @@
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT license.
+
+from __future__ import absolute_import
+from __future__ import division
+from __future__ import print_function
+
+
+import torch
+from torchvision import transforms
+import cv2
+import numpy as np
+import types
+from PIL import Image, ImageEnhance, ImageDraw
+import math
+import six
+
+import sys; sys.path.append('../')
+from data.choose_config import cfg
+cfg = cfg.cfg
+import random
+
+
+class sampler():
+
+    def __init__(self,
+                 max_sample,
+                 max_trial,
+                 min_scale,
+                 max_scale,
+                 min_aspect_ratio,
+                 max_aspect_ratio,
+                 min_jaccard_overlap,
+                 max_jaccard_overlap,
+                 min_object_coverage,
+                 max_object_coverage,
+                 use_square=False):
+        self.max_sample = max_sample
+        self.max_trial = max_trial
+        self.min_scale = min_scale
+        self.max_scale = max_scale
+        self.min_aspect_ratio = min_aspect_ratio
+        self.max_aspect_ratio = max_aspect_ratio
+        self.min_jaccard_overlap = min_jaccard_overlap
+        self.max_jaccard_overlap = max_jaccard_overlap
+        self.min_object_coverage = min_object_coverage
+        self.max_object_coverage = max_object_coverage
+        self.use_square = use_square
+
+
+def intersect(box_a, box_b):
+    max_xy = np.minimum(box_a[:, 2:], box_b[2:])
+    min_xy = np.maximum(box_a[:, :2], box_b[:2])
+    inter = np.clip((max_xy - min_xy), a_min=0, a_max=np.inf)
+    return inter[:, 0] * inter[:, 1]
+
+
+def jaccard_numpy(box_a, box_b):
+    """Compute the jaccard overlap of two sets of boxes.  The jaccard overlap
+    is simply the intersection over union of two boxes.
+    E.g.:
+        A ∩ B / A ∪ B = A ∩ B / (area(A) + area(B) - A ∩ B)
+    Args:
+        box_a: Multiple bounding boxes, Shape: [num_boxes,4]
+        box_b: Single bounding box, Shape: [4]
+    Return:
+        jaccard overlap: Shape: [box_a.shape[0], box_a.shape[1]]
+    """
+    inter = intersect(box_a, box_b)
+    area_a = ((box_a[:, 2] - box_a[:, 0]) *
+              (box_a[:, 3] - box_a[:, 1]))  # [A,B]
+    area_b = ((box_b[2] - box_b[0]) *
+              (box_b[3] - box_b[1]))  # [A,B]
+    union = area_a + area_b - inter
+    return inter / union  # [A,B]
+
+
+class bbox():
+
+    def __init__(self, xmin, ymin, xmax, ymax):
+        self.xmin = xmin
+        self.ymin = ymin
+        self.xmax = xmax
+        self.ymax = ymax
+
+
+def random_brightness(img):
+    prob = np.random.uniform(0, 1)
+    if prob < cfg.brightness_prob:
+        delta = np.random.uniform(-cfg.brightness_delta,
+                                  cfg.brightness_delta) + 1
+        img = ImageEnhance.Brightness(img).enhance(delta)
+    return img
+
+
+def random_contrast(img):
+    prob = np.random.uniform(0, 1)
+    if prob < cfg.contrast_prob:
+        delta = np.random.uniform(-cfg.contrast_delta,
+                                  cfg.contrast_delta) + 1
+        img = ImageEnhance.Contrast(img).enhance(delta)
+    return img
+
+
+def random_saturation(img):
+    prob = np.random.uniform(0, 1)
+    if prob < cfg.saturation_prob:
+        delta = np.random.uniform(-cfg.saturation_delta,
+                                  cfg.saturation_delta) + 1
+        img = ImageEnhance.Color(img).enhance(delta)
+    return img
+
+
+def random_hue(img):
+    prob = np.random.uniform(0, 1)
+    if prob < cfg.hue_prob:
+        delta = np.random.uniform(-cfg.hue_delta, cfg.hue_delta)
+        img_hsv = np.array(img.convert('HSV'))
+        img_hsv[:, :, 0] = img_hsv[:, :, 0] + delta
+        img = Image.fromarray(img_hsv, mode='HSV').convert('RGB')
+    return img
+
+
+def distort_image(img):
+    prob = np.random.uniform(0, 1)
+    # Apply different distort order
+    if prob > 0.5:
+        img = random_brightness(img)
+        img = random_contrast(img)
+        img = random_saturation(img)
+        img = random_hue(img)
+    else:
+        img = random_brightness(img)
+        img = random_saturation(img)
+        img = random_hue(img)
+        img = random_contrast(img)
+    return img
+
+
+def meet_emit_constraint(src_bbox, sample_bbox):
+    center_x = (src_bbox.xmax + src_bbox.xmin) / 2
+    center_y = (src_bbox.ymax + src_bbox.ymin) / 2
+    if center_x >= sample_bbox.xmin and \
+            center_x <= sample_bbox.xmax and \
+            center_y >= sample_bbox.ymin and \
+            center_y <= sample_bbox.ymax:
+        return True
+    return False
+
+
+def project_bbox(object_bbox, sample_bbox):
+    if object_bbox.xmin >= sample_bbox.xmax or \
+       object_bbox.xmax <= sample_bbox.xmin or \
+       object_bbox.ymin >= sample_bbox.ymax or \
+       object_bbox.ymax <= sample_bbox.ymin:
+        return False
+    else:
+        proj_bbox = bbox(0, 0, 0, 0)
+        sample_width = sample_bbox.xmax - sample_bbox.xmin
+        sample_height = sample_bbox.ymax - sample_bbox.ymin
+        proj_bbox.xmin = (object_bbox.xmin - sample_bbox.xmin) / sample_width
+        proj_bbox.ymin = (object_bbox.ymin - sample_bbox.ymin) / sample_height
+        proj_bbox.xmax = (object_bbox.xmax - sample_bbox.xmin) / sample_width
+        proj_bbox.ymax = (object_bbox.ymax - sample_bbox.ymin) / sample_height
+        proj_bbox = clip_bbox(proj_bbox)
+        if bbox_area(proj_bbox) > 0:
+            return proj_bbox
+        else:
+            return False
+
+
+def transform_labels(bbox_labels, sample_bbox):
+    sample_labels = []
+    for i in range(len(bbox_labels)):
+        sample_label = []
+        object_bbox = bbox(bbox_labels[i][1], bbox_labels[i][2],
+                           bbox_labels[i][3], bbox_labels[i][4])
+        if not meet_emit_constraint(object_bbox, sample_bbox):
+            continue
+        proj_bbox = project_bbox(object_bbox, sample_bbox)
+        if proj_bbox:
+            sample_label.append(bbox_labels[i][0])
+            sample_label.append(float(proj_bbox.xmin))
+            sample_label.append(float(proj_bbox.ymin))
+            sample_label.append(float(proj_bbox.xmax))
+            sample_label.append(float(proj_bbox.ymax))
+            sample_label = sample_label + bbox_labels[i][5:]
+            sample_labels.append(sample_label)
+    return sample_labels
+
+
+def expand_image(img, bbox_labels, img_width, img_height):
+    prob = np.random.uniform(0, 1)
+    if prob < cfg.expand_prob:
+        if cfg.expand_max_ratio - 1 >= 0.01:
+            expand_ratio = np.random.uniform(1, cfg.expand_max_ratio)
+            height = int(img_height * expand_ratio)
+            width = int(img_width * expand_ratio)
+            h_off = math.floor(np.random.uniform(0, height - img_height))
+            w_off = math.floor(np.random.uniform(0, width - img_width))
+            expand_bbox = bbox(-w_off / img_width, -h_off / img_height,
+                               (width - w_off) / img_width,
+                               (height - h_off) / img_height)
+            expand_img = np.ones((height, width, 3))
+            expand_img = np.uint8(expand_img * np.squeeze(cfg.img_mean))
+            expand_img = Image.fromarray(expand_img)
+            expand_img.paste(img, (int(w_off), int(h_off)))
+            bbox_labels = transform_labels(bbox_labels, expand_bbox)
+            return expand_img, bbox_labels, width, height
+    return img, bbox_labels, img_width, img_height
+
+
+def clip_bbox(src_bbox):
+    src_bbox.xmin = max(min(src_bbox.xmin, 1.0), 0.0)
+    src_bbox.ymin = max(min(src_bbox.ymin, 1.0), 0.0)
+    src_bbox.xmax = max(min(src_bbox.xmax, 1.0), 0.0)
+    src_bbox.ymax = max(min(src_bbox.ymax, 1.0), 0.0)
+    return src_bbox
+
+
+def bbox_area(src_bbox):
+    if src_bbox.xmax < src_bbox.xmin or src_bbox.ymax < src_bbox.ymin:
+        return 0.
+    else:
+        width = src_bbox.xmax - src_bbox.xmin
+        height = src_bbox.ymax - src_bbox.ymin
+        return width * height
+
+
+def intersect_bbox(bbox1, bbox2):
+    if bbox2.xmin > bbox1.xmax or bbox2.xmax < bbox1.xmin or \
+            bbox2.ymin > bbox1.ymax or bbox2.ymax < bbox1.ymin:
+        intersection_box = bbox(0.0, 0.0, 0.0, 0.0)
+    else:
+        intersection_box = bbox(
+            max(bbox1.xmin, bbox2.xmin),
+            max(bbox1.ymin, bbox2.ymin),
+            min(bbox1.xmax, bbox2.xmax), min(bbox1.ymax, bbox2.ymax))
+    return intersection_box
+
+
+def bbox_coverage(bbox1, bbox2):
+    inter_box = intersect_bbox(bbox1, bbox2)
+    intersect_size = bbox_area(inter_box)
+
+    if intersect_size > 0:
+        bbox1_size = bbox_area(bbox1)
+        return intersect_size / bbox1_size
+    else:
+        return 0.
+
+
+def generate_batch_random_samples(batch_sampler, bbox_labels, image_width,
+                                  image_height, scale_array, resize_width,
+                                  resize_height):
+    sampled_bbox = []
+    for sampler in batch_sampler:
+        found = 0
+        for i in range(sampler.max_trial):
+            if found >= sampler.max_sample:
+                break
+            sample_bbox = data_anchor_sampling(
+                sampler, bbox_labels, image_width, image_height, scale_array,
+                resize_width, resize_height)
+            if sample_bbox == 0:
+                break
+            if satisfy_sample_constraint(sampler, sample_bbox, bbox_labels):
+                sampled_bbox.append(sample_bbox)
+                found = found + 1
+    return sampled_bbox
+
+
+def data_anchor_sampling(sampler, bbox_labels, image_width, image_height,
+                         scale_array, resize_width, resize_height):
+    num_gt = len(bbox_labels)
+    # np.random.randint range: [low, high)
+    rand_idx = np.random.randint(0, num_gt) if num_gt != 0 else 0
+
+    if num_gt != 0:
+        norm_xmin = bbox_labels[rand_idx][1]
+        norm_ymin = bbox_labels[rand_idx][2]
+        norm_xmax = bbox_labels[rand_idx][3]
+        norm_ymax = bbox_labels[rand_idx][4]
+
+        xmin = norm_xmin * image_width
+        ymin = norm_ymin * image_height
+        wid = image_width * (norm_xmax - norm_xmin)
+        hei = image_height * (norm_ymax - norm_ymin)
+        range_size = 0
+
+        area = wid * hei
+        for scale_ind in range(0, len(scale_array) - 1):
+            if area > scale_array[scale_ind] ** 2 and area < \
+                    scale_array[scale_ind + 1] ** 2:
+                range_size = scale_ind + 1
+                break
+
+        if area > scale_array[len(scale_array) - 2]**2:
+            range_size = len(scale_array) - 2
+        scale_choose = 0.0
+        if range_size == 0:
+            rand_idx_size = 0
+        else:
+            # np.random.randint range: [low, high)
+            rng_rand_size = np.random.randint(0, range_size + 1)
+            rand_idx_size = rng_rand_size % (range_size + 1)
+
+        if rand_idx_size == range_size:
+            min_resize_val = scale_array[rand_idx_size] / 2.0
+            max_resize_val = min(2.0 * scale_array[rand_idx_size],
+                                 2 * math.sqrt(wid * hei))
+            scale_choose = random.uniform(min_resize_val, max_resize_val)
+        else:
+            min_resize_val = scale_array[rand_idx_size] / 2.0
+            max_resize_val = 2.0 * scale_array[rand_idx_size]
+            scale_choose = random.uniform(min_resize_val, max_resize_val)
+
+        sample_bbox_size = wid * resize_width / scale_choose
+
+        w_off_orig = 0.0
+        h_off_orig = 0.0
+        if sample_bbox_size < max(image_height, image_width):
+            if wid <= sample_bbox_size:
+                w_off_orig = np.random.uniform(xmin + wid - sample_bbox_size,
+                                               xmin)
+            else:
+                w_off_orig = np.random.uniform(xmin,
+                                               xmin + wid - sample_bbox_size)
+
+            if hei <= sample_bbox_size:
+                h_off_orig = np.random.uniform(ymin + hei - sample_bbox_size,
+                                               ymin)
+            else:
+                h_off_orig = np.random.uniform(ymin,
+                                               ymin + hei - sample_bbox_size)
+
+        else:
+            w_off_orig = np.random.uniform(image_width - sample_bbox_size, 0.0)
+            h_off_orig = np.random.uniform(
+                image_height - sample_bbox_size, 0.0)
+
+        w_off_orig = math.floor(w_off_orig)
+        h_off_orig = math.floor(h_off_orig)
+
+        # Figure out top left coordinates.
+        w_off = 0.0
+        h_off = 0.0
+        w_off = float(w_off_orig / image_width)
+        h_off = float(h_off_orig / image_height)
+
+        sampled_bbox = bbox(w_off, h_off,
+                            w_off + float(sample_bbox_size / image_width),
+                            h_off + float(sample_bbox_size / image_height))
+
+        return sampled_bbox
+    else:
+        return 0
+
+
+def jaccard_overlap(sample_bbox, object_bbox):
+    if sample_bbox.xmin >= object_bbox.xmax or \
+            sample_bbox.xmax <= object_bbox.xmin or \
+            sample_bbox.ymin >= object_bbox.ymax or \
+            sample_bbox.ymax <= object_bbox.ymin:
+        return 0
+    intersect_xmin = max(sample_bbox.xmin, object_bbox.xmin)
+    intersect_ymin = max(sample_bbox.ymin, object_bbox.ymin)
+    intersect_xmax = min(sample_bbox.xmax, object_bbox.xmax)
+    intersect_ymax = min(sample_bbox.ymax, object_bbox.ymax)
+    intersect_size = (intersect_xmax - intersect_xmin) * (
+        intersect_ymax - intersect_ymin)
+    sample_bbox_size = bbox_area(sample_bbox)
+    object_bbox_size = bbox_area(object_bbox)
+    overlap = intersect_size / (
+        sample_bbox_size + object_bbox_size - intersect_size)
+    return overlap
+
+
+def satisfy_sample_constraint(sampler, sample_bbox, bbox_labels):
+    if sampler.min_jaccard_overlap == 0 and sampler.max_jaccard_overlap == 0:
+        has_jaccard_overlap = False
+    else:
+        has_jaccard_overlap = True
+    if sampler.min_object_coverage == 0 and sampler.max_object_coverage == 0:
+        has_object_coverage = False
+    else:
+        has_object_coverage = True
+
+    if not has_jaccard_overlap and not has_object_coverage:
+        return True
+    found = False
+    for i in range(len(bbox_labels)):
+        object_bbox = bbox(bbox_labels[i][1], bbox_labels[i][2],
+                           bbox_labels[i][3], bbox_labels[i][4])
+        if has_jaccard_overlap:
+            overlap = jaccard_overlap(sample_bbox, object_bbox)
+            if sampler.min_jaccard_overlap != 0 and \
+                    overlap < sampler.min_jaccard_overlap:
+                continue
+            if sampler.max_jaccard_overlap != 0 and \
+                    overlap > sampler.max_jaccard_overlap:
+                continue
+            found = True
+        if has_object_coverage:
+            object_coverage = bbox_coverage(object_bbox, sample_bbox)
+            if sampler.min_object_coverage != 0 and \
+                    object_coverage < sampler.min_object_coverage:
+                continue
+            if sampler.max_object_coverage != 0 and \
+                    object_coverage > sampler.max_object_coverage:
+                continue
+            found = True
+        if found:
+            return True
+    return found
+
+
+def crop_image_sampling(img, bbox_labels, sample_bbox, image_width,
+                        image_height, resize_width, resize_height,
+                        min_face_size):
+    # no clipping here
+    xmin = int(sample_bbox.xmin * image_width)
+    xmax = int(sample_bbox.xmax * image_width)
+    ymin = int(sample_bbox.ymin * image_height)
+    ymax = int(sample_bbox.ymax * image_height)
+    w_off = xmin
+    h_off = ymin
+    width = xmax - xmin
+    height = ymax - ymin
+
+    cross_xmin = max(0.0, float(w_off))
+    cross_ymin = max(0.0, float(h_off))
+    cross_xmax = min(float(w_off + width - 1.0), float(image_width))
+    cross_ymax = min(float(h_off + height - 1.0), float(image_height))
+    cross_width = cross_xmax - cross_xmin
+    cross_height = cross_ymax - cross_ymin
+
+    roi_xmin = 0 if w_off >= 0 else abs(w_off)
+    roi_ymin = 0 if h_off >= 0 else abs(h_off)
+    roi_width = cross_width
+    roi_height = cross_height
+
+    roi_y1 = int(roi_ymin)
+    roi_y2 = int(roi_ymin + roi_height)
+    roi_x1 = int(roi_xmin)
+    roi_x2 = int(roi_xmin + roi_width)
+
+    cross_y1 = int(cross_ymin)
+    cross_y2 = int(cross_ymin + cross_height)
+    cross_x1 = int(cross_xmin)
+    cross_x2 = int(cross_xmin + cross_width)
+
+    sample_img = np.zeros((height, width, 3))
+    # print(sample_img.shape)
+    sample_img[roi_y1 : roi_y2, roi_x1 : roi_x2] = \
+        img[cross_y1: cross_y2, cross_x1: cross_x2]
+    sample_img = cv2.resize(
+        sample_img, (resize_width, resize_height), interpolation=cv2.INTER_AREA)
+
+    resize_val = resize_width
+    sample_labels = transform_labels_sampling(bbox_labels, sample_bbox,
+                                              resize_val, min_face_size)
+    return sample_img, sample_labels
+
+
+def transform_labels_sampling(bbox_labels, sample_bbox, resize_val,
+                              min_face_size):
+    sample_labels = []
+    for i in range(len(bbox_labels)):
+        sample_label = []
+        object_bbox = bbox(bbox_labels[i][1], bbox_labels[i][2],
+                           bbox_labels[i][3], bbox_labels[i][4])
+        if not meet_emit_constraint(object_bbox, sample_bbox):
+            continue
+        proj_bbox = project_bbox(object_bbox, sample_bbox)
+        if proj_bbox:
+            real_width = float((proj_bbox.xmax - proj_bbox.xmin) * resize_val)
+            real_height = float((proj_bbox.ymax - proj_bbox.ymin) * resize_val)
+            if real_width * real_height < float(min_face_size * min_face_size):
+                continue
+            else:
+                sample_label.append(bbox_labels[i][0])
+                sample_label.append(float(proj_bbox.xmin))
+                sample_label.append(float(proj_bbox.ymin))
+                sample_label.append(float(proj_bbox.xmax))
+                sample_label.append(float(proj_bbox.ymax))
+                sample_label = sample_label + bbox_labels[i][5:]
+                sample_labels.append(sample_label)
+
+    return sample_labels
+
+
+def generate_sample(sampler, image_width, image_height):
+    scale = np.random.uniform(sampler.min_scale, sampler.max_scale)
+    aspect_ratio = np.random.uniform(sampler.min_aspect_ratio,
+                                     sampler.max_aspect_ratio)
+    aspect_ratio = max(aspect_ratio, (scale**2.0))
+    aspect_ratio = min(aspect_ratio, 1 / (scale**2.0))
+
+    bbox_width = scale * (aspect_ratio**0.5)
+    bbox_height = scale / (aspect_ratio**0.5)
+
+    # guarantee a squared image patch after cropping
+    if sampler.use_square:
+        if image_height < image_width:
+            bbox_width = bbox_height * image_height / image_width
+        else:
+            bbox_height = bbox_width * image_width / image_height
+
+    xmin_bound = 1 - bbox_width
+    ymin_bound = 1 - bbox_height
+    xmin = np.random.uniform(0, xmin_bound)
+    ymin = np.random.uniform(0, ymin_bound)
+    xmax = xmin + bbox_width
+    ymax = ymin + bbox_height
+    sampled_bbox = bbox(xmin, ymin, xmax, ymax)
+    return sampled_bbox
+
+
+def generate_batch_samples(batch_sampler, bbox_labels, image_width,
+                           image_height):
+    sampled_bbox = []
+    for sampler in batch_sampler:
+        found = 0
+        for i in range(sampler.max_trial):
+            if found >= sampler.max_sample:
+                break
+            sample_bbox = generate_sample(sampler, image_width, image_height)
+            if satisfy_sample_constraint(sampler, sample_bbox, bbox_labels):
+                sampled_bbox.append(sample_bbox)
+                found = found + 1
+    return sampled_bbox
+
+
+def crop_image(img, bbox_labels, sample_bbox, image_width, image_height,
+               resize_width, resize_height, min_face_size):
+    sample_bbox = clip_bbox(sample_bbox)
+    xmin = int(sample_bbox.xmin * image_width)
+    xmax = int(sample_bbox.xmax * image_width)
+    ymin = int(sample_bbox.ymin * image_height)
+    ymax = int(sample_bbox.ymax * image_height)
+
+    sample_img = img[ymin:ymax, xmin:xmax]
+    resize_val = resize_width
+    sample_labels = transform_labels_sampling(bbox_labels, sample_bbox,
+                                              resize_val, min_face_size)
+    return sample_img, sample_labels
+
+
+def to_chw_bgr(image):
+    """
+    Transpose image from HWC to CHW and from RBG to BGR.
+    Args:
+        image (np.array): an image with HWC and RBG layout.
+    """
+    # HWC to CHW
+    if len(image.shape) == 3:
+        image = np.swapaxes(image, 1, 2)
+        image = np.swapaxes(image, 1, 0)
+    # RBG to BGR
+    image = image[[2, 1, 0], :, :]
+    return image
+
+
+def anchor_crop_image_sampling(img,
+                               bbox_labels,
+                               scale_array,
+                               img_width,
+                               img_height):
+    mean = np.array([104, 117, 123], dtype=np.float32)
+    maxSize = 12000  # max size
+    infDistance = 9999999
+    bbox_labels = np.array(bbox_labels)
+    scale = np.array([img_width, img_height, img_width, img_height])
+
+    boxes = bbox_labels[:, 1:5] * scale
+    labels = bbox_labels[:, 0]
+
+    boxArea = (boxes[:, 2] - boxes[:, 0] + 1) * (boxes[:, 3] - boxes[:, 1] + 1)
+    # argsort = np.argsort(boxArea)
+    # rand_idx = random.randint(min(len(argsort),6))
+    # print('rand idx',rand_idx)
+    rand_idx = np.random.randint(len(boxArea))
+    rand_Side = boxArea[rand_idx] ** 0.5
+    # rand_Side = min(boxes[rand_idx,2] - boxes[rand_idx,0] + 1,
+    # boxes[rand_idx,3] - boxes[rand_idx,1] + 1)
+
+    distance = infDistance
+    anchor_idx = 5
+    for i, anchor in enumerate(scale_array):
+        if abs(anchor - rand_Side) < distance:
+            distance = abs(anchor - rand_Side)
+            anchor_idx = i
+
+    target_anchor = random.choice(scale_array[0:min(anchor_idx + 1, 5) + 1])
+    ratio = float(target_anchor) / rand_Side
+    ratio = ratio * (2**random.uniform(-1, 1))
+
+    if int(img_height * ratio * img_width * ratio) > maxSize * maxSize:
+        ratio = (maxSize * maxSize / (img_height * img_width))**0.5
+
+    interp_methods = [cv2.INTER_LINEAR, cv2.INTER_CUBIC,
+                      cv2.INTER_AREA, cv2.INTER_NEAREST, cv2.INTER_LANCZOS4]
+    interp_method = random.choice(interp_methods)
+    image = cv2.resize(img, None, None, fx=ratio,
+                       fy=ratio, interpolation=interp_method)
+
+    boxes[:, 0] *= ratio
+    boxes[:, 1] *= ratio
+    boxes[:, 2] *= ratio
+    boxes[:, 3] *= ratio
+
+    height, width, _ = image.shape
+
+    sample_boxes = []
+
+    xmin = boxes[rand_idx, 0]
+    ymin = boxes[rand_idx, 1]
+    bw = (boxes[rand_idx, 2] - boxes[rand_idx, 0] + 1)
+    bh = (boxes[rand_idx, 3] - boxes[rand_idx, 1] + 1)
+
+    w = h = cfg.INPUT_SIZE
+
+    for _ in range(50):
+        if w < max(height, width):
+            if bw <= w:
+                w_off = random.uniform(xmin + bw - w, xmin)
+            else:
+                w_off = random.uniform(xmin, xmin + bw - w)
+
+            if bh <= h:
+                h_off = random.uniform(ymin + bh - h, ymin)
+            else:
+                h_off = random.uniform(ymin, ymin + bh - h)
+        else:
+            w_off = random.uniform(width - w, 0)
+            h_off = random.uniform(height - h, 0)
+
+        w_off = math.floor(w_off)
+        h_off = math.floor(h_off)
+
+        # convert to integer rect x1,y1,x2,y2
+        rect = np.array(
+            [int(w_off), int(h_off), int(w_off + w), int(h_off + h)])
+
+        # keep overlap with gt box IF center in sampled patch
+        centers = (boxes[:, :2] + boxes[:, 2:]) / 2.0
+        # mask in all gt boxes that above and to the left of centers
+        m1 = (rect[0] <= boxes[:, 0]) * (rect[1] <= boxes[:, 1])
+        # mask in all gt boxes that under and to the right of centers
+        m2 = (rect[2] >= boxes[:, 2]) * (rect[3] >= boxes[:, 3])
+        # mask in that both m1 and m2 are true
+        mask = m1 * m2
+
+        overlap = jaccard_numpy(boxes, rect)
+        # have any valid boxes? try again if not
+        if not mask.any() and not overlap.max() > 0.7:
+            continue
+        else:
+            sample_boxes.append(rect)
+
+    sampled_labels = []
+
+    if len(sample_boxes) > 0:
+        choice_idx = np.random.randint(len(sample_boxes))
+        choice_box = sample_boxes[choice_idx]
+        # print('crop the box :',choice_box)
+        centers = (boxes[:, :2] + boxes[:, 2:]) / 2.0
+        m1 = (choice_box[0] < centers[:, 0]) * \
+            (choice_box[1] < centers[:, 1])
+        m2 = (choice_box[2] > centers[:, 0]) * \
+            (choice_box[3] > centers[:, 1])
+        mask = m1 * m2
+        current_boxes = boxes[mask, :].copy()
+        current_labels = labels[mask]
+        current_boxes[:, :2] -= choice_box[:2]
+        current_boxes[:, 2:] -= choice_box[:2]
+
+        if choice_box[0] < 0 or choice_box[1] < 0:
+            new_img_width = width if choice_box[
+                0] >= 0 else width - choice_box[0]
+            new_img_height = height if choice_box[
+                1] >= 0 else height - choice_box[1]
+            image_pad = np.zeros(
+                (new_img_height, new_img_width, 3), dtype=float)
+            image_pad[:, :, :] = mean
+            start_left = 0 if choice_box[0] >= 0 else -choice_box[0]
+            start_top = 0 if choice_box[1] >= 0 else -choice_box[1]
+            image_pad[start_top:, start_left:, :] = image
+
+            choice_box_w = choice_box[2] - choice_box[0]
+            choice_box_h = choice_box[3] - choice_box[1]
+
+            start_left = choice_box[0] if choice_box[0] >= 0 else 0
+            start_top = choice_box[1] if choice_box[1] >= 0 else 0
+            end_right = start_left + choice_box_w
+            end_bottom = start_top + choice_box_h
+            current_image = image_pad[
+                start_top:end_bottom, start_left:end_right, :].copy()
+            image_height, image_width, _ = current_image.shape
+            if cfg.filter_min_face:
+                bbox_w = current_boxes[:, 2] - current_boxes[:, 0]
+                bbox_h = current_boxes[:, 3] - current_boxes[:, 1]
+                bbox_area = bbox_w * bbox_h
+                mask = bbox_area > (cfg.min_face_size * cfg.min_face_size)
+                current_boxes = current_boxes[mask]
+                current_labels = current_labels[mask]
+                for i in range(len(current_boxes)):
+                    sample_label = []
+                    sample_label.append(current_labels[i])
+                    sample_label.append(current_boxes[i][0] / image_width)
+                    sample_label.append(current_boxes[i][1] / image_height)
+                    sample_label.append(current_boxes[i][2] / image_width)
+                    sample_label.append(current_boxes[i][3] / image_height)
+                    sampled_labels += [sample_label]
+                sampled_labels = np.array(sampled_labels)
+            else:
+                current_boxes /= np.array([image_width,
+                                           image_height, image_width, image_height])
+                sampled_labels = np.hstack(
+                    (current_labels[:, np.newaxis], current_boxes))
+
+            return current_image, sampled_labels
+
+        current_image = image[choice_box[1]:choice_box[
+            3], choice_box[0]:choice_box[2], :].copy()
+        image_height, image_width, _ = current_image.shape
+
+        if cfg.filter_min_face:
+            bbox_w = current_boxes[:, 2] - current_boxes[:, 0]
+            bbox_h = current_boxes[:, 3] - current_boxes[:, 1]
+            bbox_area = bbox_w * bbox_h
+            mask = bbox_area > (cfg.min_face_size * cfg.min_face_size)
+            current_boxes = current_boxes[mask]
+            current_labels = current_labels[mask]
+            for i in range(len(current_boxes)):
+                sample_label = []
+                sample_label.append(current_labels[i])
+                sample_label.append(current_boxes[i][0] / image_width)
+                sample_label.append(current_boxes[i][1] / image_height)
+                sample_label.append(current_boxes[i][2] / image_width)
+                sample_label.append(current_boxes[i][3] / image_height)
+                sampled_labels += [sample_label]
+            sampled_labels = np.array(sampled_labels)
+        else:
+            current_boxes /= np.array([image_width,
+                                       image_height, image_width, image_height])
+            sampled_labels = np.hstack(
+                (current_labels[:, np.newaxis], current_boxes))
+
+        return current_image, sampled_labels
+    else:
+        image_height, image_width, _ = image.shape
+        if cfg.filter_min_face:
+            bbox_w = boxes[:, 2] - boxes[:, 0]
+            bbox_h = boxes[:, 3] - boxes[:, 1]
+            bbox_area = bbox_w * bbox_h
+            mask = bbox_area > (cfg.min_face_size * cfg.min_face_size)
+            boxes = boxes[mask]
+            labels = labels[mask]
+            for i in range(len(boxes)):
+                sample_label = []
+                sample_label.append(labels[i])
+                sample_label.append(boxes[i][0] / image_width)
+                sample_label.append(boxes[i][1] / image_height)
+                sample_label.append(boxes[i][2] / image_width)
+                sample_label.append(boxes[i][3] / image_height)
+                sampled_labels += [sample_label]
+            sampled_labels = np.array(sampled_labels)
+        else:
+            boxes /= np.array([image_width, image_height,
+                               image_width, image_height])
+            sampled_labels = np.hstack(
+                (labels[:, np.newaxis], boxes))
+
+        return image, sampled_labels
+
+
+def preprocess(img, bbox_labels, mode, image_path):
+    img_width, img_height = img.size
+    sampled_labels = bbox_labels
+    if mode == 'train':
+        if cfg.apply_distort:
+            img = distort_image(img)
+        if cfg.apply_expand:
+            img, bbox_labels, img_width, img_height = expand_image(
+                img, bbox_labels, img_width, img_height)
+
+        batch_sampler = []
+        prob = np.random.uniform(0., 1.)
+        if prob > cfg.data_anchor_sampling_prob and cfg.anchor_sampling:
+            scale_array = np.array(cfg.ANCHOR_SIZES)#[16, 32, 64, 128, 256, 512])
+            '''
+            batch_sampler.append(
+                sampler(1, 50, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 0.6, 0.0, True))
+            sampled_bbox = generate_batch_random_samples(
+                batch_sampler, bbox_labels, img_width, img_height, scale_array,
+                cfg.resize_width, cfg.resize_height)
+            '''
+            img = np.array(img)
+            img, sampled_labels = anchor_crop_image_sampling(
+                img, bbox_labels, scale_array, img_width, img_height)
+            '''
+            if len(sampled_bbox) > 0:
+                idx = int(np.random.uniform(0, len(sampled_bbox)))
+                img, sampled_labels = crop_image_sampling(
+                    img, bbox_labels, sampled_bbox[idx], img_width, img_height,
+                    cfg.resize_width, cfg.resize_height, cfg.min_face_size)
+            '''
+            img = img.astype('uint8')
+            img = Image.fromarray(img)
+        else:
+            batch_sampler.append(sampler(1, 50, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0,
+                                         0.0, True))
+            batch_sampler.append(sampler(1, 50, 0.3, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0,
+                                         0.0, True))
+            batch_sampler.append(sampler(1, 50, 0.3, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0,
+                                         0.0, True))
+            batch_sampler.append(sampler(1, 50, 0.3, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0,
+                                         0.0, True))
+            batch_sampler.append(sampler(1, 50, 0.3, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0,
+                                         0.0, True))
+            sampled_bbox = generate_batch_samples(
+                batch_sampler, bbox_labels, img_width, img_height)
+
+            img = np.array(img)
+            if len(sampled_bbox) > 0:
+                idx = int(np.random.uniform(0, len(sampled_bbox)))
+                img, sampled_labels = crop_image(
+                    img, bbox_labels, sampled_bbox[idx], img_width, img_height,
+                    cfg.resize_width, cfg.resize_height, cfg.min_face_size)
+
+            img = Image.fromarray(img)
+
+    interp_mode = [
+        Image.BILINEAR, Image.HAMMING, Image.NEAREST, Image.BICUBIC,
+        Image.LANCZOS
+    ]
+    interp_indx = np.random.randint(0, 5)
+
+    img = img.resize((cfg.resize_width, cfg.resize_height),
+                     resample=interp_mode[interp_indx])
+
+    img = np.array(img)
+
+    if mode == 'train':
+        mirror = int(np.random.uniform(0, 2))
+        if mirror == 1:
+            img = img[:, ::-1, :]
+            for i in six.moves.xrange(len(sampled_labels)):
+                tmp = sampled_labels[i][1]
+                sampled_labels[i][1] = 1 - sampled_labels[i][3]
+                sampled_labels[i][3] = 1 - tmp
+
+    #img = Image.fromarray(img)
+    img = to_chw_bgr(img)
+    img = img.astype('float32')
+    img -= cfg.img_mean
+    img = img[[2, 1, 0], :, :]  # to RGB
+    #img = img * cfg.scale
+
+    return img, sampled_labels
diff --git a/examples/pytorch/vision/Face_Detection/wider_test.py b/examples/pytorch/vision/Face_Detection/wider_test.py
new file mode 100755
index 000000000..df585fe06
--- /dev/null
+++ b/examples/pytorch/vision/Face_Detection/wider_test.py
@@ -0,0 +1,272 @@
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT license.
+
+import os
+import argparse
+import torch
+import torch.nn as nn
+import torch.utils.data as data
+import torch.backends.cudnn as cudnn
+import torchvision.transforms as transforms
+import os.path as osp
+
+import cv2
+import time
+import numpy as np
+from PIL import Image
+import scipy.io as sio
+
+from data.choose_config import cfg
+cfg = cfg.cfg
+
+from torch.autograd import Variable
+from utils.augmentations import to_chw_bgr
+
+from importlib import import_module
+
+import warnings
+warnings.filterwarnings("ignore")
+
+parser = argparse.ArgumentParser(description='s3fd evaluatuon wider')
+parser.add_argument('--model', type=str,
+                    default='./weights/rpool_face_c.pth', help='trained model')
+parser.add_argument('--thresh', default=0.05, type=float,
+                    help='Final confidence threshold')
+parser.add_argument('--model_arch',
+                    default='RPool_Face_C', type=str,
+                    choices=['RPool_Face_C', 'RPool_Face_Quant', 'RPool_Face_QVGA_monochrome'],
+                    help='choose architecture among rpool variants')
+parser.add_argument('--save_folder', type=str,
+                    default='rpool_face_predictions', help='folder for saving predictions')
+parser.add_argument('--subset', type=str,
+                    default='val',
+                    choices=['val', 'test'],
+                    help='choose which set to run testing on')
+
+args = parser.parse_args()
+
+
+use_cuda = torch.cuda.is_available()
+
+if use_cuda:
+    torch.set_default_tensor_type('torch.cuda.FloatTensor')
+else:
+    torch.set_default_tensor_type('torch.FloatTensor')
+
+
+def detect_face(net, img, shrink):
+    if shrink != 1:
+        img = cv2.resize(img, None, None, fx=shrink, fy=shrink,
+                         interpolation=cv2.INTER_LINEAR)
+
+    x = to_chw_bgr(img)
+    x = x.astype('float32')
+    x -= cfg.img_mean
+    x = x[[2, 1, 0], :, :]
+
+    if cfg.IS_MONOCHROME == True:
+        x = 0.299 * x[0] + 0.587 * x[1] + 0.114 * x[2]
+        x = torch.from_numpy(x).unsqueeze(0).unsqueeze(0)
+    else:
+        x = torch.from_numpy(x).unsqueeze(0)
+
+    if use_cuda:
+        x = x.cuda()
+
+    y = net(x)
+    detections = y.data
+    detections = detections.cpu().numpy()
+
+    det_conf = detections[0, 1, :, 0]
+    det_xmin = img.shape[1] * detections[0, 1, :, 1] / shrink
+    det_ymin = img.shape[0] * detections[0, 1, :, 2] / shrink
+    det_xmax = img.shape[1] * detections[0, 1, :, 3] / shrink
+    det_ymax = img.shape[0] * detections[0, 1, :, 4] / shrink
+    det = np.column_stack((det_xmin, det_ymin, det_xmax, det_ymax, det_conf))
+
+    keep_index = np.where(det[:, 4] >= args.thresh)[0]
+    det = det[keep_index, :]
+
+    return det
+
+
+def multi_scale_test(net, image, max_im_shrink):
+    # shrink detecting and shrink only detect big face
+    st = 0.5 if max_im_shrink >= 0.75 else 0.5 * max_im_shrink
+    det_s = detect_face(net, image, st)
+    index = np.where(np.maximum(
+        det_s[:, 2] - det_s[:, 0] + 1, det_s[:, 3] - det_s[:, 1] + 1) > 30)[0]
+    det_s = det_s[index, :]
+
+    # enlarge one times
+    bt = min(2, max_im_shrink) if max_im_shrink > 1 else (
+        st + max_im_shrink) / 2
+    det_b = detect_face(net, image, bt)
+
+    # enlarge small image x times for small face
+    if max_im_shrink > 2:
+        bt *= 2
+        while bt < max_im_shrink:
+            det_b = np.row_stack((det_b, detect_face(net, image, bt)))
+            bt *= 2
+        det_b = np.row_stack((det_b, detect_face(net, image, max_im_shrink)))
+
+    # enlarge only detect small face
+    if bt > 1:
+        index = np.where(np.minimum(
+            det_b[:, 2] - det_b[:, 0] + 1, det_b[:, 3] - det_b[:, 1] + 1) < 100)[0]
+        det_b = det_b[index, :]
+    else:
+        index = np.where(np.maximum(
+            det_b[:, 2] - det_b[:, 0] + 1, det_b[:, 3] - det_b[:, 1] + 1) > 30)[0]
+        det_b = det_b[index, :]
+
+    return det_s, det_b
+
+
+def flip_test(net, image, shrink):
+    image_f = cv2.flip(image, 1)
+    det_f = detect_face(net, image_f, shrink)
+
+    det_t = np.zeros(det_f.shape)
+    det_t[:, 0] = image.shape[1] - det_f[:, 2]
+    det_t[:, 1] = det_f[:, 1]
+    det_t[:, 2] = image.shape[1] - det_f[:, 0]
+    det_t[:, 3] = det_f[:, 3]
+    det_t[:, 4] = det_f[:, 4]
+    return det_t
+
+
+def bbox_vote(det):
+    order = det[:, 4].ravel().argsort()[::-1]
+    det = det[order, :]
+    while det.shape[0] > 0:
+        # IOU
+        area = (det[:, 2] - det[:, 0] + 1) * (det[:, 3] - det[:, 1] + 1)
+        xx1 = np.maximum(det[0, 0], det[:, 0])
+        yy1 = np.maximum(det[0, 1], det[:, 1])
+        xx2 = np.minimum(det[0, 2], det[:, 2])
+        yy2 = np.minimum(det[0, 3], det[:, 3])
+        w = np.maximum(0.0, xx2 - xx1 + 1)
+        h = np.maximum(0.0, yy2 - yy1 + 1)
+        inter = w * h
+        o = inter / (area[0] + area[:] - inter)
+
+        # get needed merge det and delete these det
+        merge_index = np.where(o >= 0.3)[0]
+        det_accu = det[merge_index, :]
+        det = np.delete(det, merge_index, 0)
+
+        if merge_index.shape[0] <= 1:
+            continue
+        det_accu[:, 0:4] = det_accu[:, 0:4] * np.tile(det_accu[:, -1:], (1, 4))
+        max_score = np.max(det_accu[:, 4])
+        det_accu_sum = np.zeros((1, 5))
+        det_accu_sum[:, 0:4] = np.sum(
+            det_accu[:, 0:4], axis=0) / np.sum(det_accu[:, -1:])
+        det_accu_sum[:, 4] = max_score
+        try:
+            dets = np.row_stack((dets, det_accu_sum))
+        except:
+            dets = det_accu_sum
+
+    dets = dets[0:750, :]
+    return dets
+
+
+def get_data():
+    subset = args.subset
+
+    WIDER_ROOT = os.path.join(cfg.HOME, 'WIDER_FACE')
+    if subset == 'val':
+        wider_face = sio.loadmat(
+            os.path.join(WIDER_ROOT, 'wider_face_split',
+                               'wider_face_val.mat'))
+    else:
+        wider_face = sio.loadmat(
+            os.path.join(WIDER_ROOT, 'wider_face_split',
+                               'wider_face_test.mat'))
+    event_list = wider_face['event_list']
+    file_list = wider_face['file_list']
+    del wider_face
+
+    imgs_path = os.path.join(
+        cfg.FACE.WIDER_DIR, 'WIDER_{}'.format(subset), 'images')
+    save_path = './{}'.format(args.save_folder)
+
+    return event_list, file_list, imgs_path, save_path
+
+if __name__ == '__main__':
+    event_list, file_list, imgs_path, save_path = get_data()
+    cfg.USE_NMS = False
+
+    module = import_module('models.' + args.model_arch)
+    net = module.build_s3fd('test', cfg.NUM_CLASSES)
+    
+    net = torch.nn.DataParallel(net)
+    
+
+    checkpoint_dict = torch.load(args.model)
+
+    model_dict = net.state_dict()
+
+
+    model_dict.update(checkpoint_dict) 
+    net.load_state_dict(model_dict)
+
+    
+    net.eval()
+    
+
+    if use_cuda:
+        net.cuda()
+        cudnn.benckmark = True
+
+
+    counter = 0
+
+    for index, event in enumerate(event_list):
+        filelist = file_list[index][0]
+        path = os.path.join(save_path, str(event[0][0]))#.encode('utf-8'))
+        if not os.path.exists(path):
+            os.makedirs(path)
+
+        for num, file in enumerate(filelist):
+            im_name = str(file[0][0])#.encode('utf-8')
+            in_file = os.path.join(imgs_path, event[0][0], im_name[:] + '.jpg')
+            img = Image.open(in_file)
+            if img.mode == 'L':
+                img = img.convert('RGB')
+            img = np.array(img)
+
+
+            max_im_shrink = np.sqrt(
+                1700 * 1200 / (img.shape[0] * img.shape[1]))
+
+            shrink = max_im_shrink if max_im_shrink < 1 else 1
+            counter += 1
+
+            t1 = time.time()
+            det0 = detect_face(net, img, shrink)
+
+            det1 = flip_test(net, img, shrink)    # flip test
+            [det2, det3] = multi_scale_test(net, img, max_im_shrink)
+
+            det = np.row_stack((det0, det1, det2, det3))
+            dets = bbox_vote(det)
+
+            t2 = time.time()
+            print('Detect %04d th image costs %.4f' % (counter, t2 - t1))
+
+            fout = open(osp.join(save_path, str(event[0][
+                        0]), im_name + '.txt'), 'w')
+            fout.write('{:s}\n'.format(str(event[0][0]) + '/' + im_name + '.jpg'))
+            fout.write('{:d}\n'.format(dets.shape[0]))
+            for i in range(dets.shape[0]):
+                xmin = dets[i][0]
+                ymin = dets[i][1]
+                xmax = dets[i][2]
+                ymax = dets[i][3]
+                score = dets[i][4]
+                fout.write('{:.1f} {:.1f} {:.1f} {:.1f} {:.3f}\n'.
+                           format(xmin, ymin, (xmax - xmin + 1), (ymax - ymin + 1), score))
diff --git a/examples/pytorch/vision/Visual_Wakeword/README.md b/examples/pytorch/vision/Visual_Wakeword/README.md
new file mode 100755
index 000000000..618fee202
--- /dev/null
+++ b/examples/pytorch/vision/Visual_Wakeword/README.md
@@ -0,0 +1,71 @@
+# Code for Visual Wake Words experiments with RNNPool
+
+The Visual Wake Word challenge is a binary classification problem of detecting whether a person is present in 
+an image or not, as introduced by [Chowdhery et. al](https://arxiv.org/abs/1906.05721).
+
+## Dataset
+The Visual Wake Words Dataset is derived from the publicly available [COCO](cocodataset.org/#/home) dataset. The Visual Wake Words Challenge evaluates accuracy on the [minival image ids](https://raw.githubusercontent.com/tensorflow/models/master/research/object_detection/data/mscoco_minival_ids.txt),
+and for training uses the remaining 115k images of the COCO training/validation dataset. The process of creating the Visual Wake Words dataset from COCO dataset is as follows.
+Each image is assigned a label 1 or 0. 
+The label 1 is assigned as long as it has at least one bounding box corresponding 
+to the object of interest (e.g. person) with the box area greater than a certain threshold 
+(e.g. 0.5% of the image area).
+
+To download the COCO dataset use the script `download_coco.sh`
+```bash
+bash scripts/download_mscoco.sh path-to-mscoco-dataset
+```
+
+To create COCO annotation files that converts to the minival split use:
+`scripts/create_coco_train_minival_split.py`
+
+```bash
+TRAIN_ANNOTATIONS_FILE="path-to-mscoco-dataset/annotations/instances_train2014.json"
+VAL_ANNOTATIONS_FILE="path-to-mscoco-dataset/annotations/instances_val2014.json"
+DIR="path-to-mscoco-dataset/annotations/"
+python scripts/create_coco_train_minival_split.py \
+  --train_annotations_file="${TRAIN_ANNOTATIONS_FILE}" \
+  --val_annotations_file="${VAL_ANNOTATIONS_FILE}" \
+  --output_dir="${DIR}"
+```
+
+
+To generate the new annotations, use the script `scripts/create_visualwakewords_annotations.py`.
+```bash
+MAXITRAIN_ANNOTATIONS_FILE="path-to-mscoco-dataset/annotations/instances_maxitrain.json"
+MINIVAL_ANNOTATIONS_FILE="path-to-mscoco-dataset/annotations/instances_minival.json"
+VWW_OUTPUT_DIR="new-path-to-visualwakewords-dataset/annotations/"
+python scripts/create_visualwakewords_annotations.py \
+  --train_annotations_file="${MAXITRAIN_ANNOTATIONS_FILE}" \
+  --val_annotations_file="${MINIVAL_ANNOTATIONS_FILE}" \
+  --output_dir="${VWW_OUTPUT_DIR}" \
+  --threshold=0.005 \
+  --foreground_class='person'
+```
+
+
+# Training
+
+```bash
+python train_visualwakewords.py \
+    --model_arch model_mobilenet_rnnpool \
+    --lr 0.05 \
+    --epochs 900 \
+    --data "path-to-mscoco-dataset" \
+    --ann "new-path-to-visualwakewords-dataset"
+```
+Specify the paths used for storing MS COCO dataset and the Visual Wakeword dataset as used in dataset creation steps in --data and --ann respectively. This script should reach a validation accuracy of about 89.57 upon completion.
+
+# Evaluation
+
+```bash
+python eval.py \
+    --weights vww_rnnpool.pth \
+    --model_arch model_mobilenet_rnnpool \
+    --image_folder images \
+```
+
+The weights argument is the saved checkpoint of the model trained with architecture which is passed in model_arch argument. The folder with images for evaluation has to be passed in image_folder argument. This script will print 'Person present' or 'No person present' for each image in the folder specified.
+
+
+Dataset creation code is from https://github.com/Mxbonn/visualwakewords/
diff --git a/examples/pytorch/vision/Visual_Wakeword/eval.py b/examples/pytorch/vision/Visual_Wakeword/eval.py
new file mode 100644
index 000000000..46200cb16
--- /dev/null
+++ b/examples/pytorch/vision/Visual_Wakeword/eval.py
@@ -0,0 +1,91 @@
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT license.
+
+import torch
+import torch.nn as nn
+import torch.optim as optim
+import torch.nn.functional as F
+import torch.backends.cudnn as cudnn
+import torchvision
+import torchvision.transforms as transforms
+import os
+import argparse
+import random
+from PIL import Image
+import numpy as np
+from importlib import import_module
+import skimage
+from skimage import filters
+
+
+
+
+
+device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+
+torch.backends.cudnn.benchmark = True
+torch.backends.cudnn.enabled = True
+
+
+#Arg parser
+parser = argparse.ArgumentParser(description='PyTorch VisualWakeWords evaluation')
+parser.add_argument('--weights', default=None, type=str, help='load from checkpoint')
+parser.add_argument('--model_arch',
+                    default='model_mobilenet_rnnpool', type=str,
+                    choices=['model_mobilenet_rnnpool', 'model_mobilenet_2rnnpool'],
+                    help='choose architecture among rpool variants')
+parser.add_argument('--image_folder', default=None, type=str, help='folder containing images')
+
+args = parser.parse_args()
+
+
+normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
+                                         std=[0.229, 0.224, 0.225])
+
+transform_test = transforms.Compose([
+    transforms.Resize(256),
+    transforms.CenterCrop(224),
+    transforms.ToTensor(),
+    normalize
+]) 
+ 
+
+    
+
+if __name__ == '__main__':
+
+    module = import_module(args.model_arch)
+    model = module.mobilenetv2_rnnpool(num_classes=2, width_mult=0.35, last_channel=320)
+    model = model.to(device)
+    model = torch.nn.DataParallel(model)
+
+    
+
+    checkpoint = torch.load(args.weights)
+    checkpoint_dict = checkpoint['model']
+    model_dict = model.state_dict()
+    model_dict.update(checkpoint_dict) 
+    model.load_state_dict(model_dict)
+
+    model.eval()
+    img_path = args.image_folder
+    img_list = [os.path.join(img_path, x)
+                for x in os.listdir(img_path) if x.endswith('bmp')]
+    
+    for path in sorted(img_list):
+        img = Image.open(path).convert('RGB')
+        img = transform_test(img)
+        img = (img.cuda())
+        img = img.unsqueeze(0)
+       
+        out = model(img)
+    
+        print(path)
+        print(out)
+        if out[0][0]>0.15:
+            print('No person present')
+        else:
+            print('Person present')
+
+
+
diff --git a/examples/pytorch/vision/Visual_Wakeword/model_mobilenet_2rnnpool.py b/examples/pytorch/vision/Visual_Wakeword/model_mobilenet_2rnnpool.py
new file mode 100755
index 000000000..4de119951
--- /dev/null
+++ b/examples/pytorch/vision/Visual_Wakeword/model_mobilenet_2rnnpool.py
@@ -0,0 +1,208 @@
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT license.
+
+import re
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+import numpy as np
+import torch.utils.checkpoint as cp
+from collections import OrderedDict
+from torchvision.models.utils import load_state_dict_from_url
+import sys; sys.path.append('..')
+from rnnpool import *
+
+__all__ = ['MobileNetV2', 'mobilenetv2_rnnpool']
+
+
+model_urls = {
+    'mobilenet_v2': 'https://download.pytorch.org/models/mobilenet_v2-b0353104.pth',
+}
+
+
+def _make_divisible(v, divisor, min_value=None):
+    """
+    This function is taken from the original tf repo.
+    It ensures that all layers have a channel number that is divisible by 8
+    It can be seen here:
+    https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet/mobilenet.py
+    :param v:
+    :param divisor:
+    :param min_value:
+    :return:
+    """
+    if min_value is None:
+        min_value = divisor
+    new_v = max(min_value, int(v + divisor / 2) // divisor * divisor)
+    # Make sure that round down does not go down by more than 10%.
+    if new_v < 0.9 * v:
+        new_v += divisor
+    return new_v
+
+
+class ConvBNReLU(nn.Sequential):
+    def __init__(self, in_planes, out_planes, kernel_size=3, stride=1, groups=1):
+        padding = (kernel_size - 1) // 2
+        super(ConvBNReLU, self).__init__(
+            nn.Conv2d(in_planes, out_planes, kernel_size, stride, padding, groups=groups, bias=False),
+            nn.BatchNorm2d(out_planes, momentum=0.01),
+            nn.ReLU6(inplace=True)
+        )
+
+
+class InvertedResidual(nn.Module):
+    def __init__(self, inp, oup, stride, expand_ratio):
+        super(InvertedResidual, self).__init__()
+        self.stride = stride
+        assert stride in [1, 2]
+
+        hidden_dim = int(round(inp * expand_ratio))
+        self.use_res_connect = self.stride == 1 and inp == oup
+
+        layers = []
+        if expand_ratio != 1:
+            # pw
+            layers.append(ConvBNReLU(inp, hidden_dim, kernel_size=1))
+        layers.extend([
+            # dw
+            ConvBNReLU(hidden_dim, hidden_dim, stride=stride, groups=hidden_dim),
+            # pw-linear
+            nn.Conv2d(hidden_dim, oup, 1, 1, 0, bias=False),
+            nn.BatchNorm2d(oup, momentum=0.01),
+        ])
+        self.conv = nn.Sequential(*layers)
+
+    def forward(self, x):
+        if self.use_res_connect:
+            return x + self.conv(x)
+        else:
+            return self.conv(x)
+
+
+class MobileNetV2(nn.Module):
+    def __init__(self, 
+                 num_classes=1000, 
+                 width_mult=0.5, 
+                 inverted_residual_setting=None, 
+                 round_nearest=8,
+                 block=None,
+                 last_channel = 1280):
+        """
+        MobileNet V2 main class
+        Args:
+            num_classes (int): Number of classes
+            width_mult (float): Width multiplier - adjusts number of channels in each layer by this amount
+            inverted_residual_setting: Network structure
+            round_nearest (int): Round the number of channels in each layer to be a multiple of this number
+            Set to 1 to turn off rounding
+            block: Module specifying inverted residual building block for mobilenet
+        """
+        super(MobileNetV2, self).__init__()
+        
+        if block is None:
+            block = InvertedResidual
+        input_channel = 8
+        #last_channel = 1280
+
+        if inverted_residual_setting is None:
+            inverted_residual_setting = [
+                # t, c, n, s
+                [6, 64, 4, 2],
+                [6, 96, 3, 1],
+                [6, 160, 3, 2],
+                [6, 320, 1, 1],
+            ]
+
+        # only check the first element, assuming user knows t,c,n,s are required
+        if len(inverted_residual_setting) == 0 or len(inverted_residual_setting[0]) != 4:
+            raise ValueError("inverted_residual_setting should be non-empty "
+                             "or a 4-element list, got {}".format(inverted_residual_setting))
+
+        # building first layer
+        input_channel = _make_divisible(input_channel, round_nearest)
+        self.last_channel = _make_divisible(last_channel * max(1.0, width_mult), round_nearest)
+        self.features_init = ConvBNReLU(3, input_channel, stride=2)
+
+        self.unfold = nn.Unfold(kernel_size=(6,6),stride=(4,4))
+
+        self.rnn_model = RNNPool(6, 6, 8, 8, input_channel)#num_init_features)
+        self.fold = nn.Fold(kernel_size=(1,1),output_size=(27,27))
+
+        self.rnn_model_end = RNNPool(7, 7, int(self.last_channel/4), int(self.last_channel/4), self.last_channel)
+
+        features=[] 
+
+        input_channel = 32
+
+        # building inverted residual blocks
+        for t, c, n, s in inverted_residual_setting:
+            output_channel = _make_divisible(c * width_mult, round_nearest)
+            for i in range(n):
+                stride = s if i == 0 else 1
+                features.append(block(input_channel, output_channel, stride, expand_ratio=t))
+                input_channel = output_channel
+        # building last several layers
+        features.append(ConvBNReLU(input_channel, self.last_channel, kernel_size=1))
+        # make it nn.Sequential
+        self.features = nn.Sequential(*features)
+
+        # building classifier
+        self.classifier = nn.Sequential(
+            #nn.Dropout(0.2),
+            nn.Linear(self.last_channel, num_classes),
+        )
+
+        # weight initialization
+        for m in self.modules():
+            if isinstance(m, nn.Conv2d):
+                nn.init.kaiming_normal_(m.weight, mode='fan_out')
+                if m.bias is not None:
+                    nn.init.zeros_(m.bias)
+            elif isinstance(m, nn.BatchNorm2d):
+                nn.init.ones_(m.weight)
+                nn.init.zeros_(m.bias)
+            elif isinstance(m, nn.Linear):
+                nn.init.normal_(m.weight, 0, 0.01)
+                nn.init.zeros_(m.bias)
+
+    def forward(self, x):
+        batch_size = x.shape[0]
+
+        x = self.features_init(x)
+     
+        patches = self.unfold(x)
+        patches = torch.cat(torch.unbind(patches,dim=2),dim=0)
+        patches = torch.reshape(patches,(-1,8,6,6))
+        
+
+        output_x = int((x.shape[2]-6)/4 + 1)
+        output_y = int((x.shape[3]-6)/4 + 1)
+
+        rnnX = self.rnn_model(patches, int(batch_size)*output_x*output_y)
+
+        x = torch.stack(torch.split(rnnX, split_size_or_sections=int(batch_size), dim=0),dim=2)
+
+        x = self.fold(x)
+
+        x = F.pad(x, (0,1,0,1), mode='replicate')        
+
+        x = self.features(x)
+        x = self.rnn_model_end(x, batch_size)
+        x = self.classifier(x)
+        return x
+
+
+def mobilenetv2_rnnpool(pretrained=False, progress=True, **kwargs):
+    """
+    Constructs a MobileNetV2 architecture from
+    `"MobileNetV2: Inverted Residuals and Linear Bottlenecks" <https://arxiv.org/abs/1801.04381>`_.
+    Args:
+        pretrained (bool): If True, returns a model pre-trained on ImageNet
+        progress (bool): If True, displays a progress bar of the download to stderr
+    """
+    model = MobileNetV2(**kwargs)
+    if pretrained:
+        state_dict = load_state_dict_from_url(model_urls['mobilenet_v2'],
+                                              progress=progress)
+        model.load_state_dict(state_dict)
+    return model
diff --git a/examples/pytorch/vision/Visual_Wakeword/model_mobilenet_rnnpool.py b/examples/pytorch/vision/Visual_Wakeword/model_mobilenet_rnnpool.py
new file mode 100755
index 000000000..40f30ecfd
--- /dev/null
+++ b/examples/pytorch/vision/Visual_Wakeword/model_mobilenet_rnnpool.py
@@ -0,0 +1,206 @@
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT license.
+
+import re
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+import numpy as np
+import torch.utils.checkpoint as cp
+from collections import OrderedDict
+from torchvision.models.utils import load_state_dict_from_url
+from edgeml_pytorch.graph.rnnpool import *
+
+__all__ = ['MobileNetV2', 'mobilenetv2_rnnpool']
+
+
+model_urls = {
+    'mobilenet_v2': 'https://download.pytorch.org/models/mobilenet_v2-b0353104.pth',
+}
+
+
+def _make_divisible(v, divisor, min_value=None):
+    """
+    This function is taken from the original tf repo.
+    It ensures that all layers have a channel number that is divisible by 8
+    It can be seen here:
+    https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet/mobilenet.py
+    :param v:
+    :param divisor:
+    :param min_value:
+    :return:
+    """
+    if min_value is None:
+        min_value = divisor
+    new_v = max(min_value, int(v + divisor / 2) // divisor * divisor)
+    # Make sure that round down does not go down by more than 10%.
+    if new_v < 0.9 * v:
+        new_v += divisor
+    return new_v
+
+
+class ConvBNReLU(nn.Sequential):
+    def __init__(self, in_planes, out_planes, kernel_size=3, stride=1, groups=1):
+        padding = (kernel_size - 1) // 2
+        super(ConvBNReLU, self).__init__(
+            nn.Conv2d(in_planes, out_planes, kernel_size, stride, padding, groups=groups, bias=False),
+            nn.BatchNorm2d(out_planes, momentum=0.01),
+            nn.ReLU6(inplace=True)
+        )
+
+
+class InvertedResidual(nn.Module):
+    def __init__(self, inp, oup, stride, expand_ratio):
+        super(InvertedResidual, self).__init__()
+        self.stride = stride
+        assert stride in [1, 2]
+
+        hidden_dim = int(round(inp * expand_ratio))
+        self.use_res_connect = self.stride == 1 and inp == oup
+
+        layers = []
+        if expand_ratio != 1:
+            # pw
+            layers.append(ConvBNReLU(inp, hidden_dim, kernel_size=1))
+        layers.extend([
+            # dw
+            ConvBNReLU(hidden_dim, hidden_dim, stride=stride, groups=hidden_dim),
+            # pw-linear
+            nn.Conv2d(hidden_dim, oup, 1, 1, 0, bias=False),
+            nn.BatchNorm2d(oup, momentum=0.01),
+        ])
+        self.conv = nn.Sequential(*layers)
+
+    def forward(self, x):
+        if self.use_res_connect:
+            return x + self.conv(x)
+        else:
+            return self.conv(x)
+
+
+class MobileNetV2(nn.Module):
+    def __init__(self, 
+                 num_classes=1000, 
+                 width_mult=0.5, 
+                 inverted_residual_setting=None, 
+                 round_nearest=8,
+                 block=None,
+                 last_channel = 1280):
+        """
+        MobileNet V2 main class
+        Args:
+            num_classes (int): Number of classes
+            width_mult (float): Width multiplier - adjusts number of channels in each layer by this amount
+            inverted_residual_setting: Network structure
+            round_nearest (int): Round the number of channels in each layer to be a multiple of this number
+            Set to 1 to turn off rounding
+            block: Module specifying inverted residual building block for mobilenet
+        """
+        super(MobileNetV2, self).__init__()
+        
+        if block is None:
+            block = InvertedResidual
+        input_channel = 8
+        #last_channel = 1280
+
+        if inverted_residual_setting is None:
+            inverted_residual_setting = [
+                # t, c, n, s
+                [6, 64, 4, 2],
+                [6, 96, 3, 1],
+                [6, 160, 3, 2],
+                [6, 320, 1, 1],
+            ]
+
+        # only check the first element, assuming user knows t,c,n,s are required
+        if len(inverted_residual_setting) == 0 or len(inverted_residual_setting[0]) != 4:
+            raise ValueError("inverted_residual_setting should be non-empty "
+                             "or a 4-element list, got {}".format(inverted_residual_setting))
+
+        # building first layer
+        input_channel = _make_divisible(input_channel, round_nearest)
+        self.last_channel = _make_divisible(last_channel * max(1.0, width_mult), round_nearest)
+        self.features_init = ConvBNReLU(3, input_channel, stride=2)
+
+        self.unfold = nn.Unfold(kernel_size=(6,6),stride=(4,4))
+
+        self.rnn_model = RNNPool(6, 6, 8, 8, input_channel)#num_init_features)
+        self.fold = nn.Fold(kernel_size=(1,1),output_size=(27,27))
+
+
+        features=[]
+
+        input_channel = 32
+
+        # building inverted residual blocks
+        for t, c, n, s in inverted_residual_setting:
+            output_channel = _make_divisible(c * width_mult, round_nearest)
+            for i in range(n):
+                stride = s if i == 0 else 1
+                features.append(block(input_channel, output_channel, stride, expand_ratio=t))
+                input_channel = output_channel
+        # building last several layers
+        features.append(ConvBNReLU(input_channel, self.last_channel, kernel_size=1))
+        self.features = nn.Sequential(*features)
+
+        # building classifier
+        self.classifier = nn.Sequential(
+            nn.Dropout(0.2),
+            nn.Linear(self.last_channel, num_classes),
+        )
+
+        # weight initialization
+        for m in self.modules():
+            if isinstance(m, nn.Conv2d):
+                nn.init.kaiming_normal_(m.weight, mode='fan_out')
+                if m.bias is not None:
+                    nn.init.zeros_(m.bias)
+            elif isinstance(m, nn.BatchNorm2d):
+                nn.init.ones_(m.weight)
+                nn.init.zeros_(m.bias)
+            elif isinstance(m, nn.Linear):
+                nn.init.normal_(m.weight, 0, 0.01)
+                nn.init.zeros_(m.bias)
+
+    def forward(self, x):
+        batch_size = x.shape[0]
+        
+        
+        x = self.features_init(x)
+     
+        patches = self.unfold(x)
+        patches = torch.cat(torch.unbind(patches,dim=2),dim=0)
+        patches = torch.reshape(patches,(-1,8,6,6))
+        
+
+        output_x = int((x.shape[2]-6)/4 + 1)
+        output_y = int((x.shape[3]-6)/4 + 1)
+
+        rnnX = self.rnn_model(patches, int(batch_size)*output_x*output_y)
+
+        x = torch.stack(torch.split(rnnX, split_size_or_sections=int(batch_size), dim=0),dim=2)
+
+        x = self.fold(x)
+
+        x = F.pad(x, (0,1,0,1), mode='replicate')        
+
+        x = self.features(x)
+        x = x.mean([2, 3])
+        x = self.classifier(x)
+        return x
+
+
+def mobilenetv2_rnnpool(pretrained=False, progress=True, **kwargs):
+    """
+    Constructs a MobileNetV2 architecture from
+    `"MobileNetV2: Inverted Residuals and Linear Bottlenecks" <https://arxiv.org/abs/1801.04381>`_.
+    Args:
+        pretrained (bool): If True, returns a model pre-trained on ImageNet
+        progress (bool): If True, displays a progress bar of the download to stderr
+    """
+    model = MobileNetV2(**kwargs)
+    if pretrained:
+        state_dict = load_state_dict_from_url(model_urls['mobilenet_v2'],
+                                              progress=progress)
+        model.load_state_dict(state_dict)
+    return model
diff --git a/examples/pytorch/vision/Visual_Wakeword/requirements.txt b/examples/pytorch/vision/Visual_Wakeword/requirements.txt
new file mode 100644
index 000000000..f05a38c8b
--- /dev/null
+++ b/examples/pytorch/vision/Visual_Wakeword/requirements.txt
@@ -0,0 +1,9 @@
+pycocotools
+pyvww
+easydict==1.9
+importlib-metadata==1.5.0
+matplotlib==3.2.1
+opencv-python-headless==4.2.0.32
+scikit-image==0.15.0
+tensorboard==1.14.0
+tensorboardX==1.9
\ No newline at end of file
diff --git a/examples/pytorch/vision/Visual_Wakeword/scripts/create_coco_train_minival_split.py b/examples/pytorch/vision/Visual_Wakeword/scripts/create_coco_train_minival_split.py
new file mode 100755
index 000000000..967b8c4a9
--- /dev/null
+++ b/examples/pytorch/vision/Visual_Wakeword/scripts/create_coco_train_minival_split.py
@@ -0,0 +1,120 @@
+## Code from https://github.com/Mxbonn/visualwakewords
+
+
+"""Create maxitrain and minival annotations.
+    This script generates a new train validation split with 115k training and 8k validation images.
+    Based on the split used by Google
+    (https://raw.githubusercontent.com/tensorflow/models/master/research/object_detection/data/mscoco_minival_ids.txt).
+
+    Usage:
+    From this folder, run the following commands: (2014 can be replaced by 2017 if you downloaded the 2017 dataset)
+        TRAIN_ANNOTATIONS_FILE="path-to-mscoco-dataset/annotations/instances_train2014.json"
+        VAL_ANNOTATIONS_FILE="path-to-mscoco-dataset/annotations/instances_val2014.json"
+        OUTPUT_DIR="path-to-mscoco-dataset/annotations/"
+        python create_coco_train_minival_split.py \
+          --train_annotations_file="${TRAIN_ANNOTATIONS_FILE}" \
+          --val_annotations_file="${VAL_ANNOTATIONS_FILE}" \
+          --output_dir="${OUTPUT_DIR}"
+"""
+import json
+import os
+from argparse import ArgumentParser
+
+
+def create_maxitrain_minival(train_file, val_file, output_dir):
+    """ Generate maxitrain and minival annotations files.
+    Loads COCO 2014/2017 train and validation json files and creates a new split with
+    115k training images and 8k validation images.
+    Based on the split used by Google
+    (https://raw.githubusercontent.com/tensorflow/models/master/research/object_detection/data/mscoco_minival_ids.txt).
+    Args:
+        train_file: JSON file containing COCO 2014 or 2017 train annotations
+        val_file: JSON file containing COCO 2014 or 2017 validation annotations
+        output_dir: Directory where the new annotation files will be stored.
+    """
+    maxitrain_path = os.path.join(
+        output_dir, 'instances_maxitrain.json')
+    minival_path = os.path.join(
+        output_dir, 'instances_minival.json')
+    train_json = json.load(open(train_file, 'r'))
+    val_json = json.load(open(val_file, 'r'))
+
+    info = train_json['info']
+    categories = train_json['categories']
+    licenses = train_json['licenses']
+
+    dir_path = os.path.dirname(os.path.realpath(__file__))
+    file_path = os.path.join(dir_path, 'mscoco_minival_ids.txt')
+    minival_ids_f = open(file_path, 'r')
+    minival_ids = minival_ids_f.readlines()
+    minival_ids = [int(i) for i in minival_ids]
+
+    train_images = train_json['images']
+    val_images = val_json['images']
+    train_annotations = train_json['annotations']
+    val_annotations = val_json['annotations']
+
+    maxitrain_images = []
+    minival_images = []
+    maxitrain_annotations = []
+    minival_annotations = []
+
+    for _images in [train_images, val_images]:
+        for img in _images:
+            img_id = img['id']
+            if img_id in minival_ids:
+                minival_images.append(img)
+            else:
+                maxitrain_images.append(img)
+
+    for _annotations in [train_annotations, val_annotations]:
+        for ann in _annotations:
+            img_id = ann['image_id']
+            if img_id in minival_ids:
+                minival_annotations.append(ann)
+            else:
+                maxitrain_annotations.append(ann)
+
+    with open(maxitrain_path, 'w') as fp:
+        json.dump(
+            {
+                "info": info,
+                "licenses": licenses,
+                'images': maxitrain_images,
+                'annotations': maxitrain_annotations,
+                'categories': categories,
+            }, fp)
+
+    with open(minival_path, 'w') as fp:
+        json.dump(
+            {
+                "info": info,
+                "licenses": licenses,
+                'images': minival_images,
+                'annotations': minival_annotations,
+                'categories': categories,
+            }, fp)
+
+
+def main(args):
+    output_dir = os.path.realpath(os.path.expanduser(args.output_dir))
+    train_annotations_file = os.path.realpath(os.path.expanduser(args.train_annotations_file))
+    val_annotations_file = os.path.realpath(os.path.expanduser(args.val_annotations_file))
+
+    if not os.path.isdir(output_dir):
+        os.makedirs(output_dir)
+    create_maxitrain_minival(train_annotations_file, val_annotations_file, output_dir)
+
+
+if __name__ == '__main__':
+    parser = ArgumentParser(description="Script that takes the 2014/2017 training and validation annotations and"
+                                        "creates a train split of 115k images and a minival of 8k.")
+    parser.add_argument('--train_annotations_file', type=str, required=True,
+                        help='COCO2014/2017 Training annotations JSON file')
+    parser.add_argument('--val_annotations_file', type=str, required=True,
+                        help='COCO2014/2017 Validation annotations JSON file')
+    parser.add_argument('--output_dir', type=str, required=True,
+                        help='Output directory where the maxitrain and minival annotations files will be stored')
+
+    args = parser.parse_args()
+    main(args)
diff --git a/examples/pytorch/vision/Visual_Wakeword/scripts/create_visualwakewords_annotations.py b/examples/pytorch/vision/Visual_Wakeword/scripts/create_visualwakewords_annotations.py
new file mode 100755
index 000000000..f0b97acc1
--- /dev/null
+++ b/examples/pytorch/vision/Visual_Wakeword/scripts/create_visualwakewords_annotations.py
@@ -0,0 +1,217 @@
+## Code from https://github.com/Mxbonn/visualwakewords
+
+
+"""Create Visual Wakewords annotations.
+    This script generates the Visual WakeWords dataset annotations from the raw COCO dataset.
+    The resulting annotations can then be used with `pyvww.utils.VisualWakeWords` and
+    `pyvww.pytorch.VisualWakeWordsClassification`.
+
+    Visual WakeWords Dataset is derived from the COCO dataset to design tiny models
+    classifying two classes, such as person/not-person. The COCO annotations
+    are filtered to two classes: foreground_class and background
+    (for e.g. person and not-person). Bounding boxes for small objects
+    with area less than 5% of the image area are filtered out.
+    The resulting annotations file follows the COCO data format.
+    {
+      "info" : info,
+      "images" : [image],
+      "annotations" : [annotation],
+      "licenses" : [license],
+    }
+
+    info{
+      "year" : int,
+      "version" : str,
+      "description" : str,
+      "url" : str,
+    }
+
+    image{
+      "id" : int,
+      "width" : int,
+      "height" : int,
+      "file_name" : str,
+      "license" : int,
+      "flickr_url" : str,
+      "coco_url" : str,
+      "date_captured" : datetime,
+    }
+
+    license{
+      "id" : int,
+      "name" : str,
+      "url" : str,
+    }
+
+    annotation{
+      "id" : int,
+      "image_id" : int,
+      "category_id" : int,
+      "area" : float,
+      "bbox" : [x,y,width,height],
+      "iscrowd" : 0 or 1,
+    }
+
+    Example usage:
+    From this folder, run the following commands:
+        bash download_mscoco.sh path-to-mscoco-dataset
+        TRAIN_ANNOTATIONS_FILE="path-to-mscoco-dataset/annotations/instances_train2014.json"
+        VAL_ANNOTATIONS_FILE="path-to-mscoco-dataset/annotations/instances_val2014.json"
+        DIR="path-to-mscoco-dataset/annotations/"
+        python create_coco_train_minival_split.py \
+          --train_annotations_file="${TRAIN_ANNOTATIONS_FILE}" \
+          --val_annotations_file="${VAL_ANNOTATIONS_FILE}" \
+          --output_dir="${DIR}"
+        MAXITRAIN_ANNOTATIONS_FILE="path-to-mscoco-dataset/annotations/instances_maxitrain.json"
+        MINIVAL_ANNOTATIONS_FILE="path-to-mscoco-dataset/annotations/instances_minival.json"
+        VWW_OUTPUT_DIR="new-path-to-visualwakewords-dataset/annotations/"
+        python create_visualwakewords_annotations.py \
+          --train_annotations_file="${MAXITRAIN_ANNOTATIONS_FILE}" \
+          --val_annotations_file="${MINIVAL_ANNOTATIONS_FILE}" \
+          --output_dir="${VWW_OUTPUT_DIR}" \
+          --threshold=0.005 \
+          --foreground_class='person'
+"""
+
+import json
+import os
+from argparse import ArgumentParser
+
+from pycocotools.coco import COCO
+
+
+def create_visual_wakeword_annotations(annotations_file,
+                                       visualwakewords_annotations_path,
+                                       object_area_threshold,
+                                       foreground_class_name):
+    """Generate visual wake words annotations file.
+    Loads COCO annotation json files and filters to foreground_class_name/not-foreground_class_name
+    (by default it will be person/not-person) to generate visual wake words annotations file.
+    Each image is assigned a label 1 or 0. The label 1 is assigned as long
+    as it has at least one foreground_class_name (e.g. person)
+    bounding box greater than object_area_threshold (e.g. 5% of the image area).
+    Args:
+      annotations_file: JSON file containing COCO bounding box annotations
+      visualwakewords_annotations_path: output path to annotations file
+      object_area_threshold: threshold on fraction of image area below which
+        small object bounding boxes are filtered
+      foreground_class_name: category from COCO dataset that is filtered by
+        the visual wakewords dataset
+    """
+    print('Processing {}...'.format(annotations_file))
+    coco = COCO(annotations_file)
+
+    info = {"description": "Visual Wake Words Dataset",
+            "url": "https://arxiv.org/abs/1906.05721",
+            "version": "1.0",
+            "year": 2019,
+            }
+
+    # default object of interest is person
+    foreground_class_id = 1
+    dataset = coco.dataset
+    licenses = dataset['licenses']
+
+    images = dataset['images']
+    # Create category index
+    foreground_category = None
+    background_category = {'supercategory': 'background', 'id': 0, 'name': 'background'}
+    for category in dataset['categories']:
+        if category['name'] == foreground_class_name:
+            foreground_class_id = category['id']
+            foreground_category = category
+    foreground_category['id'] = 1
+    background_category['name'] = "not-{}".format(foreground_category['name'])
+    categories = [background_category, foreground_category]
+
+    if not 'annotations' in dataset:
+        raise KeyError('Need annotations in json file to build the dataset.')
+    new_ann_id = 0
+    annotations = []
+    positive_img_ids = set()
+    foreground_imgs_ids = coco.getImgIds(catIds=foreground_class_id)
+    for img_id in foreground_imgs_ids:
+        img = coco.imgs[img_id]
+        img_area = img['height'] * img['width']
+        for ann_id in coco.getAnnIds(imgIds=img_id, catIds=foreground_class_id):
+            ann = coco.anns[ann_id]
+            if 'area' in ann:
+                normalized_ann_area = ann['area'] / img_area
+                if normalized_ann_area > object_area_threshold:
+                    new_ann = {
+                        "id": new_ann_id,
+                        "image_id": img_id,
+                        "category_id": 1,
+                        "area": ann["area"],
+                        "bbox": ann["bbox"],
+                        "iscrowd": ann["iscrowd"],
+                    }
+                    annotations.append(new_ann)
+                    positive_img_ids.add(img_id)
+                    new_ann_id += 1
+    print("There are {} images that now have label {}, of the {} images in total.".format(len(positive_img_ids),
+                                                                                          foreground_class_name,
+                                                                                          len(coco.imgs)))
+    negative_img_ids = list(set(coco.imgs.keys()) - positive_img_ids)
+    for img_id in negative_img_ids:
+        new_ann = {
+            "id": new_ann_id,
+            "image_id": img_id,
+            "category_id": 0,
+            "area": 0.0,
+            "bbox": [],
+            "iscrowd": 0,
+        }
+        annotations.append(new_ann)
+        new_ann_id += 1
+
+    # Output Visual WakeWords annotations and labels
+    with open(visualwakewords_annotations_path, 'w') as fp:
+        json.dump(
+            {
+                "info": info,
+                "licenses": licenses,
+                'images': images,
+                'annotations': annotations,
+                'categories': categories,
+            }, fp)
+
+
+def main(args):
+    output_dir = os.path.realpath(os.path.expanduser(args.output_dir))
+    train_annotations_file = os.path.realpath(os.path.expanduser(args.train_annotations_file))
+    val_annotations_file = os.path.realpath(os.path.expanduser(args.val_annotations_file))
+    visualwakewords_annotations_train = os.path.join(
+        output_dir, 'instances_train.json')
+    visualwakewords_annotations_val = os.path.join(
+        output_dir, 'instances_val.json')
+    small_object_area_threshold = args.threshold
+    foreground_class_of_interest = args.foreground_class
+
+    # Create the Visual WakeWords annotations from COCO annotations
+    if not os.path.isdir(output_dir):
+        os.makedirs(output_dir)
+    create_visual_wakeword_annotations(
+        train_annotations_file, visualwakewords_annotations_train,
+        small_object_area_threshold, foreground_class_of_interest)
+    create_visual_wakeword_annotations(
+        val_annotations_file, visualwakewords_annotations_val,
+        small_object_area_threshold, foreground_class_of_interest)
+
+
+if __name__ == '__main__':
+    parser = ArgumentParser()
+    parser.add_argument('--train_annotations_file', type=str, required=True,
+                        help='(COCO) Training annotations JSON file')
+    parser.add_argument('--val_annotations_file', type=str, required=True,
+                        help='(COCO) Validation annotations JSON file')
+    parser.add_argument('--output_dir', type=str, default='/tmp/visualwakewords/',
+                        help='Output directory where the Visual WakeWords annotations files be stored')
+    parser.add_argument('--threshold', type=float, default=0.005,
+                        help='Threshold of fraction of image area below which small objects are filtered.')
+    parser.add_argument('--foreground_class', type=str, default='person',
+                        help='Annotations will have a label indicating if this object is present or absent'
+                             'in the scene (default is person/not-person).')
+
+    args = parser.parse_args()
+    main(args)
diff --git a/examples/pytorch/vision/Visual_Wakeword/scripts/download_mscoco.sh b/examples/pytorch/vision/Visual_Wakeword/scripts/download_mscoco.sh
new file mode 100755
index 000000000..763f1fcf0
--- /dev/null
+++ b/examples/pytorch/vision/Visual_Wakeword/scripts/download_mscoco.sh
@@ -0,0 +1,80 @@
+# Copyright 2020 Maxim Bonnaerens. All Rights Reserved.
+#
+# Copyright 2019 The TensorFlow Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+# ==============================================================================
+# File modified from tensorflow/models/research/slim/datasets/download_mscoco.sh
+
+# Script to download the COCO dataset. See
+# http://cocodataset.org/#overview for an overview of the dataset.
+#
+# usage:
+#  bash scripts/download_mscoco.sh path-to-COCO-dataset
+#
+set -e
+
+YEAR=${2:-2014}
+if [ -z "$1" ]; then
+  echo "usage download_mscoco.sh [data dir] (2014|2017)"
+  exit
+fi
+
+if [ "$(uname)" == "Darwin" ]; then
+  UNZIP="tar -xf"
+else
+  UNZIP="unzip -nq"
+fi
+
+# Create the output directories.
+OUTPUT_DIR="${1%/}"
+mkdir -p "${OUTPUT_DIR}"
+
+# Helper function to download and unpack a .zip file.
+function download_and_unzip() {
+  local BASE_URL=${1}
+  local FILENAME=${2}
+
+  if [ ! -f "${FILENAME}" ]; then
+    echo "Downloading ${FILENAME} to $(pwd)"
+    wget -nd -c "${BASE_URL}/${FILENAME}"
+  else
+    echo "Skipping download of ${FILENAME}"
+  fi
+  echo "Unzipping ${FILENAME}"
+  ${UNZIP} "${FILENAME}"
+  rm "${FILENAME}"
+}
+
+cd "${OUTPUT_DIR}"
+
+# Download the images.
+BASE_IMAGE_URL="http://images.cocodataset.org/zips"
+
+TRAIN_IMAGE_FILE="train${YEAR}.zip"
+download_and_unzip ${BASE_IMAGE_URL} "${TRAIN_IMAGE_FILE}"
+TRAIN_IMAGE_DIR="${OUTPUT_DIR}/train${YEAR}"
+
+VAL_IMAGE_FILE="val${YEAR}.zip"
+download_and_unzip ${BASE_IMAGE_URL} "${VAL_IMAGE_FILE}"
+VAL_IMAGE_DIR="${OUTPUT_DIR}/val${YEAR}"
+
+COMMON_DIR="all$YEAR"
+mkdir -p "${COMMON_DIR}"
+for i in ${TRAIN_IMAGE_DIR}/*; do cp --symbolic-link "$i" ${COMMON_DIR}/; done
+for i in ${VAL_IMAGE_DIR}/*; do cp --symbolic-link "$i" ${COMMON_DIR}/; done
+
+# Download the annotations.
+BASE_INSTANCES_URL="http://images.cocodataset.org/annotations"
+INSTANCES_FILE="annotations_trainval${YEAR}.zip"
+download_and_unzip ${BASE_INSTANCES_URL} "${INSTANCES_FILE}"
diff --git a/examples/pytorch/vision/Visual_Wakeword/scripts/mscoco_minival_ids.txt b/examples/pytorch/vision/Visual_Wakeword/scripts/mscoco_minival_ids.txt
new file mode 100644
index 000000000..5bbff3c18
--- /dev/null
+++ b/examples/pytorch/vision/Visual_Wakeword/scripts/mscoco_minival_ids.txt
@@ -0,0 +1,8059 @@
+25096
+251824
+35313
+546011
+524186
+205866
+511403
+313916
+47471
+258628
+233560
+576017
+404517
+410056
+178690
+248980
+511724
+429718
+163076
+244111
+126766
+313182
+191981
+139992
+325237
+248129
+214519
+175438
+493321
+174103
+563762
+536795
+289960
+473720
+515540
+292118
+360851
+267175
+532876
+171613
+581415
+259819
+441841
+381682
+58157
+4980
+473929
+70626
+93773
+283412
+36765
+495020
+278401
+329307
+192810
+491784
+506416
+225495
+553747
+86442
+242208
+132686
+385877
+290248
+525705
+5476
+486521
+332512
+138556
+348083
+284375
+40018
+296994
+38685
+432429
+183407
+434358
+472164
+530494
+570693
+193401
+392612
+98872
+445766
+532209
+98322
+285114
+267725
+51605
+314812
+91105
+535506
+540264
+375341
+449828
+277659
+68933
+76873
+217554
+213592
+190776
+516224
+474479
+343599
+578813
+128669
+546292
+475365
+377626
+128833
+427091
+547227
+11742
+80213
+462241
+374574
+121572
+29151
+13892
+262394
+303667
+198724
+7320
+448492
+419080
+460379
+483965
+556516
+139181
+1103
+308715
+207507
+213827
+216083
+445597
+240275
+379585
+116389
+138124
+559051
+326898
+419386
+503660
+519460
+23893
+24458
+518109
+462982
+151492
+514254
+2477
+147165
+570394
+548766
+250083
+364341
+351967
+386277
+328084
+511299
+499349
+315501
+234965
+428562
+219771
+288150
+136021
+168619
+298316
+75118
+189752
+243857
+296222
+554002
+533628
+384596
+202981
+498350
+391463
+183991
+528062
+451084
+7899
+408534
+329030
+318566
+22492
+361285
+226973
+213356
+417265
+105622
+161169
+261487
+167477
+233370
+142999
+256713
+305833
+103579
+352538
+135763
+392144
+61181
+200302
+456908
+286858
+179850
+488075
+174511
+194755
+317822
+2302
+304596
+172556
+548275
+341678
+55299
+134760
+352936
+545129
+377012
+141328
+103757
+552837
+28246
+125167
+328745
+278760
+337133
+403389
+146825
+502558
+265916
+428985
+492041
+113403
+372037
+306103
+287574
+187495
+479805
+336309
+162043
+95899
+43133
+464248
+149115
+247438
+74030
+130645
+282841
+127092
+101172
+536743
+179642
+58133
+49667
+170605
+11347
+365277
+201970
+292663
+217219
+463226
+41924
+281102
+357816
+490878
+100343
+525058
+133503
+416145
+29341
+415413
+125527
+507951
+262609
+240210
+581781
+345137
+526342
+268641
+328777
+32001
+137538
+39115
+415958
+6771
+421865
+64909
+383601
+206907
+420840
+370980
+28452
+571893
+153520
+185890
+392991
+547013
+257359
+279879
+478614
+131919
+40937
+22874
+173375
+106344
+44801
+205401
+312870
+400886
+351530
+344013
+173500
+470423
+396729
+402499
+276585
+377097
+367619
+518908
+263866
+332292
+67805
+152211
+515025
+221350
+525247
+78490
+504342
+95908
+82668
+256199
+220270
+552065
+242379
+84866
+152281
+228464
+223122
+67537
+456968
+368349
+101985
+14681
+543551
+107558
+372009
+99054
+126540
+86877
+492785
+482585
+571564
+501116
+296871
+20395
+181518
+568041
+121154
+56187
+190018
+97156
+310325
+393274
+214574
+243222
+289949
+452121
+150508
+341752
+310757
+24040
+228551
+335589
+12020
+529597
+459884
+344888
+229713
+51948
+370929
+552061
+261072
+120070
+332067
+263014
+158993
+451714
+397327
+20965
+414340
+574946
+370266
+487534
+492246
+264771
+73702
+43997
+235124
+301093
+400048
+77681
+58472
+331386
+13783
+242513
+419158
+59325
+383033
+393258
+529041
+249276
+182775
+351793
+9727
+334069
+566771
+539355
+38662
+423617
+47559
+120592
+508303
+462565
+47916
+218208
+182362
+562101
+441442
+71239
+395378
+522637
+25603
+484450
+872
+171483
+527248
+323155
+240754
+15032
+419144
+313214
+250917
+333430
+242757
+221914
+283190
+194297
+228506
+550691
+172513
+312192
+530619
+113867
+323552
+374115
+35435
+160239
+62877
+441873
+196574
+62858
+557114
+427612
+242869
+356733
+304828
+24880
+490509
+407083
+457877
+402788
+536416
+385912
+544121
+500389
+451102
+12120
+483476
+70987
+482799
+542549
+49236
+424258
+435783
+182366
+438093
+501824
+232845
+53965
+223198
+288933
+450458
+285664
+196484
+408930
+519815
+290981
+398567
+315792
+490683
+257136
+75611
+302498
+332153
+82293
+416911
+558608
+564659
+536195
+370260
+57904
+527270
+6593
+145620
+551650
+470832
+515785
+251404
+287331
+150788
+334006
+266117
+10039
+579158
+328397
+468351
+550400
+31745
+405970
+16761
+323515
+459598
+558457
+570736
+476939
+472610
+72155
+112517
+13659
+530905
+458768
+43486
+560893
+493174
+31217
+262736
+412204
+142722
+151231
+480643
+197245
+398666
+444869
+110999
+191724
+479057
+492420
+170638
+277329
+301908
+395644
+537611
+141887
+47149
+403432
+34818
+372495
+67994
+337497
+478586
+249815
+533462
+281032
+289941
+151911
+271215
+407868
+360700
+508582
+103873
+353658
+369081
+406403
+331692
+26430
+105655
+572630
+37181
+91336
+484587
+318284
+113019
+33055
+25293
+229324
+374052
+384111
+213951
+315195
+319283
+539453
+17655
+308974
+326243
+539436
+417876
+526940
+356347
+221932
+73753
+292648
+262284
+304924
+558587
+374858
+253518
+311744
+539636
+40924
+136624
+334305
+365997
+63355
+191226
+526732
+367128
+575198
+500657
+50637
+17182
+424792
+565353
+563040
+383494
+74458
+155142
+197125
+223857
+428241
+440830
+371289
+437303
+330449
+93771
+82715
+499631
+381257
+563951
+192834
+528600
+404273
+270554
+208053
+188613
+484760
+432016
+129800
+91756
+523097
+317018
+487282
+444913
+159500
+126822
+540564
+105812
+560756
+306099
+471226
+123842
+513219
+154877
+497034
+283928
+564003
+238602
+194780
+462728
+558640
+524373
+455624
+3690
+560367
+316351
+455772
+223777
+161517
+243034
+250440
+239975
+441008
+324715
+152106
+246973
+462805
+296521
+412767
+530913
+370165
+292526
+107244
+217440
+330204
+220176
+577735
+197022
+127451
+518701
+212322
+204887
+27696
+348474
+119233
+282804
+230040
+425690
+409241
+296825
+296353
+375909
+123136
+573891
+338256
+198247
+373375
+151051
+500084
+557596
+120478
+44989
+283380
+149005
+522065
+626
+17198
+309633
+524245
+291589
+322714
+455847
+248468
+371948
+444928
+20438
+481670
+147195
+95022
+548159
+553165
+395324
+391371
+86884
+561121
+219737
+38875
+338159
+377881
+185472
+359277
+114861
+378048
+126226
+10217
+320246
+15827
+178236
+370279
+352978
+408101
+77615
+337044
+223714
+20796
+352445
+263834
+156704
+377867
+119402
+399567
+1180
+257941
+560675
+390471
+209290
+258382
+466339
+56437
+195042
+384230
+203214
+36077
+283038
+38323
+158770
+532381
+395903
+375461
+397857
+326798
+371699
+369503
+495626
+464328
+462211
+397719
+434089
+424793
+476770
+531852
+303538
+525849
+480917
+419653
+265063
+48956
+5184
+279149
+396727
+374266
+124429
+36124
+240213
+147556
+339512
+577182
+288599
+257169
+178254
+393869
+122314
+28713
+48133
+540681
+100974
+368459
+500110
+73634
+460982
+203878
+578344
+443602
+502012
+399666
+103603
+22090
+257529
+176328
+536656
+408873
+116881
+460972
+33835
+460781
+51223
+46463
+89395
+407646
+337453
+461715
+16257
+426987
+234889
+3125
+165643
+517472
+451435
+206800
+112128
+331236
+163306
+94185
+498716
+532732
+146509
+458567
+153832
+105996
+353398
+546976
+283060
+247624
+110048
+243491
+154798
+543600
+149962
+355256
+352900
+203081
+372203
+284605
+516244
+190494
+150301
+326082
+64146
+402858
+413538
+399510
+460251
+94336
+458721
+57345
+424162
+423508
+69356
+567220
+509786
+37038
+111535
+341318
+372067
+358120
+244909
+180653
+39852
+438560
+357041
+67065
+51928
+171717
+520430
+552395
+431355
+528084
+20913
+309610
+262323
+573784
+449485
+154846
+283438
+430871
+199578
+516318
+563912
+348483
+485613
+143440
+94922
+168817
+74457
+45830
+66297
+514173
+99186
+296236
+230903
+452312
+476444
+568981
+100811
+237350
+194724
+453622
+49559
+270609
+113701
+415393
+92173
+137004
+188795
+148280
+448114
+575964
+163155
+518719
+219329
+214247
+363927
+65357
+87617
+552612
+457817
+124796
+47740
+560463
+513968
+273637
+354212
+95959
+261061
+307265
+316237
+191342
+463272
+169273
+396518
+93261
+572733
+407386
+202658
+446497
+420852
+229274
+432724
+34900
+352533
+49891
+66144
+146831
+467484
+97988
+561647
+301155
+507421
+173217
+577584
+451940
+99927
+350639
+178941
+485155
+175948
+360673
+92963
+361321
+48739
+577310
+517795
+93405
+506458
+394681
+167920
+16995
+519573
+270532
+527750
+563403
+494608
+557780
+178691
+8676
+186927
+550173
+361656
+575911
+281315
+534377
+57570
+340894
+37624
+143103
+538243
+425077
+376545
+108129
+170974
+7522
+408906
+264279
+79415
+344025
+186797
+234349
+226472
+123639
+225177
+237984
+38714
+223671
+358247
+152465
+521405
+453722
+361111
+557117
+235832
+309341
+268469
+108353
+532531
+357279
+537280
+437618
+122953
+7088
+36693
+127659
+431901
+57244
+567565
+568111
+202926
+504516
+555685
+322369
+347620
+110231
+568982
+295340
+529798
+300341
+158160
+73588
+119476
+387216
+154994
+259755
+211282
+433971
+263588
+299468
+570138
+123017
+355106
+540172
+406215
+8401
+548844
+161820
+396432
+495348
+222407
+53123
+491556
+108130
+440617
+448309
+22596
+346841
+213829
+135076
+56326
+233139
+487418
+227326
+137763
+383389
+47882
+207797
+167452
+112065
+150703
+421109
+171753
+158279
+240800
+66821
+152886
+163640
+475466
+301799
+106712
+470885
+536370
+420389
+396768
+281950
+18903
+357529
+33650
+168243
+201004
+389295
+557150
+185327
+181256
+557396
+182025
+61564
+301928
+332455
+199403
+18444
+177452
+204206
+38465
+215906
+153103
+445019
+324527
+299207
+429281
+574675
+157067
+241269
+100850
+502818
+576566
+296775
+873
+280363
+355240
+383445
+286182
+67327
+422778
+494855
+337246
+266853
+47516
+381991
+44081
+403862
+381430
+370798
+173383
+387173
+22396
+484066
+349414
+262235
+492814
+65238
+209420
+336276
+453328
+407286
+420490
+360328
+158440
+398534
+489475
+477389
+297108
+69750
+507833
+198992
+99736
+546444
+514914
+482574
+54355
+63478
+191693
+61684
+412914
+267408
+424641
+56872
+318080
+30290
+33441
+199310
+337403
+26731
+453390
+506137
+188945
+185950
+239843
+357944
+290570
+523637
+551952
+513397
+357870
+523517
+277048
+259879
+186991
+521943
+21900
+281074
+187194
+526723
+568147
+513037
+177338
+243831
+203488
+208494
+188460
+289943
+399177
+404668
+160761
+271143
+76087
+478922
+440045
+449432
+61025
+331138
+227019
+147577
+548337
+444294
+458663
+236837
+6854
+444926
+484816
+516641
+397863
+188534
+64822
+213453
+66561
+43218
+514901
+322844
+498453
+488788
+391656
+298994
+64088
+464706
+193720
+199017
+186427
+15278
+350386
+342335
+372024
+550939
+35594
+381382
+235902
+26630
+213765
+550001
+129706
+577149
+353096
+376891
+28499
+427041
+314965
+231163
+5728
+347836
+184388
+27476
+284860
+476872
+301317
+99546
+147653
+529515
+311922
+20777
+2613
+59463
+430670
+560744
+60677
+332087
+296724
+353321
+103306
+363887
+76431
+423058
+120340
+119452
+6723
+462327
+163127
+402723
+489382
+183181
+107656
+375409
+355228
+430762
+512468
+409125
+270544
+559113
+495388
+529434
+38355
+422025
+379667
+131386
+183409
+573536
+581317
+425404
+350084
+472
+28532
+329717
+230220
+187196
+484166
+97434
+224595
+87483
+516998
+314876
+32610
+514586
+344816
+394418
+402330
+305993
+371497
+315790
+294908
+207431
+561014
+26584
+368671
+374990
+54747
+47571
+449424
+283761
+84735
+522127
+120473
+524656
+479659
+131627
+450959
+153300
+580908
+207785
+49115
+284991
+96505
+278306
+291655
+1404
+489304
+557459
+37740
+157465
+390475
+119166
+33871
+247428
+75905
+20779
+65035
+333556
+375415
+383676
+505243
+87327
+16451
+287235
+70190
+245067
+417520
+229234
+183786
+333018
+554156
+198915
+108021
+128262
+412443
+242543
+555050
+436511
+445233
+207886
+156397
+526257
+521357
+413043
+427189
+401614
+94823
+351130
+105945
+182314
+305879
+526197
+64409
+496800
+236461
+138175
+43816
+185904
+345711
+72536
+526737
+360400
+556537
+426053
+59044
+28290
+222548
+434915
+418623
+246454
+111801
+12448
+427133
+459117
+11262
+169045
+469996
+304390
+513096
+322822
+196371
+504977
+395364
+243950
+216218
+417217
+106736
+58194
+504101
+478522
+379314
+30432
+207027
+297146
+91844
+176031
+98287
+278095
+196053
+343692
+523137
+220224
+349485
+376193
+407067
+185781
+37871
+336464
+46331
+44244
+80274
+170147
+361106
+468499
+537864
+467457
+267343
+291528
+287828
+555648
+388284
+576085
+531973
+350122
+422253
+509811
+78093
+410019
+133090
+581205
+343976
+9007
+92478
+450674
+486306
+503978
+46378
+335578
+404071
+225558
+217923
+406217
+138054
+575815
+234990
+336257
+159240
+399516
+226408
+531126
+138599
+61693
+89861
+29504
+163296
+477906
+48419
+25595
+195594
+97592
+392555
+203849
+139248
+245651
+275755
+245426
+127279
+521359
+517623
+235747
+475906
+11198
+336101
+70134
+505447
+218996
+30080
+484457
+120441
+575643
+132703
+197915
+505576
+90956
+99741
+517819
+240918
+150834
+207306
+132682
+88250
+213599
+462584
+413321
+361521
+496081
+410583
+440027
+417284
+397069
+280498
+473171
+129739
+279774
+29370
+518899
+509867
+85556
+434930
+280710
+55077
+348793
+157756
+281111
+190689
+281447
+502854
+232894
+268742
+199553
+220808
+137330
+256903
+116017
+466416
+41635
+110906
+340934
+557501
+146767
+517617
+487159
+1561
+417281
+489014
+292463
+113533
+412247
+263973
+515444
+343561
+310200
+293804
+225867
+150320
+183914
+9707
+89999
+177842
+296524
+287829
+68300
+363654
+465986
+159969
+313948
+522779
+219820
+198352
+12959
+266727
+8016
+175804
+497867
+307892
+287527
+309638
+205854
+114119
+23023
+322586
+383341
+134198
+553522
+70426
+329138
+105367
+175597
+187791
+17944
+366611
+93493
+242422
+41842
+558840
+32203
+19667
+124297
+383726
+252625
+234794
+498228
+102906
+287967
+69021
+51326
+243896
+509423
+440124
+122582
+344325
+34455
+442478
+23587
+236904
+185633
+349841
+44294
+112568
+186296
+71914
+3837
+135486
+223747
+557517
+385181
+265313
+404263
+26564
+516867
+497096
+332351
+345139
+444304
+510877
+356387
+561214
+311471
+408789
+561729
+291380
+174671
+45710
+435136
+388858
+361693
+50811
+531134
+573605
+340175
+534988
+382671
+327047
+348400
+547137
+401037
+490711
+499266
+236370
+449075
+334015
+107234
+232315
+462953
+252048
+186822
+410168
+28994
+45550
+453626
+417957
+468577
+106338
+391684
+375143
+217622
+357903
+347648
+142182
+213843
+299148
+352587
+436676
+161875
+144655
+304741
+235017
+181799
+211042
+335507
+553731
+412531
+229740
+437129
+423830
+561806
+337666
+52016
+138057
+70254
+494393
+73119
+262425
+565395
+305329
+489611
+377080
+569450
+549766
+332940
+235302
+53893
+203781
+38449
+114870
+18699
+396338
+449839
+423613
+379767
+369594
+375812
+359219
+229311
+291675
+224907
+416885
+32964
+573406
+17282
+103375
+81860
+576886
+461334
+35672
+243442
+217269
+445055
+211112
+455675
+412384
+88967
+550643
+24223
+504074
+9275
+155546
+329542
+172658
+331600
+315492
+194208
+162867
+324614
+432017
+140860
+157944
+406616
+486079
+361172
+258346
+494140
+315384
+451014
+242619
+413684
+386187
+408501
+121089
+343603
+232538
+558671
+551596
+32992
+406647
+435260
+11156
+40896
+175382
+110560
+252968
+189694
+63154
+564816
+72004
+164788
+434583
+453104
+111878
+268484
+290768
+473215
+450620
+32673
+277479
+529917
+315868
+562419
+378347
+398637
+84097
+120527
+134193
+431472
+400238
+86426
+208830
+524535
+22213
+516813
+526044
+386193
+246672
+386739
+559252
+153344
+236123
+246074
+323615
+92644
+408621
+323231
+499940
+296105
+578902
+150098
+145015
+131431
+318618
+68409
+497928
+362520
+467755
+112702
+163219
+277289
+192362
+497674
+525439
+56267
+465868
+407570
+551608
+345211
+179653
+55295
+97315
+534041
+505822
+411082
+132375
+25378
+272008
+536605
+123511
+148737
+577712
+493751
+29587
+468297
+528458
+491058
+558976
+181421
+209685
+147545
+486964
+570516
+168662
+19446
+395997
+242911
+232511
+317035
+354527
+5961
+513793
+124390
+370123
+113397
+195790
+252813
+326919
+432414
+409239
+458221
+115667
+212239
+279279
+375554
+546622
+317188
+260818
+286021
+377111
+209868
+243148
+132037
+560624
+459721
+193498
+22623
+254164
+112841
+383470
+62692
+227940
+471335
+44858
+213649
+179898
+102837
+474078
+44478
+256197
+309492
+182923
+421139
+275695
+104965
+480780
+449749
+76513
+578591
+336695
+247474
+320490
+246105
+53183
+485740
+575823
+510735
+290741
+37017
+348708
+279784
+453634
+567644
+434192
+482719
+435324
+544299
+106896
+569926
+301574
+492885
+103462
+487151
+513585
+219647
+303685
+459645
+76292
+188579
+154883
+207728
+425074
+310493
+27221
+371694
+119404
+399665
+273556
+454577
+580698
+267664
+295769
+423740
+22461
+22667
+508443
+390401
+369997
+524627
+193349
+132223
+576743
+130586
+487741
+107542
+501420
+520109
+308156
+540581
+231362
+86471
+472930
+351133
+463605
+575577
+159842
+39504
+223020
+63525
+298627
+139883
+375205
+303549
+16838
+495680
+408112
+394474
+188044
+472143
+463751
+31481
+378139
+190853
+442614
+172006
+140270
+133051
+178028
+495090
+88455
+13232
+46323
+346275
+425905
+487013
+433136
+514402
+521906
+4157
+61418
+567205
+213351
+304008
+296492
+506561
+408120
+415961
+323186
+480379
+349199
+201918
+135023
+456483
+136173
+237917
+4972
+99081
+331569
+150007
+36450
+93400
+487461
+203629
+218093
+487181
+113935
+139512
+210981
+358883
+47419
+248382
+80357
+462663
+83097
+26159
+80429
+283055
+452676
+50159
+12326
+29430
+303264
+158122
+569070
+52925
+534876
+46975
+426376
+170293
+434417
+235517
+218476
+445008
+482774
+305632
+116848
+557252
+229270
+453485
+382214
+54759
+59171
+193328
+17152
+238071
+148531
+409725
+75434
+65358
+473057
+415408
+579415
+48636
+269606
+298784
+162799
+356400
+326854
+24601
+66499
+340247
+20992
+190218
+548464
+122203
+405306
+495376
+536028
+5713
+206831
+9395
+503939
+194440
+474253
+395849
+165141
+204935
+412621
+402922
+87141
+570664
+202622
+137362
+221737
+78947
+112129
+341957
+169562
+164780
+360216
+107641
+415015
+444955
+559102
+123070
+176592
+309366
+116461
+222075
+530470
+214363
+414487
+471567
+292123
+370210
+364243
+510254
+396350
+141524
+220310
+398604
+145436
+392476
+17482
+78032
+336171
+130812
+489743
+346638
+418854
+139072
+263860
+458240
+383443
+337533
+182334
+535608
+517946
+489924
+308117
+129945
+59973
+538364
+513458
+449433
+25165
+335851
+487688
+153834
+347612
+349689
+443688
+486008
+479149
+442286
+61108
+315338
+511546
+506444
+775
+121839
+291412
+497626
+387223
+367095
+557896
+196118
+530652
+447991
+215622
+232160
+296731
+272273
+473415
+364705
+235790
+479950
+141278
+547903
+66523
+353989
+121875
+237735
+100083
+348941
+288983
+390083
+168248
+120776
+489764
+219135
+551713
+256035
+309005
+112493
+579759
+114972
+458992
+295768
+158497
+309696
+363844
+507966
+313491
+280779
+327130
+292901
+127761
+183843
+456521
+164475
+224281
+443713
+72514
+567383
+476215
+565650
+17708
+474471
+248334
+196313
+164759
+212453
+319024
+332916
+35436
+113139
+172716
+7570
+161609
+144534
+137475
+561411
+45844
+332027
+36990
+190160
+421231
+283210
+365611
+511407
+400887
+485071
+481214
+347203
+153506
+397403
+229599
+357322
+76034
+101189
+567444
+92363
+526767
+218811
+362812
+339120
+579696
+399269
+10705
+549012
+410428
+105623
+535307
+419235
+119911
+236604
+515779
+188173
+66397
+549119
+478742
+256180
+128224
+440539
+112818
+315434
+97513
+171970
+433483
+226008
+83217
+424548
+343753
+350334
+479280
+208808
+43266
+399893
+444386
+47687
+499093
+565269
+465835
+167486
+433460
+169872
+299640
+158466
+241373
+50576
+161567
+73560
+349804
+181745
+352684
+450357
+532693
+88335
+256518
+94926
+541197
+14629
+276149
+539439
+498738
+25654
+291330
+146465
+160190
+513064
+75748
+499007
+164464
+134042
+422416
+543315
+34056
+303197
+394801
+293071
+44964
+529083
+414522
+331180
+227599
+581040
+382850
+159898
+176841
+205352
+540782
+406591
+184499
+14380
+350230
+458175
+528786
+314935
+111086
+2191
+20371
+337042
+558371
+296907
+539937
+511463
+574856
+87864
+403817
+152598
+169712
+533227
+173545
+478862
+19455
+258433
+373440
+460229
+525682
+176857
+525050
+277025
+156416
+206784
+415179
+183204
+210374
+312868
+514366
+65208
+376342
+515792
+383066
+85247
+119132
+338007
+88748
+206705
+495808
+532164
+150686
+35474
+207860
+111165
+391199
+346011
+537721
+11390
+487482
+360983
+400347
+92795
+347506
+324322
+371958
+101280
+222842
+563604
+210299
+150616
+96351
+330455
+273551
+228749
+248051
+495252
+372265
+52664
+191874
+157416
+446428
+136681
+1228
+321811
+93791
+477867
+192520
+157124
+40620
+200541
+103904
+329494
+60093
+112573
+489125
+513115
+322968
+561619
+74309
+572462
+248252
+375376
+217312
+243213
+79878
+452218
+349754
+554291
+434043
+460373
+452591
+567787
+504711
+196007
+511153
+312416
+296056
+308849
+203667
+253223
+331230
+465545
+363048
+69392
+301506
+216198
+147979
+6005
+381870
+56983
+320972
+144122
+210855
+151480
+299288
+462486
+103931
+321079
+4134
+239861
+540006
+413805
+221222
+198943
+450790
+380597
+388298
+58737
+246197
+160726
+398554
+513946
+222235
+323851
+364703
+125643
+169800
+445662
+223764
+575372
+489207
+559474
+7155
+453819
+402720
+102355
+415076
+287436
+35705
+111076
+395865
+310862
+570834
+54728
+215778
+80053
+35148
+350488
+524140
+190097
+36661
+302110
+96884
+383397
+245462
+446958
+138937
+424712
+561814
+276964
+148034
+411068
+357824
+103257
+322149
+508899
+580294
+214386
+114419
+271429
+168260
+209835
+573072
+252269
+31980
+161308
+281508
+192714
+247599
+188948
+180563
+419601
+233660
+154804
+311846
+181499
+5535
+175082
+531018
+412338
+166995
+441411
+427820
+516846
+287366
+67959
+271266
+330845
+74209
+508167
+542699
+66485
+453756
+158412
+443784
+118097
+265050
+29074
+152623
+532493
+292988
+530384
+192660
+502336
+472648
+151657
+351626
+241010
+115070
+268356
+539557
+304698
+251140
+497158
+527445
+385428
+179200
+512394
+184978
+141910
+36311
+579457
+19129
+424960
+181714
+126216
+512911
+488360
+379533
+337551
+325410
+364587
+468885
+211107
+90062
+500446
+105960
+451951
+431431
+134178
+164548
+173826
+373988
+15157
+3091
+393557
+380011
+75372
+37403
+209995
+493610
+315899
+353299
+355040
+547000
+86133
+58174
+377326
+510230
+480583
+158588
+432529
+311206
+127626
+239980
+166340
+104185
+405174
+507211
+542782
+448078
+253477
+542694
+567308
+214853
+288824
+283268
+480757
+503200
+221089
+112388
+171539
+124452
+224200
+206362
+428754
+256192
+119414
+351620
+330050
+547504
+216398
+94261
+19916
+163242
+432588
+143824
+361103
+271138
+260150
+313627
+141086
+308263
+388453
+153217
+372794
+514787
+251910
+351335
+92683
+465836
+18442
+404128
+208476
+47873
+303219
+201622
+367489
+32760
+436174
+401926
+338419
+45248
+328464
+312216
+156282
+315702
+300701
+345401
+515350
+29094
+284296
+466449
+351057
+110672
+364853
+10014
+415828
+397522
+451412
+433124
+158277
+93476
+183387
+109889
+223326
+105547
+530061
+256301
+526778
+80974
+86650
+45835
+202154
+92678
+315991
+423919
+455044
+491168
+272253
+146627
+285349
+86001
+44171
+162332
+257328
+432820
+519275
+380639
+269436
+236016
+543215
+346752
+575970
+423498
+136926
+195648
+126634
+133078
+138656
+490012
+122388
+195165
+434900
+533625
+504167
+333697
+216576
+538775
+125072
+391154
+545007
+150292
+566717
+367362
+490991
+356623
+141271
+402795
+516786
+39499
+536716
+293324
+212853
+276381
+57124
+325992
+394659
+452178
+117674
+461172
+518586
+497021
+462345
+526570
+17328
+202928
+62566
+411277
+256983
+49473
+211206
+398031
+277955
+531178
+453959
+27946
+252844
+30273
+536933
+500298
+229111
+7977
+27642
+303726
+79927
+110313
+527691
+442205
+33345
+365851
+233236
+239157
+409221
+400803
+32947
+422516
+359727
+215872
+559454
+289716
+450247
+57827
+312298
+530383
+260048
+35857
+224222
+299533
+13296
+325907
+117869
+54088
+391011
+340478
+205344
+347823
+468604
+78701
+101414
+197499
+490871
+89273
+380343
+441974
+35974
+486114
+354398
+535536
+294030
+7276
+278742
+137028
+98721
+372764
+429802
+72105
+220307
+116845
+195406
+333000
+130401
+264382
+125458
+363036
+286994
+531070
+113801
+4108
+47603
+130118
+573924
+302990
+237566
+21470
+577926
+139436
+425925
+36844
+63602
+399791
+35894
+347228
+225617
+504813
+245320
+466007
+553931
+166731
+164885
+19090
+457262
+247806
+502895
+167593
+352491
+520
+26386
+497348
+352000
+386164
+32901
+730
+30925
+333167
+150361
+231747
+462244
+504958
+260738
+313762
+346645
+486118
+202998
+541613
+183884
+230245
+83172
+126638
+51844
+421673
+118625
+377723
+229427
+371326
+104345
+361687
+114246
+397354
+104137
+120850
+260516
+389168
+234555
+26348
+78522
+409784
+303024
+377949
+69887
+546983
+113736
+298197
+476810
+137315
+376321
+410337
+492905
+119785
+158167
+185930
+354061
+106563
+328452
+506587
+536517
+480173
+570688
+376441
+252127
+247720
+132554
+41923
+400317
+170041
+151938
+198650
+6437
+49091
+221820
+455966
+309859
+300659
+15850
+388014
+253386
+65415
+238228
+548882
+302155
+93483
+371869
+397287
+315249
+360564
+448410
+21382
+477474
+144862
+517515
+230190
+322353
+231568
+14940
+132719
+498942
+182469
+113720
+168890
+94852
+246077
+117535
+52596
+419116
+522020
+255338
+125228
+564332
+106375
+249534
+220915
+177758
+293057
+222430
+196878
+554980
+375606
+173081
+84936
+418907
+562229
+457616
+125700
+66038
+239274
+574110
+305540
+98431
+167347
+53345
+438481
+286010
+5569
+343606
+168898
+191301
+236338
+291394
+715
+520237
+236954
+192212
+524002
+471625
+476029
+413124
+203455
+483328
+476417
+114389
+372428
+369221
+322654
+388157
+561314
+264540
+418680
+359540
+426182
+521613
+92248
+74478
+398905
+554273
+125909
+430583
+418959
+503522
+382999
+403145
+536375
+352618
+108193
+279696
+163253
+439007
+204536
+552186
+269926
+372147
+399921
+201418
+240565
+471483
+91619
+393971
+331648
+385856
+567440
+81922
+391722
+372894
+535997
+134096
+545958
+239943
+186929
+34222
+177714
+277812
+197111
+281878
+532003
+557172
+142890
+196116
+385454
+322845
+374987
+123137
+255112
+111207
+304819
+523526
+336046
+42893
+241273
+240049
+90659
+271364
+408008
+253282
+167067
+354278
+178317
+229653
+93333
+163666
+566920
+495199
+100329
+218119
+558864
+257382
+406152
+206587
+420339
+325919
+278853
+555763
+293200
+151000
+209664
+79380
+197177
+353953
+464522
+392260
+46144
+154202
+164366
+206025
+511236
+24921
+497907
+393226
+318138
+364125
+157321
+492395
+187857
+109939
+441500
+144251
+368581
+51403
+283498
+43555
+89356
+404601
+23272
+425762
+460682
+544629
+209829
+322029
+199247
+307262
+571242
+124236
+162393
+104829
+250766
+563938
+237399
+131516
+483001
+21994
+97958
+540187
+264497
+384808
+343187
+51277
+6712
+566103
+435384
+292082
+359039
+165157
+267972
+263796
+489313
+392722
+541924
+554433
+571034
+146112
+201934
+518716
+64116
+294992
+289586
+159970
+479617
+269006
+140465
+513260
+554805
+6579
+452696
+34445
+548296
+372983
+509656
+199339
+130030
+128372
+449454
+139306
+247914
+99024
+499134
+536653
+468917
+412813
+404338
+215303
+455414
+413497
+574988
+397117
+188631
+378701
+241867
+143129
+419884
+412749
+496954
+317732
+16977
+398309
+162363
+147576
+100016
+209018
+92660
+173302
+525732
+449198
+99734
+12733
+172946
+168032
+210988
+340697
+4795
+534887
+483553
+278323
+178175
+190095
+357542
+230432
+227460
+334609
+562121
+378126
+555357
+325666
+451859
+526837
+531710
+297249
+294839
+499785
+254976
+527220
+173057
+11760
+163012
+215998
+114420
+57812
+563712
+513887
+201859
+36333
+291990
+338375
+460621
+518889
+337502
+133050
+80172
+537007
+295270
+335644
+227852
+336044
+204137
+82259
+165675
+295713
+343937
+442567
+356002
+346932
+62985
+180925
+525381
+13081
+377406
+159774
+462643
+359105
+185821
+390201
+84168
+128059
+80340
+481159
+491902
+306619
+353807
+390569
+541562
+292616
+64621
+439224
+96288
+449798
+160927
+496324
+90778
+126145
+97230
+572767
+11570
+539075
+350988
+3779
+208135
+551315
+216449
+169606
+502
+67765
+281414
+118594
+146127
+543985
+124927
+471394
+385508
+373783
+501315
+140974
+42757
+527054
+202387
+513056
+329931
+153973
+510152
+520812
+534601
+131282
+386638
+508538
+234779
+229329
+396568
+153568
+229478
+153574
+356299
+436694
+324139
+299409
+212462
+478155
+393266
+117836
+190760
+213605
+196
+444382
+445211
+363845
+433277
+521141
+464786
+169076
+301402
+4495
+177258
+328962
+183757
+452966
+416059
+113233
+559417
+280678
+481398
+328372
+234910
+30667
+343062
+383046
+370953
+258089
+404229
+456931
+535183
+300867
+60507
+262672
+7288
+81100
+575395
+539951
+347848
+437594
+352005
+14941
+196453
+528386
+466939
+482187
+293468
+494077
+217285
+362951
+435751
+411480
+517315
+480015
+60610
+353001
+376442
+430265
+478338
+303069
+525344
+437331
+389315
+8179
+31981
+313872
+330920
+515465
+258905
+142249
+323128
+389699
+565012
+124636
+488693
+376608
+309424
+370596
+261940
+39871
+226984
+152866
+515050
+116861
+412876
+120411
+550452
+565273
+273791
+181466
+183155
+293505
+336113
+569997
+303738
+331049
+147030
+74058
+198176
+23991
+198841
+79816
+85183
+261535
+566756
+386291
+318200
+569849
+57429
+36049
+420827
+519271
+24391
+172087
+158795
+133002
+522198
+133698
+499365
+79261
+258860
+457718
+179948
+421875
+558073
+206684
+529762
+456756
+65773
+425722
+53102
+294264
+416730
+38574
+176275
+404297
+127494
+242060
+272212
+189244
+510861
+421370
+208516
+206431
+248457
+39502
+375087
+130839
+308730
+572453
+263474
+544611
+255708
+412604
+390094
+578131
+234463
+493563
+9450
+381914
+148999
+32300
+423576
+569758
+347253
+92939
+112212
+13923
+39472
+363736
+289659
+269949
+88349
+188522
+488915
+129054
+573823
+316000
+440562
+408818
+539302
+199575
+122300
+340047
+322816
+472878
+313922
+228071
+265648
+400166
+169166
+10040
+125245
+148766
+31281
+172599
+431067
+208236
+441824
+175611
+15148
+431199
+521587
+50025
+443139
+349822
+515056
+27530
+571970
+82367
+7115
+424333
+157601
+537506
+447187
+115182
+547597
+5586
+143040
+31650
+196336
+279818
+206273
+403104
+514248
+243190
+558642
+548246
+16848
+391539
+89614
+284589
+191314
+259452
+208380
+209441
+465463
+385005
+321385
+223569
+11727
+87574
+566470
+210890
+323598
+427193
+425676
+401240
+94021
+259571
+447553
+456053
+84693
+14278
+119995
+234595
+408696
+136271
+143560
+357578
+28071
+36561
+157102
+293789
+392251
+356622
+180274
+48320
+475779
+301326
+100977
+413551
+574010
+404479
+80725
+552221
+575441
+197424
+124601
+215633
+359546
+25386
+73199
+334466
+156572
+124614
+34121
+460049
+327623
+441695
+292488
+476514
+464018
+348571
+113413
+125208
+129690
+446218
+493761
+383413
+460390
+343149
+374041
+525211
+451263
+333683
+385194
+107427
+102872
+517249
+475879
+575755
+147787
+297180
+343774
+112437
+142240
+384503
+511111
+51089
+145408
+143582
+408138
+162858
+71850
+126925
+222781
+314616
+425609
+203928
+337563
+223300
+52644
+272566
+232597
+374430
+469075
+267164
+265851
+28134
+308889
+465795
+47263
+233727
+42
+493117
+124621
+533378
+361259
+458750
+429033
+383289
+490927
+520964
+174420
+64425
+378859
+401850
+281475
+46508
+205300
+280736
+110961
+230679
+151956
+321497
+73665
+488736
+165353
+365983
+556230
+21465
+581226
+448861
+3793
+347335
+150726
+75319
+2521
+285894
+133876
+104589
+346013
+63516
+83656
+491515
+326256
+49942
+28508
+475413
+270222
+235839
+48554
+327777
+111179
+507171
+425973
+449490
+205239
+82375
+459575
+432300
+91885
+340922
+270239
+195894
+121417
+344831
+439651
+232148
+391688
+480793
+534275
+260823
+469294
+8688
+255654
+191300
+383464
+81594
+21240
+478077
+517596
+555953
+294119
+402234
+459500
+564280
+106849
+167501
+98328
+267411
+145512
+272599
+50054
+414156
+161129
+418226
+11796
+502090
+390350
+440500
+240727
+104406
+163682
+437910
+143767
+358901
+527631
+500543
+28377
+231097
+227985
+556703
+421566
+73201
+478393
+280347
+15497
+131969
+515760
+295440
+462527
+42147
+120007
+212895
+425361
+454143
+5758
+366782
+213932
+229848
+458861
+132791
+476664
+150365
+343038
+529649
+180515
+499810
+329041
+15660
+419228
+396295
+502644
+321085
+245049
+34193
+217323
+446455
+528046
+375573
+15802
+147448
+407291
+84000
+280891
+150487
+510606
+163025
+249964
+126123
+233771
+118507
+97278
+357386
+23121
+10580
+2153
+176017
+371472
+373289
+173908
+296797
+334083
+301107
+577522
+125404
+278359
+575032
+273002
+266371
+108315
+255633
+503490
+250051
+143927
+117407
+198271
+447043
+329789
+399991
+458388
+87489
+228411
+494634
+260802
+454161
+446322
+231079
+438373
+395665
+244539
+212427
+356660
+347276
+183287
+498374
+21167
+544522
+418533
+288493
+245660
+406103
+406976
+367313
+455555
+117337
+384465
+185697
+160393
+463825
+276852
+181462
+176288
+452816
+102497
+54277
+225791
+361046
+197278
+9857
+227736
+398992
+55868
+170914
+181677
+467803
+560470
+264599
+540372
+559442
+201207
+137227
+267643
+355471
+245431
+555669
+344498
+84783
+193474
+102411
+401860
+119469
+448786
+449990
+568082
+340472
+307573
+231828
+307547
+82052
+15140
+493612
+503972
+386592
+473219
+495557
+159440
+355869
+311531
+209733
+240119
+415048
+296098
+249482
+15663
+151432
+263011
+488539
+463913
+502798
+174276
+495613
+407861
+229304
+146742
+545039
+161202
+295134
+162144
+453317
+52759
+335201
+222903
+20333
+559550
+336049
+346140
+491223
+306611
+102746
+455355
+449921
+477288
+77821
+289712
+452663
+147758
+129571
+490869
+345961
+94501
+160394
+432993
+178796
+372494
+316323
+383435
+194940
+74583
+148911
+518027
+431827
+32724
+158548
+227227
+500330
+54679
+321024
+471175
+252074
+476569
+573258
+337247
+294373
+558661
+148898
+563267
+163112
+411968
+193565
+455210
+349344
+337160
+160456
+255158
+553678
+123843
+549687
+381968
+579471
+100604
+379841
+357526
+197263
+14756
+412639
+210915
+47204
+539251
+166255
+490199
+260363
+91654
+170550
+187888
+97362
+285418
+176993
+292741
+361901
+296988
+223496
+493753
+114907
+151358
+316534
+472509
+499802
+348519
+347747
+58851
+104790
+396779
+130528
+2255
+19624
+526800
+233950
+505945
+131207
+290750
+114090
+196665
+8708
+134688
+394715
+115088
+492196
+530099
+518729
+291572
+421457
+445365
+78929
+415461
+551796
+210002
+207913
+344878
+303893
+149196
+353275
+122413
+553361
+519132
+467135
+431439
+17089
+322119
+228214
+35062
+105689
+366141
+285651
+60409
+472671
+401446
+492846
+21023
+421952
+374100
+265200
+506628
+62298
+243626
+212122
+350648
+409921
+428140
+399212
+388267
+198921
+429246
+202040
+570001
+261346
+61171
+131815
+455448
+82696
+554607
+102174
+386803
+188421
+191846
+209898
+380117
+321064
+119617
+188651
+132210
+244299
+174072
+542910
+378334
+118405
+543347
+183657
+581180
+395289
+64760
+265584
+29573
+493720
+94795
+315601
+416596
+260106
+244019
+463884
+579468
+112085
+300972
+238528
+382542
+57672
+165298
+46889
+289497
+337180
+481252
+7913
+432150
+288161
+403758
+257336
+565331
+346589
+270785
+205670
+231580
+508580
+98871
+239997
+554579
+160057
+404922
+78771
+380756
+171199
+148077
+22892
+145378
+26967
+235200
+176007
+90349
+554377
+189744
+257053
+270515
+66508
+113890
+291983
+558927
+420916
+140908
+58384
+438226
+575776
+106935
+40602
+468993
+494810
+210408
+365685
+483722
+39430
+258793
+272615
+51476
+189919
+443887
+391648
+422670
+445135
+198959
+405529
+459757
+465489
+81827
+262576
+408289
+309237
+76249
+460091
+512630
+45959
+280320
+200492
+404652
+48475
+18480
+457097
+65889
+162256
+265950
+520752
+299082
+51500
+499313
+104906
+35438
+167647
+7274
+387824
+242139
+173166
+399830
+12014
+510642
+154053
+67785
+78170
+514118
+87998
+52703
+203539
+534533
+85926
+274438
+401653
+458790
+509262
+144481
+387515
+246649
+503207
+235131
+501531
+62025
+43286
+272323
+326128
+561889
+167529
+171067
+50778
+301282
+469719
+509388
+480317
+379055
+546428
+192763
+445602
+420882
+232790
+174332
+232865
+292822
+511145
+119502
+312591
+110330
+281353
+116244
+58778
+428079
+64902
+520840
+232054
+473214
+572574
+296684
+351590
+217997
+178761
+71618
+226496
+285212
+381195
+499903
+232849
+468997
+345559
+503097
+578570
+396404
+405223
+578752
+403500
+188958
+504498
+491623
+462929
+525762
+395550
+574227
+240751
+169356
+524694
+40886
+571635
+487774
+86220
+95677
+268987
+502599
+155270
+103855
+125100
+241355
+220214
+391774
+110618
+154587
+134483
+458781
+360877
+465963
+194595
+346934
+127153
+188078
+553869
+102665
+400547
+33759
+42779
+397587
+140295
+151807
+549136
+470288
+89738
+328368
+546934
+164255
+563683
+399988
+360951
+217303
+326781
+546133
+135399
+94666
+330037
+569839
+411070
+497466
+404805
+417854
+318442
+255036
+457230
+346863
+307438
+370448
+5124
+152582
+38118
+12179
+58462
+308420
+329456
+74920
+250368
+186428
+556073
+111806
+361244
+80273
+230964
+156754
+503101
+75173
+389404
+195538
+88848
+286018
+245481
+140929
+533721
+268378
+70048
+315467
+46269
+372807
+192403
+387328
+163033
+481314
+65306
+192529
+321107
+112232
+441216
+412399
+565391
+220670
+61471
+463290
+346707
+67587
+147624
+13031
+396754
+278601
+439426
+42834
+281829
+376209
+353148
+556562
+97579
+217989
+319530
+82551
+235319
+431799
+53892
+52853
+54533
+88897
+225093
+386777
+546742
+273684
+413900
+245447
+577995
+16249
+188414
+485142
+199602
+89258
+109679
+502397
+14494
+13632
+51674
+244999
+305050
+455956
+426795
+560700
+327306
+410301
+343803
+539422
+156740
+527845
+100582
+9941
+466585
+61515
+231895
+157052
+41271
+148128
+141172
+320232
+78565
+539883
+391300
+365182
+322194
+116517
+323496
+473783
+519874
+440706
+361587
+265153
+329946
+342814
+32258
+153510
+194555
+309317
+245006
+300303
+97767
+218224
+370170
+290477
+207178
+456730
+209480
+513775
+199516
+581542
+32524
+416337
+96241
+506279
+422893
+248911
+509855
+355183
+201220
+234914
+333436
+68198
+429074
+328430
+160531
+467854
+280688
+140661
+349525
+267315
+565543
+313162
+25751
+232574
+560358
+505213
+494427
+160308
+287335
+99182
+413260
+558808
+290839
+122954
+229221
+192007
+243189
+117645
+552824
+366111
+102056
+356949
+566298
+97899
+422545
+343769
+13127
+179273
+104486
+37660
+304099
+517570
+20207
+36484
+36492
+155974
+107257
+534019
+522371
+222825
+96183
+509227
+302260
+95078
+280918
+367582
+317033
+347982
+73209
+290521
+187243
+425151
+483723
+573796
+187249
+144114
+132992
+35887
+546067
+426532
+45626
+461805
+129989
+541478
+485489
+578498
+485483
+144784
+248224
+372362
+92050
+423519
+473118
+177207
+105455
+276434
+157767
+384335
+509497
+338191
+224010
+327388
+96988
+43376
+67867
+320743
+555197
+104453
+14439
+512194
+396387
+252559
+108953
+461262
+66320
+97946
+238065
+306139
+572408
+577864
+81004
+464526
+89378
+193389
+259049
+85665
+381134
+412419
+308947
+557510
+502084
+288290
+254609
+188752
+439525
+13980
+140513
+240173
+305268
+38678
+394050
+402926
+364079
+159260
+293034
+55429
+289640
+291028
+211120
+48050
+93887
+361029
+486026
+388374
+207803
+540174
+530630
+430359
+36420
+120099
+199764
+492911
+84498
+200882
+139843
+4975
+421209
+259513
+520324
+211317
+236457
+419344
+3867
+287846
+50434
+26624
+507235
+16238
+103705
+497555
+440060
+175825
+245460
+308276
+178535
+391735
+206391
+201550
+400945
+194634
+262360
+554142
+407574
+225225
+246057
+498627
+486172
+226571
+461751
+459733
+345869
+503841
+286460
+45644
+22861
+285599
+580284
+569565
+286778
+150024
+542101
+484075
+538153
+20470
+128034
+544120
+357109
+450728
+550968
+326230
+558809
+76334
+555387
+47121
+523978
+11081
+378134
+116279
+364884
+488250
+551957
+322824
+545564
+255573
+286327
+355453
+361933
+434897
+32597
+226761
+166482
+557564
+208166
+232115
+283520
+137395
+555894
+103509
+174284
+458313
+316147
+344059
+370701
+548930
+89894
+373662
+572095
+19324
+574411
+45746
+480122
+63950
+92339
+201111
+157053
+401539
+427956
+339099
+274651
+159537
+556101
+323399
+564337
+514915
+556025
+66427
+322357
+173737
+369128
+420230
+45176
+509675
+374677
+272311
+109797
+384723
+383678
+453040
+91080
+301634
+533003
+40361
+221605
+216228
+104002
+161011
+146123
+214421
+496252
+264948
+9759
+138856
+316189
+145734
+50411
+325157
+259099
+516856
+529668
+135976
+467130
+367433
+385598
+520933
+102805
+30066
+436696
+216837
+380754
+350457
+126974
+565374
+73832
+214703
+110501
+380609
+135872
+140231
+251816
+133836
+398866
+230362
+426815
+2240
+51484
+546325
+224093
+221190
+525024
+238806
+99908
+165795
+109146
+537727
+496571
+183803
+211175
+433845
+168692
+526394
+368402
+256309
+468972
+139169
+398440
+171678
+547341
+64332
+533589
+483249
+406000
+330348
+439188
+572886
+252829
+242724
+139127
+404568
+45809
+52257
+458727
+334509
+559665
+60992
+290896
+503106
+27972
+536891
+410855
+31202
+457882
+403315
+87399
+395291
+322141
+226377
+202799
+420826
+553034
+212077
+97693
+266370
+101656
+504142
+342933
+87567
+342060
+268854
+437028
+20175
+198625
+405047
+382374
+338291
+403975
+527906
+322429
+545550
+140043
+107389
+74059
+315621
+110138
+78381
+295576
+494438
+106335
+472349
+15818
+162358
+366484
+44604
+66524
+118606
+366873
+270721
+556478
+350789
+298628
+163314
+262800
+459428
+491725
+285421
+406332
+498280
+34535
+524282
+315744
+226592
+218294
+459141
+242034
+114164
+293733
+248242
+452881
+441496
+54358
+177489
+372861
+349489
+483941
+572802
+356494
+193875
+146570
+58253
+21338
+6220
+341933
+533368
+1818
+428248
+293026
+227656
+193021
+326938
+512966
+226020
+343059
+249720
+540106
+375278
+300023
+126512
+517135
+472540
+361439
+132702
+503294
+109537
+540669
+332007
+245266
+313999
+10386
+225715
+311567
+103837
+302405
+248616
+102654
+155087
+124756
+379659
+569272
+160166
+428234
+422280
+174425
+133412
+174503
+216581
+345063
+52949
+69536
+216161
+272728
+200870
+120792
+193480
+493923
+445567
+558539
+51938
+422706
+416271
+244160
+437898
+327352
+305480
+349459
+522418
+485219
+225133
+361400
+546569
+190015
+348216
+421822
+457683
+178683
+40894
+234526
+465074
+518725
+168096
+210190
+139605
+35195
+463640
+286770
+141651
+112022
+532552
+325327
+227224
+17272
+84163
+331475
+126065
+289309
+8583
+52952
+189427
+579693
+437947
+187565
+215982
+356424
+453731
+463522
+372316
+251797
+70187
+280515
+556608
+341635
+391067
+469480
+476298
+57917
+146672
+122747
+394328
+12209
+80013
+573291
+278449
+129659
+579560
+557190
+227468
+334782
+51157
+23774
+9426
+86582
+39211
+275751
+131597
+51250
+357255
+9041
+346482
+9647
+157019
+409016
+273416
+114414
+298172
+388854
+275025
+58079
+518034
+503518
+146710
+120632
+474680
+303713
+259097
+479630
+208318
+437298
+173704
+361831
+371638
+344279
+230175
+72507
+417980
+72621
+163057
+92894
+543525
+577364
+263696
+472732
+66027
+391584
+197745
+131019
+65604
+91318
+535934
+212646
+576354
+482071
+160556
+120129
+7260
+344881
+447548
+318193
+30383
+527002
+34904
+35677
+526222
+105261
+401897
+399452
+25660
+524595
+384512
+117543
+514600
+268944
+112664
+222340
+569058
+495332
+192153
+75591
+286711
+174888
+577065
+25508
+169972
+401820
+425475
+290700
+173091
+559101
+122418
+244124
+198645
+325519
+276437
+528276
+146614
+45574
+417804
+326420
+250594
+27353
+310407
+370103
+274957
+561160
+167598
+397166
+257458
+404546
+148392
+373396
+62230
+493522
+563665
+274240
+269815
+79024
+527427
+84674
+486788
+267690
+443347
+149304
+412285
+207041
+412916
+10764
+151338
+299000
+17882
+475510
+398188
+558213
+70493
+180779
+347210
+280211
+58146
+379022
+504125
+537604
+464858
+329573
+568623
+228309
+454444
+552775
+557884
+435671
+168706
+142257
+571437
+574845
+387773
+321008
+574208
+405811
+375426
+321887
+256852
+433554
+517029
+125870
+80395
+497139
+490008
+405279
+571857
+225738
+514913
+456239
+499402
+96440
+487607
+370999
+319617
+370233
+60760
+352703
+478575
+84170
+134112
+77689
+185036
+73738
+547502
+104782
+213276
+136908
+436273
+442149
+355000
+374061
+249884
+105711
+136464
+146997
+76351
+388487
+99115
+124135
+24721
+132931
+1149
+182403
+386089
+81691
+480657
+441522
+60989
+268000
+55840
+514321
+577959
+359638
+457986
+533596
+60332
+367082
+772
+535842
+473541
+270677
+409009
+259216
+302318
+117036
+331372
+231125
+384486
+405214
+20760
+579760
+172995
+359110
+83110
+410068
+109916
+328757
+299261
+19028
+515660
+40757
+10256
+442695
+553097
+185903
+74388
+425120
+241326
+299609
+29397
+328728
+283881
+344029
+367336
+27075
+163628
+127263
+488979
+460147
+473050
+405762
+221547
+131581
+561187
+406489
+140696
+452721
+530466
+118965
+398803
+218365
+298738
+19441
+521550
+120157
+498687
+4754
+365866
+70865
+235156
+133386
+142742
+221183
+262391
+567053
+520982
+121349
+448779
+440354
+3983
+578993
+519691
+160703
+103307
+300408
+137106
+488377
+523660
+318022
+132578
+302520
+153040
+408817
+145227
+311190
+159662
+202923
+256775
+359864
+384848
+336404
+185303
+421703
+362682
+464622
+246590
+422729
+165500
+42563
+219216
+520232
+95063
+265547
+532686
+290558
+112591
+448211
+315281
+545475
+225850
+232460
+82740
+272880
+347254
+122047
+352151
+541486
+97249
+200252
+544782
+499571
+379014
+303534
+479909
+305464
+323682
+181524
+273855
+190783
+567801
+119752
+241503
+536429
+327323
+128756
+349868
+500495
+372260
+315824
+484986
+364993
+124759
+300124
+329319
+68628
+14549
+121897
+506595
+115709
+199610
+230150
+31717
+139549
+222332
+534161
+360393
+541664
+507167
+286523
+158660
+66926
+195750
+80022
+589
+252220
+47255
+247014
+49881
+455005
+232453
+445722
+516805
+544122
+541917
+469356
+370042
+130522
+502163
+307866
+408894
+524247
+52233
+177861
+348881
+357943
+295303
+475389
+431691
+61316
+143998
+503483
+340155
+488785
+133636
+133567
+251627
+470095
+34873
+88815
+261178
+468612
+127477
+157960
+15687
+303089
+572331
+456708
+190515
+126131
+239194
+332074
+129765
+107167
+478184
+421833
+359715
+112440
+331317
+74492
+505386
+247839
+534210
+134503
+422700
+352111
+98674
+546219
+520508
+503008
+461953
+101913
+362092
+22103
+359128
+316666
+335579
+414750
+297980
+365652
+53635
+547601
+97589
+570515
+7125
+99828
+321437
+80671
+426275
+294883
+212605
+424293
+338108
+25005
+6949
+234291
+428399
+7149
+343076
+575287
+431848
+307611
+293909
+542511
+564739
+573843
+356878
+472864
+336793
+121904
+161060
+254004
+269873
+216428
+77172
+346517
+498555
+203690
+348973
+117704
+552672
+275270
+208107
+314016
+427518
+278134
+53420
+318777
+238980
+350614
+467315
+61233
+272188
+550797
+125051
+553965
+187286
+282912
+102532
+156076
+467848
+130875
+531585
+523470
+507684
+332582
+438989
+489209
+125944
+127474
+371957
+570349
+283286
+541635
+547106
+253630
+388677
+572525
+542302
+554537
+367205
+228300
+443498
+356432
+123946
+490441
+211063
+224542
+116574
+434510
+33116
+353136
+134167
+128291
+542510
+433963
+147453
+365766
+374806
+336600
+38238
+165476
+535578
+127788
+157099
+173640
+114348
+496722
+58141
+467296
+235864
+5154
+22775
+422536
+136820
+453438
+446359
+41990
+422240
+39267
+391392
+233825
+308504
+478250
+87328
+4079
+127074
+267709
+377635
+353231
+185768
+487897
+124215
+249757
+341681
+557552
+280733
+374734
+281601
+456420
+222266
+491947
+432732
+467157
+94025
+410328
+428291
+397639
+163528
+234697
+557573
+208363
+515962
+358658
+373075
+438995
+425672
+450169
+216103
+254638
+288591
+53626
+43417
+372252
+5038
+218357
+120860
+399349
+485509
+530261
+477087
+352302
+96075
+495443
+133928
+197175
+134074
+212553
+448181
+152000
+254277
+105734
+75481
+343662
+479350
+554347
+71090
+297426
+22176
+277622
+469235
+163041
+221272
+154263
+89296
+68411
+192871
+183217
+258141
+53058
+540529
+566414
+560948
+254535
+246076
+135972
+420069
+431023
+343643
+32682
+515176
+222635
+377155
+547041
+513283
+26017
+366096
+252133
+138078
+25685
+321798
+549361
+14088
+423048
+570810
+374974
+447501
+492544
+554046
+575357
+420791
+6019
+340451
+66800
+565575
+148055
+330432
+483038
+455004
+288765
+11034
+86988
+347142
+450559
+543581
+293757
+556901
+533032
+333020
+260266
+22420
+13948
+512657
+214124
+231236
+177149
+560879
+491793
+35767
+312878
+118542
+450596
+423773
+48653
+224523
+509577
+462677
+75405
+350023
+452122
+42008
+302555
+382309
+468483
+368684
+372580
+31333
+153697
+124876
+330023
+315672
+53990
+136533
+82815
+356836
+414821
+268717
+7333
+77544
+525373
+371042
+227048
+576327
+419309
+239773
+8119
+424135
+297425
+222711
+489909
+393995
+31019
+539326
+517612
+102461
+199989
+483374
+44952
+103863
+528980
+441543
+85381
+247234
+50924
+483994
+87456
+424271
+356091
+534669
+378831
+560662
+298773
+257896
+498274
+305800
+40517
+183949
+276840
+84442
+297620
+298252
+119088
+233315
+283977
+345154
+287649
+427311
+63399
+4700
+463611
+224104
+209388
+431655
+364190
+28864
+412455
+283290
+228541
+422200
+985
+133596
+323853
+503081
+130732
+224675
+199688
+230862
+21396
+485390
+1532
+125778
+235541
+370478
+522478
+514292
+384338
+531707
+178746
+532747
+62915
+519491
+140691
+112093
+358024
+263687
+297595
+506085
+102446
+325768
+29558
+222054
+466965
+316254
+546500
+216785
+194184
+464390
+348371
+231582
+208995
+464339
+308856
+340946
+214604
+570586
+182227
+248441
+89078
+376310
+73450
+115924
+308235
+15994
+8749
+429679
+37751
+122040
+284286
+388707
+248163
+11320
+427997
+282062
+237600
+376751
+223314
+86215
+12443
+163255
+564940
+462640
+522713
+306303
+460675
+126833
+26201
+224757
+357899
+546782
+96427
+480944
+479556
+569273
+520528
+190690
+344832
+462466
+270354
+559776
+279259
+280909
+227781
+163798
+491098
+439658
+416088
+107375
+74132
+379800
+511654
+346687
+226161
+578849
+544272
+146149
+570624
+178299
+126671
+356380
+530766
+175954
+158798
+422095
+55780
+512276
+560626
+187329
+513125
+347216
+306486
+161840
+180917
+188192
+421437
+93120
+324891
+252216
+488476
+578347
+101959
+10693
+170038
+213586
+210439
+469202
+381463
+343248
+127785
+287328
+538690
+16382
+293022
+112378
+435785
+56092
+381504
+284365
+406129
+233119
+53629
+188509
+191053
+81056
+82252
+538319
+38439
+181948
+439710
+529344
+434035
+342958
+563882
+37734
+364743
+330986
+546226
+463211
+62210
+442724
+232241
+293858
+119345
+61953
+577033
+522015
+381587
+350107
+4936
+511307
+228771
+177811
+231450
+176168
+84540
+259408
+264238
+539738
+255827
+459382
+221105
+431742
+204337
+227741
+336356
+37655
+167159
+59352
+165937
+53956
+378712
+88462
+495786
+542938
+566498
+367228
+157577
+442661
+62363
+390689
+480664
+521540
+414249
+20571
+160855
+451683
+156832
+570045
+326542
+568276
+568717
+563311
+113579
+218268
+546095
+160661
+341118
+150649
+462632
+198972
+220025
+61720
+430681
+524011
+457217
+40064
+285583
+314493
+78023
+470882
+298722
+555597
+489829
+314779
+367818
+138503
+243737
+580255
+444565
+386677
+190841
+493074
+234347
+466988
+227033
+519039
+351554
+390585
+443303
+140983
+81079
+538005
+169757
+368780
+457322
+341804
+409116
+181805
+284292
+551358
+344548
+503569
+336587
+417055
+522315
+58705
+148955
+375530
+474934
+577893
+28881
+360772
+445267
+244737
+355777
+72811
+190788
+54513
+243075
+518551
+487530
+292169
+69293
+397303
+129285
+429996
+109532
+53802
+340573
+91280
+535602
+270908
+381925
+549220
+488573
+47131
+32735
+117525
+279085
+43961
+188906
+394677
+395
+185201
+189365
+127596
+32712
+504810
+3703
+182874
+146981
+306755
+453093
+520503
+169808
+225670
+91063
+348584
+461802
+572555
+185922
+131497
+46736
+536006
+256505
+214975
+13445
+350736
+98115
+50304
+361180
+511333
+564820
+429717
+222500
+40083
+538230
+349438
+371250
+528578
+240418
+302380
+261758
+535809
+308388
+578878
+509451
+46919
+562592
+499950
+90374
+318146
+195353
+355325
+314515
+237277
+203024
+238911
+32039
+145591
+16030
+135411
+229350
+421757
+48034
+183704
+307292
+97974
+275999
+448256
+451915
+119113
+143503
+494141
+50124
+306553
+35526
+255279
+560908
+247264
+367599
+192782
+511324
+574350
+67569
+204360
+111907
+2839
+513971
+245201
+185240
+339468
+540101
+539673
+194425
+22168
+520150
+301595
+96006
+68286
+131280
+356662
+182441
+284749
+107108
+49761
+386718
+55244
+187990
+248678
+147721
+425727
+360350
+310797
+76765
+400489
+247639
+279864
+44699
+356145
+69138
+445041
+560598
+165464
+536343
+7818
+322831
+334760
+451463
+348730
+285967
+286353
+201887
+166165
+359
+465591
+519359
+550444
+402711
+3661
+132706
+534983
+306281
+150317
+15978
+580029
+496090
+267127
+210980
+384015
+222559
+2235
+255649
+278168
+440840
+27326
+202562
+230268
+362712
+1573
+107661
+464515
+373132
+447242
+547440
+43613
+200143
+260883
+250901
+64693
+408480
+204757
+319933
+147471
+381332
+518197
+27656
+260257
+434580
+159203
+568630
+497441
+499597
+60179
+574804
+343254
+501762
+220704
+524536
+86946
+456046
+62937
+49633
+144305
+475593
+478553
+574145
+63648
+3794
+303177
+1340
+82835
+371427
+156747
+448694
+219567
+75095
+242615
+492077
+132776
+199125
+349622
+195754
+455548
+181873
+138185
+338044
+362797
+180953
+505826
+69773
+304834
+162580
+154090
+519853
+319687
+132328
+27969
+52166
+100547
+568131
+415218
+348045
+478159
+402869
+10211
+26547
+551692
+105432
+313340
+182348
+383419
+570947
+345353
+226883
+255784
+214199
+262262
+283261
+449708
+299970
+392391
+245997
+330410
+343571
+519542
+37470
+42144
+342521
+498537
+10935
+443860
+512648
+146099
+98599
+123932
+489861
+262895
+184700
+218587
+363581
+21001
+481404
+249356
+64240
+492349
+199236
+481064
+353405
+116479
+132024
+138768
+524665
+434511
+326970
+138784
+340368
+312081
+366615
+171942
+21232
+473850
+93686
+295574
+51054
+162692
+174091
+20070
+270066
+492816
+20904
+484500
+147140
+242972
+420081
+63563
+261712
+316396
+49413
+520787
+510955
+393840
+142487
+19817
+261180
+413736
+230619
+484614
+337011
+496575
+4338
+552545
+5601
+75426
+568863
+184227
+170629
+438567
+505132
+541353
+284674
+322567
+182423
+312051
+18896
+40471
+321725
+188850
+37119
+95569
+187362
+397133
+528972
+487131
+174989
+370325
+223554
+385633
+103485
+537574
+63240
+256566
+86467
+401092
+486968
+308441
+280017
+527464
+131965
+310479
+125556
+220160
+532963
+310052
+107963
+293841
+388534
+45603
+368949
+391825
+5107
+569705
+231549
+250108
+152933
+206433
+358817
+434006
+283904
+152808
+539975
+24629
+410231
+13465
+502318
+51961
+445594
+209062
+38726
+295420
+430079
+240147
+561512
+35795
+102589
+505619
+565469
+271772
+520561
+372300
+178807
+492805
+1083
+303704
+125635
+217521
+278032
+208688
+335325
+140435
+313990
+143822
+320857
+549230
+76844
+424219
+463876
+243199
+2988
+215170
+30012
+377738
+408568
+490624
+404839
+138316
+157206
+404461
+122934
+263346
+21327
+99913
+67975
+339676
+391891
+365305
+337055
+233834
+125524
+46869
+32577
+304744
+104176
+167356
+210404
+307989
+217223
+196046
+454414
+16356
+244487
+543660
+197461
+199681
+476787
+455085
+307074
+260547
+107468
+334769
+29437
+166837
+53838
+502979
+82678
+288860
+535523
+311950
+237723
+98656
+223123
+273930
+58057
+544334
+324857
+198043
+535326
+316505
+12991
+576820
+43611
+107839
+275749
+456695
+78188
+375786
+466239
+184830
+537128
+434513
+244344
+374576
+69140
+434247
+555009
+510857
+220819
+20598
+99416
+74967
+533129
+515577
+213361
+330974
+548848
+431557
+503278
+130043
+402570
+320554
+559884
+252629
+364596
+423484
+271230
+105552
+143143
+285751
+49994
+204162
+80646
+381393
+123415
+118417
+30932
+425412
+388130
+551243
+468337
+484893
+25014
+174390
+463781
+124647
+60823
+361964
+425702
+575110
+532390
+230881
+84592
+189997
+221307
+361472
+32364
+71918
+316365
+492378
+234251
+48504
+418070
+89884
+562045
+506552
+66360
+122962
+262605
+529939
+345229
+294853
+344397
+56091
+8599
+459823
+175785
+226128
+259983
+354515
+379144
+384995
+205253
+116786
+441432
+448810
+83452
+465129
+506906
+90616
+551959
+406404
+157891
+362090
+439630
+45099
+61960
+478430
+489605
+127050
+579872
+475798
+64510
+447733
+33066
+102848
+538819
+323760
+200401
+179765
+251317
+239376
+83836
+578092
+522452
+393056
+278848
+27787
+377239
+473427
+83065
+377005
+576539
+248019
+473370
+536369
+92648
+332461
+437609
+274800
+388846
+323048
+193407
+541898
+480140
+46526
+26432
+339738
+325991
+37705
+528033
+542922
+313420
+190463
+531000
+454907
+26448
+238199
+476652
+457147
+364256
+72632
+430380
+315448
+353320
+18158
+91527
+454252
+546987
+386370
+38064
+19763
+64152
+453216
+55223
+361860
+522566
+509531
+438432
+31164
+163290
+389197
+333440
+173464
+447842
+381615
+99961
+156126
+103134
+394940
+165638
+261706
+378311
+534081
+373848
+401642
+338019
+378096
+289610
+547421
+174672
+133343
+191360
+293751
+520892
+145214
+167668
+37456
+460962
+465267
+292804
+347529
+203661
+10766
+27371
+203845
+155736
+136715
+463588
+26640
+547612
+131453
+184274
+442456
+265085
+223256
+129420
+23019
+536467
+194532
+127585
+392637
+330408
+524775
+31993
+433924
+502852
+553129
+559364
+297343
+71360
+225537
+271148
+345499
+475893
+237463
+5278
+501243
+413235
+444236
+541071
+380088
+468063
+94858
+225913
+295614
+210276
+170975
+205570
+422375
+550365
+308702
+484627
+565031
+98979
+480345
+579548
+272673
+436875
+287874
+16502
+274917
+281809
+442968
+289263
+347766
+160933
+84533
+266409
+122199
+396200
+30958
+504541
+1591
+89432
+387150
+306383
+15260
+154515
+50752
+166913
+102644
+100196
+160278
+349579
+442536
+17923
+310564
+62020
+152004
+578330
+126299
+527025
+83494
+226400
+268435
+445334
+310391
+505156
+19157
+44677
+318171
+447765
+354369
+527486
+329939
+184771
+134856
+467675
+517133
+89697
+447080
+70685
+144938
+519673
+485758
+454957
+564851
+189451
+408757
+192616
+280734
+305060
+243946
+99179
+303971
+170519
+48917
+549965
+300245
+384101
+576607
+186709
+516341
+241668
+133470
+134811
+500825
+464689
+29833
+343820
+213429
+387434
+279305
+444207
+210777
+372043
+189868
+572229
+8495
+370090
+450282
+277080
+199158
+109612
+567708
+245659
+485129
+268363
+23448
+5352
+235597
+6871
+348720
+94113
+314613
+63729
+114458
+215394
+460460
+240387
+398726
+135604
+571728
+415770
+286908
+138151
+146272
+344094
+345209
+241187
+282768
+113037
+545583
+219283
+145873
+285957
+489235
+157271
+197458
+502671
+499845
+334884
+79084
+505573
+115618
+561491
+354202
+279838
+190734
+134738
+269450
+482784
+144610
+52774
+290659
+440646
+25807
+442952
+159215
+318224
+73445
+211653
+527960
+401862
+431026
+488755
+292278
+400554
+272630
+382668
+470298
+166426
+129645
+28820
+161227
+417696
+560677
+283216
+28978
+310302
+154419
+230450
+328289
+73118
+104691
+15085
+405574
+510548
+470005
+102928
+569249
+413126
+77282
+96732
+359020
+42182
+250875
+106206
+354929
+320796
+453341
+237318
+254834
+137265
+399865
+292685
+152252
+319579
+81484
+16599
+162257
+351034
+396051
+502275
+308278
+34483
+13333
+320290
+321579
+349794
+99219
+200162
+369470
+487583
+62703
+251639
+138246
+157170
+477112
+283963
+74860
+307057
+364075
+295491
+34757
+400161
+170194
+120874
+492817
+3817
+183973
+135436
+512989
+114744
+379210
+201072
+293785
+578385
+237420
+7888
+18224
+155317
+522406
+441440
+110482
+173400
+183348
+552504
+475660
+166948
+147025
+443259
+578792
+245227
+546687
+474519
+393284
+249668
+87493
+151651
+100306
+540466
+546556
+212675
+282942
+21310
+385535
+7304
+303409
+386116
+574297
+514550
+217133
+533553
+447152
+578703
+45392
+166205
+180154
+25143
+338802
+330110
+261389
+343506
+442726
+285388
+554934
+421316
+479912
+85192
+34874
+487266
+226173
+20748
+360660
+574509
+543364
+1554
+125539
+566931
+312889
+466945
+444804
+257187
+568587
+427160
+71123
+563849
+138589
+162841
+129663
+107226
+140686
+321663
+437117
+179808
+321718
+62398
+16497
+468933
+219841
+355430
+293554
+293044
+109516
+485887
+490620
+579893
+427135
+31636
+217919
+432441
+314396
+119802
+393682
+201764
+146193
+116358
+84825
+208311
+419774
+177468
+72052
+142585
+519598
+464006
+556083
+412136
+169361
+442929
+84567
+549932
+75560
+74656
+93314
+393838
+383018
+372433
+431281
+556278
+5513
+108503
+500478
+148588
+138713
+368153
+22646
+303778
+270758
+276706
+275429
+492025
+169111
+494328
+35891
+70258
+400528
+165229
+460494
+269311
+307658
+98283
+369294
+319345
+414578
+541550
+425388
+129855
+99477
+383073
+387906
+293124
+155873
+549224
+266021
+52869
+1584
+421902
+498535
+277235
+153013
+452013
+553561
+138040
+20820
+58483
+423506
+569001
+325153
+383039
+213421
+38825
+453283
+384661
+127702
+238147
+104893
+577826
+64974
+240655
+459153
+145665
+49810
+65008
+545385
+125070
+46433
+143329
+429174
+52947
+321314
+253341
+157365
+453162
+111910
+339019
+239575
+362219
+80652
+247317
+460286
+365724
+160875
+372220
+483389
+572181
+146190
+580975
+54761
+348488
+416104
+468778
+18833
+251537
+234366
+510078
+14723
+338595
+153797
+513098
+467138
+404618
+261982
+545730
+135846
+108244
+562557
+180524
+227370
+341856
+131743
+255691
+497878
+68878
+430640
+441473
+347664
+214369
+347018
+225238
+421762
+317024
+6180
+172004
+303101
+22488
+193494
+199346
+409627
+315350
+263463
+190722
+523292
+363902
+573778
+437290
+389812
+517082
+145073
+37907
+489763
+456261
+270386
+508917
+566823
+543897
+362482
+130966
+66632
+181962
+274613
+135708
+549746
+323766
+366714
+353295
+318813
+153307
+213693
+293378
+149446
+199927
+580543
+331727
+238488
+472833
+308645
+424225
+228746
+110435
+495377
+240646
+274491
+130921
+140006
+4688
+115241
+76962
+66650
+47718
+224991
+434187
+272048
+11169
+158222
+154000
+507436
+443499
+109937
+309692
+534018
+22797
+163339
+168683
+210098
+246069
+137954
+143320
+262587
+414795
+226938
+536831
+128791
+459590
+50514
+30067
+317479
+378655
+229968
+522702
+11122
+515266
+136600
+224509
+149912
+97656
+120747
+349480
+155199
+528731
+523807
+168544
+325664
+229981
+434410
+431208
+508996
+63791
+89225
+513690
+136740
+224364
+515424
+508302
+418175
+465552
+439907
+272097
+451087
+396304
+342273
+52507
+300066
+380089
+326248
+167906
+37846
+262993
+60090
+499249
+90432
+74456
+264660
+325598
+480985
+245411
+425644
+224724
+475439
+246478
+487438
+563731
+441854
+522665
+245915
+85747
+315162
+108761
+407521
+388528
+389453
+298331
+447791
+368820
+440034
+305677
+122208
+182369
+543531
+151820
+63650
+457580
+563381
+320899
+14869
+137260
+61925
+376307
+80367
+269089
+203705
+274835
+267321
+418106
+471273
+74037
+227855
+519758
+89045
+321217
+324203
+479129
+503431
+368528
+527718
+278579
+13525
+291582
+301837
+31667
+68120
+14007
+114158
+124262
+33626
+53949
+187585
+192247
+208844
+212766
+318671
+575012
+439339
+364073
+419624
+178078
+427783
+302159
+339368
+190680
+23807
+288579
+312720
+15778
+553558
+571834
+574376
+122161
+493815
+472376
+483432
+149123
+51628
+264628
+26609
+23696
+485081
+441323
+451679
+42055
+378795
+86439
+366493
+520996
+332869
+18014
+554523
+83476
+6040
+421834
+424392
+308160
+335233
+249809
+349098
+358090
+187349
+61782
+35498
+386514
+207108
+578418
+84447
+104108
+126107
+211674
+111909
+490708
+477025
+206757
+556205
+142484
+454296
+464366
+358254
+215482
+468548
+82680
+100909
+405432
+85764
+94651
+63973
+8131
+288592
+257470
+47597
+321557
+34520
+134066
+246701
+317797
+282365
+78176
+29577
+311075
+331937
+190395
+5802
+245112
+111032
+140556
+199127
+376491
+305253
+300375
+545903
+357782
+377911
+74963
+329336
+25057
+3244
+252020
+293474
+171050
+239306
+189772
+238090
+160031
+36761
+445675
+252716
+152214
+239466
+55155
+479829
+420281
+445812
+118106
+434576
+451104
+316708
+438535
+300322
+167952
+390072
+487220
+20247
+9400
+43944
+35770
+487351
+425462
+212203
+9668
+8981
+574241
+332096
+535563
+192944
+498733
+276151
+550645
+507037
+9769
+404249
+236747
+376416
+306415
+45966
+191296
+576875
+493932
+225075
+536444
+79920
+561681
+60700
+99874
+219437
+509819
+466665
+579326
+428739
+394611
+263083
+379554
+279391
+178516
+133690
+77396
+300137
+6861
+435359
+314108
+444152
+500139
+92749
+89188
+300233
+414201
+443204
+211097
diff --git a/examples/pytorch/vision/Visual_Wakeword/train_visualwakewords.py b/examples/pytorch/vision/Visual_Wakeword/train_visualwakewords.py
new file mode 100755
index 000000000..0b88fe9fc
--- /dev/null
+++ b/examples/pytorch/vision/Visual_Wakeword/train_visualwakewords.py
@@ -0,0 +1,249 @@
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT license.
+
+import torch
+import torch.nn as nn
+import torch.optim as optim
+import torch.nn.functional as F
+import torch.backends.cudnn as cudnn
+import torchvision
+import torchvision.models as models
+import torchvision.transforms as transforms
+import os
+import argparse
+import random
+from PIL import Image
+import numpy as np
+from torchvision.datasets.vision import VisionDataset
+from importlib import import_module
+from pyvww.utils import VisualWakeWords
+
+
+
+
+
+device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+
+torch.backends.cudnn.benchmark = True
+torch.backends.cudnn.enabled = True
+
+best_acc = 0  # best test accuracy
+start_epoch = 0 
+
+#Arg parser
+parser = argparse.ArgumentParser(description='PyTorch CIFAR10 Training')
+parser.add_argument('--lr', default=0.05, type=float, help='learning rate')
+parser.add_argument('--epochs', default=900, type=int, help='total epochs')
+parser.add_argument('--resume', default=None, type=str, help='load from checkpoint')
+parser.add_argument('--model_arch',
+                    default='model_mobilenet_rnnpool', type=str,
+                    choices=['model_mobilenet_rnnpool', 'model_mobilenet_2rnnpool'],
+                    help='choose architecture among rpool variants')
+parser.add_argument('--ann', default=None, type=str, 
+    help='specify new-path-to-visualwakewords-dataset used in dataset creation step')
+parser.add_argument('--data', default=None, type=str, 
+    help='specify path-to-mscoco-dataset used in dataset creation step')
+args = parser.parse_args()
+
+
+# Data
+
+class VisualWakeWordsClassification(VisionDataset):
+    """`Visual Wake Words <https://arxiv.org/abs/1906.05721>`_ Dataset.
+    Args:
+        root (string): Root directory where COCO images are downloaded to.
+        annFile (string): Path to json visual wake words annotation file.
+        transform (callable, optional): A function/transform that  takes in an PIL image
+            and returns a transformed version. E.g, ``transforms.ToTensor``
+        target_transform (callable, optional): A function/transform that takes in the
+            target and transforms it.
+    """
+    def __init__(self, root, annFile, transform=None, target_transform=None, split='val'):
+        # super(VisualWakeWordsClassification, self).__init__(root, annFile, transform, target_transform, split)
+        self.vww = VisualWakeWords(annFile)
+        self.ids = list(sorted(self.vww.imgs.keys()))
+        self.split = split
+
+        self.transform = transform
+        self.target_transform = target_transform
+        self.root = root
+
+    def __getitem__(self, index):
+        """
+        Args:
+            index (int): Index
+        Returns:
+            tuple: Tuple (image, target). target is the index of the target class.
+        """
+        vww = self.vww
+        img_id = self.ids[index]
+        ann_ids = vww.getAnnIds(imgIds=img_id)
+        target = vww.loadAnns(ann_ids)[0]['category_id']
+
+        path = vww.loadImgs(img_id)[0]['file_name']
+
+        img = Image.open(os.path.join(self.root, path)).convert('RGB')
+
+
+        if self.transform is not None:
+            img = self.transform(img)
+           
+
+        if self.target_transform is not None:
+            target = self.target_transform(target)
+
+        return img, target
+
+    def __len__(self):
+        return len(self.ids)
+
+
+normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
+                                         std=[0.229, 0.224, 0.225])
+
+transform_train = transforms.Compose([
+    # transforms.RandomAffine(10, translate=None, shear=(5,5,5,5), resample=False, fillcolor=0),
+    transforms.RandomResizedCrop(size=(224,224), scale=(0.2,1.0)),
+    transforms.RandomHorizontalFlip(),
+    #transforms.RandomAffine(10, translate=None, shear=(5,5,5,5), resample=False, fillcolor=0),
+    # transforms.ColorJitter(brightness=(0.6,1.4), saturation=(0.9,1.1), hue=(-0.1,0.1)),
+    transforms.ToTensor(),
+    normalize
+])
+
+transform_test = transforms.Compose([
+    transforms.Resize(256),
+    transforms.CenterCrop(224),
+    transforms.ToTensor(),
+    normalize
+]) 
+ 
+
+
+
+
+trainset = VisualWakeWordsClassification(root=os.path.join(args.data,'all2014'), 
+                    annFile=os.path.join(args.ann, 'annotations/instances_train.json'), 
+                    transform=transform_train, split='train')
+
+trainloader = torch.utils.data.DataLoader(trainset, batch_size=256, shuffle=True, 
+                                                num_workers=32)
+
+testset = VisualWakeWordsClassification(root=os.path.join(args.data,'all2014'), 
+                    annFile=os.path.join(args.ann, 'annotations/instances_val.json'), 
+                    transform=transform_test, split='val')
+
+testloader = torch.utils.data.DataLoader(testset, batch_size=256, shuffle=False, 
+                                                num_workers=32)
+
+ 
+# Model
+
+module = import_module(args.model_arch)
+model = module.mobilenetv2_rnnpool(num_classes=2, width_mult=0.35, last_channel=320)
+model = model.to(device)
+model = torch.nn.DataParallel(model)
+
+
+
+if args.resume:
+    # Load checkpoint.
+    print('==> Resuming from checkpoint..')
+    assert os.path.isdir('./checkpoints/'), 'Error: no checkpoint directory found!'
+    checkpoint = torch.load('./checkpoints/' + args.resume)
+    best_acc = checkpoint['acc']
+    start_epoch = checkpoint['epoch']
+
+criterion = nn.CrossEntropyLoss().cuda()
+
+optimizer = optim.SGD(model.parameters(), lr=0.05, momentum=0.9, weight_decay=4e-5)#, alpha=0.9)
+  
+
+# Training
+def train(epoch):
+    print('\nEpoch: %d' % epoch)
+    model.train()
+    train_loss = 0
+    correct = 0 
+    total = 0
+    train_loader_len = len(trainloader)
+    for batch_idx, (inputs, targets) in enumerate(trainloader):
+        adjust_learning_rate(optimizer, epoch, batch_idx, train_loader_len)
+        
+        batch_size = inputs.shape[0]
+        inputs, targets = inputs.to(device), targets.to(device)
+        optimizer.zero_grad()
+        outputs = model(inputs)
+       
+        loss = criterion(outputs, targets)
+        loss.backward()
+        optimizer.step()
+
+        train_loss += loss.item()
+        _, predicted = outputs.max(1)
+        total += targets.size(0)
+        correct += predicted.eq(targets).sum().item()
+
+    print('train_loss: ',train_loss/total, ' acc: ', correct/total)
+    print('->>lr:{:.6f}'.format(optimizer.param_groups[0]['lr']))
+
+def test(epoch):
+    global best_acc
+    model.eval()
+    test_loss = 0
+    correct = 0
+    total = 0
+    with torch.no_grad():
+        for batch_idx, (inputs, targets) in enumerate(testloader):
+            batch_size = inputs.shape[0]
+            inputs, targets = inputs.to(device), targets.to(device)
+            outputs = model(inputs)
+
+            loss = criterion(outputs, targets)
+
+            test_loss += loss.item()
+            _, predicted = outputs.max(1)
+            total += targets.size(0)
+            correct += predicted.eq(targets).sum().item()
+
+    print('test_loss: ',test_loss/total, ' test_acc: ', correct/total)
+
+    # Save checkpoint.
+    print('best acc: ', best_acc)
+    acc = 100.*correct/total
+    if acc > best_acc:
+        print('Saving..')
+        state = {
+            'model': model.state_dict(),
+            'acc': acc,
+            'epoch': epoch,
+        }
+        if not os.path.isdir('./checkpoints/'):
+            os.mkdir('./checkpoints/')
+        torch.save(state, './checkpoints/model_mobilenet_rnnpool.pth')
+        best_acc = acc
+
+
+from math import cos, pi
+def adjust_learning_rate(optimizer, epoch, iteration, num_iter):
+    lr = optimizer.param_groups[0]['lr']
+
+    warmup_epoch = 0
+    warmup_iter = warmup_epoch * num_iter
+    current_iter = iteration + epoch * num_iter
+    max_iter = 150 * num_iter
+
+
+    lr = args.lr * (1 + cos(pi * (current_iter - warmup_iter) / (max_iter - warmup_iter))) / 2
+
+    if epoch < warmup_epoch:
+        lr = args.lr * current_iter / warmup_iter
+
+
+    for param_group in optimizer.param_groups:
+        param_group['lr'] = lr
+
+
+for epoch in range(start_epoch, start_epoch+args.epochs):
+    train(epoch)    
+    test(epoch)
diff --git a/examples/pytorch/vision/cpp/README.md b/examples/pytorch/vision/cpp/README.md
new file mode 100644
index 000000000..54c4e4e44
--- /dev/null
+++ b/examples/pytorch/vision/cpp/README.md
@@ -0,0 +1,18 @@
+# RNNPool quantized sample code
+
+The `rnnpool_quantized.cpp` code takes the activations preceding the RNNpool layer
+and produces the output of a quantized RNN pool layer. The input numpy file consists 
+of all activation patches corresponding to a single image. In `trace_0_input.npy`,
+there are 6241 patches of dimensions 8x8 with 4 channels to which RNNPool is applied.
+The output is of size 6241*4*8. This can be compared to the floatin point output stored in
+`trace_0_output.npy`
+
+```shell
+g++ -o rnnpool_quantized rnnpool_quantized.cpp
+
+# Usage: ./rnnpool_quantized <#patches> <input.npy> <output.npy>
+./rnnpool_quantized 6241 trace_0_input.npy trace_0_output_quantized.npy
+```
+
+Copyright (c) Microsoft Corporation. All rights reserved.
+Licensed under the MIT license
diff --git a/examples/pytorch/vision/cpp/data.h b/examples/pytorch/vision/cpp/data.h
new file mode 100644
index 000000000..bd4e5633e
--- /dev/null
+++ b/examples/pytorch/vision/cpp/data.h
@@ -0,0 +1,59 @@
+// Copyright (c) Microsoft Corporation. All rights reserved.
+// // Licensed under the MIT license
+
+int16_t W1[4][8] = {{7069, 3262, 5513, 4733, -2708, -10109, 5233, 4489}
+, {-10390, -38, -17036, -404, -1288, -138, 226, -1100}
+, {1562, -1144, -14616, 4106, -18129, 2064, 831, 2845}
+, {-1993, -996, -6637, -1105, -1833, 1207, -1910, -1262}
+}; //16384
+
+int16_t U1[8][8] = {{15238, -4081, -18973, -1468, 3401, 12650, -911, -1588}
+, {-1372, -2625, 23200, 5474, 7390, -3379, 5065, 7849}
+, {-931, -10160, -4142, -3773, 1400, 1952, -7027, -4937}
+, {-311, 3353, 10395, -410, 2437, 426, 5921, 4664}
+, {3195, -2369, -20748, -7006, 5303, -2544, -1009, -11564}
+, {-4775, 5477, -4431, 2161, 829, 18282, 1428, 3197}
+, {-435, 4946, 11025, 4571, 1986, -2559, -1213, 4943}
+, {16, 3484, 10337, 5800, 2855, 549, 5397, 561}
+}; //32768
+
+int16_t Bg1[1][8] = {{-18778, -9519, 4055, -7310, 8584, -17258, -5281, -7934}
+}; //16384
+
+int16_t Bh1[1][8] = {{9658, 19740, -10058, 19114, 17227, 12226, 19080, 15855}
+}; //32768
+
+int16_t zeta1 = 32522; //32768
+
+int16_t nu1 = 235; //32768
+
+int16_t W2[8][8] = {{-850, 359, -9842, 5701, 7390, -4590, -3959, 2759}
+, {-1536, -6107, -1978, -5420, -1215, -5065, 77, -4658}
+, {10036, -340, 745, -3625, 1684, -1927, 2312, 2028}
+, {-3593, -1295, -997, -1, 1441, 2806, -1718, -3687}
+, {-287, -221, -1398, 439, -1651, 3409, -19972, -193}
+, {-6120, -4338, -1679, -9576, 13070, -12784, -56, -5648}
+, {-5623, -2853, -862, -3739, 2595, -285, -673, -5104}
+, {-3761, -842, -713, 396, 1405, 3339, -1477, -3670}
+}; //16384
+
+int16_t U2[8][8] = {{8755, 2010, -3642, -913, 5998, -2312, -389, -1571}
+, {-906, 9661, -1875, -328, 4034, -3910, -355, -5117}
+, {-2433, 1688, 1328, -1493, 4122, 769, -177, 9988}
+, {-2759, 2240, 1795, 6117, 6542, -6011, 710, 283}
+, {-3163, 5634, 15468, -1189, 704, -1739, 483, 3409}
+, {-4224, 5383, -15324, -2616, 19957, 2042, -579, -319}
+, {181, -1085, 863, 1111, -4614, 4177, 3342, 4059}
+, {312, 996, -3600, -867, 2397, -1214, -917, 8633}
+}; //16384
+
+int16_t Bg2[1][8] = {{-5411, -15415, -13003, -12122, -18931, -17923, -8693, -12151}
+}; //16384
+
+int16_t Bh2[1][8] = {{21417, 6457, 6421, 8970, 6601, 836, 3060, 8468}
+}; //16384
+
+int16_t zeta2 = 32520; //32768
+
+int16_t nu2 = 256; //32768
+
diff --git a/examples/pytorch/vision/cpp/rnnpool_quantized.cpp b/examples/pytorch/vision/cpp/rnnpool_quantized.cpp
new file mode 100644
index 000000000..b3a41c54a
--- /dev/null
+++ b/examples/pytorch/vision/cpp/rnnpool_quantized.cpp
@@ -0,0 +1,535 @@
+// Copyright (c) Microsoft Corporation. All rights reserved.
+// Licensed under the MIT license.
+
+#include <fstream>
+#include <iostream>
+#include <unordered_map>
+#include <string>
+#include <cstring>
+#include <cstdlib>
+#include <cmath>
+#include <algorithm>
+#include <vector>
+
+using namespace std;
+
+#define MYINT int16_t
+#define MYITE int16_t
+
+#include "data.h"
+
+#define SHIFT
+
+#ifdef SHIFT
+#define MYSCL int16_t
+unordered_map<string, MYSCL> scale = {
+    {"X", 12},
+
+    {"one", 14},
+
+    {"W1",14},
+    {"H1",14},
+    {"U1",15},
+    {"Bg1",14},
+    {"Bh1",15},
+    {"zeta1",15},
+    {"nu1",15},
+
+    {"a1",11},
+    {"b1",13},
+    {"c1",11},
+    {"cBg1",11},
+    {"cBh1",11},
+    {"g1",14},
+    {"h1",14},
+    {"z1",14},
+    {"y1",14},
+    {"w1",14},
+    {"v1",14},
+    {"u1",14},
+
+    {"intermediate", 14},
+
+    {"W2",14},
+    {"H2",14},
+    {"U2",14},
+    {"Bg2",14},
+    {"Bh2",14},
+    {"zeta2",15},
+    {"nu2",15},
+
+    {"a2",14},
+    {"b2",13},
+    {"c2",13},
+    {"cBg2",11},
+    {"cBh2",11},
+    {"g2",14},
+    {"h2",14},
+    {"z2",15},
+    {"y2",14},
+    {"w2",14},
+    {"v2",14},
+    {"u2",14},
+
+    {"Y",14},
+};
+#else
+#define MYSCL int32_t
+unordered_map<string, MYSCL> scale = {
+    {"X", 4096},
+
+    {"one", 16384},
+
+    {"W1",16384},
+    {"H1",16384},
+    {"U1",32768},
+    {"Bg1",16384},
+    {"Bh1",32768},
+    {"zeta1",32768},
+    {"nu1",32768},
+
+    {"a1",2048},
+    {"b1",8192},
+    {"c1",2048},
+    {"cBg1",2048},
+    {"cBh1",2048},
+    {"g1",16384},
+    {"h1",16384},
+    {"z1",16384},
+    {"y1",16384},
+    {"w1",16384},
+    {"v1",16384},
+    {"u1",16384},
+
+    {"intermediate", 16384},
+
+    {"W2",16384},
+    {"H2",16384},
+    {"U2",16384},
+    {"Bg2",16384},
+    {"Bh2",16384},
+    {"zeta2",32768},
+    {"nu2",32768},
+
+    {"a2",16384},
+    {"b2",8192},
+    {"c2",8192},
+    {"cBg2",2048},
+    {"cBh2",2048},
+    {"g2",16384},
+    {"h2",16384},
+    {"z2",32768},
+    {"y2",16384},
+    {"w2",16384},
+    {"v2",16384},
+    {"u2",16384},
+
+    {"Y",16384},
+};
+#endif
+
+
+
+void MatMul(int16_t* A, int16_t* B, int16_t* C, MYINT I, MYINT J, MYINT K, MYSCL scA, MYSCL scB, MYSCL scC) {
+
+#ifdef SHIFT
+  MYSCL addshrP = 1, addshr = 0;
+  while (addshrP < J) {
+    addshrP *= 2;
+    addshr += 1;
+  }
+#else
+  MYSCL addshr = 1;
+  while (addshr < J)
+    addshr *= 2;
+#endif
+
+#ifdef SHIFT
+  MYSCL shr = scA + scB - scC - addshr;
+#else
+  MYSCL shr = (scA * scB) / (scC * addshr);
+#endif
+
+  for (int i = 0; i < I; i++) {
+    for (int k = 0; k < K; k++) {
+      int32_t s = 0;
+      for (int j = 0; j < J; j++) {
+#ifdef SHIFT
+        s += ((int32_t)A[i * J + j] * (int32_t)B[j * K + k]) >> addshr;
+#else
+        s += ((int32_t)A[i * J + j] * (int32_t)B[j * K + k]) / addshr;
+#endif
+      }
+#ifdef SHIFT
+      C[i * K + k] = s >> shr;
+#else
+      C[i * K + k] = s / shr;
+#endif
+    }
+  }
+}
+
+inline MYINT min(MYINT a, MYINT b) {
+  return a < b ? a : b;
+}
+
+inline MYINT max(MYINT a, MYINT b) {
+  return a > b ? a : b;
+}
+
+void MatAdd(int16_t* A, int16_t* B, int16_t* C, MYINT I, MYINT J, MYSCL scA, MYSCL scB, MYSCL scC) {
+
+  MYSCL shrmin = min(scA, scB);
+#ifdef SHIFT
+  MYSCL shra = scA - shrmin;
+  MYSCL shrb = scB - shrmin;
+  MYSCL shrc = shrmin - scC;
+#else
+  MYSCL shra = scA / shrmin;
+  MYSCL shrb = scB / shrmin;
+  MYSCL shrc = shrmin / scC;
+#endif
+
+  for (int i = 0; i < I; i++) {
+    for (int j = 0; j < J; j++) {
+#ifdef SHIFT
+      C[i * J + j] = ((A[i * J + j] >> (shra + shrc)) + (B[i * J + j] >> (shrb + shrc)));
+#else
+      C[i * J + j] = ((A[i * J + j] / (shra * shrc)) + (B[i * J + j] / (shrb * shrc)));
+#endif
+    }
+  }
+}
+
+void ScalarMatSub(int16_t A, int16_t* B, int16_t* C, MYINT I, MYINT J, MYSCL scA, MYSCL scB, MYSCL scC) {
+
+  MYSCL shrmin = min(scA, scB);
+#ifdef SHIFT
+  MYSCL shra = scA - shrmin;
+  MYSCL shrb = scB - shrmin;
+  MYSCL shrc = shrmin - scC;
+#else
+  MYSCL shra = scA / shrmin;
+  MYSCL shrb = scB / shrmin;
+  MYSCL shrc = shrmin / scC;
+#endif
+
+  for (int i = 0; i < I; i++) {
+    for (int j = 0; j < J; j++) {
+#ifdef SHIFT
+      C[i * J + j] = ((A >> (shra + shrc)) - (B[i * J + j] >> (shrb + shrc)));
+#else
+      C[i * J + j] = ((A / (shra * shrc)) - (B[i * J + j] / (shrb * shrc)));
+#endif
+    }
+  }
+}
+
+void ScalarMatAdd(int16_t A, int16_t* B, int16_t* C, MYINT I, MYINT J, MYSCL scA, MYSCL scB, MYSCL scC) {
+
+  MYSCL shrmin = min(scA, scB);
+#ifdef SHIFT
+  MYSCL shra = scA - shrmin;
+  MYSCL shrb = scB - shrmin;
+  MYSCL shrc = shrmin - scC;
+#else
+  MYSCL shra = scA / shrmin;
+  MYSCL shrb = scB / shrmin;
+  MYSCL shrc = shrmin / scC;
+#endif
+
+  for (int i = 0; i < I; i++) {
+    for (int j = 0; j < J; j++) {
+#ifdef SHIFT
+      C[i * J + j] = ((A >> (shra + shrc)) + (B[i * J + j] >> (shrb + shrc)));
+#else
+      C[i * J + j] = ((A / (shra * shrc)) + (B[i * J + j] / (shrb * shrc)));
+#endif
+    }
+  }
+}
+
+void HadMul(int16_t* A, int16_t* B, int16_t* C, MYINT I, MYINT J, MYSCL scA, MYSCL scB, MYSCL scC) {
+
+#ifdef SHIFT
+  MYSCL shr = (scA + scB) - scC;
+#else
+  MYSCL shr = (scA * scB) / scC;
+#endif
+
+  for (int i = 0; i < I; i++) {
+    for (int j = 0; j < J; j++) {
+#ifdef SHIFT
+      C[i * J + j] = (((int32_t)A[i * J + j]) * ((int32_t)B[i * J + j])) >> shr;
+#else
+      C[i * J + j] = (((int32_t)A[i * J + j]) * ((int32_t)B[i * J + j])) / shr;
+#endif
+    }
+  }
+}
+
+void ScalarMul(int16_t A, int16_t* B, int16_t* C, MYINT I, MYINT J, MYSCL scA, MYSCL scB, MYSCL scC) {
+
+#ifdef SHIFT
+  MYSCL shr = (scA + scB) - scC;
+#else
+  MYSCL shr = (scA * scB) / scC;
+#endif
+
+  for (int i = 0; i < I; i++) {
+    for (int j = 0; j < J; j++) {
+#ifdef SHIFT
+      C[i * J + j] = ((int32_t)(A) * (int32_t)(B[i * J + j])) >> shr;
+#else
+      C[i * J + j] = ((int32_t)(A) * (int32_t)(B[i * J + j])) / shr;
+#endif
+    }
+  }
+}
+
+void SigmoidNew16(int16_t* A, MYINT I, MYINT J, int16_t* B) {
+  for (MYITE i = 0; i < I; i++) {
+    for (MYITE j = 0; j < J; j++) {
+      int16_t a = A[i * J + j];
+      B[i * J + j] = 8 * max(min((a + 2048) / 2, 2048), 0);
+    }
+  }
+  return;
+}
+
+void TanHNew16(int16_t* A, MYINT I, MYINT J, int16_t* B) {
+  for (MYITE i = 0; i < I; i++) {
+    for (MYITE j = 0; j < J; j++) {
+      int16_t a = A[i * J + j];
+      B[i * J + j] = 8 * max(min(a, 2048), -2048);
+    }
+  }
+  return;
+}
+
+void reverse(int16_t* A, int16_t* B, int I, int J) {
+  for (int i = 0; i < I; i++) {
+    for (int j = 0; j < J; j++) {
+      B[i * J + j] = A[(I - i - 1) * J + j];
+    }
+  }
+}
+
+
+void print(int16_t* var, int I, int J, MYSCL scale) {
+  for (int i = 0; i < I; i++) {
+    for (int j = 0; j < J; j++) {
+      cout << ((float)var[i * J + j]) / scale << " ";
+    }
+    cout << endl;
+  }
+  //exit(1);
+}
+
+void FastGRNN1(int16_t X[8][4], int16_t* H, int timestep) {
+  memset(&H[0], 0, 8 * 2);
+
+  for (int i = 0; i < timestep; i++) {
+    int16_t a[1][8];
+    MatMul(&X[i][0], &W1[0][0], &a[0][0], 1, 4, 8, scale["X"], scale["W1"], scale["a1"]);
+    int16_t b[1][8];
+    MatMul(&H[0], &U1[0][0], &b[0][0], 1, 8, 8, scale["H1"], scale["U1"], scale["b1"]);
+    int16_t c[1][8];
+    MatAdd(&a[0][0], &b[0][0], &c[0][0], 1, 8, scale["a1"], scale["b1"], scale["c1"]);
+    int16_t cBg[1][8];
+    MatAdd(&c[0][0], &Bg1[0][0], &cBg[0][0], 1, 8, scale["c1"], scale["Bg1"], scale["cBg1"]);
+    int16_t g[1][8];
+    SigmoidNew16(&cBg[0][0], 1, 8, &g[0][0]);
+    int16_t cBh[1][8];
+    MatAdd(&c[0][0], &Bh1[0][0], &cBh[0][0], 1, 8, scale["c1"], scale["Bh1"], scale["cBh1"]);
+    int16_t h[1][8];
+    TanHNew16(&cBh[0][0], 1, 8, &h[0][0]);
+    int16_t z[1][8];
+    HadMul(&g[0][0], &H[0], &z[0][0], 1, 8, scale["g1"], scale["H1"], scale["z1"]);
+    int16_t y[1][8];
+    ScalarMatSub(16384, &g[0][0], &y[0][0], 1, 8, scale["one"], scale["g1"], scale["y1"]);
+    int16_t w[1][8];
+    ScalarMul(zeta1, &y[0][0], &w[0][0], 1, 8, scale["zeta1"], scale["y1"], scale["w1"]);
+    int16_t v[1][8];
+    ScalarMatAdd(nu1, &w[0][0], &v[0][0], 1, 8, scale["nu1"], scale["w1"], scale["v1"]);
+    int16_t u[1][8];
+    HadMul(&w[0][0], &h[0][0], &u[0][0], 1, 8, scale["w1"], scale["h1"], scale["u1"]);
+
+    MatAdd(&z[0][0], &u[0][0], &H[0], 1, 8, scale["z1"], scale["u1"], scale["H1"]);
+  }
+}
+
+void FastGRNN2(int16_t X[8][8], int16_t* H, int timestep) {
+  memset(&H[0], 0, 8 * 2);
+
+  for (int i = 0; i < timestep; i++) {
+    int16_t a[1][8];
+    MatMul(&X[i][0], &W2[0][0], &a[0][0], 1, 8, 8, scale["intermediate"], scale["W2"], scale["a2"]);
+
+    int16_t b[1][8];
+    MatMul(&H[0], &U2[0][0], &b[0][0], 1, 8, 8, scale["H2"], scale["U2"], scale["b2"]);
+    int16_t c[1][8];
+    MatAdd(&a[0][0], &b[0][0], &c[0][0], 1, 8, scale["a2"], scale["b2"], scale["c2"]);
+    int16_t cBg[1][8];
+    MatAdd(&c[0][0], &Bg2[0][0], &cBg[0][0], 1, 8, scale["c2"], scale["Bg2"], scale["cBg2"]);
+    int16_t g[1][8];
+    SigmoidNew16(&cBg[0][0], 1, 8, &g[0][0]);
+    int16_t cBh[1][8];
+    MatAdd(&c[0][0], &Bh2[0][0], &cBh[0][0], 1, 8, scale["c2"], scale["Bh2"], scale["cBh2"]);
+    int16_t h[1][8];
+    TanHNew16(&cBh[0][0], 1, 8, &h[0][0]);
+    int16_t z[1][8];
+    HadMul(&g[0][0], &H[0], &z[0][0], 1, 8, scale["g2"], scale["H2"], scale["z2"]);
+    int16_t y[1][8];
+    ScalarMatSub(16384, &g[0][0], &y[0][0], 1, 8, scale["one"], scale["g2"], scale["y2"]);
+    int16_t w[1][8];
+    ScalarMul(zeta2, &y[0][0], &w[0][0], 1, 8, scale["zeta2"], scale["y2"], scale["w2"]);
+    int16_t v[1][8];
+    ScalarMatAdd(nu2, &w[0][0], &v[0][0], 1, 8, scale["nu2"], scale["w2"], scale["v2"]);
+    int16_t u[1][8];
+    HadMul(&w[0][0], &h[0][0], &u[0][0], 1, 8, scale["w2"], scale["h2"], scale["u2"]);
+
+    MatAdd(&z[0][0], &u[0][0], &H[0], 1, 8, scale["z2"], scale["u2"], scale["H2"]);
+  }
+}
+
+void RNNPool(int16_t X[8][8][4], int16_t pred[1][32]) {
+
+  int16_t biinput1[8][8], biinput1r[8][8];
+  for (int i = 0; i < 8; i++) {
+    int16_t subX[8][4];
+    for (int j = 0; j < 8; j++) {
+      for (int k = 0; k < 4; k++) {
+        subX[j][k] = X[i][j][k];
+      }
+    }
+    int16_t H[1][8];
+    FastGRNN1(subX, &H[0][0], 8);
+
+    for (int j = 0; j < 8; j++) {
+      biinput1[i][j] = H[0][j];
+    }
+  }
+
+  int16_t res1[1][8], res2[1][8];
+  FastGRNN2(biinput1, &res1[0][0], 8);
+  reverse(&biinput1[0][0], &biinput1r[0][0], 8, 8);
+  FastGRNN2(biinput1r, &res2[0][0], 8);
+
+  int16_t biinput2[8][8], biinput2r[8][8];
+  for (int i = 0; i < 8; i++) {
+    int16_t subX[8][4];
+    for (int j = 0; j < 8; j++) {
+      for (int k = 0; k < 4; k++) {
+        subX[j][k] = X[j][i][k];
+      }
+    }
+    int16_t H[1][8];
+    FastGRNN1(subX, &H[0][0], 8);
+
+    for (int j = 0; j < 8; j++) {
+      biinput2[i][j] = H[0][j];
+    }
+  }
+
+
+  int16_t res3[1][8], res4[1][8];
+  FastGRNN2(biinput2, &res3[0][0], 8);
+  reverse(&biinput2[0][0], &biinput2r[0][0], 8, 8);
+  FastGRNN2(biinput2r, &res4[0][0], 8);
+
+  for (int i = 0; i < 8; i++)
+    pred[0][i] = res1[0][i];
+  for (int i = 0; i < 8; i++)
+    pred[0][i + 8] = res2[0][i];
+  for (int i = 0; i < 8; i++)
+    pred[0][i + 16] = res3[0][i];
+  for (int i = 0; i < 8; i++)
+    pred[0][i + 24] = res4[0][i];
+}
+
+int main(int argc, char* argv[]) {
+  string inputfile, outputfile;
+  int patches;
+  if (argc != 4) {
+    cerr << "Improper number of arguments" << endl;
+    return -1;
+  }
+  else {
+    patches = atoi(argv[1]);
+    inputfile = string(argv[2]);
+    outputfile = string(argv[3]);
+  }
+
+  fstream Xfile, Yfile;
+
+  Xfile.open(inputfile, ios::in | ios::binary);
+  Yfile.open(outputfile, ios::out | ios::binary);
+
+
+  char line[8];
+  Xfile.read(line, 8);
+  int headerSize;
+  Xfile.read((char*)&headerSize, 1 * 2);
+
+  char* headerLine = new char[headerSize]; //Ignored
+  Xfile.read(headerLine, headerSize);
+  delete[] headerLine;
+
+  char numpyMagix = 147;
+  char numpyVersionMajor = 1, numpyVersionMinor = 0;
+  string numpyMetaHeader = "";
+  numpyMetaHeader += numpyMagix;
+  numpyMetaHeader += "NUMPY";
+  numpyMetaHeader += numpyVersionMajor;
+  numpyMetaHeader += numpyVersionMinor;
+
+  string numpyHeader = "{'descr': '<f4', 'fortran_order': False, 'shape': (" + to_string(patches) + ", 1, 32), }";
+
+  for (int i = numpyHeader.size() + numpyMetaHeader.size() + 2; i % 64 != 64 - 1; i++) {
+    numpyHeader += ' ';
+  }
+  numpyHeader += (char)(10);
+
+  char a = numpyHeader.size() / 256, b = numpyHeader.size() % 256;
+  Yfile << numpyMetaHeader;
+  Yfile << b << a;
+  Yfile << numpyHeader;
+
+  int total = 0;
+  int correct = 0;
+
+  for (int i = 0; i < 6241; i++) {
+
+    float Xline[256];
+    Xfile.read((char*)&Xline[0], 256 * 4);
+
+
+    int16_t y;
+    int16_t reshapedX[8][8][4];
+
+    for (int a = 0; a < 4; a++) {
+      for (int b = 0; b < 8; b++) {
+        for (int c = 0; c < 8; c++) {
+#ifdef SHIFT
+          reshapedX[b][c][a] = (int16_t)((Xline[a * 64 + b * 8 + c * 1]) * pow(2, scale["X"]));
+#else
+          reshapedX[b][c][a] = (int16_t)((Xline[a * 64 + b * 8 + c * 1]) * scale["X"]);
+#endif
+        }
+      }
+    }
+
+    int16_t pred[1][32];
+    RNNPool(reshapedX, pred);
+
+    for (int j = 0; j < 32; j++) {
+      float val = ((float)pred[0][j]) / pow(2, scale["Y"]);
+      Yfile.write((char*)&val, sizeof(float));
+    }
+  }
+  Xfile.close();
+  Yfile.close();
+
+  return 0;
+}
\ No newline at end of file
diff --git a/pytorch/edgeml_pytorch/graph/rnn.py b/pytorch/edgeml_pytorch/graph/rnn.py
index 5a292ee00..988f7e495 100644
--- a/pytorch/edgeml_pytorch/graph/rnn.py
+++ b/pytorch/edgeml_pytorch/graph/rnn.py
@@ -144,8 +144,8 @@ def getVars(self):
 
     def get_model_size(self):
         '''
-		Function to get aimed model size
-		'''
+        Function to get aimed model size
+        '''
         mats = self.getVars()
         endW = self._num_W_matrices
         endU = endW + self._num_U_matrices
@@ -261,7 +261,7 @@ def __init__(self, input_size, hidden_size, gate_nonlinearity="sigmoid",
         self.zeta = nn.Parameter(self._zetaInit * torch.ones([1, 1]))
         self.nu = nn.Parameter(self._nuInit * torch.ones([1, 1]))
 
-        self.copy_previous_UW()
+        # self.copy_previous_UW()
 
     @property
     def name(self):
@@ -330,7 +330,7 @@ class FastGRNNCUDACell(RNNCell):
     '''
     def __init__(self, input_size, hidden_size, gate_nonlinearity="sigmoid", 
     update_nonlinearity="tanh", wRank=None, uRank=None, zetaInit=1.0, nuInit=-4.0, wSparsity=1.0, uSparsity=1.0, name="FastGRNNCUDACell"):
-        super(FastGRNNCUDACell, self).__init__(input_size, hidden_size, gate_non_linearity, update_nonlinearity, 
+        super(FastGRNNCUDACell, self).__init__(input_size, hidden_size, gate_nonlinearity, update_nonlinearity, 
                                                 1, 1, 2, wRank, uRank, wSparsity, uSparsity)
         if utils.findCUDA() is None:
             raise Exception('FastGRNNCUDA is supported only on GPU devices.')
@@ -967,63 +967,115 @@ class BaseRNN(nn.Module):
     [batchSize, timeSteps, inputDims]
     '''
 
-    def __init__(self, cell: RNNCell, batch_first=False):
+    def __init__(self, cell: RNNCell, batch_first=False, cell_reverse: RNNCell=None, bidirectional=False):
         super(BaseRNN, self).__init__()
-        self._RNNCell = cell
+        self.RNNCell = cell 
         self._batch_first = batch_first
+        self._bidirectional = bidirectional
+        if cell_reverse is not None:
+            self.RNNCell_reverse = cell_reverse
+        elif self._bidirectional:
+            self.RNNCell_reverse = cell
 
     def getVars(self):
-        return self._RNNCell.getVars()
+        return self.RNNCell.getVars()
 
     def forward(self, input, hiddenState=None,
                 cellState=None):
         self.device = input.device
+        self.num_directions = 2 if self._bidirectional else 1
+        # hidden
+        # for i in range(num_directions):
         hiddenStates = torch.zeros(
                 [input.shape[0], input.shape[1],
-                 self._RNNCell.output_size]).to(self.device)
+                 self.RNNCell.output_size]).to(self.device)
+
+        if self._bidirectional:
+                hiddenStates_reverse = torch.zeros(
+                    [input.shape[0], input.shape[1],
+                     self.RNNCell_reverse.output_size]).to(self.device)
+
         if hiddenState is None:
                 hiddenState = torch.zeros(
-                    [input.shape[0] if self._batch_first else input.shape[1],
-                    self._RNNCell.output_size]).to(self.device)
+                    [self.num_directions, input.shape[0] if self._batch_first else input.shape[1],
+                    self.RNNCell.output_size]).to(self.device)
 
         if self._batch_first is True:
-            if self._RNNCell.cellType == "LSTMLR":
+            if self.RNNCell.cellType == "LSTMLR":
                 cellStates = torch.zeros(
                     [input.shape[0], input.shape[1],
-                     self._RNNCell.output_size]).to(self.device)
+                     self.RNNCell.output_size]).to(self.device)
+                if self._bidirectional:
+                    cellStates_reverse = torch.zeros(
+                    [input.shape[0], input.shape[1],
+                     self.RNNCell_reverse.output_size]).to(self.device)
                 if cellState is None:
                     cellState = torch.zeros(
-                        [input.shape[0], self._RNNCell.output_size]).to(self.device)
+                        [self.num_directions, input.shape[0], self.RNNCell.output_size]).to(self.device)
                 for i in range(0, input.shape[1]):
-                    hiddenState, cellState = self._RNNCell(
-                        input[:, i, :], (hiddenState, cellState))
-                    hiddenStates[:, i, :] = hiddenState
-                    cellStates[:, i, :] = cellState
-                return hiddenStates, cellStates
+                    hiddenState[0], cellState[0] = self.RNNCell(
+                        input[:, i, :], (hiddenState[0].clone(), cellState[0].clone()))
+                    hiddenStates[:, i, :] = hiddenState[0]
+                    cellStates[:, i, :] = cellState[0]
+                    if self._bidirectional:
+                        hiddenState[1], cellState[1] = self.RNNCell_reverse(
+                            input[:, input.shape[1]-i-1, :], (hiddenState[1].clone(), cellState[1].clone()))
+                        hiddenStates_reverse[:, i, :] = hiddenState[1]
+                        cellStates_reverse[:, i, :] = cellState[1]
+                if not self._bidirectional:
+                    return hiddenStates, cellStates
+                else:
+                    return torch.cat([hiddenStates,hiddenStates_reverse],-1), torch.cat([cellStates,cellStates_reverse],-1)  
             else:
                 for i in range(0, input.shape[1]):
-                    hiddenState = self._RNNCell(input[:, i, :], hiddenState)
-                    hiddenStates[:, i, :] = hiddenState
-                return hiddenStates
+                    hiddenState[0] = self.RNNCell(input[:, i, :], hiddenState[0].clone())
+                    hiddenStates[:, i, :] = hiddenState[0]
+                    if self._bidirectional:
+                        hiddenState[1] = self.RNNCell_reverse(
+                            input[:, input.shape[1]-i-1, :], hiddenState[1].clone())
+                        hiddenStates_reverse[:, i, :] = hiddenState[1]
+                if not self._bidirectional:
+                    return hiddenStates
+                else:
+                    return torch.cat([hiddenStates,hiddenStates_reverse],-1)
         else:
-            if self._RNNCell.cellType == "LSTMLR":
+            if self.RNNCell.cellType == "LSTMLR":
                 cellStates = torch.zeros(
                     [input.shape[0], input.shape[1],
-                     self._RNNCell.output_size]).to(self.device)
+                     self.RNNCell.output_size]).to(self.device)
+                if self._bidirectional:
+                    cellStates_reverse = torch.zeros(
+                    [input.shape[0], input.shape[1],
+                     self.RNNCell_reverse.output_size]).to(self.device)
                 if cellState is None:
                     cellState = torch.zeros(
-                        [input.shape[1], self._RNNCell.output_size]).to(self.device)
+                        [self.num_directions, input.shape[1], self.RNNCell.output_size]).to(self.device)
                 for i in range(0, input.shape[0]):
-                    hiddenState, cellState = self._RNNCell(
-                        input[i, :, :], (hiddenState, cellState))
-                    hiddenStates[i, :, :] = hiddenState
-                    cellStates[i, :, :] = cellState
-                return hiddenStates, cellStates
+                    hiddenState[0], cellState[0] = self.RNNCell(
+                        input[i, :, :], (hiddenState[0].clone(), cellState[0].clone()))
+                    hiddenStates[i, :, :] = hiddenState[0]
+                    cellStates[i, :, :] = cellState[0]
+                    if self._bidirectional:
+                        hiddenState[1], cellState[1] = self.RNNCell_reverse(
+                            input[input.shape[0]-i-1, :, :], (hiddenState[1].clone(), cellState[1].clone()))
+                        hiddenStates_reverse[i, :, :] = hiddenState[1]
+                        cellStates_reverse[i, :, :] = cellState[1]
+                if not self._bidirectional:
+                    return hiddenStates, cellStates
+                else:
+                    return torch.cat([hiddenStates,hiddenStates_reverse],-1), torch.cat([cellStates,cellStates_reverse],-1)
             else:
                 for i in range(0, input.shape[0]):
-                    hiddenState = self._RNNCell(input[i, :, :], hiddenState)
-                    hiddenStates[i, :, :] = hiddenState
-                return hiddenStates
+                    hiddenState[0] = self.RNNCell(input[i, :, :], hiddenState[0].clone())
+                    hiddenStates[i, :, :] = hiddenState[0]
+                    if self._bidirectional:
+                        hiddenState[1] = self.RNNCell_reverse(
+                            input[input.shape[0]-i-1, :, :], hiddenState[1].clone())
+                        hiddenStates_reverse[i, :, :] = hiddenState[1]
+                if not self._bidirectional:
+                    return hiddenStates
+                else:
+                    return torch.cat([hiddenStates,hiddenStates_reverse],-1)
 
 
 class LSTM(nn.Module):
@@ -1031,14 +1083,26 @@ class LSTM(nn.Module):
 
     def __init__(self, input_size, hidden_size, gate_nonlinearity="sigmoid",
                  update_nonlinearity="tanh", wRank=None, uRank=None,
-                 wSparsity=1.0, uSparsity=1.0, batch_first=False):
+                 wSparsity=1.0, uSparsity=1.0, batch_first=False, 
+                 bidirectional=False, is_shared_bidirectional=True):
         super(LSTM, self).__init__()
+        self._bidirectional = bidirectional
+        self._batch_first = batch_first
+        self._is_shared_bidirectional = is_shared_bidirectional
         self.cell = LSTMLRCell(input_size, hidden_size,
                                gate_nonlinearity=gate_nonlinearity,
                                update_nonlinearity=update_nonlinearity,
                                wRank=wRank, uRank=uRank,
                                wSparsity=wSparsity, uSparsity=uSparsity)
-        self.unrollRNN = BaseRNN(self.cell, batch_first=batch_first)
+        self.unrollRNN = BaseRNN(self.cell, batch_first=self._batch_first, bidirectional=self._bidirectional)
+
+        if self._bidirectional is True and self._is_shared_bidirectional is False:
+            self.cell_reverse = LSTMLRCell(input_size, hidden_size,
+                               gate_nonlinearity=gate_nonlinearity,
+                               update_nonlinearity=update_nonlinearity,
+                               wRank=wRank, uRank=uRank,
+                               wSparsity=wSparsity, uSparsity=uSparsity)
+            self.unrollRNN = BaseRNN(self.cell, self.cell_reverse, batch_first=self._batch_first, bidirectional=self._bidirectional)
 
     def forward(self, input, hiddenState=None, cellState=None):
         return self.unrollRNN(input, hiddenState, cellState)
@@ -1049,14 +1113,26 @@ class GRU(nn.Module):
 
     def __init__(self, input_size, hidden_size, gate_nonlinearity="sigmoid",
                  update_nonlinearity="tanh", wRank=None, uRank=None,
-                 wSparsity=1.0, uSparsity=1.0, batch_first=False):
+                 wSparsity=1.0, uSparsity=1.0, batch_first=False, 
+                 bidirectional=False, is_shared_bidirectional=True):
         super(GRU, self).__init__()
+        self._bidirectional = bidirectional
+        self._batch_first = batch_first
+        self._is_shared_bidirectional = is_shared_bidirectional
         self.cell = GRULRCell(input_size, hidden_size,
                               gate_nonlinearity=gate_nonlinearity,
                               update_nonlinearity=update_nonlinearity,
                               wRank=wRank, uRank=uRank,
                               wSparsity=wSparsity, uSparsity=uSparsity)
-        self.unrollRNN = BaseRNN(self.cell, batch_first=batch_first)
+        self.unrollRNN = BaseRNN(self.cell, batch_first=self._batch_first, bidirectional=self._bidirectional)
+
+        if self._bidirectional is True and self._is_shared_bidirectional is False:
+            self.cell_reverse = GRULRCell(input_size, hidden_size,
+                              gate_nonlinearity=gate_nonlinearity,
+                              update_nonlinearity=update_nonlinearity,
+                              wRank=wRank, uRank=uRank,
+                              wSparsity=wSparsity, uSparsity=uSparsity)
+            self.unrollRNN = BaseRNN(self.cell, self.cell_reverse, batch_first=self._batch_first, bidirectional=self._bidirectional)
 
     def forward(self, input, hiddenState=None, cellState=None):
         return self.unrollRNN(input, hiddenState, cellState)
@@ -1067,14 +1143,26 @@ class UGRNN(nn.Module):
 
     def __init__(self, input_size, hidden_size, gate_nonlinearity="sigmoid",
                  update_nonlinearity="tanh", wRank=None, uRank=None,
-                 wSparsity=1.0, uSparsity=1.0, batch_first=False):
+                 wSparsity=1.0, uSparsity=1.0, batch_first=False, 
+                 bidirectional=False, is_shared_bidirectional=True):
         super(UGRNN, self).__init__()
+        self._bidirectional = bidirectional
+        self._batch_first = batch_first
+        self._is_shared_bidirectional = is_shared_bidirectional
         self.cell = UGRNNLRCell(input_size, hidden_size,
                                 gate_nonlinearity=gate_nonlinearity,
                                 update_nonlinearity=update_nonlinearity,
                                 wRank=wRank, uRank=uRank,
                                 wSparsity=wSparsity, uSparsity=uSparsity)
-        self.unrollRNN = BaseRNN(self.cell, batch_first=batch_first)
+        self.unrollRNN = BaseRNN(self.cell, batch_first=self._batch_first, bidirectional=self._bidirectional)
+
+        if self._bidirectional is True and self._is_shared_bidirectional is False:
+            self.cell_reverse = UGRNNLRCell(input_size, hidden_size,
+                                gate_nonlinearity=gate_nonlinearity,
+                                update_nonlinearity=update_nonlinearity,
+                                wRank=wRank, uRank=uRank,
+                                wSparsity=wSparsity, uSparsity=uSparsity)
+            self.unrollRNN = BaseRNN(self.cell, self.cell_reverse, batch_first=self._batch_first, bidirectional=self._bidirectional)
 
     def forward(self, input, hiddenState=None, cellState=None):
         return self.unrollRNN(input, hiddenState, cellState)
@@ -1085,15 +1173,28 @@ class FastRNN(nn.Module):
 
     def __init__(self, input_size, hidden_size, gate_nonlinearity="sigmoid",
                  update_nonlinearity="tanh", wRank=None, uRank=None,
-                 wSparsity=1.0, uSparsity=1.0, alphaInit=-3.0, betaInit=3.0, batch_first=False):
+                 wSparsity=1.0, uSparsity=1.0, alphaInit=-3.0, betaInit=3.0,
+                 batch_first=False, bidirectional=False, is_shared_bidirectional=True):
         super(FastRNN, self).__init__()
+        self._bidirectional = bidirectional
+        self._batch_first = batch_first
+        self._is_shared_bidirectional = is_shared_bidirectional
         self.cell = FastRNNCell(input_size, hidden_size,
                                 gate_nonlinearity=gate_nonlinearity,
                                 update_nonlinearity=update_nonlinearity,
                                 wRank=wRank, uRank=uRank,
                                 wSparsity=wSparsity, uSparsity=uSparsity,
                                 alphaInit=alphaInit, betaInit=betaInit)
-        self.unrollRNN = BaseRNN(self.cell, batch_first=batch_first)
+        self.unrollRNN = BaseRNN(self.cell, batch_first=self._batch_first, bidirectional=self._bidirectional)
+
+        if self._bidirectional is True and self._is_shared_bidirectional is False:
+            self.cell_reverse = FastRNNCell(input_size, hidden_size,
+                                gate_nonlinearity=gate_nonlinearity,
+                                update_nonlinearity=update_nonlinearity,
+                                wRank=wRank, uRank=uRank,
+                                wSparsity=wSparsity, uSparsity=uSparsity,
+                                alphaInit=alphaInit, betaInit=betaInit)
+            self.unrollRNN = BaseRNN(self.cell, self.cell_reverse, batch_first=self._batch_first, bidirectional=self._bidirectional)
 
     def forward(self, input, hiddenState=None, cellState=None):
         return self.unrollRNN(input, hiddenState, cellState)
@@ -1105,15 +1206,27 @@ class FastGRNN(nn.Module):
     def __init__(self, input_size, hidden_size, gate_nonlinearity="sigmoid",
                  update_nonlinearity="tanh", wRank=None, uRank=None,
                  wSparsity=1.0, uSparsity=1.0, zetaInit=1.0, nuInit=-4.0,
-                 batch_first=False):
+                 batch_first=False, bidirectional=False, is_shared_bidirectional=True):
         super(FastGRNN, self).__init__()
+        self._bidirectional = bidirectional
+        self._batch_first = batch_first
+        self._is_shared_bidirectional = is_shared_bidirectional
         self.cell = FastGRNNCell(input_size, hidden_size,
                                  gate_nonlinearity=gate_nonlinearity,
                                  update_nonlinearity=update_nonlinearity,
                                  wRank=wRank, uRank=uRank,
                                  wSparsity=wSparsity, uSparsity=uSparsity,
                                  zetaInit=zetaInit, nuInit=nuInit)
-        self.unrollRNN = BaseRNN(self.cell, batch_first=batch_first)
+        self.unrollRNN = BaseRNN(self.cell, batch_first=self._batch_first, bidirectional=self._bidirectional)
+
+        if self._bidirectional is True and self._is_shared_bidirectional is False:
+            self.cell_reverse = FastGRNNCell(input_size, hidden_size,
+                                 gate_nonlinearity=gate_nonlinearity,
+                                 update_nonlinearity=update_nonlinearity,
+                                 wRank=wRank, uRank=uRank,
+                                 wSparsity=wSparsity, uSparsity=uSparsity,
+                                 zetaInit=zetaInit, nuInit=nuInit)
+            self.unrollRNN = BaseRNN(self.cell, self.cell_reverse, batch_first=self._batch_first, bidirectional=self._bidirectional)
 
     def getVars(self):
         return self.unrollRNN.getVars()
@@ -1222,8 +1335,8 @@ def getVars(self):
 
     def get_model_size(self):
         '''
-		Function to get aimed model size
-		'''
+        Function to get aimed model size
+        '''
         mats = self.getVars()
         endW = self._num_W_matrices
         endU = endW + self._num_U_matrices
diff --git a/pytorch/edgeml_pytorch/graph/rnnpool.py b/pytorch/edgeml_pytorch/graph/rnnpool.py
new file mode 100644
index 000000000..bbca6ce5e
--- /dev/null
+++ b/pytorch/edgeml_pytorch/graph/rnnpool.py
@@ -0,0 +1,66 @@
+# Copyright (c) Microsoft Corporation. All rights reserved.
+# Licensed under the MIT license.
+
+import torch
+import torch.nn as nn
+import numpy as np
+from edgeml_pytorch.graph.rnn import *
+
+class RNNPool(nn.Module):
+    def __init__(self, nRows, nCols, nHiddenDims,
+                     nHiddenDimsBiDir, inputDims):
+        super(RNNPool, self).__init__()
+        self.nRows = nRows
+        self.nCols = nCols
+        self.inputDims = inputDims
+        self.nHiddenDims = nHiddenDims
+        self.nHiddenDimsBiDir = nHiddenDimsBiDir
+
+        self._build()
+
+    def _build(self):
+
+        self.cell_rnn = FastGRNN(self.inputDims, self.nHiddenDims, gate_nonlinearity="sigmoid",
+                                update_nonlinearity="tanh", zetaInit=100.0, nuInit=-100.0,
+                                batch_first=False, bidirectional=False)
+
+        self.cell_bidirrnn = FastGRNN(self.nHiddenDims, self.nHiddenDimsBiDir, gate_nonlinearity="sigmoid",
+                                update_nonlinearity="tanh", zetaInit=100.0, nuInit=-100.0,
+                                batch_first=False, bidirectional=True, is_shared_bidirectional=True)
+
+
+    def static_single(self,inputs, hidden, batch_size):
+
+        outputs = self.cell_rnn(inputs, hidden[0], hidden[1])
+        return torch.split(outputs[-1], split_size_or_sections=batch_size, dim=0)
+
+    def forward(self,inputs,batch_size):
+        ## across rows
+
+        row_timestack = torch.cat(torch.unbind(inputs, dim=3),dim=0) 
+
+        stateList = self.static_single(torch.stack(torch.unbind(row_timestack,dim=2)),
+                        (torch.zeros(1, batch_size * self.nRows, self.nHiddenDims).to(torch.device("cuda")),
+                        torch.zeros(1, batch_size * self.nRows, self.nHiddenDims).to(torch.device("cuda"))),batch_size)       
+
+        outputs_cols = self.cell_bidirrnn(torch.stack(stateList),
+                        torch.zeros(2, batch_size, self.nHiddenDimsBiDir).to(torch.device("cuda")),
+                        torch.zeros(2, batch_size, self.nHiddenDimsBiDir).to(torch.device("cuda")))
+
+
+        ## across columns
+        col_timestack = torch.cat(torch.unbind(inputs, dim=2),dim=0)
+
+        stateList = self.static_single(torch.stack(torch.unbind(col_timestack,dim=2)),
+                        (torch.zeros(1, batch_size * self.nRows, self.nHiddenDims).to(torch.device("cuda")),
+                        torch.zeros(1, batch_size * self.nRows, self.nHiddenDims).to(torch.device("cuda"))),batch_size)
+
+        outputs_rows = self.cell_bidirrnn(torch.stack(stateList),
+                        torch.zeros(2, batch_size, self.nHiddenDimsBiDir).to(torch.device("cuda")),
+                        torch.zeros(2, batch_size, self.nHiddenDimsBiDir).to(torch.device("cuda")))
+
+
+
+        output = torch.cat([outputs_rows[-1],outputs_cols[-1]],1)
+
+        return output