Questions About RNN Pool

Hi,

My name is Alexander, we are mostly [working](https://github.com/snakers4/silero-models) on high-quality fast Speech Recognition. We mostly use strided 1D Convolutions and Transformer modules. We mostly use progressive strides (i.e. 2 - 2 - 2) in our models. I have tried overall stride of 2 and 4 and 8 was better (on a limited compute budget ofc). But I have not looked into other configurations, like 2 - 4 for example (very similar to an example you show in your blog post).

As you have noted in your blog [post](https://www.microsoft.com/en-us/research/blog/seeing-on-tiny-battery-powered-microcontrollers-with-rnnpool/), having an expressive stride module is of utmost importance. Your post made me think about improving our models further and I took the liberty of looking through your [code](https://github.com/microsoft/EdgeML/blob/master/pytorch/edgeml_pytorch/graph/rnnpool.py) briefly.

Though the idea in itself is kind of simple (just use a RNN to a batched input, in case of 1D it is even easier) a series of questions arose after taking a look at your code because it kind of heavily departs from PyTorch standard practices. So  I would like to know whether all of this is a bug or feature, so to say:

(0)
The module contains `.to(torch.device("cuda")` but it does not inherit from `FastGRNNCUDA` (it inherits from `FastGRNN`).
It seems a bit strange, so why is it the case?

(1)
You use `.to(torch.device("cuda")`, though it is a standard practice in PyTorch to write device agnostic code. Does this imply that:

- This code is NOT meant for multi-node (or multi-device) parallelization (e.g. DP , DDP)?
- This code is NOT meant to be run later on x86 (quantized or pruned) inference afterwards?

(2)
I saw some pruning and low-rank snippets in your code.
Low-rank does not seem to be used in RNNPool.

(3)
I took a quick look at `RNNCell` and `FastGRNNCell`. Apart from having some utilities for model size estimation, seeming unused low rank options and pruning, apart from param initialization I cannot really why you went for implementing these classes from scratch instead of just using the standard PyTorch ones. Is there some reasoning behind it?

(4)
```
The network is composed of a base VGG network followed by the
    added multibox conv layers.  Each multibox layer branches into
```
You seem to apply RNNPool to a VGG [encoder](https://github.com/microsoft/EdgeML/pull/215/files?file-filters%5B%5D=.py#diff-7f590a9d5d099bc186b312bff4a294bf7e7daef5aaa8c9078142e92b6d78d80cR51). VGG is usually slow and large, is there any reason you do not apply this scheme to a mobilenet?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions About RNN Pool #216

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Questions About RNN Pool #216

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions