# ResNet2RNN Example

Residual Network (ResNet) can be considered as Recurrent Neural Network (RNN)
if the parameters are shared across `residual block`. In such context, 
the number of parameters of ResNet are dramatically reduced. This can
be used for parameter reduction technique while sacrifying the computational cost.

In details, one can denote RNN by $h_{t} = W^{xh}x_{t} + W^{hh}h_{t-1}$ where $x_{t}$ is the input, $h_{t-1}$ is the hidden units at time $t-1$, $W^{xh}$ is the input to hidden weight, $W^{hh}$ is the hidden to hidden weight. On the other hand, ResNet are denoted by $h^{d} = h^{d-1} + f(h^{d-1})$, where $h^{d}$ is the hidden units at depth $d$, $f$ is a function, typically the pipeline of the functions; convolution, batchnormalization, and non-linearity.

If one set $f$ as $W^{hh}$ and $W^{xh}$ as zero matrix, then ResNet becomes 
RNN over the depth without input $x^{d}$. This is the underlying concept of what ResNet2RNN is like.   


For using this example, run the following code,

```sh
python classification.py -c "cudnn" \
    --monitor-path "cifar10_resnet2rnn_3x3x4_prediction" \
    --model-save-path "cifar10_resnet2rnn_3x3x4_prediction" \
    --net "cifar10_resnet2rnn_3x3x4_prediction" \
    -d 0
```

## References
1. Liao Qianli and Poggio Tomaso, "Bridging the Gaps Between Residual Learning, Recurrent Neural Networks and Visual Cortex", arXiv:1604.03640