Spatio-temporal video autoencoder with convolutional LSTMs
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
BilinearSamplerBHWD.lua
ConvLSTM.lua
DenseTransformer2D.lua
README.md
SmoothHuberPenalty.lua
UntiedConvLSTM.lua
data-mnist.lua
decoder.lua added temporary file with 1st-step-untied ConvLSTM model Apr 7, 2016
display_flow.lua
encoder.lua
flow.lua
main-demo-ConvLSTM.lua
main-mnist.lua
model-demo-ConvLSTM.lua
model.lua corrected bug in rmspropconfig Apr 26, 2016
opts-mnist.lua
weight-init.lua

README.md

ConvLSTM

Source code associated with Spatio-temporal video autoencoder with differentiable memory, published in ICLR2016 Workshop track.

This is a demo version to be trained on a modified version of moving MNIST dataset, available here. Some videos obtained on real test sequences are also available here (not up-to-date though).

The repository contains also a demo, main-demo-ConvLSTM.lua, of training a simple model, model-demo-ConvLSTM.lua, using the ConvLSTM module to predict the next frame in a sequence. The difference between this model and the one in the paper is that the former does not explicitly estimate the optical flow to generate the next frame.

The ConvLSTM module can be used as is. Optionally, the untied version implemented in UntiedConvLSTM class, can be used. The latter uses a separate model for the first step in the sequence, which has no memory. This can be helpful in training on shorter sequences, to reduce the impact of the first (memoryless) step on the training.

Dependencies

  • rnn: our code extends rnn by providing a spatio-temporal convolutional version of LSTM cells.
  • extracunn: contains cuda code for SpatialConvolutionalNoBias layer and Huber gradient computation.
  • stn.

To cite our paper/code:

@inproceedings{PatrauceanHC16,
  author    = {Viorica P{\u a}tr{\u a}ucean and
               Ankur Handa and
               Roberto Cipolla},
  title     = {Spatio-temporal video autoencoder with differentiable memory},
  booktitle = {International Conference on Learning Representations (ICLR) Workshop},
  year      = {2016}
}