This is pytorch implmentation project of AutoEncoder LSTM Paper in vision domain.
Original Paper experiment various dataset including Moving MNIST. This project only handle Movining MNIST Dataset.
- Dataset described in "Unsupervised Learning of Video Representations using LSTMs", 2015
In Moving MNIST dataset, each video was 20 frames long and consisted of two digits moving inside a 64 × 64 pixcel. The digits were chosen randomly from the training set and placed initially at random locations inside the Image. Each digit was assigned a velocity whose direction was chosen uniformly randomly on a unit circle and whose magnitude was also chosen uniformly at random over a fixed range.
There is a well made Dataset in official Moving MSINST homepage. But the module which support to download dataset and preprocess image is used in this project for easy handling. Module can be downloaded in this [LINK]
The images are contain only one channel.(NOT RGB Image) So It doesn't need any complex preprocessing process. Only scaling from 0 to 1 is applyed.
All Code is made with Jupyter Notebook. Korean, English version is supported in this project. Detail description is provided in here with korean.
This is reconstructed images from model when image sequence is used as a input The model is trained to memorize shape of number in sequence images.
This is predicted images from model when image sequence is used as a input. The model is trained to catch velocity and shape of number in images.