This project architecture model is inspired https://arxiv.org/abs/1612.01756, and http://www.ais.uni-bonn.de/WS2021/LabVision/Project/winner2019.pdf
We use the DSSIM and L2 loss for training. We use three seed frames and predict the next three frames in an autoregressive manner:
- Input: GT0 Discard output
- Input: GT1 Discard output
- Input: GT2 Output should be Pred0 (calculate the loss between Pred0 and GT3)
- Input: Pred0 Output should be Pred1 (calculate the loss between Pred1 and GT4)
- Input: Pred1 Output should be Pred2 (calculate the loss between Pred2 and GT5)
The results of the model on the Moving MNIST testing dataset (http://www.cs.toronto.edu/~nitish/unsupervised_video/) can be seen below :
Results on a customized Robot dataset (created from scratch from youtube videos) are visible below:
DL algorithms implemented using Pytorch (on CUDA GPUs):
List of algorithms:
- Multi Layer Neural Network
- Convolution Neural Networks
- WandB implementation for Hyperparameter Sweeps
- Convolutions on customzied Robot dataset
- Variational Convolutional Autoencoders
- LSTM and GRU
- DCGAN
- WGAN