Skip to content

Michael-MuChienHsu/R_Unet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

R_Unet

This project applies recurrent method upon U-net to perform pixel level video frame prediction.
Part of our result is published at IEEE GCCE 2020. pdf.

Brief introduction

Taking advantage of LSTM and U-net encode-decoder, we wish to be able in predicting next (n) frame(s).
Currently using a 2 layer LSTM network (V1) or convolution LSTM (V2) as RNN network applying on latent feature of U net
In our latest v4 model, we use convolutional LSTM in each level and take short cut in v2 out

On the other hand, we are now using v4_mask model to train mask, image input and mask, image prediction output
This model holds same structure as v4 but simply change output layer to output mask tensor.

Usage

  • configuration: config.json
  • parse configuration: class parse_arguement.py
  • training file: train.py
  • V1 model: R_Unet_v1.py
  • V2 model: R_Unet_ver_2.py
  • V4 model: R_Unet_ver_4.py
to train v1 model: python3 train.py config 
to train other model: python3 train_v2.py config 

Our Model Architecture

Current we are working on a better model using convolution lstm, name as runet_v2

  • model v1:
    alt_text

  • model v2:
    alt_text

  • model v4:
    alt_text



Some result

prediction: alt_text Ground truth: alt_text
mask prediction
prediction: alt_text Ground truth: alt_text

References

[1] Stochastic Adversarial Video Prediction, CVPR 2018
[2] High Fidelity Video Prediction with Large Stochastic Recurrent Neural Networks, NeurIPS 2019
[3] convLSTM - The convolution lstm framework used

Hsu Mu Chien, Watanabe Lab, Department of Fundamental Science and Engineering, Waseda University.

About

Video prediction using lstm and unet

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages