PLA-SM

This repository contains the implementation code for paper: Pair-wise Layer Attention with Spatial Masking for Video Prediction

Introduction

Video prediction yields future frames by employing the historical frames and has exhibited its great potential in many applications, e.g., meteorological prediction, and autonomous driving. Previous works often decode the ultimate high-level semantic features to future frames without texture details, which deteriorates the prediction quality. Motivated by this, we develop a Pair-wise Layer Attention (PLA) module to enhance the layer-wise semantic dependency of the feature maps derived from the U-shape structure in Translator, by coupling low-level visual cues and high-level features. Hence, the texture details of predicted frames are enriched. Moreover, most existing methods capture the spatiotemporal dynamics by Translator, but fail to sufficiently utilize the spatial features of Encoder. This inspires us to design a Spatial Masking (SM) module to mask partial encoding features during pretraining, which adds the visibility of remaining feature pixels by Decoder. To this end, we present a Pair-wise Layer Attention with Spatial Masking (PLA-SM) framework for video prediction to capture the spatiotemporal dynamics, which reflect the motion trend.

Dependencies

torch=1.9.0
scikit-image=0.19.3
numpy=1.21.5
argparse
tqdm=4.64.1
addict=2.4.0
fvcore=0.1.5
hickle=5.0.2
opencv-python=4.6.0
pandas=1.3.5
pillow=9.2.0
[MinkowskiEngine](ConvNeXt-V2/INSTALL.md at main · facebookresearch/ConvNeXt-V2 · GitHub)

Overview

simvp/api contains an experiment runner.
simvp/core contains core training plugins and metrics.
simvp/datasets contains datasets and dataloaders.
simvp/methods/ contains training methods for various video prediction
simvp/models/ contains the main network architectures of various video prediction methods.
simvp/modules/ contains network modules and layers.
tools/non_dist_train.py is the executable python file with possible arguments for training, validating, and testing pipelines.

Prepare Dateset

  cd ./data/moving_mnist        
  bash download_mmnist.sh       #download the mmnist dataset

Start Training

  python main_pretrain.py       #pretrain stage
  python main_train.py          #tarining stage

Quantitative results on Moving MNIST

	MSE	MAE	SSIM
PLA-SM	18.4	57.6	0.960

Qualitative results on Moving MNIST

(a) MAU; (b) PhyDNet; (c) SimVP; (d) Ours.

Citation

If you find this repo useful, please cite the following papers.

@article{li-PLA-SM,
  author    = {Ping Li, Chenhan Zhang, Zheng Yang, Xianghua Xu, Mingli Song},
  title     = {Pair-wise Layer Attention with Spatial Masking for Video Prediction},
  journal   = {arXiv},
  year      = {2023},
  doi       = {https://arxiv.org/abs/2311.11289}
}

Contact

If you have any questions, please feel free to contact Mr. Zhang Chenhan via email (zch2020@hdu.edu.cn)

Acknowledgements

We would like to thank to the authors of SimVP for making their source code public, which significantly accelerated the development of PLA-SM.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
API		API
configs		configs
constants		constants
data/moving_mnist		data/moving_mnist
methods		methods
models		models
modules		modules
readme_figures		readme_figures
utils		utils
README.md		README.md
main_pretrain.py		main_pretrain.py
main_train.py		main_train.py
parser.py		parser.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PLA-SM

Introduction

Dependencies

Overview

Prepare Dateset

Start Training

Quantitative results on Moving MNIST

Qualitative results on Moving MNIST

Citation

Contact

Acknowledgements

About

Releases

Packages

Contributors 2

Languages

mlvccn/PLA_SM_VideoPred

Folders and files

Latest commit

History

Repository files navigation

PLA-SM

Introduction

Dependencies

Overview

Prepare Dateset

Start Training

Quantitative results on Moving MNIST

Qualitative results on Moving MNIST

Citation

Contact

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages