Multi-Perspective Long Short Term Memory

Multi-Perspective LSTM for Joint Visual Representation Learning
Alireza Sepas-Moghaddam, Fernando Pereira, Paulo Lobato Correia, Ali Etemad

CVPR'21 Paper

Abstract: We present a novel LSTM cell architecture capable of Multi-Perspective LSTM (MP-LSTM) cell architecture learning both intra- and inter-perspective relationships available in visual sequences captured from multiple perspectives. Our architecture adopts a novel recurrent joint learning strategy that uses additional gates and memories at the cell level. We demonstrate that by using the proposed cell to create a network, more effective and richer visual representations are learned for recognition tasks. We validate the performance of our proposed architecture in the context of two multi-perspective visual recognition tasks namely lip reading and face recognition. Three relevant datasets are considered and the results are compared against fusion strategies, other existing multi-input LSTM architectures, and alternative recognition solutions. The experiments show the superior performance of our solution over the considered benchmarks, both in terms of recognition accuracy and computational complexity.

Requirements

Both Linux and Windows are supported. Linux is recommended for performance and compatibility reasons.
64-bit Python 3.6 installation. We recommend Anaconda3 with numpy 1.19.5 or newer.
We recommend TensorFlow 1.14, which we used for all experiments in the paper, but newer versions of TensorFlow might work as well.
You need to use Keras 2.1.5.
You need to use Keras-VGGFace package to extract RESNET50 spatial embeddings.
One or more high-end NVIDIA GPUs, NVIDIA drivers, and CUDA 10.0 toolkit.

Preparing Datasets

The OuluVS2, Light Field Faces in the Wild (LFFW), and Face Constrained (LFFC) datasets are used to evaluate the performance of MPLSTM. After you have downloaded the dataset successfully, you need to split the data into training, validation, and testing as disscussed in OuluVS2 paper and LFFW and LFFC paper. The organization of the files should be as follows:

OuluVS2 Dataset
├ Test	Test folder
├ Train	Train folder
├ Validation	Validation folder
├ CAM1	Camera 1 folder
├ CAM2	Camera 2 folder
├ CAM3	Camera 3 folder
├ 01	Utterance 1 folder containing speach videos
├ 02	Utterance 2 folder containing Speach videos
├ .	.
├ .	.
├ .	.
├ 20	Utterance 20 folder containing speach videos

LFFW and LFFC Datasets
├ Test	Test folder
├ Train	Train folder
├ Validation	Validation folder
├ Hor	Horizontal viewpoint sequences floder
├ Ver	Vertical viewpoint sequences folder
├ 01	Subject 1 folder containing horizontal/vertical videos
├ 02	Subject 2 folder containing horizontal/vertical videos
├ .	.
├ .	.
├ .	.
├ 53	Subject 53 folder containing horizontal/vertical videos

Training and Testing

Demo codes for training and testing using 3-perspective combination are respectively available in Training_3Views.py and Testing_3Views.py. The source code of MPLSTM when adopting 2 and 3 perspectives are available in Library\MPLSTM_2inputs and Library\MPLSTM_3inputs, respectively.

Inquiries

For inquiries, please contact alireza.sepasmoghaddam@queensu.ca

Citation

@inproceedings{Sepas2021MPLSTM,
  title     = {Multi-Perspective {LSTM} for Joint Visual Representation Learning},
  author    = {Alireza Sepas-Moghaddam and Fernando Pereira and Paulo Lobato Correia and Ali Etemad},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2021}
}

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
Library		Library
MPLSTM.png		MPLSTM.png
README.md		README.md
Testing_3Views.py		Testing_3Views.py
Training_3Views.py		Training_3Views.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-Perspective Long Short Term Memory

Requirements

Preparing Datasets

Training and Testing

Inquiries

Citation

About

Releases

Packages

Languages

arsm/MPLSTM

Folders and files

Latest commit

History

Repository files navigation

Multi-Perspective Long Short Term Memory

Requirements

Preparing Datasets

Training and Testing

Inquiries

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages