LLCP: Learning Latent Causal Processes for Reasoning-based Video Question Answer

This repository contains the implementation for ICLR2024 paper LLCP: Learning Latent Causal Processes for Reasoning-based Video Question Answer pdf

LLCP is a causal framework designed to enhance video reasoning by focusing on the spatial-temporal dynamics of objects within events, without the need for extensive data annotations. By employing self-supervised learning and leveraging the modularity of causal mechanisms, LLCP learns multivariate generative model for spatial-temporal dynamics and thus enables effective accident attribution and counterfactual prediction of Reasoning-based VideoQA.

Environment

First, please install the recent version of Pytorch and Torchvision as pip install torch torchvision. Then, you can install other package by running pip install -r requirements.txt

Download Data

We provide the processed features used in our experiments. Please download the data and model in this link1 and this link2. Then please decompress the floders as ./data/ and ./results/ and replace the original floders as the downloaded ones.

The directory structure should look like

LLCP_VQA/
|–– config.py
|–– configs/
|–– data/
|   |–– object_test_feat/
|   |–– object_train_feat/
|   |–– appearance_feat_rn50.h5
|   |–– test_questions.pt
|   |–– train_questions.pt
|   |–– video_noaccident_train.txt
|–– DataLoader.py
|–– models_cvae.py
|–– requirements.txt
|–– results/
|   |–– .../model_cvae49.pt
|–– README.md
|–– train.py
|–– validate.py

Run Scripts

To train the cvae model, you can run this command:

python train.py --cfg configs/sutd-traffic_transition.yml

To evalaute the trained model, please refer to:

python validate.py --cfg configs/sutd-traffic_transition.yml

Simulation Experiments of LLCP

See LLCP-Simulation.

Citation

If you find our work useful in your research, please consider citing:

@inproceedings{chen2024llcp,
  title={LLCP: Learning Latent Causal Processes for Reasoning-based Video Question Answer},
  author={Chen, Guangyi and Li, Yuke and Liu, Xiao and Li, Zijian and Al Surad, Eman and Wei, Donglai and Zhang, Kun}
  booktitle={ICLR},
  year={2024}
}

Acknowledgement

Our implementation is mainly based on the SUTD-TrafficQA and Tem-Adapter, we thank the authors to release their codes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLCP: Learning Latent Causal Processes for Reasoning-based Video Question Answer

Environment

Download Data

Run Scripts

Simulation Experiments of LLCP

Citation

Acknowledgement

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.git-rewrite		.git-rewrite
LLCP_simulation		LLCP_simulation
configs		configs
figs		figs
.gitignore		.gitignore
DataLoader.py		DataLoader.py
LICENSE		LICENSE
README.md		README.md
config.py		config.py
models_cvae.py		models_cvae.py
requirements.txt		requirements.txt
train.py		train.py
validate.py		validate.py

License

CHENGY12/LLCP

Folders and files

Latest commit

History

Repository files navigation

LLCP: Learning Latent Causal Processes for Reasoning-based Video Question Answer

Environment

Download Data

Run Scripts

Simulation Experiments of LLCP

Citation

Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages