Integrating Neural-Symbolic Reasoning with Variational Causal Inference Network for Explanatory Visual Question Answering

Dizhan Xue, Shengsheng Qian, and Changsheng Xu.

MAIS, Institute of Automation, Chinese Academy of Sciences

Data

Download the GQA Dataset.
Download the GQA-OOD Dataset
Download the bottom-up features and unzip it.
Extracting features from the raw tsv files (Important: You need to run the code in Linux):

python ./preprocessing/extract_tsv.py --input $TSV_FILE --output $FEATURE_DIR

We provide the annotations of GQA-REX Dataset in model/processed_data/converted_explanation_train_balanced.json and model/processed_data/converted_explanation_val_balanced.json.
(Optional) You can construct the GQA-REX Dataset by yourself following instructions by its authors.
Download our generated programs of the GQA dataset.
(Optional) You can generate programs by yourself following this project.

Models

We provide four models in model/model/model.py.

Two baselines:

REX-VisualBert is from this project.
REX-LXMERT replaces the backbone VisualBert of REX-VisualBert by LXMERT.

Two our models (using LXMERT as backbone):

VCIN is proposed in our ICCV 2023 paper "Variational Causal Inference Network for Explanatory Visual Question Answering".
Pro-VCIN is proposed in "Integrating Neural-Symbolic Reasoning with Variational Causal Inference Network for Explanatory Visual Question Answering".

Training and Test

Before training, you need to first generate the dictionary for questions, answers, explanations, and program modules:

cd ./model
python generate_dictionary --question $GQA_ROOT/question --exp $EXP_DIR  --pro $PRO_DIR --save ./processed_data

The training process can be called as:

python main.py --mode train --anno_dir $GQA_ROOT/question --ood_dir $OOD_ROOT/data --sg_dir $GQA_ROOT/scene_graph --lang_dir ./processed_data --img_dir $FEATURE_DIR/features --bbox_dir $FEATURE_DIR/box --checkpoint_dir $CHECKPOINT --explainable True

To evaluate on the GQA-testdev set or generating submission file for online evaluation on the test-standard set, call:

python main.py --mode $MODE --anno_dir $GQA_ROOT/question --ood_dir $OOD_ROOT/data --lang_dir ./processed_data --img_dir $FEATURE_DIR/features --weights $CHECKPOINT/model_best.pth --explainable True

and set $MODE to eval or submission accordingly.

Reference

If you find our paper or code helpful, please cite it as below. Thanks!

@inproceedings{xue2023variational,
  title={Variational Causal Inference Network for Explanatory Visual Question Answering},
  author={Xue, Dizhan and Qian, Shengsheng and Xu, Changsheng},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={2515--2525},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
model		model
preprocessing		preprocessing
LICENSE		LICENSE
README.md		README.md
environment.yaml		environment.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

model

model

preprocessing

preprocessing

LICENSE

LICENSE

README.md

README.md

environment.yaml

environment.yaml

Repository files navigation

Integrating Neural-Symbolic Reasoning with Variational Causal Inference Network for Explanatory Visual Question Answering

Data

Models

Two baselines:

Two our models (using LXMERT as backbone):

Training and Test

Reference

About

Releases

Packages

Languages

License

LivXue/VCIN

Folders and files

Latest commit

History

Repository files navigation

Integrating Neural-Symbolic Reasoning with Variational Causal Inference Network for Explanatory Visual Question Answering

Data

Models

Two baselines:

Two our models (using LXMERT as backbone):

Training and Test

Reference

About

Resources

License

Stars

Watchers

Forks

Languages