Skip to content

Authors's code for "Variational Causal Inference Network for Explanatory Visual Question Answering" and "Integrating Neural-Symbolic Reasoning with Variational Causal Inference Network for Explanatory Visual Question Answering"

License

LivXue/VCIN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Integrating Neural-Symbolic Reasoning with Variational Causal Inference Network for Explanatory Visual Question Answering

Dizhan Xue, Shengsheng Qian, and Changsheng Xu.

MAIS, Institute of Automation, Chinese Academy of Sciences

GitHub stars Hits

Data

  1. Download the GQA Dataset.
  2. Download the GQA-OOD Dataset
  3. Download the bottom-up features and unzip it.
  4. Extracting features from the raw tsv files (Important: You need to run the code in Linux):
python ./preprocessing/extract_tsv.py --input $TSV_FILE --output $FEATURE_DIR
  1. We provide the annotations of GQA-REX Dataset in model/processed_data/converted_explanation_train_balanced.json and model/processed_data/converted_explanation_val_balanced.json.
  2. (Optional) You can construct the GQA-REX Dataset by yourself following instructions by its authors.
  3. Download our generated programs of the GQA dataset.
  4. (Optional) You can generate programs by yourself following this project.

Models

We provide four models in model/model/model.py.

Two baselines:

  1. REX-VisualBert is from this project.
  2. REX-LXMERT replaces the backbone VisualBert of REX-VisualBert by LXMERT.

Two our models (using LXMERT as backbone):

  1. VCIN is proposed in our ICCV 2023 paper "Variational Causal Inference Network for Explanatory Visual Question Answering".
  2. Pro-VCIN is proposed in "Integrating Neural-Symbolic Reasoning with Variational Causal Inference Network for Explanatory Visual Question Answering".

Training and Test

Before training, you need to first generate the dictionary for questions, answers, explanations, and program modules:

cd ./model
python generate_dictionary --question $GQA_ROOT/question --exp $EXP_DIR  --pro $PRO_DIR --save ./processed_data

The training process can be called as:

python main.py --mode train --anno_dir $GQA_ROOT/question --ood_dir $OOD_ROOT/data --sg_dir $GQA_ROOT/scene_graph --lang_dir ./processed_data --img_dir $FEATURE_DIR/features --bbox_dir $FEATURE_DIR/box --checkpoint_dir $CHECKPOINT --explainable True

To evaluate on the GQA-testdev set or generating submission file for online evaluation on the test-standard set, call:

python main.py --mode $MODE --anno_dir $GQA_ROOT/question --ood_dir $OOD_ROOT/data --lang_dir ./processed_data --img_dir $FEATURE_DIR/features --weights $CHECKPOINT/model_best.pth --explainable True

and set $MODE to eval or submission accordingly.

Reference

If you find our paper or code helpful, please cite it as below. Thanks!

@inproceedings{xue2023variational,
  title={Variational Causal Inference Network for Explanatory Visual Question Answering},
  author={Xue, Dizhan and Qian, Shengsheng and Xu, Changsheng},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={2515--2525},
  year={2023}
}

About

Authors's code for "Variational Causal Inference Network for Explanatory Visual Question Answering" and "Integrating Neural-Symbolic Reasoning with Variational Causal Inference Network for Explanatory Visual Question Answering"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages