Skip to content

qtli/EIB

Repository files navigation

EIB: Explanation Regeneration via Information Bottleneck

This is the repository for our paper Explanation Regeneration via Information Bottleneck, by Qintong Li, Zhiyong Wu, Lingpeng Kong, and Wei Bi.

Updates

  • [2022/12/19] We have released the preprint of our EIB paper on generating sufficient and concise free-form explanations using information bottleneck.

Content

Overview

Despite the superior generative capacity of large pretrained language models on explanation generation without specific training, explanation generated through single-pass prompting often lacks sufficiency and conciseness. To address this problem, we develop an information bottleneck method EIB to produce refined explanations that are sufficient and concise. Our approach regenerates the free-text explanation by polishing the single-pass output from the pretrained language model but retaining the information that supports the contents being explained. Experiments on two out-of-domain tasks verify the effectiveness of EIB through automatic evaluation and thoroughly-conducted human evaluation.

Setup

Requirements and dependencies

The code is based on the following requirements/dependencies (we specify the version we used in our experiments in brackets):

  • Python (3.9.7)
  • pytorch (1.12.1)
  • accelerate (0.12.0)

More details see requirements.txt.

Datasets

Construct MixExpl Corpus

We train the EIB model on MixExpl, which is our constructed corpus including triples of (sample, explanation candidate, and qualified explanation), based on a variety of existing explanation datasets.

Option 1: Please use get_data.sh to download and preprocess the datasets.

bash get_data.sh

The above command help us to: download existing explanation datasets, including Explanation_for_ScienceQA, SenMaking, LiarPlus, pubhealth, E_delta_NLI, ECQA, and e-SNLI; download ConceptNet-related files (conceptnet_antonym, conceptnet_entity, and negation) and GLM pre-trained model for constructing MixExpl corpus.

Option 2: You could skip the processing steps in option 1 and use the off-the-shelf MixExpl corpus for training. Run:

bash download_mixexpl.sh

Prepare Out-of-domain Test tasks

We prompt OPT-13B to generate explanation candidates for any NLP tasks where the task samples and initial explanation candidates serves as the input of EIB model during inference.

Option 1: Please use do_prompting.sh to prompt PLM and prepare testing data for EIB:

bash prompt.sh "prompting"

Then we use the pre-trained preference classifier to select the most likely explanation candidate preferred by human:

bash prompt.sh "filtering"

The above commands help us to: download facebook/opt-13b and GPT-2-small; steer PLM (OPT) to provide initial explanation candidates; use preference classifier select one explanation candidate from a set of candidates produced by different decoding strategies and prompt formats.

Option 2: You could also skip the processing steps in option 1 and use the off-the-shelf test datasets for testing. Run:

bash download_test.sh 

Here is the (incomplete) directory structure:

|-- EIB
    |-- code
    |-- data
        |-- unify.py
        |-- retrieve.py
        |-- prompt.py
        |-- process.py
        |-- explanation_datasets
        |   |-- ecqa
        |   |-- esnli
        |   |-- senmaking
        |   |-- ...
        |   |-- unify_expl_dataset.json  # unifying result
        |-- utils
        |   |-- contriever_src  # help us construct MixExpl
        |   |-- infilling  # help us construct MixExpl
        |   |   |-- fill_blank_glm.py  
        |   |   |-- glm
        |   |   |   |-- blocklm-2b-512  # storing GLM pre-trained model
        |   |-- facebook
        |   |   |-- opt-13b
        |   |-- gpt2
        |-- MixExpl  # Files UNDER THIS DIRECTORY IS NEEDED FOR TRAINING EIB 
        |   |-- train/dev/test.csv   # MixExpl for training EIB
        |   |-- prompt_ecqa_test.csv   # OoD downstream testing data
        |   |-- prompt_filter_ecqa_test.csv
        |   |-- prompt_esnli_test.csv
        |   |-- prompt_filter_esnli_test.csv
    |-- get_data.sh
    |-- download_mixexpl.sh
    |-- prompt.sh
    |-- dowload_test.sh

EIB

EIB is an explanation regeneration method that learns to inherit task-sample-relevance information from explanation candidate and regenerate a sufficient and concise explanation. EIB treats explanation regeneration as sampling from an IB distribution. Specifically:

  • A refiner polishes an explanation hypothesis into a compact continuous representation T.
  • A T-information term optimizes the polishment process such that T retains usable information of hypothesis for predicting the explained sample (e.g., question and answer).
  • Conditioned on the sample and (noisy) explanation hypothesis, refiner maps the continuous vectors T into a qualified explanation.

Training

Code of EIB model is stored at code/EIB_model.

cd code/EIB_model
mkdir EIB_train_cpt # the trained checkpoint will be stored here
bash train.sh 'EIB' '../../data/MixExpl' '../../data/utils/gpt2' '0,1,2,3,4,5,6,7'

Arguments:

  • $1: Task name (EIB)
  • $2: Path to the training data (MixExpl)
  • $3: Path to the pretrained backbone for initialize EIB model (pretrained gpt-2)

We also provide the trained checkpoint for you to directly use. Run:

cd code/EIB_model
bash download.sh

Inference

In this project, we choose two tasks to test the zero-shot performance of EIB, i.e., ECQA and e-SNLI.

We have processed the test sets in Section Prepare Out-of-domain Test tasks. We could call the inference script to test the performance of EIB.

cd code/EIB_model
bash infer.sh 'EIB' '../../data/MixExpl/prompt_filter_ecqa_test.csv' 'EIB_train_cpt/' '../../data/utils/gpt2' 'ppl_bleu_dist' '0'

The predictions are saved in directory EIB_train_cpt/.

You could also use your own test sets for inference. Process your data into csv format which includes three columns: task_ipt (e.g., questions), task_opt (e.g., answers), expl_ipt (e.g., explanation candidates).

cd code/EIB_model
bash infer.sh 'EIB' 'path_to_your_csv_file' 'EIB_train_cpt/' '../../data/utils/gpt2' 'generate' '0'

Baselines

co code/BottleSum
bash run.sh 'data/ecqa_b1.txt' 'data/ecqa_sample.txt' 'data/ecqa_bs_result.txt'

Predictions are stored as ecqa_bs_result.txt or esnli_bs_result.txt.

cd code/Prompting-Filter
bash run.sh '../../data/explanation_datasets/esnli/esnli_explanation_cands.csv' 'esnli_pf_result.json' 'esnli'
bash run.sh '../../data/explanation_datasets/esnli/ecqa_explanation_cands.csv' 'ecqa_pf_result.json' 'ecqa'

Predictions are stored as ecqa_pf_result.json or esnli_pf_result.json.

Supervised

Code for training:

cd code/Supervised
bash train.sh 'ft_ecqa' '../../data/utils/gpt2' '0,1,2,3,4,5,6,7'

Code for inference:

cd code/Supervised
bash infer.sh 'ft_ecqa' '../../data/utils/gpt2' '0'

Predictions are stored in ft_ecqa and ft_esnli.

Contact

Please leave Github issues or contact Qintong Li qtleo@outlook.com for any questions.

About

Explanation Regeneration via Information Bottleneck

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published