Dataset Processing

PACIT

This repo releases our implementation for the PACIT model.
PACIT is a simple and effective in-context instruction tuning method inspired by the pedagogical concept of desirable difficulty. The PACIT method unlocks the power of examples by encouraging the model to actively learn to grasp the distinctions between the positive and negative examples instead of merely reading. The model is expected to first verify the correctness of the provided example according to the task description, which is then set as the condition for generating a better response to the task instance.
It is built based on the pretrained T5 model and LLaMA, and finetuned on SuperNI dataset data.

Model Checkpoint

Model checkpoints are published on Huggingface hub: T5-770M, T5-3B and LLaMA-2-7B

Requirements

Our main experiments and analysis are conducted on the following environment:

CUDA (12.2)
Pytorch (2.1.2)
Transformers (4.36.2)

pip install -r requirements.txt

Quick Start

For a quick start, you can directly run PACIT.

Dataset Processing

Our models are trained and evaluated on Super-NaturalInstructions, which can be cloned by running:

git clone git@github.com:allenai/natural-instructions.git data

Put the data under the path: "./Tk-Instruct-main/data"

We use TK-instruct-main to generate our base dataset and then use our python code to further process. You can run the shell directly

    bash script/data_processing.sh

Training for baseline (SuperNI)

A sample script for training the baseline models. You can run it as follows:

    bash script/SuperNI_train

Training for PACIT

A sample script for training PACIT models. You can run it as follows:

    bash script/PACIT_train

Evaluation

A sample script for evaluating basline and PACIT models. You can run it as follows:

    bash script/SuperNI_evaluation_generation

    bash script/PACIT_GT_evaluation_generation

    bash script/PACIT_Random_evaluation_generation

Model Performance

Here are the performance numbers (in ROUGE-L) for our tested models:

Citation

@misc{xue2024pacit,
      title={PACIT: Unlocking the Power of Examples for Better In-Context Instruction Tuning}, 
      author={Tianci Xue and Ziqi Wang and Yixia Li and Yun Chen and Guanhua Chen},
      year={2024},
      eprint={2310.00901},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
Tk-Instruct-main		Tk-Instruct-main
figure		figure
script		script
README.md		README.md
data_processing.py		data_processing.py
evaluation.py		evaluation.py
instruction_tuning.py		instruction_tuning.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PACIT

Model Checkpoint

Requirements

Quick Start

Dataset Processing

Training for baseline (SuperNI)

Training for PACIT

Evaluation

Model Performance

Citation

About

Releases

Packages

Languages

XueTianci/PACIT

Folders and files

Latest commit

History

Repository files navigation

PACIT

Model Checkpoint

Requirements

Quick Start

Dataset Processing

Training for baseline (SuperNI)

Training for PACIT

Evaluation

Model Performance

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages