Skip to content
/ FIGA Public

[ICLR 2024] This is the official implementation for the paper: "Beyond imitation: Leveraging fine-grained quality signals for alignment"

Notifications You must be signed in to change notification settings

RUCAIBox/FIGA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FIGA

This repository is the official implementation of ICLR 2024 paper: Beyond Imitation: Leveraging Fine-grained Quality Signals for Alignment.

Quick Start

Considering that a modified version of transformers will be installed, it is recommended to create a new conda environment:

conda create -n FIGA python=3.8
conda activate FIGA
conda install pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=12.1 -c pytorch -c nvidia

You should clone the FIGA repository and follow its instructions.

git clone https://github.com/RUCAIBox/FIGA.git && cd FIGA
pip install -r requirements.txt

After this, you need to replace the trainer_utils.py and modeling_llama.py files in the transformers library with the corresponding files from this repository. This is necessary for fine-tuning using the FIGA method.

SPA Dataset

You can download SPA dataset in: https://huggingface.co/datasets/RUCAIBox/SPA.

For our publicly available SPA dataset, the output field is the ground truth response, the original_output field contains results generated by the alpaca-7b model, and the revised_output field contains results modified by using a more powerful model (i.e. ChatGPT-3.5). For a detailed description of the construction process of the SPA dataset, please refer to our paper.

Instruction tuning

After setting up the environment, you can utilize the FIGA method to fine-tune the model by referring to the code provided below:

bash bash/run_7b.sh > output.log 2>&1

Acknowledgment

Please cite the following paper if you find our code or data helpful.

@article{guo2023beyond,
  title={Beyond imitation: Leveraging fine-grained quality signals for alignment},
  author={Guo, Geyang and Zhao, Ranchi and Tang, Tianyi and Zhao, Wayne Xin and Wen, Ji-Rong},
  journal={arXiv preprint arXiv:2311.04072},
  year={2023}
}

About

[ICLR 2024] This is the official implementation for the paper: "Beyond imitation: Leveraging fine-grained quality signals for alignment"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published