Skip to content
/ SaySelf Public

Public code repo for paper "SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales"

License

Notifications You must be signed in to change notification settings

xu1868/SaySelf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Overview

We present the advanced SaySelf, a training framework that teaches LLMs to express more accurate fine-grained confidence estimates. In addition, beyond the confidence scores, SaySelf initiates the process of directing LLMs to produce self-reflective rationales that clearly identify gaps in their parametric knowledge and explain their uncertainty.

The overview of SaySelf:

SaySelf Approach

Requirements

The required Python packages for running this repo are listed in requirements.txt. You can create a new conda/venv environment, then run

pip install -r requirements.txt

to install them.

You may need to set environment variables OPENAI_API_KEY and OPENAI_API_VERSION to run some scripts.

Training

Code for stage-1 training (SFT) is located in finetune.py. stage_1_finetune.sh provides an easy-to-run script to run Stage-1 training. You can change the parameters at the beginning of the script to control the parameters of the training and do SFT on different datasets.

Note

./datasets/stage_1/ consists of multiple datasets for Stage 1 training:
sft_reason_conf.jsonl is for standard SaySelf training;
sft_without_reason_conf.jsonl is a baseline for SaySelf without self-reflection and confidence;
sft_without_reason_with_conf.jsonl is a baseline for SaySelf with confidence, but without self-reflection;
sft_reason_rtuning.jsonl is a baseline for R-Tuning; sft_reason_calimath.jsonl is a baseline for GCE, as in the article.

Code for stage-2 training is located in rlhf_train.py. You can use stage_2_finetune.sh to run it as well. To reproduce the ablation test for the reward function, you will need to manually edit the function calculate_reward here.

Evaluation

Evaluation results in Tables 1, 2, and 3 are generated with script evaluate.py. For PEFT models, run it with evaluate_dataset_peft_model.sh; For other models, run it with evaluate_dataset_model.sh.

Faithfulness evaluation results in Table 4 are generated in two steps:

  1. Generate self-reflections in batch with generate_reasons_for_evaluation.py
  2. After the self-reflections are generated, evaluate them with evaluate_reasons.py.

Caution

Due to the limitations of VLLM, before generating the self-reflections with PEFT models (which is the format that our code saves in default), please merge it with the original large model using this function.

Citation

If you use or extend our work, please consider citing our paper. Thank you for your support! 🥰

@article{xu2024sayself,
      title={SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales}, 
      author={Xu, Tianyang and Wu, Shujin and Diao, Shizhe and Liu, Xiaoze and Wang, Xingyao and Chen, Yangyi and Gao, Jing},
      journal={arXiv preprint arXiv:2405.20974},
      year={2024}
}

About

Public code repo for paper "SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages