Skip to content

RUCAIBox/LLM-Knowledge-Boundary

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LLM-Knowledge-Boundary

See our paper: Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation.

🚀 Quick Start

  1. Preprocess data and install dependencies.

    bash preparation.sh
    python data_preparation.py -d [nq/tq/hq]
  2. Get supporting documents generated by ChatGPT (take Natural Questions dataset as an example).

    OPENAI_API_KEY=[your api key] \
    python run_llm.py \
        --source=data/source/nq.json \
        --usechat \
        --type=generate \
        --ra=none \
        --outfile=data/source/nq-chat.json

🔍 Conduct Experiments

  1. Question answering.
    OPENAI_API_KEY=[your api key] \
    python run_llm.py \
        --source=data/source/nq-chat.json \
        --usechat \
        --type=qa \
        --ra=none \
        --outfile=data/qa/nq-none-qa.json
  2. Priori judgement.
    OPENAI_API_KEY=[your api key] \
    python run_llm.py \
        --source=data/source/nq-chat.json \
        --usechat \
        --type=prior \
        --ra=dense \
        --outfile=data/prior/nq-dense-prior.json
  3. Posteriori judgement.
    OPENAI_API_KEY=[your api key] \
    python run_llm.py \
        --source=data/qa/nq-none-qa.json \
        --usechat \
        --type=post \
        --ra=sparse \
        --outfile=data/post/nq-sparse-post.json

🌟 Acknowledgement

Please cite the following paper if you find our code helpful.

@article{ren2023investigating,
  title={Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation},
  author={Ren, Ruiyang and Wang, Yuhao and Qu, Yingqi and Zhao, Wayne Xin and Liu, Jing and Tian, Hao and Wu, Hua and Wen, Ji-Rong and Wang, Haifeng},
  journal={arXiv preprint arXiv:2307.11019},
  year={2023}
}

About

Implementation of "Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation"

Topics

Resources

Stars

Watchers

Forks

Languages

  • Python 97.4%
  • Shell 2.6%