./msmarco_jsons:
we filter out the data without correct answers on the MSMARCO dataset and randomly sample the same amount of data as TriviaQA, here are the dev and test set of MSMARCO we use.
The format of JSONs we use for tuning and inferring is as follows:
[
{
"instruction": "Please extract the answer keyword",
"input": "Question: Kim Carnes' nine weeks at No 1 with Bette Davis Eyes was interrupted for one week by which song? Answer: Stars on 45 medley ",
"output": "Processed answer: Stars on 45"
}
]
./retriever
retrieving knowledge from knowledge base
to generate_embedding of knowledge base:
python generate_embedding.py
to retrieve knowledge by embedding:
python retriever.py
./K_PP_glm6b
for knowledge integration and post-processing part of our framework using GLM-6B
to porcess json file, do
sh cover_jasonl.sh
to finetune models, do
sh finetune.sh
to infer, do
python infer.py --saveI 1
./K_PP_LLaMA
for knowledge integration and post-processing part of our framework using LLaMA-65B
to finetune models, do
sh finetune.sh
to infer, do
sh generate.sh
./RM
for reward model part of our model
to finetune models, do
sh train_rm.sh
to infer, do
sh inference_rm.sh
./Consistency-calc
for calculation of consistency and fluency of generated answer to calculate, do
sh test.sh