Official implementation of our proposed unified and flexible factuality evaluation pipeline.
For demonstration, we show 20 samples for each dataset and source LLMs in the directory generation/
.
You can set up all libraries and dependencies by:
pip install -r requirements.txt
You can adjust the parameters and run:
bash run.sh