This is Code and data for paper "A Comprehensive Evaluation on Event Reasoning of Large Language Models".
We provide all data of the benchmark in data dir.
RQ1: src/eval_EV2.sh
RQ3: src/eval_rq3.sh
RQ4: src/eval_memory.sh
We will keep running on this code and add more instructions.
Please cite:
@misc{tao2024comprehensiveevaluationeventreasoning,
title={A Comprehensive Evaluation on Event Reasoning of Large Language Models},
author={Zhengwei Tao and Zhi Jin and Yifan Zhang and Xiancai Chen and Haiyan Zhao and Jia Li and Bing Liang and Chongyang Tao and Qun Liu and Kam-Fai Wong},
year={2024},
eprint={2404.17513},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2404.17513},
}