Preparation
tqdm
openai
vllm
torch
Run MADKE
- We used the GPT3.5 and GPT4 apis provided by Microsoft Azure to complete the experiment. You need to specify an endpoint, deployment id, and api key.
- We used VLLM to deploy Qwen1.5-32B-Chat and Qwen1.5-72B-Chat, and then called the API. Run the corresponding model with the following code.
python -m vllm.entrypoints.openai.api_server --model /home/llm_models/Qwen1.5-32B-Chat --served-model-name qwen1.5-32b-chat
python madke.py --test_data_path ./data/triviaqa/triviaqa_test_retrieval_all_evidence.json \
--test_output_path ./results/predictions/triviaqa/triviaqa_test_500_google_chatgpt_a2r3.json \
--endpoint YOUR_ENDPOINT \
--deployment_id YOUR_DEPLOYMENT_ID \
--api_key YOUR_API_KEY
Run Evaluation
python evaluation.py \
--endpoint YOUR_ENDPOINT \
--deployment_id YOUR_DEPLOYMENT_ID \
--api_key YOUR_API_KEY
If you find this work helpful, we kindly request that citations refer to the arXiv version:
@misc{wang2024learningbreak,
title={Learning to Break: Knowledge-Enhanced Reasoning in Multi-Agent Debate System},
author={Haotian Wang and Xiyuan Du and Weijiang Yu and Qianglong Chen and Kun Zhu and Zheng Chu and Lian Yan and Yi Guan},
year={2024},
eprint={2312.04854},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2312.04854},
}