Code for paper Defending LLM-based Multi-Agent Systems Against Cooperative Attacks with Sentence-Level Rectification
We recommend using conda to manage the environment.
conda create -n star python=3.8
conda activate star
pip install -r requirements.txtSTAR relies on LLM APIs.
Please configure your API credentials in llm_api.py:
qwen_client = OpenAI(
api_key="your_api_key",
base_url="your_base_url",
)
gpt_client = OpenAI(
api_key="your_api_key",
base_url="your_base_url",
)
Replace api_key and base_url with your own credentials.
You can reproduce STAR experiments by following the steps below.
- Preprocess datasets
python data_process.py- Run STAR experiments
python run_star.py \
--dataset datasets/mmlu.json \
--n_tests 400 \
--n_agents 5 \
--bad_agent_idx 0 3 \
--rounds 3 \
--model qwen-plusArgument Description:
--dataset: Path to the evaluation dataset.--n_tests: Number of test samples.--n_agents: Total number of agents in the system.--bad_agent_idx: Indices of malicious agents.--rounds: Number of interaction rounds.--model: Backbone LLM used by agents.
- Evaluate results
python evaluate.pyAfter running evaluate.py, the key evaluation metrics (e.g., Task Success Rate and Attack Success Rate) will be printed directly to the console.