CollabBench: Benchmarking and Unleashing Collaborative Ability of LLMs with Diverse Players via Proactive Engagement

ICML 2026

Hong Qian, Yuanhao Liu, Zihan Zhou, Zongbao Zhang, Hanjie Ge, Haotian Shi, Liang Dou, Xiangfeng Wang, Jingwen Yang*, and Aimin Zhou

East China Normal University
Shanghai Innovation Institute
Tencent Inc.

Overview

We propose CollabBench, a benchmark for systematically evaluating and training LLM-based agents to proactively collaborate with diverse players. CollabBench focuses on collaborative agent research, aiming to facilitate research on LLM-based agents in efficient and affective interactions.

1️⃣ Diverse Player Profiles Simulation

cd Anthropomorphic

This section focuses on modeling diverse player profiles from trajectory data.

📄 Details: Anthropomorphic

2️⃣ Collaborative Agentic Training

This section describe the training of the collaborative agents for the two multi-player game environments.

cd Training

🎮 CWAH-MultiPlayer

cd CWAH-MultiPlayer

📄 Details: CWAH-MultiPlayer

🎮 Cook-MultiPlayer

cd Cook-MultiPlayer

📄 Details: Cook-MultiPlayer

3️⃣ Evaluation

This section describes the trajectory data collection and affective LLM judge used in CollabBench for the two multi-player game environments.

cd Evaluation

Trajectory Data Collection

cd Running

🎮 CWAH-MultiPlayer

cd CWAH-MultiPlayer

📄 Details: CWAH-MultiPlayer

🎮 Cook-MultiPlayer

cd Cook-MultiPlayer

📄 Details: Cook-MultiPlayer

Affective LLM Judge

cd Judge

📄 Details: Evaluation

4️⃣ Player Trajectory Demonstration

We visualize representative trajectories for five typical player types (GIF format) to illustrate their collaboration behaviors.

❶ Efficient Collaboration Expert

❷ Hesitant Laggard

❸ Anxious Doubter

❹ Proactive Leader

❺ Independent Loner

💭 Citation

If you find this repository useful in your research, please cite:

@inproceedings{CollabBench2026,
  author = {Hong Qian and Yuanhao Liu and Zihan Zhou and Zongbao Zhang and Hanjie Ge and Haotian Shi and Liang Dou and Xiangfeng Wang and Jingwen Yang and Aimin Zhou},
  title = {CollabBench: Benchmarking and Unleashing the Collaborative Ability of LLMs with Diverse Players via Proactive Engagement},
  booktitle = {Proceedings of the 43rd International Conference on Machine Learning (ICML)},
  year = {2026},
  address = {Seoul, South Korea}
}

Reference:

Hong Qian, Yuanhao Liu, Zihan Zhou, Zongbao Zhang, Hanjie Ge, Haotian Shi, Liang Dou, Xiangfeng Wang, Jingwen Yang, and Aimin Zhou. CollabBench: Benchmarking and Unleashing the Collaborative Ability of LLMs with Diverse Players via Proactive Engagement. In Proceedings of the 43rd International Conference on Machine Learning (ICML), 2026.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
Anthropomorphic		Anthropomorphic
Evaluation		Evaluation
Training		Training
figure		figure
image		image
paper		paper
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CollabBench: Benchmarking and Unleashing Collaborative Ability of LLMs with Diverse Players via Proactive Engagement

Overview

Table of Contents

1️⃣ Diverse Player Profiles Simulation

2️⃣ Collaborative Agentic Training

🎮 CWAH-MultiPlayer

🎮 Cook-MultiPlayer

3️⃣ Evaluation

Trajectory Data Collection

🎮 CWAH-MultiPlayer

🎮 Cook-MultiPlayer

Affective LLM Judge

4️⃣ Player Trajectory Demonstration

❶ Efficient Collaboration Expert

❷ Hesitant Laggard

❸ Anxious Doubter

❹ Proactive Leader

❺ Independent Loner

💭 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CollabBench: Benchmarking and Unleashing Collaborative Ability of LLMs with Diverse Players via Proactive Engagement

Overview

Table of Contents

1️⃣ Diverse Player Profiles Simulation

2️⃣ Collaborative Agentic Training

🎮 CWAH-MultiPlayer

🎮 Cook-MultiPlayer

3️⃣ Evaluation

Trajectory Data Collection

🎮 CWAH-MultiPlayer

🎮 Cook-MultiPlayer

Affective LLM Judge

4️⃣ Player Trajectory Demonstration

❶ Efficient Collaboration Expert

❷ Hesitant Laggard

❸ Anxious Doubter

❹ Proactive Leader

❺ Independent Loner

💭 Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages