Skip to content

0nutation/SpeechAgents

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 

Repository files navigation

SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems


Introduction

SpeechAgents is a multi-modal LLM based multi-agent system designed for human communication simulating. Different from current LLM-based multi-agent systems, SpeechAgents utilizes multi-modal LLM as the central control for individual agent and employ multi-modal signals as the medium for exchanged messages among agents. Additionally, we propose Multi-Agent Tuning to enhance the multi-agent capabilities of LLM without compromising general abilities. To strengthen and evaluate the effectiveness of human communication simulation, we build the Human-Communication Simulation Benchmark.
SpeechAgents demos are shown in our project page. As shown in the demos, SpeechAgents can generate human-like communication dialogues with consistent content, authentic rhythm, and rich emotions, which can accomplish tasks such as drama creation and audio novels generation.


llustration of training and inference process of an individual agent in SpeechAgents.

Code

We will soon open-source our codes and models, stay tuned!

Demo

bbq.mp4

Citation

About

SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published