Audio Group Chat

A real-time audio group chat implementation enabling voice and text communication between humans and AI agents. This project combines WebRTC, speech-to-text, text-to-speech, and LLM capabilities to create interactive conversations with AI agents.

Features

Real-time audio communication using WebRTC
Multiple AI agents with distinct voices and personalities
Text-to-Speech (TTS) with customizable voice options
Speech-to-Text (STT) for human voice input
Round-robin speaker selection for balanced conversations
Gradio-based web interface for easy interaction
Support for both voice and text channels

Prerequisites

Python 3.8+
Node.js (for frontend components)
Ollama (for local LLM support)

Installation

Clone the repository:

git clone <repository-url>
cd AudioGroupChat

Create and activate a virtual environment:

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Configuration

Configure Ollama settings in main_app.py:

config_list = [{
    "model": "gemma3:1b",  # or other supported models
    "base_url": "http://localhost:11434/v1",
    "price": [0.00, 0.00],
}]

(Optional) Set up Twilio TURN server credentials for improved WebRTC connectivity:

export TWILIO_ACCOUNT_SID=your_account_sid
export TWILIO_AUTH_TOKEN=your_auth_token

Usage

Start the application:

python main_app.py

Open the provided Gradio interface URL in your browser (typically http://localhost:7860)
Start a conversation by:
- Speaking into your microphone
- Typing text messages
- Using the provided UI controls

Project Structure

main_app.py: Main application entry point
audio_groupchat.py: Core audio group chat implementation
gradio_ui.py: Gradio web interface components
test_group_chat.py: Test cases and examples

Voice Configuration

The system supports multiple voice options for AI agents:

Energetic (fast, US English)
Calm (slower, US English)
British (UK English)
Authoritative (moderate speed, US English)
Default (standard US English)

API Documentation

AudioGroupChat Class

class AudioGroupChat(GroupChat):
    def __init__(self, agents=None, messages=None, max_round=10,
                 speaker_selection_method="round_robin",
                 allow_repeat_speaker=False)

Key methods:

initialize(): Set up audio processing components
add_human_participant(user_id): Add a human participant
start_audio_session(user_id): Start an audio session

GradioUI Class

class GradioUI:
    def __init__(self, audio_chat: AudioGroupChat)
    def create_interface(self) -> gr.Blocks

Contributing

Fork the repository
Create a feature branch
Commit your changes
Push to the branch
Create a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Built with Autogen
Uses FastRTC for WebRTC functionality
Powered by Gradio for the web interface

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.cache/41		.cache/41
.gradio		.gradio
__pycache__		__pycache__
sockets_examples		sockets_examples
.gitignore		.gitignore
README.md		README.md
audio_groupchat.py		audio_groupchat.py
gradio_ui.py		gradio_ui.py
groupcall.py		groupcall.py
main_app.py		main_app.py
package-lock.json		package-lock.json
package.json		package.json
requirements.txt		requirements.txt
test_group_chat.py		test_group_chat.py
test_voice_british.wav		test_voice_british.wav
test_voice_fast.wav		test_voice_fast.wav
test_voice_normal.wav		test_voice_normal.wav
test_voice_slow.wav		test_voice_slow.wav
test_voice_us_english.wav		test_voice_us_english.wav
test_voice_very_fast.wav		test_voice_very_fast.wav
test_voice_very_slow.wav		test_voice_very_slow.wav
test_voices.py		test_voices.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Audio Group Chat

Features

Prerequisites

Installation

Configuration

Usage

Project Structure

Voice Configuration

API Documentation

AudioGroupChat Class

GradioUI Class

Contributing

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Languages

geniusgeek/AudioGroupChat

Folders and files

Latest commit

History

Repository files navigation

Audio Group Chat

Features

Prerequisites

Installation

Configuration

Usage

Project Structure

Voice Configuration

API Documentation

AudioGroupChat Class

GradioUI Class

Contributing

License

Acknowledgments

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages