AMPTALK: Multi-Agent AI Framework for Meeting Transcription

AMPTALK is a privacy-focused multi-agent system designed for meeting transcription and analysis. It enables real-time transcription, summarization, and insight extraction from audio with minimal data exposure.

Features

Privacy-First Design: Process all data locally without sending it to external servers
Multi-Agent Architecture: Specialized agents work together to handle different tasks
Real-Time Processing: Process audio streams in real-time with low latency
Optimized for Edge: Designed to run efficiently on local devices
Modular and Extensible: Easy to add new agents or functionality

Core Framework

Robust Agent Architecture: Flexible and extensible agent-based system
Asynchronous Execution: Efficient processing of concurrent tasks
Error Recovery: Sophisticated error handling with retry mechanisms
Memory Management: Smart caching and memory optimization
State Persistence: Persistent agent state across sessions

Whisper Integration

Transcription Agent: Specialized agent for audio transcription
Model Caching: Efficient model loading and unloading
Edge Optimization: Deployment optimizations for resource-constrained devices
- ONNX conversion
- Quantization (INT8, FP16, INT4)
- Layer Fusion: Combines consecutive operations for faster inference
- Operator fusion
- Knowledge distillation
Mobile Framework Export: Deploy models to mobile platforms
- TensorFlow Lite export for Android
- Core ML export for iOS/macOS
- Model size optimization
- Mobile-specific optimizations

Monitoring and Performance

Real-time Metrics: Comprehensive performance monitoring
OpenTelemetry Integration: Industry-standard observability
Prometheus Exporting: Metrics collection for dashboards

System Architecture

AMPTALK uses a multi-agent architecture where specialized agents communicate through a message-passing system:

Audio Processing Agent: Handles audio input, preprocessing, and segmentation
Transcription Agent: Converts audio to text using optimized Whisper models
NLP Processing Agent: Performs language analysis, entity detection, and intent recognition
Summarization Agent: Generates concise summaries of meeting content
Orchestrator: Coordinates the agents and manages system resources

Installation

# Clone the repository
git clone https://github.com/yourusername/amptalk.git
cd amptalk

# Create a virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Usage

Running the Demo

# Run the basic demo
python src/run_demo.py

# Run with a custom configuration
python src/run_demo.py --config path/to/config.json

Example Configuration

Create a JSON configuration file to customize the system:

{
  "audio_agent": {
    "audio": {
      "sample_rate": 16000,
      "chunk_duration_ms": 1000,
      "vad_threshold": 0.3
    }
  },
  "transcription_agent": {
    "whisper": {
      "model_size": "large-v3-turbo",
      "device": "cuda",
      "language": "en"
    }
  }
}

Model Pruning

AMPTALK includes a model pruning toolkit in the pruning/ directory to optimize models for edge deployment:

# Run the pruning script
cd pruning
./prune.sh --model large-v3 --target-sparsity 0.6

Development

Project Structure

amptalk/
├── src/
│   ├── agents/              # Specialized agent implementations
│   ├── core/
│   │   ├── framework/       # Core multi-agent framework
│   │   └── utils/           # Common utilities
│   └── run_demo.py          # Demo application
├── pruning/                 # Model pruning tools
│   ├── scripts/             # Pruning scripts
│   ├── configs/             # Pruning configurations
│   └── models/              # Model storage
├── requirements.txt         # Dependencies
└── README.md                # This file

Creating a New Agent

Agents inherit from the Agent base class in src/core/framework/agent.py:

from src.core.framework.agent import Agent

class MyCustomAgent(Agent):
    def __init__(self, agent_id=None, name="MyCustomAgent", config=None):
        super().__init__(agent_id, name)
        # Initialize your agent
        
    async def process_message(self, message):
        # Implement your message handling logic
        pass

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

OpenAI for the Whisper model
The open-source community for various audio processing tools
All contributors to this project

Getting Started

Edge Optimization Example

For optimizing and running Whisper models on edge devices:

# Run the edge optimization demo
python examples/edge_optimization_demo.py --model-size tiny --optimization-level MEDIUM --compare

This will:

Optimize a Whisper model using ONNX and quantization
Compare performance with the non-optimized model
Run a transcription using both models

For more details, see the Edge Optimization Documentation.

INT4 Quantization Example

For ultra-low precision quantization with AWQ:

# Run the INT4 quantization demo
python examples/int4_quantization_demo.py --model-size tiny

This will:

Quantize a Whisper model to INT4 precision using AWQ
Quantize the same model to INT8 precision for comparison
Compare transcription performance and accuracy between original, INT4, and INT8 models

INT4 quantization can reduce model size by up to 8x while maintaining good transcription quality.

Mobile Framework Export Example

For exporting Whisper models to mobile frameworks:

# Run the mobile optimization demo
python examples/mobile_optimization_demo.py

This will:

Export a Whisper model to TensorFlow Lite format (for Android)
Export a Whisper model to Core ML format (for iOS/macOS, only on macOS systems)
Demonstrate transcription using the exported models

For detailed information on mobile deployment, see the Mobile Deployment Guide.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.cursor/rules		.cursor/rules
.github/workflows		.github/workflows
docker		docker
docs		docs
examples		examples
pruning		pruning
research		research
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
README.md		README.md
ROADMAP.md		ROADMAP.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AMPTALK: Multi-Agent AI Framework for Meeting Transcription

Features

Core Framework

Whisper Integration

Monitoring and Performance

System Architecture

Installation

Usage

Running the Demo

Example Configuration

Model Pruning

Development

Project Structure

Creating a New Agent

License

Acknowledgments

Getting Started

Edge Optimization Example

INT4 Quantization Example

Mobile Framework Export Example

About

Uh oh!

Releases

Packages

Uh oh!

Languages

shivashish0904/MeetingAssistant

Folders and files

Latest commit

History

Repository files navigation

AMPTALK: Multi-Agent AI Framework for Meeting Transcription

Features

Core Framework

Whisper Integration

Monitoring and Performance

System Architecture

Installation

Usage

Running the Demo

Example Configuration

Model Pruning

Development

Project Structure

Creating a New Agent

License

Acknowledgments

Getting Started

Edge Optimization Example

INT4 Quantization Example

Mobile Framework Export Example

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages