AMPTALK is a privacy-focused multi-agent system designed for meeting transcription and analysis. It enables real-time transcription, summarization, and insight extraction from audio with minimal data exposure.
- Privacy-First Design: Process all data locally without sending it to external servers
- Multi-Agent Architecture: Specialized agents work together to handle different tasks
- Real-Time Processing: Process audio streams in real-time with low latency
- Optimized for Edge: Designed to run efficiently on local devices
- Modular and Extensible: Easy to add new agents or functionality
- Robust Agent Architecture: Flexible and extensible agent-based system
- Asynchronous Execution: Efficient processing of concurrent tasks
- Error Recovery: Sophisticated error handling with retry mechanisms
- Memory Management: Smart caching and memory optimization
- State Persistence: Persistent agent state across sessions
- Transcription Agent: Specialized agent for audio transcription
- Model Caching: Efficient model loading and unloading
- Edge Optimization: Deployment optimizations for resource-constrained devices
- ONNX conversion
- Quantization (INT8, FP16, INT4)
- Layer Fusion: Combines consecutive operations for faster inference
- Operator fusion
- Knowledge distillation
- Mobile Framework Export: Deploy models to mobile platforms
- TensorFlow Lite export for Android
- Core ML export for iOS/macOS
- Model size optimization
- Mobile-specific optimizations
- Real-time Metrics: Comprehensive performance monitoring
- OpenTelemetry Integration: Industry-standard observability
- Prometheus Exporting: Metrics collection for dashboards
AMPTALK uses a multi-agent architecture where specialized agents communicate through a message-passing system:
- Audio Processing Agent: Handles audio input, preprocessing, and segmentation
- Transcription Agent: Converts audio to text using optimized Whisper models
- NLP Processing Agent: Performs language analysis, entity detection, and intent recognition
- Summarization Agent: Generates concise summaries of meeting content
- Orchestrator: Coordinates the agents and manages system resources
# Clone the repository
git clone https://github.com/yourusername/amptalk.git
cd amptalk
# Create a virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt# Run the basic demo
python src/run_demo.py
# Run with a custom configuration
python src/run_demo.py --config path/to/config.jsonCreate a JSON configuration file to customize the system:
{
"audio_agent": {
"audio": {
"sample_rate": 16000,
"chunk_duration_ms": 1000,
"vad_threshold": 0.3
}
},
"transcription_agent": {
"whisper": {
"model_size": "large-v3-turbo",
"device": "cuda",
"language": "en"
}
}
}AMPTALK includes a model pruning toolkit in the pruning/ directory to optimize models for edge deployment:
# Run the pruning script
cd pruning
./prune.sh --model large-v3 --target-sparsity 0.6amptalk/
├── src/
│ ├── agents/ # Specialized agent implementations
│ ├── core/
│ │ ├── framework/ # Core multi-agent framework
│ │ └── utils/ # Common utilities
│ └── run_demo.py # Demo application
├── pruning/ # Model pruning tools
│ ├── scripts/ # Pruning scripts
│ ├── configs/ # Pruning configurations
│ └── models/ # Model storage
├── requirements.txt # Dependencies
└── README.md # This file
Agents inherit from the Agent base class in src/core/framework/agent.py:
from src.core.framework.agent import Agent
class MyCustomAgent(Agent):
def __init__(self, agent_id=None, name="MyCustomAgent", config=None):
super().__init__(agent_id, name)
# Initialize your agent
async def process_message(self, message):
# Implement your message handling logic
passThis project is licensed under the MIT License - see the LICENSE file for details.
- OpenAI for the Whisper model
- The open-source community for various audio processing tools
- All contributors to this project
For optimizing and running Whisper models on edge devices:
# Run the edge optimization demo
python examples/edge_optimization_demo.py --model-size tiny --optimization-level MEDIUM --compareThis will:
- Optimize a Whisper model using ONNX and quantization
- Compare performance with the non-optimized model
- Run a transcription using both models
For more details, see the Edge Optimization Documentation.
For ultra-low precision quantization with AWQ:
# Run the INT4 quantization demo
python examples/int4_quantization_demo.py --model-size tinyThis will:
- Quantize a Whisper model to INT4 precision using AWQ
- Quantize the same model to INT8 precision for comparison
- Compare transcription performance and accuracy between original, INT4, and INT8 models
INT4 quantization can reduce model size by up to 8x while maintaining good transcription quality.
For exporting Whisper models to mobile frameworks:
# Run the mobile optimization demo
python examples/mobile_optimization_demo.pyThis will:
- Export a Whisper model to TensorFlow Lite format (for Android)
- Export a Whisper model to Core ML format (for iOS/macOS, only on macOS systems)
- Demonstrate transcription using the exported models
For detailed information on mobile deployment, see the Mobile Deployment Guide.