A hybrid voice and text-based assistant that integrates with a sophisticated n8n multi-agent workflow for intelligent query processing.
This project combines a local voice assistant frontend with a cloud-based multi-agent AI system. The assistant can accept input through both voice commands (using wake word "Anka") and text input, processing queries through an intelligent agent orchestration system that dynamically assembles expert teams based on query complexity.
- Flask Web Server (
app.py) - Serves the web interface and handles API requests - Voice Recognition (
trigger.py) - Handles wake word detection, speech-to-text, and text-to-speech - Web UI - Modern chat interface with both voice and text input capabilities
The n8n workflow implements a multi-agent system with the following stages:
- Manager Agent - Analyzes incoming requests and determines complexity (1-5 scale)
- Role Selection - Dynamically selects solver and reviewer agents based on task requirements
- Parallel Processing - Multiple specialist agents work on the problem simultaneously
- Evaluation Layer - Reviewer agents critically assess solutions across 5 dimensions
- Synthesis - Chief Editor consolidates all feedback into a final response
- Wake word detection ("Anka")
- Continuous listening with adjustable sensitivity
- Speech-to-text using Google Speech Recognition
- Text-to-speech using OpenAI TTS (natural-sounding voice)
- Fallback to pyttsx3 if OpenAI is unavailable
- Direct text input through web interface
- Typing animations with contextual messages
- Real-time response streaming via Server-Sent Events
- Adaptive complexity assessment
- Dynamic team assembly from available roles
- Parallel solver execution
- Multi-perspective evaluation
- Consensus-based final answers
- Context persistence in Supabase
- RAG (Retrieval-Augmented Generation) support for document queries
- OpenAI API account (for GPT-4 and TTS)
- n8n Cloud account or self-hosted n8n instance
- Supabase account (for role prompts and context storage)
- Python 3.8+
- Microphone for voice input
- Audio output device
- Modern web browser
cd Hackathon25pip install -r requirements.txtCreate a .env file in the project root:
WAKE_WORD=anka
N8N_WEBHOOK_URL=https://your-n8n-instance.app.n8n.cloud/webhook/anka-wake-word
OPENAI_API_KEY=sk-your-openai-api-key
LANGUAGE=en-US
RECOGNITION_TIMEOUT=10- Import the provided n8n workflow JSON into your n8n instance
- Configure the following credentials in n8n:
- OpenAI API credentials
- Supabase API credentials
- Set up Supabase tables:
role_prompts- Contains agent roles and their system promptsuser_context- Stores conversation history
Insert agent roles into Supabase role_prompts table:
- Mode: "user" (or custom mode)
- Name: Role name (e.g., "Analyst", "Creative", "Skeptic")
- Prompt: System prompt defining the agent's behavior
python app.pyThe server will start on http://localhost:5000
- Click the "Voice" button in the web interface
- Wait for 5-second calibration (stay quiet)
- Say "Anka" to activate
- Speak your question clearly
- Wait for the assistant to process and respond
- Type your question in the text input field
- Press Enter or click "Send"
- The assistant will process your query and respond
If the assistant has trouble hearing you:
- Adjust
energy_thresholdintrigger.py(lower = more sensitive) - Check Windows microphone boost settings
- Ensure correct microphone is selected in system settings
Hackathon25/
├── app.py # Flask server & API endpoints
├── trigger.py # Voice assistant core logic
├── requirements.txt # Python dependencies
├── .env # Environment configuration
├── templates/
│ └── index.html # Main web interface
├── static/
│ ├── style.css # UI styling
│ └── script.js # Frontend JavaScript
└── README.md # This file
energy_threshold- Microphone sensitivity (default: 50, lower = more sensitive)pause_threshold- Silence detection (default: 0.8 seconds)phrase_time_limit- Max question length (default: 20 seconds)
Available OpenAI voices (change in trigger.py line 89):
nova- Pleasant female (default)alloy- Neutralecho- Malefable- Expressiveonyx- Deep maleshimmer- Soft female
POST /api/start- Start voice assistantPOST /api/stop- Stop voice assistantPOST /api/ask- Send text questionGET /events- SSE stream for real-time updates
Solution: Hard refresh browser (Ctrl+Shift+R) or open in incognito mode
Solution:
- Lower
energy_thresholdvalue - Check microphone permissions
- Increase system microphone boost
Solution:
- Verify webhook URL in
.env - Check OpenAI API key validity
- Ensure Supabase credentials are configured
Solution: System will automatically fall back to pyttsx3
- OpenAI GPT-4 API calls: ~$0.01-0.10 per query (varies by complexity)
- OpenAI TTS: ~$0.015 per 1000 characters
- n8n Cloud: Free tier available, paid plans for production
- Supabase: Free tier supports up to 500MB database
- Add new role to Supabase
role_promptstable - Define role name and comprehensive system prompt
- The Manager will automatically consider it for task assignment
Edit the n8n workflow to:
- Change complexity scoring criteria
- Adjust evaluation dimensions
- Add new tools or integrations
- Modify synthesis logic
This project is provided as-is for educational and development purposes.
Built with:
- Flask
- speech_recognition
- OpenAI API
- n8n workflow automation
- Supabase
- pyttsx3 / pygame