A flexible, conversational voice companion bot framework for Python. Easily build your own AI assistant by plugging in any LLM (OpenAI, Gemini, etc.) and using built-in voice tools. Great for personal productivity, home automation, or just having a friendly AI to talk to!
- Conversational AI: Integrate any LLM (OpenAI, Gemini, etc.) for smart, natural conversations.
- Speech Recognition: Uses Whisper and SpeechRecognition for accurate voice input.
- Text-to-Speech: Responds with high-quality voice using TTS APIs and local fallback.
- Extensible Tools: Add your own Python functions as tools (play music, check weather, control apps, etc.).
- Easy API: Just provide an LLM handler and start your bot!
- Python 3.8+
- System dependencies for audio:
- Linux:
sudo apt-get install portaudio19-dev ffmpeg - macOS:
brew install portaudio ffmpeg - Windows: Install FFmpeg and ensure it's in your PATH.
- Linux:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install voice-agent-coreCreate a Python file (e.g., my_bot.py):
from voice_agent_core import VoiceCompanionBot, get_tools
import datetime
def my_llm_handler(text):
if "date" in text or "time" in text:
now = datetime.datetime.now()
return {"type": "text_response", "content": f"The current date and time is: {now}"}
else:
return {"type": "text_response", "content": "I am your companion bot! You said: " + text}
bot = VoiceCompanionBot(llm_handler=my_llm_handler, tools=get_tools())
bot.listen_and_respond()Run it:
python my_bot.pySpeak to your bot! It will respond with the date/time or echo your message.
To let your LLM automatically call tools (like playing YouTube or checking weather), use OpenAI's function calling feature:
from dotenv import load_dotenv
load_dotenv()
from voice_agent_core import VoiceCompanionBot, get_tools
import openai
import os
openai.api_key = os.getenv("OPENAI_API_KEY")
functions = [
{
"name": "play_on_youtube",
"description": "Plays a video or song on YouTube.",
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "The name of the song or video to play."}
},
"required": ["query"]
}
},
{
"name": "pause_or_resume",
"description": "Pauses or resumes the currently playing media by simulating a spacebar press.",
"parameters": {"type": "object", "properties": {}}
},
{
"name": "stop_current_task",
"description": "Stops the current task by closing the active tab in the browser (Ctrl+W).",
"parameters": {"type": "object", "properties": {}}
},
{
"name": "open_website",
"description": "Opens a website in the default browser given a valid URL.",
"parameters": {
"type": "object",
"properties": {
"url": {"type": "string", "description": "The full URL of the website to open. Must start with http or https."}
},
"required": ["url"]
}
},
{
"name": "search_google",
"description": "Searches for a query on Google.",
"parameters": {
"type": "object",
"properties": {
"query": {"type": "string", "description": "The topic or question to search for on Google."}
},
"required": ["query"]
}
},
{
"name": "open_vscode",
"description": "Opens the Visual Studio Code application.",
"parameters": {"type": "object", "properties": {}}
},
{
"name": "get_weather",
"description": "Fetches the current weather for a specified location using the OpenWeatherMap API.",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string", "description": "The city name to get the weather for. For example: 'London' or 'Tokyo'."}
},
"required": ["location"]
}
}
]
system_prompt = (
"You are a helpful, friendly voice companion. "
"If the user asks to play something on YouTube, call the function 'play_on_youtube' with the song or video name as the 'query' argument. "
"You can also call other tools for media, weather, websites, and more."
)
def openai_llm_handler(text):
response = openai.ChatCompletion.create(
model="gpt-4o-mini-2024-07-18",
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": text}
],
functions=functions,
function_call="auto"
)
message = response.choices[0].message
if hasattr(message, "function_call") and message.function_call:
import json
name = message.function_call.name
args = message.function_call.arguments
args = json.loads(args) if isinstance(args, str) else args
return {"type": "function_call", "call": {"name": name, "args": args}}
else:
return {"type": "text_response", "content": message['content']}
bot = VoiceCompanionBot(llm_handler=openai_llm_handler, tools=get_tools())
bot.listen_and_respond()How it works:
- Each tool is a Python function (see
actions.py). - The LLM can call any tool by name and arguments using OpenAI function calling.
- Add your own tools by writing a function and adding its schema to the
functionslist.
MIT
For more details, see the API reference and examples above. Enjoy building your own AI companion!