Voice Agent Core

A flexible, conversational voice companion bot framework for Python. Easily build your own AI assistant by plugging in any LLM (OpenAI, Gemini, etc.) and using built-in voice tools. Great for personal productivity, home automation, or just having a friendly AI to talk to!

Features

Conversational AI: Integrate any LLM (OpenAI, Gemini, etc.) for smart, natural conversations.
Speech Recognition: Uses Whisper and SpeechRecognition for accurate voice input.
Text-to-Speech: Responds with high-quality voice using TTS APIs and local fallback.
Extensible Tools: Add your own Python functions as tools (play music, check weather, control apps, etc.).
Easy API: Just provide an LLM handler and start your bot!

Requirements

Python 3.8+
System dependencies for audio:
- Linux: sudo apt-get install portaudio19-dev ffmpeg
- macOS: brew install portaudio ffmpeg
- Windows: Install FFmpeg and ensure it's in your PATH.

Installation

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install voice-agent-core

Quick Start: Your Own Companion Bot

Create a Python file (e.g., my_bot.py):

from voice_agent_core import VoiceCompanionBot, get_tools
import datetime

def my_llm_handler(text):
    if "date" in text or "time" in text:
        now = datetime.datetime.now()
        return {"type": "text_response", "content": f"The current date and time is: {now}"}
    else:
        return {"type": "text_response", "content": "I am your companion bot! You said: " + text}

bot = VoiceCompanionBot(llm_handler=my_llm_handler, tools=get_tools())
bot.listen_and_respond()

Run it:

python my_bot.py

Speak to your bot! It will respond with the date/time or echo your message.

Advanced: Use Any LLM (OpenAI Function Calling Example)

To let your LLM automatically call tools (like playing YouTube or checking weather), use OpenAI's function calling feature:

from dotenv import load_dotenv
load_dotenv()

from voice_agent_core import VoiceCompanionBot, get_tools
import openai
import os

openai.api_key = os.getenv("OPENAI_API_KEY")

functions = [
    {
        "name": "play_on_youtube",
        "description": "Plays a video or song on YouTube.",
        "parameters": {
            "type": "object",
            "properties": {
                "query": {"type": "string", "description": "The name of the song or video to play."}
            },
            "required": ["query"]
        }
    },
    {
        "name": "pause_or_resume",
        "description": "Pauses or resumes the currently playing media by simulating a spacebar press.",
        "parameters": {"type": "object", "properties": {}}
    },
    {
        "name": "stop_current_task",
        "description": "Stops the current task by closing the active tab in the browser (Ctrl+W).",
        "parameters": {"type": "object", "properties": {}}
    },
    {
        "name": "open_website",
        "description": "Opens a website in the default browser given a valid URL.",
        "parameters": {
            "type": "object",
            "properties": {
                "url": {"type": "string", "description": "The full URL of the website to open. Must start with http or https."}
            },
            "required": ["url"]
        }
    },
    {
        "name": "search_google",
        "description": "Searches for a query on Google.",
        "parameters": {
            "type": "object",
            "properties": {
                "query": {"type": "string", "description": "The topic or question to search for on Google."}
            },
            "required": ["query"]
        }
    },
    {
        "name": "open_vscode",
        "description": "Opens the Visual Studio Code application.",
        "parameters": {"type": "object", "properties": {}}
    },
    {
        "name": "get_weather",
        "description": "Fetches the current weather for a specified location using the OpenWeatherMap API.",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string", "description": "The city name to get the weather for. For example: 'London' or 'Tokyo'."}
            },
            "required": ["location"]
        }
    }
]

system_prompt = (
    "You are a helpful, friendly voice companion. "
    "If the user asks to play something on YouTube, call the function 'play_on_youtube' with the song or video name as the 'query' argument. "
    "You can also call other tools for media, weather, websites, and more."
)

def openai_llm_handler(text):
    response = openai.ChatCompletion.create(
        model="gpt-4o-mini-2024-07-18",
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": text}
        ],
        functions=functions,
        function_call="auto"
    )
    message = response.choices[0].message
    if hasattr(message, "function_call") and message.function_call:
        import json
        name = message.function_call.name
        args = message.function_call.arguments
        args = json.loads(args) if isinstance(args, str) else args
        return {"type": "function_call", "call": {"name": name, "args": args}}
    else:
        return {"type": "text_response", "content": message['content']}

bot = VoiceCompanionBot(llm_handler=openai_llm_handler, tools=get_tools())
bot.listen_and_respond()

How it works:

Each tool is a Python function (see actions.py).
The LLM can call any tool by name and arguments using OpenAI function calling.
Add your own tools by writing a function and adding its schema to the functions list.

License

MIT

For more details, see the API reference and examples above. Enjoy building your own AI companion!

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
dist		dist
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
test_bot.py		test_bot.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Voice Agent Core

Features

Requirements

Installation

Quick Start: Your Own Companion Bot

Advanced: Use Any LLM (OpenAI Function Calling Example)

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Voice Agent Core

Features

Requirements

Installation

Quick Start: Your Own Companion Bot

Advanced: Use Any LLM (OpenAI Function Calling Example)

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages