Voise — Real-time AI Voice Assistant powered by Gemini Live API

An interactive AI assistant with real-time voice and text communication, persistent memory, Google Search, and support for camera/screen streaming.

Features

Real-time Voice — zero-latency conversation via Gemini Live API
Google Search — AI can search the web during conversation
Persistent Memory — AI remembers facts about you between sessions
Modular Tools — add new AI capabilities by dropping a file into tools/
Single Config — model, voice, speed, and AI persona in one YAML file
Docker Support — run in an isolated container

Project Structure

voise/
├── ai_studio_code.py       # Main application entrypoint
├── config_utils.py         # Config loading, device selection
├── settings/
│   ├── config.yaml         # Main config (model, instructions, devices)
│   └── parameters_guide.md # Parameter reference
├── tools/                  # Auto-loaded AI tools
│   ├── save_user_memory.py
│   ├── read_user_memory.py
│   └── update_user_memory.py
├── user_store/             # Runtime data (gitignored)
│   ├── memory.md           # Persistent AI memory
│   ├── transcript.md       # Session transcripts
│   └── debug_log.txt       # Debug output (--debug mode)
├── Dockerfile
└── docker-compose.yaml

Quick Start

Prerequisites

Python 3.12+
A Gemini API key — get one free at aistudio.google.com

Windows

Option A — with `uv` (recommended)

Install uv:

powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"

Clone the repo and enter the folder:
```
git clone <repo-url>
cd voise
```
Create .env from the template and add your key:
```
copy .env.example .env
notepad .env
```
Set GEMINI_API_KEY=your_key_here
Install dependencies and run:
```
uv sync
uv run python ai_studio_code.py
```

Option B — plain Python

Install Python 3.12+ (check "Add to PATH")

Install system dependency for PyAudio:

pip install pipwin
pipwin install pyaudio

Install the rest:

pip install google-genai opencv-python pillow mss pyyaml python-dotenv

Copy .env.example to .env and set your API key, then run:
```
python ai_studio_code.py
```

macOS

Option A — with `uv` (recommended)

Install uv:

curl -LsSf https://astral.sh/uv/install.sh | sh

Clone the repo and enter the folder:
```
git clone <repo-url>
cd voise
```
Create .env from the template and add your key:
```
cp .env.example .env
open -e .env   # or: nano .env
```
Set GEMINI_API_KEY=your_key_here
Install PortAudio (required by PyAudio):
```
brew install portaudio
```
Install dependencies and run:
```
uv sync
uv run python ai_studio_code.py
```

Option B — plain Python

Install Python 3.12+ or via Homebrew:
```
brew install python@3.12
```

Install PortAudio and dependencies:

brew install portaudio
pip install google-genai opencv-python pyaudio pillow mss pyyaml python-dotenv

Copy .env.example to .env and set your API key, then run:
```
python ai_studio_code.py
```

CLI Options

# Default — voice only
uv run python ai_studio_code.py

# Stream from camera
uv run python ai_studio_code.py --mode camera

# Stream from screen
uv run python ai_studio_code.py --mode screen

# Enable debug logging to user_store/debug_log.txt
uv run python ai_studio_code.py --debug

On first run, the app will ask you to select your microphone and speaker. The choices are saved to .env automatically.

Configuration

All settings live in settings/config.yaml. Key options:

Key	Description
`model.voice`	AI voice: `Zephyr`, `Puck`, `Aoede`, `Charon`, `Fenrir`, `Kore`
`model.speed`	Speech rate: `normal`, `fast`, `slow`
`instructions.personality`	Who the AI is and how it speaks
`instructions.greeting`	First message sent to the AI on startup
`devices.input`	Microphone name (auto-detected on first run)
`devices.output`	Speaker name (auto-detected on first run)

See settings/parameters_guide.md for the full reference.

Adding Custom Tools

Drop a new file into tools/ — it will be auto-loaded on next start.

Each tool file must export:

A function named after the file (e.g. my_tool.py → def my_tool(...))
A declaration dict with the JSON schema for the Gemini API

Example — tools/get_weather.py:

def get_weather(city: str):
    # your implementation
    return {"temperature": "22°C", "condition": "Sunny"}

declaration = {
    "name": "get_weather",
    "description": "Returns current weather for a given city.",
    "parameters": {
        "type": "OBJECT",
        "properties": {
            "city": {"type": "STRING", "description": "City name"}
        },
        "required": ["city"]
    }
}

Docker

# Build
docker build -t voise-app .

# Run with Docker Compose
docker-compose up

Note: Audio and video passthrough in Docker requires additional setup (PulseAudio or device permissions on Linux). Native run is recommended for development.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
settings		settings
tools		tools
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
ai_studio_code.py		ai_studio_code.py
config_utils.py		config_utils.py
docker-compose.yaml		docker-compose.yaml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Voise — Real-time AI Voice Assistant powered by Gemini Live API

Features

Project Structure

Quick Start

Prerequisites

Windows

Option A — with `uv` (recommended)

Option B — plain Python

macOS

Option A — with `uv` (recommended)

Option B — plain Python

CLI Options

Configuration

Adding Custom Tools

Docker

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Voise — Real-time AI Voice Assistant powered by Gemini Live API

Features

Project Structure

Quick Start

Prerequisites

Windows

Option A — with uv (recommended)

Option B — plain Python

macOS

Option A — with uv (recommended)

Option B — plain Python

CLI Options

Configuration

Adding Custom Tools

Docker

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Option A — with `uv` (recommended)

Option A — with `uv` (recommended)

Packages