Boddi is a local-first, open-source, agentic AI companion built in Python. It runs entirely on your machine, uses open-source models, supports real-time voice interaction, expressive visuals, and interruption-aware conversations — designed as a serious foundation, not a demo.
Boddi is inspired by friendly companions like BMO, but engineered with clean architecture, privacy-first principles, and extensibility in mind.
Boddi is:
- 🧠 Agentic — it reasons, routes intents, and uses tools
- 🎧 Voice-enabled — speaks and listens locally (offline)
- ⛔ Interruption-aware — you can cut it off mid-sentence
- 🎭 Expressive — visual states like thinking, smiling, talking
- 🔒 Privacy-first — no cloud, no telemetry, no data leaves your machine
- 🧩 Extensible — designed to grow without rewrites
Boddi is not:
- a cloud chatbot wrapper
- a fake UI demo
- a prompt-only toy
At runtime, Boddi behaves like a real conversational system:
-
Listens through the microphone
-
Detects speech activity (VAD)
-
Transcribes speech locally (STT)
-
Detects wake word
-
Routes intent through an agentic loop
-
Uses tools (LLM, web search, tasks)
-
Responds with:
- instant micro voice clips ("On it!", "Sorry", etc.)
- real-time spoken responses (TTS)
-
Updates visual expressions based on state
All of this happens locally.
[ Microphone ]
↓
[ Voice Activity Detection (Silero) ]
↓
[ Speech-to-Text (Whisper) ]
↓
[ Wake Word Detection ]
↓
[ Agentic Loop ]
├─► Intent Detection
├─► Tool Routing
│ ├─► Ollama (LLM)
│ ├─► Web Search (DDGS)
│ └─► User Tasks
↓
[ Response Planner ]
├─► Micro Voice Clip (instant)
└─► Piper TTS (real-time speech)
↓
[ Visual State Update ]
Boddi intentionally uses two voice paths.
Used for:
- greetings
- acknowledgements ("On it!", "Okay")
- apologies
- appreciation
- sign-off
These are:
- pre-generated WAV clips
- generated once using Piper TTS
- played instantly at runtime
- zero latency
This makes Boddi feel responsive and alive.
Used for:
- answering questions
- explanations
- summaries
- long-form responses
Powered by:
- Piper TTS
- Voice:
en_GB / semaine / medium - Fully offline
- Interruptible
Boddi includes a lightweight visual layer implemented with Python UI primitives.
Visuals are state-driven, not decorative.
| Agent State | Visual Expression |
|---|---|
| idle | blinking / neutral |
| listening | attentive |
| thinking | confused / thinking |
| speaking | talking |
| success | smiling |
| error | confused |
| sleep | eyes closed |
Users can configure soft background colors:
- yellow
- green
- blue
- red
- orange
- black (soft dark)
Boddi is built around a custom event-driven agent loop.
Key responsibilities:
- turn awareness (who is speaking)
- interruption handling
- state transitions
- intent routing
- tool execution
No heavy frameworks are required in v0.1 — the core logic is transparent and hackable.
- Ollama
- User-selected open-source LLMs
- Fully local inference
- DDGS (DuckDuckGo Search)
- No API keys
- Privacy-friendly
- Used only when needed
- User-defined triggers
- Configurable actions
- Designed for automation and repetition
| Layer | Technology |
|---|---|
| Language | Python 3.9+ |
| LLM Runtime | Ollama |
| STT | Whisper (offline) |
| VAD | Silero VAD |
| TTS | Piper (semaine voice) |
| Audio Playback | simpleaudio |
| UI | Tkinter (v0.1) |
| Web Search | DDGS |
| Config | YAML |
boddi/
├── core/ # agent loop, state, intent logic
├── audio/ # mic, VAD, STT, TTS, clips
├── llm/ # Ollama integration
├── tools/ # web search, tasks
├── visual/ # expressions & UI
├── wake/ # wake word logic
├── assets/ # audio clips & defaults
├── scripts/ # dev utilities
└── cli.py # entry point
Most assistants today are:
- cloud-dependent
- opaque
- difficult to extend
- privacy-invasive
Boddi exists to be:
- local
- transparent
- hackable
- respectful of users
It is meant to grow with the community, not be rewritten every version.
Boddi is intentionally early and open.
Contributions are welcome in:
- voice improvements
- visual expressions
- task templates
- documentation
- performance & stability
This project is built to be understood, not just used.
- Version: v0.1
- Scope: foundational runtime
- Focus: correctness, architecture, extensibility
Boddi is not trying to compete with cloud assistants. It is building something different:
A local, expressive, agentic AI companion — engineered properly from day one.