π Full guide & tutorial: xrchris.com/projects/openlexa
A real-time AI voice assistant running on a Raspberry Pi Zero 1.1, powered by the OpenAI Realtime API. Say a wake word to activate it, speak naturally, and it responds through a USB speaker. An animated robot face renders on an HDMI display.
βββββββββββββββββββββββββββββββββββββββ
β β β (eyes) β
βββββββββββββββββββββββββββββββββββββββ
β Hello! How can I help you today? β
βββββββββββββββββββββββββββββββββββββββ
- Wake-word activation β says "Computer" to wake, sleeps automatically after inactivity
- Fully offline wake-word detection β Porcupine runs locally, ~1% CPU on Pi Zero
- Real-time conversation via OpenAI
gpt-4o-realtime-preview - Multilingual β responds in English or Korean depending on the speaker
- Retro robot face on HDMI display β eyes close when sleeping, open when active
- Echo prevention β mic is muted while the AI speaks, echo buffer is flushed after
- Auto-reconnect β transparently reconnects if the WebSocket drops
- Graceful degradation β runs headless (no display) without any code changes
| Component | Details | ~Price |
|---|---|---|
| Computer | Raspberry Pi Zero 2 W | ~15 β¬ |
| Audio Option A | Soundcore Mini β Bluetooth Speaker + Mic | ~20 β¬ |
| Audio Option B | USB Speaker Bar (plug-and-play) | ~15 β¬ |
| Display (optional) | 3.5" HDMI Display | ~20 β¬ |
| MicroSD + Power | MicroSD 16 GB+ & USB PSU | ~13 β¬ |
| Audio server | PipeWire | β |
| OS | Raspberry Pi OS (Bookworm) | β |
Affiliate links β buying through these supports the project at no extra cost to you.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SLEEPING β
β Porcupine (offline) βββ pacat --record 16kHz βββ USB Mic β
β β "Computer" detected β
β βΌ β
β ACTIVE β
β USB Mic βββΊ pacat --record 24kHz βββΊ AudioRecorder.queue β
β β β
β send_audio() β
β β β
β OpenAI Realtime API β
β β β
β receive_events() β
β β β
β AudioPlayer βββββ β
β β β
β pacat --playback 24kHz βββΊ USB Speaker β
β β
β [15s inactivity] βββΊ back to SLEEPING β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
launcher.py β boot-time launcher: waits for USB mic, shows countdown, then exec's into main.py
main.py β wake-word loop, WebSocket session, audio I/O, event handling
display.py β Pygame rendering loop (daemon thread)
Running continuously connected to OpenAI is expensive and wasteful. A local wake-word detector lets the device sleep (no WebSocket, no API cost) until the user actually wants to speak.
Porcupine (by Picovoice) was chosen because:
- It ships a pre-compiled ARM binary that runs on ARMv6 (Pi Zero 1.1)
- CPU usage is ~1% β leaves the Pi's single core free for audio and WebSocket I/O
- It works fully offline β no network call for wake-word detection
- Free tier includes built-in keywords:
computer,jarvis,porcupine,bumblebee, and more - Custom keywords (.ppn files) can be trained for free at console.picovoice.ai
The OpenAI Realtime API provides speech-to-text, language model inference, and text-to-speech in a single persistent WebSocket connection. This eliminates the need to chain three separate services (Whisper β GPT β TTS) and dramatically reduces latency. It also handles voice activity detection (VAD) server-side, so no local VAD library is needed.
Raspberry Pi OS Bookworm ships PipeWire as the default audio server. It handles resampling (USB devices run at 48 kHz natively; our pipeline uses 16/24 kHz) transparently. pacat (PulseAudio-compatible client) works directly against PipeWire via its PulseAudio compatibility layer.
Important: PipeWire's default source may be set to a .monitor (speaker loopback) rather than the real microphone input. The code explicitly queries pactl list sources short to find the first alsa_input.* device and passes it to pacat via --device=, bypassing this issue.
Python audio libraries (PyAudio, sounddevice) require compiled native extensions and often have dependency conflicts on Raspberry Pi OS. pacat is a standard system tool, always available where PipeWire/PulseAudio is installed. It communicates via subprocess stdin/stdout, which is reliable, portable, and adds no Python dependencies.
USB audio devices suspend themselves when idle to save power. The first ~200 ms of audio after a period of silence gets "eaten" by the device waking up. All sound effects (startup chime, wake acknowledgement) prepend 300 ms of silence before the actual audio, ensuring the device is active before the tone begins.
After the AI finishes speaking (response.done), a 15-second inactivity timer starts. If the user doesn't speak within that window, the WebSocket is closed and the device returns to the sleeping (wake-word) state. The timer is cancelled while the AI is speaking (so long responses don't trigger a premature sleep) and reset whenever the user starts talking.
The USB Speaker Bar's microphone is physically close to its speaker, making acoustic echo a problem. When the AI starts speaking, the microphone is muted in software (recorder.muted = True). After the AI finishes:
- A 2.5-second silence allows the room echo to decay.
- The microphone queue is flushed to discard any residual echo already captured.
- The microphone is unmuted.
This project runs on a Raspberry Pi Zero 1.1 β single-core ARMv6 @ 1 GHz, 512 MB RAM, no GPU. Rendering at full 800Γ480 every frame would saturate the CPU and starve the audio pipeline. By rendering animated elements on a 200Γ120 surface and scaling 4Γ with pygame.transform.scale, pixels touched per frame are reduced to 6.25% of full resolution. Combined with dirty-flag rendering, the display thread consumes a negligible fraction of CPU.
The 4Γ upscale creates a visible pixel grid that gives the robot face a retro LED-matrix aesthetic.
| State | Eyes | When |
|---|---|---|
sleeping |
Closed (horizontal lines) | Waiting for wake word |
idle |
Open, pupils wandering | Session active, waiting for user |
listening |
Open | User is speaking |
speaking |
Open | AI is speaking |
sudo apt update
sudo apt install -y pipewire pipewire-pulse fonts-nanum python3-pygamepip3 install --break-system-packages -r requirements.txt- Create a free account at console.picovoice.ai
- Copy your Access Key from the dashboard
- Paste it into
main.py β PORCUPINE_ACCESS_KEY
Edit main.py and set:
OPENAI_API_KEY = "sk-..." # Your OpenAI API key
PORCUPINE_ACCESS_KEY = "..." # Your Picovoice access key (free)
WAKE_WORD = "computer" # Built-in keyword, or "custom"
WAKE_WORD_MODEL_PATH = "" # Path to .ppn file if WAKE_WORD = "custom"
INACTIVITY_TIMEOUT = 15 # Seconds of silence before going back to sleep
VOICE = "verse" # AI voice (alloy, echo, nova, shimmer, verse, ...)
INSTRUCTIONS = "..." # System prompt / personalityBuilt-in free keywords (no .ppn file needed):
computer, jarvis, porcupine, bumblebee, alexa, grasshopper, blueberry, grapefruit, terminator, hey barista, americano, picovoice
Custom keyword (e.g. "Hey Peter"):
Go to console.picovoice.ai β Wake Word β create your keyword β download the .ppn file for Raspberry Pi β set WAKE_WORD = "custom" and WAKE_WORD_MODEL_PATH = "/path/to/file.ppn".
sudo nano /etc/systemd/system/openlexa.service[Unit]
Description=OpenLexa AI Voice Assistant
After=network-online.target bluetooth.target sound.target
Wants=network-online.target
[Service]
User=pi
WorkingDirectory=/home/pi/ElevenLexa
ExecStart=/usr/bin/python3 /home/pi/ElevenLexa/launcher.py
Restart=on-failure
RestartSec=5
Environment=XDG_RUNTIME_DIR=/run/user/1000
Environment=DISPLAY=:0
[Install]
WantedBy=multi-user.targetsudo systemctl enable openlexa.service
sudo systemctl start openlexa.serviceWhy
launcher.pyinstead ofmain.pydirectly? PipeWire is a user-level service and may not be fully initialised when the system service starts.launcher.pypollspactl list sourcesuntil the USB microphone (alsa_input.*) appears, then shows a countdown on the display before handing off tomain.pyviaos.execv. This eliminates the race condition where Porcupine starts with a non-functional audio source and only reacts to very loud sounds.
python3 main.pyOpenLexaPi/
βββ launcher.py # Boot launcher β waits for USB mic, countdown, then starts main.py
βββ main.py # Main application β wake-word loop, audio pipeline, OpenAI session
βββ display.py # Pygame HDMI display (optional, auto-detected)
βββ requirements.txt # Python dependencies
βββ README.md # This file
βββ archive/ # Old debug scripts (not needed for running)
| What | Where | How |
|---|---|---|
| Wake word | main.py β WAKE_WORD |
Built-in keyword name or "custom" |
| Custom wake word | main.py β WAKE_WORD_MODEL_PATH |
Path to .ppn file from Picovoice Console |
| Inactivity timeout | main.py β INACTIVITY_TIMEOUT |
Seconds before returning to sleep |
| AI personality | main.py β INSTRUCTIONS |
Edit the system prompt string |
| Voice | main.py β VOICE |
Any OpenAI Realtime voice name |
| Languages | main.py β INSTRUCTIONS |
Change language instructions |
| VAD sensitivity | main.py β turn_detection.threshold |
0.0β1.0, lower = more sensitive |
| Eye colours | display.py β colour constants |
RGB tuples at the top of the file |
| Display layout | display.py β EYE_AREA_H, TEXT_AREA_Y |
Adjust the eye/text split point |
MIT β free to use, modify, and distribute.