Skip to content

overcastlab/Qube

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

🧊 Qube – Personal Voice Assistant

Python 3.10+ License: MIT ASR: Whisper-small LLM: Qwen2.5-0.5B TTS: Kokoro-ONNX

Qube is a privacy‑first, fully local voice assistant that runs on your own machine.
It listens, understands, answers questions, plays music, sets timers/alarms, searches the web, and much more – all without sending your data to the cloud.

🎤 “Hey Cube, what's the weather like in Rome?”
🔊 “At Rome it's 22°C with light rain, wind 12 km/h, humidity 68%.”

✨ Features

  • 🎙️ Voice Wake Word – say “Hey Cube” (customisable) to activate
  • 🧠 Local LLMQwen2.5‑0.5B‑Instruct for natural conversations and reasoning
  • 🔊 Speech recognitionWhisper‑small (multi‑language)
  • 🗣️ Text‑to‑speech – fast neural voice with Kokoro‑ONNX (fallback to espeak)
  • 🎵 Music from YouTubeyt‑dlp + mpv (stream audio only)
  • 🌍 Web search + RAG – DuckDuckGo search as context for the LLM
  • ⏱️ Timers & Alarms – set, cancel, list active timers, persistent alarms
  • 📝 Voice notes – save, read, delete (stored locally in notes.json)
  • 🌤️ Weather – real‑time conditions via Open‑Meteo (no API key)
  • 📊 Web Dashboard – beautiful Flask dashboard to monitor stats, manage skills, and configure Qube live
  • 🔌 Extensible skills – disable/enable any feature from the dashboard (music, web, notes, jokes, translations…)
  • 🎨 Custom sounds – beeps for timer, alarm, startup (auto‑generated)

🖥️ Dashboard Preview

When the dashboard is running (http://localhost:7860), you can:

  • See usage statistics (today/total queries, most used skills)
  • Toggle skills on/off in real‑time
  • Change wake word, assistant name, voice, language, volume
  • Review conversation history
  • Reset statistics or clear history

🚀 Quick Start

Requirements

  • OS: Linux (recommended – tested on Ubuntu 22.04/24.04).
    Windows/macOS may work but require adapting audio and TTS commands.
  • Python 3.10 or newer
  • System packages:
sudo apt update
sudo apt install -y espeak-ng mpv aplay

1. Clone the repository

git clone https://github.com/overcastlab/qube.git
cd qube

2. Install Python dependencies

torch
transformers
numpy
sounddevice
soundfile
flask
kokoro-onnx
yt-dlp
duckduckgo-search
requests
scipy

If kokoro-onnx installation fails, use the following command:

curl -L -o kokoro-v1.0.onnx "https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files-v1.0/kokoro-v1.0.onnx"
curl -L -o voices-v1.0.bin "https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files-v1.0/voices-v1.0.bin"

💡 GPU support (optional but recommended): install torch with CUDA from pytorch.org.

3. Run the assistant and the dashboard

Open two terminals.

Terminal 1 – Qube core:

python qube.py
  • By default, it listens for the wake word "hey cube".
  • Speak into your microphone – when the wake word is detected, Qube will beep and await your command.

Terminal 2 – Dashboard:

python qube_dashboard.py

Visit http://localhost:7860 in your browser to configure and monitor Qube in real time.

🔧 You can run the assistant without wake word: python qube.py --no-wake

🎮 Command Line Options

Argument Description Default
--wake Custom wake word phrase "hey cube" (from config)
--no-wake Disable wake word; always listen for a command after a beep False
--lang Language for ASR (e.g., it, en, es, fr) From config (or it)
--voice TTS voice name (Kokoro‑ONNX) – e.g., if_sara, af_sky Best match for language
--volume Playback volume (0.0–2.0) 1.0
--history Number of conversation turns to remember 6
--mic Input device index (use python -m sounddevice to list) System default
--sounds-dir Directory for notification .wav files ./sounds

🧠 How it works

  1. Wake word detection – Qube records short audio chunks, transcribes them with Whisper, and looks for the wake phrase.
  2. Recording – After activation, it captures speech until silence (≥1.8s) or 15s max.
  3. Transcription – Whisper (small) converts the audio to text.
  4. Skill routing – The text is matched against built‑in patterns (timer, music, weather, etc.) using regex.
  5. LLM fallback – If no skill matches, the text is sent to Qwen2.5‑0.5B (optionally with web search context) to generate a reply.
  6. Text‑to‑speech – The response is spoken via Kokoro‑ONNX (or espeak‑ng).
  7. Dashboard integration – Every interaction is logged via HTTP to the Flask dashboard, which stores statistics and history.

🗂️ Configuration Files

  • cube_config.json – persistent settings (wake word, language, voice, skills, volume, etc.)
    Edit manually or through the dashboard.
  • cube_stats.json – dashboard statistics (queries, commands usage, conversation history).
  • notes.json – saved voice notes.
  • sounds/ – auto‑generated beep sounds (timer.wav, alarm.wav, …).

🛠️ Skills & Example Commands

Skill Example
Timer “set timer for 5 minutes 30 seconds”
Alarm “wake me up at 7:15”
Weather “meteo a Parigi” (or “weather in London”)
Music “play lo-fi hip hop radio” (searches YouTube)
Notes “note: buy milk” / “read my notes”
Web search “search for latest AI news” (uses DuckDuckGo + LLM)
Wikipedia “who is Marie Curie”
Calculations “what is 234 * 17.5” / “square root of 49”
Unit conversion “convert 10 km to miles”
Translation “translate hello in Italian”
News “give me today's headlines”
Joke “tell me a joke”
Stop “stop” (interrupts speech and music)
Exit “goodbye” / “exit”

All skills can be turned on/off live from the dashboard (Settings → Skills).

🔧 Troubleshooting

No microphone input

  • Check your input device index:
    python -c "import sounddevice as sd; print(sd.query_devices())"
  • Pass it with --mic 1 (replace with your device number).

Kokoro‑ONNX not working

  • Ensure you have the model files kokoro-v1.0.onnx and voices-v1.0.bin in the working directory.
  • If missing, Qube automatically falls back to espeak-ng.

mpv fails to play audio

  • Install mpv and yt-dlp:
    sudo apt install mpv yt-dlp
  • Try running mpv --version to confirm.

CUDA out of memory

  • The LLM and Whisper both fit on 4GB GPU. If you have less, use --device cpu.

Dashboard cannot connect

  • Ensure qube_dashboard.py is running on the same machine (port 7860).
  • Qube sends logs via urllib – if the dashboard is down, the assistant still works, but stats won't be recorded.

📄 License

MIT License – free to use, modify, and distribute.

🙏 Acknowledgements


Built with ❤️ by OvercastLab for local, private, voice‑controlled AI.
Have fun with Qube! If you like it, ⭐ star the repository.

About

Voice Assistant

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages