Vi is an experimental, 100% offline, voice-activated artificial intelligence assistant built in Python.
- Complete Offline Functionality: No internet connection required. All transcription, processing, routing, and database searching happens entirely on your local machine.
- Retrieval-Augmented Generation (RAG): Ask questions about your personal files. Vi extracts answers from
.txt,.pdf, and.docxfiles using a local vector database. - Conversational Memory: Vi remembers the recent context of your general chats, allowing for more natural, multi-turn interactions.
- Local Computer Vision: Ask Vi to "look at this" or "describe what you see," and she will capture a webcam frame and generate an image caption.
- Semantic Intent Routing (Zero-Shot): Intelligently routes your prompt based on context and meaning, rather than relying on strict, hardcoded keywords.
- Local STT & TTS: Uses OpenAI's localized
whisper-tinymodel for transcription,pyttsx3for vocalization, andopenwakewordfor continuous listening.
- The Ear:
openwakewordandSpeechRecognitionfeed audio into a localizedWhisperpipeline. - The Router:
distilbart-mnli-12-3dynamically matches the command to an AI expert. - The Experts:
- Vision: Passes
cv2webcam frames to aSalesforce/BLIPimage-to-text pipeline. - RAG: Vectorizes
./local_docsusingunstructuredandFAISS, extracting answers viaRoBERTa-squad2. - NLP: Specialized pipelines handle Sentiment, Summarization, and Context-Aware General Chat.
- Vision: Passes
- The Voice:
pyttsx3processes the expert's text output into audible speech.
- Python 3.8+
- A working microphone and webcam
- C++ Build Tools (Required on Windows for compiling FAISS and certain dependencies)
-
Install base dependencies:
pip install SpeechRecognition pyaudio transformers torch pyttsx3 numpy openwakeword soundfile opencv-python pillow langchain langchain-community langchain-huggingface faiss-cpu sentence-transformers unstructured[all-docs]
(Note: The
unstructured[all-docs]package handles dependencies likepdfminer.sixandpython-docxautomatically). -
Add your documents: Upon first run, Vi will automatically create a
./local_docsdirectory in the project root. Drop any.txt,.pdf, or.docxfiles you want Vi to learn into this folder.
- "Hey Jarvis... What do you see right now?" -> (Triggers Vision Camera)
- "Hey Jarvis... Search my files to tell me where Vi was created." -> (Triggers RAG search)
- "Hey Jarvis... Summarize this sentence for me..." -> (Triggers Summarization Expert)
Use the official OpenWakeWord Google Colab Training Notebook to generate a custom .tflite model, then update the wakeword_models parameter in main.py.