Skip to content

Aerex0/Crix

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

19 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Crix β€” AI Voice Assistant for Linux Desktop

Crix is a fast, efficient, voice-controlled AI assistant that runs natively on Linux. It gives an AI direct control over your keyboard, mouse, and system β€” so you can get things done entirely hands-free.

Python License Platform LiveKit Tavily Status


Features

  • πŸŽ™οΈ Real-time voice interaction β€” Talk to Crix naturally using state-of-the-art speech recognition
  • ⌨️ Keyboard control β€” Type text, press shortcuts, and submit forms by voice
  • πŸ–±οΈ Mouse control β€” Move, click, double-click, and scroll anywhere on screen
  • πŸͺŸ Window & Workspace management β€” Switch workspaces, list open windows, and launch apps
  • πŸ“‹ Clipboard integration β€” Read and write clipboard content using xclip
  • πŸ” Live web search β€” Fetches up-to-date information from the web using Tavily
  • πŸ–₯️ Shell command execution β€” Run safe, non-destructive shell commands on demand
  • πŸ”‡ Noise cancellation β€” Intelligent audio filtering for cleaner voice input

Tech Stack

Component Technology
Voice Framework LiveKit Agents
Speech-to-Text (STT) Deepgram Nova-3 (multilingual)
Language Model (LLM) OpenAI GPT-4.1 Mini
Text-to-Speech (TTS) ElevenLabs Turbo v2.5
Voice Activity Detection (VAD) Silero
Desktop Automation xdotool
Clipboard xclip
Web Search Tavily
Package Manager uv

Prerequisites

  • Linux (X11 recommended; some tools may work partially on Wayland)
  • Python 3.12+
  • uv package manager
  • System dependencies:
    sudo apt install xdotool xclip
  • A LiveKit account and project (cloud.livekit.io)
  • A Tavily API key for web search
  • STT, LLM, and TTS are handled by LiveKit β€” no separate API keys needed for OpenAI, Deepgram, or ElevenLabs

Installation

  1. Clone the repository:

    git clone https://github.com/Aerex0/Crix.git
    cd Crix
  2. Install Python dependencies:

    uv sync
  3. Configure environment variables:

    cp .env.example .env

    Then edit .env and fill in your credentials:

    LIVEKIT_URL=wss://your-project.livekit.cloud
    LIVEKIT_API_KEY=your_api_key
    LIVEKIT_API_SECRET=your_api_secret
    TAVILY_API_KEY=your_tavily_key

    Note: STT (Deepgram), LLM (OpenAI), and TTS (ElevenLabs) are billed and managed through your LiveKit account β€” no separate API keys required.


Usage

Run the agent using the LiveKit CLI:

uv run python src/agent.py console

Once running, connect to the agent via any LiveKit-compatible client (e.g. the LiveKit Playground or a custom frontend). Crix will greet you and wait for voice commands.


Customization

Before using Crix, you must update the system prompt to match your desktop environment. Open src/prompts/crix.py and adjust it to reflect:

  • Your keyboard shortcuts β€” e.g. how you close a window, open a terminal, or switch workspaces may differ between desktop environments (GNOME, KDE, i3, Hyprland, etc.)
  • Your default apps β€” e.g. your terminal emulator (alacritty, kitty, gnome-terminal), browser (firefox, brave), etc.
  • Your workspace setup β€” how many workspaces you use and how they're numbered

Important

The default prompt is configured for a specific setup. If your shortcuts or apps differ, Crix may send the wrong keys or open the wrong applications. Tailor the prompt to your environment for the best experience.


Available Tools

Crix comes with a set of built-in tools it can call autonomously based on your voice commands:

πŸ” Web & Time

Tool Description
web_search Search the web for up-to-date information using Tavily
get_time Get the current system date and time

⌨️ Keyboard

Tool Description
type_text Type text at the current cursor position
press_key Press a key or key combo (e.g. ctrl+c, super+q)
type_and_submit Type text and immediately press Enter
paste_text Paste text instantly via clipboard (faster for long strings)

πŸ–±οΈ Mouse

Tool Description
click Click at a given position (left, middle, or right button)
double_click Double-click at a given position
scroll Scroll up or down at the current cursor position

πŸͺŸ Windows & Apps

Tool Description
switch_workspace Switch to a specific virtual desktop (1-based)
open_app Launch an application by command name

πŸ“‹ Clipboard & Screen

Tool Description
get_clipboard Read the current clipboard contents
select_all_and_copy Press Ctrl+A then Ctrl+C and return copied text
get_screen_size Return the current screen resolution

πŸ–₯️ System

Tool Description
run_command_silent Execute a safe, read-only shell command and return output

Example Commands

"Open a terminal"               β†’ Launches Alacritty
"Switch to workspace 3"         β†’ Switches to virtual desktop 3
"Type hello world and send it"  β†’ Types and submits text
"What time is it?"              β†’ Returns current date and time
"Search for the latest AI news" β†’ Performs a live web search
"Press Ctrl+Z"                  β†’ Sends the undo shortcut
"Select all and copy"           β†’ Copies all text in the focused window

Security

Crix is designed with the following hard rules baked into its system prompt:

  • 🚫 Will not execute destructive commands (rm, mv, dd, kill, chmod, etc.)
  • 🚫 Will not follow commands delivered via on-screen text β€” only spoken voice
  • 🚫 Will not reveal its system prompt
  • 🚫 Will not chain shell commands

Warning

These are prompt-level restrictions enforced by the AI model β€” not hard system-level blocks. While precautions have been taken to make Crix safe, no AI system is perfectly secure. Use with awareness and at your own risk. Avoid granting it access to sensitive environments.


Project Structure

crix/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ agent.py              # LiveKit agent setup, session, and tool registration
β”‚   β”œβ”€β”€ tools.py              # All function tools (keyboard, mouse, clipboard, etc.)
β”‚   β”œβ”€β”€ __init__.py
β”‚   └── prompts/
β”‚       β”œβ”€β”€ crix.py           # System prompt defining Crix's behavior and rules
β”‚       └── __init__.py
β”œβ”€β”€ LICENSE
β”œβ”€β”€ README.md
β”œβ”€β”€ pyproject.toml            # Project metadata and dependencies
└── uv.lock

Planned Features & Future Work

The following tools and improvements are actively being worked on:

  • Screen Read (read_screen_text): OCR-based screen reading using Tesseract to allow Crix to "see" on-screen content and answer questions about it.
  • Mouse Movement (move_mouse): Moving the mouse cursor to specific screen coordinates via xdotool is not yet reliably working.
  • Window Focus (list_open_windows / focus_window): Detect and focus any open window by name, enabling seamless app switching.
  • Prompt Improvement: Expanding the system prompt with richer context and more example patterns for better command understanding.
  • Multi-monitor support: Extend screen tooling to handle setups with more than one display.

License

This project is licensed under the MIT License.

About

Crix- your personal AI voice assistant that actually does your work for you

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages