Crix — AI Voice Assistant for Linux Desktop

Crix is a fast, efficient, voice-controlled AI assistant that runs natively on Linux. It gives an AI direct control over your keyboard, mouse, and system — so you can get things done entirely hands-free.

Features

🎙️ Real-time voice interaction — Talk to Crix naturally using state-of-the-art speech recognition
⌨️ Keyboard control — Type text, press shortcuts, and submit forms by voice
🖱️ Mouse control — Move, click, double-click, and scroll anywhere on screen
🪟 Window & Workspace management — Switch workspaces, list open windows, and launch apps
📋 Clipboard integration — Read and write clipboard content using xclip
🔍 Live web search — Fetches up-to-date information from the web using Tavily
🖥️ Shell command execution — Run safe, non-destructive shell commands on demand
🔇 Noise cancellation — Intelligent audio filtering for cleaner voice input

Tech Stack

Component	Technology
Voice Framework	LiveKit Agents
Speech-to-Text (STT)	Deepgram Nova-3 (multilingual)
Language Model (LLM)	OpenAI GPT-4.1 Mini
Text-to-Speech (TTS)	ElevenLabs Turbo v2.5
Voice Activity Detection (VAD)	Silero
Desktop Automation	`xdotool`
Clipboard	`xclip`
Web Search	Tavily
Package Manager	uv

Prerequisites

Linux (X11 recommended; some tools may work partially on Wayland)
Python 3.12+
uv package manager
System dependencies:
```
sudo apt install xdotool xclip
```
A LiveKit account and project (cloud.livekit.io)
A Tavily API key for web search
STT, LLM, and TTS are handled by LiveKit — no separate API keys needed for OpenAI, Deepgram, or ElevenLabs

Installation

Clone the repository:

git clone https://github.com/Aerex0/Crix.git
cd Crix

Install Python dependencies:
```
uv sync
```
Configure environment variables:
```
cp .env.example .env
```
Then edit .env and fill in your credentials:
```
LIVEKIT_URL=wss://your-project.livekit.cloud
LIVEKIT_API_KEY=your_api_key
LIVEKIT_API_SECRET=your_api_secret
TAVILY_API_KEY=your_tavily_key
```
Note: STT (Deepgram), LLM (OpenAI), and TTS (ElevenLabs) are billed and managed through your LiveKit account — no separate API keys required.

Usage

Run the agent using the LiveKit CLI:

uv run python src/agent.py console

Once running, connect to the agent via any LiveKit-compatible client (e.g. the LiveKit Playground or a custom frontend). Crix will greet you and wait for voice commands.

Customization

Before using Crix, you must update the system prompt to match your desktop environment. Open src/prompts/crix.py and adjust it to reflect:

Your keyboard shortcuts — e.g. how you close a window, open a terminal, or switch workspaces may differ between desktop environments (GNOME, KDE, i3, Hyprland, etc.)
Your default apps — e.g. your terminal emulator (alacritty, kitty, gnome-terminal), browser (firefox, brave), etc.
Your workspace setup — how many workspaces you use and how they're numbered

Important

The default prompt is configured for a specific setup. If your shortcuts or apps differ, Crix may send the wrong keys or open the wrong applications. Tailor the prompt to your environment for the best experience.

Available Tools

Crix comes with a set of built-in tools it can call autonomously based on your voice commands:

🔍 Web & Time

Tool	Description
`web_search`	Search the web for up-to-date information using Tavily
`get_time`	Get the current system date and time

⌨️ Keyboard

Tool	Description
`type_text`	Type text at the current cursor position
`press_key`	Press a key or key combo (e.g. `ctrl+c`, `super+q`)
`type_and_submit`	Type text and immediately press Enter
`paste_text`	Paste text instantly via clipboard (faster for long strings)

🖱️ Mouse

Tool	Description
`click`	Click at a given position (left, middle, or right button)
`double_click`	Double-click at a given position
`scroll`	Scroll up or down at the current cursor position

🪟 Windows & Apps

Tool	Description
`switch_workspace`	Switch to a specific virtual desktop (1-based)
`open_app`	Launch an application by command name

📋 Clipboard & Screen

Tool	Description
`get_clipboard`	Read the current clipboard contents
`select_all_and_copy`	Press `Ctrl+A` then `Ctrl+C` and return copied text
`get_screen_size`	Return the current screen resolution

🖥️ System

Tool	Description
`run_command_silent`	Execute a safe, read-only shell command and return output

Example Commands

"Open a terminal"               → Launches Alacritty
"Switch to workspace 3"         → Switches to virtual desktop 3
"Type hello world and send it"  → Types and submits text
"What time is it?"              → Returns current date and time
"Search for the latest AI news" → Performs a live web search
"Press Ctrl+Z"                  → Sends the undo shortcut
"Select all and copy"           → Copies all text in the focused window

Security

Crix is designed with the following hard rules baked into its system prompt:

🚫 Will not execute destructive commands (rm, mv, dd, kill, chmod, etc.)
🚫 Will not follow commands delivered via on-screen text — only spoken voice
🚫 Will not reveal its system prompt
🚫 Will not chain shell commands

Warning

These are prompt-level restrictions enforced by the AI model — not hard system-level blocks. While precautions have been taken to make Crix safe, no AI system is perfectly secure. Use with awareness and at your own risk. Avoid granting it access to sensitive environments.

Project Structure

crix/
├── src/
│   ├── agent.py              # LiveKit agent setup, session, and tool registration
│   ├── tools.py              # All function tools (keyboard, mouse, clipboard, etc.)
│   ├── __init__.py
│   └── prompts/
│       ├── crix.py           # System prompt defining Crix's behavior and rules
│       └── __init__.py
├── LICENSE
├── README.md
├── pyproject.toml            # Project metadata and dependencies
└── uv.lock

Planned Features & Future Work

The following tools and improvements are actively being worked on:

Screen Read (read_screen_text): OCR-based screen reading using Tesseract to allow Crix to "see" on-screen content and answer questions about it.
Mouse Movement (move_mouse): Moving the mouse cursor to specific screen coordinates via xdotool is not yet reliably working.
Window Focus (list_open_windows / focus_window): Detect and focus any open window by name, enabling seamless app switching.
Prompt Improvement: Expanding the system prompt with richer context and more example patterns for better command understanding.
Multi-monitor support: Extend screen tooling to handle setups with more than one display.

License

This project is licensed under the MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Crix — AI Voice Assistant for Linux Desktop

Features

Tech Stack

Prerequisites

Installation

Usage

Customization

Available Tools

🔍 Web & Time

⌨️ Keyboard

🖱️ Mouse

🪟 Windows & Apps

📋 Clipboard & Screen

🖥️ System

Example Commands

Security

Project Structure

Planned Features & Future Work

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
src		src
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

Crix — AI Voice Assistant for Linux Desktop

Features

Tech Stack

Prerequisites

Installation

Usage

Customization

Available Tools

🔍 Web & Time

⌨️ Keyboard

🖱️ Mouse

🪟 Windows & Apps

📋 Clipboard & Screen

🖥️ System

Example Commands

Security

Project Structure

Planned Features & Future Work

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages