A comprehensive computer-use CLI built on rustautogui that gives AI vision and control over your computer.
npm install -g reyesReyes provides AI agents with the "eyes" and "limbs" to interact with the desktop environment. Built on the high-performance rustautogui library, it offers:
- Screenshot & Vision πΈ: Capture screen regions, get pixel colors
- Mouse Control π±οΈ: Click, move, drag, scroll with natural animations
- Keyboard Control β¨οΈ: Type text, press keys, execute hotkeys
- Image Recognition π: Find and interact with UI elements using template matching
- System Information π: Query screen size, mouse position
All commands return JSON-formatted results, making it perfect for AI automation workflows.
- π Fast: Built on rustautogui's optimized template matching algorithms
- π Cross-platform: Windows, macOS, and Linux (X11) support
- π€ AI-friendly: JSON output for easy parsing
- π¦ Comprehensive: 30+ commands covering all automation needs
- β Production-ready: Properly handles errors and edge cases
cargo install reyesgit clone https://github.com/Blankeos/reyes.git
cd reyes
cargo install --path .Linux:
sudo apt-get update
sudo apt-get install libx11-dev libxtst-devmacOS:
- Grant accessibility permissions when prompted
- System Preferences > Security & Privacy > Accessibility
Windows:
- No additional dependencies required
# Take a screenshot
reyes screenshot --output screen.png
# Get mouse position
reyes get-mouse-position
# Move and click
reyes move-mouse --x 500 --y 300 --duration 0.5
reyes click
# Type text
reyes type-text --text "Hello from Reyes!"
# Press a key
reyes press-key --key enter
# Find image on screen
reyes locate-on-screen --image button.png --confidence 0.9screenshot- Capture screen to fileget-pixel-color- Get RGB/Hex color at coordinatesfind-color- Search for specific color on screen
click- Click at positiondouble-click- Double click at positionmove-mouse- Move to absolute positionmove-mouse-rel- Move relative to current positiondrag-mouse- Drag to positionscroll- Scroll wheelget-mouse-position- Get current coordinatesmouse-down/mouse-up- Press/release buttons
type-text- Type stringpress-key- Press single keyhotkey- Press key combinationshortcut- Common shortcuts (copy, paste, etc.)key-down/key-up- Press/release keys
locate-on-screen- Find image locationlocate-all-on-screen- Find all instanceswait-for-image- Wait for image to appearwait-for-image-to-vanish- Wait for image to disappearclick-on-image- Click when image foundstore-template/find-stored-template- Template management
get-screen-size- Get display resolutionsleep- Pause executionprint-mouse-position- Track mouse movement
reyes click --x 100 --y 100
reyes type-text --text "John Doe"
reyes press-key --key tab
reyes type-text --text "john@example.com"
reyes shortcut --name submit# Wait for button and click
reyes click-on-image --image submit.png --confidence 0.9 --duration 0.5
# Or find location first
location=$(reyes locate-on-screen --image icon.png --confidence 0.9)
echo "Found at: $location"reyes move-mouse --x 100 --y 100 --duration 0.5
reyes mouse-down --button left
reyes move-mouse --x 400 --y 400 --duration 1.0
reyes mouse-up --button leftAll commands return JSON:
// Success
{"success": true, "message": "Clicked at (100, 200)"}
// Position
{"x": 500, "y": 300}
// Image match
{
"found": true,
"locations": [[100, 200, 0.95]]
}Reyes is designed for AI automation:
- βοΈ Atomic Operations: Each command is independent
- π JSON Output: Easy to parse programmatically
- π Exit Codes: Non-zero on errors
- ποΈ Computer Vision: Find UI elements by image
- π Natural Interactions: Smooth mouse movements
See SKILL.md for detailed AI agent documentation.
- Use Regions: Limit search areas for faster image recognition
- Store Templates: Reuse prepared images for repeated searches
- Choose Match Mode: Segmented for small/simple images, FFT for large/complex
- Adjust Confidence: 0.9 for precise, 0.8 for fuzzy matching
- π macOS: Requires accessibility permissions; handles Retina displays automatically
- π§ Linux: X11 only (Wayland not supported); can search all monitors
- πͺ Windows: Searches main monitor only; no additional setup needed
Contributions welcome! Please ensure:
- π¦ Code follows Rust best practices
- π‘οΈ All commands have proper error handling
- π JSON output is consistent
- π Documentation is updated
MIT License - See LICENSE file
Carlo Taleon (@Blankeos)
Built on rustautogui by DavorMar
Inspired by PyAutoGUI and the computer-use paradigm
Planned integration with models.dev for advanced AI-powered vision capabilities:
-
subagent-explain- Connects to models.dev to return human-readable descriptions of images. Supports prompt-based queries for specific details. -
subagent-point- Uses a subagent to locate text or objects on screen and returns precise coordinate points. -
subagent-detect- Returns bounding boxes and center point coordinates for detected objects/text. -
subagent-segment- Returns coordinate arrays representing segmentation boundaries.
A configuration command to customize the models.dev connection:
# Configure default connection
reyes configure-subagent --endpoint https://models.dev --api-key <key>
# Configure task-specific models
reyes configure-subagent --task explain --model claude-3-opus
reyes configure-subagent --task segment --model gpt-4-vision
reyes configure-subagent --task detect --model gemini-pro-vision- Clipboard operations (copy/paste content)
- Window management (list, focus, resize windows)
- OCR support for text recognition
- Recording and playback of macro sequences
- Configuration file support for default settings
- Interactive REPL mode with command history
- Integration with other vision APIs (OpenAI, Anthropic, etc.)
