Skip to content

turinglabsorg/crusty

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Crusty

Desktop automation CLI for macOS. Controls mouse, keyboard, takes screenshots, finds text via OCR, and automates Chrome via CDP (Chrome DevTools Protocol).

Install

cargo build --release

Binary: target/release/crusty

Quick Start

# Take a screenshot
crusty screenshot --logical -o shot.png

# Find text on screen (OCR) and get click coordinates
crusty find-text "Submit"
# Output: Submit	450	312

# Click at those coordinates
crusty mouse move-to 450 312
crusty mouse click left

# Type text
crusty keyboard type "hello world"

# Open Chrome and navigate
crusty browser open "https://example.com"

Commands

Screenshot

crusty screenshot                          # Print base64 PNG to stdout
crusty screenshot -o shot.png              # Save to file
crusty screenshot --logical                # 1 pixel = 1 mouse coordinate
crusty screenshot --grid 50               # Overlay coordinate grid
crusty screenshot --max-dim 1280          # Resize

Find Text (OCR)

Uses macOS Vision framework to find text on screen and return exact click coordinates.

crusty find-text "About This Mac"
# Output: About This Mac	94	58    (text, center_x, center_y in logical px)

Mouse

crusty mouse move-to 500 300              # Absolute move (logical px)
crusty mouse click left                   # Click (left/right/middle)
crusty mouse position                     # Print current position
crusty mouse scroll -3                    # Scroll down

Keyboard

crusty keyboard type "hello world"        # Type text
crusty keyboard combo "meta+c"            # Key combination
crusty keyboard tap return                # Single key tap

Browser (Chrome via CDP)

crusty browser open                       # Launch Chrome with CDP
crusty browser open "https://x.com"       # Launch and navigate
crusty browser navigate "https://x.com"   # Navigate active tab
crusty browser tabs                       # List open tabs
crusty browser eval "document.title"      # Execute JavaScript
crusty browser find-text "Post"           # Find element by text
crusty browser find-selector ".btn"       # Find element by CSS

Other

crusty permissions                        # Check macOS permissions
crusty onboard                            # Interactive config wizard
crusty run "open browser and go to google"  # Agent mode (requires LLM config)
crusty mcp                                # Start MCP server on stdio

MCP Server

crusty mcp exposes desktop automation tools over stdio for use with Claude Code or other MCP clients.

{
  "mcpServers": {
    "crusty": {
      "command": "/path/to/crusty",
      "args": ["mcp"]
    }
  }
}

Claude Code Skill

Crusty includes a Claude Code skill at .claude/skills/crusty/SKILL.md for desktop automation directly from Claude Code. The workflow:

  1. Screenshot to see the screen
  2. find-text to measure exact coordinates via OCR
  3. mouse move-to + mouse click at the measured coordinates
  4. Screenshot to verify

Config

Config file: ~/.crusty/config.toml

[browser]
profile = "claude"
cdp_port = 9222

[desktop]
screenshot_max_dim = 1280
action_delay_ms = 100

Coordinates

macOS uses logical pixel coordinates. On Retina (2x) displays, physical pixels are 2x logical. All mouse commands use logical pixels. Use --logical flag on screenshots to get images where 1 pixel = 1 mouse coordinate.

Requirements

  • macOS (Accessibility + Screen Recording permissions)
  • Rust toolchain
  • Chrome (for browser automation)

License

MIT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages