Screen Vision MCP Server

Give Claude Code the ability to see your screen

Screen Vision lets Claude capture screenshots, watch your screen in real-time with audio transcription, analyze video files, and read text via OCR. It runs locally as an MCP server — Claude sees what you see, when you ask.

Quick Start

pip install screen-vision

Then add to your Claude Code MCP config (.mcp.json):

{
  "mcpServers": {
    "screen-vision": {
      "command": "python3",
      "args": ["-m", "screen_vision"]
    }
  }
}

Optional system deps (not required — tools gracefully degrade without them):

brew install tesseract   # Enables OCR (read_screen_text)
brew install ffmpeg      # Enables video analysis (analyze_video)

What You Can Say

"Take a screenshot of my screen"          → capture_screen
"Capture the Chrome window"               → capture_window
"Watch my screen for 1 minute"            → watch_screen (with audio transcription)
"Analyze the video at ~/Downloads/demo.mp4" → analyze_video
"Read the text on my screen"              → read_screen_text
"What window am I in?"                    → get_active_context (no screenshot)
"What's on my screen right now?"          → understand_screen (AI analysis)
"Analyze this photo I AirDropped"         → analyze_image

Tools (14)

Tool	What it does	Needs
`capture_screen`	Full screen capture with delay + multi-monitor	—
`capture_region`	Capture a specific rectangular area	—
`capture_window`	Capture a window by title	—
`list_monitors`	List displays with resolutions	—
`get_active_context`	Window/cursor/monitor info (no image)	—
`read_screen_text`	OCR text extraction from screen	tesseract
`understand_screen`	AI-powered screen analysis	Anthropic API key
`analyze_image`	Analyze a dropped/AirDropped image file	—
`watch_screen`	Watch screen with frame sampling + audio	ffmpeg (audio)
`analyze_video`	Extract keyframes from video files	ffmpeg
`capture_camera`	Grab latest frame from phone camera	—
`watch_camera`	Stream phone camera with scene detection + audio	—
`show_pairing_qr`	Show QR code to connect phone camera	—
`phone_status`	Check phone camera connection status	—

Security

Screen Vision includes security controls for corporate environments:

PII/PCI scanning — Detects credit card numbers, SSNs, phone numbers, email addresses in OCR text
App deny-list — Blocks captures of Slack, Teams, Zoom, banking apps, password managers
Call detection — Blocks captures during active audio calls
Rate limits — 200 captures/session, 2s minimum interval, 5min max watch duration
Audit logs — All captures logged to ~/.screen-vision/audit.log

Set SCREEN_VISION_MODE=work to enable all security controls. Default mode is personal (no restrictions).

Dependencies

Core (always installed): mcp[cli], mss, Pillow, numpy, httpx

Optional (pip install screen-vision[full]):

pytesseract — OCR (needs brew install tesseract)
faster-whisper — Audio transcription
sounddevice — Audio recording
opencv-python — Video processing
paddleocr — Alternative OCR engine

Python 3.11+ required.

Development

pip install -e ".[full,test]"
pytest tests/ -v
ruff check src/

Author

Alex Vicuna — github.com/avicuna

Contributing

Issues and PRs welcome: https://github.com/avicuna/screen-vision

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.claude-plugin		.claude-plugin
.github/workflows		.github/workflows
docs		docs
skills/screen-vision		skills/screen-vision
src		src
tests		tests
.DS_Store		.DS_Store
.gitignore		.gitignore
.gitlab-ci.yml		.gitlab-ci.yml
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Screen Vision MCP Server

Quick Start

What You Can Say

Tools (14)

Security

Dependencies

Development

Author

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

Screen Vision MCP Server

Quick Start

What You Can Say

Tools (14)

Security

Dependencies

Development

Author

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages