Skip to content

ljch2018/screen-watcher

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Screen Watcher — AI-Powered Screen Memory

中文文档

macOS Python Swift MIT License

Screen Watcher is a local-first AI screen memory for macOS.
It watches your foreground window, extracts semantic content via a local VLM, and builds a searchable timeline of everything you see.
All models and data stay on your device. Nothing is ever uploaded.

No browser plugins. No cloud APIs. No accessibility hacks.
Just local screenshot + local AI understanding.


Privacy first

Screen Watcher is designed with absolute privacy as a core principle:

  • Local models only — AI inference runs entirely on your machine via Ollama. No API keys, no cloud calls, no OpenAI/Anthropic/Google endpoints.
  • Local data only — all captured content is stored as plain JSONL files under ~/screen-watcher/. No sync, no upload, no telemetry.
  • Screenshots are ephemeral — captured images are passed to the local VLM for understanding, then immediately deleted. They are never persisted to disk.
  • You own everything — your data is yours. Plain text files you can read, grep, delete, or move at any time.
  • No network access — the only network call is localhost:11434 (Ollama). Zero external connections.

Your screen content is the most sensitive data on your computer. Screen Watcher treats it accordingly.


How it works

How Screen Watcher Works

Any app, any window. If you can see it, Screen Watcher can capture and understand it.

Highlights

  • VLM-powered — uses vision language models to understand screen content semantically, not just OCR.
  • 100% local — model inference via Ollama, data stored as local files. Nothing leaves your machine.
  • Zero config — no browser extensions, no AppleScript injection, no app-specific integrations.
  • Smart dedup — MD5 image hash skips unchanged screens; text similarity skips redundant content.
  • Noise filtering — strips menu bars, UI controls, status indicators automatically.
  • Always-on — runs as a launchd daemon, survives reboots, near-zero resource usage.
  • Compact & merge — consolidate daily records by app/window into deduplicated summaries.
  • Work Insights skill — OpenClaw skill that reads your timeline, builds memory, and provides proactive work insights.

Requirements

  • macOS 14+ (ScreenCaptureKit + Vision framework)
  • Python 3.10+
  • Xcode Command Line Tools (xcode-select --install)
  • Ollama running locally

Install

git clone https://github.com/anthropics/screen-watcher.git
cd screen-watcher

# Pull the VLM model (runs 100% locally)
ollama pull qwen3.5:4b

Note

First run requires granting Screen Recording permission: System Settings → Privacy & Security → Screen Recording → enable your terminal app.

Quick start

# One-shot capture
./scripts/run.sh

# Foreground loop (every 30s)
./scripts/run.sh --daemon 30

# Install as background service (auto-start on boot)
./scripts/install.sh           # default 10s interval
./scripts/install.sh 30        # custom interval

# Uninstall background service
./scripts/uninstall.sh

# View today's summary
./scripts/run.sh --show-today

# Compact today's records (merge by app + window)
./scripts/run.sh --compact

Output format

One JSON record per line, written to ~/screen-watcher/YYYY-MM-DD.jsonl:

{
  "app": "Google Chrome",
  "window_title": "ChatGPT - Agent Memory Architecture",
  "content": "How to add long-term memory to agent products? The key is building a perception-storage-retrieval-forgetting loop...",
  "content_length": 6823,
  "content_hash": "a3f2b1c4",
  "timestamp": "2026-03-21T16:30:00",
  "date": "2026-03-21"
}

Work Insights skill (OpenClaw)

The skills/work-insights/ directory contains an OpenClaw skill that turns your screen timeline into actionable work insights:

  • Incremental reading — reads only new records since last run (transactional, with commit/rollback)
  • Deep understanding — AI reads every line, extracts projects, collaborators, decisions, problems
  • Memory building — accumulates daily notes and long-term memory about your work patterns
  • Proactive alerts — detects upcoming meetings, @mentions, stuck problems, deadlines
  • Work analysis — time allocation, efficiency suggestions, project tracking
# Install as OpenClaw skill
cp -r skills/work-insights ~/.openclaw/skills/

# Configure @mention detection (optional)
export SCREEN_WATCHER_USER="your_name,Your Full Name"

Testing

python3 -m pytest tests/ -v

Project structure

screen-watcher/
├── README.md                     # English documentation
├── README.zh-CN.md               # Chinese documentation
├── src/
│   ├── collect.py                # Main collector: scheduling, dedup, storage
│   └── screen_capture.swift      # Foreground window capture (ScreenCaptureKit + Vision)
├── scripts/
│   ├── run.sh                    # Run collector
│   ├── install.sh                # Install launchd daemon
│   └── uninstall.sh              # Uninstall launchd daemon
├── skills/
│   └── work-insights/            # OpenClaw skill: AI work insights
│       ├── SKILL.md              # Skill definition and instructions
│       └── scripts/
│           ├── read_increment.py     # Incremental JSONL reader (with commit/rollback)
│           └── detect_important.py   # Meeting, mention, deadline, problem detector
└── tests/
    ├── test_collect.py           # Unit tests for collect.py
    ├── test_detect_important.py  # Tests for alert detection
    └── test_read_increment.py    # Tests for incremental reader

License

MIT

About

AI-powered screen memory for macOS — local VLM captures and understands everything you see

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors