The Android implementation of Claw Use — a protocol for AI agents to control real devices.
One app. 37 API endpoints. Full phone control over HTTP. No ADB. No root. No PC.
# See the screen
curl http://phone:7333/screen
# Take a screenshot
curl http://phone:7333/screenshot
# Tap, type, swipe
curl -X POST http://phone:7333/tap -d '{"x":500,"y":1000}'
curl -X POST http://phone:7333/type -d '{"text":"Hello world"}'
curl -X POST http://phone:7333/swipe -d '{"direction":"up"}'
# Speak, record, snap
curl -X POST http://phone:7333/tts -d '{"text":"I can talk now"}'
curl -X POST http://phone:7333/audio/record -d '{"durationMs":5000}'
curl -X POST http://phone:7333/camera -d '{"facing":"back"}'
# Read sensors & system state
curl http://phone:7333/battery
curl http://phone:7333/wifi
curl http://phone:7333/location
curl http://phone:7333/clipboard
curl http://phone:7333/volume
# Read/write files, contacts, SMS
curl http://phone:7333/file/list?path=/sdcard/DCIM
curl http://phone:7333/contacts?search=John
curl http://phone:7333/sms?limit=5
curl -X POST http://phone:7333/sms -d '{"to":"1234567","message":"Hi"}'
# Launch apps, fire intents, read notifications
curl -X POST http://phone:7333/launch -d '{"package":"com.whatsapp"}'
curl -X POST http://phone:7333/intent -d '{"action":"android.intent.action.CALL","uri":"tel:+1234567890"}'
curl http://phone:7333/notifications- AI agent with a real phone: Your agent can send messages, check apps, take screenshots, and speak — on a real device with real accounts
- Revive broken phones: USB port dead? Screen cracked? If WiFi works, Claw Use gives the phone a second life. No ADB needed, no screen needed — the AI sees through
/screenshot - Remote phone access: Add Tailscale and control your phone from anywhere in the world
- Spare phone automation: Turn that old phone in your drawer into a dedicated AI worker
- Testing & QA: Automate real-device testing without emulators
Every phone control solution requires a PC running ADB. This one doesn't.
Install the app → enable Accessibility Service → your phone is now an HTTP-controlled device. Connect from anywhere on the same network. Add Tailscale and control it from anywhere in the world.
Built for AI agents that need a real phone — not an emulator, not a cloud device, your actual phone with your actual apps, accounts, and data.
| Endpoint | Method | What it does |
|---|---|---|
/screen |
GET | UI tree — every element, its text, bounds, clickable/scrollable state |
/screenshot |
GET | Actual screenshot as base64 JPEG (configurable quality & resolution) |
/notifications |
GET | All notifications with title, text, actions |
/screen/state |
GET | Lock state, screen on/off |
/info |
GET | Device model, OS, screen size, permissions |
/status |
GET | Full health dashboard (uptime, request count, a11y latency) |
| Endpoint | Method | What it does |
|---|---|---|
/tap |
POST | Tap at coordinates |
/click |
POST | Tap by text or content description (semantic click) |
/longpress |
POST | Long press at coordinates |
/swipe |
POST | Swipe in any direction |
/scroll |
POST | Scroll up/down/left/right |
/type |
POST | Type text (supports CJK via clipboard) |
/global |
POST | Back, home, recents, notifications, power dialog |
/launch |
GET/POST | List installed apps / launch by package name |
/intent |
POST | Fire any Android Intent (call, SMS, URL, share, deep links) |
| Endpoint | Method | What it does |
|---|---|---|
/tts |
POST | Speak text through the phone speaker |
/tts/voices |
GET | List available TTS voices |
/audio/record |
POST | Record audio from microphone |
| Endpoint | Method | What it does |
|---|---|---|
/clipboard |
GET/POST | Read or write clipboard text |
/camera |
POST | Capture photo (front/back, quality, max width) |
/volume |
GET/POST | Read/set volume for all audio streams |
/battery |
GET | Battery level, charging status, temperature |
/wifi |
GET | WiFi connection info (SSID, IP, signal) |
/location |
GET | GPS/network location with fallback |
/vibrate |
POST | Vibrate the device (one-shot or pattern) |
/contacts |
GET | Search and list contacts |
/sms |
GET/POST | Read inbox/sent messages, send SMS |
/file |
GET/POST/DELETE | Read, write, delete files on device |
/file/list |
GET | List directory contents |
| Endpoint | Method | What it does |
|---|---|---|
/batch |
POST | Execute multiple operations in one request |
/flow |
POST | Multi-step automation with conditions |
| Endpoint | Method | What it does |
|---|---|---|
/screen/wake |
POST | Wake the screen |
/screen/lock |
POST | Lock the device |
/screen/unlock |
POST | Unlock with PIN (auto-unlock middleware handles this transparently) |
/config |
GET/POST/DELETE | Configure PIN for remote unlock |
/ping |
GET | Health check (no auth required) |
All endpoints (except /ping) require a token:
X-Bridge-Token: <your-token>
Token is generated on first launch and shown in the setup screen + notification bar.
┌─────────────────────────────────────────┐
│ :http process │
│ ┌─────────────────────────────────┐ │
│ │ BridgeService (NanoHTTPD) │ │
│ │ 0.0.0.0:7333 │ │
│ │ - Auth, CORS, rate tracking │ │
│ │ - /ping, /info, /launch local │ │
│ │ - Everything else → proxy │────┼──┐
│ └─────────────────────────────────┘ │ │
│ WakeLock + WifiLock + Foreground Svc │ │
└─────────────────────────────────────────┘ │
│ HTTP proxy
┌─────────────────────────────────────────┐ │ (localhost:7334)
│ main process │ │
│ ┌─────────────────────────────────┐ │ │
│ │ AccessibilityBridge │◄───┼──┘
│ │ A11yInternalServer :7334 │ │
│ │ - Screen reading │ │
│ │ - Gesture dispatch │ │
│ │ - Screenshots │ │
│ │ - TTS, Intents, Notifications │ │
│ └─────────────────────────────────┘ │
│ Heartbeat Watchdog (30s) │
└─────────────────────────────────────────┘
Why two processes? Android Accessibility Service can freeze (IPC deadlocks, unresponsive apps). If it hung in a single process, the HTTP server would die with it. The dual-process architecture keeps the external API responsive even when accessibility is stuck — the proxy returns a timeout error instead of hanging forever.
Download the APK from Releases and install it.
Settings → Accessibility → Claw Use → Enable
Settings → Notifications → Notification Access → Claw Use
# Find your phone's IP in the app or via /ping
curl http://<phone-ip>:7333/ping
# → {"status":"ok","service":"claw-use-android","version":"1.2.0"}# This remembers your existing lock screen PIN — it does NOT change it
curl -X POST http://<phone-ip>:7333/config \
-H "X-Bridge-Token: <token>" \
-d '{"pin":"your-existing-pin"}'Install Tailscale on the phone. Your phone gets a stable 100.x.x.x address accessible from anywhere in your Tailscale network.
# From anywhere in the world
curl http://100.x.x.x:7333/screenshot -H "X-Bridge-Token: <token>"No port forwarding. No dynamic DNS. No firewall rules. Just works.
The app includes an UpdateReceiver that listens for MY_PACKAGE_REPLACED. After installing a new version, the BridgeService automatically restarts — no manual app launch needed.
This enables fully autonomous OTA updates: an AI agent can build a new APK, send it to the phone (e.g. via Telegram), navigate to download it, tap through the installer, and regain control after the update completes. Zero human intervention.
Xiaomi's aggressive battery optimization will kill background services. To keep Claw Use alive:
- Battery saver: Set to "No restrictions" for Claw Use
- Autostart: Enable in Security → Autostart
- Lock in recents: Open Claw Use → long press in recent apps → tap the lock icon
- Battery optimization: The app auto-requests exemption on launch
Returns the accessibility tree as JSON.
{
"package": "org.telegram.messenger",
"timestamp": 1742108400000,
"count": 26,
"nodes": [
{"text": "Search Chats", "bounds": "0,280,1220,422", "click": true},
{"text": "John", "desc": "Last message preview", "bounds": "0,422,1220,600"}
]
}Query params:
compact=true— only nodes with text/desc/clickable/editable/scrollabletimeout=5000— max milliseconds to wait for accessibility tree
Returns a base64-encoded JPEG screenshot.
{
"screenshot": "/9j/4AAQ...",
"format": "jpeg",
"quality": 50,
"sizeBytes": 42000,
"timestamp": 1742108400000
}Query params:
quality=50— JPEG quality (10-100, default 50)maxWidth=720— max pixel width (100-2000, default 720)
Semantic click — finds an element by text or description and taps its center.
{"text": "Send"}
// or
{"desc": "Search button"}
// or
{"id": "com.app:id/send_button"}Types text. Uses clipboard paste for reliable CJK support.
{"text": "你好世界"}
// → {"typed": true, "text": "你好世界", "method": "clipboard_paste"}Fire any Android Intent.
// Open a URL
{"action": "android.intent.action.VIEW", "uri": "https://example.com"}
// Make a phone call
{"action": "android.intent.action.CALL", "uri": "tel:+1234567890"}
// Send SMS
{"action": "android.intent.action.SENDTO", "uri": "smsto:+1234567890", "extras": {"sms_body": "Hello"}}
// Share text
{"action": "android.intent.action.SEND", "type": "text/plain", "extras": {"android.intent.extra.TEXT": "Check this out"}}
// Deep link
{"action": "android.intent.action.VIEW", "uri": "tg://resolve?domain=username"}Speak text through the phone's speaker.
{"text": "Hello world", "language": "en-US", "rate": 1.0, "pitch": 1.0}- Android 7.0+ (API 24) for core features
- Android 11+ (API 30) for
/screenshot - No root required
- No ADB required
- No PC required
git clone https://github.com/4ier/claw-use-android.git
cd claw-use-android
./gradlew assembleDebug
# APK at app/build/outputs/apk/debug/app-debug.apkClaw Use Android is the first implementation of the Claw Use protocol — a standard HTTP API for AI agents to control physical devices.
The protocol defines a common set of endpoints (/screen, /screenshot, /tap, /type, /tts, etc.) that any device can implement. The same cu CLI and agent skills work across all compliant devices:
cua add redmi 192.168.0.105 <token> # Android phone
cua add ipad 100.80.1.10 <token> # future: iOS
cua add laptop 100.80.1.20 <token> # future: desktop
cua -d redmi screenshot # same command, any deviceWant to add Claw Use support for a new platform? Implement the HTTP endpoints documented above, return JSON, support token auth. The ecosystem comes free.
MIT
Built for agents that need a real phone.