SECTION 1: THE AGENT RUNTIME (`src/agents/`, ~300 files)

This is where agents actually THINK AND ACT - the consciousness loop.


THE CORE LOOP
The agent loop is NOT a TRADITIONAL `while(true)` loop. It's EVENT-DRIVEN AND
REQUEST-RESPONSE BASED, managed by the `pi-agent-core` SDK.


ENTRY POINT: `runEmbeddedPiAgent()` in `src/agents/pi-embedded-runner/run.ts:137`

The flow for every single turn:



   User Messages Arrives
   > runEmbeddedPiAgent - L137
      > Resolve workspace & session
      > Load/validate model config
      > resolve auth profiles with fallback
      > Enter retry loop
         > runEmbeddedAttempt - L399
            > Load session file from disk
            > Create all tools
            > Build system prompt
            > Create agent session object
            > Apply system prompt override
            > Subscribe to streaming events
            > Call SDK streaming to LLM
               > Events stream back:
                  > message_start
                  > message_update
                  > message_end
                  > tool_execution_start
                  > tool_execution_update
                  > tool_execution_end
                  > agent_end
            > Accumulate response
      > Handle errors  (L 474-797)
      > Return or retry   

Response Payload(s) sent back





THE "THINKING" -- It's Delegated

The actual LLM thinking is NOT in OpenClaw code. It's delegated to the Anthopic
SDK (or equivalent provider). What OpenClaw does it:
   1. Sent message to LLM with tools and system prompt
   2. LLM processes, decides what to do, generates response
   3. SDK streams back events
   4. OpenClaw subscribes to events and handles them




The event handlers live in `pi-embedded-subscribe.handlers.ts` (L 24-61), which
dispatches to:
   - `handleMessageStart/Update/End` (in `handlers.messages.ts`)
   - `handleToolExecutionStart/Update/End` (in `handlers.tools.ts`)
   - `handleAgentStart/End` (in `handlers.lifecycle.ts`)





MULTI-LEVEL FAILOVER (3 LEVELS DEEP)
   The retry loop at L392 handles failures gracefully::

   LEVEL 1 - Auth Profile Rotation:
   - If current API key/profile fails, try the next one for same model
   - Profiles in cooldown are skipped

   LEVEL 2 - Model Fallback:
   - When the model itself fails (not just auth), switch to fallback model
   - Configured via `config.agents.defaults.model.fallbacks[]`
   - Throws `FailoverError` to trigger model change

   LEVEL 3 - CONTEXT REDUCTION :
   - On context overflow: auto-compaction (summarise old messages)
   - Up to 3 compaction retries
   - If still too big: truncate oversized tool results

Error classification (`pi-embeddded-helpers.ts:30`): detects "auth", 
"rate_limit", "billing", "timeout", "unknown"




SYSTEM PROMPT CONSTRUCTION
`buildAgentSystemPrompt()` in `system-prompt.ts:164` builds the prompt with 
25+ SECTIONS in order:
   
   1. Identity - "You are a personal assistant running inside OpenClaw"#
   2. Tooling - Tool listing and descriptions
   3. Tool Call Style - Narration Guidance
   4. Safety - Self-preservation, manipulation safeguards
   5. CLI Quick Reference
   6. Skills - Conditional skill injection
   7. Memory Recall - If memory tools available
   8. Self-Update - For gateway tool
   9. Model Aliases - If configured
   10. Workspace - Working directory declaration
   11. Documentation
   12. Sandbox Info - If sandboxed
   13. User Identity
   14. Time/Timezone
   15. Workspace Files
   16. Reply Tags
   17. Messaging context
   18. Voice/TTS
   19. Group Chat Context
   20. Reactions - If enabled
   21. Reasoning Format
   22. Project Context - SOUL.md + context files
   23. Silent Replies
   24. Heartbeats
   25. Runtime - Agent ID, host, OS, model, shell, channel, capabilities


Prompt Modes: "full" (main agent), "minimal" (subagents - fewer sections), 
"none" (identity only)

In [None]:
+-------------------------------------+
|                 ___                 |
|    __.--/)  .-~~   ~~>>>>>>>>   .-. |
|   (._\~  \ (        ~~>>>>>>>>.~.-' |
|     -~}   \_~-,    )~~>>>>>>>' /    |
|       {     ~/    /~~~~~~. _.-~     |
|        ~.(   `--~~/      /~ ~.      |
|   .--~~~~_\  \--~(   -.-~~-.  \     |
|   ```-'~~ /  /    ~-.  \ .--~ /     |
|        (((_.'    (((__.' ```-'      |
+-------------------------------------+

SUBAGENT SPAWNING

Via the `session_spawn` tool (`tools/sessions-spwan-tool.ts:83`):


> Main Agent calls sessions_spawn
   > Validate agent ID against allowlist 
   > Create child session key: "agent:{id}:subagent:{uuid}"
   > Resolve model (per-agent config > global default)
   > Build subagent system prompt (minimal mode)
   > Register & queue async execution 
> Main Agent continues without waiting
> Subagent runs indepndently via same `runEmbeddedPiAgent()` loop
> Main agent queries via: sessions_list, sessions_history, sessions_send

Key: subagents are NOT blocking. They run async. The parent can keep working.

SESSION STATE PERSISTENCE

- FORMAT: JSONL at `{workspace}/.openclaw/session.jsonl`
- CONTENT: Session header + user messages + assistant messages + tool calls/results
- REOPENED at each run - previous messages loaded into memory
- COMPACTION: When context window exceeded, old messages get summarised into a
  single "summary" message




AGENT CONFIGURATION LAYERING

4 levels, each cascading:
   1. Global defaults (`config.agents/defaults`)
   2. Per-agent overrides (`config.agents.list[].model`, `.workspace`, etc.)
   3. Runtime overrides (sessionKey agent prefix)
   4. Subagent overrides

SECTION 2: THE TOOL SYSTEM

This is how agents EXECUTE ACTIONS in the real world.


WHERE TOOLS ARE DEFINED
   > `src/agents/pi-tools.ts:15`
      > Main tool creation: `createOpenClawCodingTools()`
   > `src/agents/openclaw-tools.ts:22`
      > OpenClaw-specific tools: `createOpenClawTools()`
   > `src/agents/tools/`
      > Individual tool implementations (22+ files)
   > `src/agents/bash-tools.exec.ts:800`
      > Full bash exec logic: `createExecTool()`
   > `src/agents/bash-tools.process.ts:44`
      > Background process management
   > `src/plugins/tools.ts:43`
      > Plugin tool registration               

TOOL INTERFACE

Every tool implements:
```ts
AgentTool<ParametersType, DetailsType> {
    name: string;
    label: string;
    description: string;
    parameters: JSONSchema;
    execute: async (
        toolCallId: string,
        args: unknown,
        signal?: AbortSignal,
        onUpdate?: (partialResult: AgentToolResult) => void
    ) => AgentToolResult<DetailType>
}
```

Result structure:
```ts
AgentToolResult {
    content: Array<{
        type: "text" | "image" | "audio"
        text?: string
        data?: string       // base 64
        mimeType?: string
    }>,
    details?: DetailsType       // Tool-specific structured data
}
```

Created in `openclaw-tools.ts:22`:

1. `browser` - Web browser control (Playwright)
2. `canvas` - Present/eval/snapshot canvas UI
3. `nodes` - List/describe/notify/camera/screen on paired devices
4. `cron` - Schedule wake events and cron jobs
5. `message` - Send messages and channel actions (conditional)
6. `tts` - Text-to-speech
7. `gateway` - Restart/config/update OpenClaw itself
8. `agents_list` - List allowed agents for spawning
9. `sessions_list` - List other sessions
10. `sessions_history` - Fetcg oanother session's history
11. `sessions_send` - Send message to another session
12. `sessions_spawn` - Spawn sub-agent session
13. `session_status` - Show status card
14. `web_search` - Search the web (Brave API)
15. `web_fetch` - Fetch & extract from URL
16. `image` - Analyse image (conditional)


Plus coding tools from `pi-tools.ts`: `read`, `write`, `edit`, `exec`, `process`
, `apply-patch`

Plus plugin tools loaded dynamically via `resolvePluginTools()`

In [None]:
                            ...,ooooooooo......
                      .o8888888888888888888888888o.
                  .o888888888888888888888888888888888o.
                o8888888888A88"V888888888888888888888888o
              o88888887"8"  "   V888  88888888888888888888o
            o88888888            V     888888888888888888888o
           o888888888                   888888888888888888888o
          .88888888888                  88888V"  "V88888888888.
          o88888888888v                 8888"  v8  88888888888o
          88888888888v                  8888v  v88 888888888888
          888888888888                  88888v  "88888888888888
           88888888888V                  V88888v  "88888888888
           88888888888v                            "8888888888
____________8888888888888v.........................v888888888_____________
:::::::::::::::::::::::::'                         :::::::::::::::::::::::
:::::::::::::::::::::::                .:::::::    .::::::::::::::::::::::
::::::::::::::::::::::                 :::::::  .:::::::::::::::::::::::::
:::::::::::::::::::::                  ::::::  ::: :::::::::::::::::::::::
:::::::::::::::::::::.                 ::::::. :: .:::::::::::::::::::::::
::::::::::::::::::::::                 :::::::.  .::::::::::::::::::::::::
:::::::::::::::::::::.           .     :::::::::::::::::::::::::::::::::::
:::::::::::::::::::::          :::.   ::::::::::::::::::::::::::::::::::::
::::::::::::::::::::::.::.:: :::::::.:::::::::::::::::::::::::::::::::::::
::::::::::::::::::::::::::::::::::::::::::::::::::::Meindert+Stewart::::::
[Nice job.]

THE BASH/EXEC TOOL (CRITICAL for Android App Building)

File: `bash-tools.exec.ts:800 - createExecTool()`

Parameters:
```ts
{
    command: string,                    // Shell command to run
    workdir?: string,                   // Working directory
    env?: Record<string, string>,       // Environment variables
    yieldMs?: number,                   // Ms before backgrounding (default 10000)
    background?: boolean,               // Run immediately in background
    timeout?: number,                   // Timeout in seconds
    pty?: boolean,                      // Pseudo-terminal (for interactive CLIs)
    elevated?: boolean,                 // Elevated permissions (gateway only)   
    host?: string,                      // "sandbox" | "gateway" | "node"
    security?: string,                  // "deny" | "allowList" | "full"
    ask?: string,                       // "off" | "on-miss" | "always"
    node?: string                       // Node ID for host=node
}
```

THREE EXECUTION MODES:
1. SANDBOX MODE (Default, safest) - Runs in a Docker container via `docker exec`
   Environment filtered through `buildSandboxEnv()`
2. GATEWAY MODE (semi-trusted) - Runs on host machine. Requires approval.
   Allowlist-based security.
3. NODE MODE (remote) - Runs on paired device via `system.run` gateway command.      


PROCESS HANDLING (`runExecProcess()`):
- Spawns process with output stream
- PTY support fopr terminal-based apps          - IMPORTANT FOR GRADLE INTERACTIVE MODE
- Output truncation to 200KB default
- Background execution via `yieldMs` or `background=true` 
                                                - CRITICAL SINCE ANDROID BUILDS TAKE 2-5 MINUTES


BACKGROUND PROCESS MANAGEMENT (`bash-tools/process.ts:44`)
- Actions: list, poll, log, write, send-keys, submit, paste, kill, clear, remove
- Long-running processes persist across tool calls

In [None]:
            YES   =============================   NO
     +-----------|| Does the Darn Thing work? ||-----------+
     |            =============================            |
     V                                                     V
+----------+     +---------+                          +---------+
|   Don't  |  NO |   Does  |       +-------+     YES  | Did you |
|   mess   | +---|  anyone |<------|  YOU  |<---------|   mess  |
| with it! | |   |  know?  |       | MORON |          | with it |
+----------+ |   +---------+       +-------+          +---------+
     |       V        | YES                                |  NO
     |    +------+    +-----------+                        |
     |    | HIDE |                V                        V
     |    |  IT  |            +--------+             +-----------+
     |    +------+            |  YOU   |        YES  | WILL THEY |
     |       |       +------->|  POOR  |<------------| CATCH YOU?|
     |       |       |        |BASTARD!|             +-----------+
     |       |       |        |________|                   |  NO
     |       |       |             |                       |
     |       |       |             V                       V
     |       |       |      +---------------+        +-----------+
     |       |       |  NO  | CAN YOU BLAME |        |DESTROY THE|
     |       |       +------| SOMEONE ELSE? |        |  EVIDENCE |
     |       |              +---------------+        +-----------+
     |       |                     |  YES                  |
     |       |                     v                       |
     |       |      ============================           |
     |       +---->||           N O            ||<---------+
     +------------>||      P R O B L E M       ||
                    ============================

THE APPROVAL/PERMISSION SYSTEM

FLOW (L 1120-1500 in `bash-tools.exec.ts`):
   
   1. Command analysed via `evaluateShellAllowlist()` against safe patterns
   2. `requiresExecApproval()` decides based on `ask` and `security` settings
   3. If approval needed: returns "approval-pending" with `approvalId`
   4. Gateway sends approval request with 120s timeout.
   5. User approves: "allow-once" or "always-allow" (saved to allowList)
   6. Command executes



APPROVALS CONFIG: `~/.openclaw/exec-approvals.json` with per-agent allowlists

SAFE BINS (auto-approved): `jq`, `grep`, `cut`, `sort`, `uniq`, `head`, `tail`,
   `tr`, `wc`

BLOCKED ENV VARS: `LD_PRELOAD`, `DYLD_*`, `NODE_OPTIONS`, `PYTHONPATH`, 
   `BASH_ENV`, etc.

ADDING CUSTOM TOOLS (Two Methods)

Method 1 - Plugin registration (recommend):
```ts
// In a plugin file
export function register(api) {
  api.registerTool({
    name: "android-build",
    description: "Build Android APK with Gradle",
    parameters: {
      type: "object",
      properties: {
        variant: { type: "string" },
        clean: { type: "boolean" }
      }
    },
    execute: async (toolCallId, args) => {
      return {
        content: [{ type: "text", text: "Build result..." }],
        detail: { success: true }
      }
    }
  });
}
```


METHOD 2 -- DIRECT CREATION in `src/agents/tools/`:
```ts
function createCustomTool(): AgentTool<any, unknown> {
  return { name, description, parameters, execute };
}
```

In [None]:
                            O         O
.  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .
 .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .
   .    .    .    .    .    .    .    .    .    .    .    .    .
   .     .     .     .     .     .     .     .     .     .     .
     .      .      .      .      .      .      .      .      .
 .       .       .       .       .       .       .       .       .
      .        .        .        .        .        .        .
   .         .         .         .         .         .         .
.          .          .          .          .          .          .
|          |          |          |          |          |          |
|          |          |          |          |          |          |
|          |          |          |          |          |          |
   .         .         .         .         .         .         .
      .        .        .        .        .        .        .
 .       .       .       .       .       .       .       .       .
     .      .      .      .      .      .      .      .      .
   .     .     .     .     .     .     .     .     .     .     .
   .    .    .    .    .    .    .    .    .    .    .    .    .
 .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .
.  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .

SECTION 3: THE ANDROID APP (`apps/android`, 66 Kotlin files)

This is OpenClaw's existing Android companion app - relevant for understanding
how to programatically build and package an Android app.




DIRECTORY STRUCTURE:
  apps/android/
  ├── app/src/main/
  │   ├── java/ai/openclaw/android/
  │   │   ├── NodeApp.kt               # Application entry point
  │   │   ├── MainActivity.kt          # Main Compose activity
  │   │   ├── NodeForegroundService.kt # Keeps app alive
  │   │   ├── gateway/                 # WebSocket communication
  │   │   │   ├── GatewaySession.kt    # WebSocket client
  │   │   │   ├── GatewayDiscovery.kt  # mDNS service discovery
  │   │   │   ├── GatewayTls.kt        # TLS pinning & TOFU
  │   │   ├── node/                    # Device capability managers
  │   │   │   ├── CameraCaptureManager.kt
  │   │   │   ├── ScreenRecordManager.kt
  │   │   │   ├── LocationCaptureManager.kt
  │   │   │   ├── SmsManager.kt
  │   │   ├── chat/                   # Chat functionality
  │   │   ├── protocol/               # Protocol definitions
  │   │   ├── ui/                     # Compose UI components
  │   │   │   ├── RootScreen.kt       # Main screen
  │   │   │   ├── CanvasController.kt # WebView for canvas
  │   │   │   ├── ChatSheet.kt
  │   │   │   ├── SettingsSheet.kt
  │   │   │   ├── StatusPill.kt
  │   │   │   ├── TalkOrbOverlay.kt
  │   │   │   ├── OpenClawTheme.kt
  │   │   └── voice/                  # Voice wake & talk mode
  │   ├── res/                        # Resources
  │   └── AndroidManifest.xml
  ├── build.gradle.kts                # Project-level
  ├── app/build.gradle.kts            # App-level
  ├── settings.gradle.kts
  ├── gradle/wrapper/
  ├── gradlew / gradlew.bat
  └── README.md


BUILD CONFIGURATION
   From `app/build.gradle.kts`
   - GRADLE: 9.2.1 (via wrapper)
   - KOTLIN: 2.2.21 with Compose
   - SDK: compileSdk 36, minSdk 31, targetSdk 36
   - JAVA: JVM 17
   - Version: 2026.2.6
   - Namespace: `ai.openclaw.android`
   - Output: `openclaw-[versionName]-{buildType}.apk`


KEY DEPENDENCIES:
   - Compose BOM: 2025.12.00
   - OkHttp: 5.3.2 (WebSocket)
   - Coroutines: 1.10.2
   - kotlinx-serialization: 1.9.0
   - CameraX: 1.5.2
   - DNS-SD: dnsjava:3.6.4
   - Security: androidx.security:security-crypto:1.1.0

In [None]:
                            O         O
.  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .
 .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .
   .    .    .    .    .    .    .    .    .    .    .    .    .
   .     .     .     .     .     .     .     .     .     .     .
     .      .      .      .      .      .      .      .      .
 .       .       .       .       .       .       .       .       .
      .        .        .        .        .        .        .
   .         .         .         .         .         .         .
.          .          .          .          .          .          .
|          |          |          |          |          |          |
|          |          |          |          |          |          |
|          |          |          |          |          |          |
   .         .         .         .         .         .         .
      .        .        .        .        .        .        .
 .       .       .       .       .       .       .       .       .
     .      .      .      .      .      .      .      .      .
   .     .     .     .     .     .     .     .     .     .     .
   .    .    .    .    .    .    .    .    .    .    .    .    .
 .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .
.  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .

HOW IT CONNECTS TO GATEWAY

DISCOVERY (`GatewayDiscovery.kt`)
   - Local Bonjour mDNS: `_openclaw-gw._tcp` service type
   - Wide-area DNS-SD via configurable domain
   - Returns `GatewayEndpoint` with host, port, TLS settings

WebSocket Protocol (`GatewaySession.kt`):
   - OkHttp3 WebSocket (`ws://` or `wss://`)
   - JSON-RPC style request/response
   - Connection flow:
     a. Establish WebSocket
     b. Receive connection nonce from gateway
     c. Send `connect` frame with device identity + token + device info
     d. Persistent connection with auto-reconnect

AUTH: Ed25519 key pair per device, SHA-256 hash for `deviceId`, tokens in
   `EncryptedSharedPreferences`



COMMANDS THE APP HANDLES
- `canvas.*` - Present, Hide, Navigate, Eval, Snapshot
- `camera.*` - Snap (photo), Clip (video)
- `screen.*` - Record MP4
- `location.*` - GPS location
- `sms.*` - Send SMS
- `chat.*` - Send/loat messages


BUILD COMMANDS (For Autonomous Building)

```bash
cd apps/android
./gradlew :app:testDebugUnitTest       # Run tests
./gradlew :app:assembleDebug           # Build debug APK
./gradlew :app:installDebug            # Install to connected device
```

CI/CD (`.github/workflow/ci.yml`):
- Java 21 (Temurin)
- Android SDK setup (accept licenses, install platforms + build-tools)
- Gradle 8.11.1
- Debug builds only (no release signing in repo)

KEY PATTERNS FOR AUTONOMOUS APP BUILDING
   1. VERSION CONFIG is in `build.gradle.kts` - easily scriptable
   2. GRADLE WRAPPER ensures consistent builds - no local Gradle install needed
   3. SHARED ASSETS bundled from `../../shared/OpenClawKit/Sources/Resources`
   4. DEVICE IDENTITY auto-generated (ED25519 + UUID)
   5. NO SIGNING CONFIG FOR RELEASE - would need keystore setup for Play Store




---
SECTION 4: SKILLS SYSTEM & PLUGIN SDK   

Skills - Structure
   Skills are modular, self-contained packages:

  skill-name/
  ├── SKILL.md (required)           # YAML frontmatter + Markdown instructions
  │   ├── name: (required)
  │   ├── description: (required)   # This is the trigger mechanism
  │   └── Body: instructions
  └── Bundled Resources (optional)
      ├── scripts/                  # Executable code
      ├── references/               # Documentation for context
      └── assets/                   # Templates, icons, etc.


SKILLS - LOADING (4 SOURCES, IN PRECEDENCE ORDER)

From `src/agents/skills/workspace.ts:99`:
   1. BUNDLED SKILLS - Compiled into OpenClaw
   2. EXTRA SKILLS - From config paths or plugin skill directories
   3. MANAGED SKILLS - From `~/.openclaw/skills`
   4. WORKSPACE SKILLS - From `<workspace>/skills`

Skills are filtered by: config enablement, binary availability, platform
eligibility.


SKILLS - PROGRESSIVE DISCLOSURE (3 Levels)

1. Metadata (~100 words)     - Always loaded into agent context
2. SKILL.md body (<5k words> - Loaded when skill triggers)
3. BUNDLED RESOURCES         - Loaded on-demand by agent


SKILLS - BUILT-IN LIST (Sample)

   Infrastructure: `github, discord, slack, tmux, coding-agent`
   Data/Content: `notion, obsidian, apple-notes ...`
   Communication: `imsg, video-call...`
   Utilities: `summarise, healtcheck, weather, gifgrep, model-usage`
   Workflow: `skill-creator, review-pr, prepare-pr, merge-pr`

CREATING A CUSTOM SKILL (e.g., "android-app-builder")

   From `skills/skill-creator/SKILL.md:201`:

```markdown
# Step 1: Initialise
scripts/init_skill.py android-app-builder --path skills.public \
  --resources scripts,references,assets --examples

# Step 2: Edit SKILL.md with frontmatter
---
name: android-app-builder
description: Build Android apps with Gradle and Kotlin. Use when creating 
  projects, adding feaures, configuring builds, or packaging APKs.
---
[Markdown instructions here]

# Step 3: Package
scripts/package_skill.py skills/public/android-app-builder ./dist
```

In [None]:
                            O         O
.  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .
 .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .
   .    .    .    .    .    .    .    .    .    .    .    .    .
   .     .     .     .     .     .     .     .     .     .     .
     .      .      .      .      .      .      .      .      .
 .       .       .       .       .       .       .       .       .
      .        .        .        .        .        .        .
   .         .         .         .         .         .         .
.          .          .          .          .          .          .
|          |          |          |          |          |          |
|          |          |          |          |          |          |
|          |          |          |          |          |          |
   .         .         .         .         .         .         .
      .        .        .        .        .        .        .
 .       .       .       .       .       .       .       .       .
     .      .      .      .      .      .      .      .      .
   .     .     .     .     .     .     .     .     .     .     .
   .    .    .    .    .    .    .    .    .    .    .    .    .
 .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .
.  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .  .



This snippet is the raw developer documentation for OpenClaw's Plugin SDK
(Software Development Kit).

If the OpenClaw Gateway is the "brain" and the CNS of your AI agent, the Plugin
SDK is the "USB port." It is the official way you plug completely new 
capabilities, custom tools, and safety guardrails into OpenClaw without ever
having to modify or hack its core source code.

Because your ultimate goal is to let Opus 4.6 autonomously engineer and QA test
Android apps with zero human intervention, THIS PLUGIN SYSTEM IS EXACTLY HOW
YOU WILL BUILD YOUR CUSTOM SCAFFOLDING.

Instead of writing a messy Python script from scratch, you will package your
ADB terminal commands, your vision-checking prompts, and your auto-connection
loops into a neat OpenClaw plugin.

... how this architecture works and how you will use it for your project.

---
PLUGIN SDK - Structure

Plugins defined via `openclaw.plugin.json` manifest:
```TS
{
    "id": "unique-plugin-id",
    "name": "Display Name",
    "description": "What it does",
    "version": "1.0.0",
    "configSchema": { /* JSON SCHEMA */ },
    "channels": ["discord"],
    "skills": ["skill-name"]
}
```

Plugin module:
```TS
type OpenClawPluginDefinition = {
    id?: string;
    name?: string;
    register?: (api: OpenClawPluginApi) => void;
}
```

1. STRUCTURE: The "ID Card" and the "Code"
   Before OpenClaw will load your custom code, it needs to know what it is.
   Every plugin requires two parts:
      - THE MANIFEST (`openclaw.plugin.json`): Think of this as the plugin's ID
        badge. It tells the Gateway its name, version, and what permissions it 
        needs. The `configSchema` is incredibly important--it enforces strict
        rules. For example, you can write a JSON schema that says, "This plugin
        requires the file path to the user's Android SDK." OpenClaw will
        automatically read this, generate a beautiful Web UI settings page for
        it, and refuse to boot until you type in a valid folder path.
      - THE MODULE (`register` function): This is your actual TS/JS code. The
        single most important part is the `register(api)` function. When the
        Gateway loads your plugin, it hands this function a set of "keys to the
        casle" (the `api: OpenClawPluginApi`), allowing your code to wire itself
        directly into the AI's core.                    <-- I'm not sure what this means...
                                                            but will push ahead for now...



2. LIFECYCLE: How OpenClaw Boots Your Code
   When you start the OpenClaw Gateway, it follows a strict 3-phase boot 
   sequence:
      1. PHASE 1 -- DISCOVERY: The Gateway scours its `plugins/` folders and
         your system workspace looking for any `openclaw.plugin.json` files.
      2. PHASE 2 -- LOAD && REGISTER: It securely loads your code, validates
         your config file, and executes your `register(api)` function. THIS IS
         THE EXACT MILLISECOND YOUR PLUGIN INSTALLS ITS CUSTOM ANDROID-BUILDING
         TOOLS INTO THE SYSTEM.
      3. PHASE 3 -- ACTIVATION: The Gateway finishes booting. The LLM comes
         online, and your newly registered tools are officially armed and ready 
         for the AI to use.


3. THE PLUGIN API: GIVING OPUS SUPERPOWERS
   When your plugin calls the `register(api)` function, it can choose to use any
   of the capabilities listed in the `OpenClawPluginApi`. This is your toolbox
   for building the autonomous Android engineer:
   - `registerTool()`: This is how you give Opus robotic hands. For your project
     , you would use this to register custom tools like `run_gradle_build`, 
     `adb_capture_screenshot`, or `adb_tap_coordinates`. The next time Opus
     wakes up, it will see these exact tools in its system prompt and can 
     trigger them at will.
   - `registerService()`: This is for long-running background tasks. You would
     register a service that automatically spins up your headless Android
     Emulator in the background the moment OpenClaw boots, keeping it alive so
     Opus always has a virtual test device ready.
   - `registerProvider()`: OpenClaw is model-agnostic. You can use this to 
     explicitly register Anthropic's Opus 4.6 as the "brain" for this specific
     workflow.


4. PLUGIN HOOKS: THE "INTERCEPTION" SYSTEM

   HOOKS are arguably the most powerful feature in OpenClaw. They act like an
   assembly-line checkpoint, allowing your plugin to intercept, pause, modify,
   or block the AI's thoughts and actions in real-time.

   - `before_agent_start`: Right before the AI generates a response or writes
     code, this hook pauses the process. You can use this to silently inject 
     live data into the AI's hidden system prompt (e.g., "System update: The
     Android emulator currently has 15% CPU Usage and the last Gradle build
     failed.")
   -   

In [None]:
                                            __----~~~~~~~~~~~------___
                                 .  .   ~~//====......          __--~ ~~
                  -.            \_|//     |||\\  ~~~~~~::::... /~
               ___-==_       _-~o~  \/    |||  \\            _/~~-
       __---~~~.==~||\=_    -_--~/_-~|-   |\\   \\        _/~
   _-~~     .=~    |  \\-_    '-~7  /-   /  ||    \      /
  ~       .~       |   \\ -_    /  /-   /   ||      \   /
/  ____  /         |     \\ ~-_/  /|- _/   .||       \ /
|~~    ~~|--~~~~--_ \     ~==-/   | \~--===~~        .\
         '         ~-|      /|    |-~\~~       __--~~
                     |-~~-_/ |    |   ~\_   _-~            /\
                          /  \     \__   \/~                \__
                      _--~ _/ | .-~~____--~-/                  ~~==.
                     ((->/~   '.|||' -_|    ~~-/ ,              . _||
                               -_     ~\      ~~---l__i__i__i--~~_/
                                _-~-__   ~)  \--______________--~~
                              //.-~~~-~_--~- |-------~~~~~~~~
                                     //.-~~~--\


In [None]:
                          [=========]
               -==++""" .  /. . .  \ .  """++==-
        -+""   \   .. . .  | ..  . |  . .  .   /   ""+-
     /\  +-""   `-----=====\  <O>  /=====-----'   ""-+  /\
    / /                      ""=""                      \ \
  / /                                                     \ \
 //                            |                            \\
/")                          \ | /                          ("\
\o\                           \*/                           /o/
 \ )                       --**O**--                       ( /
                              /*\
                             / | \
                               |

It is completely normal that this feels... abstract! If you are used to writing
straightforward scripts or configuring apps with simple files, a dynamic
"Plugin Architecture" can feel like unnecessary black magic.

Let...


1. DO I NEED TO PUBLISH THE PLUGIN? (Can I stay stealth?)
   ABSOLUTELY NOT. You can keep your code private...

   "Phase 1" in documentation...
      `Phase 1 -- Discovery: Scan bundled, global, WORKSPACE, and configured directories`

   That work "WORKSPACE" is your stealth... OpenClaw is designed to load...


2. HOW DO THE `register*()` functions actually work? DO THEY WRITE TO A SECRET
   FILE?

   No, they don't write to any `.jsonl` file or database on your hard drive. The
   registration happens entirely IN-MEMORY (in RAM) whoilst OpenClaw Gateway
   is actively running (more like booting up?).

   ...

   As long as the Gateway is running, it remembers this tool and passes it to
   Opus 4.6. The moment you shut down the Gateway, the whiteboard is wiped 
   completely clean. Nothing hidden is left behind on your disk.


3. WHY USE PLUGINS INSTEAD OF JUST HARDCODING OR USING A JSON TEMPLATE?
   ... intuition... is right for CONFIGURATION, but it falls short for EXECUTION.

   WHERE YOU CAN USE JSON:
      You absolutely will use a JSON (or YAML) file to configure your workflow!
      You can feed OpenClaw a config file that says: "Create an agent, use Opus
      4.6, set the system prompt to `You are an Android developer`, and give it
      access to the `adb_screenshot` tool."

   WHY YOU STILL NEED A PLUGIN (Code):
      A JSON file is just dead text. It cannot physically interact with your
      computer. A JSON file cannot open a bash terminal, execute 
      `adb shell input 400 500`, wait for the emulator to respond, and parse a
      Java stack trace. To interact with your OS, you need executable code...
                                        <-- yes... but if anything here, gemini mistook
                                            my user intent and question here...


AS FOR HARDCODING:
   Because OpenClaw is open-source, you could absolutely just download the core
   source code and hack your ADB and Gradle scripts directly into the center of
   it. 

   However, if you do that, the next time OpenClaw releases a major update or a
   critical bug fix, downloading the update will overwrite and delete all your
   hardcoded changes. The Plugin-SDK exists so you can keep your top-secret 
   Android logic safely isolated in its own folder. OpenClaw can update itself
   independently, and every time it boots, it seamlessly "plugs in" your stealth
   tools back into its memory.


... looking for the literal, physical step-by-step of where these files live on
your hard drive and how the Gateway knows to execute them. 

You are right... you do NOT type code into CLI line-by-line. And your intuition
about using a "perfectly formatted JSON file detailing all config needed" is
actually exactly how the system works under the hood!

Here is the exact missing link: the physical folder structure you will create
on your computer, what goes inside the files, and how you hook it up to OpenClaw.



---
1. THE PHYSICAL FOLDER (Where you write the code)
   ... you don't even need to touch OpenClaw's core code. Anywhere on

Ah, I see exactly where the disconnect is! You are looking for the literal, physical step-by-step of where these files live on your hard drive and how the Gateway knows to execute them.

What you are experiencing right now is incredibly common among high-level developers. Let's call it the "Hacker’s Dilemma."
        https://gemini.google.com/app/9ee4a6d3967dc06b 

4. Plugin Hooks: The "Interception" System
        https://gemini.google.com/app/9ee4a6d3967dc06b


⏺ Both explorations are done. Here's everything compiled, focused on your question: what scaffolding exists for
  checking that things work, and what you'd need to build.
        Claude Code        

In [None]:
  -----                                                               -----
1 | H |                                                               |He |
  |---+----                                       --------------------+---|
2 |Li |Be |                                       | B | C | N | O | F |Ne |
  |---+---|                                       |---+---+---+---+---+---|
3 |Na |Mg |3B  4B  5B  6B  7B |    8B     |1B  2B |Al |Si | P | S |Cl |Ar |
  |---+---+---------------------------------------+---+---+---+---+---+---|
4 | K |Ca |Sc |Ti | V |Cr |Mn |Fe |Co |Ni |Cu |Zn |Ga |Ge |As |Se |Br |Kr |
  |---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---|
5 |Rb |Sr | Y |Zr |Nb |Mo |Tc |Ru |Rh |Pd |Ag |Cd |In |Sn |Sb |Te | I |Xe |
  |---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---|
6 |Cs |Ba |LAN|Hf |Ta | W |Re |Os |Ir |Pt |Au |Hg |Tl |Pb |Bi |Po |At |Rn |
  |---+---+---+------------------------------------------------------------
7 |Fr |Ra |ACT|
  -------------
              -------------------------------------------------------------
   Lanthanide |La |Ce |Pr |Nd |Pm |Sm |Eu |Gd |Tb |Dy |Ho |Er |Tm |Yb |Lu |
              |---+---+---+---+---+---+---+---+---+---+---+---+---+---+---|
   Actinide   |Ac |Th |Pa | U |Np |Pu |Am |Cm |Bk |Cf |Es |Fm |Md |No |Lw |
              -------------------------------------------------------------


                          [=========]
               -==++""" .  /. . .  \ .  """++==-
        -+""   \   .. . .  | ..  . |  . .  .   /   ""+-
     /\  +-""   `-----=====\  <O>  /=====-----'   ""-+  /\
    / /                      ""=""                      \ \
  / /                                                     \ \
 //                            |                            \\
/")                          \ | /                          ("\
\o\                           \*/                           /o/
 \ )                       --**O**--                       ( /
                              /*\
                             / | \
                               |

---

- Hooks are instructions that can be inserted into a program by a compiler at
  compile time. Hooks can be placed at the entrances and exits of blocks, at
  statement boundaries, and at points in the program where program flow might
  change between statement boundaries (called path points).

- Hooks in agentic AI are lifecycle callbacks that allow developers to inject
  custom logic, security checks, or monitoring at specific points in an AI 
  agent's execution, such as before/after tool usage or during LLM invocation.
  ... hooks enable dynamic, situational control over agentic behavior, 
  enhancing observability and safety without modifyig the core agent code.  

These are REAL, EXISTING FUNCTIONALITIES in OpenClaw's codebasem not 
hypothetical. Let me clarify what it actually is:

OPENCLAW'S ANDROID APP IS A COMPANION APP - you install it on a real Android
phone, and it acts as a "node" that conencts back to an OpenClaw gateway server
via WebSocket. Once connected, an AI agent (like Opus 4.6) can send commands to
the phone through the gateway and the app executes them.

So concretely:

// Command // What it actually does //
`camera.Snap` - Takes a real photo using the phone's camera
`camera.Clip` - Record a real video clip
`screen.Record` - Record the phone's screen as MP4
`location.Get` - Gets real GPS coordinates
`sms.Send` - Sends a real SMS from the phone
`canvas.Snapshot` - Screenshots a WebView the agent controls

The flow is: 
AI AGENT -> Gateway server (WebSocket) -> Android app on phone -> executes 
command -> returns result to agent

So yes, an AI model like Opus 4.6 could use this to take photos, screenshots,
sends texts, etc. through a real Android device -- it's not simulating anything,
it's literally remote-controlling phone hardware via the app.

WHAT IT IS NOT: It's not a tool for building Android apps autonomously. It's a
pre-built app that gives agents access to phone capabilties. The build commands
listed (`./gradlew assembleDebug` etc.) are just how you'd compile and install 
the OpenClaw companion app itself.


- Serialisation is the process of conveting the state of an object into a form
  that can be persisted or transported. The complement of serialisation is 
  deserialisation, which converts a stream into an object. Together, these 
  processes allow data to be stored and transferred.

- mDNS: Multicast DNS (mDNS) is a zero-configuration network protocol that 
  resolves hostnames to IP addresses on local networks without a central DNS
  server. It enables devices, such as printers, smart TVs, and computers, to
  automatically discover each other using `.local` domain names. Commonly used
  in home/small networks, it operates on port 5353.  

- Compose for Kotlin refers primarily to Jetpack Compose, Android's modern, 
  native, declarative UI toolkit. It uses Kotlin functions to build UIs with 
  less code, powerful tools, and intuitive APIs, replacing the traditional
  XML-based view system. It also enables COMPOSE MULTIPLATFORM for sharing UI
  code across iOS, desktop, and web.

- Android Debug Bridge (`adb`) is a versatile command-line tool that lets you
  communicate with a device. The adb command facilitates a variety of device
  actions, such as installing and debugging apps. `adb` provides access to a 
  Unix shell that you can use to run a variety of commands on a device.  

- OkHttp is an open-source, efficient HTTP & http/2 client for Android and Java
  applications, developed by Square. It accelerates networking by utilising
  connection pooling, transparent GZIP compression, and response caching, making
  API interactions faster and more reliable while reducing bandwidth usage.

- A coroutine is a programming component that allows functions to suspend their
  execution (pause) and resume later from the same point, enabling asynchronous
  , non-blocking tasks. Unlike threads, they are cooperative, lightweight, and
  managed by the application rather than the OS, making them ideal for tasks
  like network requests, animations, or managing timers. 

  KEY CHARACTERISTICS OF COROUTINES:
  - SUSPENSION AND RESUMPTION: A coroutine can yield control ("suspend") before
    finishing, and then resume later. This is often done using keywords like
    `yield` (Unity) or `suspend`/`await` (Kotlin/Python).
  - COOPERATIVE MULTITASKING: Coroutines voluntarily yield control back to the
    scheduler or caller, rather than being pre-emptively interrupted by the OS.
  - LIGHTWEIGHT: Because they are not full-blown OS threads, you can run 
    thousands of coroutines simulatenously without overloading system memory.
  - STATE MANAGEMENT: Coroutines maintain their local state (variable, 
    instruction pointer) between pauses.

  COMMON USE CASES:
  - ASYNCHRONOUS PROGRAMMING: Handling I/O operations, API calls, or database
    queries without blocking the main thread, common in Kotlin/Python.
  - GAME DEVELOPMENT (e.g., `Unity`): Spreading long tasks (like loading assets
    or animations) across multiple frames using `IEnumerator` and `yield return`
  - STRUCTURED CONCURRENCY: Managing complex, nested background tasks in a 
    hierarchy to avoid memory leaks.

  DIFFERNCE between COROUTINES and THREADS:
     While threads can run in parallel, coroutines typically run on a single 
     thread (or a limited thread pool) by sitching between tasks, making them 
     more efficient for I/O-bound tasks.        

... bottleneck that separates a basic "code-generating script" from a true
"autonomous AI SWE."

A successful Gradle build (`BUILD SUCCESSFUL`) only proves that the Kotlin 
syntax is valid. It DOES NOT mean the app won't crash the second it opens (a 
runtime error), nor does it mean the UI doesn't look like a chaotic, overlapping
mess of unreadable text.

To achieve your zero-human-in-the-loop dream, you ... must build an AUTOMATED QA
(Quality Assurance) Loop. You need to give Opus 4.6 a virtual phone, a pair of
eyes, and robotic hands.

Because Opus 4.6 has native multimodal (vision) and Computer Use capabilities,
you can scaffold this entirely through the terminal using ADB (Android Debug 
Bridge). Here is the exact blueprint to implement it inside OpenClaw.

---

STEP 1: THE VIRTUAL LAB (Headless Emulator)
   Your Gateway server needs to spin up a virtual Android phone to run the app.
   You don't need a monitor or a desktop environment for this; the Android
   emulator can run "headlessly" in the background.

   Your OpenClaw scaffolding should automatically execute:
```Kotlin
# Create a virtual Pixel device
avdmanager create avd -n AgentPhone -k "system-image;android-34;google_apis;x86_64"

# Boot the phone in the background (no UI window, nwo audio, no animations)
emulator -avd AgentPhone -no-window -no-audio -no-boot-anim -gpu swiftshader_indirect &
```


STEP 2: THE "EYES" (VISUAL GUI EVALUATION)
   Once Opus compiles the `.apk`, the Gateway installs it (`adb install app-debug.apk`)
   and launches it (`adb shell monkey -p com.your.app 1`).

   Now, Opus needs to see it to know if it's aesthetically pleasing.

   1. Take a Screenshot: Have OpenClaw run a bash command to capture the 
      device's screen:
```Bash
adb exec-out screencap -p > current_screen.png
```
   2. THE VISION PROMPT: OpenClaw takes `current_screen.png` and passes it to
      Opus 4.6's vision API with a system prompt like:
      "You are an expert UX/UI designer. This is a screenshot of the app you
      just built running on an emulator. Evaluate the design. Are the buttons 
      alligned? Is the text readable? Do elements overlap? Is there enough
      padding? If it looks terrible, output the Kotlin Jetpack Compose UI
      changes needed to fix it."


STEP 3: THE "HANDS" (COMPUTER USE & INTERACTION)
   To verify the app is usable, Opus needs to click buttons, type text, and
   navigate pages to ensure the logic actually works. 

   While Opus can guess coordinates from a picture, the most robust way to 
   scaffold this is the "DOUBLE-VISION" technique, using UI AUTOMATOR.
   
   1. EXTRACT THE UI TREE: Android can dump a text-based XML map of every button
      , text box, and image currently on the screen, including their exact 
      `[X, Y]` bounding boxes:
```bash
adb shell uiautomator dump /sdcard/window_dump.xml
adb pull /sdcard/window_dump.xml .
```

   2. COMBINE VISION + XML: Feed the screenshot AND the XML file to Opus. Opus
      can visually see a "Submit" button, check the XML to find its exact 
      coordinates (e.g., `[400, 800]`), and accurately know where to interact.
   3. EXECUTE THE ACTION: Opus outputs a JSON command to OpenClaw, which 
      translates it to an ADB command mimicking a human finger:
      - TAP: `adb shell input tap 400 800`
      - TYPE: `adb shell input text "testuser@email.com"`
      - SWIPE: `adb shell input swipe 500 1000 500 200`
      - BACK BUTTON: `adb shell input keyevent 4`


STEP 4: The Runtime Crash Detector (Logcat)
   Visual checks won't catch under-the-hood memory leaks or database crashes.
   While Opus is "clicking" around the app using ADB, OpenClaw needs to 
   constantly monitor the device's system logs for fatal errors (like a 
   `NullPointerException`).

   1. MONITOR LOGS IN THE BACKGROUND:
```Bash
adb logcat -d '*:E' > error_log.txt
```
   2. THE FEEDBACK LOOP: If Opus taps a button and the app suddenly closes 
      (returning to the home screen), OpenClaw intercepts the `error_log.txt`
      and searches for `FATAL EXCEPTION`.
   3. SELF-HEALING: OpenClaw feeds the crash stack trace back to Opus:
         "When you tapped coordinates [400,800], the app crashes with this error
         : [Logcat Output]. Fix the logic, rebuild the APK, and we will restart
         the test."


---
THE ULTIMATE AUTONOMOUS LOOP
   If you implement these four steps as Python/Bash tools for OpenClaw, the AI's
   workflow transforms into a continuous, self-correcting loop:
      1. WRITE & BUILD: Opus writes Kotlin -> Gradle builds it. (Loops until no
         compiler errors).
      2. DEPLOY: OpenClaw installes the app on the background emulator.
      3. INSPECT (Eyes): OpenClaw pipes screenshots to Opus. (Loops until the
         UI is visually polished).
      4. INTERACT (Hands): Opus reads the XML tree and issues `adb shell input`
         commands to click through the app's user journey.
      5. DEBUG (Nervous System): OpenClaw feeds any runtime `logcat` crashes
         back to Opus. (Loops until no crashes occur during interaction).
      6. FINALISE: Once Opus successfully clicks through the entire app without
         a crash and is visually satisfied with the screenshots, it signs off
         the project.            

By wrapping Opus inside this specific ADB + Vision + Logcat scaffolding, you
transition it from a simple code generator into a full-stack, end-to-end QA
engineer!         

# Archive 3

This is a brilliant and highly ambitious project! Getting an LLM like Opus 4.6
to autonomously engineer, build, and package an Android app end-to-end is the
holy grail of agentic coding.

Because your goal is "zero human-in-the-loop," the AI will completely ignore 
Android Studio (the visual IDE). Instead, it must do everything headlessly via
the COMMAND LINE INTERFACE (CLI) and text files.

Since you already know Kotlin, the language syntax won't be an issue. The 
missing piece of the puzzle is understanding GRADLE (Android's official build
system) and how to orchestrate it programatically.

Based on the OpenClaw directory structure you shared, here is a complete 
walkthrough of how an Android app is packaged under the hood, and exactly how
you need to scaffol OpenClaw to let Opus handle it autonomously.

---
PART 1: THE BLUEPRINT (What Opus needs to write)

Before packaging can happen, Opus needs to use its file-writing tools to
generate four main pillars of the project:

   1. THE BUILD SCRIPTS (`build.gradle.kts` && `settings.gradle.kts`): Written 
      in Kotlin Script, these tell the build system how to compile the app, what
      minimum Android version to target, and what third-party libraries to 
      download (e.g., Jetpack Compose for the UI, Ktor for WebSockets).
   2. THE MANIFEST (`app/src/main/AndroidManifest.xml`): The "rulebook" for the
      Android OS. It tells the phone the app's name, which screen to launch
      first, and what permissions it needs. 
      Crucial for AI: If Opus writes code that uses the internet or camera, it
      MUST declare `<uses-permission...>` here, or the app will instantly crash.
   3. THE SOURCE CODE (`app/src/main/java/...`): The actual Kotlin logic. 
      Because modern apps use Jetpack Compose, the UI is built entirely in 
      Kotlin rather than messy XML layout files--a massive advantage for an AI.
   4. THE RESOURCES (`app/src/main/res/`): Where static, non-code assets live, 
      like app icons and string translations.   


PART 2: THE ENGINE (The Gradle Wrapper)

   At the bottom of your screenshot, you see `gradlew` (for Linux/MacOS) and
   `gradlew.bat` (for Windows).

   This is the GRADLE WRAPPER, and it is the most important file for your AI. 
   The machine running your gateway doesn't even need Gradle installed. When the
   AI executes this shell script, it automatically downloads the exact right 
   version of the build tools and orchestrates the compilation. THIS IS THE
   PRIMARY EXECUTABLE THE AI WILL INTERACT WITH.


PART 3: THE 5-STEP PACKAGING PROCESS (UNDER THE HOOD)
   
   When Opus executes a build command via its bash tool, a massive pipeline runs
   automatically:
   
   1. RESOURCE COMPILATION (AAPT2): The Android Asset Packaging Tool compiles 
      everything in the `res/` folder and parses the Manifest into a binary 
      format. 
   2. CODE COMPILATION (`kotlinc`): The Kotlin compiler turns the `.kt` source
      code into standard Java bytecode (`.class` files).
   3. DEXING (D8/R8): Android devices cannot run standard Java bytecode. A
      compiler called D8 converts the `.class` files into `.dex` (Dalvik
      Executable) files, which are highly optimised for mobile processors.
   4. PACKAGING: Gradle zips the `.dex` files, the compiled resources, and the
      Manifest into a single archive.
   5. SIGNING: Android OS refuses to install any app that isn't digitally signed
      . For testing, Gradle automatically generates a fake "Debug" certificate
      and signs the archive with it.       


---           
PART 4: HOW TO SCAFFOLD OPENCLAW FOR FULL AUTONOMY
   To achieve a true zero-human-in-the-loop system, Opus 4.6 needs access to a
   bash terminal and a strict, programmatic "Execution & Auto-Correction" loop.

   Here is the exat scaffolding you need to build into your modified OpenClaw:


STEP 1: HOST ENVIRONMENT SETUP (YOUR JOB)
   The AI cannot build an Android app if the machine hosting the OpenClaw 
   Gateway doesn't have the right underlying tools. You must install:
   - Java Development Kit (JDK): JDK 17 is currently standard for modern
     Android apps.
   - Android Command Line Tools (SDK): You don't need the massive Android Studio
     , just the CLI tools. You must expose the `ANDROID_HOME` environment 
     variable to the OpenClaw bash session.
   - Critical Scaffolding Tip: Fresh Android SDKs require you to accept Google's
     terms of service, which will freeze a headless terminal waiting for `y/n`
     input. Program your Gateway to run `yes | sdkmanager --license` 
     automatically before handing control to Opus.


STEP 2: The Build Command
   Once Opus finishes writing the Kotlin files, it needs to navigate to the root
   directory and run the build command via bash:
```bash
./gradlew assembleDebug
```
   (Note: Instruct the AI's system prompt to only use `assembleDebug` to 
   generate an APK. Generating a production app for the Play Store requires
   cryptographic keystores, which introduces unnecessary friction and points of
   failure for an autonomous agent just trying to get a working app compiled).


STEP 3: The Auto-Correction Loop (The Magic)
   Code rarely compiles on the first try. Android development is notorious for 
   verbose dependency conflicts. This is where your scaffolding is vital:
   1. OpenClaw executes `./gradlew assembleDebug`.
   2. OpenClaw intercepts the terminal's `stdout` and `stderr`.
   3. If the build fails (exit code > 0), Gradle will output a massive stack
      trace (e.g., "Unresolved reference in MainActivity.kt on line 42").
   4. FEED THIS EXACT ERROR LOG BACK TO OPUS. Give it a background prompt like:
      "The build failed with these Gradle errors: [Log]. Edit the offending 
      files and try again." Frontier models are phenomenally good at reading
      compiler errors, pinpointing the missing import or syntax error, and 
      self-healing their code.
   5. Repeat until the terminal outputs `BUILD SUCCESSFUL`.      


STEP 4: RETRIEVING THE OUTPUT
   Once successful, Opus needs to know where the finished file is. Instruct it 
   to locate the final installable package at: 
   `app/build/outputs/apk/debug/app-debug.apk`

   By combining Opus's coding abilities with a persistent bash terminal and a
   strict read-error-and-retry-loop, autonomous Android app generation is
   entirely achievable!



- PTY (pseudo-terminal) support provides a software emulation of a physical
  terminal, enabling bidirectional communicaton between a controller process
  (like an SSH session or terminal emulator) and a worker/slave process (like a
  shell). It allows programs requiring a terminal interface to run in virtual,
  non-physical, or remote environments.

  KEY ASPECT OF PTY SUPPORT
  - PAIR OF DEVICES: PTY consists of a master (controller) and a slave (worker)#
    device. The master controls the slave, and the slave provides a 
    terminal-like interface to the application.
  -  ...



- "PTY support" in the context of agentic AI refers to the ability of autonomous
  agents to interact with CLIs, terminals and legacy software via a 
  pseudo-terminal (PTY).

  Unlike simple API alls, PTY support allows an AI agent to operate as if it 
  were a human user typing commands into a terminal, enabling it to manage,
  mavigate, and control complex, non-API-enabled systems to achieve goals
  autonomously.




- `wc` command counts the number of lines, words, and bytes in the files 
  specified by the File parameter. The command writes the number of newline 
  characters, words, and bytes to the standard outpt and keeps a total count for
  all named files.

- `grep`, short for "global regular expression (regex) print", is a command used
  for searching and matching text patterns in files contained in the regex.
  Furthermore, the command comes preinstalled in every Linux distribution.

- `cut` is a shell command that extracts sections from each line of input text
  -- usually from a file. Extraction of line segments that typically be done by 
  bytes (-b), characters (-c), or fields (-f) separated by a delimiter (-d
  -- the tab character by default).



- Dalvik Executable
   To run on Android devices, the Java code is compiled into Java bytecode, and 
   then translated into Dalvik bytecode,. The Dalvik bytecode can be found in
   DEX (Dalvik Executable) files in the APK. An APK (Android Package Kit) is
   essentially a ZIP file that contains an application's code and needed
   resources.


- D8 is a dexer that converts java byte code to dex code. R8 is a java program
  shrinking and minification tool that converts java byte code to optimised
  dex code.   

# Archive 2

- A gateway in WebSocket applications acts as a managed "front door" for 
  real-time, two-way communication between clients (web/mobile) and backend
  services. It handles connection management, message routing, and security,
  allowing developers to build scalable, stateful applications like chat apps or
  dashboard without maintaining dedicated WebSocket servers.

                                        (TCP is used for gaming and live-streaming)
- A WebSocket is a computer communications protocol that provides a persistent,
  full-duplex (two-way) communication channel over a single TCP connection, 
  enabling real-time, low-latency data exchange between a client (like a web
  browser) and a server. Unlike traditional HTTP, which requires a new 
  request-response pair for every interaction, WebSockets remain open to allow
  instant, simultaneous messaging.

  KEY FEATURES and BENEFITS
  - PERSISTENT & DUPLEX: Once established, the connection stays open until 
    closed by either party, allowing server-to-client and client-to-server data
    transmission at any time.
  - REAL-TIME DATA: Ideal for applications needing instant updates, such as chat
    apps, live stock tickers, gaming or monitoring systems.
  - LOW OVERHEAD: After an initial HTTP handshake to upgrade the connection, 
    WebSockets avoid the overhead of sending HTTP headers for every subsequent
    message.
  - COMPATIBILTY: Designed to work over HTTP ports 443 and 80, making them
    compatible with HTTP proxies and firewalls.

  HOW WEBSOCKETS COMPARE TO HTTP
  - HTTP: Operates on a request-response model. The client sends a request, the
    server responds, and the connection closes.
  - WebSocket: Uses a handshake to open a long-lived, bidirectional connection.

  COMMON USE CASES:
  - Live Chat Applications
  - Real-time Dashboards: Financial, IoT or sports update
  - Online Gaming: Multiplayer data synchronisation
  - Collaborative Tools: Shared document editing      



A daemon is a computer program that runs in the background, independent of 
direct user control, to handle tasks like system resource management, network
connections, or scheduled services. Common in Unix-like systems (Linux, macOS),
they often start at boot and typically have names ending in `d`, such as `httpd`
or `crond`.      

To understand why OpenClaw (the viral open-source AI agent formerly known as 
Clawdbot or Moltbot) uses WebSockets, it helps to first understand what the 
GATEWAY is, because the two concepts are deeply connected. 

Here is a plain-English breakdown of how OpenClaw's architecture works under the
hood.
---

1. WHAT IS THE "GATEWAY" IN OPENCLAW?
   If the AI model (like Claude, OpenAI, or a local DeepSeek model) is the 
   "brain" of your personal assistant, the GATEWAY is its central nervous system
   and its "hands".

   Technically, the Gateway is a core background application (a daemon) that
   runs continuously on your host machine (like your laptop, a Mac Mini, or a
   cloud VPS). Because OpenClaw is designed to be a fully autonomous agent 
   rather than just a chat window, it needs this central hub to orchestrate
   everything.

   Here is what the Gateway actually does:
      - THE MULTI-CHANNEL ROUTER: The AI model itself doesn't know how to talk 
        to Telegram or Whatsapp. The Gateway holds the connections to your 
        messagings app. When someone messages you, the Gateway receives the 
        message, normalises it, and routes it to the AI.
      - THE EXECUTOR: The AI model just outputs text. When the AI decides, "I
        need to run a terminal command to search this codebase" or "I need
        to open a browser and click this button," the Gateway is the software
        that safely executes that action on your operating system and hands the
        results back to the AI.
      - THE AUTONOMY ENGINE: The Gateway runs on a scheduled "heartbeat" (e.g.,
        every 30 minutes). It silently wakes the AI up, hands it a checklist of
        pending tasks, and allows it to do background work without you having to
        prompt it first.
      - THE CONTROL PLANE: It hosts the Web UI dashboard (usually on port 
        `18789`), allowing you to pair devices, configure settings, view logs,
        and approve or deny high-risk actions.

      In short, you don't talk directly to the AI model. You talk to the Gateway
      , and the Gateway manages the AI and your computer.


   2. Why does OpenClaw use WebSocket instead of HTTP?

      Most traditional web applications use standard HTTP, which relies on a 
      strict "request-response" model: your browser asks the server for data, 
      the server replies, and the connection closes.

      For a static website, this is fine. But OpenClaw is an active, autonomous
      AI agent. It routes its communications through WebSockets--which creates a
      persistent, two-way open pipeline between the client and the server--for a
      few vital reasons:
         - REAL-TIME STREAMING: AI models generate answers word-by-word 
           (token-by-token). Furthermore, when the Gateway runs a shell script
           or a web scraper for the AI, it generates live terminal logs. Instead
           of waiting 30 seconds for a task to finish and sending it in one big 
           HTTP chunk, a WebSocket connection allows the Gateway to smoothly
           stream the AI's "thoughts" and live logs directly to your screen in
           real time.
         - INSTANT "PUSH" NOTIFICATIONS & AUTONOMY: Because OpenClaw runs in the
           background, the AI might wake up at 7:00 AM, read your emails, and 
           decide it needs to ask you a question. With HTTP, your browser or app
           would have to constantly "poll" the server every second asking, 
           "Anything new?" Websockets allow the Gateway to instantly push live
           alerts or approval requests (e.g., "Can I execute this code?") to 
           your dashboard the exact millisecond they happen.
         - EFFICIENT TOOL EXECUTION: The connection between your user interface
           and the Gateway uses a protocol called JSON-RPC over WebSockets. This
           allows the client and the server to rapidly fire micro-commands and 
           state changes back and forth without the heavy delay of opening and
           and closing a new HTTP connection every single time.
         - Live UI Features: OpenClaw supports a real-time typing indicators,
           presence tracking (knowing which of your devices are online), and 
           live "Canvas" rendering. These features require constant, low-latency
           communication that HTTP isn't built to handle efficiently.    

A QUICK SECURITY NOTE:
   Because WebSockets keep a persistent, bidirectional pipe open to a Gateway
   that has full access to your computer's terminal, it requires strict security
   . In early 2026, OpenClaw suffered a vulnerability (CVE-2026-25253) where
   attackers realised they could trick a user's browser into initiating a 
   malicious WebSocket connection to the local Gateway to steal tokens and
   execute code.

   If you are running an OpenClaw Gateway, always ensure you are on the latest
   patched version and avoid exposing your Gateway's port directly to the public
   internet without a secure tunnel (like Tailscale) or proper reverse-proxy
   authentication.           

#### Archived Def

- Pi-mono (`badlogic/pi-mono`) is an FOSS TypeScript monorepo that serves as a
  toolkit for building, managing, and deploying AI agents. It is heavily 
  assosciatied with Pi, a minimal terminal-based coding agent designed by Mario
  to be a lean, extensible alternative to more complex AI coding tools.

- In the context of PI-MOO (a TS monorepo for an AI coding agent), "streaming
  agents" refer to the core functionality of the AI, which incrementally 
  processes data, generates text, execute tools., and updates the Terminal UI
  (TUI) in real-time. The agent utilises TS, npm workspaces, and support 
  multi-LLM provider LLMs.

  1. `pi-agent-core` (The Core Engine)
     - AGENT LOOP: Handles the main loop of the AI agent, including tool 
       execution, validation, and event streaming.
     - AgentMessage: The agent operates with a flexible `AgentMessage` type
       that supports standard LLM messages (user, assistant, tool result) and
       custom app-specific types.
     - STREAMING BEHAVIOR: When the agent is streaming, it manages how messages
       are handled, requiring `streamingBehavior` (such as `steer` or `followUp`)
       if a command is sent during an active stream.

  2. `pi-tui` (STREAMING OUTPUT)
     - REAL-TIME Interaction: The Terminal UI (TUI) framework provides "almost"
       flicker-free updates, rendering streaming text, tool output, and 
       markdown in real-time.
     - Differential Rendering: The TUI updates only the necessary parts of the
       screen, providing a smooth user experience while the model streams output
                  
  


  - ... "subscribing to events" refers to an event-driven, rather than 
    poll-based, architecture where the agent actively listens for specific, 
    typed signals (WebSocket frames) from the GateWay.

    This mechanism allows the system to react instantly to changes rather than
    constantly asking "what's new?".

    Here is what "subscribe to event" specifically means for OpenClaw:
       - REAL-TIME RESPONSIVENESS: Instead of polling, the agent is notified
         immediately when specific events occur, such as incoming messages from
         connected apps (WhatsApp, Telegram, Slack), or when internal state 
         changes.
       - EVENT TYPES: Agents can subscribe to various event types handled by the
         Gateway, including `agent`, `presence`, `health` and `tick` (time-based
         heartbeats).
       - AUTOMATED TRIGGERS (Heartbeat): The "heartbeat" feature is a crucial
         subscribe event where the agent triggers a check, acts on information
         (like a new email or scheduled task), and then returns to an idle state
         , functioning as a 24/7 personal assistant.
       - STRUCTURED DATA INTERACTION: These subscribed events are typed and 
         validated against Json Schema (generated from TypeBox), ensuring that
         the agent only processes well-formed data.
       - PROACTIVE CAPABILITIES: Subscribing to events allows OpenClaw to move
         beyond simple chat, enabling proactive "push" actions, such as sending
         a daily briefing via a Telegram channel without being prompted.       
    
    Essentially, subscribing to events enables OpenClaw to operate as a 24/7
    automated agent that triggers actions based on external triggers 
    (messages/alerts) rather than waiting for user input.



- Handlers in the backend are functions or classes designed to process incoming
  requests (HTTP, events, etc.), execute buisness logic, and return a response.
  They act as intermediaries between raw ssytem events or API endpoints and the
  core application logic, often managing data retrieval, validation and security
  tasks.

  Key aspects of handlers in backend development:
  - HTTP REQUEST HANDLING: Web handlers receive a request object (URLs, headers)
    and a response object, filling the latter with JSON, HTML or status codes.
  - EVENT HANDLING: In event-driven architectures, handlers (or consumers) 
    respond to specific events, such as processing data from a queue or a 
    database change.
  - PROCESS FLOW: A handler typically follows a structured path: 
     receive request --> validate data --> interact with db/models --> return response
  - TYPES OF HANDLERS:
      - PIPE HANDLERS: Common in monitoring systems (like Sensu) to process
        incoming events via scripts.
      - WEB SERVICE HANDLERS: Used to handle SOAP or REST API requests, often
        managing headers and endpoints.
      - DEFAULT HANDLERS: Built-in server functions that handle static content
        or default file requests.
  
  In short, a handler is a specialised function that "handles" a particular
  input to ensure the application reacts correctly to external requests or 
  system events.
  



- A cron job is a scheduled task in Unix-like operating system that 
  automatically runs commands or scripts at specififed intervals, such as daily,
  hourly, or weekly. Managed by the `cron daemon` (background process) via a 
  `crontab` (configuration file), they are widely used for automating system
  maintenance, backups, and repetitive tasks.  


- Gateway: Because everything runs through one process, the Gateway is a single
           control surface. Which model to call, which tools to allow, how much
           context to include, how much autonomy to grant -- all configured in 
           one place.  