Android Autonomous Development Agent

Real-device autonomous Android delivery loop

A TypeScript agent runtime and field manual for autonomous Android development. The project combines planning, implementation, compile-fix recovery, real-device verification, memory, and messaging adapters into a repeatable Android delivery control plane.

English | 中文

Positioning

Android Autonomous Development Agent is the Android-specific implementation layer for the broader Autonomous AI Development Framework. It is not positioned as a chatbot or a code-completion toy. Its job is to turn Android development into an auditable loop:

Requirement
  -> plan
  -> implement with isolated agents
  -> build
  -> repair compiler failures
  -> install APK on a real device
  -> cold launch
  -> inspect logcat
  -> capture screenshot
  -> review specification compliance
  -> review code quality
  -> record reusable lessons

The key standard is simple: code exists is not done. Build success is not done. Work is done only after the APK is built, installed, launched, checked through logcat, and visually verified when a device is available.

Why This Matters

Most AI coding tools improve the model-facing part of software development: larger context windows, stronger frontier models, faster code generation. This project takes a different position: reliability should come from the delivery system, not only from the model.

The real value is not making Opus stronger — it is making affordable models good enough. That is the path to scalable AI-assisted development.

The practical bet is architecture over model size. Smaller and cheaper models can become useful for real delivery when the system gives them fresh task contexts, bounded retry loops, independent review, build feedback, device verification, and reusable memory.

Executive Summary

Area	Current State
Autonomy level	L4 validated, with L5 components under development
Primary target	Android apps using Kotlin, Jetpack Compose, React Native or Flutter Android targets
Runtime	Bun and TypeScript
Core loop	Plan, implement, review, build, install, verify, learn
Verification	Gradle, ADB install, explicit Activity launch, logcat crash scan, screenshot evidence
Memory	Pitfalls, reusable patterns, decisions, environment details, error fixes
Adapters	CLI, Telegram, Feishu/Lark
Validated projects	TransLite, CyberDiviner, Voyager AI Mobile, Hermes Mobile, AntiScamAI, CustomCam

AI Development Autonomy Levels

This repository uses the same L1-L5 taxonomy as the upstream Autonomous AI Development Framework, adapted from SAE J3016 autonomous driving levels.

Level	Stage	Definition	Human Role
L1	Code Completion	AI assists with autocomplete and suggestions. The developer remains the writer and decides what code enters the project.	Writer
L2	Pair Development	AI generates code through dialogue. The human prompts, reviews, and assembles the output into a working change.	Reviewer
L3	Semi-autonomous Agent	AI leads implementation across a scoped task, while the human supervises key checkpoints, decisions, and fixes.	Supervisor
L4	Fully Autonomous Agent	AI executes the end-to-end delivery loop from plan to implementation, build, install, runtime verification, review, and repair. The human validates the final result.	Accepter
L5	AI Development Team	Multiple specialized agents collaborate in parallel with planning, implementation, verification, review, memory, and quality gates coordinated automatically.	None

Current position: this Android implementation targets L4 validated delivery and develops the Android-specific components needed for L5: planner, implementer, independent spec reviewer, independent quality reviewer, verifier, memory, research, compile-fix recovery, and real-device evidence capture.

What This Repository Contains

Path	Purpose
`src/framework/androidDevFramework.ts`	Main framework class: task execution, review loop, compile-fix-loop, phase execution, engine migration
`src/framework/androidPatterns.ts`	Field-tested Android patterns from TransLite, CyberDiviner, Voyager AI Mobile, and other projects
`src/agents/`	Planner, implementer, reviewer, verifier, and web researcher agents
`src/memory/`	Memory storage and retrieval for reusable development lessons
`src/services/llm/`	OpenAI, Anthropic, Ollama, vLLM, and custom endpoint support
`src/gateway/`	CLI, Telegram, and Feishu/Lark integration layer
`tests/`	Unit tests for memory, dialogue, GitHub research, and message bus components

Core Architecture

graph TB
    subgraph "Interfaces"
        CLI[CLI]
        TG[Telegram]
        LK[Feishu or Lark]
    end

    subgraph "Gateway"
        ROUTER[Message Router]
        SESSION[Session Manager]
    end

    subgraph "Agent Runtime"
        COORD[Coordinator]
        PLAN[Planner]
        IMPL[Implementer]
        SPEC[Spec Reviewer]
        QUAL[Quality Reviewer]
        VERIFY[Verifier]
        RESEARCH[Researcher]
    end

    subgraph "Android Control Plane"
        BUILD[Gradle Build]
        FIX[Compile-Fix Loop]
        ADB[ADB Install and Launch]
        LOGCAT[Logcat Analysis]
        SHOT[Screenshot Evidence]
    end

    subgraph "Memory"
        STORE[(Memory Store)]
        PATTERNS[Validated Patterns]
    end

    CLI --> ROUTER
    TG --> ROUTER
    LK --> ROUTER
    ROUTER --> SESSION
    SESSION --> COORD
    COORD --> PLAN
    COORD --> IMPL
    COORD --> SPEC
    COORD --> QUAL
    COORD --> VERIFY
    COORD --> RESEARCH
    VERIFY --> BUILD
    BUILD --> FIX
    FIX --> BUILD
    VERIFY --> ADB
    ADB --> LOGCAT
    ADB --> SHOT
    LOGCAT --> STORE
    SHOT --> STORE
    STORE --> PATTERNS
    PATTERNS --> IMPL
    PATTERNS --> FIX

Mandatory Android Delivery Gate

Every Android task follows this gate unless the user explicitly scopes it as documentation-only or planning-only.

./gradlew assembleDebug
adb install -t -r app/build/outputs/apk/debug/app-debug.apk
adb shell am force-stop <package>
adb logcat -c
adb shell am start -n <package>/<activity>
sleep 5
adb logcat -d | grep -E "FATAL|AndroidRuntime|Exception|ANR" || true
adb shell screencap -p /sdcard/android-agent-check.png
adb pull /sdcard/android-agent-check.png ./artifacts/android-agent-check.png

Completion requires:

Build result recorded.
APK installed or install failure explained with device-specific evidence.
App launched through explicit Activity, not only a deep link.
Logcat checked in a clean window after launch.
Screenshot captured and inspected when a device is attached.
Specification compliance review completed.
Code quality review completed.
Findings repaired, then build and device verification repeated.

Validation Stage

For competition evaluation, the important artifact is not a demo video alone. It is a one-pass delivery record: one continuous run from requirement to implementation, test, typecheck, build, install, launch, logcat inspection, and screenshot evidence.

A validated run should produce the following evidence bundle:

Stage	Evidence	Pass Criteria
Requirement intake	Task prompt and generated plan	Scope is explicit, bounded, and testable
Implementation	Git diff or patch set	Changes are traceable to the plan
Tests	Unit or integration test output	Test command exits successfully
Typecheck or lint	Static check output	No blocking type or lint errors
Android build	Gradle output and APK path	APK is produced successfully
Device install	ADB install output	Install returns success on the target device
Runtime launch	Explicit Activity launch command	App starts from a cold state
Logcat scan	Clean post-launch log window	No fatal crash, ANR, or AndroidRuntime exception
Visual verification	Device screenshot	UI is visible and corresponds to the requested feature
Review	Spec and quality review notes	No unresolved blocker before acceptance

One-Pass Delivery Criterion

A one-pass validation run means the system completes the full delivery loop without manual code patching after the run starts. Human involvement is limited to the initial requirement and final acceptance. If the build or runtime fails, the agent may use its bounded compile-fix or debug-fix loop, but the repair must be driven by the system rather than by a human editing source files.

This is the standard used to separate a coding assistant from a delivery agent. A coding assistant can generate plausible files. A delivery agent must produce a running artifact and the evidence that it ran.

Compile-Fix Recovery Logic

The compile-fix-loop is the main resilience mechanism. It parses Gradle output, groups errors by file, injects known fix patterns, asks the implementation agent for a targeted patch, and rebuilds. The loop is bounded by maxRounds, defaulting to 3.

import { AndroidDevFramework } from "android-autonomous-dev-agent";

const framework = new AndroidDevFramework({
  llm: {
    provider: "custom",
    model: "local-model",
    baseUrl: "http://localhost:8000",
  },
});

const result = await framework.compileFixLoop(
  "cd /path/to/android/project && ./gradlew assembleDebug",
  { maxRounds: 3 }
);

if (!result.success) {
  console.error(result.finalBuildOutput);
}

The pattern library is available directly:

import { ANDROID_FIELD_PATTERNS, formatPatternsForPrompt } from "android-autonomous-dev-agent";

console.log(formatPatternsForPrompt(["compile-fix", "runtime-verification"]));

Field Lessons Integrated From Recent Projects

Autonomous AI Development Framework

The upstream framework defines the operating model:

Use fresh subagents for isolated tasks.
Run independent specification and code-quality reviews.
Treat build, install, launch, logcat, and screenshot as quality gates.
Keep retry loops bounded.
Store reusable lessons as memory or structured patterns.
Split work so subagents write code while the parent session runs slow Gradle builds.

TransLite

TransLite validated the L4 Android build loop on an offline translation application.

Reusable lessons:

Long Gradle or R8 builds should run in the parent session, not inside code-writing subagents.
Runtime engine migration should follow migrateEngine: scan old references, implement the new engine behind the same interface, update consumers, adjust dependencies and ProGuard, then build-fix until clean.
Large on-device model support requires runtime download state, a foreground service, notification channel, model file verification, and keep rules for inference libraries.
Model changes must account for APK size, device RAM, and co-installed apps.

Common fixes now represented in code:

ML Kit language enum mismatches: use stable string language codes where the API expects strings.
Missing coroutine Play Services dependency: add kotlinx-coroutines-play-services.
R8 stripping MediaPipe or LiteRT classes: add explicit keep rules.
Kotlin metadata mismatch: do not force-upgrade the entire Android toolchain; isolate newer libraries behind Java reflection when needed.

CyberDiviner

CyberDiviner contributed visually intensive Compose and Kotlin patterns.

Reusable lessons:

Visual apps should use incremental phase planning: plan Phase N, execute, inspect output, then plan Phase N+1.
Define the navigation skeleton early. Adding a screen means route constant, composable registration, callback parameter, menu entry, and callback wiring in one batch.
For Chinese UI, audit every Text composable and set the correct font family explicitly.
When adding enum or sealed-class values, update every when expression in the same batch.
Canvas is the production path for complex visual geometry. ASCII is only acceptable for debug output.
Secrets must be scanned before commit. Partially masked API keys are still secrets.

Voyager AI Mobile

Voyager AI Mobile contributed mobile data-flow, Expo, React Native, and native-module failure lessons.

Reusable lessons:

React Native long-running operations should use async job polling, not SSE and not one long HTTP response.
Completion handlers must validate payloads strictly. Do not let polling catch blocks swallow processing errors.
Backend fields added for downstream UI must pass through every normalizer. Silent data loss often occurs outside the visible component.
Tool schema and system prompt must list the same required fields. If a field is critical, it cannot be optional.
NativeModules access must be guarded. Direct top-level access can crash the app before screen code runs.
Risky native integrations should use a separate package name or product flavor so the stable installed app is not overwritten.

Verified Projects

Project	Stack	What It Validated
TransLite	Kotlin, Compose, LiteRT or MediaPipe model runtime	L4 phase execution, engine migration, model download lifecycle, compile-fix recovery
CyberDiviner	Kotlin, Compose, Hilt, CameraX, MediaPipe	Incremental visual planning, navigation hardening, font rules, canvas rendering, secret hygiene
Voyager AI Mobile	Expo, React Native, Clerk, Mapbox, native Android bridge	Async polling, data-flow normalization, guarded NativeModules, parallel vSDK package isolation
Hermes Mobile	Flutter, Android, embedded Termux	Real-device install verification, Xiaomi device behavior, release discipline
AntiScamAI	Kotlin, Compose, foreground service	Recovery from corrupted source and Android service permission pitfalls
CustomCam	Kotlin, Camera2, NDK	Camera2 RAW capture, ImageReader pitfalls, native ISP path

Quick Start

bun install
bun test
bun run typecheck
bun run src/main.ts

Gateway mode:

bun run src/main.ts --gateway

Build:

bun run build

Development Principles

Real device evidence beats synthetic confidence.
Build success is a gate, not a finish line.
The implementation agent should not be the final reviewer.
Subagents should not run long Gradle builds when timeout risk is high.
Data-flow bugs require tracing source, normalizer, store, props, and component boundary.
Native module failures must be treated as launch blockers until verified through logcat and screenshot.
Public documentation should describe verified behavior only.

Known Limitations

The repository is an implementation scaffold and field-knowledge carrier, not a fully self-hosting Android IDE.
Most validation is Android-focused. Web, iOS, and backend expansion are architectural targets, not equally verified paths.
Automated Compose UI interaction through ADB remains fragile. Use log markers, screenshot inspection, or Compose test APIs where possible.
The current ADB service is partly represented through framework commands and patterns; a dedicated typed ADB service remains a future implementation area.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.hermes/plans		.hermes/plans
scripts		scripts
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README.zh.md		README.zh.md
bun.lock		bun.lock
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Android Autonomous Development Agent

Real-device autonomous Android delivery loop

Positioning

Why This Matters

Executive Summary

AI Development Autonomy Levels

What This Repository Contains

Core Architecture

Mandatory Android Delivery Gate

Validation Stage

One-Pass Delivery Criterion

Compile-Fix Recovery Logic

Field Lessons Integrated From Recent Projects

Autonomous AI Development Framework

TransLite

CyberDiviner

Voyager AI Mobile

Verified Projects

Quick Start

Development Principles

Known Limitations

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Android Autonomous Development Agent

Real-device autonomous Android delivery loop

Positioning

Why This Matters

Executive Summary

AI Development Autonomy Levels

What This Repository Contains

Core Architecture

Mandatory Android Delivery Gate

Validation Stage

One-Pass Delivery Criterion

Compile-Fix Recovery Logic

Field Lessons Integrated From Recent Projects

Autonomous AI Development Framework

TransLite

CyberDiviner

Voyager AI Mobile

Verified Projects

Quick Start

Development Principles

Known Limitations

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages