A friend-on-the-phone car mechanic that runs entirely on your Android phone, making car diagnostics accessible regardless of income or internet. No cloud calls. No location services. No subscription. Powered by Gemma 4 running locally via Google AI Edge's LiteRT-LM.
Built for the Gemma 4 Good Hackathon. Impact Track: Digital Equity & Inclusivity. Also targets Special Technology Track: LiteRT.
Video (3 min) · Live demo / APK · Writeup · Engineering notes
Three barriers decide who gets a good answer when their check-engine light comes on. Money: a $200 dealer diagnostic, before any work is done. Internet: every consumer car-AI app is a cloud API behind a subscription paywall. Knowledge: even when an answer comes back, it comes back in dialect to someone who just wants to know if it's safe to drive.
Those barriers stack hardest on the drivers with the fewest options. A teenager with a first used Corolla. A single parent an hour from the nearest dealer. A driver in a country where the median car is twenty years old. The most useful place to put a car-savvy friend is exactly where cloud LLMs don't work, on the phone the user already owns.
CAR·COPILOT is one APK and one ~3.4 GB model file. After install, it works anywhere, for anyone, forever.
Plug any OBD-II reader into your car. Open CAR·COPILOT. In airplane mode, a Pixel 9 reads your car's fault codes and live sensor data, classifies the problem, and streams a plain-English diagnosis in a friend-on-the-phone voice. Then it walks you through fixing it.
| Screen | What it streams |
|---|---|
| Home | Trip-readiness verdict plus the day's issues, severity-coded |
| Issue | Two-paragraph diagnosis, cost / time / DIY-difficulty, evidence drawer |
| Walkthrough | Per-step repair procedure with canonical torque, gap, and pressure values pinned beside the body |
| Mechanic draft | Codes-and-evidence handoff message you paste into a text to a shop |
| History | The pattern across past issues ("two coil failures in seven months, likely valve cover gasket") |
OBDSnapshot → RulesEngine → Classification + RagStore → PromptBuilder → GemmaService (5 surfaces) → JSON-aware extractors → UI (fail-soft fallback)
data/RulesEngine.kt is the deterministic classifier. It owns severity, route, confidence, likely cause, supporting signals, cost, and time. Every safety-relevant field. data/DTCTable.kt holds 256 thin DTC entries plus 16 deeply-curated procedures. data/RagStore.kt injects per-DTC context.
Only then does Gemma narrate. Five streaming surfaces, each with its own prompt template (app/src/main/assets/*.md), per-surface sampler, JSON envelope, and a stateful extractor that pulls partial content from the streaming buffer without ever letting a half-finished escape sequence reach the user. Every surface has a canned-text fallback path. If LiteRT-LM throws, the user still sees text.
| Concern | Where it lives |
|---|---|
| LiteRT-LM Engine + Conversation multiplexing | inference/GemmaService.kt |
| Per-surface prompt assembly + RAG retrieval | inference/PromptBuilder.kt |
| Streaming JSON extractors (per surface) | ui/SynthesisState.kt, ui/MechanicDraftState.kt, ui/HistoryPatternState.kt, ui/WalkthroughPlanState.kt, ui/WalkthroughStepState.kt |
| Deterministic classifier (Gemma's guardrail) | data/RulesEngine.kt |
| UI contract (the Issue schema) | model/Schema.kt |
| Fallback text (per-DTC) | model/Fallbacks.kt |
| LiteRT manifest entries | AndroidManifest.xml (<uses-native-library> for OpenCL) |
Highlights:
- Capability-gated Multi-Token Prediction (MTP).
ExperimentalFlags.enableSpeculativeDecodingflips on at init only ifCapabilities(modelPath).hasSpeculativeDecodingSupport()returns true. +23 to 60% decode TPS depending on surface. - SDK-authoritative benchmarking.
ExperimentalFlags.enableBenchmark = trueplusConversation.getBenchmarkInfo()logged per call. Surfacesinit / ttft / prefill_tokens / decode_tokens / prefill_tps / decode_tpstoadb logcat -s CarCopilot:V. Made every subsequent perf change measurable, and revealed that prefill, not decode, dominates wall-clock cost on every surface. - Per-step phase splitting.
PromptBuilder.splitProcedureIntoPhasesparses curated procedure markdown at## Phase Nboundaries and ships intro plus indexed phase per step instead of the full document. About 21% prefill drop per step, about 30 seconds saved across a 6-step walkthrough. - Per-surface samplers. Synthesis, draft, history, and plan use
topK = 40, topP = 0.95, temperature = 0.3for friend-on-the-phone voice. Walkthrough-step drops totopP = 0.5, temperature = 0.1for numeric fidelity (the prompt must paraphrase torque and gap values verbatim). - Surface multiplexing under the single-Conversation-per-Engine constraint.
GemmaService.acquireConversationForSurfaceLockedcloses and recreates the Conversation on surface switch, with a 250 ms native-cleanup floor plusconvoMutexto mitigate a SIGSEGV insideliblitertlm_jni.sowe hit on rapid churn. - Prewarm plus cancel-after-first-token. Parallel coroutine at process start sends a few-shot voice anchor and cancels on the first emitted token. Pays the system-prompt prefill while the user reads the Home screen. About 24% first-token latency win on the Issue page.
- JSON-aware tolerant parsing.
WalkthroughPlanState.balanceJsonTailwalks the streamed buffer with string-aware brace/bracket tracking and closes any unclosed structure at the tail, so cancelled-mid-stream or model-stopped-early output parses instead of bouncing to fallback. Pre-fix, this fallback was firing silently 100% of the time.
| Surface | Decode TPS | Prefill tokens | TTFT |
|---|---|---|---|
| Synthesis | 6.15 to 7.05 | 1,449 | 10.2s |
| Mechanic draft | 8.08 | ~1,200 | ~7s |
| Walkthrough plan | 9.02 to 11.02 | 2,247 | 15.9s |
| Walkthrough step | 6.84 to 10.22 | ~2,100 | ~11.7s |
Full perf writeup: reference/perf_notes_2026-05-18.md.
- 179 JVM unit tests (
./gradlew test) covering the streaming JSON extractors, procedure phase splitter, OBDSnapshot schema, snapshot-to-Issue builder, DTC table contract, deterministic classifier, and ELM327 protocol layer. - On-device Gemma smoke test at
app/src/androidTest/.../GemmaSmokeTest.kt, gated on a model push to/data/local/tmp/. - Python OBD emulator at
emulator/obd_emulator.pywith its own stdlibunittestsuite. Runpython3 emulator/test_obd_emulator.py.
- No cloud API calls. All inference runs on the device via LiteRT-LM. This is the central product claim.
- No location services.
ACCESS_FINE_LOCATIONandACCESS_COARSE_LOCATIONare never requested. The Bluetooth permission uses theusesPermissionFlags="neverForLocation"carve-out. - No internet permission for AI features. No AI request ever leaves the device.
- No mode 04 (clear DTCs). The app does not expose this.
- Gemma never classifies.
DTCTableandRulesEngineare the source of truth for severity, route, cost, time. Gemma only generates narrative text on top. - Fail soft. If LiteRT-LM errors, throws, or returns unparseable text, every surface falls back to canned text. The app must never crash on the user.
- Android Studio Iguana (2023.2.1) or newer. Anything that bundles a recent AGP 8.x and JDK 21+ works.
- JDK 21+. Android Studio ships its own JBR. Point Gradle at it (Settings → Build → Build Tools → Gradle → Gradle JDK), or set
JAVA_HOMEto that JBR path if you build from the command line. - ADB on
PATH. Standard install location is%LOCALAPPDATA%\Android\Sdk\platform-tools\on Windows or~/Library/Android/sdk/platform-tools/on macOS. - A physical Android 8.0+ device with USB debugging authorized. The reference target is a Pixel 9. Any Tensor or Snapdragon-8-class device with an OpenCL-capable GPU should work. Emulators won't, because the GPU delegate needs a real driver.
git clone https://github.com/DigitalVeer/car-copilot.git CarCopilot
cd CarCopilotOpen the project root in Android Studio. Wait for the Gradle sync to finish. The first sync downloads Compose BOM, LiteRT-LM, and the Kotlin toolchain.
The app needs gemma-4-E4B-it.litertlm (~3.41 GB). It is not in git.
- Download from the litert-community Hugging Face repo. Get the
.litertlmfile, not the.taskfile..taskis the web build and will not load via the Android SDK. - Push to the device:
adb push gemma-4-E4B-it.litertlm /data/local/tmp/
- On first launch,
CarCopilotApp.onCreateconstructsGemmaService, which copies the model from/data/local/tmp/into the app's privatefilesDir. Subsequent launches reuse thefilesDircopy.
# Build the debug APK (default: E4B)
./gradlew assembleDebug
# Build with the E2B variant for measurement
./gradlew assembleDebug -PmodelVariant=E2B
# Install on the connected device
adb install -r -d app/build/outputs/apk/debug/app-debug.apk
# Tail runtime logs (project-tagged messages only)
adb logcat -s CarCopilot:V
# Run JVM unit tests
./gradlew testFirst launch after install copies the model (~10 to 20s) and warms the GPU shader cache (~30 to 60s). The engine starts warming in CarCopilotApp.onCreate while the user sits on Home, so by the time they tap into the Issue page the prefill is already paid.
CLAUDE.md. Project guardrails, architecture, current state, what's in-scope and out-of-scope.FUTURE_WORK.md. Known issues (including the SIGSEGV mitigation) and the post-demo backlog.reference/perf_notes_2026-05-18.md. Gemma inference perf measurements and what's been tuned. Read this before changing anything inGemmaService.kt.reference/prompts/*.md. Voice rules (system.md) and per-surface prompt templates.submission/WRITEUP.md. The hackathon writeup.
Apache License 2.0. See LICENSE.