Run Gemma 4 on your phone. Roleplay offline. No API keys, no cloud, no cost.
Download APK · All Releases · Changelog · 简体中文
Most AI roleplay apps require cloud APIs — that means paying per token, sharing your conversations with a server, and losing access when the service goes down.
Aura takes a different path:
| Cloud API Apps | Aura | |
|---|---|---|
| LLM setup | Find a provider, get API keys, manage billing | Download once inside the app, done |
| Cost | Pay per token / monthly subscription | Free forever after download |
| Privacy | Your conversations travel through servers | Everything stays on your phone |
| Internet | Required for every message | Only needed for initial model download |
| Censorship | Provider decides what you can say | You own the model, no restrictions |
| Availability | Service can go down or change terms | Works offline, forever yours |
Aura runs Gemma 4 directly on your phone via Google's LiteRT-LM runtime, with GPU and NPU hardware acceleration. After a one-time model download (~2.5 GB), the app never contacts any server again. Your stories, your characters, your conversations — they never leave your device.
Story Library · Scene Chat · Model Setup · Card Import
- Gemma 4 On-Device — Google's latest open model runs natively on your phone via LiteRT-LM, with GPU/NPU acceleration
- No API Keys, No Cost — No accounts, no subscriptions, no tokens to buy. Download the model once and use it forever
- Truly Private — Zero network requests during use. Conversations never leave your device. No analytics, no telemetry
- Tavern Ecosystem — Import PNG (steganography) and JSON character cards, worldbooks, lorebooks from Tavern/SillyTavern
- Story-First UX — Scene continuation, whisper directives, emotion expressions, session branching
- Premium Dark Theme — OLED-optimized with ambient glow effects
- 4 Languages — English, 简体中文, 日本語, 한국어
- Accessible — Screen reader support, reduce-motion compliance
- Download the latest APK from GitHub Releases
- Open Aura and choose a story core (E2B for speed, E4B for quality)
- Wait for the one-time model download (~2.5 GB)
- Start a built-in story or import your own Tavern card
- From now on, everything works offline
APK size: ~103 MB for the Android arm64 release — the model downloads separately on first launch, then you never need internet again.
git clone https://github.com/wimi321/aura.git
cd aura
flutter pub get
flutter runSee CONTRIBUTING.md for detailed build instructions including iOS.
Aura ships with two curated Gemma 4 variants. Both run entirely on-device after download.
| Model | Download | RAM | Best For |
|---|---|---|---|
| Gemma 4 E2B | ~2.5 GB | 6 GB+ | Fast start, lighter devices |
| Gemma 4 E4B | ~3.6 GB | 8 GB+ | Richer vocabulary, longer scenes |
Models download from HuggingFace with SHA256 verification and resume support. You can delete and re-download models at any time from Settings.
Aura is designed so that your conversations are yours alone:
- No cloud: After model download, the app makes zero network requests
- No accounts: No sign-up, no login, no user tracking
- No telemetry: No analytics, no crash reporting, no usage data
- No data sync: Conversations are stored locally and never uploaded
- Local model: The AI runs on your phone's processor, not a remote server
- Open source: You can audit every line of code
This isn't just a privacy policy — it's an architectural guarantee. There is literally no server to send data to.
| Format | Status |
|---|---|
Tavern PNG (steganography, tEXt/iTXt chunks) |
Supported |
| Tavern / SillyTavern JSON cards | Supported |
Embedded character_book |
Supported |
| Standalone lorebook / worldbook JSON | Supported |
| Alternate greetings | Supported |
{{char}} / {{user}} macros |
Supported |
| Expression packs (ZIP) | Supported |
Aura automatically strips wrapper tags, removes hidden blocks, and normalizes formatting from imported cards.
System overview
┌─────────────────────────────────────────────┐
│ Flutter UI │
│ (Pages, Widgets, Theme) │
├─────────────────────────────────────────────┤
│ AppStateProvider │
│ (Central ChangeNotifier + Provider) │
├─────────────────────────────────────────────┤
│ Backend Services │
│ (Bootstrap, Platform Channels, Stores) │
├─────────────────────────────────────────────┤
│ aura_core │
│ (Pure Dart: Domain → Orchestration) │
│ │
│ ┌──────────┐ ┌──────────────┐ ┌──────────┐ │
│ │ Domain │ │ Application │ │ Infra │ │
│ │ Models & │ │ AuraEngine │ │ Parsers │ │
│ │ Policy │ │ Orchestrator │ │ Persist │ │
│ └──────────┘ └──────────────┘ └──────────┘ │
├─────────────────────────────────────────────┤
│ LiteRT Native Bridge │
│ Android (LiteRT-LM) │ iOS (XCFramework) │
│ GPU / NNAPI │ CoreML / CPU │
└─────────────────────────────────────────────┘
Message flow: User input → ChatOrchestrator (prompt assembly + lorebook injection + whisper) → Native Bridge → Gemma 4 on-device → streamed text + emotion signals → UI
- On-device Gemma 4 inference (E2B + E4B)
- Tavern PNG/JSON card import with worldbook
- Session history and branching
- Whisper directives and emotion system
- 4-language UI (EN/ZH/JA/KO)
- Premium OLED dark theme
- Message copy, timestamps, haptic feedback
- Accessibility (Semantics + reduce-motion)
- Model download recovery for flaky networks
- Wider Tavern card format compatibility
- More built-in story genres
- Tablet-optimized layouts
- Community card sharing
Do I need an API key or account?
No. Aura runs Gemma 4 directly on your phone. There's no API, no account, no subscription. Download the model once and use it forever, completely free.Is my data really private?
Yes. After the one-time model download, Aura makes zero network requests — ever. Your conversations, characters, and all data stay on your device. There is no server, no cloud, no telemetry. This is an architectural guarantee, not just a promise.What devices are supported?
Android devices with 6 GB+ RAM (for E2B) or 8 GB+ (for E4B). iOS builds from source. Hardware acceleration uses GPU on Android and CoreML on iOS.Can I import my existing Tavern cards?
Yes. Aura reads Tavern PNG cards (with embedded metadata via steganography), JSON cards, and standalone worldbook files. Embedded lorebooks are preserved automatically.Does it work without internet?
Yes. After the initial model download, Aura works completely offline. You can use it on airplane mode, in areas with no signal, or with Wi-Fi turned off.Why is the APK ~103 MB?
The arm64 APK contains the Flutter app and LiteRT-LM runtime but not the model weights. Models (~2.5–3.6 GB) download on first launch so the installer stays shareable.We welcome contributions! See CONTRIBUTING.md for setup instructions, code style, and PR process.
- Bug reports: Bug report template
- Feature ideas: Feature request template
- Security issues: SECURITY.md
MIT — use it, fork it, build on it.
Built with Flutter. Powered by Gemma 4 running 100% on-device via Google LiteRT-LM.
No API keys. No cloud. No cost. Your stories stay yours.
If Aura is useful to you, consider giving it a star.






