Jarvis on your own hardware.
A private, self-improving AI orchestrator that runs entirely on Apple Silicon.
No cloud. No subscriptions. No data leaves your machine.
Install • Highlights • Architecture • The Brain • OpenChain • Getting Started • All Features • Vision • Model
Zora is a personal AI operating system. Not a chatbot. Not a wrapper around an API. A persistent, always-on intelligence that runs across your own Macs, learns your workflow, handles your communications, monitors your infrastructure, and works autonomously — day and night.
She drafts your emails in your voice. She preps you before meetings with talking points about the people you're meeting. She triages your inbox across WhatsApp, Teams, and email. She watches your screen, notices when you're stuck, and speaks up when something matters. She stays quiet when it doesn't.
Everything runs locally on Apple Silicon via MLX. The brain fits on a 16GB Mac with headroom to spare. Scale up by adding more nodes — or keep it on a single machine. Cloud providers (OpenAI, more coming) are available via /plan mode for complex planning, but Zora works fully offline by default.
git clone https://github.com/Azkabanned/zora.git
cd zora
./install.sh # sets up everything: venv, models, voice, search engine
./start.sh # starts Zora on http://localhost:8000Requirements: macOS with Apple Silicon (M1+), 16GB+ RAM, Python 3.11+, Node.js, Redis.
The installer handles Homebrew dependencies, Python venv, the Zora brain from HuggingFace, Qwen3-TTS voice model, Whisper STT, WhatsApp bridge, SearXNG (private search), and data directories. One command. No Docker.
|
Fine-tuned Qwen3-4B quantised to 4-bit via MLX. Runs on Metal GPU — first token in <1s, ~45 tok/s. Trained on 1,186 synthetic orchestrator examples. Zero personal data in training. |
Purpose-built Metal kernel implementing 3-bit PolarQuant KV cache compression. 2.7x memory savings — fit 32K context in 1.5GB instead of 4GB. This is what makes Zora practical on a 16GB Mac. |
|
File ops, shell (sandboxed), browser automation, web search, git, GitHub, email, calendar, Teams, WhatsApp, Telegram, Slack, Discord, finance, 29 macOS host actions, and more. All routed by Smart Handoff. |
Background loop every 5 minutes. COG-X reflection, morning briefings, personalised greetings, mood system (5-dimensional), proactive messaging, decision memory, ambient awareness. |
|
Extracts facts from conversations. Generates talking points before meetings. Tracks relationship health. Nudges you when someone's going cold. Birthday reminders. Helps you show up better. |
Drafts replies in your voice — learned per contact. Tone, formality, emoji, greeting style, sign-off. Draft-and-approve with permission gates. Never auto-sends without explicit consent. |
|
Qwen3-TTS with instruct-driven emotional prosody — Zora doesn't just read text, she performs it. Warmth when greeting you, urgency when something's broken, laughter when appropriate, sighing when frustrated. Multiple voice choices (4 Qwen3 + 7 Kokoro speakers). Whisper STT for local speech-to-text. Wake word ("Zora") activation. Push-to-talk on dashboard and mobile. |
Any Mac can join. Self-learning task router sends work to the best node. 16GB Mini for the brain, 128GB MBP for 70B coding. No limit on nodes. |
6 channels live: WhatsApp, Telegram, Teams, Email, Slack, Discord. Message triage with priority scoring across all channels. Unified inbox. 4-layer progressive memory (index → timeline → detail → knowledge graph). 36 agent skills with auto-creation. ReasoningBank trajectory caching. Overnight autonomous work. Self-improvement training loop.
┌─────────────────────────────────┐
│ Dashboard (8000) │
│ Alpine.js + Tailwind + Three.js │
└──────────────┬──────────────────┘
│
┌──────────────▼──────────────────┐
│ FastAPI Orchestrator │
│ │
│ ┌──────────┐ ┌──────────────┐ │
│ │ COG-X │ │ Agent Runner │ │
│ │ Cognition│ │ 150+ Tools │ │
│ │ Loop │ │ Smart Handoff │ │
│ └──────────┘ └──────────────┘ │
│ ┌──────────┐ ┌──────────────┐ │
│ │ Memory │ │ Delegate │ │
│ │ Vector + │ │ Engine │ │
│ │ FTS + KG │ │ (acts as you)│ │
│ └──────────┘ └──────────────┘ │
│ ┌──────────┐ ┌──────────────┐ │
│ │ Channels │ │ Plugins │ │
│ │ WA/TG/ │ │ Email/Reddit │ │
│ │ Teams/ │ │ Search/Kuma │ │
│ │ Slack │ │ UniFi/Notion │ │
│ └──────────┘ └──────────────┘ │
└──┬────────────────┬────────────────┬──┘
│ │ │
┌─────────────▼──┐ ┌──────────▼───┐ ┌────────▼─────────┐
│ Mac Mini M4 │ │ Mac Mini M4 │ │ MacBook Pro M5 │
│ 16GB · Brain │ │ 16GB · Node │ │ 128GB · Worker │
│ zora-4b (MLX) │ │ Light infer │ │ Qwen3-Coder- │
│ TurboQuant KV │ │ Qwen3-4B │ │ Next 70B (MLX) │
│ Speculative │ │ │ │ │
└────────────────┘ └──────────────┘ └──────────────────┘
Tested with: Mac Mini M4 16GB (orchestrator + brain), Mac Mini M4 16GB (light inference node), MacBook Pro M5 Max 128GB (Qwen3-Coder-Next 70B for heavy coding). Any Mac can join — the more nodes, the more powerful the cluster.
Stack: Python 3.11+ • FastAPI • SQLAlchemy • Redis • MLX • Alpine.js • Tailwind • Three.js • sentence-transformers • LightRAG • MCP
| Model | project-zora/zora-4b — fine-tuned Qwen3-4B |
| Quantisation | 4-bit affine (group size 64) via MLX |
| Size | ~2.1GB |
| Performance | ~45 tok/s on M4 Mac Mini |
| KV Cache | Custom 3-bit PolarQuant kernel — 2.7x compression |
| Training | 1,186 synthetic examples, zero personal data |
| Self-improvement | MLX LoRA fine-tuning → staged eval → promote or discard |
Zora exposes an Anthropic-compatible API on port 4001. Point any tool at it for local inference:
export ANTHROPIC_BASE_URL=http://localhost:4001
export ANTHROPIC_API_KEY=local
claude # runs on your Metal GPU — zero cloud latencyWorks with Claude Code, Codex, and any tool that speaks the Anthropic API.
| Page | Description |
|---|---|
| Chat | Talk to Zora, see tool calls live, stream responses, spawn agents |
| Office | 3D isometric workspace (Three.js). Zora picks her pet, decorates for seasons, plants grow over time. The room evolves with your relationship |
| Monitor | CRT-style terminal — radar sweep, network topology, live data packets, retro green-on-black |
| People | Relationship health, talking points, facts, reachout suggestions |
| Connectors | Configure Microsoft 365, Telegram, WhatsApp, Uptime Kuma, UniFi |
| Admin | 15-tab control plane: nodes, models, MCP, plugins, training, security |
# On any Mac on your network:
./worker/deploy-full.sh- Auto-discovers orchestrator via mDNS
- Self-learning task router — routes work to the best node automatically
- Menu bar app with live tok/s and current task
- Screen observation with
--observe— privacy blur on sensitive content - HMAC-signed results — orchestrator verifies every response
- No limit on nodes. Bigger cluster = more capable Zora
Mobile PWA — open on your phone, add to home screen. Chat, voice, camera. Always in your pocket.
Cloudflare Tunnel (free) — brew install cloudflared + configure in Admin. Zero-trust tunnel, no ports to forward, custom domain.
Tailscale — brew install tailscale. WireGuard mesh VPN. Access at http://mac-mini:8000. Auto-detected by Zora.
- All inference on your hardware. No API calls by default
- Data never leaves your machine. SQLite database, local files, local memory
- Cloud is opt-in.
/planmode for OpenAI planning. Data sanitiser strips sensitive content first - Private search. SearXNG runs locally. No Google, no tracking
- Encrypted vault. macOS Keychain with encrypted fallback
- No telemetry. Zora doesn't phone home. Ever
9 built-in plugins: Email, Reddit, SearXNG, Uptime Kuma, UniFi, Discord, Slack, Notion, Facebook Groups.
Plugin SDK — build your own with a plugin.yaml + main.py. Three types: connectors, skills, MCP. Templates in plugins/_templates/.
36 agent skills — markdown files that shape behaviour (Senior Dev, Incident Commander, Work Proxy, etc.). Auto-created after 15 successful runs of a pattern.
Four small machines are smarter than one big GPU.
A public, open-source compute network built on the same orchestration substrate:
- Contribute idle cycles from your homelab
- Earn credits for verified work completed by your nodes
- Spend credits on tasks bigger than your hardware can handle alone
- Jobs split and merge across multiple contributor nodes
- Privacy-preserving identity — keypair-based, no email or account required
Your personal Zora stays completely private. OpenChain only runs generic jobs, shared evals, and public skills. The orchestration substrate, worker protocol, and HMAC verification that make this possible are already built and running in private clusters today.
Shipped: Custom brain, TurboQuant kernel, speculative decoding, multi-node, COG-X cognition, People Intelligence, Digital Delegate, 6 channels, message triage, voice (TTS + STT + wake word), 3D Office, CRT Monitor, 150+ tools, plugin system, sandboxed execution, trust ledger, overnight autonomy, self-improvement loop, Cloudflare/Tailscale, mobile PWA, menu bar worker app.
In progress: Personal brain training from real usage, Playwright browser isolation, daytime autonomous work.
Planned: WebXR Office (Meta Quest), AR presence (iPhone ARKit + Apple Vision Pro), Looking Glass holographic display, multi-persona, OpenChain compute network, external repo learning, multi-model routing, voice calls and video presence.
Zora is a labour of love. Free and open source — always will be.
This project is still relatively new, so expect some sharp edges. See known issues for current bugs — for example, TTS can crash on 16GB Macs when the brain model is also loaded (Metal GPU memory). That's where you come in. We want to build Zora with the community, not just for it. Whether it's fixing bugs, adding plugins, improving the brain, building new integrations, or just telling us what's broken — contributions are welcome and valued.
git clone https://github.com/Azkabanned/zora.git
cd zora
./install.sh
python3 scripts/seed_db.py # populate with fake data for development
./start.shThe seed script creates realistic fake contacts, conversations, messages, and notifications so you can develop without real integrations.
A well-designed system with a mediocre model beats a genius model in a poor system.
System over intelligence. Solve once, reuse forever. Minimal context, maximum clarity. Humans over tech.
Apache 2.0 • Built with MLX on Apple Silicon
Made by @Azkabanned
