Skip to content

Azkabanned/Zora

Repository files navigation

Zora

Apple Silicon MLX 150+ Tools Local First License

Jarvis on your own hardware.
A private, self-improving AI orchestrator that runs entirely on Apple Silicon.
No cloud. No subscriptions. No data leaves your machine.

InstallHighlightsArchitectureThe BrainOpenChainGetting StartedAll FeaturesVisionModel


What is Zora?

Zora is a personal AI operating system. Not a chatbot. Not a wrapper around an API. A persistent, always-on intelligence that runs across your own Macs, learns your workflow, handles your communications, monitors your infrastructure, and works autonomously — day and night.

She drafts your emails in your voice. She preps you before meetings with talking points about the people you're meeting. She triages your inbox across WhatsApp, Teams, and email. She watches your screen, notices when you're stuck, and speaks up when something matters. She stays quiet when it doesn't.

Everything runs locally on Apple Silicon via MLX. The brain fits on a 16GB Mac with headroom to spare. Scale up by adding more nodes — or keep it on a single machine. Cloud providers (OpenAI, more coming) are available via /plan mode for complex planning, but Zora works fully offline by default.


Install

git clone https://github.com/Azkabanned/zora.git
cd zora
./install.sh    # sets up everything: venv, models, voice, search engine
./start.sh      # starts Zora on http://localhost:8000

Requirements: macOS with Apple Silicon (M1+), 16GB+ RAM, Python 3.11+, Node.js, Redis.

The installer handles Homebrew dependencies, Python venv, the Zora brain from HuggingFace, Qwen3-TTS voice model, Whisper STT, WhatsApp bridge, SearXNG (private search), and data directories. One command. No Docker.


Highlights

Custom Trained Brain

Fine-tuned Qwen3-4B quantised to 4-bit via MLX. Runs on Metal GPU — first token in <1s, ~45 tok/s. Trained on 1,186 synthetic orchestrator examples. Zero personal data in training.

TurboQuant — Custom MLX Kernel

Purpose-built Metal kernel implementing 3-bit PolarQuant KV cache compression. 2.7x memory savings — fit 32K context in 1.5GB instead of 4GB. This is what makes Zora practical on a 16GB Mac.

150+ Tools

File ops, shell (sandboxed), browser automation, web search, git, GitHub, email, calendar, Teams, WhatsApp, Telegram, Slack, Discord, finance, 29 macOS host actions, and more. All routed by Smart Handoff.

Cognition Engine

Background loop every 5 minutes. COG-X reflection, morning briefings, personalised greetings, mood system (5-dimensional), proactive messaging, decision memory, ambient awareness.

People Intelligence

Extracts facts from conversations. Generates talking points before meetings. Tracks relationship health. Nudges you when someone's going cold. Birthday reminders. Helps you show up better.

Digital Delegate

Drafts replies in your voice — learned per contact. Tone, formality, emoji, greeting style, sign-off. Draft-and-approve with permission gates. Never auto-sends without explicit consent.

Voice

Qwen3-TTS with instruct-driven emotional prosody — Zora doesn't just read text, she performs it. Warmth when greeting you, urgency when something's broken, laughter when appropriate, sighing when frustrated. Multiple voice choices (4 Qwen3 + 7 Kokoro speakers). Whisper STT for local speech-to-text. Wake word ("Zora") activation. Push-to-talk on dashboard and mobile.

Multi-Node Cluster

Any Mac can join. Self-learning task router sends work to the best node. 16GB Mini for the brain, 128GB MBP for 70B coding. No limit on nodes.

6 channels live: WhatsApp, Telegram, Teams, Email, Slack, Discord. Message triage with priority scoring across all channels. Unified inbox. 4-layer progressive memory (index → timeline → detail → knowledge graph). 36 agent skills with auto-creation. ReasoningBank trajectory caching. Overnight autonomous work. Self-improvement training loop.

See all features →


Architecture

                         ┌─────────────────────────────────┐
                         │         Dashboard (8000)         │
                         │  Alpine.js + Tailwind + Three.js │
                         └──────────────┬──────────────────┘
                                        │
                         ┌──────────────▼──────────────────┐
                         │     FastAPI Orchestrator         │
                         │                                  │
                         │  ┌──────────┐  ┌──────────────┐ │
                         │  │ COG-X    │  │ Agent Runner  │ │
                         │  │ Cognition│  │ 150+ Tools    │ │
                         │  │ Loop     │  │ Smart Handoff │ │
                         │  └──────────┘  └──────────────┘ │
                         │  ┌──────────┐  ┌──────────────┐ │
                         │  │ Memory   │  │ Delegate     │ │
                         │  │ Vector + │  │ Engine       │ │
                         │  │ FTS + KG │  │ (acts as you)│ │
                         │  └──────────┘  └──────────────┘ │
                         │  ┌──────────┐  ┌──────────────┐ │
                         │  │ Channels │  │ Plugins      │ │
                         │  │ WA/TG/   │  │ Email/Reddit │ │
                         │  │ Teams/   │  │ Search/Kuma  │ │
                         │  │ Slack    │  │ UniFi/Notion │ │
                         │  └──────────┘  └──────────────┘ │
                         └──┬────────────────┬────────────────┬──┘
                            │                │                │
              ┌─────────────▼──┐  ┌──────────▼───┐  ┌────────▼─────────┐
              │  Mac Mini M4   │  │ Mac Mini M4  │  │  MacBook Pro M5  │
              │  16GB · Brain  │  │ 16GB · Node  │  │  128GB · Worker  │
              │  zora-4b (MLX) │  │ Light infer  │  │  Qwen3-Coder-    │
              │  TurboQuant KV │  │ Qwen3-4B     │  │  Next 70B (MLX)  │
              │  Speculative   │  │              │  │                  │
              └────────────────┘  └──────────────┘  └──────────────────┘

Tested with: Mac Mini M4 16GB (orchestrator + brain), Mac Mini M4 16GB (light inference node), MacBook Pro M5 Max 128GB (Qwen3-Coder-Next 70B for heavy coding). Any Mac can join — the more nodes, the more powerful the cluster.

Stack: Python 3.11+ • FastAPI • SQLAlchemy • Redis • MLX • Alpine.js • Tailwind • Three.js • sentence-transformers • LightRAG • MCP


The Brain

Model project-zora/zora-4b — fine-tuned Qwen3-4B
Quantisation 4-bit affine (group size 64) via MLX
Size ~2.1GB
Performance ~45 tok/s on M4 Mac Mini
KV Cache Custom 3-bit PolarQuant kernel — 2.7x compression
Training 1,186 synthetic examples, zero personal data
Self-improvement MLX LoRA fine-tuning → staged eval → promote or discard

Use with Claude Code

Zora exposes an Anthropic-compatible API on port 4001. Point any tool at it for local inference:

export ANTHROPIC_BASE_URL=http://localhost:4001
export ANTHROPIC_API_KEY=local
claude    # runs on your Metal GPU — zero cloud latency

Works with Claude Code, Codex, and any tool that speaks the Anthropic API.


Dashboard

Page Description
Chat Talk to Zora, see tool calls live, stream responses, spawn agents
Office 3D isometric workspace (Three.js). Zora picks her pet, decorates for seasons, plants grow over time. The room evolves with your relationship
Monitor CRT-style terminal — radar sweep, network topology, live data packets, retro green-on-black
People Relationship health, talking points, facts, reachout suggestions
Connectors Configure Microsoft 365, Telegram, WhatsApp, Uptime Kuma, UniFi
Admin 15-tab control plane: nodes, models, MCP, plugins, training, security

Worker Nodes

# On any Mac on your network:
./worker/deploy-full.sh
  • Auto-discovers orchestrator via mDNS
  • Self-learning task router — routes work to the best node automatically
  • Menu bar app with live tok/s and current task
  • Screen observation with --observe — privacy blur on sensitive content
  • HMAC-signed results — orchestrator verifies every response
  • No limit on nodes. Bigger cluster = more capable Zora

Mobile & Remote Access

Mobile PWA — open on your phone, add to home screen. Chat, voice, camera. Always in your pocket.

Cloudflare Tunnel (free)brew install cloudflared + configure in Admin. Zero-trust tunnel, no ports to forward, custom domain.

Tailscalebrew install tailscale. WireGuard mesh VPN. Access at http://mac-mini:8000. Auto-detected by Zora.


Privacy

  • All inference on your hardware. No API calls by default
  • Data never leaves your machine. SQLite database, local files, local memory
  • Cloud is opt-in. /plan mode for OpenAI planning. Data sanitiser strips sensitive content first
  • Private search. SearXNG runs locally. No Google, no tracking
  • Encrypted vault. macOS Keychain with encrypted fallback
  • No telemetry. Zora doesn't phone home. Ever

Plugins & Skills

9 built-in plugins: Email, Reddit, SearXNG, Uptime Kuma, UniFi, Discord, Slack, Notion, Facebook Groups.

Plugin SDK — build your own with a plugin.yaml + main.py. Three types: connectors, skills, MCP. Templates in plugins/_templates/.

36 agent skills — markdown files that shape behaviour (Senior Dev, Incident Commander, Work Proxy, etc.). Auto-created after 15 successful runs of a pattern.

Full plugin & skill docs →


OpenChain (Future)

Four small machines are smarter than one big GPU.

A public, open-source compute network built on the same orchestration substrate:

  1. Contribute idle cycles from your homelab
  2. Earn credits for verified work completed by your nodes
  3. Spend credits on tasks bigger than your hardware can handle alone
  4. Jobs split and merge across multiple contributor nodes
  5. Privacy-preserving identity — keypair-based, no email or account required

Your personal Zora stays completely private. OpenChain only runs generic jobs, shared evals, and public skills. The orchestration substrate, worker protocol, and HMAC verification that make this possible are already built and running in private clusters today.


Roadmap

Shipped: Custom brain, TurboQuant kernel, speculative decoding, multi-node, COG-X cognition, People Intelligence, Digital Delegate, 6 channels, message triage, voice (TTS + STT + wake word), 3D Office, CRT Monitor, 150+ tools, plugin system, sandboxed execution, trust ledger, overnight autonomy, self-improvement loop, Cloudflare/Tailscale, mobile PWA, menu bar worker app.

In progress: Personal brain training from real usage, Playwright browser isolation, daytime autonomous work.

Planned: WebXR Office (Meta Quest), AR presence (iPhone ARKit + Apple Vision Pro), Looking Glass holographic display, multi-persona, OpenChain compute network, external repo learning, multi-model routing, voice calls and video presence.

Full roadmap →


Contributing

Zora is a labour of love. Free and open source — always will be.

This project is still relatively new, so expect some sharp edges. See known issues for current bugs — for example, TTS can crash on 16GB Macs when the brain model is also loaded (Metal GPU memory). That's where you come in. We want to build Zora with the community, not just for it. Whether it's fixing bugs, adding plugins, improving the brain, building new integrations, or just telling us what's broken — contributions are welcome and valued.

git clone https://github.com/Azkabanned/zora.git
cd zora
./install.sh
python3 scripts/seed_db.py    # populate with fake data for development
./start.sh

The seed script creates realistic fake contacts, conversations, messages, and notifications so you can develop without real integrations.


Philosophy

A well-designed system with a mediocre model beats a genius model in a poor system.

System over intelligence. Solve once, reuse forever. Minimal context, maximum clarity. Humans over tech.

Read the full vision →


Apache 2.0 • Built with MLX on Apple Silicon
Made by @Azkabanned

About

Zora, Ai Assistant.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages