Skip to content

estebanrfp/gos

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GenosOS (GOS) — Agent Operating System

A standalone, local-first runtime for autonomous AI agents. Memory, multi-channel messaging, real-time voice, phone calls, and on-device language models — all running on your machine, with cloud as the exception, not the default.


License-SourceCode-Proprietary Production Build

Project Status Platform Runtime

⚠️ Apple Silicon only. GenosOS uses MLX for the voice pipeline (Qwen3-TTS + Qwen3-ASR), which runs exclusively on M-series chips. Intel Macs are not supported and will fail at the pip install step.

💾 Disk space: budget ~35 GB free — ~33 GB for the local model weights (downloaded on first run from HuggingFace), ~1 GB for JS deps + Chromium, ~1 GB headroom.

⏱️ Total setup time: ~30 min (fast fiber) to ~2 hours (slow connection). The bulk is the model download in the first-run wizard.

Table of Contents

What is GenosOS

GenosOS is a server runtime that gives an LLM-driven agent the operating-system layer it needs to actually live in your workflow:

  • Persistent semantic memory — every conversation contributes to a knowledge graph; relevant context is injected automatically on each turn, with zero prompt overhead for memory mechanics.
  • Multi-channel presence — the same agent handles WhatsApp, Telegram, Discord, Slack, iMessage and browser chat, with channel-aware tool restrictions.
  • Real-time voice — browser microphone and direct SIP phone calls, fully local pipeline (Qwen3-ASR + Qwen 3.6 35B-A3B + Qwen3-TTS), ~1-1.5s round-trip on cached turns.
  • Multi-agent coordination — agent-to-agent delegation, multi-participant rooms with autonomous echo conversations, isolated per-agent memory and workspace.
  • Capability-based runtime sandbox — every tool call validated against declarative policies; workspace confinement, allowlisted shell commands, secret redaction.
  • Local-first by design — 5 local models cover chat, vision, voice and embeddings (~37 GB RAM). Cloud (Anthropic/OpenAI/Gemini) is opt-in via ⌘L.

Highlights

  • Encrypted memory at rest — AES-256-GCM derived from your own mnemonic, keys never leave the machine.
  • Per-account isolation — each Ethereum address is an independent account with its own encrypted database, workspace, and channel sessions.
  • WebAuthn + mnemonic identity — no passwords, no shared secrets, no cloud auth.
  • Direct SIP trunk for phone calls — no Twilio, no Cloudflare tunnel, no vendor lock-in.
  • Zero frameworks on the client — vanilla JS + DOM API + CSS. No build step required to develop the UI.
  • Full-state backup engine — immutable manifests, per-account iCloud sync, standard tar.gz + SQLite (no proprietary formats).

Requirements

  • Hardware: Apple Silicon Mac (M1/M2/M3/M4). 64 GB RAM recommended for the full local stack (32 GB works with one model loaded at a time).
  • OS: macOS 14+ (Sonoma or newer).
  • Disk: ~35 GB free (see warning above).
  • Runtime: Bun >=1.2.0.
  • Python: 3.11+ (for the Qwen3 MLX server — TTS + ASR).
  • System packages via Homebrew:
    • llama.cpp — chat + embedding models
    • ffmpeg — audio conversion for messaging channels
    • cliclick — Computer Use tool (optional, only needed if you want the agent to control your desktop)

Installation

# 1. Clone
git clone https://github.com/estebanrfp/gos.git
cd gos

# 2. Install system dependencies (skip the ones you already have)
brew install bun llama.cpp ffmpeg cliclick python@3.11

# 3. Install JavaScript runtime dependencies
bun install

# 4. Install Python dependencies for the MLX TTS/ASR server
python3 -m pip install -r dist/qwen3-requirements.txt

# 5. Start GenosOS
bun start

Verify the install

While the server is running (after step 5), in another terminal:

curl -s -o /dev/null -w '%{http_code}\n' http://localhost:4400
# Expected: 200

A 200 confirms the WebSocket server is up. If you see Connection refused, the server did not start — check the boot log printed by bun start.

First Run

Open http://localhost:4400 in your browser. The setup wizard will:

  1. Ask for a mnemonic passphrase. This is a BIP39-style 12- or 24-word seed used to derive your encryption key. The key never leaves your machine. If you don't have one, the wizard can generate one — write it down, because losing it means losing all your encrypted data permanently.
  2. Pick an LLM provider for the (optional) cloud boost — Anthropic, OpenAI, or Gemini. You can skip this entirely and stay 100% local.
  3. Auto-download local GGUF models from HuggingFace (~33 GB total). Progress bars per model. Resumable if interrupted.
  4. Start llama-server for chat + embeddings and launch the Python MLX server for voice.
  5. Land you in your first agent's chat session. A short onboarding conversation will configure the agent's soul, identity, user context, and rules.

After the first run, every subsequent boot is fast (~5-10 seconds): models load once, channels reconnect automatically.

Troubleshooting

bun: command not found after brew install bun → Open a new terminal window so the shell picks up the new PATH entry.

Failed to start server. Is port 4400 in use? → Another process is bound to :4400. Find it with lsof -i :4400 and stop it, or pass PORT=4500 bun start to use a different port.

pip install fails with error: externally-managed-environment on macOS 14+ → Use a virtual environment: python3 -m venv .venv && source .venv/bin/activate && python3 -m pip install -r dist/qwen3-requirements.txt. After this, always activate the venv before running bun start so the MLX server can find its packages.

mlx install fails with architecture errors → You are on an Intel Mac. GenosOS does not support Intel — only Apple Silicon (M1/M2/M3/M4).

HuggingFace download stalls or fails → Check your network. Downloads are resumable: stop GenosOS and restart it, the wizard picks up where it left off via HTTP Range requests and .part temp files.

Bundle boots but voice / channels don't work → Make sure llama-server is on your PATH (which llama-server should print a path under /opt/homebrew/). Check the boot log of bun start for [local] ... and [qwen3] ... lines confirming the model subprocesses started.

I forgot my mnemonic → There is no recovery. Your encrypted data is unrecoverable. Delete ~/.genos/ and start fresh, or restore from a backup if you made one (see the Backup feature in-app).

Architecture Overview

┌─────────────────────────────────────────────────────────────┐
│  Channels: WhatsApp · Telegram · Discord · Slack · iMessage │
│  Voice: Browser mic (Talk Local) · SIP phone calls          │
│  UI: Browser chat at http://localhost:4400                  │
└─────────────────────────────┬───────────────────────────────┘
                              │
                  ┌───────────▼───────────┐
                  │   GenosOS Server      │  Bun, single process,
                  │   :4400 (WS + HTTP)   │  encrypted SQLite store
                  └───────────┬───────────┘
                              │
        ┌─────────────────────┼─────────────────────┐
        │                     │                     │
  ┌─────▼─────┐         ┌─────▼─────┐         ┌─────▼─────┐
  │ llama-    │         │ qwen3-    │         │ node-     │
  │ server    │         │ server.py │         │ llama-cpp │
  │ :8081     │         │ :8890     │         │ (embed)   │
  │           │         │           │         │ in-proc   │
  │ Qwen 3.6  │         │ Qwen3-TTS │         │ Qwen3-Emb │
  │ 35B-A3B   │         │ Qwen3-ASR │         │           │
  └───────────┘         └───────────┘         └───────────┘

All intelligence, all encryption, all routing — runs locally. Cloud APIs (Anthropic, OpenAI, Gemini) are available as an opt-in boost (⌘L) but are not required for any feature.

License

The minified production build in dist/ is free for personal and commercial use — integrate, distribute, resell. The source code is proprietary; reverse-engineering, decompilation, or modification of the bundle is not permitted.

See LICENSE for the full terms.

Author

Esteban Fuster Pozzi (@estebanrfp) — Full Stack JavaScript Developer