Skip to content

orbis-agent/orbis

Repository files navigation

Orbis

Orbis

The AI agent that listens, speaks, and acts.
Open-source, multi-modal AI agent platform with voice, text, and 6+ channel integrations.
8 cognition layers. 7+ security layers. Your data stays on your machine.

Quick StartFeaturesArchitectureSecurityConfigurationContributing

npm License: MIT Node.js 20+ TypeScript Docker PostgreSQL LLM Providers


What is Orbis?

Orbis is a self-hosted AI agent that goes beyond chat. It can browse the web, send emails, manage your CRM, schedule tasks, scrape data, handle files, and remember everything — across voice, text, and 6+ messaging channels. It runs on your machine with your choice of LLM provider, and all your data stays in your local PostgreSQL database.

Unlike cloud-based AI assistants, Orbis gives you full control: choose your LLM (Claude, GPT-4, Llama, Groq), enable cognitive layers for deeper reasoning, and extend capabilities through a modular plugin system. Security is built in from day one — not bolted on.


🧬 PGA — Prompts Genómicos Autoevolutivos

World-first innovation created by Luis Alfredo Velasquez Duran (Germany, 2025)

Orbis is the first AI agent in the world to use Genomic Self-Evolving Prompts (PGA), a revolutionary system of continuous self-improvement inspired by biological evolution.

What is PGA?

Unlike other AI agents that use static prompts, Orbis evolves its own instructions through a three-layer system:

  • Layer 0: Immutable DNA (Security & Ethics — never changes)
  • Layer 1: Operative Genes (Technical Skills — slow evolution)
  • Layer 2: Epigenomes (Preferences & Style — rapid evolution)

Each interaction trains the agent, making it smarter, more precise, and more adapted to YOUR way of working.

The Evolution Process

Transcription → Variation → Simulation (Sandbox) → Selection

Before adopting any improvement, Orbis tests it in a sandbox to ensure it actually works better. This guarantees your agent never gets worse — only better.

Per-User Personalization

Your Orbis is not the same as another user's Orbis. The system builds a unique "User DNA" that captures:

  • Your preferred communication style
  • Your most-used tools
  • Success/failure patterns specific to your domain
  • Temporal preferences (morning/evening workflows)

Result: An agent that feels like it was designed specifically for you.

Learn more: PGA Technical Documentation


© 2025 Luis Alfredo Velasquez Duran — PGA is a unique innovation of Orbis.


Features

Channels & Interfaces

Feature Description
Web UI Dashboard with chat, voice, vision, CRM, knowledge base, and built-in help
Voice Full voice pipeline — STT, TTS, wake word detection, voice activity detection
Telegram Bot integration via Grammy
Discord Server/DM support, slash commands via discord.js
WhatsApp QR-code pairing, no API key needed (Baileys)
Slack Socket Mode, thread-aware sessions, slash commands
Desktop Native Electron app for macOS, Windows, Linux with auto-updates
REST API Full HTTP + WebSocket API for custom integrations

Cognition

Feature Description
Chain-of-Thought Agent reasons in <thinking> blocks before responding
Reflective Memory Post-conversation analysis extracts insights and lessons
Self-Evaluation Agent scores its own performance and tracks improvement
Knowledge Graphs Entity-relationship memory (people, companies, projects) with pgvector
Autonomous Planning Multi-step task decomposition with dependency tracking
Meta-Cognition Self-improvement from historical performance patterns
Proactive Engine Detects patterns and suggests actions before you ask
Workspace Learning Per-workspace behavioral adaptation over time

Security

Feature Description
JWT Auth Access tokens in memory + refresh tokens in httpOnly cookies
PII Redaction Auto-detects and masks SSNs, credit cards, phone numbers before storage
Prompt Injection Guard 12-rule detection engine with boundary markers and tool output sanitization
Brute Force Protection Progressive lockout with exponential backoff
Sandbox Mode Docker-isolated tool execution with resource limits
Security Headers CSP, HSTS, X-Frame-Options, X-Content-Type-Options
Audit Logging Every tool call logged with input, output, risk level, duration
SSRF Prevention Blocks requests to private IP ranges and internal networks

Tools & Plugins

Feature Description
Browser Navigate, click, type, extract, screenshot (Playwright)
Email Send, read, list, delete with per-user encrypted SMTP config
Search Google Custom Search integration
Files Read, write, delete, CSV parse and generate
Scheduler Cron jobs, one-time tasks, recurring reminders
CRM Contacts, deals, invoices, activities (HubSpot/Salesforce)
Scraper Advanced web scraping — tables, links, media extraction

Advanced

Feature Description
Multi-Agent Orchestrator + 5 specialist agents (researcher, writer, analyst, coder, reviewer)
Live Canvas Real-time collaborative documents streamed via WebSocket
OrbisHub Community skill registry with checksum verification
Skill Forge Agent learns to create new skills from usage patterns
Wake Word STT-based detection with fuzzy matching ("Hey Orbis")
Knowledge Base Upload documents (PDF, TXT, MD, CSV) for RAG — agent queries your docs
Context Compaction Automatic conversation summarization to manage token limits

Quick Start

Prerequisites: Node.js 20+ and Docker Desktop

Option A: One command (recommended)

npx create-orbis my-agent

The interactive wizard handles everything:

  1. Prerequisites check — verifies Node.js, Git, Docker
  2. Deployment — local machine or VPS
  3. LLM provider — Anthropic, OpenAI, Ollama, or Groq
  4. Channels — Web, Voice, WhatsApp, Discord, Slack, Telegram
  5. Cognition — enable thinking, reflection, knowledge graphs, planning
  6. Database — Docker (automatic) or external PostgreSQL
  7. Owner account — email + password (single-tenant, one owner per instance)
  8. Agent personalization — name, tone, language, custom instructions

After setup:

cd my-agent
npm run dev

Open http://localhost:3002, log in, and start chatting with your agent.

Option B: Clone and setup

One command setup — handles everything automatically:

git clone https://github.com/orbis-agent/orbis.git
cd orbis && npm run onboard

The onboarding script automatically:

  • ✅ Starts PostgreSQL database (Docker) if not running
  • ✅ Creates your owner account with secure password
  • ✅ Configures agent personalization (name, tone, language)
  • ✅ Provides next steps to start using Orbis

After onboarding, start Orbis:

npm run dev

Open http://localhost:3002 and log in with your credentials.


Alternative Setup Methods

Method Command Best For
Automatic Onboarding (recommended) npm run onboard One command — starts DB, creates account, zero config
Interactive Setup bash setup.sh Full customization with wizard (channels, cognition, email)
Quick Setup bash setup.sh --quick Fast setup — defaults + just asks for API key
Docker-only bash setup.sh --docker Production deployment, no local Node.js needed
Full Docker docker compose up -d --build When .env is already configured

Manual Development Setup

If you prefer setting things up step by step:

# 1. Install dependencies
npm install

# 2. Configure environment
cp .env.example .env
# Edit .env with your API keys and preferences

# 3. Start database
npm run db:up

# 4. Build all packages
npm run build

# 5. Start development servers
npm run dev

Architecture

┌─────────────────────────────────────────────────────────────────────┐
│                         User Interfaces                             │
│                                                                     │
│   Web UI      Voice     Telegram   Discord   WhatsApp   Slack  API  │
│   :3002     (STT/TTS)   (Bot)     (Bot)    (Baileys)  (Bolt)      │
│  :3002       :3001                                                  │
└──────────────────────────────┬──────────────────────────────────────┘
                               │ HTTP / WebSocket
┌──────────────────────────────▼──────────────────────────────────────┐
│                    @orbis/core — Agent Brain                        │
│                       localhost:3000                                 │
│                                                                     │
│  ┌──────────┐  ┌──────────────┐  ┌──────────┐  ┌───────────────┐  │
│  │  Agent    │  │  Cognition   │  │ Security │  │   Plugins     │  │
│  │  Loop     │  │  Layers (8)  │  │ Gate (7+)│  │   (8 built-in)│  │
│  │          │  │              │  │          │  │               │  │
│  │ Reason → │  │ Think →     │  │ Auth →   │  │ Browser →    │  │
│  │ Plan →   │  │ Reflect →   │  │ PII →    │  │ Email →      │  │
│  │ Execute →│  │ Evaluate →  │  │ Prompt → │  │ Search →     │  │
│  │ Remember │  │ Learn       │  │ Sandbox  │  │ Files →      │  │
│  └──────────┘  └──────────────┘  └──────────┘  │ Scheduler →  │  │
│                                                  │ CRM →        │  │
│  ┌──────────────────────────────────────────┐   │ Scraper      │  │
│  │  Multi-Agent Orchestrator                │   └───────────────┘  │
│  │  5 specialists: researcher, writer,      │                      │
│  │  analyst, coder, reviewer                │   ┌───────────────┐  │
│  └──────────────────────────────────────────┘   │  OrbisHub     │  │
│                                                  │  Skill        │  │
│  ┌──────────────────────────────────────────┐   │  Registry     │  │
│  │  Live Canvas — Real-time documents       │   └───────────────┘  │
│  └──────────────────────────────────────────┘                      │
└──────────────────────────────┬──────────────────────────────────────┘
                               │
┌──────────────────────────────▼──────────────────────────────────────┐
│              PostgreSQL 16 + pgvector                                │
│                                                                     │
│  Users │ Sessions │ Messages │ Memories (embeddings) │ Audit Log   │
│  Knowledge Graph │ Performance Log │ Scheduled Tasks │ Canvas Docs │
└─────────────────────────────────────────────────────────────────────┘

Monorepo Structure

Orbis uses a Turborepo monorepo with npm workspaces:

orbis/
├── packages/
│   ├── core/           # Agent engine, API server, auth, memory, plugins
│   ├── shared/         # Shared TypeScript types and utilities
│   ├── web/            # React frontend (Vite)
│   ├── voice/          # Voice pipeline server (STT + TTS + VAD)
│   ├── telegram/       # Telegram bot (Grammy)
│   ├── discord/        # Discord bot (discord.js v14)
│   ├── whatsapp/       # WhatsApp integration (Baileys)
│   ├── slack/          # Slack bot (@slack/bolt)
│   ├── desktop/        # Electron desktop app
│   └── create-orbis/   # Project scaffolding CLI
├── plugins/
│   ├── plugin-browser/     # Web automation (Playwright)
│   ├── plugin-email/       # Email send/receive (Nodemailer)
│   ├── plugin-search/      # Google Custom Search
│   ├── plugin-files/       # File I/O + CSV processing
│   ├── plugin-scheduler/   # Cron jobs + reminders
│   ├── plugin-crm/         # CRM integration
│   ├── plugin-scraper/     # Advanced web scraping
│   └── plugin-template/    # Boilerplate for custom plugins
├── setup.sh                # Interactive setup wizard
├── docker-compose.yml      # PostgreSQL + all services
├── turbo.json              # Turborepo pipeline config
└── package.json            # Monorepo root

Data Flow

User message → Auth middleware → Agent loop:
  1. Load context (memories, history, preferences)
  2. Cognition: thinking, planning (if enabled)
  3. LLM call (Claude/GPT-4/Llama/Groq)
  4. Tool execution (if needed) → security gate → plugin
  5. Cognition: reflection, evaluation (async, if enabled)
  6. Store message + update memories
  → Response to user

Channels & Interfaces

Web UI

The default interface at http://localhost:3002. Features text chat, voice chat with push-to-talk, vision (camera input), knowledge base (document upload + RAG), and a full CRM sidebar (dashboard, contacts, calendar, invoices, deals, settings, help).

npm run web    # Start web UI only

Voice

Full voice pipeline with configurable providers:

  • STT (Speech-to-Text): Whisper (local, free), Deepgram (cloud), Google Speech (cloud)
  • TTS (Text-to-Speech): Kokoro (local, free), ElevenLabs (cloud), OpenAI TTS (cloud)
  • Wake Word: Say "Orbis" (configurable) to activate — STT-based with fuzzy matching
  • VAD: Voice Activity Detection for natural conversation flow
npm run voice    # Start voice server only

Telegram

# 1. Create a bot with @BotFather, get the token
# 2. Add to .env:
TELEGRAM_BOT_TOKEN=your-token-here

# 3. Start:
npm run telegram

# Or with Docker:
docker compose --profile telegram up -d

Discord

# 1. Create a Discord app at https://discord.com/developers
# 2. Add to .env:
DISCORD_BOT_TOKEN=your-token-here
DISCORD_CLIENT_ID=your-client-id

# 3. Start:
npm run discord

Supports slash commands (/ask, /tools, /reset), DMs, and @mentions.

WhatsApp

# No API key needed — uses QR code pairing
npm run whatsapp
# Scan the QR code in your terminal with WhatsApp

Uses the Baileys library (open-source WhatsApp Web protocol). Supports text and audio messages.

Slack

# 1. Create a Slack app at https://api.slack.com/apps
# 2. Add to .env:
SLACK_BOT_TOKEN=xoxb-your-token
SLACK_APP_TOKEN=xapp-your-token    # For Socket Mode

# 3. Start:
npm run slack

Thread-aware sessions, /orbis and /orbis-tools slash commands.

Desktop

Native desktop app built with Electron:

npm run desktop

Features: system tray, Ctrl+Shift+O global shortcut, orbis:// protocol handler, minimize to tray. Builds for macOS (DMG), Windows (NSIS), and Linux (AppImage).

REST API

For custom integrations:

# Create a session
curl -X POST http://localhost:3000/api/sessions \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"channel": "api"}'

# Send a message
curl -X POST http://localhost:3000/api/sessions/$SESSION_ID/messages \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"content": "Search for the latest news about AI"}'

WebSocket streaming is also available at ws://localhost:3000/ws/$SESSION_ID.


Cognition Layers

Orbis features 8 opt-in cognitive layers that enhance the agent's reasoning, memory, and self-improvement capabilities. Each layer is independently togglable via environment variables. When disabled, they add zero overhead.

Overview

Layer Env Variable What It Does
Chain-of-Thought COGNITION_THINKING=true Agent reasons in <thinking> blocks before responding
Reflective Memory COGNITION_REFLECTION=true Post-conversation analysis extracts insights
Self-Evaluation COGNITION_EVALUATION=true Agent scores its performance (1-10) after tool use
Knowledge Graphs COGNITION_KNOWLEDGE_GRAPH=true Entity-relationship memory (people, companies, projects)
Autonomous Planning COGNITION_PLANNING=true Multi-step task decomposition with dependency tracking
Meta-Cognition COGNITION_METACOGNITION=true Self-improvement from historical performance patterns
Proactive Engine COGNITION_PROACTIVE=true Detects patterns and suggests actions proactively
Workspace Learning COGNITION_WORKSPACE_LEARNING=true Per-workspace adaptive behavior

How They Work

Chain-of-Thought Thinking — The agent reasons inside <thinking> tags before generating a response. These blocks are extracted and stored in the database but never shown to the user. Improves performance on complex reasoning tasks.

Reflective Memory — After conversations with tool use or 8+ messages (configurable via COGNITION_REFLECTION_MIN_MESSAGES), the agent analyzes the conversation asynchronously. Insights are stored as searchable "reflection" memories with pgvector embeddings for semantic recall.

Self-Evaluation — After executing tools, the agent scores its own performance on a 1-10 scale. Tracks tool success rates, failure patterns, and lessons learned. Results are stored in the performance_log table.

Knowledge Graphs — Goes beyond text memory to capture entity-relationship structures. When the agent learns about people, companies, or projects, it stores them as graph nodes with typed relationships. Enables queries like "who works at company X?" or "what projects is person Y involved in?"

Autonomous Planning — For complex multi-step tasks, the agent creates a TaskPlan with goals, steps, and dependencies. It executes steps in order, tracks progress, and replans if a step fails.

Meta-Cognition — The agent learns from its own performance logs. It identifies patterns (which tools work best for which tasks, when search is better than browsing, etc.) and adjusts future behavior accordingly. Runs asynchronously with zero latency impact.

Proactive Engine — Detects recurring patterns from conversation history and suggests relevant actions before the user asks. Returns ProactiveSuggestion objects with confidence scores.

Workspace Learning — Adapts behavior per workspace over time. Tracks user preferences, tool effectiveness, and domain-specific strategies to improve personalization.

Enable All Cognition Layers

# Add to .env:
COGNITION_THINKING=true
COGNITION_REFLECTION=true
COGNITION_EVALUATION=true
COGNITION_KNOWLEDGE_GRAPH=true
COGNITION_PLANNING=true
COGNITION_METACOGNITION=true
COGNITION_PROACTIVE=true
COGNITION_WORKSPACE_LEARNING=true

Security

Security in Orbis is on by default — not an afterthought. Multiple layers work together to protect your data, your users, and your infrastructure.

Authentication & Authorization

  • PBKDF2-SHA512 password hashing with 100,000 iterations
  • JWT dual-token pattern: short-lived access tokens (in memory only, immune to XSS) + long-lived refresh tokens (httpOnly cookies, immune to XSS)
  • Single-tenant architecture: One owner per instance, registration closed after first user
  • WebSocket authentication: token verified via first message (never in URL)
  • Automatic token refresh: 5 minutes before expiry, with mutex to prevent race conditions

Input Protection

  • PII Detection & Redaction: Automatically detects and masks SSNs (with validation), credit cards (Luhn check), phone numbers (US + international), and emails before database storage. Enabled by default (SECURITY_PII_REDACTION=true).
  • Prompt Injection Guard: 12-rule pattern detection engine that identifies injection attempts (ignore_previous, system_role, jailbreak, XML injection, prompt reveal, tool override, etc.). User content is wrapped in boundary delimiters. Tool outputs are sanitized before re-injection. Mode: warn (log only) or strict (block high severity).

Network Security

  • HTTP Security Headers: X-Frame-Options (DENY), X-Content-Type-Options (nosniff), Strict-Transport-Security (1 year, production), Content-Security-Policy, Referrer-Policy, Permissions-Policy
  • Strict CORS: In production, cross-origin requests are blocked unless CORS_ORIGINS is explicitly configured. No wildcard defaults.
  • Rate Limiting: Configurable per-user/IP limit (default: 60 requests/minute)
  • SSRF Prevention: SecurityGate blocks requests to private IP ranges (10.x, 172.16-31.x, 192.168.x, localhost, IPv6 loopback) with numeric IP parsing to prevent bypass attempts

Execution Security

  • Sandbox Mode: Execute tools inside ephemeral Docker containers with resource limits (CPU, memory, PID count), read-only root filesystem, writable /tmp only, network isolation, no privilege escalation (--cap-drop ALL, --security-opt no-new-privileges), and automatic cleanup
  • Tool Approval Flow: High-risk tools require explicit user approval before execution
  • Audit Logging: Every tool call is logged with input, output, risk level, approval status, and duration

Brute Force Protection

  • Per-email lockout: After 5 failed attempts (configurable), progressive delay with exponential backoff (5min base, doubles per excess, caps at ~2.7 hours)
  • Per-IP rate limiting: 10 login attempts per minute
  • Fail-open design: If the guard itself errors, the login attempt is allowed (availability over security for the guard layer)

Error Handling

  • Production mode: Generic error messages with error codes — no stack traces, no database details leaked
  • Development mode: Full error messages with stack traces for debugging

Data Protection

  • Email credentials: Per-user SMTP passwords encrypted with pgcrypto before storage
  • Access tokens: Stored only in JavaScript closures (never localStorage, never cookies, never sessionStorage)
  • Parameterized SQL: All database queries use parameterized statements — no SQL injection vectors

Comparison

Security Layer Typical AI Agent Orbis
Password Storage Plain/MD5 PBKDF2-SHA512 (100K iterations)
Token Security localStorage In-memory only + httpOnly cookies
PII Protection None Auto-detection + Luhn-validated redaction
Prompt Injection None 12-rule engine + boundary markers
SSRF None Private IP blocking + numeric parsing
Tool Execution Direct Sandboxed Docker containers
HTTP Headers None CSP + HSTS + X-Frame + X-Content-Type
Brute Force None Progressive lockout + IP rate limiting
Audit Trail None Full tool call logging with risk levels

Plugins & Tools

Orbis uses a modular plugin system. Plugins are auto-discovered at startup — just add a package to the plugins/ directory.

Built-in Plugins

Plugin Tools Description
Browser browser_navigate, browser_click, browser_type, browser_extract, browser_scroll, browser_screenshot Headless browser automation with Playwright. Session-isolated contexts, configurable viewport.
Email email_send, email_list, email_read, email_delete Email management with Nodemailer. Per-user encrypted SMTP config. Supports Gmail, Outlook, Yahoo, iCloud, Zoho, custom SMTP.
Search google_search Web search via Google Custom Search API.
Files file_read, file_list, file_generate, file_delete, csv_parse, csv_generate File I/O with path traversal prevention, MIME allowlisting, size limits, and per-user isolated storage.
Scheduler schedule_task, schedule_list, schedule_cancel, reminder_set Cron job scheduling, recurring tasks, one-time reminders. Full cron expression syntax.
CRM crm_create_contact, crm_update_contact, crm_list_deals, crm_log_activity Customer relationship management. Multi-provider support (HubSpot, Salesforce, custom).
Scraper scrape_html, scrape_table, scrape_links, scrape_media Advanced web scraping with CSS selectors, JavaScript evaluation, and element interaction.
Office excel_create, excel_set_cells, excel_add_sheet, excel_auto_fit, excel_merge_cells, excel_read Excel spreadsheet automation. Create workbooks, format cells, use formulas, manage multi-sheet files for data analysis and reporting.

Advanced Features

Multi-Agent Orchestrator

Break complex tasks into subtasks handled by specialist agents:

  • Researcher — information gathering and web search
  • Writer — content creation and editing
  • Analyst — data analysis and interpretation
  • Coder — code generation and review
  • Reviewer — quality assurance and validation

The orchestrator creates a delegation plan, runs independent tasks concurrently, and synthesizes results into a final response. Uses a star topology (agents communicate through the orchestrator, not with each other).

Live Canvas

Real-time collaborative documents that the agent creates and edits while you watch:

  • Content types: Markdown, code (with syntax highlighting), JSON, HTML
  • Operations: replace all, append, insert at position, replace section
  • Streamed via WebSocket — updates render instantly in the browser
  • Multiple canvases per session, version history

Tools: canvas_create, canvas_edit, canvas_read, canvas_list, canvas_delete

OrbisHub — Skill Registry

A community-driven marketplace for extending Orbis capabilities:

  • Discover skills by name, tag, or category
  • Install from Git repos or tarballs with checksum verification
  • Publish your own skills to the registry
  • Local caching at ~/.orbis/hub with 15-minute TTL
  • Configurable registry URL (default: GitHub-hosted index)

Skill Forge

The agent learns to create new skills from patterns in your conversations. When it detects recurring multi-step workflows, it abstracts them into reusable skill definitions stored in the forged_skills table.

Sandbox Mode

Execute tools inside ephemeral Docker containers for maximum isolation:

# Enable in .env:
SANDBOX_ENABLED=true
SANDBOX_IMAGE=node:20-alpine
SANDBOX_MEMORY_MB=256
SANDBOX_CPU=0.5
SANDBOX_TIMEOUT_MS=30000
SANDBOX_NETWORK=false

Security constraints:

  • Read-only root filesystem (--read-only)
  • Writable /tmp only (64MB max)
  • Drop all Linux capabilities (--cap-drop ALL)
  • No privilege escalation (--security-opt no-new-privileges)
  • PID namespace isolation (--pids-limit 100)
  • Automatic container cleanup after execution

Wake Word

Activate Orbis with your voice — say "Orbis" (or a custom phrase) and the agent starts listening.

  • STT-based detection with Levenshtein distance fuzzy matching
  • Configurable sensitivity (0.0 - 1.0)
  • Cooldown period to prevent false triggers
  • Silence detection via RMS energy threshold
WAKE_WORD_PHRASE=orbis
WAKE_WORD_SENSITIVITY=0.6

Context Compaction

Long conversations don't run out of context. When messages exceed the threshold (default: 30), older messages are summarized while preserving key facts and semantic meaning. The last 10 messages are always kept uncompacted for continuity.


LLM & Voice Providers

LLM Providers

Provider Env Variable Notes
Anthropic (Claude) ANTHROPIC_API_KEY Recommended for agent reasoning. Extended thinking support.
OpenAI (GPT-4o) OPENAI_API_KEY Also used for embeddings.
Ollama OLLAMA_BASE_URL Free, runs 100% locally. No API key needed. Requires ~16GB RAM.
Groq GROQ_API_KEY Ultra-fast inference. Good for voice pipeline.
LLM_PROVIDER=anthropic    # anthropic | openai | ollama | groq

Voice Providers

Speech-to-Text (STT):

Provider Type Setup
Whisper Local, free STT_PROVIDER=whisper (default)
Deepgram Cloud STT_PROVIDER=deepgram + DEEPGRAM_API_KEY
Google Speech Cloud STT_PROVIDER=google + GOOGLE_SPEECH_KEY

Text-to-Speech (TTS):

Provider Type Setup
Kokoro Local, free TTS_PROVIDER=kokoro (default)
ElevenLabs Cloud, high quality TTS_PROVIDER=elevenlabs + ELEVENLABS_API_KEY
OpenAI TTS Cloud TTS_PROVIDER=openai + OPENAI_TTS_KEY

Configuration Reference

All configuration is done through environment variables in .env. See .env.example for the complete reference.

Core

Variable Default Description
LLM_PROVIDER anthropic LLM provider: anthropic, openai, ollama, groq
ANTHROPIC_API_KEY Anthropic API key
OPENAI_API_KEY OpenAI API key (also used for embeddings)
OLLAMA_BASE_URL http://localhost:11434 Ollama server URL
GROQ_API_KEY Groq API key
DATABASE_URL postgresql://orbis:orbis@localhost:5432/orbis PostgreSQL connection string
JWT_SECRET auto-generated Secret for JWT signing (set a strong value in production)

Voice

Variable Default Description
STT_PROVIDER whisper Speech-to-text: whisper, deepgram, google
TTS_PROVIDER kokoro Text-to-speech: kokoro, elevenlabs, openai
WAKE_WORD_PHRASE orbis Wake word activation phrase
WAKE_WORD_SENSITIVITY 0.6 Wake word detection sensitivity (0.0-1.0)

Security

Variable Default Description
SECURITY_PII_REDACTION true Auto-redact PII before database storage
SECURITY_PROMPT_GUARD warn Prompt injection mode: warn (log) or strict (block)
SECURITY_MAX_LOGIN_ATTEMPTS 5 Failed attempts before lockout
SECURITY_LOCKOUT_SECONDS 300 Base lockout duration (doubles per excess)
CORS_ORIGINS Allowed origins (comma-separated). Required in production.
RATE_LIMIT_MAX 60 Requests per minute per user/IP

Cognition

Variable Default Description
COGNITION_THINKING false Enable chain-of-thought reasoning
COGNITION_REFLECTION false Enable post-conversation reflection
COGNITION_EVALUATION false Enable self-evaluation scoring
COGNITION_KNOWLEDGE_GRAPH false Enable entity-relationship memory
COGNITION_PLANNING false Enable autonomous task planning
COGNITION_METACOGNITION false Enable self-improvement learning
COGNITION_PROACTIVE false Enable proactive suggestions
COGNITION_WORKSPACE_LEARNING false Enable per-workspace adaptation
COGNITION_REFLECTION_MIN_MESSAGES 8 Min messages before reflection triggers
COGNITION_EVALUATION_THRESHOLD 1 Min tool calls before evaluation triggers

Channels

Variable Default Description
TELEGRAM_BOT_TOKEN Telegram bot token from @BotFather
DISCORD_BOT_TOKEN Discord bot token
DISCORD_CLIENT_ID Discord application client ID
WHATSAPP_AUTH_DIR ./whatsapp-auth WhatsApp session storage directory
SLACK_BOT_TOKEN Slack bot token (xoxb-...)
SLACK_APP_TOKEN Slack app token (xapp-...) for Socket Mode

Server

Variable Default Description
HOST 127.0.0.1 Network interface to bind (use 0.0.0.0 behind a reverse proxy)
PORT 3000 Core API server port
VOICE_PORT 3001 Voice server port
WEB_PORT 3002 Web UI port

Sandbox

Variable Default Description
SANDBOX_ENABLED false Enable Docker-isolated tool execution
SANDBOX_IMAGE node:20-alpine Docker image for sandbox containers
SANDBOX_MEMORY_MB 256 Memory limit per container
SANDBOX_CPU 0.5 CPU limit (fractional cores)
SANDBOX_TIMEOUT_MS 30000 Execution timeout
SANDBOX_NETWORK false Allow network access in sandbox

Available Commands

Development

Command Description
npm run dev Start all services (database + core + web + voice)
npm run core Start core API server only
npm run web Start web UI only
npm run voice Start voice server only
npm run telegram Start Telegram bot
npm run discord Start Discord bot
npm run whatsapp Start WhatsApp bot
npm run slack Start Slack bot
npm run desktop Start desktop app (Electron)

Build & Quality

Command Description
npm run build Build all packages
npm run typecheck Type-check all packages
npm run lint Lint all packages
npm run clean Clean build artifacts
npm run test Run all tests (Vitest)
npm run test:watch Run tests in watch mode

Database

Command Description
npm run db:up Start PostgreSQL container
npm run db:migrate Run database migrations

Docker

Command Description
npm run docker:up Start all services in Docker
npm run docker:down Stop all Docker services
npm run docker:logs Tail logs from all services

Utilities

Command Description
npm run setup Interactive setup wizard
npm run onboard Add a new user or reconfigure agent
npm run doctor Run diagnostics and check health

Plugin Development

Create custom plugins to extend Orbis with new tools.

1. Create the plugin

cp -r plugins/plugin-template plugins/plugin-myname

2. Define your tools

// plugins/plugin-myname/src/index.ts
import type { OrbisPlugin, ToolDefinition, ToolResult, ToolContext } from "@orbis/shared";

const tools: ToolDefinition[] = [
    {
        name: "my_tool",
        description: "What this tool does — the agent reads this to decide when to use it",
        category: "utility",
        riskLevel: "low",        // "low" | "medium" | "high"
        cost: 0,                 // Estimated cost per call (for budgeting)
        parameters: [
            {
                name: "input",
                type: "string",
                description: "The input parameter",
                required: true,
            },
        ],
    },
];

const plugin: OrbisPlugin = {
    name: "myname",
    version: "0.1.0",
    description: "What this plugin does",
    tools,

    async initialize() {
        // Called once at startup — set up connections, load resources
    },

    async execute(toolName: string, params: Record<string, unknown>, context: ToolContext): Promise<ToolResult> {
        switch (toolName) {
            case "my_tool":
                // Your tool logic here
                const result = await doSomething(params.input as string);
                return { success: true, data: result };

            default:
                return { success: false, error: `Unknown tool: ${toolName}` };
        }
    },

    async cleanup() {
        // Called on shutdown — close connections, release resources
    },
};

export default plugin;

3. Register the plugin

Add to root package.json workspaces:

{
  "workspaces": [
    "packages/*",
    "plugins/*"
  ]
}

The plugin is auto-discovered at startup. No manual registration needed.

4. Build and test

npm install
npm run build
npm run dev

Ask the agent: "What tools do you have?" — your new tool should appear in the list.


Docker Deployment

For production or when you don't want to install Node.js locally:

git clone https://github.com/orbis-agent/orbis.git
cd orbis
bash setup.sh --docker

Or if you've already configured .env:

docker compose up -d --build

Services

Service Port Description
db 5432 PostgreSQL 16 + pgvector
core 3000 Agent API server
voice 3001 Voice pipeline
web 3002 Web UI
telegram Telegram bot (optional profile)

Management

docker compose logs -f              # Tail all logs
docker compose logs -f core         # Tail core logs only
docker compose down                 # Stop all services
docker compose up -d --build        # Rebuild and restart
docker compose --profile telegram up -d   # Include Telegram bot

Custom Ports

Override ports via environment variables:

PORT=4000 VOICE_PORT=4001 WEB_PORT=4002 docker compose up -d

Troubleshooting

npm run doctor           # Check what's wrong
npm run doctor -- --fix  # Auto-fix common issues

Common Issues

"Cannot connect to database"

  • Ensure Docker is running: docker ps
  • Start the database: npm run db:up
  • Check the connection string in .env

"Voice WebSocket errors in console"

  • The voice server needs to be running: npm run voice
  • If you don't need voice, use text mode (voice errors are non-blocking)

"Error connecting to Orbis" in the web UI

  • Ensure the core server is running: npm run core
  • Check that PORT in .env matches (default: 3000)

"Extension 'vector' not available"

  • Ensure the database uses the pgvector/pgvector:pg16 Docker image
  • Run: docker compose exec db psql -U orbis -d orbis -c "CREATE EXTENSION IF NOT EXISTS vector;"

Contributing

Contributions are welcome! Here's how:

  1. Fork the repository
  2. Create a branch: git checkout -b feature/my-feature
  3. Make your changes — follow existing code style (TypeScript strict, ESLint)
  4. Test: npm run test && npm run typecheck && npm run lint
  5. Submit a PR with a clear description of what changed and why

Code Style

  • TypeScript with strict mode enabled
  • ESLint for linting
  • Vitest for testing
  • Conventional commits preferred

Project Structure Convention

  • Packages go in packages/
  • Plugins go in plugins/
  • SQL migrations go in packages/core/sql/ with sequential numbering (001, 002, ...)
  • Each package has its own package.json, tsconfig.json, and src/ directory

License

MIT License

Copyright (c) 2024-2026 Orbis Contributors

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors