Universal AI agent runtime with framework-agnostic model inference and native MCP integration.
Status: Active Development — Core runtime is functional. Web dashboard and advanced features are in progress. APIs may change without notice.
Aether is a local-first AI agent runtime that lets you create, manage, and interact with AI agents through a unified system. It supports multiple inference backends (Ollama, OpenAI-compatible APIs, MLX, llama.cpp) and integrates with the Model Context Protocol (MCP) for extensible tool use.
The system runs as a background daemon with a CLI for direct interaction, an HTTP REST API (OpenAI-compatible), and a web dashboard for visual management.
- Agent lifecycle management — Create, start, pause, resume, and stop agents with a full state machine
- Multiple inference backends — Remote APIs (Ollama, OpenAI), MLX (Apple Silicon), llama.cpp (cross-platform)
- OpenAI-compatible API — Drop-in replacement at
/v1/chat/completionswith streaming support - MCP integration — Connect to Model Context Protocol servers for tool use
- Model management — Discover, download, and convert between model formats (GGUF, MLX, SafeTensors)
- Web dashboard — Real-time system metrics, agent management, chat interface, and MCP configuration
- Event-driven architecture — Pub/sub event bus for system-wide communication
┌─────────────────────────────────────────────────┐
│ Clients │
│ CLI (aether) │ Web Dashboard │ REST API │
└────────┬───────┴────────┬────────┴───────┬───────┘
│ Unix Socket │ HTTP :3000 │ HTTP :8420
┌────────┴────────────────┴────────────────┴───────┐
│ aether-daemon (aetherd) │
│ ┌──────────┐ ┌──────────┐ ┌──────────────────┐ │
│ │ Agent │ │ Inference│ │ MCP Manager │ │
│ │ Manager │ │ Router │ │ │ │
│ └──────────┘ └──────────┘ └──────────────────┘ │
│ ┌──────────┐ ┌──────────┐ ┌──────────────────┐ │
│ │ Event │ │ Model │ │ Metrics │ │
│ │ Bus │ │ Manager │ │ Collector │ │
│ └──────────┘ └──────────┘ └──────────────────┘ │
└──────────────────────────────────────────────────┘
| Crate | Binary | Description |
|---|---|---|
aether-core |
— | Core library: agents, inference, config, events, MCP, models |
aether-daemon |
aetherd |
Background daemon with IPC and HTTP servers |
aether-cli |
aether |
Command-line interface for daemon interaction |
| Component | Path | Description |
|---|---|---|
| Web Dashboard | apps/web/ |
Next.js 16 + React 19 web interface |
| Tray App | apps/tray/ |
Tauri 2.0 desktop app (scaffolded) |
| TypeScript SDK | packages/sdk/ |
Client SDK (scaffolded) |
| Python API | python/ |
FastAPI backend (scaffolded) |
- Rust (1.75+) — rustup.rs
- Node.js (20+) — For the web dashboard
- Ollama (recommended) — Local LLM inference at
localhost:11434
# Build all Rust crates
cargo build
# Build in release mode
cargo build --release# Start daemon in foreground with HTTP API enabled
cargo run -p aether-daemon -- --foreground --http
# With custom HTTP port
cargo run -p aether-daemon -- --foreground --http --http-port 8420# Check daemon connectivity
cargo run -p aether-cli -- ping
# System status
cargo run -p aether-cli -- status
# Create and manage agents
cargo run -p aether-cli -- agent create --name my-agent
cargo run -p aether-cli -- agent list
cargo run -p aether-cli -- agent start <agent-id>
# Quick one-liner (creates agent, starts, chats, cleans up)
cargo run -p aether-cli -- run "Explain what Rust ownership is"
# Interactive chat with an agent
cargo run -p aether-cli -- chat --agent <agent-id>
# One-shot inference (no persistence)
cargo run -p aether-cli -- quick --model llama3.2 "Hello"cd apps/web
npm install
npm run devOpen http://localhost:3000 to access the dashboard.
The daemon exposes an HTTP REST API on port 8420 (when started with --http):
| Method | Endpoint | Description |
|---|---|---|
POST |
/v1/chat/completions |
Chat completions (streaming supported) |
POST |
/v1/completions |
Legacy completions |
GET |
/v1/models |
List available models |
| Method | Endpoint | Description |
|---|---|---|
GET |
/api/v1/agents |
List agents |
POST |
/api/v1/agents |
Create agent |
GET |
/api/v1/agents/:id |
Get agent |
PATCH |
/api/v1/agents/:id |
Update agent |
DELETE |
/api/v1/agents/:id |
Delete agent |
POST |
/api/v1/agents/:id/start |
Start agent |
POST |
/api/v1/agents/:id/pause |
Pause agent |
POST |
/api/v1/agents/:id/resume |
Resume agent |
POST |
/api/v1/agents/:id/stop |
Stop agent |
POST |
/api/v1/agents/:id/message |
Send message |
| Method | Endpoint | Description |
|---|---|---|
GET |
/api/v1/models |
List local models |
GET |
/api/v1/models/:id |
Get model details |
DELETE |
/api/v1/models/:id |
Delete model |
POST |
/api/v1/models/download |
Download from HuggingFace |
POST |
/api/v1/models/:id/convert |
Convert format |
POST |
/api/v1/models/:id/load |
Load into memory |
POST |
/api/v1/models/:id/unload |
Unload from memory |
POST |
/api/v1/models/scan |
Scan for local models |
| Method | Endpoint | Description |
|---|---|---|
GET |
/api/v1/system/health |
Health check |
GET |
/api/v1/system/status |
System status |
GET |
/api/v1/system/metrics |
Resource metrics |
| Method | Endpoint | Description |
|---|---|---|
GET |
/api/v1/tools |
List all tools |
POST |
/api/v1/tools/:name/test |
Test a tool |
GET |
/api/v1/tools/mcp |
List MCP servers |
POST |
/api/v1/tools/mcp |
Add MCP server |
DELETE |
/api/v1/tools/mcp/:id |
Remove MCP server |
POST |
/api/v1/tools/mcp/:id/restart |
Restart server |
POST |
/api/v1/tools/mcp/:id/invoke |
Invoke tool |
See docs/api.md for the full API reference.
Configuration is loaded from (in priority order):
config/default.toml— Defaults shipped with the project~/.aether/config.toml— User configuration- CLI flags — Runtime overrides
Key config sections: [server], [logging], [models], [backends]. See config/default.toml for all options.
- Agent state machine and lifecycle management
- Unix socket IPC with MessagePack framing
- HTTP REST API (axum) with OpenAI-compatible endpoints
- Remote inference backend (Ollama / OpenAI-compatible)
- MLX inference backend (Apple Silicon)
- llama.cpp inference backend
- Streaming responses (SSE)
- Event bus (tokio broadcast)
- MCP client integration (stdio transport)
- Model management (discovery, download, conversion types)
- System metrics collection (CPU, memory via sysinfo)
- CLI with all commands and interactive chat
- Web dashboard (agents, chat, tools, system metrics)
- HuggingFace download integration (types ready, download logic pending)
- Model format conversion (types ready, conversion logic pending)
- WebSocket support for real-time events
- Tray application (Tauri shell exists)
- Authentication and multi-user support
- Workflow automation pipelines
- TypeScript and Python SDKs
- macOS system integration (LaunchAgent, Keychain)
cargo test # Run all tests
cargo clippy # LintMIT