diff --git a/.claude/skills/gaia-release/SKILL.md b/.claude/skills/gaia-release/SKILL.md index f161fe3f1..fb11b8b54 100644 --- a/.claude/skills/gaia-release/SKILL.md +++ b/.claude/skills/gaia-release/SKILL.md @@ -53,7 +53,7 @@ These map to [CLAUDE.md](CLAUDE.md). Re-read them whenever this skill runs. - **No Claude attribution anywhere** — not in PR titles, PR bodies, commit messages (no `Co-Authored-By: Claude ...` trailer), release notes, code comments, or the Discord announcement. - **No silent fallbacks** — if a validator fails, a step times out, or a workflow run isn't found, stop with an actionable error. Do not retry blindly. Do not "proceed anyway." -- **Match house style for release notes** — see *Generation parameters* in Phase 1. In short: value-prop first, **local agents are the headline** (not the SDK), one agent/command per highlight, plain language, engaging but factual, and **no emoji, no fluff** ("finally", "silently", "no more crashes", "we're excited to announce", "blazing", "Here's the good stuff"). Read the **last 2–3 release notes** before drafting. Patch releases do **not** include a `pip install` block. Use the `Why upgrade:` framing with a short bullet list, then `## What's New`, then `## Bug Fixes`, then `## Full Changelog`. +- **Match house style for release notes** — see *Generation parameters* in Phase 1. In short: value-prop first, **local agents are the headline** (not the SDK), one agent/command per highlight, plain language, engaging but factual, and **no emoji, no fluff** (full banned-phrase list under *Generation parameters*). Read the **last 2–3 release notes** before drafting. Patch releases do **not** include a `pip install` block. Use the `Why upgrade:` framing with a short bullet list, then `## What's New`, then `## Bug Fixes`, then `## Full Changelog`. - **Match the previous release PR body shape exactly** — read the most recent merged `Release vX.Y.Z` PR (e.g. `gh pr list --repo amd/gaia --state merged --search "Release v in:title" --limit 3`). Open with `# GAIA vX.Y.Z Release Notes` (no MDX frontmatter in the PR body), end with a `Release checklist` section. Style drift here costs review cycles. - **Bulletproof commits only** — every change made by this skill must satisfy the four criteria in CLAUDE.md (validated, critiqued, scope-clean, no half-finished work) before being committed. - **Pushing tags is irreversible.** Always confirm the SHA the tag will point to and the green status of the pre-tag verification run before `git push origin v`. @@ -90,12 +90,12 @@ These map to [CLAUDE.md](CLAUDE.md). Re-read them whenever this skill runs. If the requested version doesn't match the rubric, **stop and surface the mismatch**: *"You asked for `v` (patch). I see N feat commits since `` including `` — this looks minor-shaped. Continue as patch, or bump to `v`?"* Do not silently proceed. -2. **Read the last 2–3 release notes** to match style and length. +2. **Read the last 2–3 release notes** to match structure and length (not tone — see *Generation parameters*). - [docs/releases/v0.17.4.mdx](docs/releases/v0.17.4.mdx) - [docs/releases/v0.17.3.mdx](docs/releases/v0.17.3.mdx) - [docs/releases/v0.17.2.mdx](docs/releases/v0.17.2.mdx) - Cross-check: same frontmatter shape, same section headings, same tone, same level of detail per entry. Patch releases are short; minor/major releases include `pip install` and may have a "Highlights" block. + Cross-check: same frontmatter shape, same section headings, same *structure* and length per entry — but **not** the prior tone. The last few releases predate the *Generation parameters* below; match their shape, not their dryness. Patch releases are short; minor/major releases include `pip install` and may have a "Highlights" block. 3. **Create [docs/releases/v.mdx](docs/releases/)** with this skeleton (adapt to whether it's patch / minor / major): @@ -121,7 +121,9 @@ These map to [CLAUDE.md](CLAUDE.md). Re-read them whenever this skill runs. + command per entry — add another `### ` block for the next one. Not every highlight + is a command — for UI / SDK / perf items, use a plain title with no trailing + command.> --- @@ -148,7 +150,7 @@ These map to [CLAUDE.md](CLAUDE.md). Re-read them whenever this skill runs. - **Value-prop first.** Open each entry with what the user can now do and why it matters — the outcome, not the implementation. "Triage your inbox in one command" - before "added EmailAgent with IMAP polling". + before "added EmailAgent with Gmail polling". - **Local agents are the headline.** Lead with the agents that solve real problems (`gaia browse`, `gaia analyze`, email triage, …); SDK / infra / refactors are supporting detail. People come for the agents, not the SDK. @@ -167,13 +169,13 @@ These map to [CLAUDE.md](CLAUDE.md). Re-read them whenever this skill runs. > **Bad** (dry, implementation-first, no reason to care): > ### EmailAgent - > Adds an EmailAgent with IMAP polling and a rules engine for classification. + > Adds an EmailAgent with Gmail polling and a rules engine for classification. > **Good** (value-first, plain, makes you want to try it): > ### Triage your inbox from the terminal — `gaia email` > Point GAIA at your inbox and it sorts the noise from what needs you: drafts > replies to routine mail, flags what's urgent, leaves the rest. Runs locally, so - > your mail never leaves your machine. Try it: `gaia email triage`. + > your mail never leaves your machine. Try it: `gaia email`. 4. **Update [docs/docs.json](docs/docs.json):** - Add `releases/v` to the Releases tab. diff --git a/CLAUDE.md b/CLAUDE.md index a1cfc9d5c..d97e60413 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -307,7 +307,7 @@ gaia eval agent --category rag_quality --agent-type doc \ ps aux | grep "gaia eval" | grep -v grep | wc -l # must print "0" ``` -This applies to **all** eval flavours: `gaia eval agent`, `gaia eval --use-claude`, `gaia eval fix-code`, batch experiments. The judge LLM (Claude) can run concurrently across scenarios — the bottleneck is the local Lemonade backend, which is single-tenant per model slot. +This applies to every `gaia eval agent` run — including `--fix` auto-fix runs and any batch fix-loop that chains them. The judge LLM (Claude) can run concurrently across scenarios — the bottleneck is the local Lemonade backend, which is single-tenant per model slot. ## Development Workflow @@ -367,13 +367,20 @@ gaia/ │ │ ├── chat/ # ChatAgent with RAG (tools/rag_tools, tools/shell_tools) │ │ ├── code/ # CodeAgent with orchestration, validators, file_io tools │ │ ├── builder/ # BuilderAgent — scaffolds new agents from templates -│ │ ├── summarize/ # SummarizeAgent — document/text summarization +│ │ ├── summarize/ # SummarizerAgent — document/text summarization │ │ ├── blender/ # BlenderAgent for 3D automation │ │ ├── jira/ # JiraAgent for issue management │ │ ├── docker/ # DockerAgent for containerization │ │ ├── emr/ # MedicalIntakeAgent for healthcare (VLM) │ │ ├── routing/ # RoutingAgent for intelligent agent selection │ │ ├── sd/ # SDAgent for Stable Diffusion image generation +│ │ ├── email/ # EmailTriageAgent for Gmail (local inference) +│ │ ├── browser/ # BrowserAgent — web research (gaia browse) +│ │ ├── analyst/ # AnalystAgent — structured data analysis (gaia analyze) +│ │ ├── docqa/ # DocumentQAAgent — RAG document Q&A +│ │ ├── fileio/ # FileIOAgent — file system + safe shell +│ │ ├── code_index/ # Code-aware indexing/search agent + mixin +│ │ ├── connectors_demo/ # ConnectorsDemoAgent — Google/GitHub connector demo │ │ └── registry.py # Agent registry + KNOWN_TOOLS map │ ├── api/ # OpenAI-compatible REST API server │ ├── apps/ # Standalone applications @@ -386,21 +393,28 @@ gaia/ │ │ └── _shared/ # Shared assets for apps │ ├── audio/ # Audio processing (Whisper ASR, Kokoro TTS) │ ├── chat/ # Agent SDK (AgentSDK class, prompts, app entry) +│ ├── code_index/ # Code indexing/search backend +│ ├── connectors/ # Connector framework (Google/GitHub OAuth, MCP-server connectors, grants) │ ├── database/ # DatabaseMixin and DatabaseAgent │ ├── electron/ # Electron app integration │ ├── eval/ # Evaluation framework +│ ├── filesystem/ # Filesystem service/utilities +│ ├── governance/ # Governance / guardrails layer │ ├── img/ # Shared image assets │ ├── installer/ # Install/init commands (gaia init, lemonade installer) │ ├── llm/ # LLM backend clients (Lemonade, Claude, OpenAI) + providers/ │ ├── mcp/ # Model Context Protocol servers/clients +│ ├── messaging/ # Messaging adapters (Telegram, …) │ ├── rag/ # Document retrieval (RAG) │ ├── sd/ # Stable Diffusion tool mixin (SDToolsMixin) +│ ├── scratchpad/ # Scratchpad tables backend │ ├── shell/ # Shell integration │ ├── talk/ # Voice interaction SDK │ ├── testing/ # Test utilities and fixtures │ ├── ui/ # Agent UI backend (FastAPI server, routers, SSE, database) │ ├── utils/ # Utility modules (FileWatcher, parsing) │ ├── vlm/ # Vision LLM tool mixin (VLMToolsMixin, structured extraction) +│ ├── web/ # Web utilities (search/fetch backend) │ └── cli.py # Main CLI entry point (all `gaia ` subparsers) ├── tests/ # Test suite │ ├── unit/ # Unit tests @@ -454,16 +468,21 @@ Defined in [`setup.py`](setup.py) under `console_scripts`: | Agent | Location | Description | Default Model | |-------|----------|-------------|---------------| -| **ChatAgent** | `agents/chat/agent.py` | Document Q&A with RAG | Qwen3.5-35B | -| **CodeAgent** | `agents/code/agent.py` | Code generation with orchestration | Qwen3.5-35B | -| **BuilderAgent** | `agents/builder/agent.py` | Scaffolds new agents from templates | Qwen3.5-35B | -| **SummarizeAgent** | `agents/summarize/agent.py` | Document/text summarization | Qwen3.5-35B | -| **JiraAgent** | `agents/jira/agent.py` | Jira issue management | Qwen3.5-35B | -| **BlenderAgent** | `agents/blender/agent.py` | 3D scene automation | Qwen3.5-35B | -| **DockerAgent** | `agents/docker/agent.py` | Container management | Qwen3.5-35B | +| **ChatAgent** | `agents/chat/agent.py` | Document Q&A with RAG | Gemma-4-E4B | +| **CodeAgent** | `agents/code/agent.py` | Code generation with orchestration | Qwen3.5-35B-A3B | +| **BuilderAgent** | `agents/builder/agent.py` | Scaffolds new agents from templates | Qwen3.5-35B-A3B | +| **SummarizerAgent** | `agents/summarize/agent.py` | Document/text summarization | Gemma-4-E4B | +| **JiraAgent** | `agents/jira/agent.py` | Jira issue management | Qwen3.5-35B-A3B | +| **BlenderAgent** | `agents/blender/agent.py` | 3D scene automation | Gemma-4-E4B | +| **DockerAgent** | `agents/docker/agent.py` | Container management | Gemma-4-E4B | | **MedicalIntakeAgent** | `agents/emr/agent.py` | Medical form processing | Qwen3-VL-4B (VLM) | -| **RoutingAgent** | `agents/routing/agent.py` | Intelligent agent selection | Qwen3.5-35B | +| **RoutingAgent** | `agents/routing/agent.py` | Intelligent agent selection | Qwen3.5-35B-A3B (`AGENT_ROUTING_MODEL`) | | **SDAgent** | `agents/sd/agent.py` | Stable Diffusion image generation | SDXL-Turbo | +| **EmailTriageAgent** | `agents/email/agent.py` | Email triage for Gmail — local inference, needs the Google connector | Gemma-4-E4B | +| **BrowserAgent** | `agents/browser/agent.py` | Web research — search, page fetch, download (`gaia browse`) | Gemma-4-E4B | +| **AnalystAgent** | `agents/analyst/agent.py` | Structured data analysis with scratchpad tables (`gaia analyze`) | Gemma-4-E4B | + +`gaia browse` and `gaia analyze` invoke BrowserAgent and AnalystAgent respectively (see [`src/gaia/cli.py`](src/gaia/cli.py)). `gaia telegram` is a messaging adapter, not an agent. Internal building-block agents (DocumentQAAgent, FileIOAgent, ConnectorsDemoAgent) live under `src/gaia/agents/` but aren't standalone CLI commands. ### Agent Registry & Tool Mixins @@ -472,19 +491,24 @@ New agents are Python classes inheriting from `Agent` (see [`src/gaia/agents/bas | Tool name | Mixin | Purpose | |-----------|-------|---------| | `rag` | `gaia.agents.chat.tools.rag_tools.RAGToolsMixin` | Document retrieval | +| `code_index` | `gaia.agents.code_index.tools.mixin.CodeIndexToolsMixin` | Code-aware indexing/search | | `file_search` | `gaia.agents.tools.file_tools.FileSearchToolsMixin` | Fuzzy/glob file search | | `file_io` | `gaia.agents.code.tools.file_io.FileIOToolsMixin` | Read/write/edit files | | `shell` | `gaia.agents.chat.tools.shell_tools.ShellToolsMixin` | Sandboxed shell commands | | `screenshot` | `gaia.agents.tools.screenshot_tools.ScreenshotToolsMixin` | Screen capture | +| `filesystem` | `gaia.agents.tools.filesystem_tools.FileSystemToolsMixin` | Filesystem operations | +| `scratchpad` | `gaia.agents.tools.scratchpad_tools.ScratchpadToolsMixin` | Scratchpad tables/notes | +| `browser` | `gaia.agents.tools.browser_tools.BrowserToolsMixin` | Web search / page fetch / download | | `sd` | `gaia.sd.mixin.SDToolsMixin` | Stable Diffusion image generation | | `vlm` | `gaia.vlm.mixin.VLMToolsMixin` | Vision LLM / structured extraction | When adding a new tool mixin, register it in `KNOWN_TOOLS` so other agents can compose it by name. ### Default Models -- General tasks: `Qwen3-0.6B-GGUF` -- Code/Agents: `Qwen3.5-35B-A3B-GGUF` +- Default for most agents and `gaia llm`: `Gemma-4-E4B-it-GGUF` (`DEFAULT_MODEL_NAME` in [`src/gaia/llm/lemonade_client.py`](src/gaia/llm/lemonade_client.py)) +- Code-heavy agents (Code, Builder, Jira): `Qwen3.5-35B-A3B-GGUF` (hardcoded per agent) - Vision tasks: `Qwen3-VL-4B-Instruct-GGUF` +- Image generation (SD): `SDXL-Turbo` ## CLI Commands @@ -502,10 +526,15 @@ All commands are registered in [`src/gaia/cli.py`](src/gaia/cli.py). Run `gaia - - `gaia sd` - Stable Diffusion image generation - `gaia jira` - Jira integration - `gaia docker` - Docker management +- `gaia browse` - Web research agent (search, page fetch, download) +- `gaia analyze` - Structured data analysis agent (scratchpad tables) +- `gaia email` - Email triage for Gmail (local inference; needs the Google connector) **Servers & infrastructure:** - `gaia api` - OpenAI-compatible API server -- `gaia mcp {start|stop|status|test|agent|docker|serve|add|list|remove|tools|test-client}` - MCP bridge +- `gaia mcp {start|stop|status|test|agent|docker|serve|list|tools|test-client}` - MCP bridge (add/remove moved to the connectors framework, #977) +- `gaia telegram {start|stop|status}` - Telegram messaging adapter +- `gaia connectors` - Manage connectors (Google/GitHub OAuth, MCP servers) and per-agent grants - `gaia cache {status|clear}` - Cache management **Setup & utilities:** @@ -514,16 +543,16 @@ All commands are registered in [`src/gaia/cli.py`](src/gaia/cli.py). Run `gaia - - `gaia download` - Download a model - `gaia kill` - Kill stray GAIA / Lemonade processes - `gaia test` - Smoke tests -- `gaia yt` - YouTube transcript ingest -- `gaia template` - Scaffold agent templates +- `gaia youtube --download-transcript ` - YouTube utilities (transcript download) +- `gaia stats` - Show statistics from the most recent run +- `gaia memory` - Manage agent memory (onboarding bootstrap, status) +- `gaia diagnostics` - Bundle logs + system info into a tarball for bug reports +- `gaia agent {export|import}` - Manage custom agent bundles **Evaluation & analysis** (see [`docs/reference/eval.mdx`](docs/reference/eval.mdx)): -- `gaia eval {fix-code|agent}` - Run evaluation harness -- `gaia gt` - Generate ground truth -- `gaia generate` - Dataset/response generation -- `gaia batch-exp` - Batch experiments +- `gaia eval agent` - Run the agent eval benchmark (`--fix` auto-fixes failures) - `gaia report` - Render eval reports -- `gaia visualize` / `gaia perf-vis` - Visualize results +- `gaia perf-vis` - Visualize performance results **Standalone binaries** (separate `console_scripts`, not subcommands): - `gaia-code` - CodeAgent entry (`src/gaia/agents/code/cli.py`) @@ -537,6 +566,9 @@ All documentation uses `.mdx` format (Markdown + JSX for Mintlify). **User Guides:** - [`docs/guides/chat.mdx`](docs/guides/chat.mdx) - Chat with RAG - [`docs/guides/agent-ui.mdx`](docs/guides/agent-ui.mdx) - Agent UI (desktop chat) +- [`docs/guides/browse.mdx`](docs/guides/browse.mdx) - Web research (`gaia browse`) +- [`docs/guides/analyze.mdx`](docs/guides/analyze.mdx) - Structured data analysis (`gaia analyze`) +- [`docs/guides/email.mdx`](docs/guides/email.mdx) - Email triage (`gaia email`) - [`docs/guides/talk.mdx`](docs/guides/talk.mdx) - Voice interaction - [`docs/guides/code.mdx`](docs/guides/code.mdx) - Code generation - [`docs/guides/blender.mdx`](docs/guides/blender.mdx) - 3D automation @@ -544,6 +576,12 @@ All documentation uses `.mdx` format (Markdown + JSX for Mintlify). - [`docs/guides/docker.mdx`](docs/guides/docker.mdx) - Docker management - [`docs/guides/routing.mdx`](docs/guides/routing.mdx) - Agent routing - [`docs/guides/emr.mdx`](docs/guides/emr.mdx) - Medical intake +- [`docs/guides/telegram-adapter.mdx`](docs/guides/telegram-adapter.mdx) - Telegram messaging adapter +- [`docs/guides/memory.mdx`](docs/guides/memory.mdx) - Agent memory +- [`docs/guides/install.mdx`](docs/guides/install.mdx) - Installation +- [`docs/guides/custom-agent.mdx`](docs/guides/custom-agent.mdx) - Build a custom agent +- [`docs/guides/hardware-advisor.mdx`](docs/guides/hardware-advisor.mdx) - Hardware advisor +- [`docs/guides/npu.mdx`](docs/guides/npu.mdx) - NPU setup **SDK Reference:** - [`docs/sdk/core/agent-system.mdx`](docs/sdk/core/agent-system.mdx) - Agent framework @@ -587,13 +625,13 @@ The roadmap is at [`docs/roadmap.mdx`](docs/roadmap.mdx) ([live site](https://am - [`docs/plans/oem-bundling.mdx`](docs/plans/oem-bundling.mdx) - OEM hardware pre-configuration **Infrastructure:** -- [`docs/plans/installer.mdx`](docs/plans/installer.mdx) - Desktop installer -- [`docs/plans/mcp-client.mdx`](docs/plans/mcp-client.mdx) - MCP client integration +- [`docs/plans/desktop-installer.mdx`](docs/plans/desktop-installer.mdx) - Desktop installer +- [`docs/plans/mcp-docs.mdx`](docs/plans/mcp-docs.mdx) - MCP integration - [`docs/plans/cua.mdx`](docs/plans/cua.mdx) - Computer Use Agent - [`docs/plans/docker-containers.mdx`](docs/plans/docker-containers.mdx) - Docker deployment **Key architectural decisions (April 2026):** -- ChatAgent renamed to **GaiaAgent** in v0.20.0 (#696) +- **GaiaAgent** rename planned (#696) — not yet landed; the chat agent class is still `ChatAgent` (`src/gaia/agents/chat/agent.py`) - Voice-first is P0 enabling technology (#702) - No context compaction — memory + RAG handles long conversations - Configuration dashboard + Observability dashboard as separate Agent UI panels @@ -614,7 +652,7 @@ When responding to GitHub issues and pull requests, follow these guidelines: The documentation is organized in [`docs/docs.json`](docs/docs.json) with the following structure: - **SDK**: `docs/sdk/` - Agent system, tools, core SDKs (chat, llm, rag, vlm, audio) -- **User Guides** (`docs/guides/`): Feature-specific guides (chat, talk, code, blender, jira, docker, routing, emr) +- **User Guides** (`docs/guides/`): Feature-specific guides (chat, browse, analyze, email, talk, code, blender, jira, docker, routing, emr, telegram-adapter, memory) - **Playbooks** (`docs/playbooks/`): Step-by-step tutorials for building agents - **SDK Reference** (`docs/sdk/`): Core concepts, SDKs, infrastructure, mixins, agents - **Specifications** (`docs/spec/`): Technical specs for all components diff --git a/docs/guides/custom-agent.mdx b/docs/guides/custom-agent.mdx index bb6c434ab..98a9cc9a9 100644 --- a/docs/guides/custom-agent.mdx +++ b/docs/guides/custom-agent.mdx @@ -138,7 +138,7 @@ If your agent needs a specific model, set `model_id` in `__init__()`: ```python def __init__(self, **kwargs): - kwargs.setdefault("model_id", "Qwen3-0.6B-GGUF") + kwargs.setdefault("model_id", "Gemma-4-E4B-it-GGUF") super().__init__(**kwargs) ``` diff --git a/docs/guides/mcp/client.mdx b/docs/guides/mcp/client.mdx index be627b246..b4922c83a 100644 --- a/docs/guides/mcp/client.mdx +++ b/docs/guides/mcp/client.mdx @@ -3,6 +3,16 @@ title: "MCP Client" description: "Connect GAIA agents to external MCP servers and use their tools" --- + + **CLI update (#977):** `gaia mcp add` and `gaia mcp remove` shown in this guide + have been removed. MCP servers are now defined as **connectors** and configured + with `gaia connectors configure --set KEY=VALUE` (secrets are stored in the + OS keyring; the entry is written to `~/.gaia/mcp_servers.json`). `gaia mcp list` + still lists configured servers. The `gaia mcp add ...` examples below predate this + change and are pending a rewrite — run `gaia connectors --help` for current usage, + and see issue #1339. + + **Connect your GAIA agent to any tool ecosystem with a single line of code.** - 🔌 **Universal tool integration** - Filesystem, GitHub, databases, and hundreds more diff --git a/docs/reference/cli.mdx b/docs/reference/cli.mdx index 0f05d12bc..94440d054 100644 --- a/docs/reference/cli.mdx +++ b/docs/reference/cli.mdx @@ -880,6 +880,104 @@ gaia talk -i guide.pdf --no-tts --- +### Jira Command + + + Natural-language interface for Jira, Confluence, and Compass via the MCP bridge. + + +```bash +gaia jira "Create a bug report for the login issue" +gaia jira --interactive +``` + +**Options:** + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `command` | string | – | Natural-language command to execute (positional). | +| `--interactive` | flag | false | Continuous interactive mode. | +| `--mcp-host` | string | localhost | MCP bridge host. | +| `--mcp-port` | integer | 8765 | MCP bridge port. | +| `--verbose` | flag | false | Verbose output. | +| `--debug` | flag | false | Debug logging. | + +[→ Full Jira Guide](/guides/jira) + +--- + +### Docker Command + + + Natural-language interface for Docker containerization. + + +```bash +gaia docker "Create a Dockerfile for my Flask app" +gaia docker "Containerize this project" --directory ./myapp +``` + +**Options:** + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `command` | string | – | Natural-language command to execute (positional). | +| `--directory` | path | current dir | Directory to analyze/containerize. | +| `--verbose` | flag | false | Verbose output. | +| `--debug` | flag | false | Debug logging. | + +[→ Full Docker Guide](/guides/docker) + +--- + +### Summarize Command + +Summarize meeting transcripts, emails, and PDFs. + +```bash +gaia summarize --input ./meeting.vtt +gaia summarize --input ./inbox/ --type email --format both +``` + +**Options:** + +| Option | Type | Default | Description | +|--------|------|---------|-------------| +| `--input` | path | – | Input file or directory (required unless `--list-configs`). | +| `--output` | path | auto | Output file/directory (auto-adjusted to format). | +| `--type` | `transcript`\|`email`\|`pdf`\|`auto` | auto | Input type. | +| `--format` | `json`\|`pdf`\|`email`\|`both` | json | Output format (`both` = json + pdf). | +| `--styles` | one or more styles | executive participants action_items | `brief`, `detailed`, `bullets`, `executive`, `participants`, `action_items`, `all`. | + +--- + +### Telegram Command + + + Bridge a Telegram bot to GAIA so you can chat with your agents from Telegram. + + +```bash +gaia telegram start --token --background +gaia telegram status +gaia telegram stop +``` + +**Subcommands & options:** + +| Subcommand | Option | Default | Description | +|-----------|--------|---------|-------------| +| `start` | `--token` (required) | – | Telegram bot token. | +| `start` | `--allowed-users` | allow all | Comma-separated user IDs permitted to interact. | +| `start` | `--background` | false | Run as a daemon (writes PID + health endpoint). | +| `stop` | `--force` | false | Force stop if graceful shutdown fails. | +| `status` | `--health-host` | 127.0.0.1 | Health server host. | +| `status` | `--health-port` | 8765 | Health server port. | + +[→ Telegram Adapter Guide](/guides/telegram-adapter) + +--- + ## API Server @@ -974,30 +1072,13 @@ Configure MCP servers that your agents can connect to. Servers are saved to `~/. ### Commands -#### `gaia mcp add` - -Add an MCP server to configuration. - -```bash -gaia mcp add "" [--config PATH] -``` - -**Arguments:** -- `` - Unique identifier for the server (e.g., "time", "memory") -- `""` - Shell command to start the MCP server (must be quoted) - -**Options:** -- `--config PATH` - Custom config file path (default: `~/.gaia/mcp_servers.json`) - -**Examples:** -```bash -# Add to user config (default) -gaia mcp add time "uvx mcp-server-time" -gaia mcp add memory "npx -y @modelcontextprotocol/server-memory" +#### Managing MCP servers (`add` / `remove`) -# Add to project config (can be committed to git) -gaia mcp add time "uvx mcp-server-time" --config ./mcp_servers.json -``` + +`gaia mcp add` and `gaia mcp remove` were removed in #977 — MCP servers are now +configured through the connectors framework. Run `gaia connectors --help` for the +current commands. `gaia mcp list` (below) still lists configured servers. + #### `gaia mcp list` @@ -1019,29 +1100,6 @@ gaia mcp list gaia mcp list --config ./mcp_servers.json ``` -#### `gaia mcp remove` - -Remove an MCP server from configuration. - -```bash -gaia mcp remove [--config PATH] -``` - -**Arguments:** -- `` - Name of the server to remove - -**Options:** -- `--config PATH` - Custom config file path (default: `~/.gaia/mcp_servers.json`) - -**Example:** -```bash -# Remove from user config -gaia mcp remove time - -# Remove from project config -gaia mcp remove memory --config ./mcp_servers.json -``` - #### `gaia mcp tools` List tools available from a configured MCP server. @@ -1258,35 +1316,14 @@ Use `lemonade-server list` to see all available models and their download status **Tools for:** -- Ground Truth Generation -- Automated Evaluation -- Batch Experimentation -- Performance Analysis -- Transcript Testing +- Agent eval benchmark (scenario-based, end-to-end) +- Auto-fixing failures with Claude Code +- Report generation +- Performance-log visualization **Quick Examples:** -Generate evaluation data: -```bash -gaia groundtruth -f ./data/document.html -``` - -Create sample experiment configuration: -```bash -gaia batch-experiment --create-sample-config experiments.json -``` - -Run systematic experiments: -```bash -gaia batch-experiment -c experiments.json -i ./data -o ./results -``` - -Evaluate results: -```bash -gaia eval -f ./results/experiment.json -``` - -Run agent eval benchmark: +Run the agent eval benchmark: ```bash gaia eval agent # Run all scenarios gaia eval agent --category mcp_reliability # Run MCP reliability scenarios only @@ -1296,14 +1333,14 @@ gaia eval agent --fix # Auto-fix failures with Clau gaia eval agent --compare run1/scorecard.json run2/scorecard.json # Compare runs ``` -Generate report: +Generate a report: ```bash gaia report -d ./eval_results ``` -Launch visualizer: +Visualize llama.cpp performance logs: ```bash -gaia visualize +gaia perf-vis ./logs/llama-server.log ``` [→ Full Evaluation Guide](/reference/eval) @@ -1409,54 +1446,33 @@ gaia eval agent --scenario-dir ~/my-project/eval-scenarios --- -### Visualize Command +### Performance Visualization -Launch interactive web-based visualizer for comparing evaluation results. +Plot llama.cpp server performance metrics from one or more log files. Plots are saved as images; pass `--show` to also display them interactively. ```bash -gaia visualize [OPTIONS] +gaia perf-vis LOG_PATH [LOG_PATH ...] [--show] ``` **Options:** | Option | Type | Default | Description | |--------|------|---------|-------------| -| `--port` | integer | 3000 | Visualizer server port | -| `--experiments-dir` | path | ./output/experiments | Experiments directory | -| `--evaluations-dir` | path | ./output/evaluations | Evaluations directory | -| `--workspace` | path | current directory | Base workspace directory | -| `--no-browser` | flag | false | Don't auto-open browser | -| `--host` | string | localhost | Host address | +| `log_paths` | path(s) | - | One or more llama.cpp server log files to visualize | +| `--show` | flag | false | Display plots interactively in addition to saving images | **Examples:** -```bash Default -gaia visualize -``` - -```bash Custom Directories -gaia visualize \ - --experiments-dir ./my_experiments \ - --evaluations-dir ./my_evaluations +```bash Single log +gaia perf-vis ./logs/llama-server.log ``` -```bash Custom Port -gaia visualize --port 8080 --no-browser +```bash Multiple logs + show +gaia perf-vis ./logs/run1.log ./logs/run2.log --show ``` -**Features:** -- Interactive Comparison (side-by-side) -- Key Metrics Dashboard -- Quality Analysis -- Real-time Updates -- Responsive Design - - -Node.js must be installed. Dependencies are automatically installed on first run. - - --- ## Memory diff --git a/docs/spec/agent-base.mdx b/docs/spec/agent-base.mdx index c6f98d979..d120118b1 100644 --- a/docs/spec/agent-base.mdx +++ b/docs/spec/agent-base.mdx @@ -119,7 +119,7 @@ def __init__( use_chatgpt: Use ChatGPT/OpenAI API instead of local LLM claude_model: Model to use when use_claude=True base_url: Local LLM server URL (default: from LEMONADE_BASE_URL or http://localhost:13305/api/v1) - model_id: Model ID for local LLM (default: Qwen3.5-35B-A3B-GGUF) + model_id: Model ID for local LLM (default: None — resolves to DEFAULT_MODEL_NAME, currently Gemma-4-E4B-it-GGUF) max_steps: Maximum reasoning iterations (default: 20) debug_prompts: Include prompts in conversation history (default: False) show_prompts: Display prompts sent to LLM (default: False) diff --git a/docs/spec/autonomous-agent-mode.md b/docs/spec/autonomous-agent-mode.md index e37cecc5a..a277932a4 100644 --- a/docs/spec/autonomous-agent-mode.md +++ b/docs/spec/autonomous-agent-mode.md @@ -952,7 +952,7 @@ async def _run_step(self, trigger: AgentTrigger, session: dict): **Settings:** - [ ] `agent_mode` in `SettingsResponse` + `SettingsUpdateRequest` (default `"autonomous"`) - [ ] GET/PUT `/api/settings` validates `agent_mode` against `"manual"|"goal_driven"|"autonomous"` -- [ ] `GAIA_AUTO_OBSERVE_MODEL` env var wired into AgentLoop (default `"Qwen3-0.6B-GGUF"`) +- [ ] `GAIA_AUTO_OBSERVE_MODEL` env var wired into AgentLoop (default `"Qwen3-4B-GGUF"`) **Audit Log:** - [ ] `GET /api/agent/activity` + `GET /api/agent/activity/{id}` endpoints diff --git a/docs/spec/mcp-client.mdx b/docs/spec/mcp-client.mdx index 00f214a30..4109659b1 100644 --- a/docs/spec/mcp-client.mdx +++ b/docs/spec/mcp-client.mdx @@ -4,6 +4,15 @@ description: "Complete technical specification for MCP client components" icon: "sitemap" --- + + **CLI update (#977):** `gaia mcp add` / `gaia mcp remove` referenced in this spec + were removed. MCP servers are now configured through the connectors framework + (`gaia connectors configure --set KEY=VALUE`; the entry is persisted to + `~/.gaia/mcp_servers.json` with secrets in the OS keyring). `gaia mcp list` still + lists configured servers. Sections below using `gaia mcp add/remove` are pending a + rewrite — see issue #1339. + + 🔧 **You are viewing:** API Specification - Complete technical reference