From 995f305d232ee3d62864fe1cf2ab9c04a60e3468 Mon Sep 17 00:00:00 2001 From: Saurabh Arora Date: Wed, 18 Mar 2026 13:17:13 -0700 Subject: [PATCH 01/13] docs: update README and docs for v0.4.2 launch - Standardize tool count to 50+ across all docs (was 99+/55+) - Update install command to unscoped `altimate-code` package - Remove stale Python/uv auto-setup claims (all-native TypeScript now) - Update docs badge URL to docs.altimate.sh - Remove altimate-core npm badge from README - Add --yolo flag to CLI reference and builder mode subtext - Add new env vars (YOLO, MEMORY, TRAINING) to CLI docs - Add prompt enhancement keybind (leader+i) to TUI and keybinds docs - Add tool_lookup to tools index - Add built-in skills table (sql-review, schema-migration, pii-audit, etc.) - Add altimate-dbt CLI section to dbt-tools.md - Add Oracle and SQLite to warehouse lists - Update security FAQ: replace Python engine FAQ with native engine, add sensitive_write FAQ - Update telemetry docs to remove Python engine references - Add v0.4.2 to README "What's New" section - Update llms.txt URLs to docs.altimate.sh and bump version to v0.4.2 Co-Authored-By: Claude Opus 4.6 (1M context) --- README.md | 18 ++--- docs/docs/configure/keybinds.md | 6 ++ docs/docs/configure/skills.md | 21 +++++- docs/docs/configure/telemetry.md | 16 ++--- docs/docs/configure/tools.md | 2 +- docs/docs/data-engineering/agent-modes.md | 44 ++++++------ .../data-engineering/guides/ci-headless.md | 2 +- docs/docs/data-engineering/tools/dbt-tools.md | 30 +++++++++ docs/docs/data-engineering/tools/index.md | 5 +- .../data-engineering/tools/warehouse-tools.md | 3 - docs/docs/getting-started.md | 46 ++++++------- docs/docs/index.md | 16 +++-- docs/docs/llms.txt | 54 +++++++-------- docs/docs/quickstart.md | 18 ++--- docs/docs/security-faq.md | 67 ++++++++++--------- docs/docs/usage/cli.md | 16 +++++ docs/docs/usage/tui.md | 1 + docs/docs/windows-wsl.md | 2 +- docs/mkdocs.yml | 2 +- 19 files changed, 224 insertions(+), 145 deletions(-) diff --git a/README.md b/README.md index 2350a34874..350f89bb1e 100644 --- a/README.md +++ b/README.md @@ -7,19 +7,18 @@ **The open-source data engineering harness.** -The intelligence layer for data engineering AI — 99+ deterministic tools for SQL analysis, +The intelligence layer for data engineering AI — 50+ deterministic tools for SQL analysis, column-level lineage, dbt, FinOps, and warehouse connectivity across every major cloud platform. Run standalone in your terminal, embed underneath Claude Code or Codex, or integrate into CI pipelines and orchestration DAGs. Precision data tooling for any LLM. -[![npm](https://img.shields.io/npm/v/@altimateai/altimate-code)](https://www.npmjs.com/package/@altimateai/altimate-code) -[![npm](https://img.shields.io/npm/v/@altimateai/altimate-core)](https://www.npmjs.com/package/@altimateai/altimate-core) -[![npm downloads](https://img.shields.io/npm/dm/@altimateai/altimate-code)](https://www.npmjs.com/package/@altimateai/altimate-code) +[![npm](https://img.shields.io/npm/v/altimate-code)](https://www.npmjs.com/package/altimate-code) +[![npm downloads](https://img.shields.io/npm/dm/altimate-code)](https://www.npmjs.com/package/altimate-code) [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](./LICENSE) [![CI](https://github.com/AltimateAI/altimate-code/actions/workflows/ci.yml/badge.svg)](https://github.com/AltimateAI/altimate-code/actions/workflows/ci.yml) [![Slack](https://img.shields.io/badge/Slack-Join%20Community-4A154B?logo=slack)](https://altimate.ai/slack) -[![Docs](https://img.shields.io/badge/docs-altimateai.github.io-blue)](https://altimateai.github.io/altimate-code) +[![Docs](https://img.shields.io/badge/docs-docs.altimate.sh-blue)](https://docs.altimate.sh) @@ -29,7 +28,7 @@ into CI pipelines and orchestration DAGs. Precision data tooling for any LLM. ```bash # npm (recommended) -npm install -g @altimateai/altimate-code +npm install -g altimate-code # Homebrew brew install AltimateAI/tap/altimate-code @@ -58,7 +57,7 @@ altimate /discover `/discover` auto-detects dbt projects, warehouse connections (from `~/.dbt/profiles.yml`, Docker, environment variables), and installed tools (dbt, sqlfluff, airflow, dagster, and more). Skip this and start building — you can always run it later. -> **Zero Python setup required.** On first run, the CLI automatically downloads [`uv`](https://github.com/astral-sh/uv), creates an isolated Python environment, and installs the data engine with all warehouse drivers. No `pip install`, no virtualenv management. +> **No Python required.** All tools run natively in TypeScript via `@altimateai/altimate-core` napi-rs bindings. No pip, no virtualenv, no Python installation needed. ## Why a specialized harness? @@ -162,7 +161,7 @@ Each agent has scoped permissions and purpose-built tools for its role. ## Supported Warehouses -Snowflake · BigQuery · Databricks · PostgreSQL · Redshift · DuckDB · MySQL · SQL Server +Snowflake · BigQuery · Databricks · PostgreSQL · Redshift · DuckDB · MySQL · SQL Server · Oracle · SQLite First-class support with schema indexing, query execution, and metadata introspection. SSH tunneling available for secure connections. @@ -222,8 +221,9 @@ Contributions welcome — docs, SQL rules, warehouse connectors, and TUI improve ## What's New +- **v0.4.2** (March 2026) — yolo mode, Python engine elimination (all-native TypeScript), tool consolidation, path sandboxing hardening, altimate-dbt CLI, unscoped npm package - **v0.4.1** (March 2026) — env-based skill selection, session caching, tracing improvements -- **v0.4.0** (Feb 2026) — data visualization skill, 99+ tools, training system +- **v0.4.0** (Feb 2026) — data visualization skill, 50+ tools, training system - **v0.3.x** — [See full changelog →](CHANGELOG.md) ## License diff --git a/docs/docs/configure/keybinds.md b/docs/docs/configure/keybinds.md index 9ce0310028..281986e765 100644 --- a/docs/docs/configure/keybinds.md +++ b/docs/docs/configure/keybinds.md @@ -74,6 +74,12 @@ Override it in your config: | `Ctrl+Z` | Undo | | `Ctrl+Shift+Z` | Redo | +### Prompt + +| Keybind | Action | +|---------|--------| +| Leader + `i` | Enhance prompt (AI-powered rewrite for clarity) | + ### Other | Keybind | Action | diff --git a/docs/docs/configure/skills.md b/docs/docs/configure/skills.md index 66801fd2b1..f70b3218f8 100644 --- a/docs/docs/configure/skills.md +++ b/docs/docs/configure/skills.md @@ -67,7 +67,26 @@ Skills are loaded from these locations (in priority order): ## Built-in Data Engineering Skills -altimate ships with built-in skills for common data engineering tasks. Skills are loaded and surfaced dynamically at runtime — type `/` in the TUI to browse what's available and get autocomplete on skill names. +altimate ships with built-in skills for common data engineering tasks. Type `/` in the TUI to browse what's available and get autocomplete on skill names. + +| Skill | Description | +|-------|-------------| +| `/sql-review` | SQL quality gate — lint 26 anti-patterns, validate syntax, check safety | +| `/sql-translate` | Cross-dialect SQL translation | +| `/schema-migration` | Schema migration planning and execution | +| `/pii-audit` | PII detection and compliance audits | +| `/cost-report` | Snowflake FinOps analysis | +| `/lineage-diff` | Column-level lineage comparison | +| `/query-optimize` | Query optimization suggestions | +| `/data-viz` | Interactive data visualization and dashboards | +| `/dbt-develop` | dbt model development and scaffolding | +| `/dbt-test` | dbt test generation | +| `/dbt-docs` | dbt documentation generation | +| `/dbt-analyze` | dbt project analysis | +| `/dbt-troubleshoot` | dbt issue diagnosis | +| `/teach` | Teach patterns from example files | +| `/train` | Learn standards from documents/style guides | +| `/training-status` | Dashboard of all learned knowledge | For custom skills, see [Adding Custom Skills](#adding-custom-skills) below. diff --git a/docs/docs/configure/telemetry.md b/docs/docs/configure/telemetry.md index 84b9acaa0c..bf27af30fb 100644 --- a/docs/docs/configure/telemetry.md +++ b/docs/docs/configure/telemetry.md @@ -11,17 +11,17 @@ We collect the following categories of events: | `session_start` | A new CLI session begins | | `session_end` | A CLI session ends (includes duration) | | `session_forked` | A session is forked from an existing one | -| `generation` | An AI model generation completes (model ID, token counts, duration — no prompt content) | -| `tool_call` | A tool is invoked (tool name and category — no arguments or output) | -| `bridge_call` | A Python engine RPC call completes (method name and duration — no arguments) | +| `generation` | An AI model generation completes (model ID, token counts, duration, but no prompt content) | +| `tool_call` | A tool is invoked (tool name and category, but no arguments or output) | +| `bridge_call` | A native tool call completes (method name and duration, but no arguments) | | `command` | A CLI command is executed (command name only) | | `error` | An unhandled error occurs (error type and truncated message — no stack traces) | | `auth_login` | Authentication succeeds or fails (provider and method — no credentials) | | `auth_logout` | A user logs out (provider only) | | `mcp_server_status` | An MCP server connects, disconnects, or errors (server name and transport) | | `provider_error` | An AI provider returns an error (error type and HTTP status — no request content) | -| `engine_started` | The Python engine starts or restarts (version and duration) | -| `engine_error` | The Python engine fails to start (phase and truncated error) | +| `engine_started` | The native tool engine initializes (version and duration) | +| `engine_error` | The native tool engine fails to start (phase and truncated error) | | `upgrade_attempted` | A CLI upgrade is attempted (version and method) | | `permission_denied` | A tool permission is denied (tool name and source) | | `doom_loop_detected` | A repeated tool call pattern is detected (tool name and count) | @@ -47,9 +47,9 @@ No events are ever written to disk — if the process is killed before the final Telemetry helps us: - **Detect errors** — identify crashes, provider failures, and engine issues before users report them -- **Improve reliability** — track MCP server stability, engine startup success rates, and upgrade outcomes +- **Improve reliability** — track MCP server stability, engine initialization, and upgrade outcomes - **Understand usage patterns** — know which tools and features are used so we can prioritize development -- **Measure performance** — track generation latency, engine startup time, and bridge call duration +- **Measure performance** — track generation latency, tool call duration, and startup time ## Disabling Telemetry @@ -103,7 +103,7 @@ Event type names use **snake_case** with a `domain_action` pattern: - `auth_login`, `auth_logout` — authentication events - `mcp_server_status`, `mcp_server_census` — MCP server lifecycle -- `engine_started`, `engine_error` — Python engine events +- `engine_started`, `engine_error` — native engine events - `provider_error` — AI provider errors - `session_forked` — session lifecycle - `environment_census` — environment snapshot events diff --git a/docs/docs/configure/tools.md b/docs/docs/configure/tools.md index 9069b54c21..cda9b2a321 100644 --- a/docs/docs/configure/tools.md +++ b/docs/docs/configure/tools.md @@ -24,7 +24,7 @@ altimate includes built-in tools that agents use to interact with your codebase ## Data Engineering Tools -In addition to built-in tools, altimate provides 55+ specialized data engineering tools. See the [Data Engineering Tools](../data-engineering/tools/index.md) section for details. +In addition to built-in tools, altimate provides 50+ specialized data engineering tools. See the [Data Engineering Tools](../data-engineering/tools/index.md) section for details. ## Tool Permissions diff --git a/docs/docs/data-engineering/agent-modes.md b/docs/docs/data-engineering/agent-modes.md index 6290e16760..afbb9e5adb 100644 --- a/docs/docs/data-engineering/agent-modes.md +++ b/docs/docs/data-engineering/agent-modes.md @@ -20,11 +20,13 @@ altimate runs in one of seven specialized modes. Each mode has different permiss altimate --agent builder ``` +> Tip: `--yolo` auto-approves permission prompts for faster iteration (`altimate --yolo --agent builder`). Not recommended with live warehouse connections. Use on local/dev environments only. See [Permissions: Yolo Mode](../configure/permissions.md#yolo-mode). + Builder mode follows a strict pre-execution protocol for every SQL operation: -1. `sql_analyze` — Check for anti-patterns -2. `sql_validate` — Verify syntax and schema references -3. `sql_execute` — Run the query +1. `sql_analyze` to check for anti-patterns +2. `sql_validate` to verify syntax and schema references +3. `sql_execute` to run the query ### Example: Create a staging model @@ -87,10 +89,10 @@ altimate --agent analyst Analyst mode enforces strict guardrails: -- **SELECT only** — INSERT, UPDATE, DELETE, DROP are blocked by altimate-core -- **Cost-first** — Every query gets a cost prediction before execution -- **LIMIT enforced** — Large scans are automatically limited -- **No file writes** — Cannot modify your codebase +- **SELECT only.** INSERT, UPDATE, DELETE, DROP are blocked by altimate-core +- **Cost-first.** Every query gets a cost prediction before execution +- **LIMIT enforced.** Large scans are automatically limited +- **No file writes.** Cannot modify your codebase ### Example: Investigate expensive queries @@ -105,9 +107,9 @@ Analyst: Let me pull the query history and credit analysis. Daily trend: ↑ 15% vs last week > finops_expensive_queries prod-snowflake --days 7 --limit 5 - 1. SELECT * FROM events JOIN users... — 89.3 credits, 47 runs - 2. CREATE TABLE AS SELECT... — 45.1 credits, 1 run - 3. SELECT COUNT(DISTINCT...) — 23.7 credits, 312 runs + 1. SELECT * FROM events JOIN users... (89.3 credits, 47 runs) + 2. CREATE TABLE AS SELECT... (45.1 credits, 1 run) + 3. SELECT COUNT(DISTINCT...) (23.7 credits, 312 runs) The #1 cost driver is a query running 47 times that scans the full events table with a SELECT *. Let me analyze it: @@ -118,7 +120,7 @@ events table with a SELECT *. Let me analyze it: Recommendations: 1. Replace SELECT * with only the 6 columns actually used downstream -2. Add a date filter — the events table has 2.1B rows but most queries +2. Add a date filter, since the events table has 2.1B rows but most queries only need the last 7 days 3. Consider materializing this as a view with the filter baked in @@ -165,8 +167,8 @@ Validator: Severity: medium (type change may affect downstream) Issues found: -1. FUNCTION_IN_FILTER on line 23 — use a pre-computed date column instead -2. Type change from FLOAT to DECIMAL(18,2) — verify downstream consumers +1. FUNCTION_IN_FILTER on line 23: use a pre-computed date column instead +2. Type change from FLOAT to DECIMAL(18,2): verify downstream consumers 3. Missing test coverage for the new revenue_category column ``` @@ -207,7 +209,7 @@ Migrator: > lineage_check [source SQL] > lineage_check [translated SQL] - ✓ Lineage preserved — same columns, same sources + ✓ Lineage preserved, same columns, same sources ``` ### Supported translation paths @@ -233,10 +235,10 @@ altimate --agent researcher Researcher mode follows a 4-phase protocol for thorough investigation: -1. **Plan** — Outline questions, data sources, and tool sequence -2. **Gather** — Execute investigation steps, parallelize where possible -3. **Analyze** — Cross-reference findings, identify root causes -4. **Report** — Produce structured report with evidence and recommendations +1. **Plan.** Outline questions, data sources, and tool sequence +2. **Gather.** Execute investigation steps, parallelize where possible +3. **Analyze.** Cross-reference findings, identify root causes +4. **Report.** Produce structured report with evidence and recommendations ### Example: Investigate a cost spike @@ -271,8 +273,8 @@ Researcher: Let me plan the investigation. Costs increased 38% due to two changes deployed 3 days ago. ## Key Findings - 1. fct_attribution model (new) scans 2.1B events rows without partition filter — 45 credits/week - 2. Partition filter was accidentally removed from stg_events in commit def5678 — caused 47 full scans + 1. fct_attribution model (new) scans 2.1B events rows without partition filter, costing 45 credits/week + 2. Partition filter was accidentally removed from stg_events in commit def5678, causing 47 full scans ## Recommendations 1. [ ] Add date partition filter to fct_attribution (estimated savings: 40 credits/week) @@ -306,7 +308,7 @@ Researcher: Let me plan the investigation. altimate --agent trainer ``` -Trainer mode helps you systematically teach the agent about your project. It cannot modify files — it only reads, explores, and manages training. +Trainer mode helps you systematically teach the agent about your project. It cannot modify files. It only reads, explores, and manages training. ### Example: Teach from a correction diff --git a/docs/docs/data-engineering/guides/ci-headless.md b/docs/docs/data-engineering/guides/ci-headless.md index 11d29da1af..772608577e 100644 --- a/docs/docs/data-engineering/guides/ci-headless.md +++ b/docs/docs/data-engineering/guides/ci-headless.md @@ -83,7 +83,7 @@ jobs: - uses: actions/checkout@v4 - name: Install altimate - run: npm install -g @altimateai/altimate-code + run: npm install -g altimate-code - name: Run cost report env: diff --git a/docs/docs/data-engineering/tools/dbt-tools.md b/docs/docs/data-engineering/tools/dbt-tools.md index 2b07099901..1e72b44508 100644 --- a/docs/docs/data-engineering/tools/dbt-tools.md +++ b/docs/docs/data-engineering/tools/dbt-tools.md @@ -72,6 +72,36 @@ Source Freshness: --- +## altimate-dbt CLI + +`altimate-dbt` is a standalone CLI for dbt workflows. It auto-detects your dbt project directory, Python environment, and adapter type (Snowflake, BigQuery, Databricks, Redshift, etc.). + +```bash +# Initialize dbt integration +altimate-dbt init + +# Diagnose issues +altimate-dbt doctor + +# Run dbt commands +altimate-dbt compile +altimate-dbt build +altimate-dbt run +altimate-dbt test + +# Utilities +altimate-dbt execute "SELECT 1" # Run a query via dbt adapter +altimate-dbt columns my_model # List model columns +altimate-dbt graph # View lineage/DAG +altimate-dbt deps # Manage dependencies +``` + +All commands provide friendly error diagnostics with actionable fix suggestions when something goes wrong. + +> **Tip:** In builder mode, the agent prefers `altimate-dbt` over the raw `dbt_run` tool for better error handling and auto-detection. + +--- + ## dbt Skills ### /generate-tests diff --git a/docs/docs/data-engineering/tools/index.md b/docs/docs/data-engineering/tools/index.md index b8993ee5f8..ae944bfe4f 100644 --- a/docs/docs/data-engineering/tools/index.md +++ b/docs/docs/data-engineering/tools/index.md @@ -1,6 +1,6 @@ # Tools Reference -altimate has 99+ specialized tools organized by function. +altimate has 50+ specialized tools organized by function. | Category | Tools | Purpose | |---|---|---| @@ -8,9 +8,10 @@ altimate has 99+ specialized tools organized by function. | [Schema Tools](schema-tools.md) | 7 tools | Inspection, search, PII detection, tagging, diffing | | [FinOps Tools](finops-tools.md) | 8 tools | Cost analysis, warehouse sizing, unused resources, RBAC | | [Lineage Tools](lineage-tools.md) | 1 tool | Column-level lineage tracing with confidence scoring | -| [dbt Tools](dbt-tools.md) | 2 tools + 6 skills | Run, manifest parsing, test generation, scaffolding | +| [dbt Tools](dbt-tools.md) | 2 tools + 5 skills | Run, manifest parsing, test generation, scaffolding, `altimate-dbt` CLI | | [Warehouse Tools](warehouse-tools.md) | 6 tools | Environment scanning, connection management, discovery, testing | | [Altimate Memory](memory-tools.md) | 3 tools | Persistent cross-session memory for warehouse config, conventions, and preferences | | [Training](../training/index.md) | 3 tools + 3 skills | Correct the agent once, it remembers forever, your team inherits it | +| `tool_lookup` | 1 tool | Runtime introspection — discover tool schemas and parameters dynamically | All tools are available in the interactive TUI. The agent automatically selects the right tools based on your request. diff --git a/docs/docs/data-engineering/tools/warehouse-tools.md b/docs/docs/data-engineering/tools/warehouse-tools.md index adaa76daf7..2505f0f282 100644 --- a/docs/docs/data-engineering/tools/warehouse-tools.md +++ b/docs/docs/data-engineering/tools/warehouse-tools.md @@ -9,9 +9,6 @@ Scan the entire data engineering environment in one call. Detects dbt projects, # Environment Scan -## Python Engine -✓ Engine healthy - ## Git Repository ✓ Git repo on branch `main` (origin: github.com/org/analytics) diff --git a/docs/docs/getting-started.md b/docs/docs/getting-started.md index e4d74fcc1f..75ebab958c 100644 --- a/docs/docs/getting-started.md +++ b/docs/docs/getting-started.md @@ -4,7 +4,7 @@ ## Why altimate? -altimate is the open-source data engineering harness — 99+ deterministic tools for building, validating, optimizing, and shipping data products. Unlike general-purpose coding agents, every tool is purpose-built for data engineering: +altimate is the open-source data engineering harness with 50+ deterministic tools for building, validating, optimizing, and shipping data products. Unlike general-purpose coding agents, every tool is purpose-built for data engineering: | Capability | General coding agents | altimate | |---|---|---| @@ -42,7 +42,7 @@ Then in the TUI: This walks you through selecting and authenticating with an LLM provider (Anthropic, OpenAI, Bedrock, Codex, Ollama, etc.). You need a working LLM connection before the agent can do anything useful. -## Step 3: Configure Your Warehouse +## Step 3: Configure Your Warehouse _(Optional)_ Set up warehouse connections so altimate can query your data platform. You have two options: @@ -54,11 +54,11 @@ Set up warehouse connections so altimate can query your data platform. You have `/discover` scans your environment and sets up everything automatically: -1. **Detects your dbt project** — finds `dbt_project.yml`, parses the manifest, and reads profiles -2. **Discovers warehouse connections** — from `~/.dbt/profiles.yml`, running Docker containers, and environment variables (e.g. `SNOWFLAKE_ACCOUNT`, `PGHOST`, `DATABASE_URL`) -3. **Checks installed tools** — dbt, sqlfluff, airflow, dagster, prefect, soda, sqlmesh, great_expectations, sqlfmt -4. **Offers to configure connections** — walks you through adding and testing each discovered warehouse -5. **Indexes schemas** — populates the schema cache for autocomplete and context-aware analysis +1. **Detects your dbt project** by finding `dbt_project.yml`, parsing the manifest, and reading profiles +2. **Discovers warehouse connections** from `~/.dbt/profiles.yml`, running Docker containers, and environment variables (e.g. `SNOWFLAKE_ACCOUNT`, `PGHOST`, `DATABASE_URL`) +3. **Checks installed tools** including dbt, sqlfluff, airflow, dagster, prefect, soda, sqlmesh, great_expectations, sqlfmt +4. **Offers to configure connections** and walks you through adding and testing each discovered warehouse +5. **Indexes schemas** to populate the schema cache for autocomplete and context-aware analysis Once complete, altimate indexes your schemas and detects your tooling, enabling schema-aware autocomplete and context-rich analysis. @@ -136,13 +136,13 @@ altimate offers specialized agent modes for different workflows: | What do you want to do? | Use this agent mode | |---|---| -| Analyzing data without risk of changes | **Analyst** — read-only queries, cost analysis, data profiling | -| Building or generating dbt models | **Builder** — model scaffolding, SQL generation, ref() wiring | -| Validating data quality | **Validator** — test generation, anomaly detection, data contracts | -| Migrating across warehouses | **Migrator** — cross-dialect SQL translation, compatibility checks | -| Teaching team conventions | **Trainer** — learns corrections, enforces naming/style rules across team | -| Research and exploration | **Researcher** — deep-dive analysis, lineage tracing, impact assessment | -| Executive summaries and reports | **Executive** — high-level overviews, cost summaries, health dashboards | +| Analyzing data without risk of changes | **Analyst** for read-only queries, cost analysis, data profiling | +| Building or generating dbt models | **Builder** for model scaffolding, SQL generation, ref() wiring | +| Validating data quality | **Validator** for test generation, anomaly detection, data contracts | +| Migrating across warehouses | **Migrator** for cross-dialect SQL translation, compatibility checks | +| Teaching team conventions | **Trainer**, which learns corrections and enforces naming/style rules across team | +| Research and exploration | **Researcher** for deep-dive analysis, lineage tracing, impact assessment | +| Executive summaries and reports | **Executive** for high-level overviews, cost summaries, health dashboards | Switch modes in the TUI: @@ -215,7 +215,7 @@ altimate uses a JSON config file. Create `altimate-code.json` in your project ro } ``` -Or use Application Default Credentials (ADC) — just omit `service_account` and run `gcloud auth application-default login`. +Or use Application Default Credentials (ADC). Just omit `service_account` and run `gcloud auth application-default login`. ### Databricks @@ -290,7 +290,7 @@ If you have a ChatGPT Plus/Pro subscription, you can use Codex as your LLM backe 1. Run `/connect` in the TUI 2. Select **Codex** as your provider 3. Authenticate via browser OAuth -4. Your subscription covers all usage — no API keys needed +4. Your subscription covers all usage, so no API keys are needed ## Verify your setup @@ -344,10 +344,10 @@ Generate data quality tests for all models in the marts/ directory. For each mod ## Next steps -- [Terminal UI](usage/tui.md) — Learn the terminal interface, keybinds, and slash commands -- [CLI](usage/cli.md) — Subcommands, flags, and environment variables -- [Config Files](configure/config.md) — Full config file reference -- [Providers](configure/providers.md) — Set up Anthropic, OpenAI, Bedrock, Ollama, and more -- [Agent Modes](data-engineering/agent-modes.md) — Builder, Analyst, Validator, Migrator, Researcher, Trainer -- [Training](data-engineering/training/index.md) — Correct the agent once, it remembers forever, your team inherits it -- [Tools](data-engineering/tools/sql-tools.md) — 99+ specialized tools for SQL, dbt, and warehouses +- [Terminal UI](usage/tui.md): Learn the terminal interface, keybinds, and slash commands +- [CLI](usage/cli.md): Subcommands, flags, and environment variables +- [Config Files](configure/config.md): Full config file reference +- [Providers](configure/providers.md): Set up Anthropic, OpenAI, Bedrock, Ollama, and more +- [Agent Modes](data-engineering/agent-modes.md): Builder, Analyst, Validator, Migrator, Researcher, Trainer +- [Training](data-engineering/training/index.md): Correct the agent once, it remembers forever, your team inherits it +- [Tools](data-engineering/tools/sql-tools.md): 50+ specialized tools for SQL, dbt, and warehouses diff --git a/docs/docs/index.md b/docs/docs/index.md index 3abd9c34cf..2ad5a8db45 100644 --- a/docs/docs/index.md +++ b/docs/docs/index.md @@ -17,7 +17,7 @@ hide:

The open-source data engineering harness.

-

99+ tools for building, validating, optimizing, and shipping data products. Use in your terminal, CI pipeline, orchestration DAGs, or as the harness for your data agents. Evaluate across any platform — independent of a single warehouse provider.

+

50+ tools for building, validating, optimizing, and shipping data products. Use in your terminal, CI pipeline, orchestration DAGs, or as the harness for your data agents. Evaluate across any platform — independent of a single warehouse provider.

@@ -39,7 +39,7 @@ npm install -g altimate-code ---

Purpose-built for the data product lifecycle

-

Every tool covers a specific stage — build, validate, optimize, or ship. Not general-purpose AI on top of SQL files.

+

Every tool covers a specific stage: build, validate, optimize, or ship. Not general-purpose AI on top of SQL files.

@@ -92,7 +92,7 @@ npm install -g altimate-code --- - Interactive TUI with 99+ tools, autocomplete for skills, and persistent memory across sessions. + Interactive TUI with 50+ tools, autocomplete for skills, and persistent memory across sessions. - :material-pipe-disconnected:{ .lg .middle } **CI Pipeline** @@ -168,7 +168,7 @@ npm install -g altimate-code ---

Works with any LLM

-

Model-agnostic — bring your own provider or run locally.

+

Model-agnostic. Bring your own provider or run locally.

@@ -185,7 +185,7 @@ npm install -g altimate-code ---

Evaluate across any platform

-

First-class support for 8 warehouses. Migrate, compare, and translate across platforms — not locked to one vendor.

+

First-class support for 10 databases. Migrate, compare, and translate across platforms, not locked to one vendor.

@@ -197,6 +197,8 @@ npm install -g altimate-code - :material-duck: **DuckDB** - :material-database: **MySQL** - :material-microsoft: **SQL Server** +- :material-database-outline: **Oracle** +- :material-database-search: **SQLite**
@@ -204,8 +206,8 @@ npm install -g altimate-code diff --git a/docs/docs/llms.txt b/docs/docs/llms.txt index 70eaa8733f..da7c6c4d55 100644 --- a/docs/docs/llms.txt +++ b/docs/docs/llms.txt @@ -1,42 +1,42 @@ # altimate-code llms.txt # AI-friendly documentation index for altimate-code -# Generated: 2026-03-17 | Version: v0.4.1 -# Source: https://altimateai.github.io/altimate-code +# Generated: 2026-03-18 | Version: v0.4.2 +# Source: https://docs.altimate.sh -> altimate-code is an open-source data engineering harness — 99+ tools for building, validating, optimizing, and shipping data products. Use in your terminal, CI pipeline, orchestration DAGs, or as the tool layer for your data agents. Includes a deterministic SQL Intelligence Engine (100% F1 across 1,077 queries), column-level lineage, FinOps analysis, PII detection, and dbt integration. Works with any LLM provider. Local-first, MIT-licensed. +> altimate-code is an open-source data engineering harness — 50+ tools for building, validating, optimizing, and shipping data products. Use in your terminal, CI pipeline, orchestration DAGs, or as the tool layer for your data agents. Includes a deterministic SQL Intelligence Engine (100% F1 across 1,077 queries), column-level lineage, FinOps analysis, PII detection, and dbt integration. Works with any LLM provider. Local-first, MIT-licensed. ## Get Started -- [Quickstart (5 min)](https://altimateai.github.io/altimate-code/quickstart/): Install altimate, configure your LLM provider, connect your warehouse, and run your first query in under 5 minutes. -- [Full Setup Guide](https://altimateai.github.io/altimate-code/getting-started/): Complete installation, warehouse configuration for all 8 supported warehouses, LLM provider setup, and first-run walkthrough. -- [Network & Proxy](https://altimateai.github.io/altimate-code/network/): Proxy configuration, CA certificate setup, firewall requirements. +- [Quickstart (5 min)](https://docs.altimate.sh/quickstart/): Install altimate, configure your LLM provider, connect your warehouse, and run your first query in under 5 minutes. +- [Full Setup Guide](https://docs.altimate.sh/getting-started/): Complete installation, warehouse configuration for all 8 supported warehouses, LLM provider setup, and first-run walkthrough. +- [Network & Proxy](https://docs.altimate.sh/network/): Proxy configuration, CA certificate setup, firewall requirements. ## Data Engineering -- [Agent Modes](https://altimateai.github.io/altimate-code/data-engineering/agent-modes/): 7 specialized agents — Builder (full read/write), Analyst (read-only enforced), Validator, Migrator, Researcher, Trainer, Executive — each with scoped permissions and purpose-built tool access. -- [Training Overview](https://altimateai.github.io/altimate-code/data-engineering/training/): How to teach altimate project-specific patterns, naming conventions, and corrections that persist across sessions and team members. -- [Team Deployment](https://altimateai.github.io/altimate-code/data-engineering/training/team-deployment/): How to commit training to git so your entire team inherits SQL conventions automatically. -- [SQL Tools](https://altimateai.github.io/altimate-code/data-engineering/tools/sql-tools/): 9 SQL analysis tools with 19 anti-pattern rules. 100% F1 accuracy on 1,077 benchmark queries. -- [Schema Tools](https://altimateai.github.io/altimate-code/data-engineering/tools/schema-tools/): Warehouse schema introspection, metadata indexing, and column-level analysis tools. -- [FinOps Tools](https://altimateai.github.io/altimate-code/data-engineering/tools/finops-tools/): Credit analysis, expensive query detection, warehouse right-sizing, unused resource cleanup, RBAC auditing. -- [Lineage Tools](https://altimateai.github.io/altimate-code/data-engineering/tools/lineage-tools/): Column-level lineage extraction from SQL. 100% edge-match accuracy on 500 benchmark queries. -- [dbt Tools](https://altimateai.github.io/altimate-code/data-engineering/tools/dbt-tools/): dbt manifest parsing, test generation, model scaffolding, incremental logic detection. -- [Warehouse Tools](https://altimateai.github.io/altimate-code/data-engineering/tools/warehouse-tools/): Direct connectivity to Snowflake, BigQuery, Databricks, PostgreSQL, Redshift, DuckDB, MySQL, SQL Server. -- [Memory Tools](https://altimateai.github.io/altimate-code/data-engineering/tools/memory-tools/): Session memory, persistent corrections, and team training storage. -- [Cost Optimization Guide](https://altimateai.github.io/altimate-code/data-engineering/guides/cost-optimization/): Step-by-step warehouse cost reduction with before/after SQL examples and savings estimates. -- [Migration Guide](https://altimateai.github.io/altimate-code/data-engineering/guides/migration/): Cross-warehouse SQL migration with side-by-side examples. -- [CI & Headless Mode](https://altimateai.github.io/altimate-code/data-engineering/guides/ci-headless/): Non-interactive use in GitHub Actions, scheduled jobs, and pre-commit hooks. +- [Agent Modes](https://docs.altimate.sh/data-engineering/agent-modes/): 7 specialized agents — Builder (full read/write), Analyst (read-only enforced), Validator, Migrator, Researcher, Trainer, Executive — each with scoped permissions and purpose-built tool access. +- [Training Overview](https://docs.altimate.sh/data-engineering/training/): How to teach altimate project-specific patterns, naming conventions, and corrections that persist across sessions and team members. +- [Team Deployment](https://docs.altimate.sh/data-engineering/training/team-deployment/): How to commit training to git so your entire team inherits SQL conventions automatically. +- [SQL Tools](https://docs.altimate.sh/data-engineering/tools/sql-tools/): 9 SQL analysis tools with 19 anti-pattern rules. 100% F1 accuracy on 1,077 benchmark queries. +- [Schema Tools](https://docs.altimate.sh/data-engineering/tools/schema-tools/): Warehouse schema introspection, metadata indexing, and column-level analysis tools. +- [FinOps Tools](https://docs.altimate.sh/data-engineering/tools/finops-tools/): Credit analysis, expensive query detection, warehouse right-sizing, unused resource cleanup, RBAC auditing. +- [Lineage Tools](https://docs.altimate.sh/data-engineering/tools/lineage-tools/): Column-level lineage extraction from SQL. 100% edge-match accuracy on 500 benchmark queries. +- [dbt Tools](https://docs.altimate.sh/data-engineering/tools/dbt-tools/): dbt manifest parsing, test generation, model scaffolding, incremental logic detection. +- [Warehouse Tools](https://docs.altimate.sh/data-engineering/tools/warehouse-tools/): Direct connectivity to Snowflake, BigQuery, Databricks, PostgreSQL, Redshift, DuckDB, MySQL, SQL Server. +- [Memory Tools](https://docs.altimate.sh/data-engineering/tools/memory-tools/): Session memory, persistent corrections, and team training storage. +- [Cost Optimization Guide](https://docs.altimate.sh/data-engineering/guides/cost-optimization/): Step-by-step warehouse cost reduction with before/after SQL examples and savings estimates. +- [Migration Guide](https://docs.altimate.sh/data-engineering/guides/migration/): Cross-warehouse SQL migration with side-by-side examples. +- [CI & Headless Mode](https://docs.altimate.sh/data-engineering/guides/ci-headless/): Non-interactive use in GitHub Actions, scheduled jobs, and pre-commit hooks. ## Configure -- [Configuration Overview](https://altimateai.github.io/altimate-code/configure/config/): Full altimate-code.json schema, value substitution, project structure, experimental flags. -- [Providers](https://altimateai.github.io/altimate-code/configure/providers/): 17 LLM provider configurations with JSON examples: Anthropic, OpenAI, Google Gemini, Vertex AI, Amazon Bedrock, Azure OpenAI, Mistral, Groq, Ollama, and more. -- [Agent Skills](https://altimateai.github.io/altimate-code/configure/skills/): How to configure, discover, and add custom skills. -- [Permissions](https://altimateai.github.io/altimate-code/configure/permissions/): Permission levels, pattern matching, per-agent restrictions, deny rules for destructive SQL. -- [Tracing](https://altimateai.github.io/altimate-code/configure/tracing/): Local-first observability — trace schema, span types, live viewing, remote OTLP exporters, crash recovery. -- [Telemetry](https://altimateai.github.io/altimate-code/configure/telemetry/): 25 anonymized event types, privacy guarantees, opt-out instructions. +- [Configuration Overview](https://docs.altimate.sh/configure/config/): Full altimate-code.json schema, value substitution, project structure, experimental flags. +- [Providers](https://docs.altimate.sh/configure/providers/): 17 LLM provider configurations with JSON examples: Anthropic, OpenAI, Google Gemini, Vertex AI, Amazon Bedrock, Azure OpenAI, Mistral, Groq, Ollama, and more. +- [Agent Skills](https://docs.altimate.sh/configure/skills/): How to configure, discover, and add custom skills. +- [Permissions](https://docs.altimate.sh/configure/permissions/): Permission levels, pattern matching, per-agent restrictions, deny rules for destructive SQL. +- [Tracing](https://docs.altimate.sh/configure/tracing/): Local-first observability — trace schema, span types, live viewing, remote OTLP exporters, crash recovery. +- [Telemetry](https://docs.altimate.sh/configure/telemetry/): 25 anonymized event types, privacy guarantees, opt-out instructions. ## Reference -- [Security FAQ](https://altimateai.github.io/altimate-code/security-faq/): 12 Q&A pairs on data handling, credentials, permissions, network endpoints, and team hardening. -- [Troubleshooting](https://altimateai.github.io/altimate-code/troubleshooting/): 6 common issues with step-by-step fixes, including Python bridge failures and warehouse connection errors. +- [Security FAQ](https://docs.altimate.sh/security-faq/): 12 Q&A pairs on data handling, credentials, permissions, network endpoints, and team hardening. +- [Troubleshooting](https://docs.altimate.sh/troubleshooting/): 6 common issues with step-by-step fixes, including tool execution errors and warehouse connection setup. diff --git a/docs/docs/quickstart.md b/docs/docs/quickstart.md index a0292ffaad..056539560a 100644 --- a/docs/docs/quickstart.md +++ b/docs/docs/quickstart.md @@ -1,5 +1,5 @@ --- -description: "Install altimate-code and run your first SQL analysis. The open-source data engineering harness — 99+ tools for building, validating, optimizing, and shipping data products." +description: "Install altimate-code and run your first SQL analysis. The open-source data engineering harness with 50+ tools for building, validating, optimizing, and shipping data products." --- # Quickstart @@ -8,21 +8,21 @@ description: "Install altimate-code and run your first SQL analysis. The open-so --- -## Step 1 — Install +## Step 1: Install ```bash # npm (recommended) -npm install -g @altimateai/altimate-code +npm install -g altimate-code # Homebrew brew install AltimateAI/tap/altimate-code ``` -> **Zero Python setup required.** On first run, the CLI automatically downloads `uv`, creates an isolated Python environment, and installs the data engine. No `pip install`, no virtualenv management. +> **No Python required.** All tools run natively in TypeScript via `@altimateai/altimate-core` napi-rs bindings. No pip, no virtualenv, no Python installation needed. --- -## Step 2 — Configure Your LLM +## Step 2: Configure Your LLM ```bash altimate # Launch the TUI @@ -48,11 +48,11 @@ Minimal config file option (`altimate-code.json` in your project root): } ``` -> **No API key?** Select **Codex** in the `/connect` menu — it's a built-in provider with no setup required. +> **No API key?** Select **Codex** in the `/connect` menu. It's a built-in provider with no setup required. --- -## Step 3 — Connect Your Warehouse _(Optional)_ +## Step 3: Connect Your Warehouse _(Optional)_ > Skip this step if you want to work locally or don't need warehouse/orchestration connections. You can always run `/discover` later. @@ -60,7 +60,7 @@ Minimal config file option (`altimate-code.json` in your project root): altimate /discover ``` -`/discover` scans for dbt projects, warehouse credentials (from `~/.dbt/profiles.yml`, environment variables, and Docker), and installed tools. It **reads but never writes** — safe to run against production. +Auto-detects your dbt projects, warehouse credentials, and installed tools. See [Full Setup](getting-started.md#step-3-configure-your-warehouse) for details on what `/discover` finds and manual configuration options. **No cloud warehouse?** Use DuckDB with a local file: @@ -77,7 +77,7 @@ altimate /discover --- -## Step 4 — Build Your First Artifact +## Step 4: Build Your First Artifact In the TUI, try these prompts or describe your own use case: diff --git a/docs/docs/security-faq.md b/docs/docs/security-faq.md index 2abe309340..a94ba8a084 100644 --- a/docs/docs/security-faq.md +++ b/docs/docs/security-faq.md @@ -19,7 +19,7 @@ Altimate Code needs database credentials to connect to your warehouse. Credentia ## What can the agent actually execute? -Altimate Code can read files, write files, and run shell commands — but only with your permission. The [permission system](configure/permissions.md) lets you control every tool: +Altimate Code can read files, write files, and run shell commands, but only with your permission. The [permission system](configure/permissions.md) lets you control every tool: | Level | Behavior | |-------|----------| @@ -96,9 +96,9 @@ No other outbound connections are made. See the [Network reference](network.md) Yes, with constraints. You need: -1. **A locally accessible LLM** — self-hosted model or a provider reachable from your network -2. **Model catalog disabled** — set `ALTIMATE_CLI_DISABLE_MODELS_FETCH=true` or provide a local models file -3. **Telemetry disabled** — set `ALTIMATE_TELEMETRY_DISABLED=true` +1. **A locally accessible LLM**, either a self-hosted model or a provider reachable from your network +2. **Model catalog disabled** by setting `ALTIMATE_CLI_DISABLE_MODELS_FETCH=true` or providing a local models file +3. **Telemetry disabled** by setting `ALTIMATE_TELEMETRY_DISABLED=true` ```bash export ALTIMATE_CLI_DISABLE_MODELS_FETCH=true @@ -108,7 +108,7 @@ export ALTIMATE_CLI_MODELS_PATH=/path/to/models.json ## What telemetry is collected? -Anonymous usage telemetry — event names, token counts, timing, and error types. **Never** code, queries, credentials, file paths, or prompt content. See the full [Telemetry reference](configure/telemetry.md) for the complete event list. +Anonymous usage telemetry, including event names, token counts, timing, and error types. **Never** code, queries, credentials, file paths, or prompt content. See the full [Telemetry reference](configure/telemetry.md) for the complete event list. Disable telemetry entirely: @@ -130,8 +130,8 @@ export ALTIMATE_TELEMETRY_DISABLED=true When you run `altimate auth login `, the CLI fetches `/.well-known/altimate-code` to discover the server's auth command. Before executing anything: -1. **Validation** — The auth command must be an array of strings. Malformed or unexpected types are rejected. -2. **Confirmation prompt** — You are shown the exact command and must explicitly approve it before it runs. +1. **Validation.** The auth command must be an array of strings. Malformed or unexpected types are rejected. +2. **Confirmation prompt.** You are shown the exact command and must explicitly approve it before it runs. ``` $ altimate auth login https://mcp.example.com @@ -152,16 +152,21 @@ MCP (Model Context Protocol) servers extend Altimate Code with additional tools. !!! warning Third-party MCP servers are not reviewed or audited by Altimate. Treat them like any other third-party dependency — review the source, check for updates, and limit their access. -## How does the Python engine work? Is it safe? +## How does the SQL analysis engine work? -The Python engine (`altimate_engine`) runs as a local subprocess, communicating with the CLI over JSON-RPC via stdio. It: +As of v0.4.2, all 73 tool methods run natively in TypeScript via `@altimateai/altimate-core` (Rust napi-rs bindings). There is no Python dependency. The engine executes in-process with no subprocess, no network port, and no external service. -- Runs under your user account with your permissions -- Has no network access beyond what your warehouse connections require -- Restarts automatically if it crashes (max 2 restarts) -- Times out after 30 seconds per call +## What is `sensitive_write` protection? -The engine is not exposed on any network port — it only communicates through stdin/stdout pipes with the parent CLI process. +Altimate Code classifies writes to credential-adjacent files as `sensitive_write` operations. These always trigger a confirmation prompt, even if `write` is set to `"allow"` in your config. Protected patterns include: + +- **Environment files** such as `.env`, `.env.local`, `.env.production`, `.env.staging` +- **Credential files** such as `credentials.json`, `service-account.json`, `.npmrc`, `.pypirc`, `.netrc`, `.pgpass` +- **Secret key directories** such as `.ssh/`, `.aws/`, `.gnupg/`, `.gcloud/`, `.kube/`, `.docker/` +- **Private key extensions** such as `*.pem`, `*.key`, `*.p12`, `*.pfx` +- **Version control** files such as `.git/config`, `.git/hooks/*` + +You can approve per-file with "Allow always" to reduce prompt fatigue. The approval persists for your current session only. On macOS and Windows, matching is case-insensitive. ## Does Altimate Code store conversation history? @@ -170,22 +175,22 @@ Yes. Altimate Code persists session data locally on your machine: - **Session messages** are stored in a local SQLite database so you can resume, review, and revert conversations. - **Prompt history** (your recent inputs) is saved to `~/.state/prompt-history.jsonl` for command-line recall. -This data **never** leaves your machine — it is not sent to any service or included in telemetry. You can delete it at any time by removing the local database and history files. +This data **never** leaves your machine. It is not sent to any service or included in telemetry. You can delete it at any time by removing the local database and history files. !!! note Your LLM provider may have its own data retention policies. Check your provider's terms to understand how they handle API requests. ## How do I secure Altimate Code in a team environment? -1. **Use project-level config** — Place `altimate-code.json` in your project root with appropriate permission defaults. This ensures consistent security settings across the team. +1. **Use project-level config.** Place `altimate-code.json` in your project root with appropriate permission defaults. This ensures consistent security settings across the team. -2. **Restrict dangerous operations** — Deny destructive SQL and shell commands at the project level so individual users can't accidentally bypass them. +2. **Restrict dangerous operations.** Deny destructive SQL and shell commands at the project level so individual users can't accidentally bypass them. -3. **Use environment variables for secrets** — Never commit credentials. Use `ALTIMATE_CLI_PYTHON`, warehouse connection env vars, and your cloud provider's secret management. +3. **Use environment variables for secrets.** Never commit credentials. Use `ALTIMATE_CLI_PYTHON`, warehouse connection env vars, and your cloud provider's secret management. -4. **Review MCP servers** — Maintain a list of approved MCP servers. Don't let individual developers add arbitrary servers to shared configurations. +4. **Review MCP servers.** Maintain a list of approved MCP servers. Don't let individual developers add arbitrary servers to shared configurations. -5. **Lock down agent permissions** — Give each agent only the permissions it needs. The `analyst` agent doesn't need `write` access. The `builder` agent doesn't need `DROP` permissions. +5. **Lock down agent permissions.** Give each agent only the permissions it needs. The `analyst` agent doesn't need `write` access. The `builder` agent doesn't need `DROP` permissions. ## Can AI-generated SQL damage my database? @@ -202,12 +207,12 @@ For additional safety: Altimate Code includes several layers of protection to keep the agent within your project: -- **Project boundary enforcement** — File operations check that paths stay within your project directory (or git worktree for monorepos). Attempts to read or write outside the project trigger an `external_directory` permission prompt. -- **Symlink-aware path resolution** — Symlinks inside the project that point outside are detected and blocked. This prevents an agent from reading or writing outside your project through symlinks. -- **Path traversal blocking** — Paths containing `../` sequences that would escape the project are rejected with an "Access denied" error. -- **Sensitive file protection** — Writing to credential files (`.env`, `.ssh/`, `.aws/`, private keys) triggers a confirmation prompt, even inside the project. See [below](#why-am-i-being-prompted-to-edit-env-files) for details. -- **Bash command analysis** — The bash tool parses commands with tree-sitter to detect file operations (`rm`, `cp`, `mv`, etc.) targeting paths outside your project, and prompts for permission. -- **Non-git project safety** — For projects outside a git repository, the boundary is strictly the working directory (not the entire filesystem). +- **Project boundary enforcement.** File operations check that paths stay within your project directory (or git worktree for monorepos). Attempts to read or write outside the project trigger an `external_directory` permission prompt. +- **Symlink-aware path resolution.** Symlinks inside the project that point outside are detected and blocked. This prevents an agent from reading or writing outside your project through symlinks. +- **Path traversal blocking.** Paths containing `../` sequences that would escape the project are rejected with an "Access denied" error. +- **Sensitive file protection.** Writing to credential files (`.env`, `.ssh/`, `.aws/`, private keys) triggers a confirmation prompt, even inside the project. See [below](#why-am-i-being-prompted-to-edit-env-files) for details. +- **Bash command analysis.** The bash tool parses commands with tree-sitter to detect file operations (`rm`, `cp`, `mv`, etc.) targeting paths outside your project, and prompts for permission. +- **Non-git project safety.** For projects outside a git repository, the boundary is strictly the working directory (not the entire filesystem). These protections operate at the application level. For additional isolation, you can run Altimate Code inside a Docker container or VM. @@ -225,13 +230,13 @@ Altimate Code prompts before modifying files that commonly contain credentials o When you see this prompt: -- **"Allow once"** — approves this single edit -- **"Allow always"** — approves edits to this specific file for the rest of the session (resets on restart) +- **"Allow once"** approves this single edit +- **"Allow always"** approves edits to this specific file for the rest of the session (resets on restart) If you frequently edit `.env` files and find the prompts disruptive, click "Allow always" on the first prompt for each file — you won't be asked again for that file during your session. !!! tip - This protection does **not** block reading these files — only writing. The agent can still read your `.env` to understand configuration without prompting. + This protection does **not** block reading these files, only writing. The agent can still read your `.env` to understand configuration without prompting. ## What commands are blocked or prompted by default? @@ -254,9 +259,9 @@ To override defaults, add rules in `altimate-code.json`. See [Permissions](confi ## Best practices for staying safe -1. **Review before approving.** The permission prompt shows you exactly what will happen — diffs for file edits, the full command for bash. Take a moment to read it. +1. **Review before approving.** The permission prompt shows you exactly what will happen, including diffs for file edits and the full command for bash. Take a moment to read it. -2. **Work on a branch.** Let the agent work on a feature branch so you can review changes before merging. Git gives you a full safety net — this is the single most effective protection. +2. **Work on a branch.** Let the agent work on a feature branch so you can review changes before merging. Git gives you a full safety net. This is the single most effective protection. 3. **Use per-agent permissions.** Give each agent only what it needs. The `analyst` agent doesn't need write access. See [Permissions](configure/permissions.md) for examples. diff --git a/docs/docs/usage/cli.md b/docs/docs/usage/cli.md index 45e2a50118..a3fb7b72fc 100644 --- a/docs/docs/usage/cli.md +++ b/docs/docs/usage/cli.md @@ -45,6 +45,7 @@ altimate --agent analyst |------|------------| | `--model ` | Override the default model | | `--agent ` | Start with a specific agent | +| `--yolo` | Auto-approve all permission prompts (explicit `deny` rules still enforced) | | `--print-logs` | Print logs to stderr | | `--log-level ` | Set log level: `DEBUG`, `INFO`, `WARN`, `ERROR` | | `--help`, `-h` | Show help | @@ -85,6 +86,21 @@ Configuration can be controlled via environment variables: | `ALTIMATE_CLI_SERVER_PASSWORD` | Server HTTP basic auth password | | `ALTIMATE_CLI_PERMISSION` | Permission config as JSON | +### Permissions & Safety + +| Variable | Description | +|----------|------------| +| `ALTIMATE_CLI_YOLO` | Auto-approve all permission prompts (`true`/`false`). Explicit `deny` rules still enforced. | +| `OPENCODE_YOLO` | Fallback for `ALTIMATE_CLI_YOLO`. When both are set, `ALTIMATE_CLI_YOLO` takes precedence. | + +### Memory & Training + +| Variable | Description | +|----------|------------| +| `ALTIMATE_DISABLE_MEMORY` | Disable the persistent memory system | +| `ALTIMATE_MEMORY_AUTO_EXTRACT` | Auto-extract memories at session end | +| `ALTIMATE_DISABLE_TRAINING` | Disable the AI teammate training system | + ### Experimental | Variable | Description | diff --git a/docs/docs/usage/tui.md b/docs/docs/usage/tui.md index a30be554a2..ef3fa44129 100644 --- a/docs/docs/usage/tui.md +++ b/docs/docs/usage/tui.md @@ -34,6 +34,7 @@ The leader key (default: `Ctrl+X`) gives access to all TUI keybindings. Press le | `s` | Toggle sidebar | | `t` | List themes | | `m` | List models | +| `i` | Enhance prompt (rewrite with AI for clarity) | | `a` | List agents | | `k` | List keybinds | | `q` | Quit | diff --git a/docs/docs/windows-wsl.md b/docs/docs/windows-wsl.md index 0367a64436..036201f822 100644 --- a/docs/docs/windows-wsl.md +++ b/docs/docs/windows-wsl.md @@ -8,7 +8,7 @@ You can install and run altimate directly in PowerShell or Command Prompt withou ```powershell # PowerShell or CMD — install globally -npm install -g @altimateai/altimate-code +npm install -g altimate-code # Launch altimate diff --git a/docs/mkdocs.yml b/docs/mkdocs.yml index cb96740091..bdb628c0e2 100644 --- a/docs/mkdocs.yml +++ b/docs/mkdocs.yml @@ -1,5 +1,5 @@ site_name: altimate-code -site_description: The open-source data engineering harness. 99+ tools for building, validating, optimizing, and shipping data products. +site_description: The open-source data engineering harness. 50+ tools for building, validating, optimizing, and shipping data products. site_url: https://docs.altimate.sh repo_url: https://github.com/AltimateAI/altimate-code repo_name: AltimateAI/altimate-code From d6e07c49f96d605047865d2fc26e44f71f083d3a Mon Sep 17 00:00:00 2001 From: Saurabh Arora Date: Wed, 18 Mar 2026 13:20:13 -0700 Subject: [PATCH 02/13] docs: simplify zero-setup messaging in README and quickstart Co-Authored-By: Claude Opus 4.6 (1M context) --- README.md | 2 +- docs/docs/quickstart.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 350f89bb1e..b0cb9dd6b9 100644 --- a/README.md +++ b/README.md @@ -57,7 +57,7 @@ altimate /discover `/discover` auto-detects dbt projects, warehouse connections (from `~/.dbt/profiles.yml`, Docker, environment variables), and installed tools (dbt, sqlfluff, airflow, dagster, and more). Skip this and start building — you can always run it later. -> **No Python required.** All tools run natively in TypeScript via `@altimateai/altimate-core` napi-rs bindings. No pip, no virtualenv, no Python installation needed. +> **Zero setup.** One `npm install` and you're ready. No Python, no pip, no virtualenv. ## Why a specialized harness? diff --git a/docs/docs/quickstart.md b/docs/docs/quickstart.md index 056539560a..1dcbd6e452 100644 --- a/docs/docs/quickstart.md +++ b/docs/docs/quickstart.md @@ -18,7 +18,7 @@ npm install -g altimate-code brew install AltimateAI/tap/altimate-code ``` -> **No Python required.** All tools run natively in TypeScript via `@altimateai/altimate-core` napi-rs bindings. No pip, no virtualenv, no Python installation needed. +> **Zero setup.** One `npm install` and you're ready. No Python, no pip, no virtualenv. --- From 22bacc109b77527a5f3d84f7a2b9a06816322604 Mon Sep 17 00:00:00 2001 From: Saurabh Arora Date: Wed, 18 Mar 2026 13:20:47 -0700 Subject: [PATCH 03/13] docs: simplify install callout to "zero additional setup" Co-Authored-By: Claude Opus 4.6 (1M context) --- README.md | 2 +- docs/docs/quickstart.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index b0cb9dd6b9..4324a1e0f1 100644 --- a/README.md +++ b/README.md @@ -57,7 +57,7 @@ altimate /discover `/discover` auto-detects dbt projects, warehouse connections (from `~/.dbt/profiles.yml`, Docker, environment variables), and installed tools (dbt, sqlfluff, airflow, dagster, and more). Skip this and start building — you can always run it later. -> **Zero setup.** One `npm install` and you're ready. No Python, no pip, no virtualenv. +> **Zero additional setup.** One command install — no Python, no pip, no virtualenv. ## Why a specialized harness? diff --git a/docs/docs/quickstart.md b/docs/docs/quickstart.md index 1dcbd6e452..dfa5bfd33f 100644 --- a/docs/docs/quickstart.md +++ b/docs/docs/quickstart.md @@ -18,7 +18,7 @@ npm install -g altimate-code brew install AltimateAI/tap/altimate-code ``` -> **Zero setup.** One `npm install` and you're ready. No Python, no pip, no virtualenv. +> **Zero additional setup.** One command install — no Python, no pip, no virtualenv. --- From b41ffb0f7f187af97023c32f521eef97708af989 Mon Sep 17 00:00:00 2001 From: Saurabh Arora Date: Wed, 18 Mar 2026 13:21:07 -0700 Subject: [PATCH 04/13] docs: trim install callout Co-Authored-By: Claude Opus 4.6 (1M context) --- README.md | 2 +- docs/docs/quickstart.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 4324a1e0f1..ee2330f1d7 100644 --- a/README.md +++ b/README.md @@ -57,7 +57,7 @@ altimate /discover `/discover` auto-detects dbt projects, warehouse connections (from `~/.dbt/profiles.yml`, Docker, environment variables), and installed tools (dbt, sqlfluff, airflow, dagster, and more). Skip this and start building — you can always run it later. -> **Zero additional setup.** One command install — no Python, no pip, no virtualenv. +> **Zero additional setup.** One command install. ## Why a specialized harness? diff --git a/docs/docs/quickstart.md b/docs/docs/quickstart.md index dfa5bfd33f..a47ef7d63c 100644 --- a/docs/docs/quickstart.md +++ b/docs/docs/quickstart.md @@ -18,7 +18,7 @@ npm install -g altimate-code brew install AltimateAI/tap/altimate-code ``` -> **Zero additional setup.** One command install — no Python, no pip, no virtualenv. +> **Zero additional setup.** One command install. --- From 3529cd51139ab019e251c55cd36d44ec75210fc7 Mon Sep 17 00:00:00 2001 From: Saurabh Arora Date: Wed, 18 Mar 2026 13:30:43 -0700 Subject: [PATCH 05/13] docs: remove em-dashes, fix pill spacing, simplify /discover duplication - Replace 304 em-dashes across 38 docs files with natural sentence structures (colons, commas, periods, split sentences) to avoid AI-generated content appearance - Fix pill-grid CSS: increase gap/padding, add responsive breakpoints at 768px and 480px for reliable scaling across viewport sizes - Simplify quickstart /discover step to brief description + link to Full Setup; add (Optional) marker to getting-started warehouse step Co-Authored-By: Claude Opus 4.6 (1M context) --- docs/docs/assets/css/extra.css | 41 ++++++++++++++--- docs/docs/configure/agents.md | 4 +- docs/docs/configure/commands.md | 10 ++-- docs/docs/configure/config.md | 2 +- docs/docs/configure/context-management.md | 22 ++++----- docs/docs/configure/permissions.md | 16 +++---- docs/docs/configure/providers.md | 4 +- docs/docs/configure/rules.md | 14 +++--- docs/docs/configure/skills.md | 2 +- docs/docs/configure/telemetry.md | 46 +++++++++---------- docs/docs/configure/tools.md | 6 +-- docs/docs/configure/tracing.md | 26 +++++------ .../data-engineering/guides/ci-headless.md | 8 ++-- .../guides/cost-optimization.md | 6 +-- .../docs/data-engineering/guides/migration.md | 10 ++-- .../guides/using-with-codex.md | 2 +- docs/docs/data-engineering/tools/dbt-tools.md | 8 ++-- .../data-engineering/tools/finops-tools.md | 28 +++++------ docs/docs/data-engineering/tools/index.md | 2 +- .../data-engineering/tools/memory-tools.md | 26 +++++------ .../data-engineering/tools/schema-tools.md | 22 ++++----- docs/docs/data-engineering/tools/sql-tools.md | 24 +++++----- docs/docs/data-engineering/training/index.md | 10 ++-- .../training/team-deployment.md | 16 +++---- docs/docs/develop/ecosystem.md | 12 ++--- docs/docs/develop/plugins.md | 10 ++-- docs/docs/develop/sdk.md | 10 ++-- docs/docs/develop/server.md | 12 ++--- docs/docs/drivers.md | 32 ++++++------- docs/docs/index.md | 6 +-- docs/docs/llms.txt | 6 +-- docs/docs/quickstart.md | 8 ++-- docs/docs/security-faq.md | 6 +-- docs/docs/troubleshooting.md | 8 ++-- docs/docs/usage/cli.md | 2 +- docs/docs/usage/tui.md | 12 ++--- docs/docs/usage/web.md | 2 +- docs/docs/windows-wsl.md | 4 +- 38 files changed, 256 insertions(+), 229 deletions(-) diff --git a/docs/docs/assets/css/extra.css b/docs/docs/assets/css/extra.css index 5c7c736fa8..4a62c26c0d 100644 --- a/docs/docs/assets/css/extra.css +++ b/docs/docs/assets/css/extra.css @@ -92,7 +92,7 @@ /* --- Feature cards --- */ .grid.cards > ul > li { border-radius: 8px; - padding: 0.8rem !important; + padding: 1rem !important; transition: box-shadow 0.2s ease, transform 0.2s ease; } @@ -112,15 +112,18 @@ /* --- Pill grid (LLM providers, warehouses) --- */ .pill-grid { - max-width: 600px; - margin: 0 auto; + width: 100%; + max-width: 640px; + margin: 1rem auto; + padding: 0 1rem; + box-sizing: border-box; } .pill-grid ul { display: flex; flex-wrap: wrap; justify-content: center; - gap: 0.45rem; + gap: 0.6rem 0.75rem; list-style: none; padding: 0; margin: 0; @@ -129,14 +132,14 @@ .pill-grid ul li { display: inline-flex; align-items: center; - gap: 0.3rem; - padding: 0.4rem 0.85rem; + gap: 0.4rem; + padding: 0.5rem 1.15rem; border-radius: 100px; font-size: 0.8rem; border: 1px solid var(--md-default-fg-color--lightest); color: var(--md-default-fg-color--light); white-space: nowrap; - margin: 0; + flex-shrink: 0; } .pill-grid ul li .twemoji { @@ -147,6 +150,30 @@ border-color: rgba(255, 255, 255, 0.12); } +/* Responsive pill sizing */ +@media (max-width: 768px) { + .pill-grid { + max-width: 100%; + padding: 0 0.75rem; + } + + .pill-grid ul { + gap: 0.5rem 0.6rem; + } + + .pill-grid ul li { + padding: 0.45rem 0.9rem; + font-size: 0.75rem; + } +} + +@media (max-width: 480px) { + .pill-grid ul li { + padding: 0.4rem 0.8rem; + font-size: 0.72rem; + } +} + /* --- Doc links footer --- */ .doc-links { text-align: center; diff --git a/docs/docs/configure/agents.md b/docs/docs/configure/agents.md index d111acb46e..1e476ce3af 100644 --- a/docs/docs/configure/agents.md +++ b/docs/docs/configure/agents.md @@ -9,8 +9,8 @@ Agents define different AI personas with specific models, prompts, permissions, | Agent | Description | |-------|------------| | `general` | Default general-purpose coding agent | -| `plan` | Planning agent — analyzes before acting | -| `build` | Build-focused agent — prioritizes code generation | +| `plan` | Planning agent that analyzes before acting | +| `build` | Build-focused agent that prioritizes code generation | | `explore` | Read-only exploration agent | ### Data Engineering diff --git a/docs/docs/configure/commands.md b/docs/docs/configure/commands.md index d7b75aced5..06af3d7e36 100644 --- a/docs/docs/configure/commands.md +++ b/docs/docs/configure/commands.md @@ -8,7 +8,7 @@ altimate ships with four built-in slash commands: |---------|-------------| | `/init` | Create or update an AGENTS.md file with build commands and code style guidelines. | | `/discover` | Scan your data stack and set up warehouse connections. Detects dbt projects, warehouse connections from profiles/Docker/env vars, installed tools, and config files. Walks you through adding and testing new connections, then indexes schemas. | -| `/review` | Review changes — accepts `commit`, `branch`, or `pr` as an argument (defaults to uncommitted changes). | +| `/review` | Review changes. Accepts `commit`, `branch`, or `pr` as an argument (defaults to uncommitted changes). | | `/feedback` | Submit product feedback as a GitHub issue. Guides you through title, category, description, and optional session context. | ### `/discover` @@ -35,10 +35,10 @@ The recommended way to set up a new data engineering project. Run `/discover` in Submit product feedback directly from the CLI. The agent walks you through: -1. **Title** — a short summary of your feedback -2. **Category** — bug, feature, improvement, or ux -3. **Description** — detailed explanation -4. **Session context** (opt-in) — includes working directory name and session ID for debugging +1. **Title**, a short summary of your feedback +2. **Category**: bug, feature, improvement, or ux +3. **Description** with a detailed explanation +4. **Session context** (opt-in), which includes working directory name and session ID for debugging ``` /feedback # start the guided feedback flow diff --git a/docs/docs/configure/config.md b/docs/docs/configure/config.md index a66b8a6633..2bbf97b174 100644 --- a/docs/docs/configure/config.md +++ b/docs/docs/configure/config.md @@ -149,7 +149,7 @@ Control how context is managed when conversations grow long: |-------|---------|-------------| | `auto` | `true` | Auto-compact when context is full | | `prune` | `true` | Prune old tool outputs | -| `reserved` | — | Token buffer to reserve | +| `reserved` | (none) | Token buffer to reserve | !!! info Compaction automatically summarizes older messages to free up context window space, allowing longer conversations without losing important context. See [Context Management](context-management.md) for full details. diff --git a/docs/docs/configure/context-management.md b/docs/docs/configure/context-management.md index 805da651eb..19cb216ab5 100644 --- a/docs/docs/configure/context-management.md +++ b/docs/docs/configure/context-management.md @@ -1,14 +1,14 @@ # Context Management -altimate automatically manages conversation context so you can work through long sessions without hitting model limits. When a conversation grows large, the CLI summarizes older messages, prunes stale tool outputs, and recovers from provider overflow errors — all without losing the important details of your work. +altimate automatically manages conversation context so you can work through long sessions without hitting model limits. When a conversation grows large, the CLI summarizes older messages, prunes stale tool outputs, and recovers from provider overflow errors, all without losing the important details of your work. ## How It Works Every LLM has a finite context window. As you work, each message, tool call, and tool result adds tokens to the conversation. When the conversation approaches the model's limit, altimate takes action: -1. **Prune** — Old tool outputs (file reads, command results, query results) are replaced with compact summaries -2. **Compact** — The entire conversation history is summarized into a continuation prompt -3. **Continue** — The agent picks up where it left off using the summary +1. **Prune.** Old tool outputs (file reads, command results, query results) are replaced with compact summaries +2. **Compact.** The entire conversation history is summarized into a continuation prompt +3. **Continue.** The agent picks up where it left off using the summary This happens automatically by default. You do not need to manually manage context. @@ -38,7 +38,7 @@ When a tool output is pruned, it is replaced with a brief fingerprint: [Tool output cleared — read_file(file: src/main.ts) returned 42 lines, 1.2 KB — "import { App } from './app'"] ``` -This tells the model what tool was called, what arguments were used, how much output it produced, and the first line of the result — enough to maintain continuity without consuming tokens. +This tells the model what tool was called, what arguments were used, how much output it produced, and the first line of the result. That is enough to maintain continuity without consuming tokens. **Pruning rules:** @@ -51,12 +51,12 @@ This tells the model what tool was called, what arguments were used, how much ou Compaction is aware of data engineering workflows. When summarizing a conversation, the compaction prompt preserves: -- **Warehouse connections** — which databases or warehouses are connected -- **Schema context** — discovered tables, columns, and relationships -- **dbt project state** — models, sources, tests, and project structure -- **Lineage findings** — upstream and downstream dependencies -- **Query patterns** — SQL dialects, anti-patterns, and optimization opportunities -- **FinOps context** — cost findings and warehouse sizing recommendations +- **Warehouse connections**, including which databases or warehouses are connected +- **Schema context**, including discovered tables, columns, and relationships +- **dbt project state**, including models, sources, tests, and project structure +- **Lineage findings**, including upstream and downstream dependencies +- **Query patterns**, including SQL dialects, anti-patterns, and optimization opportunities +- **FinOps context**, including cost findings and warehouse sizing recommendations This means you can run a long data exploration session and compaction will not lose track of what schemas you discovered, what dbt models you were working with, or what cost optimizations you identified. diff --git a/docs/docs/configure/permissions.md b/docs/docs/configure/permissions.md index 93d8407821..4d577c91ad 100644 --- a/docs/docs/configure/permissions.md +++ b/docs/docs/configure/permissions.md @@ -49,7 +49,7 @@ For tools that accept arguments (like `bash`), use pattern matching: } ``` -Patterns are matched in order — **last matching rule wins**. Use `*` as a wildcard. Place your catch-all `"*"` rule first and more specific rules after it. +Patterns are matched in order, and the **last matching rule wins**. Use `*` as a wildcard. Place your catch-all `"*"` rule first and more specific rules after it. For example, with `"*": "ask"` first and `"rm *": "deny"` after it, all `rm` commands are denied while everything else prompts. If you put `"*": "ask"` last, it would override the deny rule. @@ -125,7 +125,7 @@ export ALTIMATE_CLI_YOLO=true altimate-code run "analyze my queries" ``` -The fallback `OPENCODE_YOLO` env var is also supported. When both are set, `ALTIMATE_CLI_YOLO` takes precedence — setting it to `false` disables yolo even if `OPENCODE_YOLO=true`. +The fallback `OPENCODE_YOLO` env var is also supported. When both are set, `ALTIMATE_CLI_YOLO` takes precedence. Setting it to `false` disables yolo even if `OPENCODE_YOLO=true`. **Safety:** Explicit `deny` rules in your config are still enforced. Deny rules throw an error *before* any permission prompt is created, so yolo mode never sees them. If you've denied `rm *` or `DROP *`, those remain blocked even with `--yolo`. @@ -133,7 +133,7 @@ When yolo mode is active in the TUI, a `△ YOLO` indicator appears in the foote ## Recommended Configurations -### Data Engineering (Default — Balanced) +### Data Engineering (Default, Balanced) A good starting point for most data engineering workflows. Allows safe read operations, prompts for writes and commands: @@ -230,11 +230,11 @@ Give each agent only the permissions it needs: When the agent wants to use a tool, the permission system evaluates your rules in order: -1. **Config rules** — from `altimate-code.json` -2. **Agent-level rules** — per-agent overrides -3. **Session approvals** — patterns you've approved with "Allow always" during the current session +1. **Config rules** from `altimate-code.json` +2. **Agent-level rules** for per-agent overrides +3. **Session approvals** for patterns you've approved with "Allow always" during the current session -If a rule matches, it applies. If no rule matches, the default is `"ask"` — you'll be prompted. +If a rule matches, it applies. If no rule matches, the default is `"ask"`, which means you'll be prompted. When prompted, you have three choices: @@ -251,4 +251,4 @@ When prompted, you have three choices: - **Start with `"ask"` and relax as you build confidence.** You can always approve patterns with "Allow always" during a session. - **Use `"deny"` for truly dangerous commands** like `rm *`, `DROP *`, `git push --force *`, and `git reset --hard *`. These are blocked even if other rules would allow them. - **Use per-agent permissions** to enforce least-privilege. An analyst doesn't need write access. A builder doesn't need `DROP`. -- **Review the prompt before approving.** The TUI shows you exactly what will run — including diffs for file edits and the full command for bash operations. +- **Review the prompt before approving.** The TUI shows you exactly what will run, including diffs for file edits and the full command for bash operations. diff --git a/docs/docs/configure/providers.md b/docs/docs/configure/providers.md index 0b73800508..a62e96d9e6 100644 --- a/docs/docs/configure/providers.md +++ b/docs/docs/configure/providers.md @@ -69,7 +69,7 @@ Available models: `claude-opus-4-6`, `claude-sonnet-4-6`, `claude-haiku-4-5-2025 Uses the standard AWS credential chain. Set `AWS_PROFILE` or provide credentials directly. !!! note - If you have AWS SSO or IAM roles configured, Bedrock will use your default credential chain automatically — no explicit keys needed. + If you have AWS SSO or IAM roles configured, Bedrock will use your default credential chain automatically, so no explicit keys are needed. ## Azure OpenAI @@ -143,7 +143,7 @@ If `location` is not set, it defaults to `us-central1`. } ``` -No API key needed — runs entirely on your local machine. +No API key needed. Runs entirely on your local machine. !!! info Make sure Ollama is running before starting altimate. Install it from [ollama.com](https://ollama.com) and pull your desired model with `ollama pull llama3.1`. diff --git a/docs/docs/configure/rules.md b/docs/docs/configure/rules.md index 892e916418..f2b5e6751e 100644 --- a/docs/docs/configure/rules.md +++ b/docs/docs/configure/rules.md @@ -6,9 +6,9 @@ Rules are instructions that guide agent behavior. They are loaded automatically altimate looks for instruction files in these locations: -- `AGENTS.md` — Primary instruction file (searched up directory tree) -- `CLAUDE.md` — Fallback instruction file -- `.altimate-code/AGENTS.md` — Project-specific instructions +- `AGENTS.md`: Primary instruction file (searched up directory tree) +- `CLAUDE.md`: Fallback instruction file +- `.altimate-code/AGENTS.md`: Project-specific instructions - Custom patterns via the `instructions` config field !!! tip @@ -31,9 +31,9 @@ Specify additional instruction sources in your config: Patterns support: -- **Glob patterns** — `*.md`, `docs/**/*.md` -- **URLs** — fetched at startup -- **Relative paths** — resolved from project root +- **Glob patterns** such as `*.md`, `docs/**/*.md` +- **URLs**, which are fetched at startup +- **Relative paths**, which are resolved from project root ## Writing Effective Rules @@ -57,7 +57,7 @@ This is a dbt project for our analytics warehouse on Snowflake. ``` !!! example "Tips for effective rules" - - Be specific and actionable — vague rules get ignored + - Be specific and actionable, since vague rules get ignored - Include project-specific terminology and conventions - Reference file paths and commands that agents should use - Keep rules concise; overly long instructions dilute focus diff --git a/docs/docs/configure/skills.md b/docs/docs/configure/skills.md index f70b3218f8..f12428c03d 100644 --- a/docs/docs/configure/skills.md +++ b/docs/docs/configure/skills.md @@ -71,7 +71,7 @@ altimate ships with built-in skills for common data engineering tasks. Type `/` | Skill | Description | |-------|-------------| -| `/sql-review` | SQL quality gate — lint 26 anti-patterns, validate syntax, check safety | +| `/sql-review` | SQL quality gate that lints 26 anti-patterns, validates syntax, and checks safety | | `/sql-translate` | Cross-dialect SQL translation | | `/schema-migration` | Schema migration planning and execution | | `/pii-audit` | PII detection and compliance audits | diff --git a/docs/docs/configure/telemetry.md b/docs/docs/configure/telemetry.md index bf27af30fb..0aefe92620 100644 --- a/docs/docs/configure/telemetry.md +++ b/docs/docs/configure/telemetry.md @@ -15,11 +15,11 @@ We collect the following categories of events: | `tool_call` | A tool is invoked (tool name and category, but no arguments or output) | | `bridge_call` | A native tool call completes (method name and duration, but no arguments) | | `command` | A CLI command is executed (command name only) | -| `error` | An unhandled error occurs (error type and truncated message — no stack traces) | -| `auth_login` | Authentication succeeds or fails (provider and method — no credentials) | +| `error` | An unhandled error occurs (error type and truncated message, but no stack traces) | +| `auth_login` | Authentication succeeds or fails (provider and method, but no credentials) | | `auth_logout` | A user logs out (provider only) | | `mcp_server_status` | An MCP server connects, disconnects, or errors (server name and transport) | -| `provider_error` | An AI provider returns an error (error type and HTTP status — no request content) | +| `provider_error` | An AI provider returns an error (error type and HTTP status, but no request content) | | `engine_started` | The native tool engine initializes (version and duration) | | `engine_error` | The native tool engine fails to start (phase and truncated error) | | `upgrade_attempted` | A CLI upgrade is attempted (version and method) | @@ -27,11 +27,11 @@ We collect the following categories of events: | `doom_loop_detected` | A repeated tool call pattern is detected (tool name and count) | | `compaction_triggered` | Context compaction runs (strategy and token counts) | | `tool_outputs_pruned` | Tool outputs are pruned during compaction (count) | -| `environment_census` | Environment snapshot on project scan (warehouse types, dbt presence, feature flags — no hostnames) | +| `environment_census` | Environment snapshot on project scan (warehouse types, dbt presence, feature flags, but no hostnames) | | `context_utilization` | Context window usage per generation (token counts, utilization percentage, cache hit ratio) | | `agent_outcome` | Agent session outcome (agent type, tool/generation counts, cost, outcome status) | | `error_recovered` | Successful recovery from a transient error (error type, strategy, attempt count) | -| `mcp_server_census` | MCP server capabilities after connect (tool and resource counts — no tool names) | +| `mcp_server_census` | MCP server capabilities after connect (tool and resource counts, but no tool names) | | `context_overflow_recovered` | Context overflow is handled (strategy) | Each event includes a timestamp, anonymous session ID, and the CLI version. @@ -40,16 +40,16 @@ Each event includes a timestamp, anonymous session ID, and the CLI version. Telemetry events are buffered in memory and flushed periodically. If a flush fails (e.g., due to a transient network error), events are re-added to the buffer for one retry. On process exit, the CLI performs a final flush to avoid losing events from the current session. -No events are ever written to disk — if the process is killed before the final flush, buffered events are lost. This is by design to minimize on-disk footprint. +No events are ever written to disk. If the process is killed before the final flush, buffered events are lost. This is by design to minimize on-disk footprint. ## Why We Collect Telemetry Telemetry helps us: -- **Detect errors** — identify crashes, provider failures, and engine issues before users report them -- **Improve reliability** — track MCP server stability, engine initialization, and upgrade outcomes -- **Understand usage patterns** — know which tools and features are used so we can prioritize development -- **Measure performance** — track generation latency, tool call duration, and startup time +- **Detect errors** by identifying crashes, provider failures, and engine issues before users report them +- **Improve reliability** by tracking MCP server stability, engine initialization, and upgrade outcomes +- **Understand usage patterns** to know which tools and features are used so we can prioritize development +- **Measure performance** by tracking generation latency, tool call duration, and startup time ## Disabling Telemetry @@ -79,7 +79,7 @@ We take your privacy seriously. Altimate Code telemetry **never** collects: - Code content, file contents, or file paths - Credentials, API keys, or tokens - Database connection strings or hostnames -- Personally identifiable information (your email is SHA-256 hashed before sending — used only for anonymous user correlation) +- Personally identifiable information (your email is SHA-256 hashed before sending and is used only for anonymous user correlation) - Tool arguments or outputs - AI prompt content or responses @@ -101,21 +101,21 @@ For a complete list of network endpoints, see the [Network Reference](../network Event type names use **snake_case** with a `domain_action` pattern: -- `auth_login`, `auth_logout` — authentication events -- `mcp_server_status`, `mcp_server_census` — MCP server lifecycle -- `engine_started`, `engine_error` — native engine events -- `provider_error` — AI provider errors -- `session_forked` — session lifecycle -- `environment_census` — environment snapshot events -- `context_utilization`, `context_overflow_recovered` — context management events -- `agent_outcome` — agent session events -- `error_recovered` — error recovery events +- `auth_login`, `auth_logout` for authentication events +- `mcp_server_status`, `mcp_server_census` for MCP server lifecycle +- `engine_started`, `engine_error` for native engine events +- `provider_error` for AI provider errors +- `session_forked` for session lifecycle +- `environment_census` for environment snapshot events +- `context_utilization`, `context_overflow_recovered` for context management events +- `agent_outcome` for agent session events +- `error_recovered` for error recovery events ### Adding a New Event -1. **Define the type** — Add a new variant to the `Telemetry.Event` union in `packages/altimate-code/src/telemetry/index.ts` -2. **Emit the event** — Call `Telemetry.track()` at the appropriate location -3. **Update docs** — Add a row to the event table above +1. **Define the type.** Add a new variant to the `Telemetry.Event` union in `packages/altimate-code/src/telemetry/index.ts` +2. **Emit the event.** Call `Telemetry.track()` at the appropriate location +3. **Update docs.** Add a row to the event table above ### Privacy Checklist diff --git a/docs/docs/configure/tools.md b/docs/docs/configure/tools.md index cda9b2a321..89af267306 100644 --- a/docs/docs/configure/tools.md +++ b/docs/docs/configure/tools.md @@ -98,9 +98,9 @@ The `bash` tool executes shell commands in the project directory. Commands run i File tools respect the project boundaries and permission settings: -- **`read`** — Reads file contents, supports line ranges -- **`write`** — Creates or overwrites entire files -- **`edit`** — Surgical find-and-replace edits within files +- **`read`** reads file contents and supports line ranges +- **`write`** creates or overwrites entire files +- **`edit`** performs surgical find-and-replace edits within files ### LSP Tool diff --git a/docs/docs/configure/tracing.md b/docs/docs/configure/tracing.md index 2b09eb0969..fc8cf9fa36 100644 --- a/docs/docs/configure/tracing.md +++ b/docs/docs/configure/tracing.md @@ -1,13 +1,13 @@ # Tracing -Altimate Code captures detailed traces of every headless session — LLM generations, tool calls, token usage, cost, and timing — and saves them locally as JSON files. Traces are invaluable for debugging agent behavior, optimizing cost, and understanding how the agent solves problems. +Altimate Code captures detailed traces of every headless session, including LLM generations, tool calls, token usage, cost, and timing, and saves them locally as JSON files. Traces are invaluable for debugging agent behavior, optimizing cost, and understanding how the agent solves problems. Tracing is **enabled by default** and requires no configuration. Traces are stored locally and never leave your machine unless you configure a remote exporter. ## Quick Start ```bash -# Run a prompt — trace is saved automatically +# Run a prompt (trace is saved automatically) altimate-code run "optimize my most expensive queries" # → Trace saved: ~/.local/share/altimate-code/traces/abc123.json @@ -44,7 +44,7 @@ When using SQL and dbt tools, traces automatically capture domain-specific data: | **Data Quality** | Row counts, null percentages, freshness, anomaly detection | | **Cost Attribution** | LLM cost + warehouse compute cost + storage delta = total cost, per user/team/project | -These attributes are purely optional — traces are valid without them. They're populated automatically by tools that have access to warehouse metadata. +These attributes are purely optional. Traces are valid without them. They're populated automatically by tools that have access to warehouse metadata. ## Configuration @@ -115,9 +115,9 @@ altimate-code trace view Opens a local web server with an interactive trace viewer in your browser. The viewer shows: -- **Summary cards** — duration, token breakdown (input/output/reasoning/cache), cost, generations, tool calls, status -- **Timeline** — horizontal bars for each span, color-coded by type (generation, tool, error) -- **Detail panel** — click any span to see its model info, token counts, finish reason, input/output, and domain-specific attributes (warehouse metrics, dbt results, etc.) +- **Summary cards** showing duration, token breakdown (input/output/reasoning/cache), cost, generations, tool calls, status +- **Timeline** with horizontal bars for each span, color-coded by type (generation, tool, error) +- **Detail panel** where you click any span to see its model info, token counts, finish reason, input/output, and domain-specific attributes (warehouse metrics, dbt results, etc.) Options: @@ -126,11 +126,11 @@ Options: | `--port` | Port for the viewer server (default: random) | | `--live` | Auto-refresh every 2s for in-progress sessions | -Partial session ID matching is supported — `altimate-code trace view abc` matches `abc123def456`. +Partial session ID matching is supported. For example, `altimate-code trace view abc` matches `abc123def456`. ### Live Viewing (In-Progress Sessions) -Traces are written incrementally — after every tool call and generation, a snapshot is flushed to disk. This means you can view a trace while the session is still running: +Traces are written incrementally. After every tool call and generation, a snapshot is flushed to disk. This means you can view a trace while the session is still running: ```bash # In terminal 1: run a long task @@ -178,7 +178,7 @@ Traces can be sent to remote backends via HTTP POST. Each exporter receives the - A failing exporter never blocks local file storage or other exporters - If the server responds with `{ "url": "..." }`, the URL is displayed to the user - Exporters have a 10-second timeout -- All export operations are best-effort — they never crash the CLI +- All export operations are best-effort and never crash the CLI ## Trace File Format @@ -296,13 +296,13 @@ All domain-specific attributes use the `de.*` prefix and are stored in the `attr Traces are designed to survive process crashes: -1. **Immediate snapshot** — A trace file is written as soon as `startTrace()` is called, before any LLM interaction. Even if the process crashes immediately, a minimal trace file exists. +1. **Immediate snapshot.** A trace file is written as soon as `startTrace()` is called, before any LLM interaction. Even if the process crashes immediately, a minimal trace file exists. -2. **Incremental snapshots** — After every tool call and generation completion, the trace file is updated atomically (write to temp file, then rename). The file on disk always contains a valid, complete JSON document. +2. **Incremental snapshots.** After every tool call and generation completion, the trace file is updated atomically (write to temp file, then rename). The file on disk always contains a valid, complete JSON document. -3. **Crash handlers** — The `run` command registers `SIGINT`/`SIGTERM`/`beforeExit` handlers that flush the trace synchronously with a `"crashed"` status. +3. **Crash handlers.** The `run` command registers `SIGINT`/`SIGTERM`/`beforeExit` handlers that flush the trace synchronously with a `"crashed"` status. -4. **Status indicators** — Trace status tells you exactly what happened: +4. **Status indicators.** Trace status tells you exactly what happened: | Status | Meaning | |--------|---------| diff --git a/docs/docs/data-engineering/guides/ci-headless.md b/docs/docs/data-engineering/guides/ci-headless.md index 772608577e..b9972b2804 100644 --- a/docs/docs/data-engineering/guides/ci-headless.md +++ b/docs/docs/data-engineering/guides/ci-headless.md @@ -51,7 +51,7 @@ SNOWFLAKE_WAREHOUSE=compute_wh | Code | Meaning | |---|---| -| `0` | Success — task completed | +| `0` | Success (task completed) | | `1` | Task completed but result indicates issues (e.g., anti-patterns found) | | `2` | Configuration error (missing API key, bad connection) | | `3` | Tool execution error (warehouse unreachable, query failed) | @@ -66,7 +66,7 @@ altimate run "validate models in models/staging/ for anti-patterns" || exit 1 ## Worked Examples -### Example 1 — Nightly Cost Check (GitHub Actions) +### Example 1: Nightly Cost Check (GitHub Actions) ```yaml # .github/workflows/cost-check.yml @@ -105,7 +105,7 @@ jobs: path: cost-report.json ``` -### Example 2 — Post-Deploy SQL Validation +### Example 2: Post-Deploy SQL Validation Add to your dbt deployment workflow to catch anti-patterns before they reach production: @@ -120,7 +120,7 @@ Add to your dbt deployment workflow to catch anti-patterns before they reach pro --output json ``` -### Example 3 — Automated Test Generation (Pre-commit) +### Example 3: Automated Test Generation (Pre-commit) ```bash #!/bin/bash diff --git a/docs/docs/data-engineering/guides/cost-optimization.md b/docs/docs/data-engineering/guides/cost-optimization.md index 068d1c0c21..654651383b 100644 --- a/docs/docs/data-engineering/guides/cost-optimization.md +++ b/docs/docs/data-engineering/guides/cost-optimization.md @@ -70,9 +70,9 @@ You: Are our warehouses the right size? ``` Common findings: -- **Over-provisioned warehouses** — Utilization below 30% means you're paying for idle compute -- **Missing auto-suspend** — Warehouses running 24/7 when only used during business hours -- **Wrong size for workload** — Small queries on XL warehouses waste credits +- **Over-provisioned warehouses.** Utilization below 30% means you're paying for idle compute. +- **Missing auto-suspend.** Warehouses running 24/7 when only used during business hours. +- **Wrong size for workload.** Small queries on XL warehouses waste credits. ## Step 4: Clean up unused resources diff --git a/docs/docs/data-engineering/guides/migration.md b/docs/docs/data-engineering/guides/migration.md index 1b62886ca5..08b1eb291c 100644 --- a/docs/docs/data-engineering/guides/migration.md +++ b/docs/docs/data-engineering/guides/migration.md @@ -127,8 +127,8 @@ WHERE RLIKE(email, '^[a-z]+@.*$'); ## Best practices -1. **Translate in batches** — Start with staging models, then intermediate, then marts -2. **Verify lineage** — Always check that column lineage is preserved after translation -3. **Test with LIMIT** — Run translated queries with `LIMIT 10` on the target warehouse first -4. **Check data types** — Type mappings may lose precision (e.g., `NUMBER(38,0)` → `INT64`) -5. **Handle NULL semantics** — Some warehouses handle NULLs differently in comparisons +1. **Translate in batches.** Start with staging models, then intermediate, then marts. +2. **Verify lineage.** Always check that column lineage is preserved after translation. +3. **Test with LIMIT.** Run translated queries with `LIMIT 10` on the target warehouse first. +4. **Check data types.** Type mappings may lose precision (e.g., `NUMBER(38,0)` to `INT64`). +5. **Handle NULL semantics.** Some warehouses handle NULLs differently in comparisons. diff --git a/docs/docs/data-engineering/guides/using-with-codex.md b/docs/docs/data-engineering/guides/using-with-codex.md index 68e82b2bae..713f565322 100644 --- a/docs/docs/data-engineering/guides/using-with-codex.md +++ b/docs/docs/data-engineering/guides/using-with-codex.md @@ -42,7 +42,7 @@ Once authenticated, all altimate tools work with Codex as the LLM backend. No AP - altimate authenticates via PKCE OAuth flow with ChatGPT - Requests route through `chatgpt.com/backend-api/codex/responses` -- Your subscription covers all token usage — no per-token billing +- Your subscription covers all token usage, so there is no per-token billing - Token is stored locally at `~/.altimate/data/auth.json` ## Cost diff --git a/docs/docs/data-engineering/tools/dbt-tools.md b/docs/docs/data-engineering/tools/dbt-tools.md index 1e72b44508..89f62e7c95 100644 --- a/docs/docs/data-engineering/tools/dbt-tools.md +++ b/docs/docs/data-engineering/tools/dbt-tools.md @@ -14,10 +14,10 @@ Running: dbt run --select stg_orders ``` **Parameters:** -- `command` (optional, default: "run") — dbt command: `run`, `test`, `build`, `compile`, `seed`, `snapshot` -- `select` (optional) — Model selection syntax (`stg_orders`, `+fct_revenue`, `tag:daily`) -- `args` (optional) — Additional CLI arguments -- `project_dir` (optional) — Path to dbt project root +- `command` (optional, default: "run"): dbt command: `run`, `test`, `build`, `compile`, `seed`, `snapshot` +- `select` (optional): Model selection syntax (`stg_orders`, `+fct_revenue`, `tag:daily`) +- `args` (optional): Additional CLI arguments +- `project_dir` (optional): Path to dbt project root ### Examples diff --git a/docs/docs/data-engineering/tools/finops-tools.md b/docs/docs/data-engineering/tools/finops-tools.md index 1cd5bd8b15..b7ffc987a7 100644 --- a/docs/docs/data-engineering/tools/finops-tools.md +++ b/docs/docs/data-engineering/tools/finops-tools.md @@ -27,11 +27,11 @@ Summary: ``` **Parameters:** -- `warehouse` (required) — Connection name -- `days` (optional, default: 7) — Lookback period -- `limit` (optional, default: 100) — Max queries returned -- `user` (optional) — Filter by username -- `warehouse_filter` (optional) — Filter by compute warehouse name +- `warehouse` (required): Connection name +- `days` (optional, default: 7): Lookback period +- `limit` (optional, default: 100): Max queries returned +- `user` (optional): Filter by username +- `warehouse_filter` (optional): Filter by compute warehouse name **Data sources by warehouse:** - Snowflake: `QUERY_HISTORY` function @@ -63,9 +63,9 @@ By Warehouse: DEV_WH (XS): 47.4 credits (6%) Recommendations: - 1. TRANSFORM_WH runs at 23% utilization — consider downsizing to L - 2. 340 queries on ANALYTICS_WH scan >1GB but return <100 rows — add filters - 3. DEV_WH has 0 queries between 2am-8am — enable auto-suspend + 1. TRANSFORM_WH runs at 23% utilization, consider downsizing to L + 2. 340 queries on ANALYTICS_WH scan >1GB but return <100 rows, add filters + 3. DEV_WH has 0 queries between 2am-8am, enable auto-suspend ``` --- @@ -92,7 +92,7 @@ Top 5 Expensive Queries: 3. 23.7 credits | 312 executions | ANALYTICS_WH SELECT COUNT(DISTINCT user_id) FROM events WHERE ... Anti-patterns: None - Suggestion: Pre-aggregate in a materialized view — saves ~23 credits/week + Suggestion: Pre-aggregate in a materialized view, which saves ~23 credits/week 4. 18.2 credits | 7 executions | TRANSFORM_WH INSERT INTO daily_agg SELECT ... FROM raw_events @@ -153,14 +153,14 @@ Find tables and warehouses that are costing money but not being used. > finops_unused_resources prod-snowflake --days 30 Unused Tables (no reads in 30 days): - 1. RAW.LEGACY_EVENTS — 450GB, last accessed 2025-11-03 - 2. STAGING.STG_OLD_USERS — 12GB, last accessed 2025-12-15 - 3. ANALYTICS.TMP_MIGRATION_2024 — 89GB, last accessed 2025-08-22 + 1. RAW.LEGACY_EVENTS (450GB, last accessed 2025-11-03) + 2. STAGING.STG_OLD_USERS (12GB, last accessed 2025-12-15) + 3. ANALYTICS.TMP_MIGRATION_2024 (89GB, last accessed 2025-08-22) Total storage: 551GB → ~$23/month in storage costs Idle Warehouses (no queries in 7+ days): - 1. MIGRATION_WH (Medium) — last query 2026-02-10 - 2. TEST_WH (Small) — last query 2026-01-28 + 1. MIGRATION_WH (Medium), last query 2026-02-10 + 2. TEST_WH (Small), last query 2026-01-28 Recommendations: 1. Archive or drop the 3 unused tables → save $23/month diff --git a/docs/docs/data-engineering/tools/index.md b/docs/docs/data-engineering/tools/index.md index ae944bfe4f..30c4381491 100644 --- a/docs/docs/data-engineering/tools/index.md +++ b/docs/docs/data-engineering/tools/index.md @@ -12,6 +12,6 @@ altimate has 50+ specialized tools organized by function. | [Warehouse Tools](warehouse-tools.md) | 6 tools | Environment scanning, connection management, discovery, testing | | [Altimate Memory](memory-tools.md) | 3 tools | Persistent cross-session memory for warehouse config, conventions, and preferences | | [Training](../training/index.md) | 3 tools + 3 skills | Correct the agent once, it remembers forever, your team inherits it | -| `tool_lookup` | 1 tool | Runtime introspection — discover tool schemas and parameters dynamically | +| `tool_lookup` | 1 tool | Runtime introspection that discovers tool schemas and parameters dynamically | All tools are available in the interactive TUI. The agent automatically selects the right tools based on your request. diff --git a/docs/docs/data-engineering/tools/memory-tools.md b/docs/docs/data-engineering/tools/memory-tools.md index 47d4b43b40..03f837dd6c 100644 --- a/docs/docs/data-engineering/tools/memory-tools.md +++ b/docs/docs/data-engineering/tools/memory-tools.md @@ -2,16 +2,16 @@ Altimate Memory gives your data engineering agent **persistent, cross-session memory**. Instead of re-explaining your warehouse setup, naming conventions, or team preferences every session, the agent remembers what matters and picks up where you left off. -Memory blocks are plain Markdown files stored on disk — human-readable, version-controllable, and fully under your control. +Memory blocks are plain Markdown files stored on disk, making them human-readable, version-controllable, and fully under your control. ## Why memory matters for data engineering General-purpose coding agents treat every session as a blank slate. For data engineering, this is especially painful because: -- **Warehouse context is stable** — your Snowflake warehouse name, default database, and connection details rarely change, but you re-explain them every session. -- **Naming conventions are tribal knowledge** — `stg_` for staging, `int_` for intermediate, `fct_`/`dim_` for marts. The agent needs to learn these once, not every time. -- **Past analyses inform future work** — if the agent optimized a query or traced lineage for a table last week, recalling that context avoids redundant work. -- **User preferences accumulate** — SQL style, preferred dialects, dbt patterns, warehouse sizing decisions. +- **Warehouse context is stable.** Your Snowflake warehouse name, default database, and connection details rarely change, but you re-explain them every session. +- **Naming conventions are tribal knowledge.** `stg_` for staging, `int_` for intermediate, `fct_`/`dim_` for marts. The agent needs to learn these once, not every time. +- **Past analyses inform future work.** If the agent optimized a query or traced lineage for a table last week, recalling that context avoids redundant work. +- **User preferences accumulate.** SQL style, preferred dialects, dbt patterns, warehouse sizing decisions. Altimate Memory solves this with three tools that let the agent save, recall, and manage its own persistent knowledge. @@ -41,7 +41,7 @@ Memory: 1 block(s) |---|---|---|---| | `scope` | `"global" \| "project" \| "all"` | `"all"` | Filter by scope | | `tags` | `string[]` | `[]` | Filter to blocks containing all specified tags | -| `id` | `string` | — | Read a specific block by ID | +| `id` | `string` | (none) | Read a specific block by ID | --- @@ -55,7 +55,7 @@ Create or update a persistent memory block. Memory: Created "warehouse-config" ``` -The agent automatically calls this when it learns something worth persisting — you can also explicitly ask it to "remember" something. +The agent automatically calls this when it learns something worth persisting. You can also explicitly ask it to "remember" something. **Parameters:** @@ -116,7 +116,7 @@ tags: ["snowflake", "warehouse"] - **Default database**: ANALYTICS_DB ``` -Files are human-readable and editable. You can create, edit, or delete them manually — the agent will pick up changes on the next session. +Files are human-readable and editable. You can create, edit, or delete them manually. The agent will pick up changes on the next session. ## Limits and safety @@ -134,7 +134,7 @@ Blocks are written to a temporary file first, then atomically renamed. This prev ## Disabling memory -Set the environment variable to disable all memory functionality — tools and automatic injection: +Set the environment variable to disable all memory functionality, including tools and automatic injection: ```bash ALTIMATE_DISABLE_MEMORY=true @@ -149,7 +149,7 @@ Altimate Memory automatically injects relevant blocks into the system prompt at **What this means in practice:** - With a typical block size of 200-500 characters, the default budget comfortably fits 15-40 blocks -- Memory injection adds a one-time cost at session start — it does not grow during the session +- Memory injection adds a one-time cost at session start and does not grow during the session - If you notice context pressure, reduce the number of blocks or keep them concise - The agent's own tool calls and responses consume far more context than memory blocks - To disable injection entirely (e.g., for benchmarks), set `ALTIMATE_DISABLE_MEMORY=true` @@ -173,7 +173,7 @@ Memory blocks persist indefinitely. If your warehouse configuration changes or a **How to prevent:** -- Review memory blocks periodically — they're plain Markdown files you can inspect directly +- Review memory blocks periodically, since they're plain Markdown files you can inspect directly - Ask the agent to "forget" outdated information when things change - Keep blocks focused on stable facts rather than ephemeral details @@ -193,7 +193,7 @@ The agent decides what to save based on conversation context. It may occasionall **How to fix:** - Delete the bad block: ask the agent or run `rm .altimate-code/memory/bad-block.md` -- Edit the file directly — it's just Markdown +- Edit the file directly, since it's just Markdown - Ask the agent to rewrite it: "Update the warehouse-config memory with the correct warehouse name" ### Context bloat @@ -219,7 +219,7 @@ Memory blocks are stored as plaintext files on disk. Be mindful of what gets sav - **Do not** save credentials, API keys, or connection strings in memory blocks - **Do** save structural information (warehouse names, naming conventions, schema patterns) - If using project-scoped memory in a shared repo, add `.altimate-code/memory/` to `.gitignore` to avoid committing sensitive context -- Memory blocks are scoped per-user (global) and per-project — there is no cross-user or cross-project leakage +- Memory blocks are scoped per-user (global) and per-project, so there is no cross-user or cross-project leakage !!! warning Memory blocks are not encrypted. Treat them like any other configuration file on your machine. Do not store secrets or PII in memory blocks. diff --git a/docs/docs/data-engineering/tools/schema-tools.md b/docs/docs/data-engineering/tools/schema-tools.md index 8de2ac6880..78726177aa 100644 --- a/docs/docs/data-engineering/tools/schema-tools.md +++ b/docs/docs/data-engineering/tools/schema-tools.md @@ -23,9 +23,9 @@ Table: ANALYTICS.PUBLIC.ORDERS ``` **Parameters:** -- `table` (required) — Table name (schema-qualified: `schema.table` or just `table`) -- `schema_name` (optional) — Schema to search in -- `warehouse` (optional) — Connection name +- `table` (required): Table name (schema-qualified: `schema.table` or just `table`) +- `schema_name` (optional): Schema to search in +- `warehouse` (optional): Connection name --- @@ -50,13 +50,13 @@ Run this once per warehouse (or periodically to refresh). Enables `schema_search ## schema_search -Search indexed metadata by keyword — finds tables, columns, and schemas. +Search indexed metadata by keyword to find tables, columns, and schemas. ``` > schema_search "revenue" --warehouse prod-snowflake Tables: - 1. ANALYTICS.MARTS.FCT_REVENUE (42 columns) — "Monthly revenue fact table" + 1. ANALYTICS.MARTS.FCT_REVENUE (42 columns), "Monthly revenue fact table" 2. ANALYTICS.STAGING.STG_REVENUE_EVENTS (18 columns) Columns: @@ -66,9 +66,9 @@ Columns: ``` **Parameters:** -- `query` (required) — Search term -- `warehouse` (optional) — Limit to one connection -- `limit` (optional) — Max results +- `query` (required): Search term +- `warehouse` (optional): Limit to one connection +- `limit` (optional): Max results --- @@ -84,7 +84,7 @@ Check cache freshness across all warehouses. ├─────────────────┼──────────┼────────┼─────────┼─────────────────────┤ │ prod-snowflake │ 12 │ 847 │ 15,293 │ 2026-02-26 14:30:00 │ │ dev-duckdb │ 2 │ 23 │ 156 │ 2026-02-25 09:15:00 │ -│ bigquery-prod │ — │ — │ — │ Never │ +│ bigquery-prod │ n/a │ n/a │ n/a │ Never │ └─────────────────┴──────────┴────────┴─────────┴─────────────────────┘ ``` @@ -149,8 +149,8 @@ Compare schema changes between two SQL versions to understand migration impact. --new_sql "CREATE TABLE orders (id INT, amount DECIMAL(12,2), status TEXT, created_at TIMESTAMP)" Schema Changes: - ~ Modified: amount (FLOAT → DECIMAL(12,2)) — severity: medium - + Added: created_at (TIMESTAMP) — severity: low + ~ Modified: amount (FLOAT → DECIMAL(12,2)), severity: medium + + Added: created_at (TIMESTAMP), severity: low Impact: Type change on 'amount' may affect downstream consumers expecting FLOAT ``` diff --git a/docs/docs/data-engineering/tools/sql-tools.md b/docs/docs/data-engineering/tools/sql-tools.md index a776fbd4f6..f953f2ee15 100644 --- a/docs/docs/data-engineering/tools/sql-tools.md +++ b/docs/docs/data-engineering/tools/sql-tools.md @@ -18,9 +18,9 @@ Run SQL queries against your connected warehouse. ``` **Parameters:** -- `query` (required) — SQL to execute -- `warehouse` (optional) — Connection name from config. Uses default if omitted -- `limit` (optional, default: 100) — Max rows returned +- `query` (required): SQL to execute +- `warehouse` (optional): Connection name from config. Uses default if omitted +- `limit` (optional, default: 100): Max rows returned --- @@ -167,7 +167,7 @@ Diagnose and auto-fix SQL errors. --error "SQL compilation error: Object 'ANALYTICS.PUBLIC.USERSS' does not exist" \ "SELECT * FROM analytics.public.userss" -Diagnosis: Typo in table name — 'userss' should be 'users' +Diagnosis: Typo in table name. 'userss' should be 'users' Fixed SQL: SELECT * FROM analytics.public.users @@ -222,12 +222,12 @@ Rewritten SQL: ### Rewrite strategies -1. **Predicate pushdown** — Move filters closer to data source -2. **SELECT pruning** — Replace `*` with explicit columns -3. **Function elimination** — Replace non-sargable functions with range predicates -4. **JOIN reordering** — Smaller tables first -5. **Subquery flattening** — Convert to JOINs where possible -6. **UNION ALL promotion** — Replace UNION with UNION ALL when safe +1. **Predicate pushdown.** Move filters closer to data source +2. **SELECT pruning.** Replace `*` with explicit columns +3. **Function elimination.** Replace non-sargable functions with range predicates +4. **JOIN reordering.** Smaller tables first +5. **Subquery flattening.** Convert to JOINs where possible +6. **UNION ALL promotion.** Replace UNION with UNION ALL when safe --- @@ -260,6 +260,6 @@ Schema-aware SQL completion. > sql_autocomplete --prefix "SELECT o.order_id, o.amo" --table_context ["orders"] Suggestions: - 1. o.amount (DECIMAL) — orders.amount - 2. o.amount_usd (DECIMAL) — orders.amount_usd + 1. o.amount (DECIMAL), from orders.amount + 2. o.amount_usd (DECIMAL), from orders.amount_usd ``` diff --git a/docs/docs/data-engineering/training/index.md b/docs/docs/data-engineering/training/index.md index 4e75b8791f..a13f9906c7 100644 --- a/docs/docs/data-engineering/training/index.md +++ b/docs/docs/data-engineering/training/index.md @@ -4,7 +4,7 @@ ## The Problem -AI coding assistants make the same mistakes over and over. You say "use DECIMAL not FLOAT," it fixes it — then does the same thing next session. You write instructions in CLAUDE.md, but nobody updates it after corrections. The knowledge from your day-to-day work never becomes permanent. +AI coding assistants make the same mistakes over and over. You say "use DECIMAL not FLOAT," it fixes it, then does the same thing next session. You write instructions in CLAUDE.md, but nobody updates it after corrections. The knowledge from your day-to-day work never becomes permanent. ## How Training Works @@ -23,7 +23,7 @@ Builder: Saved. I'll apply this in every future session. That's it. **2 seconds.** No editing files. No context switching. The correction becomes permanent knowledge that every agent mode (builder, analyst, validator) sees in every future session. -Research shows compact, focused context improves AI performance by 17 percentage points — while dumping comprehensive docs actually hurts by 3 points (SkillsBench, 7,308 test runs). Training delivers the right knowledge to the right agent at the right time, not everything to everyone. +Research shows compact, focused context improves AI performance by 17 percentage points, while dumping comprehensive docs actually hurts by 3 points (SkillsBench, 7,308 test runs). Training delivers the right knowledge to the right agent at the right time, not everything to everyone. ## Three Ways to Teach @@ -75,7 +75,7 @@ Agent: I found 8 actionable rules: | Kind | Purpose | Example | |---|---|---| -| **rule** | Hard constraint | "Never use FLOAT for money — use DECIMAL(18,2)" | +| **rule** | Hard constraint | "Never use FLOAT for money. Use DECIMAL(18,2)." | | **pattern** | How code should look | "Staging models: source CTE → filtered → final" | | **standard** | Team convention | "Every PR needs tests + schema YAML" | | **glossary** | Business term | "ARR = Annual Recurring Revenue = MRR * 12" | @@ -99,7 +99,7 @@ For systematic teaching (not just corrections), switch to trainer mode: altimate --agent trainer ``` -Trainer mode is read-only — it can't modify your code. It helps you: +Trainer mode is read-only and cannot modify your code. It helps you: - **Teach interactively**: "Let me teach you about our Databricks setup" - **Find gaps**: "What don't you know about my project?" @@ -148,7 +148,7 @@ Training doesn't replace CLAUDE.md. They complement each other: - **Advisory, not enforced.** Training guides the agent, but it's not a hard gate. For critical rules, also add dbt tests or sqlfluff rules that block CI. - **No approval workflow.** Anyone with repo access can save training to project scope. Use code review on `.altimate-code/memory/` changes for governance. -- **No audit trail** beyond git history. Training doesn't track who saved what — use `git blame` on the training files. +- **No audit trail** beyond git history. Training doesn't track who saved what, so use `git blame` on the training files. - **Context budget.** Training competes for context space. Under pressure, least-relevant entries are excluded. Run `/training-status` to see what's included. - **20 entries per kind.** Hard limit. Consolidate related rules into one entry rather than saving many small ones. - **SQL-focused file analysis.** The `/teach` skill works best with SQL/dbt files. Python, PySpark, and other patterns must be taught manually via conversation. diff --git a/docs/docs/data-engineering/training/team-deployment.md b/docs/docs/data-engineering/training/team-deployment.md index fec7848db0..a7ccfcd2f5 100644 --- a/docs/docs/data-engineering/training/team-deployment.md +++ b/docs/docs/data-engineering/training/team-deployment.md @@ -1,10 +1,10 @@ # Deploying Team Training -Get every teammate's AI automatically applying the same SQL conventions, naming standards, and anti-pattern rules. Achieved by committing `.altimate-code/memory/` to git — teammates inherit your training on `git pull`. +Get every teammate's AI automatically applying the same SQL conventions, naming standards, and anti-pattern rules. Achieved by committing `.altimate-code/memory/` to git so that teammates inherit your training on `git pull`. --- -## Step 1 — Create Your First Team Training Entries +## Step 1: Create Your First Team Training Entries Use the `/teach` or `/train` skills to save project-specific conventions: @@ -26,7 +26,7 @@ This shows all active training entries, their scope (global vs project), and whe --- -## Step 2 — Locate the Training Files +## Step 2: Locate the Training Files Training is stored in `.altimate-code/memory/` in your project root. Each entry is a markdown file with YAML frontmatter: @@ -40,11 +40,11 @@ Training is stored in `.altimate-code/memory/` in your project root. Each entry **Global vs. project scope:** - **Project scope** (`.altimate-code/memory/`): Applies when working in this project. Commit to git to share with team. -- **Global scope** (`~/.altimate-code/memory/`): Applies across all projects. Do not commit — this is personal. +- **Global scope** (`~/.altimate-code/memory/`): Applies across all projects. Do not commit, as this is personal. --- -## Step 3 — Commit to Git +## Step 3: Commit to Git ```bash git add .altimate-code/memory/ @@ -52,11 +52,11 @@ git commit -m "Add team SQL conventions and naming standards" git push ``` -Teammates who `git pull` automatically inherit all training entries. No additional setup required — the tool reads from `.altimate-code/memory/` on startup. +Teammates who `git pull` automatically inherit all training entries. No additional setup is required because the tool reads from `.altimate-code/memory/` on startup. --- -## Step 4 — Verify a Teammate Got the Training +## Step 4: Verify a Teammate Got the Training After a teammate pulls, they can run: @@ -85,4 +85,4 @@ Use project scope for team standards. Use global scope only for personal prefere ## Limitations -Training is as good as the corrections you save. The system doesn't infer conventions from your existing codebase — you teach it explicitly. For the full description of how training works, see [Training Overview](index.md). +Training is as good as the corrections you save. The system doesn't infer conventions from your existing codebase; you teach it explicitly. For the full description of how training works, see [Training Overview](index.md). diff --git a/docs/docs/develop/ecosystem.md b/docs/docs/develop/ecosystem.md index 66bfd9186b..3f847d5d41 100644 --- a/docs/docs/develop/ecosystem.md +++ b/docs/docs/develop/ecosystem.md @@ -12,15 +12,15 @@ altimate has a growing ecosystem of plugins, tools, and integrations. ## Integrations -- **GitHub Actions** — Automated PR review and issue triage -- **GitLab CI** — Merge request analysis -- **VS Code / Cursor** — IDE integration -- **MCP** — Model Context Protocol servers -- **ACP** — Agent Communication Protocol for editors +- **GitHub Actions**: Automated PR review and issue triage +- **GitLab CI**: Merge request analysis +- **VS Code / Cursor**: IDE integration +- **MCP**: Model Context Protocol servers +- **ACP**: Agent Communication Protocol for editors ## Community -- [GitHub Repository](https://github.com/AltimateAI/altimate-code) — Source code, issues, discussions +- [GitHub Repository](https://github.com/AltimateAI/altimate-code): Source code, issues, discussions - Share your plugins and tools with the community ## Contributing diff --git a/docs/docs/develop/plugins.md b/docs/docs/develop/plugins.md index 237904ea80..a4bb14b979 100644 --- a/docs/docs/develop/plugins.md +++ b/docs/docs/develop/plugins.md @@ -56,9 +56,9 @@ Add plugins to your `altimate-code.json` config file: Plugins can be specified as: -- **npm package name** — installed from the registry (e.g., `"npm-published-plugin"`) -- **Relative path** — a local directory (e.g., `"./path/to/local-plugin"`) -- **Scoped package** — with an org prefix (e.g., `"@altimateai/altimate-code-plugin-example"`) +- **npm package name**: installed from the registry (e.g., `"npm-published-plugin"`) +- **Relative path**: a local directory (e.g., `"./path/to/local-plugin"`) +- **Scoped package**: with an org prefix (e.g., `"@altimateai/altimate-code-plugin-example"`) ## Plugin Hooks @@ -70,7 +70,7 @@ Plugins can listen to lifecycle events. Each hook receives a context object with | `onSessionEnd` | A session is closed or expires | `session.id`, `session.duration`, `session.messageCount` | | `onMessage` | User sends a message to the agent | `message.content`, `message.sessionId`, `message.agent` | | `onResponse` | Agent generates a response | `response.content`, `response.sessionId`, `response.toolCalls` | -| `onToolCall` | Before a tool is executed | `call.name`, `call.parameters`, `call.sessionId` — return `false` to cancel | +| `onToolCall` | Before a tool is executed | `call.name`, `call.parameters`, `call.sessionId` (return `false` to cancel) | | `onToolResult` | After a tool finishes executing | `result.toolName`, `result.output`, `result.duration`, `result.error` | | `onFileEdit` | A file is modified via the agent | `edit.filePath`, `edit.oldContent`, `edit.newContent`, `edit.sessionId` | | `onFileWrite` | A new file is created via the agent | `write.filePath`, `write.content`, `write.sessionId` | @@ -92,7 +92,7 @@ Hooks fire in this order during a typical interaction: ## Example: SQL Anti-Pattern Plugin -This example creates a data-engineering-specific plugin that checks for `CROSS JOIN` without a `WHERE` clause in Snowflake SQL — a common anti-pattern that can cause massive result sets and runaway costs. +This example creates a data-engineering-specific plugin that checks for `CROSS JOIN` without a `WHERE` clause in Snowflake SQL. This is a common anti-pattern that can cause massive result sets and runaway costs. ### Plugin File diff --git a/docs/docs/develop/sdk.md b/docs/docs/develop/sdk.md index 5502660509..bdd30dfe1c 100644 --- a/docs/docs/develop/sdk.md +++ b/docs/docs/develop/sdk.md @@ -186,11 +186,11 @@ try { | Import | Description | |--------|------------| -| `@altimateai/altimate-code-sdk` | Core SDK — error types, constants, utilities | -| `@altimateai/altimate-code-sdk/client` | HTTP client — `createClient()` | -| `@altimateai/altimate-code-sdk/server` | Server utilities — for embedding altimate in your own server | -| `@altimateai/altimate-code-sdk/v2` | v2 API types — TypeScript type definitions | -| `@altimateai/altimate-code-sdk/v2/client` | v2 client — auto-generated typed client | +| `@altimateai/altimate-code-sdk` | Core SDK: error types, constants, utilities | +| `@altimateai/altimate-code-sdk/client` | HTTP client: `createClient()` | +| `@altimateai/altimate-code-sdk/server` | Server utilities for embedding altimate in your own server | +| `@altimateai/altimate-code-sdk/v2` | v2 API types: TypeScript type definitions | +| `@altimateai/altimate-code-sdk/v2/client` | v2 client: auto-generated typed client | ## OpenAPI diff --git a/docs/docs/develop/server.md b/docs/docs/develop/server.md index d99f9a8a0f..5bae917ed6 100644 --- a/docs/docs/develop/server.md +++ b/docs/docs/develop/server.md @@ -44,12 +44,12 @@ The server uses HTTP Basic Authentication when credentials are set. The server exposes REST endpoints for: -- **Sessions** — Create, list, delete sessions -- **Messages** — Send messages, stream responses -- **Models** — List available models -- **Agents** — List and switch agents -- **Tools** — Execute tools programmatically -- **Export/Import** — Session data management +- **Sessions**: Create, list, delete sessions +- **Messages**: Send messages, stream responses +- **Models**: List available models +- **Agents**: List and switch agents +- **Tools**: Execute tools programmatically +- **Export/Import**: Session data management Use the [SDK](sdk.md) for a typed client, or call the API directly. diff --git a/docs/docs/drivers.md b/docs/docs/drivers.md index 949a3e9f83..52c1d85cdf 100644 --- a/docs/docs/drivers.md +++ b/docs/docs/drivers.md @@ -2,7 +2,7 @@ ## Overview -Altimate Code connects to 10 databases natively via TypeScript drivers. No Python dependency required. Drivers are loaded lazily — only the driver you need is imported at runtime. +Altimate Code connects to 10 databases natively via TypeScript drivers. No Python dependency required. Drivers are loaded lazily, so only the driver you need is imported at runtime. ## Support Matrix @@ -21,7 +21,7 @@ Altimate Code connects to 10 databases natively via TypeScript drivers. No Pytho ## Installation -Drivers are `optionalDependencies` — install only what you need: +Drivers are `optionalDependencies`, so install only what you need: ```bash # Embedded databases (no external service needed) @@ -77,7 +77,7 @@ export ALTIMATE_CODE_CONN_MYDB='{"type":"postgres","host":"localhost","port":543 ### Via dbt Profiles (Recommended for dbt Users) -**dbt-first execution**: When working in a dbt project, `sql.execute` automatically uses dbt's own adapter to connect via `profiles.yml` — no separate connection configuration needed. If dbt is not configured or fails, it falls back to native drivers silently. +**dbt-first execution**: When working in a dbt project, `sql.execute` automatically uses dbt's own adapter to connect via `profiles.yml`, so no separate connection configuration is needed. If dbt is not configured or fails, it falls back to native drivers silently. Connections are also auto-discovered from `~/.dbt/profiles.yml` for the `warehouse.list` and `warehouse.discover` tools. Jinja `{{ env_var() }}` patterns are resolved automatically. Discovered connections are named `dbt_{profile}_{target}`. @@ -161,15 +161,15 @@ Connect through a bastion host by adding SSH config to any connection: SSH auth types: `"key"` (default) or `"password"` (set `ssh_password`). -> **Note:** SSH tunneling cannot be used with `connection_string` — use explicit `host`/`port` instead. +> **Note:** SSH tunneling cannot be used with `connection_string`. Use explicit `host`/`port` instead. ## Auto-Discovery The CLI auto-discovers connections from: -1. **Docker containers** — detects running PostgreSQL, MySQL, MariaDB, SQL Server, Oracle containers -2. **dbt profiles** — parses `~/.dbt/profiles.yml` for all supported adapters -3. **Environment variables** — detects `SNOWFLAKE_ACCOUNT`, `PGHOST`, `MYSQL_HOST`, `MSSQL_HOST`, `ORACLE_HOST`, `DUCKDB_PATH`, `SQLITE_PATH`, etc. +1. **Docker containers**: detects running PostgreSQL, MySQL, MariaDB, SQL Server, Oracle containers +2. **dbt profiles**: parses `~/.dbt/profiles.yml` for all supported adapters +3. **Environment variables**: detects `SNOWFLAKE_ACCOUNT`, `PGHOST`, `MYSQL_HOST`, `MSSQL_HOST`, `ORACLE_HOST`, `DUCKDB_PATH`, `SQLITE_PATH`, etc. Use the `warehouse_discover` tool or run project scan to find available connections. @@ -177,7 +177,7 @@ Use the `warehouse_discover` tool or run project scan to find available connecti These features work based on SDK documentation but haven't been verified with automated E2E tests: -### Snowflake (partially tested — 37 E2E tests pass) +### Snowflake (partially tested, 37 E2E tests pass) - ✅ Password authentication - ✅ Key-pair with unencrypted PEM - ✅ Key-pair with encrypted PEM + passphrase @@ -188,7 +188,7 @@ These features work based on SDK documentation but haven't been verified with au - ❌ OAuth/external browser auth (requires interactive browser) - ❌ Multi-cluster warehouse auto-scaling -### BigQuery (partially tested — 25 E2E tests pass) +### BigQuery (partially tested, 25 E2E tests pass) - ✅ Service Account JSON key authentication - ✅ Schema introspection (datasets, tables, columns) - ✅ BigQuery types (UNNEST, STRUCT, DATE/DATETIME/TIMESTAMP, STRING_AGG) @@ -197,7 +197,7 @@ These features work based on SDK documentation but haven't been verified with au - ❌ Location-specific query execution - ❌ Dry run / cost estimation -### Databricks (partially tested — 24 E2E tests pass) +### Databricks (partially tested, 24 E2E tests pass) - ✅ Personal Access Token (PAT) authentication - ✅ Unity Catalog (SHOW CATALOGS, SHOW SCHEMAS) - ✅ Schema introspection (listSchemas, listTables, describeTable) @@ -246,11 +246,11 @@ User calls sql.execute("SELECT * FROM orders") ### Dispatcher Pattern -All 73 tool methods route through a central `Dispatcher` that maps method names to native TypeScript handlers. There is no Python bridge — every call executes in-process. +All 73 tool methods route through a central `Dispatcher` that maps method names to native TypeScript handlers. There is no Python bridge; every call executes in-process. ### Shared Driver Package -Database drivers live in `packages/drivers/` (`@altimateai/drivers`) — a workspace package shared across the monorepo. Each driver: +Database drivers live in `packages/drivers/` (`@altimateai/drivers`), a workspace package shared across the monorepo. Each driver: - Lazy-loads its npm package via dynamic `import()` (no startup cost) - Uses parameterized queries for schema introspection (SQL injection safe) - Implements a common `Connector` interface: `connect()`, `execute()`, `listSchemas()`, `listTables()`, `describeTable()`, `close()` @@ -259,9 +259,9 @@ Database drivers live in `packages/drivers/` (`@altimateai/drivers`) — a works Credentials are handled with a 3-tier fallback: -1. **OS Keychain** (via `keytar`) — preferred, secure. Credentials stored in macOS Keychain, Linux Secret Service, or Windows Credential Vault. -2. **Environment variables** (`ALTIMATE_CODE_CONN_*`) — for CI/headless environments. Pass full connection JSON. -3. **Refuse** — if keytar is unavailable and no env var set, credentials are NOT stored in plaintext. The CLI warns and tells you to use env vars. +1. **OS Keychain** (via `keytar`): preferred and secure. Credentials stored in macOS Keychain, Linux Secret Service, or Windows Credential Vault. +2. **Environment variables** (`ALTIMATE_CODE_CONN_*`): for CI/headless environments. Pass full connection JSON. +3. **Refuse**: if keytar is unavailable and no env var set, credentials are NOT stored in plaintext. The CLI warns and tells you to use env vars. Sensitive fields (`password`, `private_key_passphrase`, `access_token`, `ssh_password`, `connection_string`) are always stripped from `connections.json` on disk. @@ -289,4 +289,4 @@ Or in config: } ``` -Telemetry failures **never** affect functionality — every tracking call is wrapped in try/catch. +Telemetry failures **never** affect functionality because every tracking call is wrapped in try/catch. diff --git a/docs/docs/index.md b/docs/docs/index.md index 2ad5a8db45..6416185800 100644 --- a/docs/docs/index.md +++ b/docs/docs/index.md @@ -17,7 +17,7 @@ hide:

The open-source data engineering harness.

-

50+ tools for building, validating, optimizing, and shipping data products. Use in your terminal, CI pipeline, orchestration DAGs, or as the harness for your data agents. Evaluate across any platform — independent of a single warehouse provider.

+

50+ tools for building, validating, optimizing, and shipping data products. Use in your terminal, CI pipeline, orchestration DAGs, or as the harness for your data agents. Evaluate across any platform, independent of a single warehouse provider.

@@ -110,7 +110,7 @@ npm install -g altimate-code --- - Mount altimate as the tool layer underneath Claude Code, Codex, or any AI agent — giving it deterministic, warehouse-aware capabilities. + Mount altimate as the tool layer underneath Claude Code, Codex, or any AI agent, giving it deterministic, warehouse-aware capabilities.

@@ -161,7 +161,7 @@ npm install -g altimate-code --- - Business-friendly reporting. No SQL jargon — translates technical findings into impact and recommendations. + Business-friendly reporting. No SQL jargon. Translates technical findings into impact and recommendations.
diff --git a/docs/docs/llms.txt b/docs/docs/llms.txt index da7c6c4d55..65a6e3a1d0 100644 --- a/docs/docs/llms.txt +++ b/docs/docs/llms.txt @@ -3,7 +3,7 @@ # Generated: 2026-03-18 | Version: v0.4.2 # Source: https://docs.altimate.sh -> altimate-code is an open-source data engineering harness — 50+ tools for building, validating, optimizing, and shipping data products. Use in your terminal, CI pipeline, orchestration DAGs, or as the tool layer for your data agents. Includes a deterministic SQL Intelligence Engine (100% F1 across 1,077 queries), column-level lineage, FinOps analysis, PII detection, and dbt integration. Works with any LLM provider. Local-first, MIT-licensed. +> altimate-code is an open-source data engineering harness with 50+ tools for building, validating, optimizing, and shipping data products. Use in your terminal, CI pipeline, orchestration DAGs, or as the tool layer for your data agents. Includes a deterministic SQL Intelligence Engine (100% F1 across 1,077 queries), column-level lineage, FinOps analysis, PII detection, and dbt integration. Works with any LLM provider. Local-first, MIT-licensed. ## Get Started @@ -13,7 +13,7 @@ ## Data Engineering -- [Agent Modes](https://docs.altimate.sh/data-engineering/agent-modes/): 7 specialized agents — Builder (full read/write), Analyst (read-only enforced), Validator, Migrator, Researcher, Trainer, Executive — each with scoped permissions and purpose-built tool access. +- [Agent Modes](https://docs.altimate.sh/data-engineering/agent-modes/): 7 specialized agents (Builder, Analyst, Validator, Migrator, Researcher, Trainer, Executive), each with scoped permissions and purpose-built tool access. Builder has full read/write; Analyst has read-only enforced. - [Training Overview](https://docs.altimate.sh/data-engineering/training/): How to teach altimate project-specific patterns, naming conventions, and corrections that persist across sessions and team members. - [Team Deployment](https://docs.altimate.sh/data-engineering/training/team-deployment/): How to commit training to git so your entire team inherits SQL conventions automatically. - [SQL Tools](https://docs.altimate.sh/data-engineering/tools/sql-tools/): 9 SQL analysis tools with 19 anti-pattern rules. 100% F1 accuracy on 1,077 benchmark queries. @@ -33,7 +33,7 @@ - [Providers](https://docs.altimate.sh/configure/providers/): 17 LLM provider configurations with JSON examples: Anthropic, OpenAI, Google Gemini, Vertex AI, Amazon Bedrock, Azure OpenAI, Mistral, Groq, Ollama, and more. - [Agent Skills](https://docs.altimate.sh/configure/skills/): How to configure, discover, and add custom skills. - [Permissions](https://docs.altimate.sh/configure/permissions/): Permission levels, pattern matching, per-agent restrictions, deny rules for destructive SQL. -- [Tracing](https://docs.altimate.sh/configure/tracing/): Local-first observability — trace schema, span types, live viewing, remote OTLP exporters, crash recovery. +- [Tracing](https://docs.altimate.sh/configure/tracing/): Local-first observability covering trace schema, span types, live viewing, remote OTLP exporters, and crash recovery. - [Telemetry](https://docs.altimate.sh/configure/telemetry/): 25 anonymized event types, privacy guarantees, opt-out instructions. ## Reference diff --git a/docs/docs/quickstart.md b/docs/docs/quickstart.md index a47ef7d63c..863fb0d654 100644 --- a/docs/docs/quickstart.md +++ b/docs/docs/quickstart.md @@ -4,7 +4,7 @@ description: "Install altimate-code and run your first SQL analysis. The open-so # Quickstart -> **You need:** npm 8+ or Homebrew. An API key for any supported LLM provider — or use Codex (built-in, no key required). +> **You need:** npm 8+ or Homebrew. An API key for any supported LLM provider, or use Codex (built-in, no key required). --- @@ -97,6 +97,6 @@ Build me a real time, interactive dashboard for my macbook system metrics and he ## What's Next -- [Full Setup](getting-started.md) — All warehouse configs, LLM providers, advanced setup -- [Agent Modes](data-engineering/agent-modes.md) — Choose the right agent for your task -- [CI & Automation](data-engineering/guides/ci-headless.md) — Run altimate in automated pipelines +- [Full Setup](getting-started.md): All warehouse configs, LLM providers, advanced setup +- [Agent Modes](data-engineering/agent-modes.md): Choose the right agent for your task +- [CI & Automation](data-engineering/guides/ci-headless.md): Run altimate in automated pipelines diff --git a/docs/docs/security-faq.md b/docs/docs/security-faq.md index a94ba8a084..0c3669c42b 100644 --- a/docs/docs/security-faq.md +++ b/docs/docs/security-faq.md @@ -150,7 +150,7 @@ MCP (Model Context Protocol) servers extend Altimate Code with additional tools. - **MCP tool calls go through the permission system.** You can set MCP tools to `"ask"` or `"deny"` like any other tool. !!! warning - Third-party MCP servers are not reviewed or audited by Altimate. Treat them like any other third-party dependency — review the source, check for updates, and limit their access. + Third-party MCP servers are not reviewed or audited by Altimate. Treat them like any other third-party dependency: review the source, check for updates, and limit their access. ## How does the SQL analysis engine work? @@ -233,7 +233,7 @@ When you see this prompt: - **"Allow once"** approves this single edit - **"Allow always"** approves edits to this specific file for the rest of the session (resets on restart) -If you frequently edit `.env` files and find the prompts disruptive, click "Allow always" on the first prompt for each file — you won't be asked again for that file during your session. +If you frequently edit `.env` files and find the prompts disruptive, click "Allow always" on the first prompt for each file. You won't be asked again for that file during your session. !!! tip This protection does **not** block reading these files, only writing. The agent can still read your `.env` to understand configuration without prompting. @@ -253,7 +253,7 @@ Altimate Code applies safe defaults so you don't have to configure anything for | `TRUNCATE *` | **Blocked** | Irreversible data deletion. | | All other commands | **Prompted** | You approve each command before it runs. | -**"Prompted"** means you'll see the command and can approve or reject it. **"Blocked"** means the agent cannot run it at all — you must override in config. +**"Prompted"** means you'll see the command and can approve or reject it. **"Blocked"** means the agent cannot run it at all; you must override in config. To override defaults, add rules in `altimate-code.json`. See [Permissions](configure/permissions.md) for the full configuration reference. diff --git a/docs/docs/troubleshooting.md b/docs/docs/troubleshooting.md index 2e2c39f4a7..4429bbf7a4 100644 --- a/docs/docs/troubleshooting.md +++ b/docs/docs/troubleshooting.md @@ -47,7 +47,7 @@ altimate --print-logs --log-level DEBUG # Example for PostgreSQL: bun add pg ``` -3. No Python installation is required — all tools run natively in TypeScript. +3. No Python installation is required. All tools run natively in TypeScript. ### Warehouse Connection Failed @@ -55,7 +55,7 @@ altimate --print-logs --log-level DEBUG **Solutions:** -1. **If using dbt:** Run `altimate-dbt init` to set up the dbt integration. The CLI will use your `profiles.yml` automatically — no separate connection config needed. +1. **If using dbt:** Run `altimate-dbt init` to set up the dbt integration. The CLI will use your `profiles.yml` automatically, so no separate connection config is needed. 2. **If not using dbt:** Add a connection via the `warehouse_add` tool, `~/.altimate-code/connections.json`, or `ALTIMATE_CODE_CONN_*` env vars. 3. Test connectivity: use the `warehouse_test` tool with your connection name. 4. Check that the warehouse hostname and port are reachable @@ -69,7 +69,7 @@ altimate --print-logs --log-level DEBUG **Solutions:** -1. Check the log files — MCP initialization errors are now logged with the server name and error message: +1. Check the log files. MCP initialization errors are now logged with the server name and error message: ``` WARN failed to initialize MCP server { key: "my-tools", error: "..." } ``` @@ -139,5 +139,5 @@ Then share `debug.log` when reporting issues. ## Getting Help -- [GitHub Issues](https://github.com/AltimateAI/altimate-code/issues) — Report bugs and request features +- [GitHub Issues](https://github.com/AltimateAI/altimate-code/issues): Report bugs and request features - Check [existing issues](https://github.com/AltimateAI/altimate-code/issues) before filing new ones diff --git a/docs/docs/usage/cli.md b/docs/docs/usage/cli.md index a3fb7b72fc..221082ae7b 100644 --- a/docs/docs/usage/cli.md +++ b/docs/docs/usage/cli.md @@ -130,7 +130,7 @@ altimate run --no-trace "quick question" ## Tracing -Every `run` command automatically saves a trace file with the full session details — generations, tool calls, tokens, cost, and timing. See [Tracing](../configure/tracing.md) for configuration options. +Every `run` command automatically saves a trace file with the full session details, including generations, tool calls, tokens, cost, and timing. See [Tracing](../configure/tracing.md) for configuration options. ```bash # List recent traces diff --git a/docs/docs/usage/tui.md b/docs/docs/usage/tui.md index ef3fa44129..7d8c187cc2 100644 --- a/docs/docs/usage/tui.md +++ b/docs/docs/usage/tui.md @@ -10,9 +10,9 @@ altimate The TUI has three main areas: -- **Message area** — shows the conversation with the AI assistant -- **Input area** — where you type messages and commands -- **Sidebar** — shows session info, tool calls, and file changes (toggle with leader key + `s`) +- **Message area**: shows the conversation with the AI assistant +- **Input area**: where you type messages and commands +- **Sidebar**: shows session info, tool calls, and file changes (toggle with leader key + `s`) ## Input Shortcuts @@ -41,9 +41,9 @@ The leader key (default: `Ctrl+X`) gives access to all TUI keybindings. Press le ## Scrolling -- **Page up/down** — scroll messages -- **Home/End** — jump to first/last message -- **Mouse scroll** — scroll with mouse wheel +- **Page up/down**: scroll messages +- **Home/End**: jump to first/last message +- **Mouse scroll**: scroll with mouse wheel Configure scroll speed: diff --git a/docs/docs/usage/web.md b/docs/docs/usage/web.md index 099ac9fa33..82ec166522 100644 --- a/docs/docs/usage/web.md +++ b/docs/docs/usage/web.md @@ -28,7 +28,7 @@ Configure the web server in `altimate-code.json`: | `hostname` | `localhost` | Bind address | | `cors` | `[]` | Allowed CORS origins | | `mdns` | `false` | Enable mDNS discovery | -| `mdnsDomain` | — | Custom mDNS domain | +| `mdnsDomain` | (none) | Custom mDNS domain | ## Authentication diff --git a/docs/docs/windows-wsl.md b/docs/docs/windows-wsl.md index 036201f822..a3285a03ac 100644 --- a/docs/docs/windows-wsl.md +++ b/docs/docs/windows-wsl.md @@ -18,7 +18,7 @@ This works with Node.js 18+ installed natively on Windows. All core features wor ## WSL Setup (Recommended) -For the best experience — especially with file watching, shell tools, and dbt — we recommend WSL 2: +For the best experience (especially with file watching, shell tools, and dbt), we recommend WSL 2: 1. Install WSL: ```powershell @@ -115,4 +115,4 @@ If you installed Node.js but `npm` or `node` is not recognized: - Use WSL 2 for better performance - Store your projects in the WSL filesystem (`~/projects/`) rather than `/mnt/c/` for faster file operations - Set up your warehouse connections in the WSL environment -- If using both WSL and native Windows, keep separate config files — the WSL and Windows file systems have different path conventions +- If using both WSL and native Windows, keep separate config files because the WSL and Windows file systems have different path conventions From e09341e6d4e9c86b68429be5513d1514d2c0c8bf Mon Sep 17 00:00:00 2001 From: Pradnesh Date: Wed, 18 Mar 2026 17:32:34 -0700 Subject: [PATCH 06/13] docs: overhaul getting-started pages with comprehensive setup guide Rewrite quickstart as a full Setup page covering warehouse connections, LLM provider switching, agent modes, skills, and permissions. Update overview page with ADE-Bench results (74.4%), fix install command, and change 70+ to 50+ tools. Replace query example with NYC taxi cab analytics prompt. Remove time blocks from step headings and trim redundant sections. Co-Authored-By: Claude Opus 4.6 (1M context) --- docs/docs/configure/agents.md | 28 +- docs/docs/configure/config.md | 2 +- docs/docs/configure/governance.md | 39 ++ docs/docs/configure/index.md | 57 +++ docs/docs/configure/tools/config.md | 54 +++ docs/docs/configure/tools/core-tools.md | 281 +++++++++++ docs/docs/configure/tools/custom.md | 94 ++++ docs/docs/configure/tools/index.md | 17 + docs/docs/configure/tracing.md | 2 +- docs/docs/configure/warehouses.md | 350 ++++++++++++++ .../data-engineering/guides/ci-headless.md | 2 +- docs/docs/examples/index.md | 49 ++ docs/docs/getting-started.md | 140 +----- docs/docs/getting-started/index.md | 182 +++++++ docs/docs/getting-started/quickstart-new.md | 229 +++++++++ docs/docs/getting-started/quickstart.md | 451 ++++++++++++++++++ docs/docs/reference/changelog.md | 22 + docs/docs/{ => reference}/network.md | 0 docs/docs/{ => reference}/security-faq.md | 65 +-- .../{configure => reference}/telemetry.md | 2 +- docs/docs/{ => reference}/troubleshooting.md | 0 docs/docs/{ => reference}/windows-wsl.md | 0 docs/mkdocs.yml | 117 ++--- 23 files changed, 1895 insertions(+), 288 deletions(-) create mode 100644 docs/docs/configure/governance.md create mode 100644 docs/docs/configure/index.md create mode 100644 docs/docs/configure/tools/config.md create mode 100644 docs/docs/configure/tools/core-tools.md create mode 100644 docs/docs/configure/tools/custom.md create mode 100644 docs/docs/configure/tools/index.md create mode 100644 docs/docs/configure/warehouses.md create mode 100644 docs/docs/examples/index.md create mode 100644 docs/docs/getting-started/index.md create mode 100644 docs/docs/getting-started/quickstart-new.md create mode 100644 docs/docs/getting-started/quickstart.md create mode 100644 docs/docs/reference/changelog.md rename docs/docs/{ => reference}/network.md (100%) rename docs/docs/{ => reference}/security-faq.md (90%) rename docs/docs/{configure => reference}/telemetry.md (99%) rename docs/docs/{ => reference}/troubleshooting.md (100%) rename docs/docs/{ => reference}/windows-wsl.md (100%) diff --git a/docs/docs/configure/agents.md b/docs/docs/configure/agents.md index 1e476ce3af..19a17b7bd6 100644 --- a/docs/docs/configure/agents.md +++ b/docs/docs/configure/agents.md @@ -21,6 +21,11 @@ Agents define different AI personas with specific models, prompts, permissions, | `analyst` | Explore data, run SELECT queries, generate insights | Read-only (enforced) | | `validator` | Data quality checks, schema validation, test coverage | Read + validate | | `migrator` | Cross-warehouse SQL translation and migration | Read/write for migration | +| `researcher` | Deep multi-step investigations, root cause analysis | Read-only + parallel | +| `trainer` | Teach conventions, manage training entries | Read-only + training | +| `executive` | Business-friendly reporting, health dashboards | Read-only | + +For detailed examples and usage guidance for each mode, see [Agent Modes](../data-engineering/agent-modes.md). !!! tip Use the `analyst` agent when exploring data to ensure no accidental writes. Switch to `builder` when you are ready to create or modify models. @@ -90,28 +95,7 @@ You are a Snowflake cost optimization expert. For every query: ## Agent Permissions -Each agent can have its own permission overrides that restrict or expand the default permissions: - -```json -{ - "agent": { - "analyst": { - "permission": { - "write": "deny", - "edit": "deny", - "bash": { - "dbt show *": "allow", - "dbt list *": "allow", - "*": "deny" - } - } - } - } -} -``` - -!!! warning - Agent-specific permissions override global permissions. A `"deny"` at the agent level cannot be overridden by a global `"allow"`. +Each agent can have its own permission overrides that restrict or expand the default permissions. For full details, examples, and recommended configurations, see the [Permissions reference](permissions.md#per-agent-permissions). ## Switching Agents diff --git a/docs/docs/configure/config.md b/docs/docs/configure/config.md index 2bbf97b174..8c2ed93f7e 100644 --- a/docs/docs/configure/config.md +++ b/docs/docs/configure/config.md @@ -57,7 +57,7 @@ Configuration is loaded from multiple sources, with later sources overriding ear | `skills` | `object` | Skill paths and URLs | | `plugin` | `string[]` | Plugin specifiers | | `instructions` | `string[]` | Glob patterns for instruction files | -| `telemetry` | `object` | Telemetry settings (see [Telemetry](telemetry.md)) | +| `telemetry` | `object` | Telemetry settings (see [Telemetry](../reference/telemetry.md)) | | `compaction` | `object` | Context compaction settings (see [Context Management](context-management.md)) | | `experimental` | `object` | Experimental feature flags | diff --git a/docs/docs/configure/governance.md b/docs/docs/configure/governance.md new file mode 100644 index 0000000000..b04a30c44b --- /dev/null +++ b/docs/docs/configure/governance.md @@ -0,0 +1,39 @@ +# Governance + +Most people think of governance as a cost — something you bolt on for compliance. In practice, governance makes agents produce **better results**, not just safer ones. + +LLMs have built-in randomization. Give them too much freedom and they explore dead ends, burn tokens, and produce inconsistent output. Constrain the solution space and they get to correct results faster, in fewer tokens, with more consistency. + +Task-scoped permissions aren't just about safety — they're about **focus**. When an Analyst agent knows it can only `SELECT`, it doesn't waste cycles considering whether to `CREATE` a temp table. When it has prescribed, deterministic tools for tracing lineage instead of trying to figure it out from scratch, the results are the same every time. + +There's an audit angle too. In regulated industries, prescribed tooling eliminates unnecessary audit cycles. When your tools generate SQL the same way every time, auditors can verify consistency. Change the SQL — even if the results are conceptually identical — and you trigger an investigation to prove equivalence. Deterministic tooling removes that overhead entirely. + +Altimate Code enforces governance at the **harness level**, not via prompt instructions the model can ignore. Four mechanisms work together: + +## Rules + +Project rules via `AGENTS.md` files guide agent behavior — coding conventions, naming standards, warehouse policies, and workflow instructions. Rules are loaded automatically from well-known file patterns and merged into the agent's system prompt. Place them at your project root, in subdirectories for scoped guidance, or host them remotely for organization-wide standards. + +[:octicons-arrow-right-24: Rules reference](rules.md) + +## Permissions + +Every tool has a permission level — `allow`, `ask`, or `deny` — configurable globally or per agent. The Analyst agent can't `INSERT`, `UPDATE`, `DELETE`, or `DROP`. That's not a prompt instruction the model can choose to ignore. It's enforced at the tool level. Pattern-based permissions give you fine-grained control: allow `dbt build *` but deny `rm -rf *`. + +[:octicons-arrow-right-24: Permissions reference](permissions.md) + +## Context Management + +Long sessions produce large conversation histories that can exceed model context windows. Altimate Code automatically prunes old tool outputs, compacts conversations into summaries, and recovers from provider overflow errors — all while preserving critical data engineering context like warehouse connections, schema discoveries, lineage findings, and cost analysis results. + +[:octicons-arrow-right-24: Context Management reference](context-management.md) + +## Formatters + +Every file edit is auto-formatted before it's written. This isn't optional consistency — it's enforced consistency. Altimate Code detects file types and runs the appropriate formatter (prettier, ruff, gofmt, sqlfluff, and 20+ others) automatically. The agent can't produce code that violates your formatting standards. + +[:octicons-arrow-right-24: Formatters reference](formatters.md) + +--- + +Together, these four mechanisms mean governance is not an afterthought — it's built into every agent interaction. The harness enforces the rules so your team doesn't have to police the output. diff --git a/docs/docs/configure/index.md b/docs/docs/configure/index.md new file mode 100644 index 0000000000..d2df2d3ed8 --- /dev/null +++ b/docs/docs/configure/index.md @@ -0,0 +1,57 @@ +# Configure + +Set up your warehouses, LLM providers, and preferences. For agents, tools, skills, and commands, see the [Use](../data-engineering/agent-modes.md) section. For rules, permissions, and context management, see [Governance](rules.md). + +## What's in this section + +
+ +- :material-cog:{ .lg .middle } **Config File Reference** + + --- + + JSON configuration file locations, schema, value substitution, and project structure. + + [:octicons-arrow-right-24: Config File](config.md) + +- :material-database:{ .lg .middle } **Warehouses** + + --- + + Connect to Snowflake, BigQuery, Databricks, PostgreSQL, Redshift, DuckDB, MySQL, and SQL Server. Includes key-pair auth, IAM, ADC, and SSH tunneling. + + [:octicons-arrow-right-24: Warehouses](warehouses.md) + +- :material-cloud-outline:{ .lg .middle } **LLMs** + + --- + + Connect to 35+ LLM providers — Anthropic, OpenAI, Bedrock, Ollama, and more. Configure API keys and model selection. + + [:octicons-arrow-right-24: Providers](providers.md) · [:octicons-arrow-right-24: Models](models.md) + +- :material-puzzle:{ .lg .middle } **MCPs & ACPs** + + --- + + Extend Altimate Code with MCP servers (local and remote) and ACP-compatible editor integrations. + + [:octicons-arrow-right-24: MCP Servers](mcp-servers.md) · [:octicons-arrow-right-24: ACP Support](acp.md) + +- :material-palette:{ .lg .middle } **Appearance** + + --- + + Themes, keybinds, and visual customization for the TUI. + + [:octicons-arrow-right-24: Themes](themes.md) · [:octicons-arrow-right-24: Keybinds](keybinds.md) + +- :material-dots-horizontal:{ .lg .middle } **Additional Config** + + --- + + LSP servers, network/proxy settings, and Windows/WSL setup. + + [:octicons-arrow-right-24: LSP Servers](lsp.md) · [:octicons-arrow-right-24: Network](../reference/network.md) · [:octicons-arrow-right-24: Windows / WSL](../reference/windows-wsl.md) + +
diff --git a/docs/docs/configure/tools/config.md b/docs/docs/configure/tools/config.md new file mode 100644 index 0000000000..4028561792 --- /dev/null +++ b/docs/docs/configure/tools/config.md @@ -0,0 +1,54 @@ +# Tools + +altimate includes built-in tools that agents use to interact with your codebase and environment. + +## Built-in Tools + +| Tool | Description | +|------|------------| +| `bash` | Execute shell commands | +| `read` | Read file contents | +| `edit` | Edit files with find-and-replace | +| `write` | Create or overwrite files | +| `glob` | Find files by pattern | +| `grep` | Search file contents with regex | +| `list` | List directory contents | +| `patch` | Apply multi-file patches | +| `lsp` | Language server operations (diagnostics, completions) | +| `webfetch` | Fetch and process web pages | +| `websearch` | Search the web | +| `question` | Ask the user a question | +| `todo_read` | Read task list | +| `todo_write` | Create/update tasks | +| `skill` | Execute a skill | + +## Data Engineering Tools + +In addition to built-in tools, altimate provides 70+ specialized data engineering tools. See the [Data Engineering Tools](index.md) section for details. + +## Tool Permissions + +Control which tools agents can use via the [permission system](../permissions.md). For full details, pattern-based rules, and recommended configurations, see the [Permissions reference](../permissions.md). + +## Tool Behavior + +### Bash Tool + +The `bash` tool executes shell commands in the project directory. Commands run in a non-interactive shell with the user's environment. + +### Read / Write / Edit Tools + +File tools respect the project boundaries and permission settings: + +- **`read`** — Reads file contents, supports line ranges +- **`write`** — Creates or overwrites entire files +- **`edit`** — Surgical find-and-replace edits within files + +### LSP Tool + +When [LSP servers](../lsp.md) are configured, the `lsp` tool provides: + +- Diagnostics (errors, warnings) +- Go-to-definition +- Hover information +- Completions diff --git a/docs/docs/configure/tools/core-tools.md b/docs/docs/configure/tools/core-tools.md new file mode 100644 index 0000000000..ef0fc3716b --- /dev/null +++ b/docs/docs/configure/tools/core-tools.md @@ -0,0 +1,281 @@ +# Core Tools + +The `altimate_core_*` tools are powered by a Rust-based SQL engine that provides fast, deterministic analysis without LLM calls. These tools handle validation, linting, safety scanning, lineage, formatting, and more. + +## Analysis & Validation + +### altimate_core_check + +Run the full analysis pipeline — validate + lint + safety scan + PII check — in a single call. + +**Parameters:** `sql` (required), `schema_path` (optional), `schema_context` (optional) + +--- + +### altimate_core_validate + +Validate SQL syntax and schema references. Checks if tables and columns exist in the schema and if SQL is valid for the target dialect. + +**Parameters:** `sql` (required), `schema_path` (optional), `schema_context` (optional) + +--- + +### altimate_core_lint + +Lint SQL for anti-patterns — NULL comparisons, implicit casts, unused CTEs, and dialect-specific problems. + +**Parameters:** `sql` (required), `schema_path` (optional), `schema_context` (optional) + +--- + +### altimate_core_grade + +Grade SQL quality on an A–F scale. Evaluates readability, performance, correctness, and best practices. + +**Parameters:** `sql` (required), `schema_path` (optional), `schema_context` (optional) + +--- + +## Safety & Governance + +### altimate_core_safety + +Scan SQL for injection patterns, dangerous statements (DROP, TRUNCATE), and security threats. + +**Parameters:** `sql` (required) + +--- + +### altimate_core_is_safe + +Quick boolean safety check — returns true/false indicating whether SQL is safe to execute. + +**Parameters:** `sql` (required) + +--- + +### altimate_core_policy + +Check SQL against YAML-based governance policy guardrails. Validates compliance with custom rules like allowed tables, forbidden operations, and data access restrictions. + +**Parameters:** `sql` (required), `policy_json` (required), `schema_path` (optional), `schema_context` (optional) + +--- + +### altimate_core_classify_pii + +Classify PII columns in a schema by name patterns and data types. Identifies columns likely containing personal identifiable information. + +**Parameters:** `schema_path` (optional), `schema_context` (optional) + +--- + +### altimate_core_query_pii + +Analyze query-level PII exposure. Checks if a SQL query accesses columns classified as PII and reports the exposure risk. + +**Parameters:** `sql` (required), `schema_path` (optional), `schema_context` (optional) + +--- + +## SQL Transformation + +### altimate_core_fix + +Auto-fix SQL errors using fuzzy matching and iterative re-validation to correct syntax errors, typos, and schema reference issues. + +**Parameters:** `sql` (required), `schema_path` (optional), `schema_context` (optional), `max_iterations` (optional) + +--- + +### altimate_core_correct + +Iteratively correct SQL using a propose-verify-refine loop. More thorough than `fix` — applies multiple correction rounds. + +**Parameters:** `sql` (required), `schema_path` (optional), `schema_context` (optional) + +--- + +### altimate_core_format + +Format SQL with dialect-aware keyword casing and indentation. Fast and deterministic. + +**Parameters:** `sql` (required), `dialect` (optional) + +--- + +### altimate_core_rewrite + +Suggest query optimization rewrites — analyzes SQL and proposes concrete rewrites for better performance. + +**Parameters:** `sql` (required), `schema_path` (optional), `schema_context` (optional) + +--- + +### altimate_core_transpile + +Translate SQL between dialects using the Rust engine. + +**Parameters:** `sql` (required), `source_dialect` (required), `target_dialect` (required) + +--- + +## Comparison & Equivalence + +### altimate_core_compare + +Structurally compare two SQL queries. Identifies differences in table references, join conditions, filters, projections, and aggregations. + +**Parameters:** `left_sql` (required), `right_sql` (required), `dialect` (optional) + +--- + +### altimate_core_equivalence + +Check semantic equivalence of two SQL queries — determines if they produce the same result set regardless of syntactic differences. + +**Parameters:** `sql1` (required), `sql2` (required), `schema_path` (optional), `schema_context` (optional) + +--- + +## Lineage & Metadata + +### altimate_core_column_lineage + +Trace schema-aware column lineage. Maps how columns flow through a query from source tables to output. + +**Parameters:** `sql` (required), `dialect` (optional), `schema_path` (optional), `schema_context` (optional) + +--- + +### altimate_core_track_lineage + +Track lineage across multiple SQL statements. + +**Parameters:** `sql` (required), `schema_path` (optional), `schema_context` (optional) + +--- + +### altimate_core_extract_metadata + +Extract metadata from SQL — identifies tables, columns, functions, CTEs, and other structural elements referenced in a query. + +**Parameters:** `sql` (required), `dialect` (optional) + +--- + +### altimate_core_resolve_term + +Resolve a business glossary term to schema elements using fuzzy matching. Maps human-readable terms like "revenue" or "customer" to actual table/column names. + +**Parameters:** `term` (required), `schema_path` (optional), `schema_context` (optional) + +--- + +### altimate_core_semantics + +Analyze semantic meaning of SQL elements. + +**Parameters:** `sql` (required), `schema_path` (optional), `schema_context` (optional) + +--- + +## Schema Operations + +### altimate_core_schema_diff + +Diff two schema versions to detect structural changes. + +**Parameters:** `old_ddl` (required), `new_ddl` (required), `dialect` (optional) + +--- + +### altimate_core_migration + +Analyze DDL migration safety. Detects potential data loss, type narrowing, missing defaults, and other risks in schema migration statements. + +**Parameters:** `old_ddl` (required), `new_ddl` (required), `dialect` (optional) + +--- + +### altimate_core_export_ddl + +Export a YAML/JSON schema as CREATE TABLE DDL statements. + +**Parameters:** `schema_path` (optional), `schema_context` (optional) + +--- + +### altimate_core_import_ddl + +Convert CREATE TABLE DDL into a structured YAML schema definition that other core tools can consume. + +**Parameters:** `ddl` (required), `dialect` (optional) + +--- + +### altimate_core_fingerprint + +Compute a SHA-256 fingerprint of a schema. Useful for cache invalidation and change detection. + +**Parameters:** `schema_path` (optional), `schema_context` (optional) + +--- + +### altimate_core_introspection_sql + +Generate INFORMATION_SCHEMA introspection queries for a given database type. Supports postgres, bigquery, snowflake, mysql, mssql, redshift. + +**Parameters:** `db_type` (required), `database` (required), `schema_name` (optional) + +--- + +## Context Optimization + +### altimate_core_optimize_context + +Optimize schema for LLM context window. Applies 5-level progressive disclosure to reduce schema size while preserving essential information. + +**Parameters:** `schema_path` (optional), `schema_context` (optional) + +--- + +### altimate_core_optimize_for_query + +Prune schema to only tables and columns relevant to a specific query. Reduces context size for LLM prompts. + +**Parameters:** `sql` (required), `schema_path` (optional), `schema_context` (optional) + +--- + +### altimate_core_prune_schema + +Filter schema to only tables and columns referenced by a SQL query. + +**Parameters:** `sql` (required), `schema_path` (optional), `schema_context` (optional) + +--- + +## dbt & Autocomplete + +### altimate_core_parse_dbt + +Parse a dbt project directory. Extracts models, sources, tests, and project structure for analysis. + +**Parameters:** `project_dir` (required) + +--- + +### altimate_core_complete + +Get cursor-aware SQL completion suggestions. Returns table names, column names, functions, and keywords relevant to the cursor position. + +**Parameters:** `sql` (required), `cursor_pos` (required), `schema_path` (optional), `schema_context` (optional) + +--- + +### altimate_core_testgen + +Generate test cases for SQL queries. + +**Parameters:** `sql` (required), `schema_path` (optional), `schema_context` (optional) diff --git a/docs/docs/configure/tools/custom.md b/docs/docs/configure/tools/custom.md new file mode 100644 index 0000000000..18f121070c --- /dev/null +++ b/docs/docs/configure/tools/custom.md @@ -0,0 +1,94 @@ +# Custom Tools + +Create custom tools using TypeScript and the altimate plugin system. + +## Quick Start + +1. Create a tools directory: + +```bash +mkdir -p .altimate-code/tools +``` + +2. Create a tool file: + +```typescript +// .altimate-code/tools/my-tool.ts +import { defineTool } from "@altimateai/altimate-code-plugin/tool" +import { z } from "zod" + +export default defineTool({ + name: "my_custom_tool", + description: "Does something useful", + parameters: z.object({ + input: z.string().describe("The input to process"), + }), + async execute({ input }) { + // Your tool logic here + return { result: `Processed: ${input}` } + }, +}) +``` + +## Plugin Package + +For more complex tools, create a plugin package: + +```bash +npm init +npm install @altimateai/altimate-code-plugin zod +``` + +```typescript +// index.ts +import { definePlugin } from "@altimateai/altimate-code-plugin" +import { z } from "zod" + +export default definePlugin({ + name: "my-plugin", + tools: [ + { + name: "analyze_costs", + description: "Analyze warehouse costs", + parameters: z.object({ + warehouse: z.string(), + days: z.number().default(30), + }), + async execute({ warehouse, days }) { + // Implementation + return { costs: [] } + }, + }, + ], +}) +``` + +## Registering Plugins + +Add plugins to your config: + +```json +{ + "plugin": [ + "@altimateai/altimate-code-plugin-example", + "./my-local-plugin" + ] +} +``` + +## Plugin Hooks + +Plugins can hook into 30+ lifecycle events: + +- `onSessionStart` / `onSessionEnd` +- `onMessage` / `onResponse` +- `onToolCall` / `onToolResult` +- `onFileEdit` / `onFileWrite` +- `onError` +- And more... + +## Disabling Default Plugins + +```bash +export ALTIMATE_CLI_DISABLE_DEFAULT_PLUGINS=true +``` diff --git a/docs/docs/configure/tools/index.md b/docs/docs/configure/tools/index.md new file mode 100644 index 0000000000..4009d4f0c7 --- /dev/null +++ b/docs/docs/configure/tools/index.md @@ -0,0 +1,17 @@ +# Tools Reference + +Altimate Code has 70+ specialized tools organized by function. + +| Category | Tools | Purpose | +|---|---|---| +| [Built-in Tools](config.md) | 14 tools | File operations, search, shell, subagents, and other core agent tools | +| [Core Tools](core-tools.md) | 28 tools | Rust-based SQL engine — validation, linting, safety, lineage, formatting, PII, governance | +| [SQL Tools](../../data-engineering/tools/sql-tools.md) | 10 tools | Analysis, optimization, translation, formatting, cost prediction | +| [Schema Tools](../../data-engineering/tools/schema-tools.md) | 7 tools | Inspection, search, PII detection, tagging, diffing | +| [FinOps Tools](../../data-engineering/tools/finops-tools.md) | 8 tools | Cost analysis, warehouse sizing, unused resources, RBAC | +| [Lineage Tools](../../data-engineering/tools/lineage-tools.md) | 1 tool | Column-level lineage tracing with confidence scoring | +| [dbt Tools](../../data-engineering/tools/dbt-tools.md) | 4 tools + 11 skills | Run, manifest, lineage, profiles, test generation, scaffolding | +| [Warehouse Tools](../../data-engineering/tools/warehouse-tools.md) | 6 tools | Environment scanning, connection management, discovery, testing | +| [Custom Tools](custom.md) | — | Build your own tools with TypeScript plugins | + +All tools are available in the interactive TUI. The agent automatically selects the right tools based on your request. diff --git a/docs/docs/configure/tracing.md b/docs/docs/configure/tracing.md index fc8cf9fa36..f23b914bf6 100644 --- a/docs/docs/configure/tracing.md +++ b/docs/docs/configure/tracing.md @@ -335,7 +335,7 @@ Traces are stored **locally only** by default. They contain: - Tool inputs and outputs (SQL queries, file contents, command results) - Model responses -If you configure remote exporters, trace data is sent to those endpoints. No trace data is included in the anonymous telemetry described in [Telemetry](telemetry.md). +If you configure remote exporters, trace data is sent to those endpoints. No trace data is included in the anonymous telemetry described in [Telemetry](../reference/telemetry.md). !!! warning "Sensitive Data" Traces may contain SQL queries, file paths, and command outputs from your session. If you share trace files or configure remote exporters, be aware that this data will be included. diff --git a/docs/docs/configure/warehouses.md b/docs/docs/configure/warehouses.md new file mode 100644 index 0000000000..4185576ba7 --- /dev/null +++ b/docs/docs/configure/warehouses.md @@ -0,0 +1,350 @@ +# Warehouses + +Altimate Code connects to 8 warehouse types. Configure them in the `warehouses` section of your config file or in `.altimate-code/connections.json`. + +## Configuration + +Each warehouse has a key (the connection name) and a config object: + +```json +{ + "warehouses": { + "my-connection-name": { + "type": "", + ... + } + } +} +``` + +!!! tip + Use `{env:...}` substitution for passwords and tokens so you never commit secrets to version control. + +## Snowflake + +```json +{ + "warehouses": { + "prod-snowflake": { + "type": "snowflake", + "account": "xy12345.us-east-1", + "user": "analytics_user", + "password": "{env:SNOWFLAKE_PASSWORD}", + "warehouse": "COMPUTE_WH", + "database": "ANALYTICS", + "role": "ANALYST_ROLE" + } + } +} +``` + +| Field | Required | Description | +|-------|----------|-------------| +| `account` | Yes | Snowflake account identifier (e.g. `xy12345.us-east-1`) | +| `user` | Yes | Username | +| `password` | Auth | Password (use one auth method) | +| `private_key_path` | Auth | Path to private key file (alternative to password) | +| `private_key_passphrase` | No | Passphrase for encrypted private key | +| `warehouse` | No | Warehouse name | +| `database` | No | Database name | +| `schema` | No | Schema name | +| `role` | No | User role | + +### Key-pair authentication + +```json +{ + "warehouses": { + "prod-snowflake": { + "type": "snowflake", + "account": "xy12345.us-east-1", + "user": "svc_altimate", + "private_key_path": "~/.ssh/snowflake_rsa_key.p8", + "private_key_passphrase": "{env:SNOWFLAKE_KEY_PASSPHRASE}", + "warehouse": "COMPUTE_WH", + "database": "ANALYTICS", + "role": "TRANSFORM_ROLE" + } + } +} +``` + +## BigQuery + +```json +{ + "warehouses": { + "bigquery-prod": { + "type": "bigquery", + "project": "my-gcp-project", + "credentials_path": "/path/to/service-account.json", + "location": "US" + } + } +} +``` + +| Field | Required | Description | +|-------|----------|-------------| +| `project` | Yes | Google Cloud project ID | +| `credentials_path` | No | Path to service account JSON file. Omit to use Application Default Credentials (ADC) | +| `location` | No | Default location (default: `US`) | + +### Using Application Default Credentials + +If you're already authenticated via `gcloud`, omit `credentials_path`: + +```json +{ + "warehouses": { + "bigquery-prod": { + "type": "bigquery", + "project": "my-gcp-project" + } + } +} +``` + +## Databricks + +```json +{ + "warehouses": { + "databricks-prod": { + "type": "databricks", + "server_hostname": "adb-1234567890.1.azuredatabricks.net", + "http_path": "/sql/1.0/warehouses/abcdef1234567890", + "access_token": "{env:DATABRICKS_TOKEN}", + "catalog": "main", + "schema": "default" + } + } +} +``` + +| Field | Required | Description | +|-------|----------|-------------| +| `server_hostname` | Yes | Databricks workspace hostname | +| `http_path` | Yes | HTTP path from compute resources | +| `access_token` | Yes | Personal Access Token (PAT) | +| `catalog` | No | Unity Catalog name | +| `schema` | No | Schema/database name | + +## PostgreSQL + +```json +{ + "warehouses": { + "my-postgres": { + "type": "postgres", + "host": "localhost", + "port": 5432, + "database": "analytics", + "user": "analyst", + "password": "{env:PG_PASSWORD}" + } + } +} +``` + +| Field | Required | Description | +|-------|----------|-------------| +| `connection_string` | No | Full connection string (alternative to individual fields) | +| `host` | No | Hostname (default: `localhost`) | +| `port` | No | Port (default: `5432`) | +| `database` | No | Database name (default: `postgres`) | +| `user` | No | Username | +| `password` | No | Password | + +### Using a connection string + +```json +{ + "warehouses": { + "my-postgres": { + "type": "postgres", + "connection_string": "postgresql://analyst:secret@localhost:5432/analytics" + } + } +} +``` + +## Redshift + +```json +{ + "warehouses": { + "redshift-prod": { + "type": "redshift", + "host": "my-cluster.abc123.us-east-1.redshift.amazonaws.com", + "port": 5439, + "database": "analytics", + "user": "admin", + "password": "{env:REDSHIFT_PASSWORD}" + } + } +} +``` + +| Field | Required | Description | +|-------|----------|-------------| +| `connection_string` | No | Full connection string (alternative to individual fields) | +| `host` | No | Hostname | +| `port` | No | Port (default: `5439`) | +| `database` | No | Database name (default: `dev`) | +| `user` | No | Username | +| `password` | No | Password | +| `iam_role` | No | IAM role ARN (alternative to password) | +| `region` | No | AWS region (default: `us-east-1`) | +| `cluster_identifier` | No | Cluster identifier (required for IAM auth) | + +### IAM authentication + +```json +{ + "warehouses": { + "redshift-prod": { + "type": "redshift", + "host": "my-cluster.abc123.us-east-1.redshift.amazonaws.com", + "database": "analytics", + "user": "admin", + "iam_role": "arn:aws:iam::123456789012:role/RedshiftReadOnly", + "cluster_identifier": "my-cluster", + "region": "us-east-1" + } + } +} +``` + +## DuckDB + +```json +{ + "warehouses": { + "dev-duckdb": { + "type": "duckdb", + "path": "./dev.duckdb" + } + } +} +``` + +| Field | Required | Description | +|-------|----------|-------------| +| `path` | No | Database file path. Omit or use `":memory:"` for in-memory | + +## MySQL + +```json +{ + "warehouses": { + "mysql-prod": { + "type": "mysql", + "host": "localhost", + "port": 3306, + "database": "analytics", + "user": "analyst", + "password": "{env:MYSQL_PASSWORD}" + } + } +} +``` + +| Field | Required | Description | +|-------|----------|-------------| +| `host` | No | Hostname (default: `localhost`) | +| `port` | No | Port (default: `3306`) | +| `database` | No | Database name | +| `user` | No | Username | +| `password` | No | Password | +| `ssl_ca` | No | Path to CA certificate file | +| `ssl_cert` | No | Path to client certificate file | +| `ssl_key` | No | Path to client key file | + +## SQL Server + +```json +{ + "warehouses": { + "sqlserver-prod": { + "type": "sqlserver", + "host": "localhost", + "port": 1433, + "database": "analytics", + "user": "sa", + "password": "{env:MSSQL_PASSWORD}" + } + } +} +``` + +| Field | Required | Description | +|-------|----------|-------------| +| `host` | No | Hostname (default: `localhost`) | +| `port` | No | Port (default: `1433`) | +| `database` | No | Database name | +| `user` | No | Username | +| `password` | No | Password | +| `driver` | No | ODBC driver name (default: `ODBC Driver 18 for SQL Server`) | +| `azure_auth` | No | Use Azure AD authentication (default: `false`) | +| `trust_server_certificate` | No | Trust server certificate without validation (default: `false`) | + +## SSH Tunneling + +All warehouse types support SSH tunneling for connections behind a bastion host: + +```json +{ + "warehouses": { + "prod-via-bastion": { + "type": "postgres", + "host": "10.0.1.50", + "database": "analytics", + "user": "analyst", + "password": "{env:PG_PASSWORD}", + "ssh_host": "bastion.example.com", + "ssh_port": 22, + "ssh_user": "ubuntu", + "ssh_auth_type": "key", + "ssh_key_path": "~/.ssh/id_rsa" + } + } +} +``` + +| Field | Required | Description | +|-------|----------|-------------| +| `ssh_host` | Yes | SSH bastion hostname | +| `ssh_port` | No | SSH port (default: `22`) | +| `ssh_user` | Yes | SSH username | +| `ssh_auth_type` | No | `"key"` or `"password"` | +| `ssh_key_path` | No | Path to SSH private key | +| `ssh_password` | No | SSH password | + +## Auto-Discovery + +The `/discover` command can automatically detect warehouse connections from: + +| Source | Detection | +|--------|-----------| +| dbt profiles | Parses `~/.dbt/profiles.yml` | +| Docker containers | Finds running PostgreSQL, MySQL, and SQL Server containers | +| Environment variables | Scans for `SNOWFLAKE_ACCOUNT`, `PGHOST`, `DATABRICKS_HOST`, etc. | + +See [Warehouse Tools](../data-engineering/tools/warehouse-tools.md) for the full list of environment variable signals. + +## Testing Connections + +After configuring a warehouse, verify it works: + +``` +> warehouse_test prod-snowflake + +Testing connection to prod-snowflake (snowflake)... + ✓ Connected successfully + Account: xy12345.us-east-1 + User: analytics_user + Role: ANALYST_ROLE + Warehouse: COMPUTE_WH + Database: ANALYTICS +``` diff --git a/docs/docs/data-engineering/guides/ci-headless.md b/docs/docs/data-engineering/guides/ci-headless.md index b9972b2804..bacc07d8a1 100644 --- a/docs/docs/data-engineering/guides/ci-headless.md +++ b/docs/docs/data-engineering/guides/ci-headless.md @@ -152,4 +152,4 @@ See [Tracing](../../configure/tracing.md) for the full trace reference. ## Security Recommendation -Use a **read-only warehouse user** for CI jobs that only need to read data. Reserve write-access credentials for jobs that explicitly need them (e.g., test generation that writes files). See [Security FAQ](../../security-faq.md) and [Permissions](../../configure/permissions.md). +Use a **read-only warehouse user** for CI jobs that only need to read data. Reserve write-access credentials for jobs that explicitly need them (e.g., test generation that writes files). See [Security FAQ](../../reference/security-faq.md) and [Permissions](../../configure/permissions.md). diff --git a/docs/docs/examples/index.md b/docs/docs/examples/index.md new file mode 100644 index 0000000000..070aeb2142 --- /dev/null +++ b/docs/docs/examples/index.md @@ -0,0 +1,49 @@ +# Showcase + +Real-world examples showing what altimate can do across data engineering workflows. Each example demonstrates end-to-end automation — from discovery to implementation. + +
+ +- :material-pipe:{ .lg .middle } **Build, Test & Document dbt Models** + + --- + + Pull context from your Knowledge Hub, grab requirements from a Jira ticket, and build fully tested dbt models — all from your IDE. + + +- :material-snowflake:{ .lg .middle } **Find Broken Views in Snowflake** + + --- + + Create a "Sprint Work Agent" that queries Snowflake, finds empty views, traces root causes through dbt models, and files Jira tickets. + + +- :material-cash-multiple:{ .lg .middle } **Optimize Cost & Performance** + + --- + + Automate discovery and implementation of optimization opportunities across Snowflake, Databricks, and BigQuery. + + +- :material-swap-horizontal:{ .lg .middle } **Migrate PySpark to dbt** + + --- + + Convert a PySpark-based reporting project in Databricks to dbt with automated code conversion, testing, and validation. + + +- :material-bug:{ .lg .middle } **Debug an Airflow DAG** + + --- + + Use AI to debug Airflow DAGs by combining platform integrations, best-practice templates, and automated fix suggestions. + + +- :material-function:{ .lg .middle } **Write Snowflake UDFs** + + --- + + Use the Knowledge Hub to guide LLMs in building Snowflake UDFs with best practices, examples, and auto-generated documentation. + + +
diff --git a/docs/docs/getting-started.md b/docs/docs/getting-started.md index 75ebab958c..5895bfd23f 100644 --- a/docs/docs/getting-started.md +++ b/docs/docs/getting-started.md @@ -64,9 +64,7 @@ Once complete, altimate indexes your schemas and detects your tooling, enabling ### Option B: Manual configuration -Add a warehouse connection to your `altimate-code.json`. Here are minimal snippets for each warehouse type: - -#### Snowflake (quick-connect) +Add a warehouse connection to your `altimate-code.json`. Here's a quick example: ```json { @@ -83,52 +81,7 @@ Add a warehouse connection to your `altimate-code.json`. Here are minimal snippe } ``` -#### BigQuery (quick-connect) - -```json -{ - "warehouses": { - "bigquery": { - "type": "bigquery", - "project": "my-gcp-project", - "dataset": "analytics" - } - } -} -``` - -> Tip: Omit `service_account` to use Application Default Credentials (`gcloud auth application-default login`). - -#### Databricks (quick-connect) - -```json -{ - "warehouses": { - "databricks": { - "type": "databricks", - "host": "dbc-abc123.cloud.databricks.com", - "token": "${DATABRICKS_TOKEN}", - "warehouse_id": "abcdef1234567890", - "catalog": "main" - } - } -} -``` - -#### DuckDB (quick-connect) - -```json -{ - "warehouses": { - "duckdb": { - "type": "duckdb", - "database": "./dev.duckdb" - } - } -} -``` - -See [Warehouse connections](#warehouse-connections) below for full configuration options including key-pair auth, Redshift, and PostgreSQL. +For all warehouse types (Snowflake, BigQuery, Databricks, PostgreSQL, Redshift, DuckDB, MySQL, SQL Server) and advanced options (key-pair auth, ADC, SSH tunneling), see the [Warehouses reference](configure/warehouses.md). ## Step 4: Choose an Agent Mode @@ -162,94 +115,7 @@ altimate uses a JSON config file. Create `altimate-code.json` in your project ro ### Warehouse connections -```json -{ - "warehouses": { - "prod-snowflake": { - "type": "snowflake", - "account": "xy12345.us-east-1", - "user": "analytics_user", - "password": "${SNOWFLAKE_PASSWORD}", - "warehouse": "COMPUTE_WH", - "database": "ANALYTICS", - "role": "ANALYST_ROLE" - }, - "dev-duckdb": { - "type": "duckdb", - "database": "./dev.duckdb" - } - } -} -``` - -### Snowflake (key-pair auth) - -```json -{ - "warehouses": { - "snowflake-prod": { - "type": "snowflake", - "account": "xy12345.us-east-1", - "user": "svc_altimate", - "private_key_path": "~/.ssh/snowflake_rsa_key.p8", - "warehouse": "COMPUTE_WH", - "database": "ANALYTICS", - "role": "SYSADMIN" - } - } -} -``` - -### BigQuery - -```json -{ - "warehouses": { - "bigquery-prod": { - "type": "bigquery", - "project": "my-gcp-project", - "dataset": "analytics", - "service_account": "/path/to/service-account.json" - } - } -} -``` - -Or use Application Default Credentials (ADC). Just omit `service_account` and run `gcloud auth application-default login`. - -### Databricks - -```json -{ - "warehouses": { - "databricks-prod": { - "type": "databricks", - "host": "dbc-abc123.cloud.databricks.com", - "token": "${DATABRICKS_TOKEN}", - "warehouse_id": "abcdef1234567890", - "catalog": "main", - "schema": "default" - } - } -} -``` - -### PostgreSQL / Redshift - -```json -{ - "warehouses": { - "postgres-dev": { - "type": "postgres", - "host": "localhost", - "port": 5432, - "database": "analytics", - "user": "analyst", - "password": "${PG_PASSWORD}" - } - } -} -``` +For all warehouse types and configuration options, see the [Warehouses reference](configure/warehouses.md). ## Project-level config diff --git a/docs/docs/getting-started/index.md b/docs/docs/getting-started/index.md new file mode 100644 index 0000000000..44497fdbdd --- /dev/null +++ b/docs/docs/getting-started/index.md @@ -0,0 +1,182 @@ +--- +title: Altimate Code +hide: + - toc +--- + + + +
+ +

+ altimate-code +

+ +

The open-source data engineering harness.

+ +

50+ specialized data engineering tools for building, validating, optimizing, and shipping data products. Use in your terminal, CI pipeline, orchestration DAGs, or as the harness for your data agents. Evaluate across platforms, independent of any single warehouse provider.

+ +

+ +[Get Started](quickstart.md){ .md-button .md-button--primary } +[See Examples](../examples/index.md){ .md-button } +[View on GitHub :material-github:](https://github.com/AltimateAI/altimate-code){ .md-button } + +

+ +
+ +
+ +```bash +npm install -g altimate-code && altimate +``` + +
+ +--- + +

Why Altimate Code?

+

Every major data platform is building AI agents — but they're all locked to one ecosystem. Your data stack isn't.

+ +Your transformation logic is in dbt. Your orchestration is in Airflow or Dagster. Your warehouses span Snowflake and BigQuery (and maybe that Redshift cluster nobody wants to talk about). Your governance requirements cross every platform boundary. + +Altimate Code goes the other direction. It connects to your **entire** stack and lets you bring **any LLM** you want. No vendor lock-in. No platform tax. + +
+ +- :material-open-source-initiative:{ .lg .middle } **Open source & auditable** + + --- + + Every tool, every agent prompt, every analysis rule is inspectable, extensible, and auditable. For data teams in regulated industries, that's not a nice-to-have — it's a requirement. + +- :material-connection:{ .lg .middle } **Cross-platform, not single-vendor** + + --- + + Optimize a Snowflake query in the morning. Migrate a SQL Server pipeline to BigQuery in the afternoon. Same agent, same tools. No warehouse subscription required. First-class support for :material-snowflake: Snowflake, :material-google-cloud: BigQuery, :simple-databricks: Databricks, :material-elephant: PostgreSQL, :material-aws: Redshift, :material-duck: DuckDB, :material-database: MySQL, and :material-microsoft: SQL Server. + +- :material-cloud-outline:{ .lg .middle } **Works with any LLM** + + --- + + Model-agnostic — bring your own provider, use your existing subscription, or run locally. Swap models without swapping your harness. Supports :material-cloud: Anthropic, :material-creation: OpenAI, :material-google: Google Gemini, :material-google: Google Vertex AI, :material-aws: AWS Bedrock, :material-microsoft-azure: Azure OpenAI, :material-server: Ollama, :material-router-wireless: OpenRouter, :material-cog: Mistral, :material-lightning-bolt: Groq, :material-head-snowflake-outline: DeepInfra, :material-brain: Cerebras, :material-message-text: Cohere, :material-group: Together AI, :material-compass: Perplexity, :material-alpha-x-circle: xAI, and :material-github: GitHub Copilot. + +- :material-puzzle:{ .lg .middle } **Customizable to your workflow** + + --- + + Bring your own rules, agents, skills, and tools. Customize the framework to match your company's data conventions, naming standards, and testing patterns. + +- :material-shield-check:{ .lg .middle } **Governed by design — five agent modes** + + --- + + Five agent modes — Builder, Analyst, Validator, Migrator, and Executive — each with tool-level permissions you can `allow`, `ask`, or `deny` per agent. Layer on project rules via `AGENTS.md`, automatic context compaction for long sessions, and auto-formatting on every edit. Governance enforced by the harness. + +
+ +--- + +

50+ specialized tools

+

Unlike general-purpose coding agents, every tool is purpose-built for data engineering workflows.

+ +
+ +- :material-database-search:{ .lg .middle } **SQL Anti-Pattern Detection** + + --- + + 19 rules with confidence scoring. Catches SELECT *, missing filters, cartesian joins, non-sargable predicates, and more. 100% accuracy across 1,077 benchmark queries. + +- :material-graph-outline:{ .lg .middle } **Live Column-Level Lineage** + + --- + + Real-time lineage extraction from SQL. Trace any column back through joins, CTEs, and subqueries to its source. Not a cached graph — a living lineage that updates with every change. + +- :material-cash-multiple:{ .lg .middle } **FinOps & Cost Analysis** + + --- + + Credit analysis, expensive query detection, warehouse right-sizing, and unused resource cleanup. Specific optimization recommendations with estimated savings. + +- :material-translate:{ .lg .middle } **Cross-Dialect Translation** + + --- + + Deterministic engine translating SQL between Snowflake, BigQuery, Databricks, Redshift, PostgreSQL, MySQL, SQL Server, and DuckDB with lineage verification. + +- :material-shield-lock-outline:{ .lg .middle } **PII Detection & Safety** + + --- + + Automatic column scanning across 15+ PII categories. Safety checks and policy enforcement before every query touches production. + +- :material-pipe:{ .lg .middle } **dbt Native** + + --- + + Manifest parsing, test generation, model scaffolding, incremental model detection, and lineage-aware refactoring. Builds models that fit your project conventions. + +
+ +--- + +

See it in action

+

Build dbt models from Jira tickets, find broken Snowflake views, optimize warehouse costs, migrate PySpark to dbt, debug Airflow DAGs, and more — all from your terminal.

+ +```bash + +# Analyze a query for anti-patterns and optimization opportunities +> Analyze this query for issues: or + +# Translate SQL across dialects +> /sql-translate this Snowflake query to BigQuery: + +# Get a cost report for your Snowflake or Databricks account +> /cost-report + +# Scaffold a new dbt model following your project patterns +> /model-scaffold fct_revenue from stg_orders and stg_payments + +# Generate column level lineage report for sensitive columns +# from a particular table and identify owners +> Trace the lineage for email_id and name columns from + customer_data.customer_info table and generate a report + of where sensitive data is replicated with table owners info + +# Migrate PySpark jobs to dbt models +> Migrate this PySpark ETL to a dbt model: + +# Debug a failing Airflow DAG +> Debug this Airflow DAG failure: +``` + +

[:octicons-arrow-right-24: Browse more examples](../examples/index.md)

+ +--- + +

Benchmarks

+

Precision matters. Here's where we stand.

+ +| Benchmark | Result | +|---|---| +| **ADE-Bench (DuckDB Local)** | **74.4%** pass rate (32/43 tasks) — 15.4 points ahead of dbt Fusion+MCP (59%). | +| **SQL Anti-Pattern Detection** | 100% accuracy across 1,077 queries, 19 categories. Zero false positives. | +| **Column-Level Lineage** | 100% edge match across 500 queries with complex joins, CTEs, and subqueries. | +| **Snowflake Query Optimization (TPC-H)** | 16.8% average execution speedup (3.6x vs baseline). | + +

[:octicons-arrow-right-24: Full benchmark details](https://www.altimate.sh/benchmarks)

+ +--- + + diff --git a/docs/docs/getting-started/quickstart-new.md b/docs/docs/getting-started/quickstart-new.md new file mode 100644 index 0000000000..3c5bb6e36c --- /dev/null +++ b/docs/docs/getting-started/quickstart-new.md @@ -0,0 +1,229 @@ +--- +description: "Get value from Altimate Code in 10 minutes. For data engineers who know dbt, Snowflake, and SQL — skip the basics, see what Altimate adds to your workflow." +--- + +# Quickstart + +--- + +## Step 1: Install + +```bash +npm install -g altimate-code +``` + +Or via Homebrew: `brew install AltimateAI/tap/altimate-code` + +--- + +## Step 2: Connect Your LLM + +```bash +altimate # Launch the TUI +/connect # Interactive setup +``` + +Or set an environment variable and skip the prompt: + +```bash +export ANTHROPIC_API_KEY=sk-ant-... +altimate +``` + +> **No API key?** Select **Codex** in `/connect` — it's built-in with no setup. + +--- + +## Step 3: Connect Your Warehouse + +### Option A: Auto-detect from dbt profiles + +If you have `~/.dbt/profiles.yml` configured: + +```bash +/discover +``` + +Altimate reads your dbt profiles and creates warehouse connections automatically. You'll see output like: + +``` +Found dbt project: jaffle_shop (dbt-snowflake) +Found profile: snowflake_prod → Added connection 'snowflake_prod' +Indexing schema... 142 tables, 1,847 columns indexed +``` + +### Option B: Manual configuration + +Add to `altimate-code.json` in your project root: + +=== "Snowflake" + + ```json + { + "connections": { + "snowflake": { + "type": "snowflake", + "account": "xy12345.us-east-1", + "user": "dbt_user", + "password": "${SNOWFLAKE_PASSWORD}", + "warehouse": "TRANSFORM_WH", + "database": "ANALYTICS", + "schema": "PUBLIC", + "role": "TRANSFORMER" + } + } + } + ``` + +=== "BigQuery" + + ```json + { + "connections": { + "bigquery": { + "type": "bigquery", + "project": "my-project-id", + "keyfile": "~/.config/gcloud/application_default_credentials.json" + } + } + } + ``` + +=== "PostgreSQL" + + ```json + { + "connections": { + "postgres": { + "type": "postgres", + "host": "localhost", + "port": 5432, + "database": "analytics", + "user": "postgres", + "password": "${POSTGRES_PASSWORD}" + } + } + } + ``` + +=== "DuckDB (local)" + + ```json + { + "connections": { + "local": { + "type": "duckdb", + "database": "./data/analytics.duckdb" + } + } + } + ``` + +Then index the schema for autocomplete and analysis: + +```bash +/schema-index snowflake +``` + +--- + +## Step 4: Your First Workflow — NYC Taxi Cab Analytics + +Try this end-to-end example. Paste this prompt into the TUI: + +``` +Take the New York City taxi cab public dataset, bring up a DuckDB instance, +and build a dashboard showing areas of maximum coverage and lowest coverage. +Set up a complete dbt project with staging, intermediate, and mart layers, +and create an Airflow DAG to orchestrate the pipeline. +``` + +**What altimate does:** + +1. **Downloads the NYC TLC trip data** into a local DuckDB instance +2. **Scaffolds a full dbt project** with proper directory structure: + ``` + nyc_taxi/ + models/ + staging/ + stg_yellow_trips.sql + stg_taxi_zones.sql + intermediate/ + int_trips_by_zone.sql + int_zone_coverage_stats.sql + marts/ + fct_zone_coverage.sql + dim_zones.sql + seeds/ + taxi_zone_lookup.csv + dbt_project.yml + profiles.yml # points to DuckDB + ``` +3. **Generates mart models** that aggregate pickup/dropoff counts per zone, rank zones by trip volume, and classify them as high-coverage or low-coverage +4. **Creates an Airflow DAG** (`dags/nyc_taxi_pipeline.py`) with tasks for data ingestion, `dbt run`, `dbt test`, and dashboard generation +5. **Builds an interactive dashboard** visualizing zone coverage across NYC — top zones, bottom zones, and geographic distribution + +This single prompt exercises warehouse connections, dbt scaffolding, SQL generation, orchestration wiring, and visualization — the full altimate toolkit. + +--- + +## Skill Discovery: What Can I Do? + +Type `/` in the TUI to see all available skills. Here's a quick reference for common tasks: + +| I want to... | Skill | Example | +| ------------------------- | ------------------- | -------------------------------------------------------- | +| Optimize a slow query | `/query-optimize` | `/query-optimize SELECT * FROM big_table` | +| Review SQL before merging | `/sql-review` | `/sql-review models/staging/stg_orders.sql` | +| Check Snowflake costs | `/cost-report` | `/cost-report` (last 30 days) | +| Scan for PII exposure | `/pii-audit` | `/pii-audit` (full schema) or `/pii-audit models/marts/` | +| Debug a dbt error | `/dbt-troubleshoot` | Paste the error message | +| Add tests to a model | `/dbt-test` | `/dbt-test models/staging/stg_orders.sql` | +| Document a model | `/dbt-docs` | `/dbt-docs models/marts/fct_revenue.sql` | +| Analyze downstream impact | `/dbt-analyze` | `/dbt-analyze stg_orders` (before refactoring) | +| Create a new dbt model | `/dbt-develop` | `Create a staging model for the raw_orders source` | +| Translate SQL dialects | `/sql-translate` | `/sql-translate snowflake bigquery SELECT DATEADD(...)` | +| Check migration safety | `/schema-migration` | `/schema-migration migrations/V003__alter_orders.sql` | +| Teach a pattern | `/teach` | `/teach @models/staging/stg_orders.sql` | + +**Pro tip:** You don't need to memorize these. Just describe what you want in plain English — the agent routes to the right skill automatically. + +--- + +## What's Next + +
+ +- :material-cog:{ .lg .middle } **Complete Setup** + + *** + + Advanced warehouse configs, all LLM providers, SSH tunneling, multi-environment setup. + + [:octicons-arrow-right-24: Complete Setup](quickstart.md) + +- :material-account-group:{ .lg .middle } **Agent Modes** + + *** + + Builder, Analyst, Validator, Migrator, Executive — choose the right permissions for your task. + + [:octicons-arrow-right-24: Agent Modes](../data-engineering/agent-modes.md) + +- :material-robot:{ .lg .middle } **CI & Automation** + + *** + + Run SQL review gates in GitHub Actions, block PRs with failing grades, automate cost reports. + + [:octicons-arrow-right-24: CI & Automation](../data-engineering/guides/ci-headless.md) + +- :material-school:{ .lg .middle } **Train Your Agent** + + *** + + Teach project-specific patterns, naming conventions, and SQL style rules. + + [:octicons-arrow-right-24: Training](../configure/skills.md) + +
diff --git a/docs/docs/getting-started/quickstart.md b/docs/docs/getting-started/quickstart.md new file mode 100644 index 0000000000..d60c6f337f --- /dev/null +++ b/docs/docs/getting-started/quickstart.md @@ -0,0 +1,451 @@ +--- +description: "Install altimate-code, connect your warehouse and LLM, configure agent modes, skills, and permissions." +--- + +# Setup + +> **You need:** npm 8+ or Homebrew. An API key for any supported LLM provider, or use Codex (built-in, no key required). + +--- + +## Step 1: Install + +```bash +# npm (recommended) +npm install -g altimate-code + +# Homebrew +brew install AltimateAI/tap/altimate-code +``` + +> **Zero additional setup.** One command install. + +--- + +## Step 2: Configure Your LLM + +```bash +altimate # Launch the TUI +/connect # Choose your provider and enter your API key +``` + +Or set an environment variable: + +```bash +export ANTHROPIC_API_KEY=your-key-here # Anthropic Claude (recommended) +export OPENAI_API_KEY=your-key-here # OpenAI +``` + +Minimal config file option (`altimate-code.json` in your project root): + +```json +{ + "provider": { + "anthropic": { + "apiKey": "{env:ANTHROPIC_API_KEY}" + } + }, + "model": "anthropic/claude-sonnet-4-6" +} +``` + +> **No API key?** Select **Codex** in the `/connect` menu. It's a built-in provider with no setup required. + +### Changing your LLM provider + +Switch providers at any time by updating the `provider` and `model` fields in `altimate-code.json`: + +=== "Anthropic" + + ```json + { + "provider": { + "anthropic": { + "apiKey": "{env:ANTHROPIC_API_KEY}" + } + }, + "model": "anthropic/claude-sonnet-4-6" + } + ``` + +=== "OpenAI" + + ```json + { + "provider": { + "openai": { + "apiKey": "{env:OPENAI_API_KEY}" + } + }, + "model": "openai/gpt-4o" + } + ``` + +=== "AWS Bedrock" + + ```json + { + "provider": { + "bedrock": { + "region": "us-east-1", + "accessKeyId": "{env:AWS_ACCESS_KEY_ID}", + "secretAccessKey": "{env:AWS_SECRET_ACCESS_KEY}" + } + }, + "model": "bedrock/anthropic.claude-sonnet-4-6-v1" + } + ``` + +=== "Azure OpenAI" + + ```json + { + "provider": { + "azure": { + "apiKey": "{env:AZURE_OPENAI_API_KEY}", + "baseURL": "https://your-resource.openai.azure.com/openai/deployments/your-deployment" + } + }, + "model": "azure/gpt-4o" + } + ``` + +=== "Google Gemini" + + ```json + { + "provider": { + "google": { + "apiKey": "{env:GOOGLE_API_KEY}" + } + }, + "model": "google/gemini-2.5-pro" + } + ``` + +=== "Ollama (Local)" + + ```json + { + "provider": { + "ollama": { + "baseURL": "http://localhost:11434" + } + }, + "model": "ollama/llama3.1" + } + ``` + +=== "OpenRouter" + + ```json + { + "provider": { + "openrouter": { + "apiKey": "{env:OPENROUTER_API_KEY}" + } + }, + "model": "openrouter/anthropic/claude-sonnet-4-6" + } + ``` + +You can also set a smaller model for lightweight tasks like summarization: + +```json +{ + "model": "anthropic/claude-sonnet-4-6", + "small_model": "anthropic/claude-haiku-4-5-20251001" +} +``` + +--- + +## Step 3: Connect Your Warehouse + +### Auto-discover with `/discover` + +> Skip this step if you want to work locally. You can always run `/discover` later. + +```bash +altimate /discover +``` + +Auto-detects your dbt projects, warehouse credentials from `~/.dbt/profiles.yml`, running Docker containers, and environment variables (`SNOWFLAKE_ACCOUNT`, `PGHOST`, `DATABASE_URL`, etc.). + +### Manual configuration + +Add a warehouse connection to `altimate-code.json`: + +=== "Snowflake" + + ```json + { + "warehouses": { + "snowflake": { + "type": "snowflake", + "account": "xy12345.us-east-1", + "user": "dbt_user", + "password": "{env:SNOWFLAKE_PASSWORD}", + "warehouse": "TRANSFORM_WH", + "database": "ANALYTICS", + "schema": "PUBLIC", + "role": "TRANSFORMER" + } + } + } + ``` + +=== "BigQuery" + + ```json + { + "warehouses": { + "bigquery": { + "type": "bigquery", + "project": "my-project-id", + "credentials_path": "~/.config/gcloud/application_default_credentials.json" + } + } + } + ``` + +=== "Databricks" + + ```json + { + "warehouses": { + "databricks": { + "type": "databricks", + "server_hostname": "dbc-abc123.cloud.databricks.com", + "http_path": "/sql/1.0/warehouses/abcdef", + "access_token": "{env:DATABRICKS_TOKEN}", + "catalog": "main", + "schema": "default" + } + } + } + ``` + +=== "PostgreSQL" + + ```json + { + "warehouses": { + "postgres": { + "type": "postgres", + "host": "localhost", + "port": 5432, + "database": "analytics", + "user": "postgres", + "password": "{env:POSTGRES_PASSWORD}" + } + } + } + ``` + +=== "DuckDB" + + ```json + { + "warehouses": { + "local": { + "type": "duckdb", + "path": "./data/analytics.duckdb" + } + } + } + ``` + +=== "Redshift" + + ```json + { + "warehouses": { + "redshift": { + "type": "redshift", + "host": "my-cluster.abc123.us-east-1.redshift.amazonaws.com", + "port": 5439, + "database": "analytics", + "user": "admin", + "password": "{env:REDSHIFT_PASSWORD}" + } + } + } + ``` + +All warehouse types support SSH tunneling for bastion hosts. See the [Warehouses reference](../configure/warehouses.md) for full options including key-pair auth, IAM roles, and ADC. + +Verify your connection: + +``` +> warehouse_test snowflake +✓ Connected successfully +``` + +--- + +## Step 4: Choose an Agent Mode + +altimate ships with specialized agent modes, each with its own tool permissions: + +| Mode | Access | Use when you want to... | +|---|---|---| +| **Builder** | Read/Write | Create and modify SQL, dbt models, pipelines | +| **Analyst** | Read-only | Explore production data safely, run cost analysis | +| **Validator** | Read + Validate | Check data quality, run anti-pattern detection | +| **Migrator** | Cross-warehouse | Translate SQL between dialects, plan migrations | +| **Researcher** | Read-only + Parallel | Deep-dive investigations, lineage tracing | +| **Trainer** | Read-only + Training | Teach the agent your project conventions | +| **Executive** | Read-only | Generate business-friendly reports and summaries | + +Switch modes in the TUI: + +``` +/agent analyst +``` + +Or from the CLI: + +```bash +altimate --agent analyst +``` + +The **Analyst** mode is production-safe — it blocks INSERT, UPDATE, DELETE, and DROP statements at the harness level. The **Builder** mode has full read/write access for creating and editing SQL and dbt files. + +--- + +## Step 5: Select Skills + +Skills are reusable prompt templates for common workflows. Type `/` in the TUI to browse all available skills: + +| Skill | Purpose | +|---|---| +| `/query-optimize` | Optimize slow queries with anti-pattern detection | +| `/sql-review` | SQL quality gate with grading | +| `/sql-translate` | Cross-dialect SQL translation | +| `/cost-report` | Snowflake/Databricks cost analysis | +| `/pii-audit` | Scan for PII exposure | +| `/dbt-develop` | Scaffold new dbt models | +| `/dbt-test` | Generate dbt tests | +| `/dbt-docs` | Generate dbt documentation | +| `/dbt-analyze` | Column-level lineage and impact analysis | +| `/dbt-troubleshoot` | Debug dbt errors | +| `/data-viz` | Interactive dashboards and visualizations | +| `/teach` | Teach patterns from example files | +| `/train` | Load standards from documents | + +You don't need to memorize these — describe what you want in plain English and the agent routes to the right skill automatically. + +### Custom skills + +Add your own skills as Markdown files in `.altimate-code/skill/`: + +```markdown +--- +name: cost-review +description: Review SQL queries for cost optimization +--- + +Analyze the SQL query for cost optimization opportunities. +Focus on: $ARGUMENTS +``` + +Skills are loaded from these paths (highest priority first): + +1. `.altimate-code/skill/` (project) +2. `~/.altimate-code/skills/` (global) +3. Custom paths via config: + +```json +{ + "skills": { + "paths": ["./my-skills", "~/shared-skills"] + } +} +``` + +--- + +## Step 6: Configure Permissions + +Governance is enforced at the harness level, not via prompts. Every tool has a permission level: `allow`, `ask`, or `deny`. + +### Per-agent permissions + +Set tool permissions for each agent mode in `altimate-code.json`: + +```json +{ + "agent": { + "analyst": { + "permission": { + "write": "deny", + "edit": "deny", + "bash": { + "dbt docs generate": "allow", + "*": "deny" + } + } + }, + "builder": { + "permission": { + "write": "allow", + "edit": "allow", + "bash": { + "dbt *": "allow", + "rm -rf *": "deny" + } + } + } + } +} +``` + +### Project rules with AGENTS.md + +Define project-wide conventions in an `AGENTS.md` file at your project root. These rules are automatically loaded into every agent's system prompt: + +```markdown +# Project Rules + +- All staging models must be prefixed with `stg_` +- Never run queries without a WHERE clause on production tables +- Use `ref()` instead of hardcoded table names in dbt models +- All new models require at least one unique test and one not_null test +``` + +### Default permissions by agent mode + +| Agent | File writes | SQL writes | Bash | Training | +|---|---|---|---|---| +| Builder | allow | allow | ask | deny | +| Analyst | deny | deny (SELECT only) | deny | deny | +| Validator | deny | deny | ask | deny | +| Migrator | allow | allow | ask | deny | +| Researcher | deny | deny | allow | deny | +| Trainer | deny | deny | deny | allow | +| Executive | deny | deny | deny | deny | + +--- + +## Step 7: Build Your First Artifact + +In the TUI, paste this prompt: + +``` +Build a NYC taxi analytics dashboard using BigQuery public data and dbt +for transformations. Include geographic demand analysis with +pickup/dropoff hotspots, top routes, airport traffic, and borough +comparisons. Add revenue analytics with fare breakdowns, fare +distribution, tip analysis, payment trends, and revenue-per-mile +by route. +``` + +--- + +## What's Next + +- [Agent Modes](../data-engineering/agent-modes.md): Deep dive into each mode's capabilities +- [Warehouses Reference](../configure/warehouses.md): All warehouse types, auth methods, SSH tunneling +- [Config Reference](../configure/config.md): Full config file schema +- [CI & Automation](../data-engineering/guides/ci-headless.md): Run altimate in automated pipelines diff --git a/docs/docs/reference/changelog.md b/docs/docs/reference/changelog.md new file mode 100644 index 0000000000..063601ad13 --- /dev/null +++ b/docs/docs/reference/changelog.md @@ -0,0 +1,22 @@ +# What's New + +Release notes and changelog for Altimate Code. + +--- + +!!! note "Coming soon" + Detailed release notes will be published here with each version. For now, check the [GitHub releases](https://github.com/AltimateAI/altimate-code/releases) page for the latest updates. + +## How to check your version + +```bash +altimate --version +``` + +## How to upgrade + +```bash +npm update -g @altimateai/altimate-code +``` + +After upgrading, the TUI welcome banner shows what changed since your previous version. diff --git a/docs/docs/network.md b/docs/docs/reference/network.md similarity index 100% rename from docs/docs/network.md rename to docs/docs/reference/network.md diff --git a/docs/docs/security-faq.md b/docs/docs/reference/security-faq.md similarity index 90% rename from docs/docs/security-faq.md rename to docs/docs/reference/security-faq.md index 0c3669c42b..2491c9a165 100644 --- a/docs/docs/security-faq.md +++ b/docs/docs/reference/security-faq.md @@ -19,66 +19,11 @@ Altimate Code needs database credentials to connect to your warehouse. Credentia ## What can the agent actually execute? -Altimate Code can read files, write files, and run shell commands, but only with your permission. The [permission system](configure/permissions.md) lets you control every tool: - -| Level | Behavior | -|-------|----------| -| `"allow"` | Runs without confirmation | -| `"ask"` | Prompts you before each use | -| `"deny"` | Blocked entirely | - -By default, destructive operations like `bash`, `write`, and `edit` require confirmation. You can further restrict specific commands: - -```json -{ - "permission": { - "bash": { - "*": "ask", - "dbt *": "allow", - "git status": "allow", - "DROP *": "deny", - "rm *": "deny" - } - } -} -``` +Altimate Code can read files, write files, and run shell commands, but only with your permission. The [permission system](../configure/permissions.md) lets you set every tool to `"allow"`, `"ask"`, or `"deny"`, with pattern-based rules for fine-grained control. See the [Permissions reference](../configure/permissions.md) for the full configuration guide. ## Can I prevent the agent from modifying production databases? -Yes. Use pattern-based permissions to deny destructive SQL: - -```json -{ - "permission": { - "bash": { - "*": "ask", - "DROP *": "deny", - "DELETE *": "deny", - "TRUNCATE *": "deny", - "ALTER *": "deny" - } - } -} -``` - -You can also configure per-agent permissions. For example, restrict the `analyst` agent to read-only: - -```json -{ - "agent": { - "analyst": { - "permission": { - "write": "deny", - "edit": "deny", - "bash": { - "SELECT *": "allow", - "*": "deny" - } - } - } - } -} -``` +Yes. Use pattern-based permissions to deny destructive SQL (`DROP *`, `DELETE *`, `TRUNCATE *`), and per-agent permissions to restrict agents like `analyst` to read-only. See the [Permissions reference](../configure/permissions.md#pattern-based-permissions) for examples and recommended configurations. ## What network endpoints does Altimate Code contact? @@ -108,7 +53,7 @@ export ALTIMATE_CLI_MODELS_PATH=/path/to/models.json ## What telemetry is collected? -Anonymous usage telemetry, including event names, token counts, timing, and error types. **Never** code, queries, credentials, file paths, or prompt content. See the full [Telemetry reference](configure/telemetry.md) for the complete event list. +Anonymous usage telemetry, including event names, token counts, timing, and error types. **Never** code, queries, credentials, file paths, or prompt content. See the full [Telemetry reference](telemetry.md) for the complete event list. Disable telemetry entirely: @@ -255,7 +200,7 @@ Altimate Code applies safe defaults so you don't have to configure anything for **"Prompted"** means you'll see the command and can approve or reject it. **"Blocked"** means the agent cannot run it at all; you must override in config. -To override defaults, add rules in `altimate-code.json`. See [Permissions](configure/permissions.md) for the full configuration reference. +To override defaults, add rules in `altimate-code.json`. See [Permissions](../configure/permissions.md) for the full configuration reference. ## Best practices for staying safe @@ -263,7 +208,7 @@ To override defaults, add rules in `altimate-code.json`. See [Permissions](confi 2. **Work on a branch.** Let the agent work on a feature branch so you can review changes before merging. Git gives you a full safety net. This is the single most effective protection. -3. **Use per-agent permissions.** Give each agent only what it needs. The `analyst` agent doesn't need write access. See [Permissions](configure/permissions.md) for examples. +3. **Use per-agent permissions.** Give each agent only what it needs. The `analyst` agent doesn't need write access. See [Permissions](../configure/permissions.md) for examples. 4. **Use read-only database credentials for exploration.** When using the agent for analysis or ad-hoc queries, connect with a read-only database user. diff --git a/docs/docs/configure/telemetry.md b/docs/docs/reference/telemetry.md similarity index 99% rename from docs/docs/configure/telemetry.md rename to docs/docs/reference/telemetry.md index 0aefe92620..5d499d8e4d 100644 --- a/docs/docs/configure/telemetry.md +++ b/docs/docs/reference/telemetry.md @@ -93,7 +93,7 @@ Telemetry data is sent to Azure Application Insights: |----------|---------| | `eastus-8.in.applicationinsights.azure.com` | Telemetry ingestion | -For a complete list of network endpoints, see the [Network Reference](../network.md). +For a complete list of network endpoints, see the [Network Reference](network.md). ## For Contributors diff --git a/docs/docs/troubleshooting.md b/docs/docs/reference/troubleshooting.md similarity index 100% rename from docs/docs/troubleshooting.md rename to docs/docs/reference/troubleshooting.md diff --git a/docs/docs/windows-wsl.md b/docs/docs/reference/windows-wsl.md similarity index 100% rename from docs/docs/windows-wsl.md rename to docs/docs/reference/windows-wsl.md diff --git a/docs/mkdocs.yml b/docs/mkdocs.yml index c19639c515..e281960f94 100644 --- a/docs/mkdocs.yml +++ b/docs/mkdocs.yml @@ -73,77 +73,64 @@ markdown_extensions: emoji_generator: !!python/name:material.extensions.emoji.to_svg nav: - - Home: index.md - - - Get Started: - - Quickstart: quickstart.md - - Full Setup: getting-started.md - - Agent Modes: data-engineering/agent-modes.md + - Getting Started: + - Overview: getting-started/index.md + - Quickstart: getting-started/quickstart-new.md + - Complete Setup: getting-started/quickstart.md + - Examples: + - Showcase: examples/index.md + - Guides: + - Cost Optimization: data-engineering/guides/cost-optimization.md + - Migration: data-engineering/guides/migration.md + - CI & Automation: data-engineering/guides/ci-headless.md + - Use: + - Agents: + - Agent Modes: data-engineering/agent-modes.md + - Agent Config: configure/agents.md + - Tools: + - Overview: configure/tools/index.md + - Built-in Tools: configure/tools/config.md + - Core Tools: configure/tools/core-tools.md + - SQL Tools: data-engineering/tools/sql-tools.md + - Schema Tools: data-engineering/tools/schema-tools.md + - FinOps Tools: data-engineering/tools/finops-tools.md + - Lineage Tools: data-engineering/tools/lineage-tools.md + - dbt Tools: data-engineering/tools/dbt-tools.md + - Warehouse Tools: data-engineering/tools/warehouse-tools.md + - Custom Tools: configure/tools/custom.md + - Skills: configure/skills.md + - Commands: configure/commands.md - Interfaces: - - Terminal UI: usage/tui.md + - TUI: usage/tui.md - CLI: usage/cli.md - - IDE / VS Code: usage/ide.md - - Web UI: usage/web.md - - - Guides: - - Cost Optimization: data-engineering/guides/cost-optimization.md - - SQL Migration: data-engineering/guides/migration.md - - CI & Automation: data-engineering/guides/ci-headless.md - - - Tools: - - SQL Analysis: data-engineering/tools/sql-tools.md - - Schema & Metadata: data-engineering/tools/schema-tools.md - - Column-Level Lineage: data-engineering/tools/lineage-tools.md - - dbt Integration: data-engineering/tools/dbt-tools.md - - Cost & FinOps: data-engineering/tools/finops-tools.md - - Warehouse Tools: data-engineering/tools/warehouse-tools.md - - - Integrations: - - GitHub Actions: usage/github.md - - GitLab CI: usage/gitlab.md - - Claude Code: data-engineering/guides/using-with-claude-code.md - - Codex: data-engineering/guides/using-with-codex.md - - MCP Servers: configure/mcp-servers.md - - LSP: configure/lsp.md - - ACP: configure/acp.md - + - IDE: usage/ide.md + - GitHub: usage/github.md + - GitLab: usage/gitlab.md - Configure: - - Config Files: configure/config.md - - AI Providers & Models: + - Overview: configure/index.md + - Warehouses: configure/warehouses.md + - LLMs: - Providers: configure/providers.md - Models: configure/models.md - - Agents & Skills: - - Agents: configure/agents.md - - Skills: configure/skills.md - - Tools & Access: - - Allowed Tools: configure/tools.md - - Custom Tools: configure/custom-tools.md - - Access Control: configure/permissions.md - - Behavior: - - Rules: configure/rules.md - - Commands: configure/commands.md - - Context Management: configure/context-management.md - - Memory: data-engineering/tools/memory-tools.md - - Training: - - Overview: data-engineering/training/index.md - - Team Deployment: data-engineering/training/team-deployment.md + - MCPs & ACPs: + - MCP Servers: configure/mcp-servers.md + - ACP Support: configure/acp.md - Appearance: - Themes: configure/themes.md - Keybinds: configure/keybinds.md - - Formatters: configure/formatters.md - - Observability: - - Tracing: configure/tracing.md - - Telemetry: configure/telemetry.md - - Network & Proxy: network.md - - Windows / WSL: windows-wsl.md - - - Extend: - - SDK: develop/sdk.md - - Server API: develop/server.md - - Plugins: develop/plugins.md - - Ecosystem: develop/ecosystem.md - + - Additional Config: + - LSP Servers: configure/lsp.md + - Network: reference/network.md + - Windows / WSL: reference/windows-wsl.md + - Config File Reference: configure/config.md + - Governance: + - Overview: configure/governance.md + - Rules: configure/rules.md + - Permissions: configure/permissions.md + - Context Management: configure/context-management.md + - Formatters: configure/formatters.md - Reference: - - Security FAQ: security-faq.md - - Troubleshooting: troubleshooting.md - - Changelog: https://github.com/AltimateAI/altimate-code/blob/main/CHANGELOG.md + - What's New: reference/changelog.md + - Security FAQ: reference/security-faq.md + - Troubleshooting: reference/troubleshooting.md + - Telemetry: reference/telemetry.md From 2f703a63ee94d45db732f7662e966940f0f45061 Mon Sep 17 00:00:00 2001 From: Pradnesh Date: Wed, 18 Mar 2026 17:39:05 -0700 Subject: [PATCH 07/13] docs: move CI & Headless under Interfaces, deduplicate from CLI Move CI page from data-engineering/guides to usage/. Remove duplicate non-interactive and tracing sections from CLI page, link to CI instead. Co-Authored-By: Claude Opus 4.6 (1M context) --- docs/docs/usage/ci-headless.md | 155 +++++++++++++++++++++++++++++++++ docs/docs/usage/cli.md | 28 +----- docs/mkdocs.yml | 2 +- 3 files changed, 157 insertions(+), 28 deletions(-) create mode 100644 docs/docs/usage/ci-headless.md diff --git a/docs/docs/usage/ci-headless.md b/docs/docs/usage/ci-headless.md new file mode 100644 index 0000000000..471acdaada --- /dev/null +++ b/docs/docs/usage/ci-headless.md @@ -0,0 +1,155 @@ +# CI & Headless Mode + +Run any altimate prompt non-interactively from scripts, CI pipelines, or scheduled jobs. No TUI. Output is plain text or JSON. + +--- + +## Basic Usage + +```bash +altimate run "your prompt here" +``` + +Key flags: + +| Flag | Description | +|---|---| +| `--output json` | Structured JSON output instead of plain text | +| `--model ` | Override the configured model | +| `--connection ` | Select a specific warehouse connection | +| `--no-color` | Disable ANSI color codes (for CI logs) | + +See `altimate run --help` for the full flag list, or [CLI Reference](cli.md). + +--- + +## Environment Variables for CI + +Configure without committing an `altimate-code.json` file: + +```bash +# LLM provider +ALTIMATE_PROVIDER=anthropic +ALTIMATE_ANTHROPIC_API_KEY=your-key-here + +# Or OpenAI +ALTIMATE_PROVIDER=openai +ALTIMATE_OPENAI_API_KEY=your-key-here + +# Warehouse (Snowflake example) +SNOWFLAKE_ACCOUNT=myorg-myaccount +SNOWFLAKE_USER=ci_user +SNOWFLAKE_PASSWORD=${{ secrets.SNOWFLAKE_PASSWORD }} +SNOWFLAKE_DATABASE=analytics +SNOWFLAKE_SCHEMA=public +SNOWFLAKE_WAREHOUSE=compute_wh +``` + +--- + +## Exit Codes + +| Code | Meaning | +|---|---| +| `0` | Success (task completed) | +| `1` | Task completed but result indicates issues (e.g., anti-patterns found) | +| `2` | Configuration error (missing API key, bad connection) | +| `3` | Tool execution error (warehouse unreachable, query failed) | + +Use exit codes to fail CI on actionable findings: + +```bash +altimate run "validate models in models/staging/ for anti-patterns" || exit 1 +``` + +--- + +## Worked Examples + +### Example 1: Nightly Cost Check (GitHub Actions) + +```yaml +# .github/workflows/cost-check.yml +name: Nightly Cost Check + +on: + schedule: + - cron: '0 8 * * 1-5' # 8am UTC, weekdays + +jobs: + cost-check: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v4 + + - name: Install altimate + run: npm install -g altimate-code + + - name: Run cost report + env: + ALTIMATE_PROVIDER: anthropic + ALTIMATE_ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }} + SNOWFLAKE_ACCOUNT: ${{ secrets.SNOWFLAKE_ACCOUNT }} + SNOWFLAKE_USER: ${{ secrets.SNOWFLAKE_CI_USER }} + SNOWFLAKE_PASSWORD: ${{ secrets.SNOWFLAKE_CI_PASSWORD }} + SNOWFLAKE_DATABASE: analytics + SNOWFLAKE_WAREHOUSE: compute_wh + run: | + altimate run "/cost-report" --output json > cost-report.json + cat cost-report.json + + - name: Upload cost report + uses: actions/upload-artifact@v4 + with: + name: cost-report + path: cost-report.json +``` + +### Example 2: Post-Deploy SQL Validation + +Add to your dbt deployment workflow to catch anti-patterns before they reach production: + +```yaml + - name: SQL anti-pattern check + env: + ALTIMATE_PROVIDER: anthropic + ALTIMATE_ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }} + run: | + altimate run "validate all SQL files in models/staging/ for anti-patterns and fail if any are found" \ + --no-color \ + --output json +``` + +### Example 3: Automated Test Generation (Pre-commit) + +```bash +#!/bin/bash +# .git/hooks/pre-commit +# Generate tests for any staged SQL model files + +STAGED_MODELS=$(git diff --cached --name-only --diff-filter=A | grep "models/.*\.sql") + +if [ -n "$STAGED_MODELS" ]; then + echo "Generating tests for new models..." + altimate run "/generate-tests for: $STAGED_MODELS" --no-color +fi +``` + +--- + +## Tracing in Headless Mode + +Tracing works in headless mode. View traces after the run: + +```bash +altimate trace list +altimate trace view +``` + +See [Tracing](../configure/tracing.md) for the full trace reference. + +--- + +## Security Recommendation + +Use a **read-only warehouse user** for CI jobs that only need to read data. Reserve write-access credentials for jobs that explicitly need them (e.g., test generation that writes files). See [Security FAQ](../reference/security-faq.md) and [Permissions](../configure/permissions.md). diff --git a/docs/docs/usage/cli.md b/docs/docs/usage/cli.md index 221082ae7b..5893bd32f2 100644 --- a/docs/docs/usage/cli.md +++ b/docs/docs/usage/cli.md @@ -112,30 +112,4 @@ Configuration can be controlled via environment variables: | `ALTIMATE_CLI_EXPERIMENTAL_PLAN_MODE` | Enable plan mode | | `ALTIMATE_CLI_ENABLE_EXA` | Enable Exa web search | -## Non-interactive Usage - -```bash -# Pipe input -echo "explain this SQL" | altimate run - -# With a specific model -altimate run --model anthropic/claude-sonnet-4-6 "optimize my warehouse" - -# Print logs for debugging -altimate --print-logs --log-level DEBUG run "test query" - -# Disable tracing for a single run -altimate run --no-trace "quick question" -``` - -## Tracing - -Every `run` command automatically saves a trace file with the full session details, including generations, tool calls, tokens, cost, and timing. See [Tracing](../configure/tracing.md) for configuration options. - -```bash -# List recent traces -altimate trace list - -# View a trace in the browser -altimate trace view -``` +For non-interactive usage, CI pipelines, and headless automation, see [CI & Automation](ci-headless.md). diff --git a/docs/mkdocs.yml b/docs/mkdocs.yml index e281960f94..64efd7477d 100644 --- a/docs/mkdocs.yml +++ b/docs/mkdocs.yml @@ -82,7 +82,6 @@ nav: - Guides: - Cost Optimization: data-engineering/guides/cost-optimization.md - Migration: data-engineering/guides/migration.md - - CI & Automation: data-engineering/guides/ci-headless.md - Use: - Agents: - Agent Modes: data-engineering/agent-modes.md @@ -103,6 +102,7 @@ nav: - Interfaces: - TUI: usage/tui.md - CLI: usage/cli.md + - CI: usage/ci-headless.md - IDE: usage/ide.md - GitHub: usage/github.md - GitLab: usage/gitlab.md From 2b5e5b21d8b2c485a7ab7872d09aa7748c876aa2 Mon Sep 17 00:00:00 2001 From: Pradnesh Date: Wed, 18 Mar 2026 17:51:28 -0700 Subject: [PATCH 08/13] docs: simplify agents page, quickstart next-steps, and nav label MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Remove data-engineering-specific agent table from agents.md (now covered elsewhere), replace grid cards in quickstart with a compact link list, and rename "Complete Setup" → "Setup" in nav. Co-Authored-By: Claude Opus 4.6 (1M context) --- docs/docs/configure/agents.md | 17 --------- docs/docs/getting-started/quickstart-new.md | 38 ++------------------- docs/mkdocs.yml | 2 +- 3 files changed, 4 insertions(+), 53 deletions(-) diff --git a/docs/docs/configure/agents.md b/docs/docs/configure/agents.md index 19a17b7bd6..eecbd7d945 100644 --- a/docs/docs/configure/agents.md +++ b/docs/docs/configure/agents.md @@ -13,23 +13,6 @@ Agents define different AI personas with specific models, prompts, permissions, | `build` | Build-focused agent that prioritizes code generation | | `explore` | Read-only exploration agent | -### Data Engineering - -| Agent | Description | Permissions | -|-------|------------|------------| -| `builder` | Create dbt models, SQL pipelines, transformations | Full read/write | -| `analyst` | Explore data, run SELECT queries, generate insights | Read-only (enforced) | -| `validator` | Data quality checks, schema validation, test coverage | Read + validate | -| `migrator` | Cross-warehouse SQL translation and migration | Read/write for migration | -| `researcher` | Deep multi-step investigations, root cause analysis | Read-only + parallel | -| `trainer` | Teach conventions, manage training entries | Read-only + training | -| `executive` | Business-friendly reporting, health dashboards | Read-only | - -For detailed examples and usage guidance for each mode, see [Agent Modes](../data-engineering/agent-modes.md). - -!!! tip - Use the `analyst` agent when exploring data to ensure no accidental writes. Switch to `builder` when you are ready to create or modify models. - ## Custom Agents Define custom agents in `altimate-code.json`: diff --git a/docs/docs/getting-started/quickstart-new.md b/docs/docs/getting-started/quickstart-new.md index 3c5bb6e36c..44ecc3540c 100644 --- a/docs/docs/getting-started/quickstart-new.md +++ b/docs/docs/getting-started/quickstart-new.md @@ -192,38 +192,6 @@ Type `/` in the TUI to see all available skills. Here's a quick reference for co ## What's Next -
- -- :material-cog:{ .lg .middle } **Complete Setup** - - *** - - Advanced warehouse configs, all LLM providers, SSH tunneling, multi-environment setup. - - [:octicons-arrow-right-24: Complete Setup](quickstart.md) - -- :material-account-group:{ .lg .middle } **Agent Modes** - - *** - - Builder, Analyst, Validator, Migrator, Executive — choose the right permissions for your task. - - [:octicons-arrow-right-24: Agent Modes](../data-engineering/agent-modes.md) - -- :material-robot:{ .lg .middle } **CI & Automation** - - *** - - Run SQL review gates in GitHub Actions, block PRs with failing grades, automate cost reports. - - [:octicons-arrow-right-24: CI & Automation](../data-engineering/guides/ci-headless.md) - -- :material-school:{ .lg .middle } **Train Your Agent** - - *** - - Teach project-specific patterns, naming conventions, and SQL style rules. - - [:octicons-arrow-right-24: Training](../configure/skills.md) - -
+- **[Setup](quickstart.md)** — Warehouses, LLM providers, agent modes, skills, and permissions +- **[Examples](../examples/index.md)** — End-to-end walkthroughs for common data engineering tasks +- **[Interfaces](../usage/tui.md)** — TUI, CLI, CI, IDE, and GitHub/GitLab integrations diff --git a/docs/mkdocs.yml b/docs/mkdocs.yml index 64efd7477d..53995ffc3c 100644 --- a/docs/mkdocs.yml +++ b/docs/mkdocs.yml @@ -76,7 +76,7 @@ nav: - Getting Started: - Overview: getting-started/index.md - Quickstart: getting-started/quickstart-new.md - - Complete Setup: getting-started/quickstart.md + - Setup: getting-started/quickstart.md - Examples: - Showcase: examples/index.md - Guides: From ec708a8a0640f90022e4e51dc5c5f1af0c34e2d2 Mon Sep 17 00:00:00 2001 From: Pradnesh Date: Wed, 18 Mar 2026 18:11:17 -0700 Subject: [PATCH 09/13] docs: sync missing content from v2, rename What's New to Changelog - Rename "What's New" to "Changelog" across docs, nav, and README - Populate changelog with full release history (v0.1.0 through v0.4.9) - Add inline permission examples to security-faq (permission table, JSON configs) - Add Data Engineering agents table and Agent Permissions example to agents page - Add Non-interactive Usage and Tracing sections to CLI docs - Add missing nav entries: Web UI, Claude Code/Codex guides, Memory Tools, Observability (Tracing/Telemetry), Training, and Extend (SDK/Server/Plugins/Ecosystem) Co-Authored-By: Claude Opus 4.6 (1M context) --- README.md | 2 +- docs/docs/configure/agents.md | 35 ++- docs/docs/reference/changelog.md | 342 +++++++++++++++++++++++++++- docs/docs/reference/security-faq.md | 59 ++++- docs/docs/usage/cli.md | 30 ++- docs/mkdocs.yml | 18 +- 6 files changed, 473 insertions(+), 13 deletions(-) diff --git a/README.md b/README.md index ee2330f1d7..5e3aa99cce 100644 --- a/README.md +++ b/README.md @@ -219,7 +219,7 @@ Contributions welcome — docs, SQL rules, warehouse connectors, and TUI improve **[Read CONTRIBUTING.md →](./CONTRIBUTING.md)** -## What's New +## Changelog - **v0.4.2** (March 2026) — yolo mode, Python engine elimination (all-native TypeScript), tool consolidation, path sandboxing hardening, altimate-dbt CLI, unscoped npm package - **v0.4.1** (March 2026) — env-based skill selection, session caching, tracing improvements diff --git a/docs/docs/configure/agents.md b/docs/docs/configure/agents.md index eecbd7d945..1e476ce3af 100644 --- a/docs/docs/configure/agents.md +++ b/docs/docs/configure/agents.md @@ -13,6 +13,18 @@ Agents define different AI personas with specific models, prompts, permissions, | `build` | Build-focused agent that prioritizes code generation | | `explore` | Read-only exploration agent | +### Data Engineering + +| Agent | Description | Permissions | +|-------|------------|------------| +| `builder` | Create dbt models, SQL pipelines, transformations | Full read/write | +| `analyst` | Explore data, run SELECT queries, generate insights | Read-only (enforced) | +| `validator` | Data quality checks, schema validation, test coverage | Read + validate | +| `migrator` | Cross-warehouse SQL translation and migration | Read/write for migration | + +!!! tip + Use the `analyst` agent when exploring data to ensure no accidental writes. Switch to `builder` when you are ready to create or modify models. + ## Custom Agents Define custom agents in `altimate-code.json`: @@ -78,7 +90,28 @@ You are a Snowflake cost optimization expert. For every query: ## Agent Permissions -Each agent can have its own permission overrides that restrict or expand the default permissions. For full details, examples, and recommended configurations, see the [Permissions reference](permissions.md#per-agent-permissions). +Each agent can have its own permission overrides that restrict or expand the default permissions: + +```json +{ + "agent": { + "analyst": { + "permission": { + "write": "deny", + "edit": "deny", + "bash": { + "dbt show *": "allow", + "dbt list *": "allow", + "*": "deny" + } + } + } + } +} +``` + +!!! warning + Agent-specific permissions override global permissions. A `"deny"` at the agent level cannot be overridden by a global `"allow"`. ## Switching Agents diff --git a/docs/docs/reference/changelog.md b/docs/docs/reference/changelog.md index 063601ad13..32d7a00d93 100644 --- a/docs/docs/reference/changelog.md +++ b/docs/docs/reference/changelog.md @@ -1,11 +1,9 @@ -# What's New +# Changelog -Release notes and changelog for Altimate Code. +All notable changes to this project will be documented in this file. ---- - -!!! note "Coming soon" - Detailed release notes will be published here with each version. For now, check the [GitHub releases](https://github.com/AltimateAI/altimate-code/releases) page for the latest updates. +The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), +and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). ## How to check your version @@ -20,3 +18,335 @@ npm update -g @altimateai/altimate-code ``` After upgrading, the TUI welcome banner shows what changed since your previous version. + +--- + +## [0.4.9] - 2026-03-18 + +### Added + +- Script to build and run compiled binary locally (#262) + +### Fixed + +- Snowflake auth — support all auth methods (`password`, `keypair`, `externalbrowser`, `oauth`), fix field name mismatches (#268) +- dbt tool regression — schema format mismatch, silent failures, wrong results (#263) +- `altimate-dbt compile`, `execute`, and children commands fail with runtime errors (#255) +- `Cannot find module @altimateai/altimate-core` on `npm install` (#259) +- Dispatcher tests fail in CI due to shared module state (#257) + +### Changed + +- CI: parallel per-target builds — 12 jobs, ~5 min wall clock instead of ~20 min (#254) +- CI: faster release — build parallel with test, lower compression, tighter timeouts (#251) +- Docker E2E tests skip in CI unless explicitly opted in (#253) + +## [0.4.1] - 2026-03-16 +## [0.4.2] - 2026-03-18 + +### Breaking Changes + +- **Python engine eliminated** — all 73 tool methods now run natively in TypeScript. No Python, pip, venv, or `altimate-engine` installation required. Fixes #210. + +### Added + +- `@altimateai/drivers` shared workspace package with 10 database drivers (Snowflake, BigQuery, PostgreSQL, Databricks, Redshift, MySQL, SQL Server, Oracle, DuckDB, SQLite) +- Direct `@altimateai/altimate-core` napi-rs bindings — SQL analysis calls go straight to Rust (no Python intermediary) +- dbt-first SQL execution — automatically uses `profiles.yml` connection when in a dbt project +- Warehouse telemetry (5 event types: connect, query, introspection, discovery, census) +- 340+ new tests including E2E tests against live Snowflake, BigQuery, and Databricks accounts +- Encrypted key-pair auth support for Snowflake (PKCS8 PEM with passphrase) +- Comprehensive driver documentation at `docs/docs/drivers.md` + +### Fixed + +- Python bridge connection failures for UV, conda, and non-standard venv setups (#210) +- SQL injection in finops/schema queries (parameterized queries + escape utility) +- Credential store no longer saves plaintext passwords +- SSH tunnel cleanup on SIGINT/SIGTERM +- Race condition in connection registry for concurrent access +- Databricks DATE_SUB syntax +- Redshift describeTable column name +- SQL Server describeTable includes views +- Dispatcher telemetry wrapped in try/catch +- Flaky test timeouts + +### Removed + +- `packages/altimate-engine/` — entire Python package (~17,000 lines) +- `packages/opencode/src/altimate/bridge/` — JSON-RPC bridge +- `.github/workflows/publish-engine.yml` — PyPI publish workflow + +### Added + +- Local-first tracing system replacing Langfuse (#183) + +### Fixed + +- Engine not found when user's project has `.venv` in cwd — managed venv now takes priority (#199) +- Missing `[warehouses]` pip extra causing FinOps tools to fail with "snowflake-connector-python not installed" (#199) +- Engine install trusting stale manifest when venv/Python binary was deleted (#199) +- Extras changes not detected on upgrade — manifest now tracks installed extras (#199) +- Windows path handling for dev/cwd venv resolution (#199) +- Concurrent bridge startup race condition — added `pendingStart` mutex (#199) +- Unhandled spawn `error` event crashing host process on invalid Python path (#199) +- Bridge hung permanently after ping failure — child process now cleaned up (#199) +- `restartCount` incorrectly incremented on signal kills, prematurely disabling bridge (#199) +- TUI prompt corruption from engine bootstrap messages writing to stderr (#180) +- Tracing exporter timeout leaking timers (#191) +- Feedback submission failing when repo labels don't exist (#188) +- Pre-release security and resource cleanup fixes for tracing (#197) + +## [0.4.0] - 2026-03-15 + +### Added + +- Data-viz skill for data storytelling and visualizations (#170) +- AI Teammate training system with learn-by-example patterns (#148) + +### Fixed + +- Sidebar shows "OpenCode" instead of "Altimate Code" after upstream merge (#168) +- Prevent upstream tags from polluting origin (#165) +- Show welcome box on first CLI run, not during postinstall (#163) + +### Changed + +- Engine version bumped to 0.4.0 + +## [0.3.1] - 2026-03-15 + +### Fixed + +- Database migration crash when upgrading from v0.2.x — backfill NULL migration names for Drizzle beta.16 compatibility (#161) +- Install banner not visible during `npm install` — moved output from stdout to stderr (#161) +- Verbose changelog dump removed from CLI startup (#161) +- `altimate upgrade` detection broken — `method()` and `latest()` referenced upstream `opencode-ai` package names instead of `@altimateai/altimate-code` (#161) +- Brew formula detection and upgrade referencing `opencode` instead of `altimate-code` (#161) +- Homebrew tap updated to v0.3.0 (was stuck at 0.1.4 due to expired `HOMEBREW_TAP_TOKEN`) (#161) +- `.opencode/memory/` references in docs updated to `.altimate-code/memory/` (#161) +- Stale `@opencode-ai/plugin` reference in CONTRIBUTING.md (#161) + +### Changed + +- CI now uses path-based change detection to skip unaffected jobs (saves ~100s on non-TS changes) (#161) +- Release workflow gated on test job passing (#157) +- Upstream merge restricted to published GitHub releases only (#150) + +## [0.3.0] - 2026-03-15 + +### Added + +- AI-powered prompt enhancement (#144) +- Altimate Memory — persistent cross-session memory with TTL, namespaces, citations, and audit logging (#136) +- Upstream merge with OpenCode v1.2.26 (#142) + +### Fixed + +- Sentry review findings from PR #144 (#147) +- OAuth token refresh retry and error handling for idle timeout (#133) +- Welcome banner on first CLI run after install/upgrade (#132) +- `@altimateai/altimate-code` npm package name restored after upstream rebase +- Replace `mock.module()` with `spyOn()` to fix 149 test failures (#153) + +### Changed + +- Rebrand user-facing references to Altimate Code (#134) +- Bump `@modelcontextprotocol/sdk` dependency (#139) +- Engine version bumped to 0.3.0 + +## [0.2.5] - 2026-03-13 + +### Added + +- `/feedback` command and `feedback_submit` tool for in-app user feedback (#89) +- Datamate manager — dynamic MCP server management (#99) +- Non-interactive mode for `mcp add` command with input validation +- `mcp remove` command +- Upstream merge with OpenCode v1.2.20 + +### Fixed + +- TUI crash after upstream merge (#98) +- `GitlabAuthPlugin` type incompatibility in plugin loader (#92) +- All test failures from fork restructure (#91) +- CI/CD workflow paths updated from `altimate-code` to `opencode` +- Fallback to global config when not in a git repo +- PR standards workflow `TEAM_MEMBERS` ref corrected from `dev` to `main` (#101) + +### Changed + +- Removed self-hosted runners from public repo CI (#110) +- Migrated CI/release to ARC runners (#93, #94) +- Reverted Windows tests to `windows-latest` (#95) +- Engine version bumped to 0.2.5 + +## [0.2.4] - 2026-03-04 + +### Added + +- E2E tests for npm install pipeline: postinstall script, bin wrapper, and publish output (#50) + +## [0.2.3] - 2026-03-04 + +### Added + +- Postinstall welcome banner and changelog display after upgrade (#48) + +### Fixed + +- Security: validate well-known auth command type before execution, add confirmation prompt (#45) +- CI/CD: SHA-pin all GitHub Actions, per-job least-privilege permissions (#45) +- MCP: fix copy-paste log messages, log init errors, prefix floating promises (#45) +- Session compaction: clean up compactionAttempts on abort to prevent memory leak (#45) +- Telemetry: retry failed flush events once with buffer-size cap (#45, #46) +- Telemetry: flush events before process exit (#46) +- TUI: resolve worker startup crash from circular dependency (#47) +- CLI: define ALTIMATE_CLI build-time constants for correct version reporting (#41) +- Address 4 issues found in post-v0.2.2 commits (#49) +- Address remaining code review issues from PR #39 (#43) + +### Changed + +- CI/CD: optimize pipeline with caching and parallel builds (#42) + +### Docs + +- Add security FAQ (#44) + +## [0.2.2] - 2026-03-05 + +### Fixed + +- Telemetry init: `Config.get()` failure outside Instance context no longer silently disables telemetry +- Telemetry init: called early in CLI middleware and worker thread so MCP/engine/auth events are captured +- Telemetry init: promise deduplication prevents concurrent init race conditions +- Telemetry: pre-init events are now buffered and flushed (previously silently dropped) +- Telemetry: user email is SHA-256 hashed before sending (privacy) +- Telemetry: error message truncation standardized to 500 chars across all event types +- Telemetry: `ALTIMATE_TELEMETRY_DISABLED` env var now actually checked in init +- Telemetry: MCP disconnect reports correct transport type instead of hardcoded `stdio` +- Telemetry: `agent_outcome` now correctly reports `"error"` outcome for failed sessions + +### Changed + +- Auth telemetry events use session context when available instead of hardcoded `"cli"` + +## [0.2.1] - 2026-03-05 + +### Added + +- Comprehensive telemetry instrumentation: 25 event types across auth, MCP servers, Python engine, provider errors, permissions, upgrades, context utilization, agent outcomes, workflow sequencing, and environment census +- Telemetry docs page with event table, privacy policy, opt-out instructions, and contributor guide +- AppInsights endpoint added to network firewall documentation +- `categorizeToolName()` helper for tool classification (sql, schema, dbt, finops, warehouse, lineage, file, mcp) +- `bucketCount()` helper for privacy-safe count bucketing + +### Fixed + +- Command loading made resilient to MCP/Skill initialization failures + +### Changed + +- CLI binary renamed from `altimate-code` to `altimate` + +## [0.2.0] - 2026-03-04 + +### Added + +- Context management: auto-compaction with overflow recovery, observation masking, and loop protection +- Context management: data-engineering-aware compaction template preserving warehouse, schema, dbt, and lineage context +- Context management: content-aware token estimation (code, JSON, SQL, text heuristics) +- Context management: observation masking replaces pruned tool outputs with fingerprinted summaries +- Context management: provider overflow detection for Azure OpenAI patterns +- CLI observability: telemetry module with session, generation, tool call, and error tracking +- `/discover` command for data stack setup with project_scan tool +- User documentation for context management configuration + +### Fixed + +- ContextOverflowError now triggers automatic compaction instead of a dead-end error +- `isOverflow()` correctly reserves headroom for models with separate input/output limits +- `NamedError.isInstance()` no longer crashes on null input +- Text part duration tracking now preserves original start timestamp +- Compaction loop protection: max 3 consecutive attempts per turn, counter resets between turns +- Negative usable context guard for models where headroom exceeds base capacity + +### Changed + +- Removed cost estimation and complexity scoring bindings +- Docs: redesigned homepage with hero, feature cards, and pill layouts +- Docs: reorganized sidebar navigation for better discoverability + +## [0.1.10] - 2026-03-03 + +### Fixed + +- Build: resolve @opentui/core parser.worker.js via import.meta.resolve for monorepo hoisting +- Build: output binary as `altimate-code` instead of `opencode` +- Publish: update Docker/AUR/Homebrew references from anomalyco/opencode to AltimateAI/altimate-code +- Publish: make Docker/AUR/Homebrew steps non-fatal +- Bin wrapper: look for `@altimateai/altimate-code-*` scoped platform packages +- Postinstall: resolve `@altimateai` scoped platform packages +- Dockerfile: update binary paths and names + +## [0.1.9] - 2026-03-02 + +### Fixed + +- Build: fix solid-plugin import to use bare specifier for monorepo hoisting +- CI: install warehouse extras for Python tests (duckdb, boto3, etc.) +- CI: restrict pytest collection to tests/ directory +- CI: fix all ruff lint errors in Python engine +- CI: fix remaining TypeScript test failures (agent rename, config URLs, Pydantic model) +- Update theme schema URLs and documentation references to altimate-code.dev + +## [0.1.8] - 2026-03-02 + +### Changed + +- Rename npm scope from `@altimate` to `@altimateai` for all packages +- Wrapper package is now `@altimateai/altimate-code` (no `-ai` suffix) + +### Fixed + +- CI: test fixture writes config to correct filename (`altimate-code.json`) +- CI: add `dev` optional dependency group to Python engine for pytest/ruff + +## [0.1.7] - 2026-03-02 + +### Changed + +- Improve TUI logo readability: redesign M, E, T, I letter shapes +- Add two-tone logo color: ALTIMATE in peach, CODE in purple + +### Fixed + +- Release: npm publish glob now finds scoped package directories +- Release: PyPI publish skips existing versions instead of failing + +## [0.1.5] - 2026-03-02 + +### Added + +- Anthropic OAuth plugin ported in-tree +- Docs site switched from Jekyll to Material for MkDocs + +### Fixed + +- Build script: restore `.trim()` on models API JSON to prevent syntax error in generated `models-snapshot.ts` +- Build script: fix archive path for scoped package names in release tarball/zip creation + +## [0.1.0] - 2025-06-01 + +### Added + +- Initial open-source release +- SQL analysis and formatting via Python engine +- Column-level lineage tracking +- dbt integration (profiles, lineage, `+` operator) +- Warehouse connectivity (Snowflake, BigQuery, Databricks, Postgres, DuckDB, MySQL) +- AI-powered SQL code review +- TUI interface with Solid.js +- MCP (Model Context Protocol) server support +- Auto-bootstrapping Python engine via uv diff --git a/docs/docs/reference/security-faq.md b/docs/docs/reference/security-faq.md index 2491c9a165..3206c435e9 100644 --- a/docs/docs/reference/security-faq.md +++ b/docs/docs/reference/security-faq.md @@ -19,11 +19,66 @@ Altimate Code needs database credentials to connect to your warehouse. Credentia ## What can the agent actually execute? -Altimate Code can read files, write files, and run shell commands, but only with your permission. The [permission system](../configure/permissions.md) lets you set every tool to `"allow"`, `"ask"`, or `"deny"`, with pattern-based rules for fine-grained control. See the [Permissions reference](../configure/permissions.md) for the full configuration guide. +Altimate Code can read files, write files, and run shell commands, but only with your permission. The [permission system](../configure/permissions.md) lets you control every tool: + +| Level | Behavior | +|-------|----------| +| `"allow"` | Runs without confirmation | +| `"ask"` | Prompts you before each use | +| `"deny"` | Blocked entirely | + +By default, destructive operations like `bash`, `write`, and `edit` require confirmation. You can further restrict specific commands: + +```json +{ + "permission": { + "bash": { + "*": "ask", + "dbt *": "allow", + "git status": "allow", + "DROP *": "deny", + "rm *": "deny" + } + } +} +``` ## Can I prevent the agent from modifying production databases? -Yes. Use pattern-based permissions to deny destructive SQL (`DROP *`, `DELETE *`, `TRUNCATE *`), and per-agent permissions to restrict agents like `analyst` to read-only. See the [Permissions reference](../configure/permissions.md#pattern-based-permissions) for examples and recommended configurations. +Yes. Use pattern-based permissions to deny destructive SQL: + +```json +{ + "permission": { + "bash": { + "*": "ask", + "DROP *": "deny", + "DELETE *": "deny", + "TRUNCATE *": "deny", + "ALTER *": "deny" + } + } +} +``` + +You can also configure per-agent permissions. For example, restrict the `analyst` agent to read-only: + +```json +{ + "agent": { + "analyst": { + "permission": { + "write": "deny", + "edit": "deny", + "bash": { + "SELECT *": "allow", + "*": "deny" + } + } + } + } +} +``` ## What network endpoints does Altimate Code contact? diff --git a/docs/docs/usage/cli.md b/docs/docs/usage/cli.md index 5893bd32f2..21ec08b2b7 100644 --- a/docs/docs/usage/cli.md +++ b/docs/docs/usage/cli.md @@ -112,4 +112,32 @@ Configuration can be controlled via environment variables: | `ALTIMATE_CLI_EXPERIMENTAL_PLAN_MODE` | Enable plan mode | | `ALTIMATE_CLI_ENABLE_EXA` | Enable Exa web search | -For non-interactive usage, CI pipelines, and headless automation, see [CI & Automation](ci-headless.md). +## Non-interactive Usage + +```bash +# Pipe input +echo "explain this SQL" | altimate run + +# With a specific model +altimate run --model anthropic/claude-sonnet-4-6 "optimize my warehouse" + +# Print logs for debugging +altimate --print-logs --log-level DEBUG run "test query" + +# Disable tracing for a single run +altimate run --no-trace "quick question" +``` + +For CI pipelines and headless automation, see [CI & Automation](ci-headless.md). + +## Tracing + +Every `run` command automatically saves a trace file with the full session details, including generations, tool calls, tokens, cost, and timing. See [Tracing](../configure/tracing.md) for configuration options. + +```bash +# List recent traces +altimate trace list + +# View a trace in the browser +altimate trace view +``` diff --git a/docs/mkdocs.yml b/docs/mkdocs.yml index 53995ffc3c..f1165280fe 100644 --- a/docs/mkdocs.yml +++ b/docs/mkdocs.yml @@ -82,6 +82,8 @@ nav: - Guides: - Cost Optimization: data-engineering/guides/cost-optimization.md - Migration: data-engineering/guides/migration.md + - Using with Claude Code: data-engineering/guides/using-with-claude-code.md + - Using with Codex: data-engineering/guides/using-with-codex.md - Use: - Agents: - Agent Modes: data-engineering/agent-modes.md @@ -96,12 +98,14 @@ nav: - Lineage Tools: data-engineering/tools/lineage-tools.md - dbt Tools: data-engineering/tools/dbt-tools.md - Warehouse Tools: data-engineering/tools/warehouse-tools.md + - Memory Tools: data-engineering/tools/memory-tools.md - Custom Tools: configure/tools/custom.md - Skills: configure/skills.md - Commands: configure/commands.md - Interfaces: - TUI: usage/tui.md - CLI: usage/cli.md + - Web UI: usage/web.md - CI: usage/ci-headless.md - IDE: usage/ide.md - GitHub: usage/github.md @@ -118,6 +122,12 @@ nav: - Appearance: - Themes: configure/themes.md - Keybinds: configure/keybinds.md + - Observability: + - Tracing: configure/tracing.md + - Telemetry: reference/telemetry.md + - Training: + - Overview: data-engineering/training/index.md + - Team Deployment: data-engineering/training/team-deployment.md - Additional Config: - LSP Servers: configure/lsp.md - Network: reference/network.md @@ -130,7 +140,11 @@ nav: - Context Management: configure/context-management.md - Formatters: configure/formatters.md - Reference: - - What's New: reference/changelog.md + - Changelog: reference/changelog.md - Security FAQ: reference/security-faq.md - Troubleshooting: reference/troubleshooting.md - - Telemetry: reference/telemetry.md + - Extend: + - SDK: develop/sdk.md + - Server API: develop/server.md + - Plugins: develop/plugins.md + - Ecosystem: develop/ecosystem.md From 0ae85c2c9a4ac5693be72029586358703c346647 Mon Sep 17 00:00:00 2001 From: Pradnesh Date: Wed, 18 Mar 2026 18:21:44 -0700 Subject: [PATCH 10/13] docs: update tool count from 50+ to 100+ across all docs Co-Authored-By: Claude Opus 4.6 (1M context) --- README.md | 4 ++-- docs/docs/configure/tools.md | 2 +- docs/docs/data-engineering/tools/index.md | 2 +- docs/docs/getting-started.md | 4 ++-- docs/docs/getting-started/index.md | 4 ++-- docs/docs/index.md | 4 ++-- docs/docs/llms.txt | 2 +- docs/docs/quickstart.md | 2 +- docs/mkdocs.yml | 2 +- 9 files changed, 13 insertions(+), 13 deletions(-) diff --git a/README.md b/README.md index 5e3aa99cce..7a9850dcb3 100644 --- a/README.md +++ b/README.md @@ -7,7 +7,7 @@ **The open-source data engineering harness.** -The intelligence layer for data engineering AI — 50+ deterministic tools for SQL analysis, +The intelligence layer for data engineering AI — 100+ deterministic tools for SQL analysis, column-level lineage, dbt, FinOps, and warehouse connectivity across every major cloud platform. Run standalone in your terminal, embed underneath Claude Code or Codex, or integrate @@ -223,7 +223,7 @@ Contributions welcome — docs, SQL rules, warehouse connectors, and TUI improve - **v0.4.2** (March 2026) — yolo mode, Python engine elimination (all-native TypeScript), tool consolidation, path sandboxing hardening, altimate-dbt CLI, unscoped npm package - **v0.4.1** (March 2026) — env-based skill selection, session caching, tracing improvements -- **v0.4.0** (Feb 2026) — data visualization skill, 50+ tools, training system +- **v0.4.0** (Feb 2026) — data visualization skill, 100+ tools, training system - **v0.3.x** — [See full changelog →](CHANGELOG.md) ## License diff --git a/docs/docs/configure/tools.md b/docs/docs/configure/tools.md index 89af267306..1149312866 100644 --- a/docs/docs/configure/tools.md +++ b/docs/docs/configure/tools.md @@ -24,7 +24,7 @@ altimate includes built-in tools that agents use to interact with your codebase ## Data Engineering Tools -In addition to built-in tools, altimate provides 50+ specialized data engineering tools. See the [Data Engineering Tools](../data-engineering/tools/index.md) section for details. +In addition to built-in tools, altimate provides 100+ specialized data engineering tools. See the [Data Engineering Tools](../data-engineering/tools/index.md) section for details. ## Tool Permissions diff --git a/docs/docs/data-engineering/tools/index.md b/docs/docs/data-engineering/tools/index.md index 30c4381491..5df590cc31 100644 --- a/docs/docs/data-engineering/tools/index.md +++ b/docs/docs/data-engineering/tools/index.md @@ -1,6 +1,6 @@ # Tools Reference -altimate has 50+ specialized tools organized by function. +altimate has 100+ specialized tools organized by function. | Category | Tools | Purpose | |---|---|---| diff --git a/docs/docs/getting-started.md b/docs/docs/getting-started.md index 5895bfd23f..06c11ca9fa 100644 --- a/docs/docs/getting-started.md +++ b/docs/docs/getting-started.md @@ -4,7 +4,7 @@ ## Why altimate? -altimate is the open-source data engineering harness with 50+ deterministic tools for building, validating, optimizing, and shipping data products. Unlike general-purpose coding agents, every tool is purpose-built for data engineering: +altimate is the open-source data engineering harness with 100+ deterministic tools for building, validating, optimizing, and shipping data products. Unlike general-purpose coding agents, every tool is purpose-built for data engineering: | Capability | General coding agents | altimate | |---|---|---| @@ -216,4 +216,4 @@ Generate data quality tests for all models in the marts/ directory. For each mod - [Providers](configure/providers.md): Set up Anthropic, OpenAI, Bedrock, Ollama, and more - [Agent Modes](data-engineering/agent-modes.md): Builder, Analyst, Validator, Migrator, Researcher, Trainer - [Training](data-engineering/training/index.md): Correct the agent once, it remembers forever, your team inherits it -- [Tools](data-engineering/tools/sql-tools.md): 50+ specialized tools for SQL, dbt, and warehouses +- [Tools](data-engineering/tools/sql-tools.md): 100+ specialized tools for SQL, dbt, and warehouses diff --git a/docs/docs/getting-started/index.md b/docs/docs/getting-started/index.md index 44497fdbdd..0a59254360 100644 --- a/docs/docs/getting-started/index.md +++ b/docs/docs/getting-started/index.md @@ -17,7 +17,7 @@ hide:

The open-source data engineering harness.

-

50+ specialized data engineering tools for building, validating, optimizing, and shipping data products. Use in your terminal, CI pipeline, orchestration DAGs, or as the harness for your data agents. Evaluate across platforms, independent of any single warehouse provider.

+

100+ specialized data engineering tools for building, validating, optimizing, and shipping data products. Use in your terminal, CI pipeline, orchestration DAGs, or as the harness for your data agents. Evaluate across platforms, independent of any single warehouse provider.

@@ -82,7 +82,7 @@ Altimate Code goes the other direction. It connects to your **entire** stack and --- -

50+ specialized tools

+

100+ specialized tools

Unlike general-purpose coding agents, every tool is purpose-built for data engineering workflows.

diff --git a/docs/docs/index.md b/docs/docs/index.md index 6416185800..e654f0695b 100644 --- a/docs/docs/index.md +++ b/docs/docs/index.md @@ -17,7 +17,7 @@ hide:

The open-source data engineering harness.

-

50+ tools for building, validating, optimizing, and shipping data products. Use in your terminal, CI pipeline, orchestration DAGs, or as the harness for your data agents. Evaluate across any platform, independent of a single warehouse provider.

+

100+ tools for building, validating, optimizing, and shipping data products. Use in your terminal, CI pipeline, orchestration DAGs, or as the harness for your data agents. Evaluate across any platform, independent of a single warehouse provider.

@@ -92,7 +92,7 @@ npm install -g altimate-code --- - Interactive TUI with 50+ tools, autocomplete for skills, and persistent memory across sessions. + Interactive TUI with 100+ tools, autocomplete for skills, and persistent memory across sessions. - :material-pipe-disconnected:{ .lg .middle } **CI Pipeline** diff --git a/docs/docs/llms.txt b/docs/docs/llms.txt index 65a6e3a1d0..195bb03166 100644 --- a/docs/docs/llms.txt +++ b/docs/docs/llms.txt @@ -3,7 +3,7 @@ # Generated: 2026-03-18 | Version: v0.4.2 # Source: https://docs.altimate.sh -> altimate-code is an open-source data engineering harness with 50+ tools for building, validating, optimizing, and shipping data products. Use in your terminal, CI pipeline, orchestration DAGs, or as the tool layer for your data agents. Includes a deterministic SQL Intelligence Engine (100% F1 across 1,077 queries), column-level lineage, FinOps analysis, PII detection, and dbt integration. Works with any LLM provider. Local-first, MIT-licensed. +> altimate-code is an open-source data engineering harness with 100+ tools for building, validating, optimizing, and shipping data products. Use in your terminal, CI pipeline, orchestration DAGs, or as the tool layer for your data agents. Includes a deterministic SQL Intelligence Engine (100% F1 across 1,077 queries), column-level lineage, FinOps analysis, PII detection, and dbt integration. Works with any LLM provider. Local-first, MIT-licensed. ## Get Started diff --git a/docs/docs/quickstart.md b/docs/docs/quickstart.md index 863fb0d654..4fa5741848 100644 --- a/docs/docs/quickstart.md +++ b/docs/docs/quickstart.md @@ -1,5 +1,5 @@ --- -description: "Install altimate-code and run your first SQL analysis. The open-source data engineering harness with 50+ tools for building, validating, optimizing, and shipping data products." +description: "Install altimate-code and run your first SQL analysis. The open-source data engineering harness with 100+ tools for building, validating, optimizing, and shipping data products." --- # Quickstart diff --git a/docs/mkdocs.yml b/docs/mkdocs.yml index f1165280fe..33965d1f46 100644 --- a/docs/mkdocs.yml +++ b/docs/mkdocs.yml @@ -1,5 +1,5 @@ site_name: altimate-code -site_description: The open-source data engineering harness. 50+ tools for building, validating, optimizing, and shipping data products. +site_description: The open-source data engineering harness. 100+ tools for building, validating, optimizing, and shipping data products. site_url: https://docs.altimate.sh repo_url: https://github.com/AltimateAI/altimate-code repo_name: AltimateAI/altimate-code From ebdf13869f788e99c39f2ed4fdf877e3ade43d67 Mon Sep 17 00:00:00 2001 From: Saurabh Arora Date: Wed, 18 Mar 2026 19:32:19 -0700 Subject: [PATCH 11/13] docs: fix broken anchors, tool counts, changelog, nav, and hero copy - Fix broken anchor link to #step-3-configure-your-warehouse-optional - Add inline "Adding Custom Skills" section in skills.md - Fix changelog upgrade command to use unscoped package name - Split merged 0.4.1/0.4.2 changelog into separate sections - Update tool count from 70+ to 100+ in configure/tools pages - Move Guides to bottom of Use section in nav - Change hero tagline to "Open-source data engineering harness." - Simplify install command to just npm install Co-Authored-By: Claude Opus 4.6 (1M context) --- docs/docs/configure/skills.md | 42 ++++++++++++++++++++++++++++- docs/docs/configure/tools/config.md | 2 +- docs/docs/configure/tools/index.md | 2 +- docs/docs/getting-started/index.md | 4 +-- docs/docs/quickstart.md | 2 +- docs/docs/reference/changelog.md | 5 ++-- docs/mkdocs.yml | 10 +++---- 7 files changed, 54 insertions(+), 13 deletions(-) diff --git a/docs/docs/configure/skills.md b/docs/docs/configure/skills.md index f12428c03d..6a807cce3f 100644 --- a/docs/docs/configure/skills.md +++ b/docs/docs/configure/skills.md @@ -88,7 +88,47 @@ altimate ships with built-in skills for common data engineering tasks. Type `/` | `/train` | Learn standards from documents/style guides | | `/training-status` | Dashboard of all learned knowledge | -For custom skills, see [Adding Custom Skills](#adding-custom-skills) below. +## Adding Custom Skills + +Add your own skills as Markdown files in `.altimate-code/skill/`: + +```markdown +--- +name: cost-review +description: Review SQL queries for cost optimization +--- + +Analyze the SQL query for cost optimization opportunities. +Focus on: $ARGUMENTS +``` + +`$ARGUMENTS` is replaced with whatever the user types after the skill name (e.g., `/cost-review SELECT * FROM orders` passes `SELECT * FROM orders`). + +Skills are loaded from these paths (highest priority first): + +1. `.altimate-code/skill/` (project) +2. `~/.altimate-code/skills/` (global) +3. Custom paths via config: + +```json +{ + "skills": { + "paths": ["./my-skills", "~/shared-skills"] + } +} +``` + +### Remote Skills + +Host skills at a URL and load them at startup: + +```json +{ + "skills": { + "urls": ["https://example.com/skills-registry.json"] + } +} +``` ## Disabling External Skills diff --git a/docs/docs/configure/tools/config.md b/docs/docs/configure/tools/config.md index 4028561792..636a229ff8 100644 --- a/docs/docs/configure/tools/config.md +++ b/docs/docs/configure/tools/config.md @@ -24,7 +24,7 @@ altimate includes built-in tools that agents use to interact with your codebase ## Data Engineering Tools -In addition to built-in tools, altimate provides 70+ specialized data engineering tools. See the [Data Engineering Tools](index.md) section for details. +In addition to built-in tools, altimate provides 100+ specialized data engineering tools. See the [Data Engineering Tools](index.md) section for details. ## Tool Permissions diff --git a/docs/docs/configure/tools/index.md b/docs/docs/configure/tools/index.md index 4009d4f0c7..7c1a387606 100644 --- a/docs/docs/configure/tools/index.md +++ b/docs/docs/configure/tools/index.md @@ -1,6 +1,6 @@ # Tools Reference -Altimate Code has 70+ specialized tools organized by function. +Altimate Code has 100+ specialized tools organized by function. | Category | Tools | Purpose | |---|---|---| diff --git a/docs/docs/getting-started/index.md b/docs/docs/getting-started/index.md index 0a59254360..29344ad3c7 100644 --- a/docs/docs/getting-started/index.md +++ b/docs/docs/getting-started/index.md @@ -15,7 +15,7 @@ hide: altimate-code

-

The open-source data engineering harness.

+

Open-source data engineering harness.

100+ specialized data engineering tools for building, validating, optimizing, and shipping data products. Use in your terminal, CI pipeline, orchestration DAGs, or as the harness for your data agents. Evaluate across platforms, independent of any single warehouse provider.

@@ -32,7 +32,7 @@ hide:
```bash -npm install -g altimate-code && altimate +npm install -g altimate-code ```
diff --git a/docs/docs/quickstart.md b/docs/docs/quickstart.md index 4fa5741848..273c5bec2b 100644 --- a/docs/docs/quickstart.md +++ b/docs/docs/quickstart.md @@ -60,7 +60,7 @@ Minimal config file option (`altimate-code.json` in your project root): altimate /discover ``` -Auto-detects your dbt projects, warehouse credentials, and installed tools. See [Full Setup](getting-started.md#step-3-configure-your-warehouse) for details on what `/discover` finds and manual configuration options. +Auto-detects your dbt projects, warehouse credentials, and installed tools. See [Full Setup](getting-started.md#step-3-configure-your-warehouse-optional) for details on what `/discover` finds and manual configuration options. **No cloud warehouse?** Use DuckDB with a local file: diff --git a/docs/docs/reference/changelog.md b/docs/docs/reference/changelog.md index 32d7a00d93..fba360674b 100644 --- a/docs/docs/reference/changelog.md +++ b/docs/docs/reference/changelog.md @@ -14,7 +14,7 @@ altimate --version ## How to upgrade ```bash -npm update -g @altimateai/altimate-code +npm update -g altimate-code ``` After upgrading, the TUI welcome banner shows what changed since your previous version. @@ -41,7 +41,6 @@ After upgrading, the TUI welcome banner shows what changed since your previous v - CI: faster release — build parallel with test, lower compression, tighter timeouts (#251) - Docker E2E tests skip in CI unless explicitly opted in (#253) -## [0.4.1] - 2026-03-16 ## [0.4.2] - 2026-03-18 ### Breaking Changes @@ -77,6 +76,8 @@ After upgrading, the TUI welcome banner shows what changed since your previous v - `packages/opencode/src/altimate/bridge/` — JSON-RPC bridge - `.github/workflows/publish-engine.yml` — PyPI publish workflow +## [0.4.1] - 2026-03-16 + ### Added - Local-first tracing system replacing Langfuse (#183) diff --git a/docs/mkdocs.yml b/docs/mkdocs.yml index 33965d1f46..9a4635260b 100644 --- a/docs/mkdocs.yml +++ b/docs/mkdocs.yml @@ -79,11 +79,6 @@ nav: - Setup: getting-started/quickstart.md - Examples: - Showcase: examples/index.md - - Guides: - - Cost Optimization: data-engineering/guides/cost-optimization.md - - Migration: data-engineering/guides/migration.md - - Using with Claude Code: data-engineering/guides/using-with-claude-code.md - - Using with Codex: data-engineering/guides/using-with-codex.md - Use: - Agents: - Agent Modes: data-engineering/agent-modes.md @@ -110,6 +105,11 @@ nav: - IDE: usage/ide.md - GitHub: usage/github.md - GitLab: usage/gitlab.md + - Guides: + - Cost Optimization: data-engineering/guides/cost-optimization.md + - Migration: data-engineering/guides/migration.md + - Using with Claude Code: data-engineering/guides/using-with-claude-code.md + - Using with Codex: data-engineering/guides/using-with-codex.md - Configure: - Overview: configure/index.md - Warehouses: configure/warehouses.md From 56b413a6224d2ebc4f1165ecf1e043cf9669778a Mon Sep 17 00:00:00 2001 From: Saurabh Arora Date: Wed, 18 Mar 2026 19:54:01 -0700 Subject: [PATCH 12/13] docs: sync agent modes from PR #282, Claude/Codex commands from #235, stub web UI - Reduce agent modes from 7 to 3 (builder, analyst, plan) per PR #282 - Add SQL Write Access Control section with query classification table - Add sql_execute_write permission to permissions reference - Update /data to /altimate in Claude Code guide, add /configure-claude setup - Add Codex CLI skill integration and /configure-codex setup - Add /configure-claude and /configure-codex to commands reference - Stub web UI page with Coming Soon notice - Update all cross-references (getting-started, quickstart, index, tui, training, migration) Co-Authored-By: Claude Opus 4.6 (1M context) --- docs/docs/configure/agents.md | 61 +++-- docs/docs/configure/commands.md | 40 ++- docs/docs/configure/permissions.md | 13 +- docs/docs/data-engineering/agent-modes.md | 235 ++---------------- .../docs/data-engineering/guides/migration.md | 12 +- .../guides/using-with-claude-code.md | 20 +- .../guides/using-with-codex.md | 16 +- docs/docs/data-engineering/training/index.md | 25 +- docs/docs/getting-started.md | 12 +- docs/docs/getting-started/index.md | 2 +- docs/docs/getting-started/quickstart.md | 20 +- docs/docs/index.md | 36 +-- docs/docs/usage/tui.md | 2 +- docs/docs/usage/web.md | 54 +--- 14 files changed, 181 insertions(+), 367 deletions(-) diff --git a/docs/docs/configure/agents.md b/docs/docs/configure/agents.md index 1e476ce3af..2876080c96 100644 --- a/docs/docs/configure/agents.md +++ b/docs/docs/configure/agents.md @@ -4,26 +4,46 @@ Agents define different AI personas with specific models, prompts, permissions, ## Built-in Agents -### General Purpose +| Agent | Description | Access Level | +|-------|------------|-------------| +| `builder` | Create and modify dbt models, SQL pipelines, and data transformations | Full read/write. SQL mutations prompt for approval. | +| `analyst` | Explore data, run SELECT queries, inspect schemas, generate insights | Read-only (enforced). SQL writes denied. Safe bash commands auto-allowed. | +| `plan` | Plan before acting, restricted to planning files only | Minimal: no edits, no bash, no SQL | -| Agent | Description | -|-------|------------| -| `general` | Default general-purpose coding agent | -| `plan` | Planning agent that analyzes before acting | -| `build` | Build-focused agent that prioritizes code generation | -| `explore` | Read-only exploration agent | +### Builder -### Data Engineering +Full access mode. Can read/write files, run any bash command (with approval), execute SQL, and modify dbt models. SQL write operations (`INSERT`, `UPDATE`, `DELETE`, `CREATE`, etc.) prompt for user approval. Destructive SQL (`DROP DATABASE`, `DROP SCHEMA`, `TRUNCATE`) is hard-blocked. -| Agent | Description | Permissions | -|-------|------------|------------| -| `builder` | Create dbt models, SQL pipelines, transformations | Full read/write | -| `analyst` | Explore data, run SELECT queries, generate insights | Read-only (enforced) | -| `validator` | Data quality checks, schema validation, test coverage | Read + validate | -| `migrator` | Cross-warehouse SQL translation and migration | Read/write for migration | +### Analyst + +Truly read-only mode for safe data exploration: + +- **File access**: Read, grep, glob without prompts +- **SQL**: SELECT queries execute freely. Write queries are denied (not prompted, blocked entirely) +- **Bash**: Safe commands auto-allowed (`ls`, `grep`, `cat`, `head`, `tail`, `find`, `wc`). dbt read commands allowed (`dbt list`, `dbt ls`, `dbt debug`, `dbt deps`). Everything else denied. +- **Web**: Fetch and search allowed without prompts +- **Schema/warehouse/finops**: All inspection tools available !!! tip - Use the `analyst` agent when exploring data to ensure no accidental writes. Switch to `builder` when you are ready to create or modify models. + Use `analyst` when exploring data to ensure no accidental writes. Switch to `builder` when you're ready to create or modify models. + +### Plan + +Planning mode with minimal permissions. Can only read files and edit plan files. No SQL, no bash, no file modifications. + +## SQL Write Access Control + +All SQL queries are classified before execution: + +| Query Type | Builder | Analyst | +|-----------|---------|---------| +| `SELECT`, `SHOW`, `DESCRIBE`, `EXPLAIN` | Allowed | Allowed | +| `INSERT`, `UPDATE`, `DELETE`, `CREATE`, `ALTER` | Prompts for approval | Denied | +| `DROP DATABASE`, `DROP SCHEMA`, `TRUNCATE` | Blocked (cannot override) | Blocked | + +The classifier detects write operations including: `INSERT`, `UPDATE`, `DELETE`, `MERGE`, `CREATE`, `DROP`, `ALTER`, `TRUNCATE`, `GRANT`, `REVOKE`, `COPY INTO`, `CALL`, `EXEC`, `EXECUTE IMMEDIATE`, `BEGIN`, `DECLARE`, `REPLACE`, `UPSERT`, `RENAME`. + +Multi-statement queries (`SELECT 1; INSERT INTO ...`) are classified as write if any statement is a write. ## Custom Agents @@ -86,11 +106,11 @@ You are a Snowflake cost optimization expert. For every query: ``` !!! info - Markdown agent files use YAML frontmatter for configuration and the body as the system prompt. This is a convenient way to define agents without editing your main config file. + Markdown agent files use YAML frontmatter for configuration and the body as the system prompt. ## Agent Permissions -Each agent can have its own permission overrides that restrict or expand the default permissions: +Each agent can have its own permission overrides: ```json { @@ -99,10 +119,11 @@ Each agent can have its own permission overrides that restrict or expand the def "permission": { "write": "deny", "edit": "deny", + "sql_execute_write": "deny", "bash": { - "dbt show *": "allow", + "*": "deny", "dbt list *": "allow", - "*": "deny" + "ls *": "allow" } } } @@ -117,4 +138,4 @@ Each agent can have its own permission overrides that restrict or expand the def - **TUI**: Press leader + `a` or use `/agent ` - **CLI**: `altimate --agent analyst` -- **In conversation**: Type `/agent validator` +- **In conversation**: Type `/agent analyst` diff --git a/docs/docs/configure/commands.md b/docs/docs/configure/commands.md index 06af3d7e36..d09d6a3e91 100644 --- a/docs/docs/configure/commands.md +++ b/docs/docs/configure/commands.md @@ -2,7 +2,7 @@ ## Built-in Commands -altimate ships with four built-in slash commands: +altimate ships with six built-in slash commands: | Command | Description | |---------|-------------| @@ -10,6 +10,8 @@ altimate ships with four built-in slash commands: | `/discover` | Scan your data stack and set up warehouse connections. Detects dbt projects, warehouse connections from profiles/Docker/env vars, installed tools, and config files. Walks you through adding and testing new connections, then indexes schemas. | | `/review` | Review changes. Accepts `commit`, `branch`, or `pr` as an argument (defaults to uncommitted changes). | | `/feedback` | Submit product feedback as a GitHub issue. Guides you through title, category, description, and optional session context. | +| `/configure-claude` | Configure altimate as a `/altimate` slash command in [Claude Code](https://claude.com/claude-code). Writes `~/.claude/commands/altimate.md` so you can invoke altimate from within Claude Code sessions. | +| `/configure-codex` | Configure altimate as a skill in [Codex CLI](https://developers.openai.com/codex). Creates `~/.codex/skills/altimate/SKILL.md` so Codex can delegate data engineering tasks to altimate. | ### `/discover` @@ -47,6 +49,31 @@ Submit product feedback directly from the CLI. The agent walks you through: Requires the `gh` CLI to be installed and authenticated (`gh auth login`). +### `/configure-claude` + +Set up altimate as a tool inside Claude Code: + +``` +/configure-claude +``` + +This creates `~/.claude/commands/altimate.md`, which registers a `/altimate` slash command in Claude Code. After running this, you can use `/altimate` in any Claude Code session to delegate data engineering tasks: + +``` +# In Claude Code +/altimate analyze the cost of our top 10 most expensive queries +``` + +### `/configure-codex` + +Set up altimate as a skill inside Codex CLI: + +``` +/configure-codex +``` + +This creates `~/.codex/skills/altimate/SKILL.md`. Restart Codex after running this command. Codex will then automatically invoke altimate when you ask about data engineering tasks. + ## Custom Commands Custom commands let you define reusable slash commands. @@ -109,3 +136,14 @@ Commands are loaded from: 2. `~/.config/altimate-code/commands/` globally Press leader + `/` to see all available commands. + +## External CLI Integration + +The `/configure-claude` and `/configure-codex` commands write integration files to external CLI tools: + +| Command | File created | Purpose | +|---------|-------------|---------| +| `/configure-claude` | `~/.claude/commands/altimate.md` | Registers `/altimate` slash command in Claude Code | +| `/configure-codex` | `~/.codex/skills/altimate/SKILL.md` | Registers altimate as a Codex CLI skill | + +These files allow you to invoke altimate's data engineering capabilities from within other AI coding agents. diff --git a/docs/docs/configure/permissions.md b/docs/docs/configure/permissions.md index 4d577c91ad..3b4e7e7557 100644 --- a/docs/docs/configure/permissions.md +++ b/docs/docs/configure/permissions.md @@ -86,6 +86,7 @@ Override permissions for specific agents: | `grep` | Yes | Search files | | `list` | Yes | List directories | | `bash` | Yes | Shell commands | +| `sql_execute_write` | Yes | SQL write operations (INSERT, UPDATE, DELETE, etc.) | | `task` | Yes | Spawn subagents | | `lsp` | Yes | LSP operations | | `skill` | Yes | Execute skills | @@ -205,20 +206,22 @@ Give each agent only the permissions it needs: "permission": { "write": "deny", "edit": "deny", + "sql_execute_write": "deny", "bash": { - "SELECT *": "allow", - "dbt docs *": "allow", - "*": "deny" + "*": "deny", + "ls *": "allow", + "cat *": "allow", + "dbt list *": "allow" } } }, "builder": { "permission": { + "sql_execute_write": "ask", "bash": { "*": "ask", "dbt *": "allow", - "git *": "ask", - "DROP *": "deny" + "rm -rf *": "deny" } } } diff --git a/docs/docs/data-engineering/agent-modes.md b/docs/docs/data-engineering/agent-modes.md index afbb9e5adb..97e612edcc 100644 --- a/docs/docs/data-engineering/agent-modes.md +++ b/docs/docs/data-engineering/agent-modes.md @@ -1,16 +1,12 @@ # Agent Modes -altimate runs in one of seven specialized modes. Each mode has different permissions, tool access, and behavioral guardrails. +altimate runs in one of three specialized modes. Each mode has different permissions, tool access, and behavioral guardrails. | Mode | Access | Purpose | |---|---|---| | **Builder** | Read/Write | Create and modify data pipelines | | **Analyst** | Read-only | Safe exploration and cost analysis | -| **Validator** | Read + Validate | Data quality and integrity checks | -| **Migrator** | Cross-warehouse | Dialect translation and migration | -| **Researcher** | Read-only + Parallel | Deep multi-step investigations | -| **Trainer** | Read-only + Training | Teach your AI teammate | -| **Executive** | Read-only | Business-friendly reporting (no SQL jargon) | +| **Plan** | Minimal | Planning only, no edits or execution | ## Builder @@ -22,11 +18,7 @@ altimate --agent builder > Tip: `--yolo` auto-approves permission prompts for faster iteration (`altimate --yolo --agent builder`). Not recommended with live warehouse connections. Use on local/dev environments only. See [Permissions: Yolo Mode](../configure/permissions.md#yolo-mode). -Builder mode follows a strict pre-execution protocol for every SQL operation: - -1. `sql_analyze` to check for anti-patterns -2. `sql_validate` to verify syntax and schema references -3. `sql_execute` to run the query +Builder mode classifies every SQL query before execution. Read queries run freely. Write queries (`INSERT`, `UPDATE`, `DELETE`, `CREATE`, `ALTER`) prompt for approval. Destructive SQL (`DROP DATABASE`, `DROP SCHEMA`, `TRUNCATE`) is hard-blocked and cannot be overridden. ### Example: Create a staging model @@ -72,7 +64,7 @@ I'll create a staging model with proper typing, deduplication, and column naming ### What builder can do - Create and edit SQL files, dbt models, YAML configs -- Execute SQL (DDL/DML/DQL) +- Execute SQL (DDL/DML/DQL) with write approval prompts - Run dbt commands - Generate tests and documentation - Scaffold new models from templates @@ -89,10 +81,11 @@ altimate --agent analyst Analyst mode enforces strict guardrails: -- **SELECT only.** INSERT, UPDATE, DELETE, DROP are blocked by altimate-core -- **Cost-first.** Every query gets a cost prediction before execution -- **LIMIT enforced.** Large scans are automatically limited +- **SELECT only.** INSERT, UPDATE, DELETE, DROP are denied (not prompted, blocked entirely) - **No file writes.** Cannot modify your codebase +- **Safe bash only.** `ls`, `grep`, `cat`, `head`, `tail`, `find`, `wc` auto-allowed. dbt read commands allowed (`dbt list`, `dbt ls`, `dbt debug`, `dbt deps`). Everything else denied. +- **Full web access.** Fetch and search allowed without prompts +- **All inspection tools.** Schema, warehouse, finops, lineage, and core analysis tools available ### Example: Investigate expensive queries @@ -135,218 +128,40 @@ Estimated savings: ~60 credits/week (71% reduction on this query alone) - Analyze costs and credit consumption - Search warehouse metadata - Detect PII columns +- Save training entries --- -## Validator +## Plan -**Read + validate. For data quality and integrity checks.** +**Minimal permissions. For planning before acting.** ```bash -altimate --agent validator -``` - -### Example: Audit a model before merging - +altimate --agent plan ``` -You: Validate the changes in models/marts/fct_revenue.sql -Validator: - -> sql_analyze [fct_revenue.sql contents] - ⚠ FUNCTION_IN_FILTER: DATE_TRUNC on order_date prevents index usage - ✓ No other anti-patterns - -> lineage_check [fct_revenue.sql] - Sources: stg_orders.order_amount, stg_orders.order_date, dim_products.category - Confidence: high (no SELECT *, schema-qualified, small graph) - -> schema_diff [old version] [new version] - + Added column: revenue_category (VARCHAR) - ~ Modified column: total_revenue FLOAT → DECIMAL(18,2) - Severity: medium (type change may affect downstream) - -Issues found: -1. FUNCTION_IN_FILTER on line 23: use a pre-computed date column instead -2. Type change from FLOAT to DECIMAL(18,2): verify downstream consumers -3. Missing test coverage for the new revenue_category column -``` +Plan mode restricts the agent to reading files and editing plan files only. No SQL, no bash, no file modifications. Use this to outline an approach before switching to builder to execute it. --- -## Migrator - -**Cross-warehouse migration specialist.** +## SQL Write Access Control -```bash -altimate --agent migrator -``` +All SQL queries are classified before execution using AST-based parsing: -### Example: Migrate from Snowflake to BigQuery - -``` -You: Translate our top 5 models from Snowflake to BigQuery - -Migrator: - -> sql_translate --source snowflake --target bigquery - Input: SELECT DATEADD(day, -7, CURRENT_TIMESTAMP()) AS week_ago, - IFF(status = 'active', 1, 0) AS is_active, - TRY_TO_NUMBER(raw_amount) AS amount - FROM analytics.public.orders - QUALIFY ROW_NUMBER() OVER (PARTITION BY order_id ORDER BY updated_at DESC) = 1 - - Output: SELECT DATE_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY) AS week_ago, - IF(status = 'active', 1, 0) AS is_active, - SAFE_CAST(raw_amount AS NUMERIC) AS amount - FROM `analytics.public.orders` - QUALIFY ROW_NUMBER() OVER (PARTITION BY order_id ORDER BY updated_at DESC) = 1 - - Warnings: - - QUALIFY is supported natively in BigQuery (no change needed) - - TRY_TO_NUMBER → SAFE_CAST (NULL on failure, same behavior) - - Table reference backtick-quoted for BigQuery - -> lineage_check [source SQL] -> lineage_check [translated SQL] - ✓ Lineage preserved, same columns, same sources -``` +| Query Type | Builder | Analyst | +|-----------|---------|---------| +| `SELECT`, `SHOW`, `DESCRIBE`, `EXPLAIN` | Allowed | Allowed | +| `INSERT`, `UPDATE`, `DELETE`, `CREATE`, `ALTER` | Prompts for approval | Denied | +| `DROP DATABASE`, `DROP SCHEMA`, `TRUNCATE` | Blocked (cannot override) | Blocked | -### Supported translation paths +The classifier detects write operations including: `INSERT`, `UPDATE`, `DELETE`, `MERGE`, `CREATE`, `DROP`, `ALTER`, `TRUNCATE`, `GRANT`, `REVOKE`, `COPY INTO`, `CALL`, `EXEC`, `EXECUTE IMMEDIATE`, `BEGIN`, `DECLARE`, `REPLACE`, `UPSERT`, `RENAME`. -| From | To | -|---|---| -| Snowflake | BigQuery, Databricks, Redshift, PostgreSQL | -| BigQuery | Snowflake, Databricks, Redshift, PostgreSQL | -| Redshift | Snowflake, BigQuery, PostgreSQL | -| PostgreSQL | Snowflake, BigQuery, Databricks | -| MySQL | PostgreSQL, Snowflake | -| SQL Server | PostgreSQL, Snowflake | +Multi-statement queries (`SELECT 1; INSERT INTO ...`) are classified as write if any statement is a write. --- -## Researcher - -**Read-only + parallel investigation. For complex analytical questions.** - -```bash -altimate --agent researcher -``` - -Researcher mode follows a 4-phase protocol for thorough investigation: - -1. **Plan.** Outline questions, data sources, and tool sequence -2. **Gather.** Execute investigation steps, parallelize where possible -3. **Analyze.** Cross-reference findings, identify root causes -4. **Report.** Produce structured report with evidence and recommendations - -### Example: Investigate a cost spike - -``` -You: Why did our Snowflake costs spike last week? - -Researcher: Let me plan the investigation. - - Investigation Plan: - 1. Pull credit consumption trends (finops_analyze_credits) - 2. Identify top expensive queries (finops_expensive_queries) - 3. Check for deployment changes (git log) - 4. Correlate query patterns with cost changes - - Proceeding... - -> finops_analyze_credits prod-snowflake --days 14 - Last 7 days: 847 credits (↑ 38% vs prior week) - Top warehouse: TRANSFORM_WH (523 credits, 62%) - -> finops_expensive_queries prod-snowflake --days 7 - 1. Full table scan on events (89 credits, 47 runs) - 2. New model fct_attribution (45 credits, added 3 days ago) - -> bash: git log --oneline --since="7 days ago" -- models/ - abc1234 feat: add fct_attribution model - def5678 fix: remove partition filter from stg_events - - # Investigation Report - - ## Summary - Costs increased 38% due to two changes deployed 3 days ago. - - ## Key Findings - 1. fct_attribution model (new) scans 2.1B events rows without partition filter, costing 45 credits/week - 2. Partition filter was accidentally removed from stg_events in commit def5678, causing 47 full scans - - ## Recommendations - 1. [ ] Add date partition filter to fct_attribution (estimated savings: 40 credits/week) - 2. [ ] Restore partition filter in stg_events (estimated savings: 60 credits/week) -``` - -### What researcher can do - -- Run all read-only SQL and schema tools -- Use `bash` for git history and code analysis -- Launch parallel sub-investigations with the `task` tool -- Access `websearch` and `webfetch` for external research -- Save discoveries as training for future sessions - -### When to use researcher vs analyst - -| Use Analyst for... | Use Researcher for... | -|---|---| -| Quick data exploration | Multi-step investigations | -| Single-query answers | Root cause analysis | -| Schema browsing | Cross-referencing multiple data sources | -| Ad-hoc cost checks | Comprehensive audits with reports | - ---- - -## Trainer - -**Read-only + training management. For teaching your AI teammate.** - -```bash -altimate --agent trainer -``` - -Trainer mode helps you systematically teach the agent about your project. It cannot modify files. It only reads, explores, and manages training. - -### Example: Teach from a correction - -``` -You: I want you to know that we never use FLOAT for money columns - -Trainer: Good rule. Let me save this properly: - - Kind: rule - Name: no-float-financial - Content: "Use DECIMAL(18,2) for financial columns (*_amount, *_price, *_cost). - FLOAT causes rounding errors that compound across aggregations." - - Scope: project (shared with team via git) - Save this? - -You: Yes - -Trainer: Saved. Every agent mode will apply this from now on. - Your teammates get it when they pull. -``` - -### What trainer can do - -- Guide users through teaching interactively -- Read codebase files to understand existing patterns -- Save, list, and remove training entries -- Analyze training gaps and suggest what to teach next -- Review and curate training quality (stale detection, consolidation) - -### When to use trainer mode +## Custom Agent Modes -| Scenario | Why trainer mode | -|---|---| -| New project setup | Teach conventions before anyone starts building | -| New hire onboarding | Walk through what the team has taught | -| Post-incident review | Save lessons learned as permanent rules | -| Loading a style guide | Extract rules and standards from documentation | -| Quarterly audit | Remove stale entries, consolidate, fill gaps | +You can create custom agents with tailored permissions for specialized workflows like validation, migration, research, or executive reporting. See [Agent Configuration](../configure/agents.md#custom-agents) for details. -For the full guide, see [Training: Corrections That Stick](training/index.md). +For training your AI teammate, see [Training](training/index.md). diff --git a/docs/docs/data-engineering/guides/migration.md b/docs/docs/data-engineering/guides/migration.md index 08b1eb291c..c7b594ccfa 100644 --- a/docs/docs/data-engineering/guides/migration.md +++ b/docs/docs/data-engineering/guides/migration.md @@ -1,11 +1,11 @@ # Migration Guide -Use migrator mode to translate SQL across warehouse dialects while preserving lineage and correctness. +Use altimate to translate SQL across warehouse dialects while preserving lineage and correctness. -## Start migrator mode +## Start a migration ```bash -altimate --agent migrator +altimate --agent builder ``` ## Translation workflow @@ -15,7 +15,7 @@ altimate --agent migrator ``` You: Migrate our Snowflake models to BigQuery -Migrator: I'll translate each model and verify lineage is preserved. +Builder: I'll translate each model and verify lineage is preserved. Let me start by listing your models. > dbt_manifest ./target/manifest.json @@ -24,7 +24,7 @@ Migrator: I'll translate each model and verify lineage is preserved. ### 2. Translate with verification -For each model, the migrator: +For each model, the agent: 1. **Reads** the source SQL 2. **Translates** to target dialect @@ -66,7 +66,7 @@ Some features don't have direct equivalents: ``` You: Run all translations and report issues -Migrator: +Builder: ✓ 38/47 models translated cleanly ⚠ 6 models need manual review (VARIANT columns) ✗ 3 models use Snowflake-specific features (STREAMS, TASKS) diff --git a/docs/docs/data-engineering/guides/using-with-claude-code.md b/docs/docs/data-engineering/guides/using-with-claude-code.md index 89edd7375f..4c584a5bb8 100644 --- a/docs/docs/data-engineering/guides/using-with-claude-code.md +++ b/docs/docs/data-engineering/guides/using-with-claude-code.md @@ -4,11 +4,11 @@ altimate can work alongside Claude Code. While Claude Code handles general codin ## How it works -Claude Code can invoke altimate as a tool when working on data projects. The `/data` skill routes data engineering tasks to altimate's specialized tools. +Claude Code can invoke altimate as a tool when working on data projects. The `/altimate` command routes data engineering tasks to altimate's specialized tools. ``` # In Claude Code -/data "analyze the cost of our Snowflake warehouse" +/altimate analyze the cost of our Snowflake warehouse ``` This gives Claude Code access to: @@ -29,7 +29,7 @@ When Claude Code reviews a PR that includes SQL changes: Claude Code: I see changes to models/marts/fct_revenue.sql. Let me check for data engineering concerns. -> /data "analyze fct_revenue.sql for anti-patterns and estimate cost" +> /altimate analyze fct_revenue.sql for anti-patterns and estimate cost altimate: sql_analyze: No anti-patterns detected ✓ @@ -45,7 +45,7 @@ You (in Claude Code): Set up a new dbt model for our customer metrics Claude Code: I'll create the model structure. Let me use altimate to inspect the source tables first. -> /data "inspect the raw.customers and raw.orders tables" +> /altimate inspect the raw.customers and raw.orders tables altimate: schema_inspect raw.customers → 12 columns @@ -62,7 +62,7 @@ You (in Claude Code): This query is slow, can you optimize it? Claude Code: Let me analyze it with altimate's SQL tools. -> /data "analyze and optimize: SELECT * FROM events WHERE YEAR(event_date) = 2026" +> /altimate analyze and optimize: SELECT * FROM events WHERE YEAR(event_date) = 2026 altimate: sql_analyze: @@ -77,7 +77,15 @@ altimate: 1. Install altimate globally: `npm install -g altimate-code` 2. Configure warehouse connections in your project -3. Claude Code automatically discovers altimate's tools when the `/data` skill is invoked +3. Run `/configure-claude` inside altimate to set up the integration: + +```bash +altimate +# then in the TUI: +/configure-claude +``` + +This creates `~/.claude/commands/altimate.md`. You can now use `/altimate` in any Claude Code session. ## When to use which diff --git a/docs/docs/data-engineering/guides/using-with-codex.md b/docs/docs/data-engineering/guides/using-with-codex.md index 713f565322..493d3987c2 100644 --- a/docs/docs/data-engineering/guides/using-with-codex.md +++ b/docs/docs/data-engineering/guides/using-with-codex.md @@ -1,4 +1,18 @@ -# Using altimate with Codex (ChatGPT Subscription) +# Using altimate with Codex + +altimate integrates with Codex in two ways: as an **LLM provider** (use your ChatGPT subscription to power altimate) and as a **Codex skill** (invoke altimate from within Codex CLI). + +## Using altimate as a Codex CLI skill + +You can delegate data engineering tasks from Codex CLI to altimate. Run `/configure-codex` inside altimate to set up the integration: + +``` +/configure-codex +``` + +This creates `~/.codex/skills/altimate/SKILL.md`. Restart Codex to pick up the new skill. Codex will then automatically invoke altimate when you ask about data engineering tasks like SQL analysis, lineage, dbt, or FinOps. + +## Using Codex as an LLM provider If you have a ChatGPT Plus or Pro subscription, you can use Codex as your LLM backend in altimate at no additional API cost. Your subscription covers all usage. diff --git a/docs/docs/data-engineering/training/index.md b/docs/docs/data-engineering/training/index.md index a13f9906c7..39050a65f4 100644 --- a/docs/docs/data-engineering/training/index.md +++ b/docs/docs/data-engineering/training/index.md @@ -21,7 +21,7 @@ Builder: Saved. I'll apply this in every future session. Your team gets it too when they pull. ``` -That's it. **2 seconds.** No editing files. No context switching. The correction becomes permanent knowledge that every agent mode (builder, analyst, validator) sees in every future session. +That's it. **2 seconds.** No editing files. No context switching. The correction becomes permanent knowledge that every agent mode (builder, analyst) sees in every future session. Research shows compact, focused context improves AI performance by 17 percentage points, while dumping comprehensive docs actually hurts by 3 points (SkillsBench, 7,308 test runs). Training delivers the right knowledge to the right agent at the right time, not everything to everyone. @@ -48,7 +48,7 @@ Point the agent at code that demonstrates a convention: ``` You: /teach @models/staging/stg_orders.sql -Trainer: I see the pattern: +Agent: I see the pattern: - source CTE → filtered CTE → final - ROW_NUMBER dedup on _loaded_at Save as pattern "staging-cte-structure"? @@ -91,22 +91,15 @@ Agent: I found 8 actionable rules: No meetings. No Slack messages. No "hey everyone, remember to..." -## Trainer Mode +## Systematic Teaching -For systematic teaching (not just corrections), switch to trainer mode: +For systematic teaching (not just corrections), use the `/teach` and `/train` skills in any agent mode: -```bash -altimate --agent trainer -``` - -Trainer mode is read-only and cannot modify your code. It helps you: - -- **Teach interactively**: "Let me teach you about our Databricks setup" -- **Find gaps**: "What don't you know about my project?" -- **Review training**: "Show me what the team has taught you" -- **Curate**: "Which entries are stale? What should we consolidate?" +- `/teach @file` to learn patterns from example files +- `/train @file` to learn standards from documentation +- `/training-status` to see all learned knowledge -### When to Use Trainer Mode +### When to Teach | Scenario | Why | |---|---| @@ -121,8 +114,6 @@ Training doesn't dump everything into every session. It delivers what's relevant - **Builder** gets rules and patterns first (naming conventions, SQL constraints) - **Analyst** gets glossary and context first (business terms, background knowledge) -- **Validator** gets rules and standards first (quality gates, test requirements) -- **Executive** gets glossary and playbooks first (business terms, procedures) Research shows 2-3 focused modules per task is optimal. The scoring system ensures each agent gets its most relevant knowledge first. diff --git a/docs/docs/getting-started.md b/docs/docs/getting-started.md index 06c11ca9fa..729622a6d5 100644 --- a/docs/docs/getting-started.md +++ b/docs/docs/getting-started.md @@ -89,13 +89,9 @@ altimate offers specialized agent modes for different workflows: | What do you want to do? | Use this agent mode | |---|---| -| Analyzing data without risk of changes | **Analyst** for read-only queries, cost analysis, data profiling | -| Building or generating dbt models | **Builder** for model scaffolding, SQL generation, ref() wiring | -| Validating data quality | **Validator** for test generation, anomaly detection, data contracts | -| Migrating across warehouses | **Migrator** for cross-dialect SQL translation, compatibility checks | -| Teaching team conventions | **Trainer**, which learns corrections and enforces naming/style rules across team | -| Research and exploration | **Researcher** for deep-dive analysis, lineage tracing, impact assessment | -| Executive summaries and reports | **Executive** for high-level overviews, cost summaries, health dashboards | +| Analyzing data without risk of changes | **Analyst** for read-only queries, cost analysis, data profiling. SQL writes are blocked entirely. | +| Building or generating dbt models | **Builder** for model scaffolding, SQL generation, ref() wiring. SQL writes prompt for approval. | +| Planning before acting | **Plan** for outlining an approach before switching to builder to execute it | Switch modes in the TUI: @@ -214,6 +210,6 @@ Generate data quality tests for all models in the marts/ directory. For each mod - [CLI](usage/cli.md): Subcommands, flags, and environment variables - [Config Files](configure/config.md): Full config file reference - [Providers](configure/providers.md): Set up Anthropic, OpenAI, Bedrock, Ollama, and more -- [Agent Modes](data-engineering/agent-modes.md): Builder, Analyst, Validator, Migrator, Researcher, Trainer +- [Agent Modes](data-engineering/agent-modes.md): Builder, Analyst, Plan - [Training](data-engineering/training/index.md): Correct the agent once, it remembers forever, your team inherits it - [Tools](data-engineering/tools/sql-tools.md): 100+ specialized tools for SQL, dbt, and warehouses diff --git a/docs/docs/getting-started/index.md b/docs/docs/getting-started/index.md index 29344ad3c7..e7f5c373bb 100644 --- a/docs/docs/getting-started/index.md +++ b/docs/docs/getting-started/index.md @@ -76,7 +76,7 @@ Altimate Code goes the other direction. It connects to your **entire** stack and --- - Five agent modes — Builder, Analyst, Validator, Migrator, and Executive — each with tool-level permissions you can `allow`, `ask`, or `deny` per agent. Layer on project rules via `AGENTS.md`, automatic context compaction for long sessions, and auto-formatting on every edit. Governance enforced by the harness. + Three agent modes — Builder, Analyst, and Plan — each with tool-level permissions you can `allow`, `ask`, or `deny` per agent. Create custom agents for specialized workflows. Layer on project rules via `AGENTS.md`, automatic context compaction for long sessions, and auto-formatting on every edit. Governance enforced by the harness.
diff --git a/docs/docs/getting-started/quickstart.md b/docs/docs/getting-started/quickstart.md index d60c6f337f..d4395cfc05 100644 --- a/docs/docs/getting-started/quickstart.md +++ b/docs/docs/getting-started/quickstart.md @@ -290,13 +290,9 @@ altimate ships with specialized agent modes, each with its own tool permissions: | Mode | Access | Use when you want to... | |---|---|---| -| **Builder** | Read/Write | Create and modify SQL, dbt models, pipelines | -| **Analyst** | Read-only | Explore production data safely, run cost analysis | -| **Validator** | Read + Validate | Check data quality, run anti-pattern detection | -| **Migrator** | Cross-warehouse | Translate SQL between dialects, plan migrations | -| **Researcher** | Read-only + Parallel | Deep-dive investigations, lineage tracing | -| **Trainer** | Read-only + Training | Teach the agent your project conventions | -| **Executive** | Read-only | Generate business-friendly reports and summaries | +| **Builder** | Read/Write | Create and modify SQL, dbt models, pipelines. SQL writes prompt for approval. | +| **Analyst** | Read-only | Explore production data safely, run cost analysis. SQL writes denied entirely. | +| **Plan** | Minimal | Plan an approach before switching to builder to execute it | Switch modes in the TUI: @@ -418,13 +414,9 @@ Define project-wide conventions in an `AGENTS.md` file at your project root. The | Agent | File writes | SQL writes | Bash | Training | |---|---|---|---|---| -| Builder | allow | allow | ask | deny | -| Analyst | deny | deny (SELECT only) | deny | deny | -| Validator | deny | deny | ask | deny | -| Migrator | allow | allow | ask | deny | -| Researcher | deny | deny | allow | deny | -| Trainer | deny | deny | deny | allow | -| Executive | deny | deny | deny | deny | +| Builder | allow | ask (prompts for approval) | ask | allow | +| Analyst | deny | deny (blocked entirely) | deny (safe commands auto-allowed) | allow | +| Plan | deny | deny | deny | deny | --- diff --git a/docs/docs/index.md b/docs/docs/index.md index e654f0695b..63085ca92b 100644 --- a/docs/docs/index.md +++ b/docs/docs/index.md @@ -116,7 +116,7 @@ npm install -g altimate-code --- -

Seven specialized agents

+

Purpose-built agent modes

Each agent has scoped permissions and purpose-built tools for its role.

@@ -125,46 +125,24 @@ npm install -g altimate-code --- - Create dbt models, SQL pipelines, and data transformations with full read/write access. + Create dbt models, SQL pipelines, and data transformations with full read/write access. SQL writes prompt for approval. Destructive SQL is hard-blocked. - :material-chart-bar:{ .lg .middle } **Analyst** --- - Explore data, run SELECT queries, and generate insights. Read-only access is enforced. + Explore data, run SELECT queries, and generate insights. Read-only access is enforced. SQL writes are denied, not prompted. Safe bash commands auto-allowed. -- :material-check-decagram:{ .lg .middle } **Validator** +- :material-clipboard-text:{ .lg .middle } **Plan** --- - Data quality checks, schema validation, test coverage analysis, and CI gating. - -- :material-swap-horizontal:{ .lg .middle } **Migrator** - - --- - - Cross-warehouse SQL translation, schema migration, and dialect conversion workflows. - -- :material-magnify:{ .lg .middle } **Researcher** - - --- - - Deep multi-step investigations with structured reports. Root cause analysis, cost audits, deprecation checks. - -- :material-school:{ .lg .middle } **Trainer** - - --- - - Correct the agent once, it remembers forever, your team inherits it. Teach patterns, rules, and domain knowledge. - -- :material-account-tie:{ .lg .middle } **Executive** - - --- - - Business-friendly reporting. No SQL jargon. Translates technical findings into impact and recommendations. + Plan before acting. Read-only with minimal permissions. No SQL, no bash, no file modifications.
+Create custom agents with tailored permissions for specialized workflows like validation, migration, research, or executive reporting. See [Agent Configuration](configure/agents.md#custom-agents). + ---

Works with any LLM

diff --git a/docs/docs/usage/tui.md b/docs/docs/usage/tui.md index 7d8c187cc2..7ca9177253 100644 --- a/docs/docs/usage/tui.md +++ b/docs/docs/usage/tui.md @@ -65,7 +65,7 @@ Switch between agents during a conversation: - Press leader key + `a` to see all agents - Use `/agent ` to switch directly - Built-in agents: `general`, `plan`, `build`, `explore` -- Data engineering agents: `builder`, `analyst`, `validator`, `migrator` +- Data engineering agents: `builder`, `analyst`, `plan` ## Diff Display diff --git a/docs/docs/usage/web.md b/docs/docs/usage/web.md index 82ec166522..bd3125e3a5 100644 --- a/docs/docs/usage/web.md +++ b/docs/docs/usage/web.md @@ -1,53 +1,11 @@ -# Web +# Web UI -altimate includes a web-based interface for browser access. - -```bash -altimate web -``` - -## Configuration - -Configure the web server in `altimate-code.json`: - -```json -{ - "server": { - "port": 3000, - "hostname": "localhost", - "cors": ["https://myapp.example.com"], - "mdns": true, - "mdnsDomain": "altimate-code.local" - } -} -``` - -| Option | Default | Description | -|--------|---------|------------| -| `port` | 3000 | HTTP port | -| `hostname` | `localhost` | Bind address | -| `cors` | `[]` | Allowed CORS origins | -| `mdns` | `false` | Enable mDNS discovery | -| `mdnsDomain` | (none) | Custom mDNS domain | - -## Authentication - -Set basic auth credentials: - -```bash -export ALTIMATE_CLI_SERVER_USERNAME=admin -export ALTIMATE_CLI_SERVER_PASSWORD=secret -altimate web -``` - -## Features - -The web UI provides the same conversational interface as the TUI: +Altimate Web is a browser-based interface for interacting with altimate's data engineering tools without the terminal. It provides the same conversational agent experience as the TUI, accessible from any browser. - Full chat interface with streaming responses +- Agent switching between builder, analyst, and plan modes - File references and tool call results -- Agent switching -- Session management +- Session management and history -!!! note - The web UI is the general-purpose agent interface. For data-engineering-specific UIs, see the [Data Engineering guides](../data-engineering/guides/index.md). +!!! info "Coming Soon" + The web UI is currently under development. For now, use the [TUI](tui.md) or [CLI](cli.md) to interact with altimate. From 04b51e56a0ef7212b68ba06060e7068b1c852f25 Mon Sep 17 00:00:00 2001 From: Saurabh Arora Date: Wed, 18 Mar 2026 20:24:07 -0700 Subject: [PATCH 13/13] docs: fix stale llms.txt URLs, add v0.5.0 changelog entry - Fix 4 broken URLs in llms.txt (network, telemetry, security-faq, troubleshooting) to match reference/ paths in mkdocs nav - Update llms.txt version from v0.4.2 to v0.5.0 - Add missing v0.5.0 changelog entry with features and fixes Co-Authored-By: Claude Opus 4.6 (1M context) --- docs/docs/llms.txt | 10 +++++----- docs/docs/reference/changelog.md | 14 ++++++++++++++ 2 files changed, 19 insertions(+), 5 deletions(-) diff --git a/docs/docs/llms.txt b/docs/docs/llms.txt index 195bb03166..f287812765 100644 --- a/docs/docs/llms.txt +++ b/docs/docs/llms.txt @@ -1,6 +1,6 @@ # altimate-code llms.txt # AI-friendly documentation index for altimate-code -# Generated: 2026-03-18 | Version: v0.4.2 +# Generated: 2026-03-18 | Version: v0.5.0 # Source: https://docs.altimate.sh > altimate-code is an open-source data engineering harness with 100+ tools for building, validating, optimizing, and shipping data products. Use in your terminal, CI pipeline, orchestration DAGs, or as the tool layer for your data agents. Includes a deterministic SQL Intelligence Engine (100% F1 across 1,077 queries), column-level lineage, FinOps analysis, PII detection, and dbt integration. Works with any LLM provider. Local-first, MIT-licensed. @@ -9,7 +9,7 @@ - [Quickstart (5 min)](https://docs.altimate.sh/quickstart/): Install altimate, configure your LLM provider, connect your warehouse, and run your first query in under 5 minutes. - [Full Setup Guide](https://docs.altimate.sh/getting-started/): Complete installation, warehouse configuration for all 8 supported warehouses, LLM provider setup, and first-run walkthrough. -- [Network & Proxy](https://docs.altimate.sh/network/): Proxy configuration, CA certificate setup, firewall requirements. +- [Network & Proxy](https://docs.altimate.sh/reference/network/): Proxy configuration, CA certificate setup, firewall requirements. ## Data Engineering @@ -34,9 +34,9 @@ - [Agent Skills](https://docs.altimate.sh/configure/skills/): How to configure, discover, and add custom skills. - [Permissions](https://docs.altimate.sh/configure/permissions/): Permission levels, pattern matching, per-agent restrictions, deny rules for destructive SQL. - [Tracing](https://docs.altimate.sh/configure/tracing/): Local-first observability covering trace schema, span types, live viewing, remote OTLP exporters, and crash recovery. -- [Telemetry](https://docs.altimate.sh/configure/telemetry/): 25 anonymized event types, privacy guarantees, opt-out instructions. +- [Telemetry](https://docs.altimate.sh/reference/telemetry/): 25 anonymized event types, privacy guarantees, opt-out instructions. ## Reference -- [Security FAQ](https://docs.altimate.sh/security-faq/): 12 Q&A pairs on data handling, credentials, permissions, network endpoints, and team hardening. -- [Troubleshooting](https://docs.altimate.sh/troubleshooting/): 6 common issues with step-by-step fixes, including tool execution errors and warehouse connection setup. +- [Security FAQ](https://docs.altimate.sh/reference/security-faq/): 12 Q&A pairs on data handling, credentials, permissions, network endpoints, and team hardening. +- [Troubleshooting](https://docs.altimate.sh/reference/troubleshooting/): 6 common issues with step-by-step fixes, including tool execution errors and warehouse connection setup. diff --git a/docs/docs/reference/changelog.md b/docs/docs/reference/changelog.md index fba360674b..e48494e67a 100644 --- a/docs/docs/reference/changelog.md +++ b/docs/docs/reference/changelog.md @@ -21,6 +21,20 @@ After upgrading, the TUI welcome banner shows what changed since your previous v --- +## [0.5.0] - 2026-03-18 + +### Added + +- Smooth streaming mode for TUI response rendering (#281) +- Ship builtin skills to customers via postinstall (#279) +- `/configure-claude` and `/configure-codex` built-in commands (#235) + +### Fixed + +- Brew formula stuck at v0.3.1, version normalization in publish pipeline (#286) +- Harden auth field handling for all warehouse drivers (#271) +- Suppress console logging that corrupts TUI display (#269) + ## [0.4.9] - 2026-03-18 ### Added