feat: builder prompt rewrite + dbt skills consolidation + altimate-dbt CLI#174
feat: builder prompt rewrite + dbt skills consolidation + altimate-dbt CLI#174suryaiyer95 wants to merge 10000 commits intomainfrom
Conversation
Co-authored-by: Adam <2363879+adamdotdevin@users.noreply.github.com>
* fix: auto-bootstrap Python engine before starting bridge Bridge.start() now calls ensureEngine() to download uv, create an isolated venv, and install altimate-engine before spawning the Python subprocess. resolvePython() also checks the managed venv path so the correct interpreter is used after bootstrapping. Previously, resolvePython() would fall through to system python3 which doesn't have altimate_engine installed, causing ModuleNotFoundError on first run. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: add bridge client tests for ensureEngine and resolvePython - Export resolvePython() from client.ts for direct unit testing - Test that ALTIMATE_CLI_PYTHON env var takes highest priority - Test that managed engine venv is used when present on disk - Test fallback to python3 when no venvs exist - Test that ensureEngine() is called before bridge spawn - Mock only bridge/engine module to avoid leaking into other tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* fix: auto-bootstrap Python engine before starting bridge Bridge.start() now calls ensureEngine() to download uv, create an isolated venv, and install altimate-engine before spawning the Python subprocess. resolvePython() also checks the managed venv path so the correct interpreter is used after bootstrapping. Previously, resolvePython() would fall through to system python3 which doesn't have altimate_engine installed, causing ModuleNotFoundError on first run. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test: add bridge client tests for ensureEngine and resolvePython - Export resolvePython() from client.ts for direct unit testing - Test that ALTIMATE_CLI_PYTHON env var takes highest priority - Test that managed engine venv is used when present on disk - Test fallback to python3 when no venvs exist - Test that ensureEngine() is called before bridge spawn - Mock only bridge/engine module to avoid leaking into other tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Move existing data engineering docs into data-engineering/ subdirectory and add 29 new pages covering platform features: TUI, CLI, web UI, IDE and CI/CD integration, configuration, providers, tools, agents, models, themes, keybinds, commands, formatters, permissions, LSP, MCP, ACP, skills, custom tools, SDK, server, plugins, ecosystem, network, troubleshooting, and Windows/WSL. All content adapted with altimate-code branding (env vars, config paths, package names). mkdocs builds with zero warnings. Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
- Remove navigation.expand so sub-sections start collapsed (less overwhelming) - Group Configure's 16 flat items into 5 logical sub-sections: Providers & Models, Agents & Tools, Behavior, Appearance, Integrations - Group orphaned bottom pages (Network, Troubleshooting, Windows/WSL) under Reference - All 44 pages preserved, zero information lost Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…148) * Add AI Teammate repositioning design document Comprehensive design for repositioning altimate from "AI tool" to "AI teammate" — including trainable knowledge system (/teach, /train, /feedback), Deep Research mode for multi-step investigations, team memory that persists via git, and UX reframing from "agent modes" to "teammate roles." https://claude.ai/code/session_01V17Kk3qCZFp9ZJiuNYucoq * Enrich design doc with OpenClaw research and proactive behaviors Add detailed competitive analysis from OpenClaw (self-improving memory, heartbeat scheduler, meet-users-where-they-are), Devin ($10.2B valuation, "junior partner" framing), and Factory AI (workflow embedding). Add proactive behaviors section with background monitors (cost alerts, freshness checks, schema drift, PII scanning) and auto-promotion of learned corrections. https://claude.ai/code/session_01V17Kk3qCZFp9ZJiuNYucoq * Implement AI Teammate training system and Deep Research mode Core training infrastructure built on top of existing memory system: Training Store & Types: - TrainingStore wraps MemoryStore with training-specific conventions - Four knowledge kinds: pattern, rule, glossary, standard - Structured metadata (applied count, source, acceptance tracking) - Training blocks stored in .opencode/memory/training/ (git-committable) - One person teaches, whole team benefits via git Training Tools: - training_save: Save learned patterns, rules, glossary, standards - training_list: List all learned knowledge with applied counts - training_remove: Remove outdated training entries Training Skills: - /teach: Learn patterns from example files in the codebase - /train: Learn standards from documents or style guides - /training-status: Dashboard of all learned knowledge System Prompt Injection: - Training knowledge injected alongside memory at session start - Structured by kind: rules first, then patterns, standards, glossary - Budget-limited to 6000 chars to control prompt size - Zero LLM calls on startup — just reads files from disk Deep Research Agent Mode: - New "researcher" agent for multi-step investigations - 4-phase protocol: Plan → Gather → Analyze → Report - Read-only access to all warehouse, schema, FinOps tools - Structured reports with evidence, root causes, action items Agent Awareness: - All agent prompts updated with training awareness section - Agents offer to save corrections as rules when users correct behavior - Training tools permitted in all agent modes Tests: - 88 new tests across 5 test files (types, store, prompt, tools, integration) - All tests standalone (no Instance dependency) - Full lifecycle tests: save → list → format → inject → remove - Edge cases: budget limits, meta roundtrips, coexistence with memory https://claude.ai/code/session_01V17Kk3qCZFp9ZJiuNYucoq * Polish AI Teammate training UX: auto-lowercase names, update detection, budget visibility - Fix researcher agent permissions: add training_save/remove (was read-only) - Auto-lowercase + space-to-hyphen name transform in training_save (ARR → arr) - Detect update vs new save, show "Updated" with preserved applied count - Show training budget usage (chars/percent) on save, list, and remove - Improve training_list: group by kind, show most-applied entries, budget % - Improve training_remove: show available entries on not-found, applied count - Show similar entry names in duplicate warnings (not just count) - Raise content limit from 1800 to 2500 chars - Export TRAINING_BUDGET constant, add budgetUsage() to TrainingPrompt - Add 30 new tests: auto-lowercase, update detection, budget overflow, name collision, scale (80 entries), improved messaging - All 118 training tests + 305 memory tests pass https://claude.ai/code/session_01V17Kk3qCZFp9ZJiuNYucoq * Enhance training UX: attribution, correction detection, priority sorting - Builder prompt: add attribution instructions (cite training entries that influenced output), correction detection (explicit + implicit patterns), conflict flagging between contradictory training entries - Add /teach, /train, /training-status to Available Skills list in builder prompt - Sort training entries by applied count (descending) in prompt injection so most-used entries get priority within the 6000-char budget - Restructure Teammate Training section with clear subsections https://claude.ai/code/session_01V17Kk3qCZFp9ZJiuNYucoq * Fix experience gaps from user journey simulations Simulation findings and fixes: 1. training_save now echoes back saved content so user can verify what was captured (new saves show content preview, updates show old vs new diff) 2. When training limit is reached, error now lists existing entries sorted by applied count and suggests the least-applied entry for removal 3. Researcher prompt now documents training_save/remove permissions (was contradicting its own permissions by saying "read-only" while having write access to training) 4. Added 10 new tests: content echo, update diff, limit suggestion, special character preservation (SQL -->, Jinja, HTML comments, code blocks), priority sorting verification Verified: --> in content does NOT corrupt meta block (false positive). The non-greedy regex terminates at the meta block's own --> correctly. 128 training tests + 305 memory tests all pass. https://claude.ai/code/session_01V17Kk3qCZFp9ZJiuNYucoq * Add self-improvement loop: applied tracking, insights, staleness detection OpenClaw-inspired self-improvement mechanisms: 1. Wire up incrementApplied at injection time — counters now actually increment once per session per entry (deduped via session-scoped set), making "Most Applied" dashboard and priority sorting meaningful 2. TrainingInsights module analyzes training metadata and surfaces: - Stale entries (7+ days old, never applied) — suggests cleanup - High-value entries (5+ applications) — highlights most impactful - Near-limit warnings (18-19 of 20 entries per kind) - Consolidation opportunities (3+ entries with shared name prefix) 3. Insights automatically shown in training_list output 4. 24 new tests covering all insight types, boundary conditions, session tracking dedup, and format output 152 training tests + 305 memory tests all pass. https://claude.ai/code/session_01V17Kk3qCZFp9ZJiuNYucoq * fix: add dedicated training feature flag and remove unused insight type - Add `ALTIMATE_DISABLE_TRAINING` flag independent of memory's disable flag - Use new flag in session prompt injection and tool registry - Remove unused `budget-warning` insight type from `TrainingInsight` Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: reset training session tracking, add error logging, fix list truncation - Call `TrainingPrompt.resetSession()` at session start (step === 1) to prevent applied counters from growing unbounded across sessions - Add structured error logging to all three training tools - Add truncation indicator (`...`) when training list preview is cut off Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use `.altimate-code/memory` as primary storage path with `.opencode` fallback Memory store was hardcoded to `.opencode/memory/` but the config system already uses `.altimate-code` as primary with `.opencode` as fallback. Now checks for `.altimate-code/` directory first, falls back to `.opencode/`, and defaults to `.altimate-code/` for new projects. Result is cached per process to avoid repeated filesystem checks. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add Trainer agent mode with pattern discovery and training validation Add dedicated trainer mode — the 8th primary agent — for systematically building the AI teammate's knowledge base. Unlike inline corrections in other modes, trainer mode actively scans codebases, validates training against reality, and guides knowledge curation. Changes: - New `trainer` agent mode with read-only permissions (no write/edit/sql_execute) - New `training_scan` tool: auto-discover patterns in models, SQL, config, tests, docs - New `training_validate` tool: check training compliance against actual codebase - Expand `TrainingKind` to 6 types: add `context` (background "why" knowledge) and `playbook` (multi-step procedures) - Update `count()` to derive from enum (prevents drift when kinds change) - Add KIND_HEADERS for context and playbook in prompt injection - Update injection order: rules first, playbooks last (budget priority) - Update training-save and training-list descriptions for new kinds Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: add comprehensive training guide with scenarios and limitations - New `data-engineering/training/index.md` (350+ lines): - Quick start with 3 entry points (trainer mode, inline corrections, /train skill) - Deep dive into all 4 trainer workflows (scan, validate, teach, gap analysis) - 5 comprehensive scenarios: new project onboarding, post-incident learning, quarterly review, business domain teaching, pre-migration documentation - Explicit limitations section (not a hard gate, budget limits, no auto-learning, heuristic validation, no conflict resolution, no version history) - Full reference tables for tools, skills, limits, and feature flag - Updated `agent-modes.md`: add Researcher and Trainer mode sections with examples, capabilities, and "when to use" guidance - Updated `getting-started.md`: add training link to "Next steps" - Updated `mkdocs.yml`: add Training nav section under Data Engineering Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: increase training budget to 16K chars and rewrite docs as harness customization guide Training is not a CLAUDE.md replacement — it's the mechanism by which users customize the data engineering harness for their specific project. The agent works WITH the user to discover what it needs to know, rather than requiring users to write perfect static instructions. Changes: - Increase TRAINING_BUDGET from 6000 to 16000 chars (removes the #1 criticism from user simulations — budget was worse than unlimited CLAUDE.md) - Complete docs rewrite with correct positioning: - "Customizing Your AI Teammate" framing (not "Training Your AI Teammate") - Research-backed "why" section (40-70% knowledge omission, guided discovery) - Clear comparison table: training vs CLAUDE.md (complementary, not competing) - 6 real-world scenarios including Databricks, Salesforce quirks, cost spikes - Honest limitations section (not a linter, not an audit trail, not automatic) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: merge training into memory with context-aware relevance scoring Replace two parallel injection systems (memory 8KB + training 16KB) with a single unified injection that scores blocks by relevance to the current agent. How it works: - All blocks (memory + training) loaded in one pass - Each block scored: agent tag match (+10), training kind relevance per agent (+1-5), applied count bonus (+0-3), recency (+0-2), non-training base (+5) - Builder sees rules/patterns first; analyst sees glossary/context first - Budget is 20KB unified, filled greedily by score - Training blocks still tracked with applied counts (fire-and-forget) Architecture: - memory/prompt.ts: new scoreBlock(), unified inject() with InjectionContext - memory/types.ts: UNIFIED_INJECTION_BUDGET, AGENT_TRAINING_RELEVANCE weights - session/prompt.ts: single inject call with agent context (was 2 separate) - training/prompt.ts: deprecated, delegates to MemoryPrompt (backward compat) No changes to: MemoryStore, TrainingStore, training tools, memory tools. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: cut training_scan and training_validate, simplify docs Research from 8 independent evaluations + SkillsBench (7,308 test runs) found that compact focused context beats comprehensive docs by 20pp. The training system's value is in correction capture (2-sec saves) and team propagation (git sync) — not in regex scanning or keyword grep. Removed: - training_scan (255 lines) — regex pattern counting, not discovery - training_validate (315 lines) — keyword grep, not validation Simplified: - trainer.txt: removed scan/validate workflows, focused on guided teaching and curation - agent-modes.md: updated trainer section with correction-focused example - training docs: complete rewrite with new pitch: "Correct the agent once. It remembers forever. Your team inherits it." Backed by SkillsBench research showing compact > comprehensive. Net: -753 lines. 152 tests pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: remove dead accepted/rejected fields, add training tips, expand limitations Gaps found by simulation team: 1. Remove `accepted`/`rejected` counters from TrainingBlockMeta — they were never incremented anywhere in the codebase (dead code since inception) 2. Add 5 training discoverability tips to TUI tips (was 0 mentions in 152 tips) 3. Expand limitations section in docs with honest, complete list: context budget, 20/kind limit, no approval workflow, SQL-focused, git discipline required Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * docs: update site-wide docs for training and new agent modes - Homepage: update from "Four agents" to "Seven agents" — add Researcher, Trainer, Executive cards with descriptions - Getting Started: update training link to match new pitch "Corrections That Stick" - Tools index: add Training row (3 tools + 3 skills) with link - All references now consistent with simplified training system Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address Sentry review findings — 7 bugs fixed 1. stripTrainingMeta/parseTrainingMeta regex: remove multiline `m` flag that could match user content starting with `<!-- training` mid-string (types.ts, store.ts) 2. training_save content limit: reduce from 2500 to 1800 chars to account for ~200 char metadata overhead against MemoryStore's 2048 char limit (training-save.ts) 3. injectTrainingOnly: change `break` to `continue` so budget-exceeding section headers skip to next kind instead of stopping all injection (memory/prompt.ts) 4. injectTrainingOnly: track itemCount and return empty string when no items injected (was returning header-only string, inflating budget reports) (memory/prompt.ts) 5. projectDir cache: replace module-level singleton with Map keyed by Instance.directory to prevent stale paths when AsyncLocalStorage context changes across concurrent requests (memory/store.ts) 6. budgetUsage side effect: already fixed — delegates to injectTrainingOnly which is read-only (no applied count increment). Sentry comments were against pre-refactor code. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: CI failure + new Sentry finding — orphaned headers and agent test 1. Agent test: add researcher + trainer to "all disabled" test so it correctly expects "no primary visible agent" when ALL agents are off 2. Orphaned section headers: add pre-check that at least one entry fits before adding section header in both injectTrainingOnly and inject memory section (prevents header-only output inflating budget reports) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address multi-model code review findings Fixes from 6-model consensus review (Claude + GPT + Gemini + Kimi + MiniMax + GLM-5): 1. training_remove: add name validation regex matching training_save (Gemini finding — prevents path traversal via malformed names) 2. training_save: improve name transform to strip ALL non-alphanumeric chars, not just whitespace (Gemini finding — "don't-use-float!" now becomes "don-t-use-float" instead of failing regex) 3. incrementApplied: replace silent `.catch(() => {})` with warning log (Kimi + GLM-5 consensus — fire-and-forget is by design but failures should be visible in logs for debugging) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address new Sentry findings — regex m flag and off-by-one budget check 1. formatTrainingEntry regex: remove multiline `m` flag that could match user content mid-string (memory/prompt.ts:82) 2. Memory block budget check: change `<` to `<=` so blocks that fit exactly into remaining budget are included (memory/prompt.ts:204) 3 prior Sentry findings already fixed in earlier commits: - projectDir cache (Map keyed by Instance.directory) - injectTrainingOnly header-only return (itemCount guard) - orphaned section headers (first-entry pre-check) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address 6-model consensus review — 4 remaining bugs Fixes from consensus across Claude, GPT 5.2, Gemini 3.1, Kimi K2.5, MiniMax M2.5, and GLM-5: 1. parseTrainingMeta: check safeParse().success before accessing .data (GLM-5 + MiniMax consensus — accessing .data on failed parse returns undefined, could cause downstream errors) 2. Stale detection: use `e.updated` not `e.created` so entries updated recently aren't incorrectly flagged as stale (MiniMax finding) 3. training_list: pass scope/kind filter to count() so summary table matches the filtered entries list (GPT finding) 4. training_remove: show hint entries from same scope only, not all scopes (GPT + MiniMax finding) Prior fixes already addressed: name validation on remove (Gemini), name transform punctuation (Gemini), silent incrementApplied catch (Kimi + GLM-5), regex m flag (MiniMax + Sentry). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>
npm v7+ suppresses ALL postinstall output (stdout AND stderr), so the welcome box was never visible after `npm install`. Users only saw "added 2 packages" with no feedback. Move the full welcome box into `showWelcomeBannerIfNeeded()` which runs in the CLI middleware before the TUI starts. The postinstall script now only writes the marker file — no output. Flow: 1. `npm install` → postinstall writes `.installed-version` marker 2. First `altimate` run → CLI reads marker, shows welcome box, deletes marker 3. Subsequent runs → no marker, no banner Closes #160
Multiple scripts and CI workflows were fetching/pushing tags in ways
that caused ~900 upstream OpenCode tags to leak into our origin remote:
- CI `git fetch upstream` auto-followed tags — added `--no-tags`
- Release scripts used `git push --tags` pushing ALL local tags to
origin — changed to push only the specific new tag
- Release scripts used `git fetch --force --tags` without explicit
remote — added explicit `origin`
- `script/publish.ts` used `--tags` flag — push only `v${version}`
- Docs referenced `git fetch upstream --tags` — fixed to `--no-tags`
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…erge (#168) * fix: sidebar shows `OpenCode` instead of `Altimate Code` after upstream merge - Replace `<b>Open</b><b>Code</b>` with `<b>Altimate</b><b> Code</b>` in sidebar footer - Add `altimate_change` markers to protect branding block from future upstream merges - Add TUI branding guard tests to `upstream-merge-guard.test.ts` Closes #167 * fix: remove stale `accepted`/`rejected` properties from `TrainingBlockMeta` test These fields were removed from the type but the test wasn't updated.
* feat: added a skill for data story telling and visualizations/ data products * fix: rename skill to data-viz * fix: reduce skills.md and references files by 60% --------- Co-authored-by: Saurabh Arora <saurabh@altimate.ai>
New `packages/dbt-tools/` TypeScript package wrapping `@altimateai/dbt-integration` to provide one-shot dbt CLI operations (compile, build, test, execute, introspect). - 16 commands: init, doctor, info, compile, compile-query, build, run, test, build-project, execute, columns, columns-source, column-values, children, parents, deps, add-packages - Config at `~/.altimate-code/dbt.json`, auto-detected via `altimate-dbt init` - Prerequisite validation (`doctor`) checks Python, dbt-core, and project health - Structured JSON output to stdout, logs to stderr, `--format text` for humans - Graceful error handling with actionable `error` + `fix` fields - Patch `python-bridge@1.1.0` to fix `bluebird.promisifyAll` crash - Build with `bun build --target node` for Node.js runtime (Bun IPC bug workaround) - 11 tests covering config round-trip, CLI dispatch, error paths - `/dbt-cli` skill teaching AI agents when and how to invoke each command Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… bridge `bun build` bundles all JS into a single `dist/index.js`, causing `import.meta.url` to resolve to the bundle location instead of the original `@altimateai/dbt-integration/dist/` directory. This meant `PYTHONPATH` pointed to a nonexistent `altimate_python_packages/` dir, breaking `dbt_core_integration` imports. - Add `script/copy-python.ts` post-build step that copies `altimate_python_packages/` from the npm package into `dist/` - Remove developer-only build instructions from `/dbt-cli` skill Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…nces/` system Restructure all dbt-related skills for better AI routing and progressive context disclosure: **New skills (5):** - `dbt-develop` — model creation hub (merges model-scaffold, yaml-config, medallion-patterns, incremental-logic, dbt-cli) - `dbt-test` — schema tests, unit tests, custom tests (merges generate-tests + new content) - `dbt-troubleshoot` — diagnostic workflow for compilation, runtime, and test errors - `dbt-analyze` — downstream impact analysis using lineage (replaces impact-analysis) - `dbt-docs` — enhanced with `altimate-dbt` integration **Deleted skills (7):** - dbt-cli, model-scaffold, generate-tests, yaml-config, incremental-logic, medallion-patterns, impact-analysis **Architecture:** - Lean SKILL.md files for AI routing (When to Use / Do NOT Use sections) - Deep `references/` directories for on-demand knowledge (read only when needed) - Shared `altimate-dbt-commands.md` reference in every skill - Iron Rules, Common Mistakes tables, and Rationalizations to Resist patterns - All skills use `altimate-dbt` commands instead of raw `dbt` Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ce `altimate_core` tools - Restructure `builder.txt` from flat tool list to skills-first architecture - Surface 6 `altimate_core` offline SQL analysis tools (validate, semantics, lint, column_lineage, correct, grade) - Add structured 5-phase workflow: Explore → Plan → Analyze → Execute → Validate - Add Common Pitfalls section from benchmark failure analysis - Fix dbt-tools CLI: improve error handling in `columns`, `init`, and main dispatch - Update dbt-develop and dbt-troubleshoot skills with `altimate-dbt` references Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| result = (await import("./commands/graph")).children(adapter, rest) | ||
| break | ||
| case "parents": | ||
| result = (await import("./commands/graph")).parents(adapter, rest) |
There was a problem hiding this comment.
Bug: The children and parents commands are missing async/await, causing them to return a Promise instead of data, which results in empty {} output.
Severity: HIGH
Suggested Fix
Add the async keyword to the children and parents function declarations in graph.ts. Then, add await before the adapter method calls (getChildrenModels and getParentModels) within those functions. Finally, add await to the calls for these commands in index.ts.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: packages/dbt-tools/src/index.ts#L175-L178
Potential issue: The `children` and `parents` commands are not defined as `async`
functions and do not `await` the results from their respective adapter methods
(`getChildrenModels` and `getParentModels`). All other commands and adapter interactions
in the codebase use `async`/`await`. This inconsistency will cause the commands to
return a Promise object instead of the resolved data. Consequently, when the result is
stringified to JSON for output, it will produce an empty object `{}` instead of the
expected model graph, and any errors during execution will lead to unhandled promise
rejections.
Did we get this right? 👍 / 👎 to inform future reviews.
a30e344 to
ce928a6
Compare
| case "children": | ||
| result = (await import("./commands/graph")).children(adapter, rest) | ||
| break | ||
| case "parents": | ||
| result = (await import("./commands/graph")).parents(adapter, rest) |
There was a problem hiding this comment.
Bug: The children and parents commands are missing await, which can cause a race condition where the adapter is disposed before the async operation completes, leading to runtime errors.
Severity: HIGH
Suggested Fix
Mark the children and parents functions in graph.ts as async. Then, add await to the calls to these functions in index.ts for the children and parents cases to ensure the operations complete before the adapter is disposed.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: packages/dbt-tools/src/index.ts#L174-L178
Potential issue: The `children` and `parents` commands in the main switch statement do
not `await` the result of their respective functions. These functions call
`adapter.getChildrenModels` and `adapter.getParentModels`, which likely return Promises,
based on the usage pattern of other adapter methods. The `finally` block disposes of the
`adapter` immediately after the command is dispatched. This creates a race condition
where the adapter can be disposed of before the asynchronous operation completes,
leading to a runtime error when the pending Promise attempts to access the disposed
adapter's resources.
🤖 Behavioral Analysis — 4 Finding(s)🔴 Critical (1)
🟡 Warnings (1)
🔵 Nits (2)
Analysis run | Powered by QA Autopilot |
…, improve builder prompt - Remove `coalesce(a, 0)` guidance from dbt-develop and dbt-troubleshoot skills (caused NULL→0 conversion failures in salesforce001 and others) - Remove `dbt_packages/` reading instructions from dbt-develop skill (caused agent to spend too many events reading packages, fewer building) - Change dbt build guidance from individual `altimate-dbt build --model` to full-project `dbt build` to ensure all models including package models materialize - Add explicit `dbt deps` guidance for package installation - Add NULL preservation guidance (don't coalesce unless explicitly required) - Add date spine boundary guidance (derive from source data, not `current_date`) Spider2-DBT benchmark context: - Run 1 (pre-changes): 29/68 = 42.65% - Run 2 (added dbt_packages reading): 21/68 = 30.88% (regression) - Run 3 (removed coalesce/packages reading): 23/68 = 33.82% - This commit targets the remaining issues for Run 4 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ession) Reverting `dbt build` (full project) back to `altimate-dbt build --model <name>`. The full-project build caused agent to waste event budget and miss models. Kept from post-baseline changes: - `dbt deps` guidance for package installation - NULL vs 0 preservation pitfall - Removed coalesce guidance that caused salesforce001 failure Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add final `dbt build` step after all models are created to build package models - Add column casing preservation guidance (e.g., keep `MINIMUM_NIGHTS` not `minimum_nights`) - Add warning against extra columns not requested by the task - Add date spine completeness guidance (derive boundaries from source data) - Update dbt-develop skill with final full-project build step Run 6 result: 27/68 = 39.7% (up from 33.8%) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Removed: - Training tools/skills section (unused in benchmark) - Teammate training section (unused in benchmark) - Verbose 5-step workflow (replaced with 3-step) - Verbose SQL analysis tools table (compressed to 4 bullets) - `dbt build` final step (caused 20 timeouts in run 8) Reverted dbt-develop skill to original (no full-project build). Run 6: 27/68 (with dual build) → Run 7: 19/68 → Run 8: 18/68 (20 timeouts!) Hypothesis: leaner prompt = more event budget for model creation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Strip everything non-essential: - Remove skills tables (agent doesn't load skills in benchmark) - Remove SQL analysis tools (agent rarely uses them) - Remove redundant pitfalls - Keep only: principles, dbt commands, workflow, key pitfalls Hypothesis: less system prompt = more context for task = better results Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…pt and skills - Add principle 4 to builder prompt: "Fix everything" — run full `dbt build` after changes - Add full project build instruction after first-build note - Add 4 new common pitfalls: column casing, stopping at compile, skipping full build, ignoring pre-existing failures - Add iron rule 5 to dbt-develop: fix ALL errors including pre-existing - Expand dbt-troubleshoot iron rule to include fixing all errors, not just reported ones Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
feat: improve builder prompt, skills, and add Langfuse tracing
| const limit = raw ? parseInt(raw, 10) : undefined | ||
| if (limit) return adapter.immediatelyExecuteSQLWithLimit(sql, model, limit) |
There was a problem hiding this comment.
Bug: parseInt on the --limit option can return NaN. The subsequent truthiness check if (limit) fails silently, ignoring the user's invalid input instead of raising an error.
Severity: MEDIUM
Suggested Fix
After parsing the --limit value with parseInt, add a validation step using Number.isNaN() to check if the result is NaN. If it is, throw an error to inform the user that they have provided an invalid value for the limit. This aligns with the existing validation pattern in packages/opencode/src/session/retry.ts.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: packages/dbt-tools/src/commands/execute.ts#L8-L9
Potential issue: When a non-numeric value is passed to the `--limit` option, `parseInt`
returns `NaN`. The code only performs a truthiness check on the result (`if (limit)`).
Since `NaN` is falsy, the condition fails, and the code silently falls back to executing
the SQL without a limit via `immediatelyExecuteSQL`. This contradicts the user's intent
and happens without any error or warning. The expected behavior is to validate the
parsed number and inform the user if their input is invalid, a pattern already
established elsewhere in the codebase.
| function format(result?: CommandProcessResult) { | ||
| if (result?.stderr) return { error: result.stderr, stdout: result.stdout } | ||
| return { stdout: result?.stdout ?? "" } |
There was a problem hiding this comment.
Bug: The format function incorrectly uses the presence of stderr content to detect errors, rather than checking the exit_code. This can cause successful commands to be reported as failures.
Severity: HIGH
Suggested Fix
Modify the format function to determine success or failure based on the result.exit_code property. An exit_code of 0 indicates success. The presence of stderr should not be treated as a definitive error condition, as shown in the implementation for dbt-run.ts.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: packages/dbt-tools/src/commands/build.ts#L39-L41
Potential issue: The `format` function in `build.ts` and `deps.ts` incorrectly
determines if a dbt command failed. It checks if `result.stderr` has content, but dbt
often writes non-error information like warnings and progress logs to stderr. The
function should instead check `result.exit_code === 0` to determine success, which is
the reliable indicator. This incorrect logic will cause successful dbt operations (like
`build`, `test`, or `deps`) to be reported as failures if dbt writes anything to stderr,
leading to user confusion and unnecessary retries.
There was a problem hiding this comment.
Pull request overview
This PR introduces a new altimate-dbt CLI package (Bun/TS) for dbt operations and adds a Spider2-DBT benchmarking/evaluation harness, while updating Altimate prompts/tools to prefer altimate-dbt over raw dbt execution and patching python-bridge for runtime compatibility.
Changes:
- Add
packages/dbt-tools(CLI + adapter wrapper around@altimateai/dbt-integration) with Bun tests and workspace wiring. - Add
experiments/spider2_dbtbenchmark runner, evaluator, report generator, and SQLite tracker for results. - Update Altimate prompts/tool exports/registry to de-emphasize or remove the
dbt_runtool, and patchpython-bridge@1.1.0.
Reviewed changes
Copilot reviewed 64 out of 66 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| patches/python-bridge@1.1.0.patch | Patch python-bridge child_process import to avoid promisify behavior. |
| packages/opencode/src/tool/registry.ts | Removes DbtRunTool import/registration from tool registry. |
| packages/opencode/src/altimate/tools/dbt-run.ts | Updates tool description to prefer altimate-dbt (tool now appears orphaned). |
| packages/opencode/src/altimate/prompts/builder.txt | Rewrites builder prompt to emphasize altimate-dbt workflow and principles. |
| packages/opencode/src/altimate/index.ts | Removes barrel export for dbt-run. |
| packages/dbt-tools/tsconfig.json | Adds strict TS config for the new dbt-tools package. |
| packages/dbt-tools/test/config.test.ts | Adds basic tests around config handling/JSON structure. |
| packages/dbt-tools/test/cli.test.ts | Adds CLI behavior tests via bun spawnSync. |
| packages/dbt-tools/src/index.ts | Implements CLI routing, formatting, error diagnosis, adapter lifecycle. |
| packages/dbt-tools/src/config.ts | Implements config file read/write under ~/.altimate-code/dbt.json. |
| packages/dbt-tools/src/commands/init.ts | Adds init command to discover project + Python and write config. |
| packages/dbt-tools/src/commands/info.ts | Adds info command wrapper. |
| packages/dbt-tools/src/commands/graph.ts | Adds children/parents DAG wrappers. |
| packages/dbt-tools/src/commands/execute.ts | Adds SQL execution wrapper with optional limit. |
| packages/dbt-tools/src/commands/doctor.ts | Adds doctor command to report prerequisite status. |
| packages/dbt-tools/src/commands/deps.ts | Adds deps installation and package add wrappers. |
| packages/dbt-tools/src/commands/compile.ts | Adds model/query compile wrappers. |
| packages/dbt-tools/src/commands/columns.ts | Adds model/source/values column inspection wrappers. |
| packages/dbt-tools/src/commands/build.ts | Adds build/run/test/project wrappers. |
| packages/dbt-tools/src/check.ts | Adds prerequisite checking and validation messaging. |
| packages/dbt-tools/src/adapter.ts | Creates and initializes DBTProjectIntegrationAdapter. |
| packages/dbt-tools/script/copy-python.ts | Copies bundled Python packages from @altimateai/dbt-integration into dist. |
| packages/dbt-tools/package.json | Defines new workspace package, bin entry, build/test scripts. |
| packages/dbt-tools/bin/altimate-dbt | Node bin shim that imports the built CLI entry. |
| package.json | Adds packages/dbt-tools workspace and patches python-bridge@1.1.0. |
| experiments/spider2_dbt/tracker.py | Adds SQLite-backed run/task tracking and comparison commands. |
| experiments/spider2_dbt/setup_spider2.py | Adds one-time setup (clone/download/extract/verify) for Spider2-DBT. |
| experiments/spider2_dbt/schema_introspect.py | Adds DuckDB schema summarization for prompt context. |
| experiments/spider2_dbt/run_benchmark.py | Adds benchmark runner with retries, parallelism, caching, and result aggregation. |
| experiments/spider2_dbt/requirements.txt | Adds Python deps for benchmark tooling. |
| experiments/spider2_dbt/report.py | Adds single-file HTML report generator for evaluation results. |
| experiments/spider2_dbt/prompt_template.py | Builds task prompts including YAML model discovery and DuckDB schema summary. |
| experiments/spider2_dbt/evaluate_results.py | Evaluates benchmark outputs via Spider2 duckdb_match and writes summary JSON. |
| experiments/spider2_dbt/config.py | Centralizes benchmark config (paths, defaults, leaderboard data). |
| experiments/spider2_dbt/altimate-code-dev.sh | Adds local dev wrapper script (currently hardcoded path). |
| experiments/spider2_dbt/.gitignore | Ignores cloned repo/workspace/results artifacts for experiments. |
| bun.lock | Updates lockfile for new workspace/package deps and patched deps. |
| .opencode/skills/yaml-config/SKILL.md | Removes legacy skill doc. |
| .opencode/skills/model-scaffold/SKILL.md | Removes legacy skill doc. |
| .opencode/skills/medallion-patterns/SKILL.md | Removes legacy skill doc. |
| .opencode/skills/incremental-logic/SKILL.md | Removes legacy skill doc. |
| .opencode/skills/impact-analysis/SKILL.md | Removes legacy skill doc. |
| .opencode/skills/generate-tests/SKILL.md | Removes legacy skill doc. |
| .opencode/skills/dbt-troubleshoot/SKILL.md | Adds new dbt-troubleshoot skill doc using altimate-dbt. |
| .opencode/skills/dbt-troubleshoot/references/test-failures.md | Adds troubleshooting reference for test failures. |
| .opencode/skills/dbt-troubleshoot/references/runtime-errors.md | Adds troubleshooting reference for runtime errors. |
| .opencode/skills/dbt-troubleshoot/references/compilation-errors.md | Adds troubleshooting reference for compilation errors. |
| .opencode/skills/dbt-troubleshoot/references/altimate-dbt-commands.md | Adds altimate-dbt command reference (troubleshoot skill). |
| .opencode/skills/dbt-test/SKILL.md | Adds new dbt-test skill doc using altimate-dbt. |
| .opencode/skills/dbt-test/references/unit-test-guide.md | Adds dbt unit testing guide. |
| .opencode/skills/dbt-test/references/schema-test-patterns.md | Adds schema-test patterns reference. |
| .opencode/skills/dbt-test/references/custom-tests.md | Adds custom test patterns reference. |
| .opencode/skills/dbt-test/references/altimate-dbt-commands.md | Adds altimate-dbt command reference (test skill). |
| .opencode/skills/dbt-docs/SKILL.md | Updates dbt docs skill to be altimate-dbt-driven + adds references. |
| .opencode/skills/dbt-docs/references/documentation-standards.md | Adds documentation standards reference. |
| .opencode/skills/dbt-docs/references/altimate-dbt-commands.md | Adds altimate-dbt command reference (docs skill). |
| .opencode/skills/dbt-develop/SKILL.md | Adds new dbt-develop skill doc using altimate-dbt. |
| .opencode/skills/dbt-develop/references/yaml-generation.md | Adds YAML generation reference. |
| .opencode/skills/dbt-develop/references/medallion-architecture.md | Adds medallion architecture reference. |
| .opencode/skills/dbt-develop/references/layer-patterns.md | Adds dbt layering patterns reference. |
| .opencode/skills/dbt-develop/references/incremental-strategies.md | Adds incremental strategies reference. |
| .opencode/skills/dbt-develop/references/common-mistakes.md | Adds common mistakes reference. |
| .opencode/skills/dbt-develop/references/altimate-dbt-commands.md | Adds altimate-dbt command reference (develop skill). |
| .opencode/skills/dbt-analyze/SKILL.md | Adds new dbt-analyze impact analysis skill doc. |
| .opencode/skills/dbt-analyze/references/lineage-interpretation.md | Adds lineage interpretation reference. |
| .opencode/skills/dbt-analyze/references/altimate-dbt-commands.md | Adds altimate-dbt command reference (analyze skill). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| FROM task_results a | ||
| FULL OUTER JOIN task_results b ON a.instance_id = b.instance_id AND b.run_id = ? | ||
| WHERE a.run_id = ? | ||
| ORDER BY instance_id | ||
| """, (run2, run1)).fetchall() |
| case "children": | ||
| result = (await import("./commands/graph")).children(adapter, rest) | ||
| break | ||
| case "parents": | ||
| result = (await import("./commands/graph")).parents(adapter, rest) | ||
| break |
| export function children(adapter: DBTProjectIntegrationAdapter, args: string[]) { | ||
| const model = flag(args, "model") | ||
| if (!model) return { error: "Missing --model" } | ||
| return adapter.getChildrenModels({ table: model }) | ||
| } |
| export function parents(adapter: DBTProjectIntegrationAdapter, args: string[]) { | ||
| const model = flag(args, "model") | ||
| if (!model) return { error: "Missing --model" } | ||
| return adapter.getParentModels({ table: model }) | ||
| } |
| const raw = flag(args, "limit") | ||
| const limit = raw ? parseInt(raw, 10) : undefined | ||
| if (limit) return adapter.immediatelyExecuteSQLWithLimit(sql, model, limit) | ||
| return adapter.immediatelyExecuteSQL(sql, model) |
| function python(): string { | ||
| for (const cmd of ["python3", "python"]) { | ||
| try { | ||
| return execFileSync("which", [cmd], { encoding: "utf-8" }).trim() | ||
| } catch {} |
| SPIDER2_REPO_URL = "https://github.com/xlang-ai/Spider2.git" | ||
| # Pin to a known-good commit for reproducibility | ||
| SPIDER2_COMMIT = "main" |
| total_elapsed = time.perf_counter() - total_start | ||
| skipped = sum(1 for r in results if load_incremental(r["instance_id"]) is not None and r.get("_cached", False)) | ||
|
|
| #!/bin/bash | ||
| exec bun run --cwd /Users/surya/code/altimateai/altimate-code/packages/opencode --conditions=browser src/index.ts "$@" |
| import { WarehouseAddTool } from "../altimate/tools/warehouse-add" | ||
| import { WarehouseRemoveTool } from "../altimate/tools/warehouse-remove" | ||
| import { WarehouseDiscoverTool } from "../altimate/tools/warehouse-discover" | ||
| import { DbtRunTool } from "../altimate/tools/dbt-run" | ||
|
|
||
| import { DbtManifestTool } from "../altimate/tools/dbt-manifest" | ||
| import { DbtProfilesTool } from "../altimate/tools/dbt-profiles" |
✅ Tests — All PassedTypeScript — passedPython — passedTested at |
Summary
builder.txtfrom flat tool list to skills-first architecture withaltimate_coreoffline SQL analysis tools surfacedaltimate-dbtCLI (packages/dbt-tools/) — 16 commands wrapping@altimateai/dbt-integrationfor one-shot dbt operationsreferences/system for lean AI routingcolumns,init, and main dispatchChanges
Builder Prompt (
packages/opencode/src/altimate/prompts/builder.txt)altimate_coretools (validate, semantics, lint, column_lineage, correct, grade) — previously invisible to the agentaltimate-dbtCLI (packages/dbt-tools/)~/.altimate-code/dbt.json, auto-detected viaaltimate-dbt initaltimate_python_packagesbundlingSkills Consolidation (
.opencode/skills/)references/directory for progressive contextTest plan
altimate-dbt initdetects dbt project and creates configaltimate-dbt build --model <name>compiles and runs modelsaltimate_core_*tools appear in agent's tool list🤖 Generated with Claude Code