DataRecce · iamcxa · Mar 4, 2026 · Mar 4, 2026 · Mar 4, 2026 · Mar 4, 2026
diff --git a/.claude/skills/recce-mcp-dev/SKILL.md b/.claude/skills/recce-mcp-dev/SKILL.md
@@ -0,0 +1,82 @@
+---
+name: recce-mcp-dev
+description: Use when modifying recce/mcp_server.py, MCP tool handlers, error classification, or MCP-related tests. Also use when adding new MCP tools or changing tool response formats.
+---
+
+# Recce MCP Server Development
+
+## Architecture
+
+`RecceMCPServer` registers `list_tools`/`call_tool` handlers via MCP SDK `Server`. `call_tool` dispatches to `_tool_*` methods, classifies errors, logs/emits metrics, re-raises.
+
+Entry point `run_mcp_server()` pops `single_env` before passing kwargs to `load_context()`.
+
+## Key Patterns
+
+**Error classification** — Shared indicator lists defined in `recce/tasks/rowcount.py`. Priority order (`PERMISSION_DENIED` > `TABLE_NOT_FOUND` > `SYNTAX_ERROR`) enforced by `_classify_db_error()` in `mcp_server.py` and `_query_row_count()` in `rowcount.py`. Classified → `logger.warning()` + `sentry_metrics.count()` (when sentry_sdk available). Unclassified → `logger.error()` + traceback.
+
+**MCP SDK quirk** — Handler must **raise** for SDK to set `isError=True`.
+
+**Response contracts** — See CLAUDE.md. Additive `_meta` only. `summary.py`: guard with `is None`, not `dict.get(key, 0)`.
+
+**Single-env** — `_maybe_add_single_env_warning()` adds `_warning` to diff results. Descriptions get conditional note.
+
+## Testing (Three Layers)
+
+| Layer | File | Data Source | Runs In | Purpose |
+|-------|------|-------------|---------|---------|
+| Unit | `tests/test_mcp_server.py` | Mock `RecceContext` | CI (`pytest`) | Logic correctness — tool handlers, error classification, response format |
+| Integration | `tests/test_mcp_e2e.py` | `DbtTestHelper` + DuckDB (fixed data) | CI (`pytest`) | MCP protocol works end-to-end via anyio memory streams |
+| Smoke (E2E) | `/recce-mcp-e2e` skill | User's real dbt project + real database | Manual | All 8 tools return valid results against real data |
+
+Each new MCP feature or behavior change should be covered at all three layers.
+
+## Test Coverage Gap Analysis
+
+After completing a round of MCP changes (see E2E Gate below for definition), proactively scan for missing test coverage across the three layers before asking about E2E verification.
+
+**How to check:**
+1. Identify what changed — new tool handler? new error path? new response field?
+2. For each change, verify coverage exists at each layer:
+   - **Unit**: Does `tests/test_mcp_server.py` have a test case for the new behavior? (happy path + error path)
+   - **Integration**: Does `tests/test_mcp_e2e.py` exercise the new tool/feature via MCP protocol?
+   - **Smoke**: Will `/recce-mcp-e2e` template cover the new tool? (If a new tool was added, the template may need updating)
+
+**If gaps are found**, report them to the user before the E2E gate prompt:
+
+> Test coverage gaps found:
+> - Unit: missing test for `_tool_foo` error path when table not found
+> - Integration: `test_mcp_e2e.py` does not exercise `foo` tool
+> - Smoke: `/recce-mcp-e2e` template does not include `foo` tool
+>
+> Want to fill these gaps before running E2E?
+
+**Do NOT scan** after: test-only changes, comment/doc edits, import reordering.
+
+## E2E Verification Gate
+
+After each meaningful round of MCP changes, you MUST ask the user:
+
+> MCP changes complete for this round. Run `/recce-mcp-e2e` to verify?
+
+If the user says yes, invoke `/recce-mcp-e2e`. If a dbt project path was used earlier in this session, reuse it automatically; otherwise ask.
+
+**What counts as "a round":**
+- A tool handler added or modified + its unit tests pass
+- Error classification logic changed + tests pass
+- Single-env or response format changed + tests pass
+
+**Do NOT ask** after: test-only changes, comment/doc edits, import reordering.
+
+**This is separate from `tests/test_mcp_e2e.py`** — that file tests with DbtTestHelper + DuckDB in CI. `/recce-mcp-e2e` verifies all 8 tools against a real dbt project with a real database.
+
+## Pitfalls
+
+- `sentry_sdk` import: `# pragma: no cover` on except (CI always has it)
+- Python 3.9: `Union[X, Y]` not `X | Y`
+- Pre-commit: black/isort may reformat — re-stage and commit
+- `run.py` `schema_diff_should_be_approved()` try/except is intentional (ensures check creation)
+
+## File Map
+
+`recce/mcp_server.py` (server + handlers), `recce/tasks/rowcount.py` (error indicators, RowCountStatus), `recce/run.py` (CLI preset), `recce/summary.py` (display logic), `recce/event/__init__.py` (Sentry)
diff --git a/.claude/skills/recce-mcp-e2e/SKILL.md b/.claude/skills/recce-mcp-e2e/SKILL.md
@@ -0,0 +1,68 @@
+---
+name: recce-mcp-e2e
+description: Use when MCP server code is modified and needs full E2E verification against a real dbt project. Triggers after changes to recce/mcp_server.py, MCP tool handlers, single-env logic, or error classification. Also use before merging MCP PRs.
+---
+
+# MCP E2E Verification
+
+Full-stack verification of all 8 MCP tools against a real dbt project.
+
+## When to Use
+
+- After modifying `recce/mcp_server.py` or `_tool_*` handlers
+- After changing single-env logic or error classification
+- Before merging any MCP-related PR
+- **Not for**: unit test changes only, frontend-only changes, docs-only changes
+
+## Usage
+
+Invoke as `/recce-mcp-e2e` or `/recce-mcp-e2e <project_path>`.
+
+- **With argument**: use the given path as the dbt project directory
+- **Without argument**: ask the user for the dbt project path
+
+The project directory must contain `target/manifest.json` and `target-base/manifest.json`.
+
+## Process
+
+1. **Resolve project path** from argument or user input
+2. **Validate** `target/` and `target-base/` exist with `manifest.json`
+3. **Detect recce source** — find the repo root containing `recce/mcp_server.py`. If `recce-nightly` is also installed (`pip show recce recce-nightly`), set `PYTHONPATH=<RECCE_REPO_ROOT>:$PYTHONPATH`
+4. **Generate** `test_mcp_e2e.py` in the project directory from `test_mcp_e2e_template.py` (in this skill directory). Replace `PROJECT_DIR_PLACEHOLDER` with the resolved absolute path.
+5. **Execute** with appropriate PYTHONPATH prefix
+6. **Report** results — all 13 checks must show PASS. Expected output:
+   ```
+   === FULL MODE (8 tools) ===
+     PASS lineage_diff: PASS
+     ...
+   === SINGLE-ENV MODE ===
+     PASS row_count_diff (_warning): PASS
+     ...
+   ALL PASS
+   ```
+7. **Clean up** — delete `test_mcp_e2e.py`
+
+## Quick Reference
+
+| Test Suite | Checks | What's Verified |
+|-----------|--------|----------------|
+| Full mode (8 tools) | lineage_diff, schema_diff, row_count_diff, query, query_diff, profile_diff, list_checks, run_check | Non-empty results from each tool |
+| Single-env _warning (3) | row_count_diff, query_diff, profile_diff | `_warning` field present with `SINGLE_ENV_WARNING` |
+| Single-env no _warning (2) | lineage_diff, schema_diff | `_warning` field NOT present |
+
+**Additional manual checks** (not in script):
+
+| Check | Command/Action |
+|-------|---------------|
+| --help | `recce mcp-server --help` shows Prerequisites section |
+| Server modes | Non-server mode: `list_tools` returns only lineage_diff + schema_diff |
+
+## Common Mistakes
+
+| Problem | Fix |
+|---------|-----|
+| `ImportError: cannot import name 'SINGLE_ENV_WARNING'` | recce-nightly conflict — use `PYTHONPATH=<RECCE_REPO_ROOT>:$PYTHONPATH` |
+| lineage_diff returns empty | Use `view_mode="all"` (default `changed_models` filters out unchanged) |
+| list_checks returns empty | Preset checks from `recce.yml` must be loaded via `load_preset_checks()` — script handles this |
+| `portalocker` FileNotFoundError on exit | Cosmetic thread error in event collector — does not affect results |
+| Single-env test uses target-base | By design — `load_context` needs both, `single_env=True` flag simulates the mode |
diff --git a/.claude/skills/recce-mcp-e2e/test_mcp_e2e_template.py b/.claude/skills/recce-mcp-e2e/test_mcp_e2e_template.py
@@ -0,0 +1,143 @@
+"""MCP E2E — temporary, delete after verification."""
+
+import asyncio
+import json
+import os
+import sys
+
+PROJECT_DIR = "PROJECT_DIR_PLACEHOLDER"
+os.chdir(PROJECT_DIR)
+
+TOOL_METHODS = {
+    "lineage_diff": "_tool_lineage_diff",
+    "schema_diff": "_tool_schema_diff",
+    "row_count_diff": "_tool_row_count_diff",
+    "query": "_tool_query",
+    "query_diff": "_tool_query_diff",
+    "profile_diff": "_tool_profile_diff",
+    "list_checks": "_tool_list_checks",
+    "run_check": "_tool_run_check",
+}
+
+
+def discover_model(manifest_path):
+    with open(manifest_path) as f:
+        manifest = json.load(f)
+    for uid, node in manifest.get("nodes", {}).items():
+        if node.get("resource_type") == "model":
+            return node["name"]
+    return None
+
+
+MODEL = discover_model(os.path.join(PROJECT_DIR, "target", "manifest.json"))
+if not MODEL:
+    print("ERROR: No model found in manifest")
+    sys.exit(1)
+
+TOOL_ARGS = {
+    "lineage_diff": {"select": MODEL, "view_mode": "all"},
+    "schema_diff": {"select": MODEL},
+    "row_count_diff": {"node_names": [MODEL]},
+    "query": {"sql_template": f"SELECT count(*) as cnt FROM {{{{ ref('{MODEL}') }}}}"},
+    "query_diff": {"sql_template": f"SELECT count(*) as cnt FROM {{{{ ref('{MODEL}') }}}}"},
+    "profile_diff": {"model": MODEL},
+    "list_checks": {},
+    "run_check": None,  # resolved after list_checks
+}
+
+WARNING_TOOLS = {"row_count_diff", "query_diff", "profile_diff"}
+NO_WARNING_TOOLS = {"lineage_diff", "schema_diff"}
+
+
+async def call_tool(server, name, args):
+    return await getattr(server, TOOL_METHODS[name])(args)
+
+
+async def test_full_mode(ctx):
+    from recce.config import RecceConfig
+    from recce.mcp_server import RecceMCPServer
+    from recce.run import load_preset_checks
+
+    config_file = os.path.join(PROJECT_DIR, "recce.yml")
+    if os.path.exists(config_file):
+        config = RecceConfig(config_file=config_file)
+        preset_checks = config.get("checks", [])
+        if preset_checks:
+            load_preset_checks(preset_checks)
+
+    server = RecceMCPServer(ctx)
+    results = {}
+
+    for name, args in TOOL_ARGS.items():
+        if name == "run_check":
+            continue
+        try:
+            result = await call_tool(server, name, args)
+            ok = result is not None and isinstance(result, (dict, list))
+            results[name] = "PASS" if ok else "FAIL (empty)"
+            if name == "list_checks" and isinstance(result, dict):
+                checks = result.get("checks", [])
+                if checks:
+                    TOOL_ARGS["run_check"] = {"check_id": checks[0]["check_id"]}
+        except Exception as e:
+            results[name] = f"ERROR: {e}"
+
+    if TOOL_ARGS["run_check"]:
+        try:
+            result = await call_tool(server, "run_check", TOOL_ARGS["run_check"])
+            results["run_check"] = "PASS" if result else "FAIL"
+        except Exception as e:
+            results["run_check"] = f"ERROR: {e}"
+    else:
+        results["run_check"] = "SKIP (no checks in recce.yml)"
+
+    return results
+
+
+async def test_single_env(ctx):
+    from recce.mcp_server import SINGLE_ENV_WARNING, RecceMCPServer
+
+    server = RecceMCPServer(ctx, single_env=True)
+    results = {}
+
+    for name in WARNING_TOOLS:
+        try:
+            result = await call_tool(server, name, TOOL_ARGS[name])
+            has = "_warning" in result and result["_warning"] == SINGLE_ENV_WARNING
+            results[f"{name} (_warning)"] = "PASS" if has else "FAIL"
+        except Exception as e:
+            results[f"{name} (_warning)"] = f"ERROR: {e}"
+
+    for name in NO_WARNING_TOOLS:
+        try:
+            result = await call_tool(server, name, TOOL_ARGS[name])
+            has = "_warning" in result if isinstance(result, dict) else False
+            results[f"{name} (no _warning)"] = "PASS" if not has else "FAIL"
+        except Exception as e:
+            results[f"{name} (no _warning)"] = f"ERROR: {e}"
+
+    return results
+
+
+async def main():
+    from recce.core import load_context
+
+    ctx = load_context(target_path="target", target_base_path="target-base")
+
+    print("=== FULL MODE (8 tools) ===")
+    full = await test_full_mode(ctx)
+    for k, v in full.items():
+        print(f"  {'PASS' if 'PASS' in v else 'FAIL'} {k}: {v}")
+
+    print("\n=== SINGLE-ENV MODE ===")
+    single = await test_single_env(ctx)
+    for k, v in single.items():
+        print(f"  {'PASS' if 'PASS' in v else 'FAIL'} {k}: {v}")
+
+    all_pass = all("PASS" in v for v in {**full, **single}.values())
+    print(f"\n{'ALL PASS' if all_pass else 'SOME FAILED'}")
+    return 0 if all_pass else 1
+
+
+if __name__ == "__main__":
+    sys.exit(asyncio.run(main()))
diff --git a/AGENTS.md b/AGENTS.md
@@ -49,6 +49,7 @@ Recce is a data validation and review tool for dbt projects. It helps data teams
 | **Deps Check (Python)** | `make deps-check-python` |
 | **Deps Check (Frontend)** | `make deps-check-frontend` |
 | **Deps Check (All)** | `make deps-check` |
+| **Coverage (targeted)** | `python -m pytest tests/test_foo.py --cov=recce.module --cov-report=term-missing` |
 
 ---
 
@@ -64,6 +65,7 @@ Recce is a data validation and review tool for dbt projects. It helps data teams
 | `js/packages/storybook/` | Component stories and visual tests |
 | `tests/` | Python unit tests |
 | `integration_tests/` | dbt/SQLMesh integration tests |
+| `.claude/skills/` | Project-level Claude Code skills |
 
 ## Where to Add Code
 

diff --git a/pyproject.toml b/pyproject.toml
@@ -20,7 +20,7 @@ dependencies = [
     "pydantic",
     "jinja2",
     "rich>=12.0.0",
-    "sentry-sdk",
+    "sentry-sdk>=2.44.0",
     "watchdog",
     "websockets",
     "py-markdown-table",

diff --git a/recce/cli.py b/recce/cli.py
@@ -1753,6 +1753,17 @@ def mcp_server(state_file, sse, host, port, **kwargs):
 
     STATE_FILE is the path to the recce state file (optional).
 
+    \b
+    Prerequisites:
+        Development dbt artifacts (target/) must exist before starting.
+        Base artifacts (target-base/) are recommended for full diffing.
+        - Development: dbt docs generate            (creates target/)
+        - Base: dbt docs generate --target-path target-base
+                                                     (creates target-base/)
+        Without base artifacts, the server starts in single-environment
+        mode where diff tools compare the current environment against
+        itself (no changes expected).
+
     Examples:\n
 
     \b
@@ -1804,6 +1815,22 @@ def mcp_server(state_file, sse, host, port, **kwargs):
         state_loader = create_state_loader_by_args(state_file, **kwargs)
         kwargs["state_loader"] = state_loader
 
+    # Check Single Environment Onboarding Mode
+    # When target-base/ doesn't exist, fall back to single-env mode:
+    # set target_base_path = target_path so both envs load the same artifacts,
+    # making all diffs show no changes. The MCP server adds _warning to responses.
+    if not is_cloud:
+        project_dir_path = Path(kwargs.get("project_dir") or "./")
+        target_base_path = project_dir_path.joinpath(Path(kwargs.get("target_base_path", "target-base")))
+        if not target_base_path.is_dir():
+            kwargs["single_env"] = True
+            kwargs["target_base_path"] = kwargs.get("target_path")
+            console.print(
+                "[yellow]Base artifacts not found. "
+                "Starting in single-environment mode (diffs will show no changes).[/yellow]"
+            )
+            console.print("To enable diffing: dbt docs generate --target-path target-base")
+
     try:
         if sse:
             console.print(f"Starting Recce MCP Server in HTTP/SSE mode on {host}:{port}...")