[feedback] LLM agent experience, silent failures, false positives — v0.2.1

Tested sqlcg v0.2.1 on a Snowflake DWH repo (1457 SQL files, 3245 tables, 8084 queries, SqlColumn: 0). Ran the full session via an LLM agent (Claude Code) using the MCP server. Issues grouped by severity.

## 1. LLM onboarding (critical)

- Package is `sql-code-graph`, binary is `sqlcg`. An LLM invoking the MCP server name hits "executable not found, try `uvx --from sql-code-graph sqlcg`" and cannot self-recover without human intervention.
- `--help` lists commands but gives no workflow order. The required sequence (`db init` → `index <path>` → `git install-hooks`) is not implied anywhere. The LLM jumped straight to `find`/`analyze` and got empty results for the entire session.
- `mcp setup` prints JSON with no mention of prerequisites. An LLM will run it and consider setup complete.
- `watch` has no initialization guard — behavior is undefined if `db init` was never run.
- **Suggestion:** add an LLM-oriented quick-start block at the top of `--help`:
  ```
  QUICK START (run in order):
    1. sqlcg db init
    2. sqlcg index <repo-root> --dialect <dialect>
    3. sqlcg git install-hooks

  NOTE: binary is 'sqlcg', MCP package name is 'sql-code-graph'
  Check health: sqlcg db info
  ⚠ If SqlColumn: 0 — column lineage unavailable, only table-level tracing works
  ```

## 2. Silent failures & misleading state

- `db info` shows `SqlColumn: 0` with no warning that `trace_column_lineage` and `find column` are therefore completely non-functional.
- Index success rate reported as 100% despite 900KB of parse warnings and zero column nodes populated. "Success" should mean something.
- `find column` returns "No results" with no explanation that the column graph is empty. An LLM cannot distinguish "column not found" from "column graph not built".
- After `db init` with no index run, every query silently returns empty. There is no "database is empty, have you run `sqlcg index`?" guard.

## 3. `analyze unused` is all false positives

Returns ~100 `IA_ANALYTICS` / `IA_TABLEAU` / `IA_BUSINESSOBJECTS` views as unused. These are consumer-facing views used by Tableau and BI tools — external consumers that have no SQL references within the repo. The tool has no concept of external consumers.

Suggestions:
- Add a `--exclude-schema` flag to filter known consumer schemas
- Or warn: "results may include externally-consumed objects not referenced in indexed files"

## 4. `analyze impact` vs `find pattern` inconsistency

For the same table (`BA.WTFV_VOORRAAD_DAGSTAND_IGDC`):
- `analyze impact` returned only DDL files
- `find pattern` additionally found the ETL INSERT in `etl/sql/fact/`

The two commands should return consistent results.

## 5. Feedback loop has no false-negative path

`trace_column_lineage` was called 7 times via MCP, all returned empty, 0 feedback samples collected. The feedback prompt only fires when there are results to rate.

The FN case — empty result when the user expected one — is the most valuable signal at this stage and it's completely absent. Suggestion: prompt "Did you expect a result here? (y/n)" on empty returns.

As a proxy metric: `execute_cypher` was the #1 MCP tool with 15 calls vs `trace_column_lineage` at 7. That ratio means the LLM kept falling back to raw Cypher because the high-level tools failed. Worth surfacing in `sqlcg gain` as a signal that abstractions aren't working.

## 6. Parser: `ALTER WAREHOUSE` noise

`ALTER WAREHOUSE IDENTIFIER($var) SET WAREHOUSE_SIZE = 'X-Large'` is a standard pattern in Snowflake DDL rebuild files (every table rebuild uses it). The parser fails on it and falls back to `Command` type. Since it's not DML, stripping it before DML extraction would eliminate the noise entirely. Same applies to `CALL MA.MSSPR_*()` stored procedure calls.

## 7. Missing: persistent index config

No `.sqlcg.toml` to declare repo root and dialect. After a `db reset` or fresh clone there is no way to replay the index without remembering the original invocation. The git hook (which correctly resyncs on branch switch) also depends on the DB already existing — it cannot bootstrap from scratch.

Suggestion: `.sqlcg.toml` with `path` and `dialect` so `sqlcg index` with no arguments just works, and the git hook can call it safely.

## 8. No progress feedback during `index`

`sqlcg index` on 1457 files took ~17 seconds but produced zero stdout output while running — only the final line `Indexed 1457 files — 2231 tables, 0 edges, 148 errors` appeared at the end. With no progress indicator it's impossible to tell whether the command is working, hung, or waiting on something.

Also: the summary line reports `0 edges` — a direct signal that the graph has no relationships and lineage tracing will return nothing — but it appears only in the raw log, never surfaced in `db info`.

Suggestions:
- A simple progress indicator or periodic `INFO: indexed N files...` lines
- Surface `edges: 0` as a warning in `db info` — a graph with no edges cannot answer any lineage query

## 9. `uvx` cold-start is invisible

The recommended install and MCP server config both use `uvx`, which downloads and installs the package on first use. On a cold cache this took ~6.5 minutes with zero output — indistinguishable from a hang. The actual indexing of 1457 files took 17 seconds.

The MCP server config (`"command": "uvx"`) means Claude Code also pays a startup cost each session.

Suggestions:
- Docs should recommend `uv tool install sql-code-graph` for persistent installs, with `uvx` only for one-shot tries
- Or at minimum note in the quick-start that first run will be slow due to package download

---

Environment: sqlcg 0.2.1, Snowflake dialect, 1457 files indexed, WSL2 / Linux, Claude Code MCP session.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feedback] LLM agent experience, silent failures, false positives — v0.2.1 #5

1. LLM onboarding (critical)

2. Silent failures & misleading state

3. `analyze unused` is all false positives

4. `analyze impact` vs `find pattern` inconsistency

5. Feedback loop has no false-negative path

6. Parser: `ALTER WAREHOUSE` noise

7. Missing: persistent index config

8. No progress feedback during `index`

9. `uvx` cold-start is invisible

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[feedback] LLM agent experience, silent failures, false positives — v0.2.1 #5

Description

1. LLM onboarding (critical)

2. Silent failures & misleading state

3. analyze unused is all false positives

4. analyze impact vs find pattern inconsistency

5. Feedback loop has no false-negative path

6. Parser: ALTER WAREHOUSE noise

7. Missing: persistent index config

8. No progress feedback during index

9. uvx cold-start is invisible

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

3. `analyze unused` is all false positives

4. `analyze impact` vs `find pattern` inconsistency

6. Parser: `ALTER WAREHOUSE` noise

8. No progress feedback during `index`

9. `uvx` cold-start is invisible