feat: add ztd query lint for structural SQL diagnostics#536
Conversation
📝 WalkthroughWalkthroughAdds a SQL linting feature to the CLI: new Changes
Sequence DiagramsequenceDiagram
participant User
participant CLI
participant LintHandler as Lint Command Handler
participant Parser as SQL Parser
participant Analyzer as Analysis Engine
participant Formatter as Report Formatter
participant Output
User->>CLI: run "ztd query lint file.sql --format json"
CLI->>LintHandler: invoke runQueryLintCommand(file, options)
LintHandler->>Parser: read & parse SQL file
Parser-->>LintHandler: AST
LintHandler->>Analyzer: buildQueryLintReport(file)
Analyzer->>Analyzer: detect unused CTEs, cycles, duplicates, risks
Analyzer-->>LintHandler: QueryLintReport
LintHandler->>Formatter: formatQueryLintReport(report, format)
Formatter-->>LintHandler: formattedOutput
LintHandler->>Output: write to stdout or file
Output-->>User: report
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (6)
packages/ztd-cli/src/query/lint.ts (4)
435-436: Remove trailing blank lines.The file ends with two blank lines. While not a functional issue, it's a minor style inconsistency.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/ztd-cli/src/query/lint.ts` around lines 435 - 436, Remove the two trailing blank lines at the end of packages/ztd-cli/src/query/lint.ts so the file ends immediately after the last statement; open the file, delete the extra empty lines at EOF (after the last token in the lint.ts module) and save so there is a single newline at file end per project style.
102-118: Synchronous file read may block the event loop for large files.
readFileSyncon line 104 blocks the thread. For a CLI tool this is generally acceptable, but if this module is ever used in a server context, consider an async variant.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/ztd-cli/src/query/lint.ts` around lines 102 - 118, The function buildQueryLintReport uses synchronous readFileSync which can block the event loop; change it to an async implementation that uses fs.promises.readFile (or util.promisify(fs.readFile)), make buildQueryLintReport return a Promise<QueryLintReport> (e.g., declare it async), await the file read and then proceed with SqlParser.parse/ assertSupportedStatement/ analyzeStatement as before, and update any callers to await the new async buildQueryLintReport; keep the same behavior and preserve symbols like buildQueryLintReport, SqlParser.parse, assertSupportedStatement, analyzeStatement, and collectReachableCtes.
133-149: Large-CTE heuristic uses character-based estimation that may produce inconsistent results.The
estimatedLineCountcalculation at line 139 usesMath.ceil(formattedSql.length / 50)as a fallback. This assumes ~50 characters per line, which may not reflect actual complexity. The approach is documented inline, so this is acceptable, but consider adding a note in user-facing messages that this is an approximation.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/ztd-cli/src/query/lint.ts` around lines 133 - 149, The large-CTE heuristic uses a character-based fallback (Math.ceil(formattedSql.length / 50)) which can be misleading; update the user-facing message emitted in the issues.push call for 'large-cte' (the object created using cte.aliasExpression.table.name, estimatedLineCount and message) to explicitly mark the line count as an approximation (e.g. append or include "(approximate)" or "approx.") and keep the existing estimatedLineCount calculation and LARGE_CTE_LINE_THRESHOLD logic unchanged so the heuristic behavior is preserved but the message clearly communicates uncertainty to users.
65-91: Consider anchoring regex patterns to reduce false positives.The
executepattern/\bexecute\b/icould match the word "execute" in comments, strings, or column names (e.g.,execute_at). Consider refining patterns or documenting that false positives are acceptable for risk detection.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/ztd-cli/src/query/lint.ts` around lines 65 - 91, The execute detection regex in ANALYSIS_RISK_PATTERNS is too broad; update the entry that uses the execute pattern so it only matches the standalone SQL keyword (i.e., ensure there are no adjacent alphanumeric/underscore characters) by replacing the current pattern with one that uses negative lookbehind/lookahead around "execute" to prevent matching identifiers like "execute_at", and add a short inline comment near ANALYSIS_RISK_PATTERNS documenting that some false positives (e.g., inside comments/strings) may still occur and are acceptable for risk detection.packages/ztd-cli/tests/queryLint.unit.test.ts (2)
9-24: Consider extracting shared test utilities.The
createTempDirandcreateSqlWorkspacehelpers duplicate similar functions incliCommands.test.ts. Consider extracting these to a shared test utilities module to reduce duplication, though this is a minor refactor that can be deferred.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/ztd-cli/tests/queryLint.unit.test.ts` around lines 9 - 24, Duplicate test helpers createTempDir and createSqlWorkspace should be extracted into a shared test utilities module: create a new module (e.g., tests/testUtils.ts) exporting functions createTempDir and createSqlWorkspace, move the implementations there, and update both queryLint.unit.test.ts and cliCommands.test.ts to import these functions instead of redefining them; ensure the exported function signatures and referenced symbols (createTempDir, createSqlWorkspace) remain unchanged and that any path-related imports (path, mkdirSync, mkdtempSync, existsSync) are present in the new module.
122-123: Remove trailing blank lines.Minor style issue: the file ends with two blank lines.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/ztd-cli/tests/queryLint.unit.test.ts` around lines 122 - 123, Remove the extra trailing blank line(s) at the end of the test file packages/ztd-cli/tests/queryLint.unit.test.ts so the file ends with a single newline only; open that file and delete the extra blank line(s) after the final test block or closing brace to conform to style guidelines.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@packages/ztd-cli/src/query/lint.ts`:
- Around line 435-436: Remove the two trailing blank lines at the end of
packages/ztd-cli/src/query/lint.ts so the file ends immediately after the last
statement; open the file, delete the extra empty lines at EOF (after the last
token in the lint.ts module) and save so there is a single newline at file end
per project style.
- Around line 102-118: The function buildQueryLintReport uses synchronous
readFileSync which can block the event loop; change it to an async
implementation that uses fs.promises.readFile (or util.promisify(fs.readFile)),
make buildQueryLintReport return a Promise<QueryLintReport> (e.g., declare it
async), await the file read and then proceed with SqlParser.parse/
assertSupportedStatement/ analyzeStatement as before, and update any callers to
await the new async buildQueryLintReport; keep the same behavior and preserve
symbols like buildQueryLintReport, SqlParser.parse, assertSupportedStatement,
analyzeStatement, and collectReachableCtes.
- Around line 133-149: The large-CTE heuristic uses a character-based fallback
(Math.ceil(formattedSql.length / 50)) which can be misleading; update the
user-facing message emitted in the issues.push call for 'large-cte' (the object
created using cte.aliasExpression.table.name, estimatedLineCount and message) to
explicitly mark the line count as an approximation (e.g. append or include
"(approximate)" or "approx.") and keep the existing estimatedLineCount
calculation and LARGE_CTE_LINE_THRESHOLD logic unchanged so the heuristic
behavior is preserved but the message clearly communicates uncertainty to users.
- Around line 65-91: The execute detection regex in ANALYSIS_RISK_PATTERNS is
too broad; update the entry that uses the execute pattern so it only matches the
standalone SQL keyword (i.e., ensure there are no adjacent
alphanumeric/underscore characters) by replacing the current pattern with one
that uses negative lookbehind/lookahead around "execute" to prevent matching
identifiers like "execute_at", and add a short inline comment near
ANALYSIS_RISK_PATTERNS documenting that some false positives (e.g., inside
comments/strings) may still occur and are acceptable for risk detection.
In `@packages/ztd-cli/tests/queryLint.unit.test.ts`:
- Around line 9-24: Duplicate test helpers createTempDir and createSqlWorkspace
should be extracted into a shared test utilities module: create a new module
(e.g., tests/testUtils.ts) exporting functions createTempDir and
createSqlWorkspace, move the implementations there, and update both
queryLint.unit.test.ts and cliCommands.test.ts to import these functions instead
of redefining them; ensure the exported function signatures and referenced
symbols (createTempDir, createSqlWorkspace) remain unchanged and that any
path-related imports (path, mkdirSync, mkdtempSync, existsSync) are present in
the new module.
- Around line 122-123: Remove the extra trailing blank line(s) at the end of the
test file packages/ztd-cli/tests/queryLint.unit.test.ts so the file ends with a
single newline only; open that file and delete the extra blank line(s) after the
final test block or closing brace to conform to style guidelines.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 7ab6f874-fc28-475e-b2ca-4daaee96b5ff
📒 Files selected for processing (4)
packages/ztd-cli/src/commands/query.tspackages/ztd-cli/src/query/lint.tspackages/ztd-cli/tests/cliCommands.test.tspackages/ztd-cli/tests/queryLint.unit.test.ts
There was a problem hiding this comment.
🧹 Nitpick comments (2)
packages/ztd-cli/src/query/lint.ts (2)
291-294: Consider adding a type guard for safety.The type assertion to access
withClauseis pragmatic given the AST structure differences, though a type guard could make this more robust. Current implementation is acceptable for the controlled input types.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/ztd-cli/src/query/lint.ts` around lines 291 - 294, The getQueryWithClause function currently uses a blunt type assertion to access withClause; add a narrow type guard to safely detect objects that actually have a withClause property before accessing it. Modify getQueryWithClause to check (e.g. using a helper isWithClauseHolder or an inline typeof/property-in check) that statement has a withClause property of the expected shape, then return that withClause or null; keep the return type { recursive: boolean } | null and reference getQueryWithClause to locate the change.
438-444: Minor duplication in scope set computation.The same
Set(group.map((item) => item.scope))is computed twice (lines 442 and 443). Could be extracted to a variable, but this is a minor optimization.♻️ Optional: extract scope set computation
.map((group) => ({ type, severity, fragment: group[0].preview, - occurrences: Array.from(new Set(group.map((item) => item.scope))).sort(), - message: `${messagePrefix} across ${Array.from(new Set(group.map((item) => item.scope))).sort().join(', ')}` + occurrences: (() => { + const scopes = Array.from(new Set(group.map((item) => item.scope))).sort(); + return scopes; + })(), + message: `${messagePrefix} across ${Array.from(new Set(group.map((item) => item.scope))).sort().join(', ')}` }));A cleaner approach would be to compute scopes before the object literal:
.map((group) => { const scopes = Array.from(new Set(group.map((item) => item.scope))).sort(); return { type, severity, fragment: group[0].preview, occurrences: scopes, message: `${messagePrefix} across ${scopes.join(', ')}` }; });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/ztd-cli/src/query/lint.ts` around lines 438 - 444, In the map callback that builds the lint objects (the arrow function using parameter "group" which returns properties type, severity, fragment, occurrences, and message), avoid recomputing Array.from(new Set(group.map((item) => item.scope))).sort() twice: compute it once (e.g., const scopes = Array.from(new Set(group.map(item => item.scope))).sort()) and then use scopes for occurrences and for constructing message via `${messagePrefix} across ${scopes.join(', ')}`; keep fragment as group[0].preview and preserve all other fields.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@packages/ztd-cli/src/query/lint.ts`:
- Around line 291-294: The getQueryWithClause function currently uses a blunt
type assertion to access withClause; add a narrow type guard to safely detect
objects that actually have a withClause property before accessing it. Modify
getQueryWithClause to check (e.g. using a helper isWithClauseHolder or an inline
typeof/property-in check) that statement has a withClause property of the
expected shape, then return that withClause or null; keep the return type {
recursive: boolean } | null and reference getQueryWithClause to locate the
change.
- Around line 438-444: In the map callback that builds the lint objects (the
arrow function using parameter "group" which returns properties type, severity,
fragment, occurrences, and message), avoid recomputing Array.from(new
Set(group.map((item) => item.scope))).sort() twice: compute it once (e.g., const
scopes = Array.from(new Set(group.map(item => item.scope))).sort()) and then use
scopes for occurrences and for constructing message via `${messagePrefix} across
${scopes.join(', ')}`; keep fragment as group[0].preview and preserve all other
fields.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d1021cd8-5dec-410c-951f-95d50cdeed54
📒 Files selected for processing (2)
packages/ztd-cli/src/query/lint.tspackages/ztd-cli/tests/queryLint.unit.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/ztd-cli/tests/queryLint.unit.test.ts
Summary
ztd query lint <sqlFile>for AI-first structural maintainability checksVerification
Notes
@rawsql-ts/ztd-clisuite and still hits an unrelated existing failure intests/init.command.test.ts(init local-source mode links direct rawsql-ts dependencies from the monorepo and emits a local shim), so the commit was created with--no-verifyafter the focused checks above passed.Summary by CodeRabbit
New Features
query lint <sqlFile>--formatand--outoptionsTests