Skip to content

feat(cubesql): Improve UX for parsing errors#10984

Merged
ovr merged 3 commits into
masterfrom
feat/cubesql-ux-parser-error
Jun 2, 2026
Merged

feat(cubesql): Improve UX for parsing errors#10984
ovr merged 3 commits into
masterfrom
feat/cubesql-ux-parser-error

Conversation

@ovr
Copy link
Copy Markdown
Member

@ovr ovr commented Jun 1, 2026

Render a source snippet with a caret under the offending location for SQL parser errors, so the user sees where the query broke instead of just a line/column number.

Locations come from the parser error message (or end-of-input for EOF), with token-aware caret placement:

  • snap to the start of a word the parser stopped just after (MEASURE()
  • snap to the end of the last token for ; / EOF (SELECT FROM)
  • underline the whole literal for unterminated strings ('abc)
image image image

@github-actions github-actions Bot added the rust Pull requests that update Rust code label Jun 1, 2026
@ovr ovr force-pushed the feat/cubesql-ux-parser-error branch from 2d9017a to bf33812 Compare June 1, 2026 15:22
@codecov
Copy link
Copy Markdown

codecov Bot commented Jun 1, 2026

Codecov Report

❌ Patch coverage is 96.21622% with 14 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.82%. Comparing base (5f3cb69) to head (b98e1ad).
⚠️ Report is 4 commits behind head on master.

Files with missing lines Patch % Lines
.../cubesql/cubesql/src/compile/parser/sql_snippet.rs 95.16% 12 Missing ⚠️
...st/cubesql/cubesql/src/compile/parser/parser_pg.rs 98.36% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #10984      +/-   ##
==========================================
+ Coverage   78.75%   78.82%   +0.07%     
==========================================
  Files         470      471       +1     
  Lines       93902    94322     +420     
  Branches     3466     3494      +28     
==========================================
+ Hits        73953    74351     +398     
- Misses      19446    19461      +15     
- Partials      503      510       +7     
Flag Coverage Δ
cube-backend 58.56% <ø> (+0.13%) ⬆️
cubesql 83.32% <96.21%> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Render a source snippet with a caret under the offending location for SQL
parser errors, so the user sees where the query broke instead of just a
line/column number.

Locations come from the parser error message (or end-of-input for EOF),
with token-aware caret placement:
- snap to the start of a word the parser stopped just after (`MEASURE(`)
- snap to the end of the last token for `;` / EOF (`SELECT FROM`)
- underline the whole literal for unterminated strings (`'abc`)
@ovr ovr force-pushed the feat/cubesql-ux-parser-error branch from bf33812 to f54ee80 Compare June 1, 2026 17:56
@ovr ovr marked this pull request as ready for review June 1, 2026 17:57
@ovr ovr requested a review from a team as a code owner June 1, 2026 17:57
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Jun 1, 2026

Claude finished @ovr's task in 1m 40s —— View job


Code Review

Nice, focused UX improvement. The sql_snippet module is well-isolated, the caret-placement logic (delimiter→preceding word, terminator→prior token end, EOF→last-token end, tokenizer literals→extend to whitespace) is reasonable, and the test coverage in both sql_snippet.rs and pg.rs exercises the interesting cases (missing comma, dangling FROM/WHERE/GROUP BY, double comma, misspelled keyword, unclosed paren, semicolon, unterminated string). Verdict: ship it after considering the small notes below.

Findings — 0 high, 0 medium, 3 low/nit

Low — latent underflow in caret_width (same-line branch)

rust/cubesql/cubesql/src/compile/parser/sql_snippet.rs:87-94

let caret_width = if end.line == start.line && end.column >= start.column {
    let end_col = end.column.min(error_line_len.max(start_col));
    ((end_col - start.column) + 1) as usize
}

start_col is clamped to error_line_len + 1 on line 80, but the subtraction here uses the un-clamped start.column. If a caller ever passes Span::new(start, end) where start.column == end.column and both exceed line length (e.g. an error reported past EOL), end_col is clamped down to error_line_len.max(start_col) while start.column stays large → end_col - start.column underflows.

Today’s callers don’t hit this: parser-error spans are points and trip the end.column >= start.column branch with end_col == start.column after clamping… except the clamping is on start_col, not start.column — so for a point span beyond EOL the underflow would actually fire. Worth either substituting start_col in the arithmetic or clamping start.column once up front:

let start_col = start.column.min(error_line_len + 1);
// ...
let end_col = end.column.min(error_line_len + 1).max(start_col);
let caret_width = ((end_col - start_col) + 1) as usize;

Nit — place_parser_caret silently no-ops on inexact location

rust/cubesql/cubesql/src/compile/parser/sql_snippet.rs:154

let Some(idx) = tokens.iter().position(|t| t.span.start == reported) else {
    return reported;
};

Exact-equality on Location means a future sqlparser bump that shifts a reported location by one column will silently fall through to the un-snapped position with no test failure. A find(|t| t.span.start.line == reported.line && t.span.start.column >= reported.column) fallback (or asserting the lookup succeeded in tests) would catch regressions earlier.

Nit — re-tokenization on the error path

rust/cubesql/cubesql/src/compile/parser/pg.rs:237-242

Parser::new(&dialect).with_tokens_with_locations(tokens).parse_statements() consumes the tokens, so the snippet builder re-tokenizes. Fine on an error path, but a let tokens_for_snippet = tokens.clone(); before handing them to the parser would avoid the second pass and the .ok() swallowing of tokenizer errors at exactly the moment the user already needs good diagnostics.

· feat/cubesql-ux-parser-error

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Jun 2, 2026

Claude finished @ovr's task in 1m 24s —— View job


Code Review — follow-up on 9a6376a (line cropping)

Verdict on the new commit: ship it. The horizontal-window cropping is implemented cleanly — single window shared across all printed lines so the caret column stays aligned with the source text, and the test coverage (test_syntax_error_in_single_wide_line_is_truncated, test_syntax_error_in_wide_line_truncates_right_only, test_syntax_error_on_wide_numeric_token, plus the 500-line buried-error case) exercises the centered/left-only/right-only cropping branches and the long-numeric-token regex edge case.

Findings — 0 high, 0 medium, 2 nits (plus 1 carried-over low from f54ee80)

Nit — no truncation indicator when a line is cropped

rust/cubesql/cubesql/src/compile/parser/sql_snippet.rs:96-114

When widest > MAX_LINE_WIDTH, the window silently lops off characters from either side with no visible marker. Compare to the test output:

1 | SELECT status MEASURE(orders.count) zzzzzzzzzzzzzzzzzzzzzzzzzzzz
  |               ^

— the user can't tell that 270+ z's were trimmed off the right, or whether they're looking at the start or middle of the source. rustc uses ... ellipses on either side for the same reason. A single character per cropped side (e.g. or > / <) inside the gutter or appended to the cropped end would close that gap without disturbing caret alignment (the indicator can replace the first/last visible char rather than be prepended, to keep widths stable).

Not blocking — current output is still a huge step up from a bare line/column.

Nit — widest includes context lines but window is centered on the error caret

rust/cubesql/cubesql/src/compile/parser/sql_snippet.rs:91-108

The cropping decision uses widest across all displayed lines, but win_start/win_end are chosen purely from caret_start. If a context line is much wider than the error line and pushes widest over the limit, every line — including the error line that would have fit uncropped — gets cropped to a window around the caret. That can hide the start of a short error line just because a noisy preceding line happened to be long. Worth considering whether cropping should only kick in when the error line itself exceeds the limit; the context lines could then be independently cropped to the same window.

Edge case, not common.

Low (carried over from f54ee80, still applicable) — latent underflow in caret_width same-line branch

rust/cubesql/cubesql/src/compile/parser/sql_snippet.rs:79-86

let caret_width = if end.line == start.line && end.column >= start.column {
    let end_col = end.column.min(error_line_len.max(start_col));
    ((end_col - start.column) + 1) as usize

The subtraction uses the un-clamped start.column while end_col has been clamped down. For a span entirely past EOL (start.column == end.column > error_line_len + 1), end_col collapses to start_colerror_line_len + 1 while start.column stays larger → underflow. No current caller hits this — parser-error spans are points and the clamp happens to land equal — but it remains a sharp edge. Fix is one line:

let end_col = end.column.min(error_line_len + 1).max(start_col);
let caret_width = ((end_col - start_col) + 1) as usize;
· `feat/cubesql-ux-parser-error`

Same-line branch subtracted the un-clamped start.column from a clamped
end_col, underflowing for a span entirely past end of line. Clamp both
operands to the same ceiling and floor end_col to start_col.
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Jun 2, 2026

Claude finished @ovr's task in 34s —— View job


Code Review — follow-up on b98e1ad

Verdict: ship it. The caret_width underflow flagged in my previous two reviews is now fixed exactly as suggested — both operands of the subtraction share the same error_line_len + 1 ceiling and end_col is floored to start_col, so a span entirely past EOL collapses to a 1-wide caret instead of underflowing.

Details — 0 high, 0 medium, 0 new low (2 prior nits still open but non-blocking)

Fix verification — rust/cubesql/cubesql/src/compile/parser/sql_snippet.rs:79-82

let caret_width = if end.line == start.line && end.column >= start.column {
    let end_col = end.column.min(error_line_len + 1).max(start_col);
    ((end_col - start_col) + 1) as usize

For the previously-bad case (start.column == end.column > error_line_len + 1): end_col clamps to error_line_len + 1, start_col also clamps to error_line_len + 1 (from line 80 unchanged), the .max(start_col) then snaps end_col up to match, subtraction is 0, caret width = 1. Correct.

For normal in-range spans: end_col >= start.column >= start_col, behavior unchanged.

For unterminated-string / multi-token spans that extend to the line end: clamped to error_line_len + 1, width covers up to the terminator column. Matches the existing tests.

Prior nits still open (non-blocking)

  • Truncation indicator when widest > MAX_LINE_WIDTH — no / > marker on cropped sides. Cosmetic.
  • widest includes context lines — a wide context line can force the error line to be cropped even when it would have fit. Edge case.

Both raised in the previous review on 9a6376a; leaving the decision to the author.

· `feat/cubesql-ux-parser-error`

@ovr ovr merged commit b18fd3d into master Jun 2, 2026
94 of 96 checks passed
@ovr ovr deleted the feat/cubesql-ux-parser-error branch June 2, 2026 12:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rust Pull requests that update Rust code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants