[bugfix] fix(sql): strip trailing -- comments before OPTIONS regex match#18425
[bugfix] fix(sql): strip trailing -- comments before OPTIONS regex match#18425tarun11Mavani wants to merge 2 commits into
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #18425 +/- ##
============================================
- Coverage 64.33% 64.27% -0.07%
Complexity 1137 1137
============================================
Files 3314 3314
Lines 204100 204150 +50
Branches 31771 31785 +14
============================================
- Hits 131315 131221 -94
- Misses 62241 62396 +155
+ Partials 10544 10533 -11
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
689b698 to
118f78d
Compare
63d1600 to
1873fbb
Compare
|
@Jackie-Jiang @xiangfu0 Can you take a look? |
|
@Jackie-Jiang @xiangfu0 @shauryachats gentle bump on this. |
| pinotQuery = compileToPinotQuery( | ||
| "SELECT col1, count(*) FROM foo GROUP BY col1 /* option(skipUpsert=true) */"); | ||
| Assert.assertTrue(pinotQuery.getQueryOptions() == null || pinotQuery.getQueryOptions().isEmpty(), | ||
| "option(...) inside a /* */ comment must not be parsed as a query option"); |
There was a problem hiding this comment.
Can you add a test for double dashes within double quotes? Legally a column name "my--column--name" is valid under Pinot.
|
@tarun11Mavani can you please resolve the binary compatibility check? A rebase generally fixes it. |
The legacy OPTIONS regex (e.g. `option(skipUpsert=true)`) is applied to the raw SQL string before it is passed to the Calcite parser. Because `sanitizeSql` only stripped trailing whitespace, a query ending with a single-line comment such as SELECT col1 FROM foo -- option(skipUpsert=true) would be mistakenly treated as if `skipUpsert=true` were a real query option, since the regex anchors at end-of-string (\Z). Fix: scan only the last line of the sanitized SQL for an unquoted `--` sequence and remove it (plus any resulting trailing whitespace) before the OPTIONS regex is applied. Block comments (`/* ... */`) are not affected because they shift the `option(...)` text away from the end-of-string anchor and therefore never triggered the bug. Added two regression test cases to `CalciteSqlCompilerTest#testQueryOptions`: - `-- option(skipUpsert=true)` trailing comment must not set query options - `/* option(skipUpsert=true) */` block comment must not set query options
Verifies that "--" inside a double-quoted column name ("my--column--name")
is not mistakenly treated as a single-line comment by stripSingleLineComments.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
6e81989 to
c60421b
Compare
Done. |
The legacy OPTIONS regex (e.g.
option(skipUpsert=true)) is applied to the raw SQL string before it is passed to the Calcite parser. BecausesanitizeSqlonly stripped trailing whitespace, a query ending with a single-line comment such asSELECT col1 FROM foo -- option(skipUpsert=true)
would be mistakenly treated as if
skipUpsert=truewere a real query option, since the regex anchors at end-of-string (\Z).Fix: scan only the last line of the sanitized SQL for an unquoted
--sequence and remove it (plus any resulting trailing whitespace) before the OPTIONS regex is applied. Block comments (/* ... */) are not affected because they shift theoption(...)text away from the end-of-string anchor and therefore never triggered the bug.Added two regression test cases to
CalciteSqlCompilerTest#testQueryOptions:-- option(skipUpsert=true)trailing comment must not set query options/* option(skipUpsert=true) */block comment must not set query optionsBug reproduced in cluster:
Without skipUpsert=true:

With skipUpsert=true

With skipUpsert=true but in comments. -> This is incorrect.
