fix(tokenizer): unclosed delimited identifier emits TK_ERROR#44
Conversation
scan_double_quoted_identifier (PostgreSQL `"name"`) and scan_backtick_identifier (MySQL `` `name` ``) previously scanned to EOF on unclosed delimiters and silently produced a TK_IDENTIFIER with everything inside as the name. That let SQL like `SET search_path = "unclosed_quote, public` parse as identifier `unclosed_quote, public` (commas, spaces, and all) -- downstream the ProxySQL search_path validator happily accepted that as a 2-element comma-separated list and overwrote the stored value. The PG backend would have rejected the query outright; the parser was hiding the syntax error. Fix: if the scanner reaches end_ without finding the matching delimiter, return TK_ERROR positioned at the opening delimiter byte so the parser fails cleanly with ParseResult::ERROR and downstream consumers can fall back to "forward to backend, let the backend reject it" instead of fabricating an identifier from garbage. Closed-delimiter behaviour is unchanged. Tests: 4 new cases in tests/test_set.cpp covering unclosed `"`, unclosed backtick, and matching closed-delimiter cases as regression guards. End-to-end verified in ProxySQL: setparser_parsersql_test-t now produces an empty map for `SET search_path = "unclosed_quote, public` (was returning `unclosed_quote, public` as the value before).
|
Warning Review limit reached
Your plan currently allows 1 review/hour. Refill in 28 minutes and 9 seconds. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more review capacity refills, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than trial, open-source, and free plans. In all cases, review capacity refills continuously over time. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
ParserSQL 1.0.8 (PR ProxySQL/ParserSQL#44, tag v1.0.8) fixes scan_double_quoted_identifier and scan_backtick_identifier so an unclosed delimiter emits TK_ERROR instead of silently consuming everything to EOF as one identifier. Closes the case-#172 cascade in pgsql-set_parameter_validation_test-t under set_parser_algorithm=3: `SET search_path = "unclosed_quote, public` now returns an empty walker map (parse failed) so the session falls through to unable_to_parse_set_statement() and the SET is forwarded to PG as-is. PG rejects with a syntax error, search_path stays unchanged, and the test sees the expected unchanged value (which in turn unblocks the cascading case-#184). Tests: added TestEmptyOnParseFail in setparser_parsersql_test.cpp with byte-exact assertions that the walker now returns an empty map for both the PG unclosed-`"` and MySQL unclosed-`` ` `` shapes (294/294 local). SHA256 parsersql-1.0.8.tar.gz: 9bdaf58c207a556c117f2322c3f532e367e631459e158e30b5139604818ef8e1
Summary
scan_double_quoted_identifierandscan_backtick_identifiernow emitTK_ERRORwhen reaching EOF without a matching delimiter, instead of silently producing aTK_IDENTIFIERcontaining everything in between.Why
SET search_path = \"unclosed_quote, publicwas previously parsed as identifierunclosed_quote, public(commas, spaces, and all). The ProxySQL search_path validator accepted it as a 2-element list and overwrote the stored value with garbage — even though PG itself would reject the SQL outright. The parser was hiding the syntax error.Test plan
tests/test_set.cpp: unclosed\", unclosed backtick, closed-delim PG, closed-delim MySQLmake test— 1262 tests passsetparser_parsersql_test-t. Walker now returns empty map forSET search_path = \"unclosed_quote, public(was returningunclosed_quote, publicas the value before)