Skip to content

fix(tokenizer): unclosed delimited identifier emits TK_ERROR#44

Merged
renecannao merged 1 commit into
mainfrom
fix/unclosed-delimited-ident
May 23, 2026
Merged

fix(tokenizer): unclosed delimited identifier emits TK_ERROR#44
renecannao merged 1 commit into
mainfrom
fix/unclosed-delimited-ident

Conversation

@renecannao
Copy link
Copy Markdown
Collaborator

Summary

  • scan_double_quoted_identifier and scan_backtick_identifier now emit TK_ERROR when reaching EOF without a matching delimiter, instead of silently producing a TK_IDENTIFIER containing everything in between.
  • Properly-closed delimited identifiers are unchanged.

Why

SET search_path = \"unclosed_quote, public was previously parsed as identifier unclosed_quote, public (commas, spaces, and all). The ProxySQL search_path validator accepted it as a 2-element list and overwrote the stored value with garbage — even though PG itself would reject the SQL outright. The parser was hiding the syntax error.

Test plan

  • 4 new tests in tests/test_set.cpp: unclosed \", unclosed backtick, closed-delim PG, closed-delim MySQL
  • make test — 1262 tests pass
  • End-to-end verified in ProxySQL: spliced libsqlparser.a into deps and ran setparser_parsersql_test-t. Walker now returns empty map for SET search_path = \"unclosed_quote, public (was returning unclosed_quote, public as the value before)

scan_double_quoted_identifier (PostgreSQL `"name"`) and
scan_backtick_identifier (MySQL `` `name` ``) previously scanned to EOF
on unclosed delimiters and silently produced a TK_IDENTIFIER with
everything inside as the name. That let SQL like
`SET search_path = "unclosed_quote, public` parse as identifier
`unclosed_quote, public` (commas, spaces, and all) -- downstream the
ProxySQL search_path validator happily accepted that as a 2-element
comma-separated list and overwrote the stored value. The PG backend
would have rejected the query outright; the parser was hiding the
syntax error.

Fix: if the scanner reaches end_ without finding the matching delimiter,
return TK_ERROR positioned at the opening delimiter byte so the parser
fails cleanly with ParseResult::ERROR and downstream consumers can fall
back to "forward to backend, let the backend reject it" instead of
fabricating an identifier from garbage.

Closed-delimiter behaviour is unchanged.

Tests: 4 new cases in tests/test_set.cpp covering unclosed `"`,
unclosed backtick, and matching closed-delimiter cases as regression
guards. End-to-end verified in ProxySQL: setparser_parsersql_test-t now
produces an empty map for `SET search_path = "unclosed_quote, public`
(was returning `unclosed_quote, public` as the value before).
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 23, 2026

Warning

Review limit reached

@renecannao, we couldn't start this review because you've used your available PR reviews for now.

Your plan currently allows 1 review/hour. Refill in 28 minutes and 9 seconds.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more review capacity refills, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than trial, open-source, and free plans. In all cases, review capacity refills continuously over time.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4e08903c-60a1-41b7-b66e-2f66417c3e22

📥 Commits

Reviewing files that changed from the base of the PR and between 4f9fe62 and 0013fa7.

📒 Files selected for processing (2)
  • include/sql_parser/tokenizer.h
  • tests/test_set.cpp
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/unclosed-delimited-ident

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@renecannao renecannao merged commit faadb2e into main May 23, 2026
8 checks passed
renecannao added a commit to sysown/proxysql that referenced this pull request May 23, 2026
ParserSQL 1.0.8 (PR ProxySQL/ParserSQL#44, tag v1.0.8) fixes
scan_double_quoted_identifier and scan_backtick_identifier so an
unclosed delimiter emits TK_ERROR instead of silently consuming
everything to EOF as one identifier. Closes the case-#172 cascade
in pgsql-set_parameter_validation_test-t under set_parser_algorithm=3:
`SET search_path = "unclosed_quote, public` now returns an empty
walker map (parse failed) so the session falls through to
unable_to_parse_set_statement() and the SET is forwarded to PG as-is.
PG rejects with a syntax error, search_path stays unchanged, and the
test sees the expected unchanged value (which in turn unblocks the
cascading case-#184).

Tests: added TestEmptyOnParseFail in setparser_parsersql_test.cpp
with byte-exact assertions that the walker now returns an empty map
for both the PG unclosed-`"` and MySQL unclosed-`` ` `` shapes
(294/294 local).

SHA256 parsersql-1.0.8.tar.gz: 9bdaf58c207a556c117f2322c3f532e367e631459e158e30b5139604818ef8e1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant