Skip to content

fix: tokenizer word_continuation case sensitivity for camelCase ident…#21

Merged
niclasko merged 1 commit intomainfrom
fix/tokenizer-camelcase-keyword
Feb 7, 2026
Merged

fix: tokenizer word_continuation case sensitivity for camelCase ident…#21
niclasko merged 1 commit intomainfrom
fix/tokenizer-camelcase-keyword

Conversation

@niclasko
Copy link
Copy Markdown
Collaborator

@niclasko niclasko commented Feb 7, 2026

…ifiers

The word_continuation method in StringWalker compared characters against word_valid_chars (lowercase only) without normalizing case. This caused camelCase identifiers starting with keywords (e.g., 'fromUser') to be incorrectly tokenized as keyword + identifier ('FROM' + 'User').

Fix: normalize the next character to lowercase before checking. Applied to both TypeScript and Python implementations.

Added tests for:

  • Lookup with 'from' keyword as property name
  • CamelCase alias starting with keyword ('fromUser')
  • FROM keyword property in CREATE VIRTUAL subquery

…ifiers

The word_continuation method in StringWalker compared characters against
word_valid_chars (lowercase only) without normalizing case. This caused
camelCase identifiers starting with keywords (e.g., 'fromUser') to be
incorrectly tokenized as keyword + identifier ('FROM' + 'User').

Fix: normalize the next character to lowercase before checking.
Applied to both TypeScript and Python implementations.

Added tests for:
- Lookup with 'from' keyword as property name
- CamelCase alias starting with keyword ('fromUser')
- FROM keyword property in CREATE VIRTUAL subquery
@niclasko niclasko merged commit 2933e01 into main Feb 7, 2026
@niclasko niclasko deleted the fix/tokenizer-camelcase-keyword branch February 7, 2026 13:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant