It seems that in ocaml a char is 8bit, and unicode characters are
their UTF-8 encoded strings. This means we cannot represent unicode
character sets in the ocamllex lexer definition. We can use string
literals in some circumstances.
For that reason, RAlts is now translated to a disjunction of char or
string literals (the latter for unicode chars) rather than to a
Instead of @l@, use name @_letter@ for predefined character class
The included test makes sure the new names cannot clash with user
defined token names. Previously, user token type L would be
translated to @l@ clashing with the predefined letter character
In most backends' lexer generators,
isn't translated correctly to a regular expression.
The text was updated successfully, but these errors were encountered: