-
Notifications
You must be signed in to change notification settings - Fork 196
Description
Firstly, thanks a lot for this tool. It saved me a lot of time! I am using re2c to create a parser for an as-yet unpublished build tool. The input files are utf-8 encoded. Everything works fine for the ascii character set.
However, I'd like to expand my identifier space to include/allow unicode letters in addition to [a-zA-Z]. Currently the only way to do this that I can see is to write a parser for UnicodeData.txt that grabs all of the letter category code points and dumps them into a giant character class. That's fine, but now I have a generator for a generator for C++. It seems like this sort of Unicode character class functionality would be more naturally supported directly in re2c itself.
I was somewhat surprised this was not already supported, so I went looking for these classes in re2c and could not find them. Apologies if this is already supported and my grep-powers were insufficient.
Thanks!