You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on May 28, 2019. It is now read-only.
If Gherkin3 is going to use this project as a template, we have to make sure we can scan UTF-8 encoded input since many Gherkin translations rely on the unicode character set.
A simple way to do this is to create a utf8 branch where we change && (AND) to øø everywhere, both in lexer definitions and in tests. If everything passes we're fine, if not we have a problem....
The text was updated successfully, but these errors were encountered:
…rors are off since flex counts bytes, not characters. We can live with that since line number error reporting will be good enough for gherkin. Ref #38
I have verified that with Ragel, multi-byte characters (such as å) work fine for recognition, but it puts the firstColumn and lastColumn values off, since they are based on ts and te, which seem to be counting bytes, not characters. This is not a huge problem since we're only likely to be using line numbers in error reporting anyway.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
If Gherkin3 is going to use this project as a template, we have to make sure we can scan UTF-8 encoded input since many Gherkin translations rely on the unicode character set.
A simple way to do this is to create a
utf8
branch where we change&&
(AND) toøø
everywhere, both in lexer definitions and in tests. If everything passes we're fine, if not we have a problem....The text was updated successfully, but these errors were encountered: