Given the following code:
re2c generates:
{
YYCTYPE yych;
YYCTXMARKER = YYCURSOR + 1;
if (YYLIMIT <= YYCURSOR) YYFILL(1);
yych = *YYCURSOR;
switch (yych) {
case 'a': goto yy3;
default: goto yy2;
}
yy2:
yy3:
++YYCURSOR;
YYCURSOR = YYCTXMARKER;
{}
}
On strings starting with "a" this lexer will consume single code unit ('a') and advance YYCURSOR to code unit next to 'a'. This is clearly not what was intended.
Lookahead is a useful feature and this should be fixed.