Skip to content

empty string with non-empty trailing context consumes code units #116

@skvadrik

Description

@skvadrik

Given the following code:

/*!re2c
    "" / "a" {}
*/

re2c generates:

{
        YYCTYPE yych;
        YYCTXMARKER = YYCURSOR + 1;
        if (YYLIMIT <= YYCURSOR) YYFILL(1);
        yych = *YYCURSOR;
        switch (yych) {
        case 'a':       goto yy3;
        default:        goto yy2;
        }
yy2:
yy3:
        ++YYCURSOR;
        YYCURSOR = YYCTXMARKER;
        {}
}

On strings starting with "a" this lexer will consume single code unit ('a') and advance YYCURSOR to code unit next to 'a'. This is clearly not what was intended.

Lookahead is a useful feature and this should be fixed.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions