Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upGet rid of (negative) lookahead (somehow) #14
Comments
eddyb
added
the
help wanted
label
Aug 21, 2018
This comment has been minimized.
This comment has been minimized.
|
Found a solution while talking to @eternaleye, and it involves #27: // Existing Rust grammar:
ForLoop = "for" Pat "in" Expr Block;
// Scannerless CFG:
ForLoop = "for" {
Pat(/*allow_ident_left=*/"0", /*allow_ident_right=*/"0") |
// NB: WS is *mandatory* whitespace here:
WS Pat(/*allow_ident_left=*/"1", /*allow_ident_right=*/"0") |
Pat(/*allow_ident_left=*/"0", /*allow_ident_right=*/"1") WS |
WS Pat(/*allow_ident_left=*/"1", /*allow_ident_right=*/"1") WS
} "in" {
Expr(/*allow_ident_left=*/"0", /*allow_ident_right=*/"1") |
WS Pat(/*allow_ident_left=*/"1", /*allow_ident_right=*/"1")
} WS? Block;Note that this is a first approximation and we could find some sugar for it, perhaps, especially to avoid having to write One possibility could be EDIT: We probably don't need that |
eddyb
removed
the
help wanted
label
Aug 25, 2018
This comment has been minimized.
This comment has been minimized.
|
While we might be able to implement the machinery to allow writing the explicit version sooner, it's becoming increasingly clear how unergonomic it would be to write such a grammar. A good middle-ground could be taking lookaround and propagating it up the grammar, and require that it cleanly "intersects" with existing terminals and "dissolves" away, leaving a CFG behind. |
eddyb commentedAug 21, 2018
[a-zA-Z][a-zA-Z0-9]*in regex) is not followed by more identifier characters ((?![a-zA-Z0-9])in some regex dialects), forcing the rule to match the longest valid identifier (just like most regex semantics out there)