Robustify - Improved/Extended Lexing Behavior & Error Handling
Pre-releaseThis patch was authored and released by @tdotclare.
Major Changes
- Converts
LeafErrorfrom Enum to source-reporting Struct - Adjusts
Lexerstate change table for human clarity - Adds
Lexerrobustness for parameters with no whitespace boundaries - Adds
Lexerparameter handling for bin/oct/hex constantInts and hexDoubles - Adds
Lexerinternal methods for additional checking peek/pop sequences - Makes
Parameterdepth a state variable inLexerinstead of an enum property - Adjusts
Characterexts for additional granularity on token types - Adjusts
Characterexts for start/body validity of token types - Adds
Characterexts for bin/oct/hex numerics - Adjusts
Parserbehavior to allow decayingtagBodyIndicatorstorawwhen tag is known to have no body - Adjusts
Parserto allow replacingLexedtokens when necessary for above - Re-enables a number of
TestCasefunctions from Leaf3 and adjusts for Leaf4 syntax
Problems Solved
Better/clearer error handling properties
Much easier to follow state changes during lexing In most cases
Parameter processing is still complicated but other cases are clearly handled
Better Parameter handling with whitespace
EG, the varying inputs below all properly lex to the correct interpretation now. Before, the first three would inaccurately lex, and only the fourth would correctly lex the parameters to
operator(not) variable(one) operator(||) operator(not) variable(two)
"#if(!one||!two)"
"#if(!one || !two)"
"#if(! one||! two)"
"#if(! one || ! two)"
Better handling of tagBodyIndicator
Previously syntax like #(index):#(value) would error because the colon was universally assumed to indicate the start of a body - now, cases where it's impossible for a tag to take a body (eg, anonymous functions for now) will mutate the tBI back to a raw colon. Next step to improving this is to make observers on tag and built-in control structures to allow parsing to inquire as to expected state (eg, a function may take two parameters and no body or one parameter and a body and both are acceptable)
Improved syntax options for constant numerics
You can specify bin/oct/hex Int and hex Double constants now in Swift-manner literals... eg
0b1111 // Constant Int
1_000_000 // Constant Int
0x0.50 // Constant Double
Why? Why not?