Please sign in to comment.
syntax: refactor how escaped newlines are handled
We used to treat escaped newlines like any other escaped character in the lexer. The parser would then deal with them as needed. This produced inconsistent syntax trees. For example, an escaped newline within double quotes should be skipped as per the POSIX Shell spec, but we would include it as part of a literal. Then, the expand package would be in charge of getting rid of it. Instead, treat escaped lines especially in the lexer, conforming with the spec. Most code gets simpler, particularly in the parser, as it doesn't have to deal with skipping and discarding bytes any more. Two cases want to keep escaped newlines (single quotes and comments), but that's a simple append. Double-quoted strings now represent escaped newlines by separating the parts in different lines, even if the two parts are just *Lit. This is enough information for the printer to do the right thing. Besides the simplification in the lexer and in the code, the syntax tree is now closer to what people would expect when wanting to expand or interpret it. Finally, this requires adding another special rune to the lexer. We were already using RuneSelf as an EOF rune, so RuneSelf+1 now represents an escaped newline. Fixes #321.
- Loading branch information...
Showing with 44 additions and 52 deletions.