Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add Tatsu to have a real PEG parser from the EBNF grammar (#194)
Add 竜 TatSu as a dependency. This enables us to have a real PEG parser and not a combination of regexes and string splitting. Fix parsing of quoted values as well as escaped semi-columns This fixes #185 and fixes #193 Note : Adding Tatsu might have made the parser significantly slower in some cases.
- Loading branch information
1 parent
61453e4
commit 6b71a49
Showing
7 changed files
with
85 additions
and
40 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
@@grammar::contentline | ||
@@whitespace :: // | ||
start = contentline $ ; | ||
ALPHA = ?"[a-zA-Z]" ; | ||
DIGIT = ? "[0-9]" ; | ||
CRLF = "\r\n" ; | ||
WSP = " "; | ||
DQUOTE = '"' ; | ||
QSAFE_CHAR = WSP | ?"\x21" | ?"[\x23-\x7E]" | ?"[\u0080-\uffff]"; | ||
SAFE_CHAR = WSP | ?"\x21" | ?"[\x23-\x2B]" | ?"[\x2D-\x39]" | ?"[\x3C-\x7E]" | ?"[\u0080-\uffff]" ; | ||
VALUE_CHAR = WSP | ?"[\x21-\x7E]" | ?"[\u0080-\uffff]"; | ||
name = iana_token | x_name ; | ||
iana_token = {(ALPHA | DIGIT | "-")}+ ; | ||
x_name = "X-" [vendorid "-"] {(ALPHA | DIGIT | "-")}+ ; | ||
vendorid = (ALPHA | DIGIT) (ALPHA | DIGIT) {(ALPHA | DIGIT)}+ ; | ||
contentline = name:name {(";" params+:param )}* ":" value:value CRLF ; | ||
param = name:param_name "=" values+:param_value {("," values+:param_value)}* ; | ||
param_name = iana_token | x_name ; | ||
param_value = quoted_string | paramtext ; | ||
paramtext = {SAFE_CHAR}* ; | ||
value = {VALUE_CHAR}* ; | ||
quoted_string = DQUOTE @:{QSAFE_CHAR}* DQUOTE ; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,4 @@ | ||
python-dateutil | ||
arrow>=0.11,<0.12 | ||
six>1.5 | ||
tatsu>4.2 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters