Feature | Tools |
---|---|
Languages | |
Git | |
Formatting | |
Testing |
DAIDE parser using parsimonious. Parsimonious is a python package that uses "a simplified sort of EBNF notation" to define a grammar. The parser currently supports all 130 levels of DAIDE.
- The original DAIDE specification is here
- The working Markdown document that will include DAIDE enhancements is here
- The machine-parsable grammar can be found in this
.py
file
Using the grammar and grammar utils in grammar
, you can create a parse tree from a DAIDE press message or reply. The nodes of the parse tree can be visited to return something more useful. The visiting rules for each of the nodes of the parse tree are defined in visitor.py
.
Example:
>>> from daidepp import create_daide_grammar, daide_visitor
>>> grammar = create_daide_grammar(level=130)
>>> message = 'PRP (AND (SLO (ENG)) (SLO (GER)) (SLO (RUS)))'
>>> parse_tree = grammar.parse(message)
>>> output = daide_visitor.visit(parse_tree) # object composed of dataclass objects in keywords.py
>>> str(output)
'PRP ( AND ( SLO ( ENG ) ) ( SLO ( GER ) ) ( SLO ( RUS ) ) )'
The daide_visitor outputs a dataclass that can be used to access useful information about the message.
>>> from daidepp import create_daide_grammar, daide_visitor
>>> grammar = create_daide_grammar(level=130, string_type="arrangement") # allows for messages without PRP
>>> parse_tree = grammar.parse('ALY (GER FRA) VSS (TUR ITA)')
>>> output = daide_visitor.visit(parse_tree)
>>> str(output)
'ALY ( FRA GER ) VSS ( ITA TUR )'
>>> output.aly_powers
('FRA', 'GER')
>>> output.vss_powers
('ITA', 'TUR')
If the DAIDE token is not in the grammar or if the message is malformed, the parser will just throw an exception. We're currently working on returning a list of unrecognized tokens instead of just erroring out.
In addition, DAIDE strings can be constructed using the classes in base_keywords.py
and press_keywords.py
. Each class has type hints that indicate the parameters that should be used.
Example:
>>> from daidepp import AND, PCE
>>> str(AND(PCE("AUS", "ENG"), PCE("AUS", "ENG"), PCE("AUS", "ENG", "FRA")))
'AND ( PCE ( AUS ENG ) ) ( PCE ( AUS ENG FRA ) )'
Each keyword class uses different parameters for instantiation, so it is recommended to carefully follow the type hints or checkout tests/keywords
, which provides examples for each class.
Grammar can also be created using a subset of press keywords. The list of press keywords can be found in constants.py
under PressKeywords
.
Example:
>>> from daidepp.grammar.grammar_utils import create_grammar_from_press_keywords
>>> grammar = create_grammar_from_press_keywords(["PRP", "XDO", "ALY_VSS"])
>>> grammar.parse("PRP (ALY (ITA TUR) VSS (ENG RUS))")
>>> grammar.parse("PRP(XDO((ENG FLT EDI) SUP (ENG AMY LVP) MTO CLY))")
>>> grammar.parse("PRP(PCE (AUS ENG))") # this would fail
Note: Because of the way Parsimonious simplifies grammars, there may be some edge cases where the given list of press keywords does not result in a grammar object with the correct order of keywords (i.e. it may fail to parse even when the message is valid). If this happens, try providing the function with a few more keywords.
Three files should be updated whenever making a PR:
grammar.py
: the machine-readable grammarvisitor.py
: the visitor object to parse a messagepress_keywords.py
: DAIDE press keyword objects- The daide markdown specification: the human-readable specification