Skip to content

Parser Debugging and Diagnostics

Paul McGuire edited this page Aug 21, 2019 · 9 revisions

Diagnostic switches

Pyparsing contains the following __diag__ switches:

  • __diag__.warn_multiple_tokens_in_named_alternation
  • __diag__.warn_ungrouped_named_tokens_in_collection
  • __diag__.warn_name_set_on_empty_Forward
  • __diag__.warn_on_multiple_string_args_to_oneof * __diag__.enable_debug_on_named_expressions

[comment]: # ( They can be enabled and disabled using:

  • __diag__.enable()
  • __diag__.enable_all_warnings()
  • __diag__.disable() )

All are disabled by default, but you can selectively enable them to get some warnings if your parser uses techniques that may not give you desired results.

To enable a switch, add code similar to the following to your parser code:

import pyparsing as pp
pp.__diag__.warn_ungrouped_named_tokens_in_collection = True

__diag__.warn_multiple_tokens_in_named_alternation

Enables warnings when a results name is defined on a MatchFirst or Or expression with one or more And subexpressions (only warns if __compat__.collect_all_And_tokens is False).

TBD

__diag__.warn_ungrouped_named_tokens_in_collection

Here is an example of an ungrouped named tokens in collection:

term = ppc.identifier | ppc.number
# this expression has a results name, and the expressions it
# contains also have results names
eqn = (term("lhs") + '=' + term("rhs"))("eqn")

eqn.runTests("""\
    a = 1000
    """)

The resulting output is:

diag_examples.py:11: UserWarning: warn_ungrouped_named_tokens_in_collection: setting results name 'eqn' on And expression collides with 'rhs' on contained expression
  eqn = (term("lhs") + '=' + term("rhs"))("eqn")

a = 1000
['a', '=', 1000]
- eqn: ['a', '=', 1000]
- lhs: 'a'
- rhs: 1000

Note that all the results names are at the same level, no hierarchy. If other expressions in this parser had 'lhs' or 'rhs' names, in similar ungrouped hierarchy, the 'lhs' and 'rhs' names would clash, and the default would be for only the last name to be reported.

The resolution for this warning is to Group eqn:

eqn = Group(term("lhs") + '=' + term("rhs"))("eqn")

Which gives this output:

a = 1000
[['a', '=', 1000]]
- eqn: ['a', '=', 1000]
  - lhs: 'a'
  - rhs: 1000

Now 'lhs' and 'rhs' are grouped under 'eqn', and would not be overwritten by other 'lhs' or 'rhs' names in other expressions.

__diag__.warn_name_set_on_empty_Forward

Enables warnings when a Forward is defined with a results name, but has no contents defined.

TBD

__diag__.warn_on_multiple_string_args_to_oneof

Enables warnings whan oneOf is incorrectly called with multiple str arguments. A common mistake is to call one of with multiple str arguments:

direction = oneOf("left", "right")

oneOf takes additional keyword arguments, so Python will accept this call, but it generates the wrong expression. The correct form is:

direction = oneOf("left right")

or

direction = oneOf(["left", "right"])

__diag__.enable_debug_on_named_expressions

After enabling this switch, all expressions that are defined with names using setName() are automatically enabled for parse-time debugging.

Use setName(), setDebug(), and traceParseAction to monitor parsing behavior

TBD

Use ParseException.explain() to get more details

TBD

Use runTests() to run multiple test and see where parsers fail to parse

TBD

Use - operator instead of + in selected places in your parser to improve parse error locations

TBD