Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Improve our handling of pyparsing errors #219

Closed

Conversation

peternowee
Copy link
Member

From the commit message:

When parsing DOT strings or files, pydot's parser calls the third-party library pyparsing. If pyparsing meets a problem, it raises a ParseException, which pydot's parser then reports.

This commit makes some improvements to that reporting:

  • Use str(err) instead of just err to prevent errors such as:
    Python 2.7:

    TypeError: cannot concatenate 'str' and 'ParseException' objects
    

    Python 3.7:

    TypeError: can only concatenate str (not "ParseException") to str
    

    This closes Error in error handling hides the real error message #176 ("Error in error handling hides the real error message"), reported by Shish.

  • Add newlines where necessary.

  • When running under Python 3, use the ParseException.explain() method if it is available. This method was introduced in pyparsing 2.3.1 of January 2019 1 2. It returns the same information that we otherwise manually string together (line, column arrow and error message), and adds to that a list of the pyparsing expressions that caused the exception to be raised.

I verified the new behavior by manually testing under the four combinations of Python 2.7.16/3.7.3 and pyparsing 2.3.0/2.4.2.

Unit tests all pass (same four combinations, when run together with the test suite fix proposed in PR 211).

Additional remarks:

Example output with Python 2.7.16 and pyparsing 2.4.2:

>>> pydot.graph_from_dot_data("graph { n [label=<some complex problematic DOT string>>]; }")
graph { n [label=<some complex problematic DOT string>>]; }
          ^
Expected "}", found '['  (at char 10), (line:1, col:11)

Example output with Python 3.7.3 and pyparsing 2.4.2:

>>> pydot.graph_from_dot_data("graph { n [label=<some complex problematic DOT string>>]; }")
graph { n [label=<some complex problematic DOT string>>]; }
          ^
ParseException: Expected "}", found '['  (at char 10), (line:1, col:11)
parse_dot_data
pyparsing.OneOrMore - {{['strict'] Group:({'graph' | 'digraph'}) [ID] graph_stmt}}...
pyparsing.And - {['strict'] Group:({'graph' | 'digraph'}) [ID] graph_stmt}
pyparsing.Group - graph_stmt
pyparsing.And - {Suppress:("{") [Forward: {{stmt [Suppress:(";")]}}...] Suppress:("}") [Suppress:(";")]}
pyparsing.Suppress - Suppress:("}")
pyparsing._SingleCharLiteral - "}"

Let me know if I need to split this commit in two commits, the second commit then to introduce the use of explain().

@peternowee
Copy link
Member Author

For an alternative solution by @ankostis and discussion about bubbling-up and/or printing details of the ParseException, see #171 (comment) and later comments.

dot_parser.py Outdated Show resolved Hide resolved
Copy link

@ankostis ankostis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I distributed my comments on #219 in this review about enhancing pysot error-handling of Popen errors..

dot_parser.py Outdated Show resolved Hide resolved
dot_parser.py Outdated Show resolved Hide resolved
dot_parser.py Outdated Show resolved Hide resolved
dot_parser.py Outdated
" "*(err.column-1) + "^" +
err)
if PY3 and (hasattr(ParseException, 'explain') and
callable(getattr(ParseException, 'explain'))):
Copy link

@ankostis ankostis Apr 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does explain attribute contain from (stdout, stderr, cmd-line, cmd-args)?

Would it worth it to make it configurable with a package-level configuration flag, e.g popen_ex_contets with bit-field values:

  • POPEN_EX_INCLUDE_STDOUT = 1<<0
  • POPEN_EX_INCLUDE_STDERR = 1<<1
  • POPEN_EX_INCLUDE__CMD_LINE = 1<<2
  • POPEN_EX_INCLUDE__CMD_ARGS = 1<<3
  • POPEN_EX_INCLUDE_ALL = - POPEN_EX_INCLUDE_STDOUT | POPEN_EX_INCLUDE_STDERR | POPEN_EX_INCLUDE__CMD_LINE POPEN_EX_INCLUDE__CMD_ARGS

to allow for client code mitigate any privacy concerns iof the dot content
and make a new DEBUG flag translate to (see my comment at the bottom):

if DEBUG:
    popen_ex_contets = POPEN_EX_INCLUDE_ALL
    raise_errors = True

dot_parser.py Outdated
err.line + "\n" +
" "*(err.column-1) + "^\n" +
str(err)
)
return None

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't be better to add a new package-level config-option (e.g. pydot.raise_errors), not to ever return None,
but always let exceptions bubble-up?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, after some more reading, I certainly agree with you that None should not be returned.

Some sources:

In our discussion in issue #171, I worried that some users may depend on a possible return value None. In other words, removing that may be a breaking change. Some remarks about that:

  • I checked the current pydot 1.4.1 code and did not see anything directly or indirectly depending on a return value of None from this function. I also checked NetworkX 2.4 code as an example of a library that depends on pydot and also did not find any there. (There is of course a multitude of dependent code out there that I did not check.)
  • The current docstring of the function states that it returns Graphs and does not mention possibly returning None.
  • Still, I think changes in return values (and possible exceptions) are too far-reaching for a mere point release (e.g. 1.4.2). So, I will split off the bug fix for the TypeError to a separate PR (as soon as we agree on the solution for that). This PR (219) will keep the more substantial changes, including the removal of return None, for the next major or minor release (e.g. 1.5.x). I will add a Changelog-entry and any necessary function docstring changes to this PR in an effort to make users aware of the change.

Wouldn't be better to add a new package-level config-option (e.g. pydot.raise_errors), not to ever return None, but always let exceptions bubble-up?

If we agree on removing return None, then that config-option is not necessary anymore, right? I mean, if the function does not have a Graph object to return (substitution by an empty Graph is not good either) and it should not return None, then what else can we do in the end except raise an exception (whether it be our own or bubbling up the original), right?

Hope this gets some things out of the way already. I will still get back on several other issues we were discussing (error handling, logging, printing, flags, etc.), either here or in our parallel discussion in issue #171.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the resources.

  • The last one is outdated and fested with the ill-advice that troubles the newbies: if you have the code you are calling raise an exception, you should NOT catch it, unless you know what to do with it. Besides, it's PY2.
  • The 2nd from the end is a very nice discussion of a simple indexing operation: to raise or not (the inverse of the prebious). For me, offering both possibilities, like dict does is the optimum. That's simple. But you do not ave this problem.
  • The SO article confirms the pythonic consensus that state-return values are ugly.
  • For the video , i dont have time, but i hope it goes further deep into the problem, and discussrs Error-Bariers (against the proliferation of unknown exception classes) or other architectural concerns and laws.

Copy link

@ankostis ankostis Apr 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

f we agree on removing return None, then that config-option is not necessary anymore, right?

The suggested config-option was to retain BW-compatibilty. If BW is not needed/too important, then it's totally fine (personally i'm ok with braking broken designs ;-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About that config-option pydot.raise_errors: We could still use it to create a transition period, with its default switching from False to True later than described above (e.g. 2.x.x). It would be similar (but inverse) to the debug/verbose flag discussed for printing in #171 (comment). For example, this other project started a transition period like that: sendgrid/sendgrid-python#57.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I did not see your two comments before I sent out my last comment and went to sleep last night.

I will think about the BW-compatibility/transition period some more. Once we know where we want to go exactly, we might also have a better idea on how to get there. Also, I still hope to hear the opinion of the project maintainer, because he might know more about how pydot is used in practice.

For the video , i dont have time, but i hope it goes further deep into the problem, and discussrs Error-Bariers (against the proliferation of unknown exception classes) or other architectural concerns and laws.

Not very deep, it was just an intro course. I could not find many results on Error-Bariers (or Error Barriers), do you perhaps mean Fault Barriers?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link

@ankostis ankostis Apr 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not very deep, it was just an intro course. I could not find many results on Error-Bariers (or Error Barriers), do you perhaps mean Fault Barriers?

Bullseye 👍
[edit:] (i was looking for an article like that for years, thank you! Usefull discussion but not really apllying here, it's a petty little library :-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above, I wrote:

I will split off the bug fix for the TypeError to a separate PR (as soon as we agree on the solution for that)

@ankostis: Though we did not agree on the details of it yet, I created PR #227 already, because the code of the bug fix there is no longer automatically part of the more extensive changes we are discussing here. I still welcome your comments there.

@peternowee peternowee changed the title Fix and improve printing of pyparsing errors Improve handling of pyparsing errors Apr 15, 2020
ankostis referenced this pull request in sytelus/pydotz Apr 17, 2020
When pydot calls Graphviz and Graphviz returns a non-zero exit code
(indicating an error), pydot currently reports the Graphviz stdout and
stderr output and raises an exception.

This commit adds to that error report the 'raw' DOT string generated by
pydot, as it was also sent to Graphviz as input.

Also, this commit separates the Graphviz stdout and stderr output more
clearly.

These changes should help in troubleshooting Graphviz errors.

For some examples where this would have helped see:
pydot#203
("Problem with graph.write.png...AssertionError: 1")

Tested with Python 2.7 and 3.7. All tests pass (when run together with
the test suite fix proposed in PR 211).
@peternowee
Copy link
Member Author

I just created a separate PR #227 with just the minimal bug fix for TypeError for a bug fix/point release.

This PR (#219) will now target a future major/minor release and assume support for Python <3.5 is dropped. I plan to force-push new changes over this branch later, so until further notice, please do not merge.

@peternowee peternowee changed the title Improve handling of pyparsing errors [WIP] Improve handling of pyparsing errors Apr 20, 2020
@peternowee peternowee closed this May 16, 2020
@peternowee peternowee force-pushed the fix/pyparsing-error-printing branch from 756d29e to 48ba231 Compare May 16, 2020 09:40
@peternowee
Copy link
Member Author

Following the discussion in #171, I have been working on better solutions for PRs #218 and #219. These solutions depend on dropping Python 2 support and having a logging facility. For this, I created PRs #229 and #231. Also, #211 needs to be merged to make the Travis CI test results meaningful again.

I will now suspend my work until at least those 3 PRs (#211, #229, #231) have been merged into the master branch. I remain available to rebase my PRs or advise about the order in which these and other PRs can best be merged.

@peternowee peternowee changed the title [WIP] Improve handling of pyparsing errors [WIP] Improve our handling of pyparsing errors May 16, 2020
@peternowee peternowee added this to the 1.5.0 or 2.0.0 milestone Aug 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error in error handling hides the real error message
2 participants