Silent errors are bad #349

Closed
PatrickMassot opened this Issue May 24, 2016 · 9 comments

Projects

None yet

3 participants

@PatrickMassot

Here I only want to share how I was confused (and almost discarded the idea of using invoke) after reading the documentation and how I think it could be improved.

In https://github.com/pyinvoke/invoke/blob/master/sites/docs/getting_started.rst#declaring-pre-tasks we read:

"Let’s expand our docs builder with a new cleanup task that runs before every build (but which, of course, can still be executed on its own):"

from invoke import task, run

@task
def clean():
    run("rm -rf docs/_build")

@task(clean)
def build():
    run("sphinx-build docs docs/_build")

I think one should emphasize the crucial role of the force flag in rm. Without it, running invoke clean followed by invoke build doesn't work and, to me, it wasn't immediately clear why. I even made the situation worse by using hide=true in the clean task (I wasn't interested in knowing that there was nothing to remove). I think it could be a could place to mention the option warn=True as an alternative to using rm -f. This option is mentioned in the FAQ but not in getting started.

@bitprophet
Member

Yea, having hide=True is the lynchpin here, that would have hid the stderr from your erroring rm, making it impossible to tell that it's what failed and then prevented build from running.

I'd argue this is less a docs problem and more of an outright UX bug; users shouldn't have to guess at why something didn't behave as expected.

Offhand reactions:

  • One way to solve this is a more Fabric-1-like mode of "explicitly chat about what is going on".
    • But that cannot be our default here (we want the default to be more like make and get out of users' way as much as possible) and if it's not the default, it won't really help much.
    • There are debug and echo options available; arguably, you "should have" found & enabled the --debug flag, which would have triggered printing of the failed rm stderr) which basically serve this function. But they're not on by default (obviously) and I still think this counts as surprising (bad) behavior.
  • Have any command failure that occurs when hide is set (perhaps just testing the stderr side of it) explicitly print or except, so that it is clear what failed when.
    • Also not ideal, as it's pretty noisy for intermediate+ users.
  • Explicitly print/except when pre-tasks fail and prevent execution of a main task.
    • This seems like the best option, there's far fewer downsides and it prevents most of what I perceive to be the core problem here.
    • Note to self, probably best done as capturing Failure in Executor, checking to see if the task being executed was a pre-task, then either printing something there, or emitting a different/wrapping exception for Program to capture & print about before exiting.
@bitprophet bitprophet changed the title from Improved doc of pre-tasks in Getting started to Pre-task failure (causing main-task skip) hard to troubleshoot when output hidden Jun 11, 2016
@enkore
enkore commented Jun 14, 2016 edited

This is something one really easily stumbles over.

I'm not 100 % sure how the error handling of run() works under the hood, I guess it just raises an exception and it bubbles up from there. Imho it should be so that any un-handled run() failure should print both the command line that failed, and it's output (ideally with some nice colors :)

I like to have my scripts quiet by default - but also have them print useful info with little extra effort when things go wrong. Shellscript/make/... make this hard, but Invoke is in a position to make it easy (and the default).

@bitprophet
Member

That's a good point, @enkore - thanks. Also I guess good Unix behavior, now that I think about it - be silent unless something bad happened. So I'll definitely consider going with the 2nd bullet point in my above comment, in addition to or instead of the 3rd :)

@bitprophet bitprophet modified the milestone: 0.14, 1.0 Jun 14, 2016
@bitprophet
Member
bitprophet commented Dec 1, 2016 edited

I think I'm running into a flavor of this, if not exactly this, in the general case - not just with pre-tasks. Seems to be straight up that any Program-driven use (i.e. inv vs library/repl) is exiting correctly but otherwise not making it clear that it's exiting due to UnexpectedExit instead of just falling off the end of the script.

I may have changed how this works in the last year or so, not entirely sure. Need to dig a bit. Think the solution is some hybrid of the 2nd/3rd options in my old comment - don't limit it to pre-tasks, probably, since I think the issue is the same either way (whether inv exits during a pre-task or a main task should be irrelevant). Just straight up handle UnexpectedExit better by inspecting its wrapped execution request to see if hide was set, and verbosely complain if so, on assumption that the user wasn't otherwise redirecting output etc.

EDIT: two related thoughts:

  • Relating this to Fabric 1's behavior and how Invoke wants to depart from it: the current situation stems from a desire to be a lot less verbose by default, as Fabric 1 spews a ton of info about what it's doing unless you lock a lot of it down.
    -IIRC it's possible to make Fabric 1 behave like this ticket by doing hide('aborts') but that requires some pretty specific user foot-shooting.
    • Either way, it certainly seems like this is one spot where we're being too clean/quiet...
  • I think the "best" way to handle this is probably to just allow the UnexpectedExit to raise naturally (the same way it does in library use, which is a normal traceback, whose "message" component is a multiline string). As opposed to some super-special string treatment custom to Program as I'd been sorta thinking. Will see.
@bitprophet
Member

The code in question is here: https://github.com/pyinvoke/invoke/blob/70055e8498bb0ab944acc7d337a73d32c1b85d56/invoke/program.py#L280-L286

It raises one other issue which is how to unify "raise traceback" with "exit with a very specific exit code". If we just raise or raise e, Python will exit with code 1. Now, granted, the "real" exit code is clearly displayed in the value of UnexpectedExit - see below example, where the outer Invoke-running Python process exits 1 but the inner one exits 15:

» python -c "from invoke import Context; Context().run('python -c \'import sys; sys.exit(15)\'', hide=True)"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "invoke/context.py", line 60, in run
    return runner_class(context=self).run(command, **kwargs)
  File "invoke/runners.py", line 256, in run
    return self._run_body(command, **kwargs)
  File "invoke/runners.py", line 395, in _run_body
    raise UnexpectedExit(result)
invoke.exceptions.UnexpectedExit: Encountered a bad command exit code!

Command: "python -c 'import sys; sys.exit(15)'"

Exit code: 15

Stderr:


Suppose I could get both worlds by doing eg traceback.print_stack() or whatever, instead of raising; then just exiting as we currently do.

@bitprophet
Member

MORE EDIT: actually I missed the bit 2-3 lines above the linked snippet, where we write out ParseError to sys.stderr. Ought to unify with that, I think.

On the fence re: whether we should even hinge this printing on the value of hide, but I think it's probably still worthwhile given the annoyances in Fabric 1 of "double output" on error - getting both the normally-printed stdout/err of a failing program, and then the same (captured) out/err in the traceback/abort message.

@bitprophet
Member

Bit more thought after poking this (& adding some more tests around the existing functionality of UnexpectedExit's __repr__):

  • I'd forgotten this, but UnexpectedExit already does some semi-conditional stream display:
    • when pty=False (default), it only includes stderr, not stdout. IIRC this was done on assumption of subprocesses being "good Unix citizens" and putting errors in stderr (as well as assuming "stdout is usually verbose and not usually hidden").
    • when pty=True, because PTYs only have stdout, it displays stdout only.
  • One problem with that existing behavior is that the attempt to be "nice" and hide presumed-verbose stdout when we have access to both streams, breaks down under a PTY where you have to just spit out everything. It's somewhat inconsistent.
  • Another issue is that we don't do any truncation, so anytime the printed stream is large/verbose (possible with stderr, and much more likely with stdout) users may be annoyed, or may not even notice it happened as part of an exception display (because the "hey shit happened" bits have scrolled off screen). Which leads to "why isn't my hide=True working?!?!" and similar things.

Putting all that on top of what I came into this wanting to add (the conditional display of hidden streams on error) makes me think we should unify the two and tweak it a bit:

  • Make the exception object display whichever stream(s) were hidden, in its repr/str, because it has access to that info, and this prevents any double-display.
  • Have it display only the last few lines of output by default...probably. (Or make this configurable, perhaps.) That solves the issue of super-verbose streams being annoying, while still working well by default for the common case of "a small bit of stderr happened explaining how you forgot to install unzip before trying to use it" etc.
  • Have Program simply print(e, file=sys.stderr), avoiding any break of code reuse (i.e. having Program crap out subtly different data on the same event as compared to a raw Python shell - feels bad to do that. Only difference ought to be lack of traceback.)
@bitprophet bitprophet changed the title from Pre-task failure (causing main-task skip) hard to troubleshoot when output hidden to Silent errors are bad Dec 4, 2016
@bitprophet bitprophet closed this in a1eac82 Dec 4, 2016
@bitprophet
Member

I double-checked the original case for this ticket (pre-task calling run('rm <nonexistent dir', hide=True)) and the just-merged solution does seem to make it better, i.e. instead of simply quietly exiting 1, you'll see this instead:

» inv build
Encountered a bad command exit code!

Command: 'rm wat'

Exit code: 1

Stdout:



Stderr:

rm: wat: No such file or directory


Thinking there could be more intelligence (cleverness? =/) added to the display - e.g. for empty stdout, it looks...a bit off. But this is still a big step in the right direction, and future changes would be additive/cosmetic instead of behavioral.

@bitprophet bitprophet modified the milestone: 0.14, 1.0 Dec 4, 2016
@bitprophet bitprophet added a commit that referenced this issue Dec 4, 2016
@bitprophet bitprophet Changelog re #349 3dcaa82
@bitprophet
Member
bitprophet commented Dec 5, 2016 edited

The next day: derp! I forgot another minor wrinkle, we still don't want UnexpectedExit.__repr__ to display when no hiding was happening. In that situation, it is annoyingly verbose and unwanted by most users. Fixing now.

EDIT: pushed

@bitprophet bitprophet modified the milestone: 0.16 Feb 14, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment