RFC: Should external command non-zero exit status always throw an exception? #497

krader1961 · 2017-10-06T23:26:27Z

I've been exploring Elvish for a few weeks and really like most of what I see. Including the fact it's written in Go rather than C++, the concept of namespaces, using exceptions to signal conditions like the loop should be terminated, etcetera. I can live with the fact it is still rapidly evolving and backward incompatible changes are still occurring with some regularity. There is, however, one behavior that will keep me from using Elvish: automatically converting a non-zero exit status by an external command into an exception. The reason is that there are way too many commands that use a non-zero status to indicate a non-fatal condition. The canonical example is grep which terminates with status one if no lines matched the regex. Having to put every such command in a try { } except { } block is not friendly, error prone, and ugly. My proposal is to map non-zero exit status to an exception only if the command is run in the context of a try { } block. Thoughts?

If this proposal were implemented I also think there should be a variable analogous to bash's $? or fish's $status that automatically captures the exit status. I've implemented the following proof of concept function to prove that it can be done using Elvish script but it really should be default behavior:

# Run a command without throwing an exception if it exits with a non-zero
# status. Store the exit status in `$_status` and the exception in `$_exc`.
_status = ''
_exc = ''
fn nx [cmd @rest]{
    try {
        $cmd $@rest
    } except e {
        up:_exc = $e
        try {
            use re
            local:matches = (re:find ' exited with (\d+)''' (echo $e))
            up:_status = $matches[groups][1][text]
        } except {
            up:_status = 1
        }
    }
}

The text was updated successfully, but these errors were encountered:

krader1961 · 2017-10-07T02:36:16Z

I'll also point out that the fish project has discussed implementing something akin to the current elvish behavior. Details about how it should work haven't been thoroughly worked out but the consensus is it should not be the default behavior. It needs to be opt-in. Probably by annotating block commands. For example,

begin --errexit
   grep non-matching-pattern /some/file
end

What I'm proposing for elvish is the equivalent of that idea.

zzamboni · 2017-10-07T17:22:44Z

@krader1961 I'm not sure I agree with your proposal. While the exception behavior takes some time to get used to, I think it's good default behavior.

However, as you say, non-zero status doesn't always indicate an error - many programs use it simply to indicate different types of result. What I would like is to have better exception handling - the ability to query the exit code, to compare exceptions, etc. Then it would be easy to code wrappers and extensions to the default rule.

krader1961 · 2017-10-08T00:32:35Z

nx.elv.txt

Then it would be easy to code wrappers and extensions to the default rule.

Yes, in theory something like my proof of concept nx function above could also accept a list of statuses that should be treated as non-fatal. Or perhaps a lambda which would return true if the status should be non-fatal. Then commands like grep could be wrapped in a function like this:

fn grep [@args]{
  nx [1] grep $@args
}

I actually implemented both and it works. If I run grep with a pattern that matches no lines I don't get an exception and $status is set to 1. But I'm not thrilled with the unbelievably verbose output I get when the command exits with a fatal status (see below). This approach has some other problems. For example, if we shipped the grep function above then if someone else writes a different grep function (perhaps they want to always include the --color flag) they won't get the "exit status one is non-fatal" behavior unless they know to invoke it with the hypothetical nx command as I did above.

This situation is analogous to whether a network firewall should be primarily based on blacklist or whitelist rules. The current Elvish behavior is in some sense "safer" just as a firewall based on whitelist rules is safer and more secure. But it is also much harder to use and requires more configuration (e.g., try blocks you wouldn't otherwise need).

$ grep pattern /no/such/file
grep: /no/such/file: No such file or directory
Exception: ?(fail 'grep exited with 2')
Traceback:
  /Users/krader/.config/elvish/lib/nx.elv, line 32:
                fail (echo $e)
  /Users/krader/.config/elvish/lib/nx.elv, line 24-33:
            } else {
                # put up:matches $up:matches
                # put 'up:matches[groups][1][text]' $up:matches[groups][1][text]
                up:status = $up:matches[groups][1][text]
                # put up:status $up:status
                for x $nonfatal {
                    if (== $x $up:status) { return }
                }
                fail (echo $e)
            }
  /Users/krader/.config/elvish/lib/nx.elv, line 15-34:
        } except e {
            up:exc = $e
            local:matches = $ok
            try {
                use re
                # put $e
                up:matches = (re:find ' exited with (\d+)''' (echo $e))
            } except {
                up:status = 1
            } else {
                # put up:matches $up:matches
                # put 'up:matches[groups][1][text]' $up:matches[groups][1][text]
                up:status = $up:matches[groups][1][text]
                # put up:status $up:status
                for x $nonfatal {
                    if (== $x $up:status) { return }
                }
                fail (echo $e)
            }
        }
  /Users/krader/.config/elvish/lib/grep.elv, line 2:
        nx [1] grep $@args
  [interactive], line 1:
    grep pattern /no/such/file
$ put $status
▶ 2
$ put $exc
▶ ?(fail 'grep exited with 2')

xiaq · 2017-10-10T13:24:21Z

My proposal is to map non-zero exit status to an exception only if the command is run in the context of a try { } block.

That entangles exit status with try and makes both features more complex. Also, should this effect be lexical or dynamic? For instance, if a function f calls grep, in try { f }, does the effect apply to grep or not?

If this proposal were implemented I also think there should be a variable analogous to bash's $? or fish's $status that automatically captures the exit status.

How would that interact with builtins and functions? Does $? only apply to externals? Are they lexical or dynamic? In other words, does every scope get its own $?, or is there just one $? that gets set and reset?

As you said, Elvish adopts the "better safe than sorry" approach towards errors. It is really hard to make sure that a certain external command behaves the way you want, so I believe that behaving defensively and treating non-zero exit statuses as exceptions is the only reasonable default, even if it does not make sense in every case. The mechanism to override the default behaviour should be orthogonal to other features, and explicit.

I am thinking of something that is a combination of technical solution and community convention:

A more lightweight syntax for exception handling, probably reusing the traditional || syntax. The syntax should also support matching for a particular exit code. It can probably look like this:
```
grep some-text ||[1] nop
```
Which means that only exit code 1 is ignored. A similar syntax for pattern matching should also be added to the except clause in try.
Encouraging users to wrap external commands. For instance, the most trivial wrapper for grep can simply tolerate exit code 1:
```
fn grep [@a]{
  e:grep $@a ||[1] nop
}
```
It seems that you now have to build such wrappers every time you use grep (or sed, or test, ...). But you can put them into a module (say w for "wrapper"), and remember to use w:grep from now on.

Building dedicated wrapper modules just for tolerating certain exit statuses sound like an overkill, but it is not as bad as it sounds. Anyone who has tried writing portable shell scripts knows that the basic Unix toolbox has many variations -- GNU, macOS, busybox, just to name a few. Sooner or later, it will make sense to build wrappers that hide the difference of the underlying tools, and rely on those wrappers in your scripts.

krader1961 · 2017-10-11T01:03:40Z

I could probably live with the default behavior being to raise an exception for a non-zero status if there were a succinct syntax for dealing with it. Such as your proposed e:command1 ||[status] command2. Obviously we would want the bracketed expression to allow a sequence of status codes; e.g., ||[1 2]. Should the bracketed list be mandatory? Should ||[] mean any exit status and possibly be equivalent to || (assuming we don't want to make the brackets mandatory)?

What about exit statuses that indicate the process terminated due to receipt of an unhandled signal? Should those always be translated to an exception? I'm inclined to say, yes. In fact, it would be useful to make that a distinct exception type. Possibly with $? set to the signal number rather than the raw status returned by wait(); e.g., 15 rather than 143 if the process exited as a result of receiving SIGTERM.

How would that interact with builtins and functions? Does $? only apply to externals?

It was my understanding that functions don't have an exit status. Certainly the return command does not allow specifying an exit status. And builtins don't seem to have an exit status either. They throw an exception if an error occurs. It seems to me the exit status of a builtin or function should be treated as zero on success or one if an exception was thrown. For functions that wrap an external command it is probably useful to allow return to take an exit status so that the status of the external command can be explicitly propagated.

Are they lexical or dynamic? In other words, does every scope get its own $?, or is there just one $? that gets set and reset?

I can't think of a good use for making $? lexically scoped. Consider the situation where someone wants to show the exit status (or info about the exception if any) in their prompt. The usual approach is to do saved_status = $? at the top of the prompt function which means it needs to be dynamically scoped.

Obviously there can be interesting situations outside the normal linear flow of a script or function. For example, what happens to $? if elvish supports registering a callback function for when some event has occurred? Is it the responsibility of the callback function to save and restore $? or does that happen automatically? I'm thinking here of callback functions like fish's function var_changed --on-variable var_name mechanism.

ilyash · 2017-10-27T08:40:48Z

My approach, which I also recommend for elvish, is implemented in NGS (alternative shell which is under development) and described in my blog post. Here are the main points:

Step 1
When external process finishes, the exit code is checked by finished_ok function (multi-method to be more precise). This function is defined as follows (later definitions have higher precedence):

doc Decide whether process finished normally: exit code must be 0.
doc p - any process
doc %RET - bool
F finished_ok(p:Process) p.exit_code == 0

doc Decide whether /bin/false process finished normally: exit code must be 1.
doc p - /bin/false process
doc %RET - bool
F finished_ok(p:Process) {
	guard p.executable.path in ['/bin/false', '/usr/bin/false']
	p.exit_code == 1
}

doc Decide whether specific process finished normally: exit code must be 0 or 1.
doc p - One of the processes: /usr/bin/test, /bin/fuser, /bin/ping
doc %EX - # Exit code 1 on the line below but no exception thanks to this finished_ok().
doc %EX - if $(test -f /) echo("/ is a file")  # Outputs nothing, / is a directory
doc %RET - bool
F finished_ok(p:Process) {
	guard p.executable.path in ['/usr/bin/test', '/bin/fuser', '/usr/bin/fuser', '/bin/ping', '/usr/bin/ping']
	p.exit_code in [0, 1]
}

finished_ok is easily extensible for any new commands with additional F finished_ok(p:Process) ... clauses. If finished_ok returns false, an exception is thrown. The exception instance has property process which references the failed process for convenience. The process includes the command that was running, redirections, outputs, etc.

I am sure that running false must not throw an exception and I think that throwing exception for any process that returns non-zero exit code is not a practical approach.

Step 2
Depending on usage (optionally), the exit code is converted to boolean value. For example if $(grep ...) .... In NGS, Bool function, which is also easily extensible, handles the conversion. The built-in implementation converts zero to true and non-zero to false. This behaviour matches most of the programs out there.

Additional syntax
Commands in NGS might be prefixed by ok: syntax which tells which exit codes are "good" for this run of the command. The handling of ok: syntax is also extensible:

F finished_ok(p:Process) {
	ok = p.command.options.get('ok')
	guard ok is not Null
	econd {
		ok is Bool { ok }
		ok is Int  { p.exit_code == ok }
		(ok is Arr) and len(ok) == 1 and ok[0] is NumRange { throw InvalidArgument("'ok' option is [some..range] but should be some..range") }
		(ok is Arr) or (ok is NumRange) { p.exit_code in ok }
		true throw InvalidArgument("'ok' option must be one of: Bool, Int, Arr, NumRange")
	}
}

TEST try $(ok:2 cat NO_SUCH_FILE 2>/dev/null) catch(pf:ProcessFail) true
TEST $(ok:1 cat NO_SUCH_FILE 2>/dev/null); true
TEST $(ok:[1,10] cat NO_SUCH_FILE 2>/dev/null); true
TEST $(ok:1..2 cat NO_SUCH_FILE 2>/dev/null); true

xiaq · 2017-10-27T09:39:16Z

@ilyash Thanks for the detailed explanation. Some comments:

You have demonstrated using finished_ok to customize the handling from exit status. Not sure if NGS has modules (or packages), but how would this work across multiple modules? Will finished_ok defined in one module affect the behavior of external commands in another module?

The ok: syntax is also interesting. It seems to be implementable as a function as well.

xiaq · 2017-10-27T09:59:07Z

In case you are not familiar with Elvish terminology, when I say "module", I mean reusable Elvish source files that are isolated by namespacing and potentially developed by different people. It is akin to modules or packages in most other modern languages.

I find a bit of dilemma to decide whether finished_ok from one module should affect the behavior of external commands to another module:

If it does, that means if I am writing a module that uses any external command (like grep), the behavior of that external command now depends on whether my user has imported any other module that defines finished_ok. This is very much undesirable as it introduces unnecessary coupling and makes it difficult to write modules that behave reliably. This is comparable to zsh's options that let you alter the behavior of the language; so now each zsh module has to start with a bunch of setopt nothis; setopt nothat to make sure that they are working with a "default" state of the language.
If it doesn't, that means it's now hard to reuse definitions for finished_ok. Everyone who writes a module that uses external commands now need to bring their own finished_ok. That makes the feature less convenient. In fact, if each module that needs to use external commands needs to declare their finished_ok, we might as well require the author to provide their wrappers for external commands, which is what I am proposing in RFC: Should external command non-zero exit status always throw an exception? #497 (comment).

ilyash · 2017-10-27T10:02:11Z

finished_ok is a global thing, the assumption was that in general external commands have specific exit codes for fatal conditions and other codes for results so determining whether to throw an exception can happen in one place, disregarding how the command was called.

Note that you can add your own finished_ok implementation that for example would only be applicable for specific arguments for the external program to further tailor the behaviour.

It seems to be implementable as a function as well.

I guess yes but in NGS, ok: syntax is part of the larger option:value prefix syntax which is supported when running external program and it was playing nicely with the finished_ok which was already in place. For elvish, it might make more sense to make it a function.

imported any other module that defines finished_ok

In NGS finished_ok is part of the standard library with the assumption that user defined finished_ok implementations would be either (1) temporary workarounds for commands which are not handled yet by stdlib (2) local site customizations for command nobody knows or cares about.

zzamboni · 2017-10-27T17:20:53Z

In fact, @ilyash's ok: syntax seems very much like @krader1961's nx function above (#497 (comment))

ilyash · 2017-10-27T20:45:38Z

Yes, @zzamboni , visually, functionally and use-wise they are very similar. The implementation is completely different.

krader1961 · 2020-03-17T01:18:09Z

I'm going to close this because I've gotten accustomed to the elvish behavior. I think the real problem here is that the exception object doesn't make it easy to extract relevant facts such as the exit status or signal number that terminated the command. That, however, is being discussed in #208 and #765. Furthermore, I now agree with almost all of @xiaq's concerns expressed in his first comment. I think making the exception object "richer" (e.g., adding status and signo attributes) and providing something like my nx function as part of the base installation would be preferable to the proposal I made in this issue.

xiaq added module:eval question labels Mar 20, 2018

krader1961 closed this as completed Mar 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Should external command non-zero exit status always throw an exception? #497

RFC: Should external command non-zero exit status always throw an exception? #497

krader1961 commented Oct 6, 2017

krader1961 commented Oct 7, 2017

zzamboni commented Oct 7, 2017

krader1961 commented Oct 8, 2017 •

edited

Loading

xiaq commented Oct 10, 2017

krader1961 commented Oct 11, 2017

ilyash commented Oct 27, 2017 •

edited

Loading

xiaq commented Oct 27, 2017

xiaq commented Oct 27, 2017

ilyash commented Oct 27, 2017

zzamboni commented Oct 27, 2017

ilyash commented Oct 27, 2017

krader1961 commented Mar 17, 2020

RFC: Should external command non-zero exit status always throw an exception? #497

RFC: Should external command non-zero exit status always throw an exception? #497

Comments

krader1961 commented Oct 6, 2017

krader1961 commented Oct 7, 2017

zzamboni commented Oct 7, 2017

krader1961 commented Oct 8, 2017 • edited Loading

xiaq commented Oct 10, 2017

krader1961 commented Oct 11, 2017

ilyash commented Oct 27, 2017 • edited Loading

xiaq commented Oct 27, 2017

xiaq commented Oct 27, 2017

ilyash commented Oct 27, 2017

zzamboni commented Oct 27, 2017

ilyash commented Oct 27, 2017

krader1961 commented Mar 17, 2020

krader1961 commented Oct 8, 2017 •

edited

Loading

ilyash commented Oct 27, 2017 •

edited

Loading