New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return different exit codes for different failures #956

Open
unclesamwk opened this Issue May 10, 2017 · 12 comments

Comments

Projects
None yet
4 participants
@unclesamwk
Copy link

unclesamwk commented May 10, 2017

Hey,

another request error/exit code handling.

Hey, nice feature is if restic abort answer with an error code more than 0 or 1.

rsync example:
0 Success
1 Syntax or usage error
2 Protocol incompatibility
3 Errors selecting input/output files, dirs
4 Requested action not supported: an attempt was made to manipulate 64-bit files on a platform
that cannot support them; or an option was specified that is supported by the client and not by the server.
5 Error starting client-server protocol
6 Daemon unable to append to log-file
10 Error in socket I/O
11 Error in file I/O
12 Error in rsync protocol data stream
13 Errors with program diagnostics
14 Error in IPC code
20 Received SIGUSR1 or SIGINT
21 Some error returned by waitpid()
22 Error allocating core memory buffers
23 Partial transfer due to error
24 Partial transfer due to vanished source files
25 The --max-delete limit stopped deletions
30 Timeout in data send/receive
35 Timeout waiting for daemon connection

For scripting restic its needed.

thanks

greetings
sam

@fd0

This comment has been minimized.

Copy link
Member

fd0 commented May 14, 2017

Hey, thanks for suggesting this. Can you please elaborate what your use case is? Why do you need to know exactly what the error code was? What are the cases you need to distinguish in your script and why?

@jwkohnen

This comment has been minimized.

Copy link
Contributor

jwkohnen commented Jun 25, 2017

Coming here because I've expected restic to exit with non-zero code when it received a signal. Otherwise it is hard to terminate a script after restic terminated on request. In my case the script just doesn't terminate, deletes data that was not backed up yet (restic said good bye, w/o backing up as requested) and the next backup job unintentionally starts up. I tried to fix this by using trap in my script, but that's brittle and still destroys data now and then.

If everything exits with 0 no matter what, automation is hard. 🙂

@fd0

This comment has been minimized.

Copy link
Member

fd0 commented Jun 25, 2017

@wjkohnen Huh, that doesn't sound right, but I'm not sure I understood. Could you elaborate on your use case a bit? I understood you're running restic in a script, then you send a signal to restic? Which signal, and why?

@jwkohnen

This comment has been minimized.

Copy link
Contributor

jwkohnen commented Jun 25, 2017

Well, it's quite simple. A bash script (with set -e) iterates in a for loop over some directories, in each loop there is a call to restic. I hit ^c, because $reason, restic clean's up due to SIGINT, restic exits with 0, for loop iterates because there's no one telling bash to stop. trap "{ exit 27; }" SIGINT doesn't help.

Restic's code says effectively os.Exit(0) in src/cms/restic/cleanup.go. I've patched that to exit with 1, but it still exits with 0. So I am clearly missing something.

@fd0

This comment has been minimized.

Copy link
Member

fd0 commented Jun 25, 2017

Thanks for describing your situation! It seems to me we have different expectations: I didn't think it was necessary to return an error code when the user requested termination via SIGINT, but I'm open to adding that.

Could you please open a new issue? This issue really is about returning different error codes for different situations.

@jwkohnen

This comment has been minimized.

Copy link
Contributor

jwkohnen commented Jun 25, 2017

New issue is #1064. Apparently I did not have my use case straight: actually I loop on the terminal over a restic wrapper script. Thanks!

@fd0 fd0 added feature request and removed suggestion labels Jun 25, 2017

@fd0 fd0 changed the title [FEATURE] exit code handling Return different exit codes for different failures Jun 25, 2017

@dsommers

This comment has been minimized.

Copy link

dsommers commented Sep 20, 2017

In my restic-runner script for automated backups, I'm depending on the exit code to highlight this in the backup report - if the backup was successful or not. I just noticed that disk-full scenarios also returns with exit code 0, so the backup logs are a bit misleading on the final result.

I'm not sure we need such a comprehensive list of failures as rsync have. But having all non-fatal and non-critical scenarios return 0 is usually fine. But to have a few other exit codes when the backup might or might not have happened correctly (if restic is in doubt, it is a failure) is really needed.

@fd0

This comment has been minimized.

Copy link
Member

fd0 commented Sep 20, 2017

I just noticed that disk-full scenarios also returns with exit code 0, so the backup logs are a bit misleading on the final result.

Hu, that shouldn't be the case, and I'm not able to reproduce this. When the disc is full, restic panics and returns exit code 2 (the Go runtime does that). What did I do differently?

having all non-fatal and non-critical scenarios return 0 is usually fine. But to have a few other exit codes when the backup might or might not have happened correctly (if restic is in doubt, it is a failure) is really needed.

That should be the case. If not, I consider this a bug.

@dsommers

This comment has been minimized.

Copy link

dsommers commented Sep 20, 2017

Ahh ... sorry, my mistake! In my script I do:

restic $args 2>&1 | log_stream

When checking $? it is reflecting the exit code of the log_stream function, not the restic call.

@dsommers

This comment has been minimized.

Copy link

dsommers commented Sep 20, 2017

Okay, found the solution. Seems I need to set set -o pipefail to get the restic exit code. I'll test that out, but consider my issue resolved for now.

@fd0

This comment has been minimized.

Copy link
Member

fd0 commented Sep 21, 2017

Nice, thanks for letting us know!

@fd0 fd0 removed the feedback needed label Jul 21, 2018

@fd0

This comment has been minimized.

Copy link
Member

fd0 commented Jul 21, 2018

I've thought about this today, and I think we should improve the error handling: Restic should continue if possible (e.g. when just a single file could not be read), but return an error code so the admin is notified something is odd.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment