Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exit status 0 on backup and invocation errors; need to distinguish warnings from errors #2364

Open
HaleTom opened this issue Aug 8, 2019 · 2 comments

Comments

@HaleTom
Copy link

commented Aug 8, 2019

Summary

While having different exit codes to distinguish between the causes of non-0 exit status is a feature request (#956), I assert that returning a 0 exit code in some situations warrants a bug.

Below I demonstrate two cases of backup emitting errors and exiting 0.

In cases like this, data loss can occur as follows:

  1. All data is initially backed up
  2. A change occurs (eg permissions on data being backed up)
  3. Restic emits errors but still exits 0 so these errors are not presented for human inspection
  4. Result: Data appears to be backed up when in fact it isn't.

Additionally, there are some cases where a 0 exit is confusing to a human, and may also confuse automation scripts:

  1. Running restic key backup / returns 0. I ran into this when editing a previous command line and wondered why my backup "succeeded" so quickly.

  2. Running restic mount -H Athena latest /mountpoint doesn't exit 0 with an invocation complaint, but rather mounts at location latest and ignores the /mountpoint argument.

  3. Running restic snapshots -H athena returned 0 but didn't list any snapshots.
    Running restic snapshots -H Athena (note the capitalisation) produced a list of snapshots.
    I'd expect that like grep, if nothing matches, restic would exit non-0 to allow for automated checking if snapshots exist.


Related: "Should restic mount quit with Exit code 130?": #2015

Output of restic version

restic 0.9.5 compiled with go1.12.4 on linux/amd64

How did you run restic exactly?

Invocation:

restic backup -v --tag seed "/media/1TB-toshiba/01TB-master/d/Media/Music" "/media/1TB-toshiba/01TB-master/c/Music"

End of output:

error: lstat /media/1TB-toshiba/01TB-master/d/Media/Music/Media Monkey/Working/From Christine/Who Killed Amanda Palmer 320k MP3: no such file or directory
error: lstat /media/1TB-toshiba/01TB-master/d/Media/Music/Media Monkey/Working/From Christine/XTC: no such file or directory
error: lstat /media/1TB-toshiba/01TB-master/d/Media/Music/Media Monkey/Working/From Christine/ZZ Top: no such file or directory
error: lstat /media/1TB-toshiba/01TB-master/d/Media/Music/Media Monkey/Working/From Christine/__ Playlists __: no such file or directory
error: lstat /media/1TB-toshiba/01TB-master/d/Media/Music/Media Monkey/Working/From Christine/faithless - forever faithless: no such file or directory
error: lstat /media/1TB-toshiba/01TB-master/d/Media/Music/bltD555.tmp: input/output error
error: read /media/1TB-toshiba/01TB-master/d/Media/Music/Media Monkey/Working/From Christine/Oasis/(What's The Story) Morning Glory/Champagne Supernova.mp3: input/output error
error: read /media/1TB-toshiba/01TB-master/d/Media/Music/Media Monkey/Working/From Christine/Oasis/(What's The Story) Morning Glory/Don't Look Back In Anger.mp3: input/output error
error: read /media/1TB-toshiba/01TB-master/d/Media/Music/Media Monkey/Working/From Christine/Oasis/(What's The Story) Morning Glory/Hello.mp3: input/output error
error: read /media/1TB-toshiba/01TB-master/d/Media/Music/Media Monkey/Working/From Christine/Oasis/(What's The Story) Morning Glory/Roll With It.mp3: input/output error

Files:       24918 new,     0 changed,     0 unmodified
Dirs:           10 new,     0 changed,     0 unmodified
Data Blobs:  73868 new
Tree Blobs:     11 new
Added to the repo: 79.513 GiB

processed 24918 files, 148.553 GiB in 8:41:30
snapshot a98d7d41 saved

ravi@svelte:~/tmp% echo Exit status was $?
Exit status was 0

Another example, from Windows restic 0.9.5:

C:\Users\Jnani\Documents>restic backup -v "C:\Users\Jnani\00 Travelling" "C:\Users\Jnani\Documents" --tag seed
open repository
repository 86b7fabe opened successfully, password is correct
lock repository
load index files
using parent snapshot 233715dd
start scan on [C:\Users\Jnani\00 Travelling C:\Users\Jnani\Documents]
start backup on [C:\Users\Jnani\00 Travelling C:\Users\Jnani\Documents]
scan finished in 50.370s: 17711 files, 63.002 GiB
error: NodeFromFileInfo: Readlink: readlink \\?\C:\Users\Jnani\Documents\My Music: Access is denied.
error: NodeFromFileInfo: Readlink: readlink \\?\C:\Users\Jnani\Documents\My Pictures: Access is denied.
error: NodeFromFileInfo: Readlink: readlink \\?\C:\Users\Jnani\Documents\My Videos: Access is denied.

Files:           0 new,     0 changed, 17711 unmodified
Dirs:            0 new,     1 changed,     2 unmodified
Data Blobs:      0 new
Tree Blobs:      1 new
Added to the repo: 295 B

processed 17711 files, 63.002 GiB in 0:56
snapshot 56495314 saved

C:\Users\Jnani\Documents>echo Exit Code is %errorlevel%
Exit Code is 0

C:\Users\Jnani\Documents>date /t
2019-08-04

What backend/server/service did you use to store the repository?

B2

Expected behavior

  • All errors or warnings or error causes a non-0 exit.
    • "Warnings only" has a different exit code to "error occurred"
  • All invalid invocations exit with an error code

Actual behavior

As already described.

Concern with check

As a 0 exit with cases of less than 100% success is so prevalent, I question how this affects check.

The different expected behaviour of warning vs error becomes significant here:

  • If ANY blobs of my indexed data can't be found, I expect an exit status indicating that something is seriously wrong and immediate action needs to be taken.

    • The automation system can then ensure that the job output is presented to human eyeballs and flagged as high priority.
  • If I have some additional packs that aren't indexed, the error status should indicate that everything is not perfect, but still there is no concern for data integrity.

    • The automation system can then present the job output to the operator in a way that doesn't demand immediate attention.

Do you have any idea what may have caused this?

Hmm...

Design philosophy

Perhaps the root cause lies in design philosophy:

I expect that restic will mostly be run via automation tools, which (without horrible output parsing) have absolute reliance upon exit status to know how restic fared in its duties, and therefore what to do next.

I note that none of the restic manual pages have an Exit Status section.

Even the man page for man itself has the following:

  EXIT STATUS
         0      Successful program execution.

         1      Usage, syntax or configuration file error.

         2      Operational error.

         3      A child process returned a non-zero exit status.

         16     At least one of the pages/files/keywords didn't exist or wasn't matched.

I find it extremely surprising that restic doesn't document ANY information passed to it's caller via exit status.

Given documented exit statues and the situations which cause them, automation systems can make informed decisions on behalf of humans.

Even with the current implementation of sometimes returning a non-0 exit status, it would be extremely useful to know the conditions which trigger that behaviour.

Immediate cause

Missing tests for cases where 100% success does not occur?

Do you have an idea how to solve the issue?

Unfortunately the issue seems quite systemic from the observed behaviour. Test exit status on every non-success execution flow termination point?

Did restic help you or made you happy in any way?

I'm really glad you asked this question!

I imagine it would be very easy to just see issues on software that people generally use without issue every day. From the squeaky wheel effect, a skewed view could easily form.

I recently evaluated (all?) linux backup programs. restic was the winner for technical reasons.

restic was also the winner because of the community - the positive and supportive forum and the exemplary way that @fd0 communicates.

restic saves my memories of my loved ones and experiences in the form of photos and videos.

restic also stores my mp3 library which I use to lift my mood.

Thanks to all involved for creating and maintaining this software <3

@HaleTom HaleTom changed the title Exits 0 with backup errors and invocation errors Exit status 0 on backup and invocation errors; need to distinguish warnings from errors Aug 8, 2019

@HaleTom

This comment has been minimized.

Copy link
Author

commented Aug 16, 2019

Another example of errors and a 0 exit staus:

C:\Windows\system32>restic backup -v "C:\00-Destiny"  "C:\Global Archive"  "C:\Users\Jenny\Documents" --tag restore-complete
open repository
repository 86b7fabe opened successfully, password is correct
lock repository
load index files
using parent snapshot cf1aa601
start scan on [C:\00-Destiny C:\Global Archive C:\Users\Jenny\Documents]
start backup on [C:\00-Destiny C:\Global Archive C:\Users\Jenny\Documents]
scan finished in 83.407s: 19458 files, 64.452 GiB
error: NodeFromFileInfo: Readlink: readlink \\?\C:\Users\Jenny\Documents\My Music: Access is denied.
error: NodeFromFileInfo: Readlink: readlink \\?\C:\Users\Jenny\Documents\My Pictures: Access is denied.
error: NodeFromFileInfo: Readlink: readlink \\?\C:\Users\Jenny\Documents\My Videos: Access is denied.

Files:           0 new,     0 changed, 19458 unmodified
Dirs:            0 new,     0 changed,     3 unmodified
Data Blobs:      0 new
Tree Blobs:      0 new
Added to the repo: 0 B

processed 19458 files, 64.452 GiB in 7:24
snapshot 2045eaa3 saved

C:\Windows\system32>echo Exited %errorlevel%.
Exited 0.
@cdhowie

This comment has been minimized.

Copy link
Contributor

commented Aug 20, 2019

Running restic mount -H Athena latest /mountpoint doesn't exit 0 with an invocation complaint, but rather mounts at location latest and ignores the /mountpoint argument.

This seems like an unrelated issue to the exit status conversation; this is about restic mount ignoring extra parameters instead of complaining about them. This might be better reported as a separate issue.

Running restic snapshots -H athena returned 0 but didn't list any snapshots.
Running restic snapshots -H Athena (note the capitalisation) produced a list of snapshots.
I'd expect that like grep, if nothing matches, restic would exit non-0 to allow for automated checking if snapshots exist.

I'm not sure I agree with this. To me a non-zero exit code means "something went wrong" where here nothing went wrong, we just enumerated an empty set.

It's easy enough to test with jq whether there are any snapshots.

$ #When there are snapshots.
$ restic snapshots --json | jq -se '.[0][0]' >/dev/null; echo $?
0

$ #When there are no snapshots.
$ restic snapshots --json | jq -se '.[0][0]' >/dev/null; echo $?
1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.