Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle SIGPIPE gracefully #1466

Open
FiloSottile opened this issue Dec 3, 2017 · 8 comments
Open

Handle SIGPIPE gracefully #1466

FiloSottile opened this issue Dec 3, 2017 · 8 comments

Comments

@FiloSottile
Copy link
Contributor

FiloSottile commented Dec 3, 2017

The SIGPIPE issue in #1457 will occur for all backends, it's just that some are... less reliable than others ;)

The problem is that registering a SIGPIPE handler removes the Go runtime logic that only crashes on SIGPIPE on fd 0/1/2, and not on SIGPIPE caused by i.e. TCP connection resets.

https://github.com/golang/go/blob/78074f6850c34a955d69f578e363d1d3f3e00e25/src/os/signal/doc.go#L91-L110

This was reverted in 2579fe6, but for a solution that also fixes #1413, you can check the file descriptor like the runtime does, and ignore for > 2.

Also, I think the fix in 2579fe6 should have its own CHANGELOG line and maybe even a minor version release, as I basically can't finish a backup to B2 without a TCP reset.

(Some decent background reading is also https://stackoverflow.com/questions/108183/how-to-prevent-sigpipes-or-handle-them-properly)

FiloSottile referenced this issue Dec 3, 2017
Handling SIGPIPE made restic abort when a TCP connection was reset by a
server. This happened on DigitalOcean Spaces, which uses the s3 backend.
@FiloSottile
Copy link
Contributor Author

@fd0 you're right, the best thing to do is to always ignore the signal, and do complete error handling + cleanup on EPIPE (or any other non-Timeout) errors from stdin/out/err.

That's also what the runtime does.

https://github.com/golang/go/blob/415349575dec277fbadf08b9d690d07fe313b288/src/os/file_unix.go#L143-L150

@fd0
Copy link
Member

fd0 commented Dec 3, 2017

Good points, thanks for the analysis. I'll add an entry to the CHANGELOG now. We'll leave this issue open until writing stdout/stderr failures are checked everywhere and handled correctly.

@armhold you mentioned in #1413 that sometimes your backups fail and leave a lock file. How do you typically run restic?

@fd0
Copy link
Member

fd0 commented Dec 3, 2017

Done, https://github.com/restic/restic/blob/master/CHANGELOG.md#important-changes-in-0xy

@fd0 fd0 added the type: feature enhancement improving existing features label Dec 3, 2017
@armhold
Copy link
Contributor

armhold commented Dec 3, 2017

I think most (perhaps all?) of the lock situations I encountered were due to me flubbing a command line where I was piping restic output to a filter of some sort. Like this:

restic -r ~/repo cat index 8a32887 |jq'.'

This results in "command not found" and also a lock not cleaned up.

@Prof-Bloodstone
Copy link

Prof-Bloodstone commented Jul 27, 2019

This hit me today. When trying to automate restic I run

restic forget --keep-within 1h --dry-run | grep -q '^remove [[:digit:]]* snapshots:$'

to see if there's any snapshots that'd be cleaned up. This causes the repository to get locked. Temporary workaround is to use

restic forget --keep-within 1h --dry-run | grep '^remove [[:digit:]]* snapshots:$' &>/dev/null

but it'd be nice not having to scratch my head wondering what's happening.
Since it's an old report - are there any plans on fixing it? Depending on how restic handles it, I wonder if it's possible to corrupt repository this way?
I'm afraid that one of my pipelines will exit early and lock the repo. And I'd prefer to avoid using something like:

restic() {
    command restic "${@}" | sponge
}

to make sure it won't happen.

@Prof-Bloodstone
Copy link

As per #1814, this should probably be treated as bug.

Are there any plans on fixing it?

@rawtaz
Copy link
Contributor

rawtaz commented Aug 13, 2022

Okay, so can we do what @FiloSottile suggested (reimplementing the SIGPIPE handling but only acting on it for file descriptors 1 and 2 (stdout and stderr), so everything > 2 is ignored)? Or won't that work for some reason I'm not seeing?

@MichaelEischer
Copy link
Member

MichaelEischer commented Aug 18, 2022

Judging from the go source code, the default behavior is currently to first ignore the SIGPIPE signal and then internally route it back to the signal handler if it is for fd 1 or 2 (look for epipecheck in the source code). The default signal handler behavior is then to terminate the process. Inside the signal handler there is no way to know which file descriptor is affected.

But what we could do is, to replace os.Stdout and os.Stderr with wrapped versions, which in the Write method check whether an error was returned that wraps EPIPE. This wrapper should then call the Exit function from cmd/restic.
[Edit]Hmm, that could be problem, as os.Stdout and os.Stderr are structs and not interfaces... So we probably should add wrappers and then offer them via some package.[/Edit]

I've also tried to come up with a solution that tries to route the error through restic, but that just ended up being quite horrible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants