Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

agent: refactor termination and stream management #316

Merged
merged 1 commit into from
Feb 4, 2022
Merged

Conversation

xenoscopic
Copy link
Member

What does this pull request do and why is it needed?

This request fixes a number of issues stemming from golang/go#23019, most notably #223 and mutagen-io/mutagen-compose#11.

We've already seen several manifestations of golang/go#23019, so this
commit refactors the way that agent input, output, and error streams are
managed, as well as the way that agent process termination is signaled.

Historically (in 6c1a47c), we avoided closing
standard input/output pipes because they were blocking, meaning that a
close wouldn't preempt a read/write and that the close itself could
potentially block. However, this hasn't been the case since Go 1.9, when
os.File was switched to use polling I/O (at least for pipes and other
pollable files).

As such, we can now use closure of standard input as a signal to agent
processes (via their intermediate transport processes) that they should
terminate. Failing that, we still fall back to signal-based termination,
but this standard input closure mechanism is particularly important on
Windows, where no "soft" signaling mechanism (like SIGTERM) exists and
transport process termination via TerminateProcess often triggers
golang/go#23019. This is especially problematic with Docker Desktop,
where an intermediate com.docker.cli process is used underneath the
primary Docker CLI, and standard input closure is the only reliable
termination signaling mechanism.

Just to clarify, there are mechanisms like WM_CLOSE and CTRL_CLOSE_EVENT
on Windows, which some runtimes (such as Go's) will convert to a
SIGTERM, but there's no telling how intermediate transport processes
will interpret these messages. They don't necessarily have the same
semantics as SIGTERM.

And, just in case none of our signaling mechanisms works, we now avoid
using os/exec.Cmd's forwarding Goroutines entirely, meaning that its
Wait method will close all pipes as soon as the child process is
terminated.

As part of this refactor, I've also looked at switching to a more
systematic approach to manage the process hierarchies that could
potentially be generated by transport processes. Things like killpg on
POSIX or job objects on Windows could theoretically facilitate such
management. However, the fact of the matter is that there's simply no
way to reliably enforce termination of such hierarchies and, more
importantly, no way for Mutagen to know what termination signaling
mechanism would be appropriate for any intermediate processes.
Essentially, we just have to rely on transport processes to either
correctly forwarded standard input closure (especially on Windows)
and/or correctly forward SIGTERM on POSIX. But, if they don't, we will
forcibly terminate them and any associated resources in the Mutagen
daemon. If their subprocesses linger on afterward, that's a bug in the
transport process, not Mutagen.

This commit also takes the opportunity to impose a size limit on the
in-memory error buffer used to capture transport errors after a
handshake failure.

Updates #223
Updates mutagen-io/mutagen-compose#11

Signed-off-by: Jacob Howard <jacob@mutagen.io>
@xenoscopic xenoscopic merged commit 31fa038 into master Feb 4, 2022
@xenoscopic xenoscopic deleted the procterm2 branch February 4, 2022 11:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant