-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix gateway exec tty cleanup on context.Canceled #3658
Conversation
9eec9d1
to
1dca040
Compare
looking at the test failures ... |
1dca040
to
8d521a7
Compare
executor/runcexecutor/executor.go
Outdated
@@ -310,6 +310,7 @@ func (w *runcExecutor) Run(ctx context.Context, id string, root executor.Mount, | |||
timeout() | |||
select { | |||
case <-time.After(50 * time.Millisecond): | |||
cancelRun() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This allows the tests to pass, I am a little confused why this was necessary. The pid1 sleep 10
was continuing to run, even though the ctx.Done
was triggered almost immediately. Then the w.runc.Kill
ran which did not return an error, but the w.run
call below didn't terminate. So this loop waited for 50ms then ran the kill 9
again and again, until the pid1 ended after 10s. I am not sure how the kill 9
is not actually terminating the runc process. With this change we cancel the ctx passed into w.run
after the kill 9
is ignored, and the process ends up exiting as expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not look quite right. I'm not sure why sigkill wouldn't work (can we replicate it outside of test?) but even when we handle misbehaving runc that doesn't react(properly) to sigkill, we should give it more than 50ms to shut down normally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have tried to reproduce the steps by calling runc directly and so far it works, not sure what the difference is. I will keep looking, although I have very little time these days. The only thing I can think of is that the sigkill signal is not actually being sent to the runc or sleep process, not sure how/why though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you can see that this work outside of the test. can you just add a counter before calling cancelRun()
so that case does not get called on 50ms but after a couple of seconds for example? Also some comments describing why we are doing this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I spent some time debugging this, I think the problem is with the test container, not with buildkit. The sleep process gets wedged on zap_pid_ns_processes
which seems to be related to parent/child reaping. The cancelRun
was just giving up and shutting things down even though the sleep process persisted. I have updated the test container to run with tini
as the entrypoint to better handle reaping during the go tests, so far it seems to work well. Hopefully it will make it through the github workflows...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the tests all pass when run under tini
with out the cancelRun()
hack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have another idea why the zombie processes might happening, let me test that out. In theory runc should be doing the waitpid so there should be no zombies for tini
to handle...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated the PR again, the bug was in (*runcExecutor).Exec
, and the tini
hack put me on the right path. So the runc pid1 (sleep) was getting wedged on zap_pid_ns_process
, which according to the docs:
The
zap_pid_ns_processes
function is used in Linux to terminate all processes within a specific namespace
So the problem was not with the pid1 Kill directly, it was actually a zombie process from the Exec
(the sh
command) which was preventing zap_pid_ns_processes
from finishing. It turns out we were using context.Background()
for the runc
call via (*runcExecutor).Run
but were using the request ctx
for the runc
call via (*runcExecutor).Exec
. So when the ctx.Done
happened on the parent runc
command for the Exec
was getting terminated immediately, before it could call waitpid on the child sh
command.
I have moved the wait/kill logic into a common routine now that will be called for both Run
and Exec
and we use context.Background()
for both runc
calls now.
f7eb53c
to
38e37c5
Compare
I got a build failure that seems unrelated, retrying:
|
Okay, tests passing, it looks like |
The flakiness is likely from #3401. |
@tonistiigi would be great to get your review on this one again, I think all the issues are known/fixed at this point. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, the fix makes sense 🎉 I did spend some time trying to actually find the part of the PR that was relevant to the fix vs the runc issue. Could you split the busy loop fix into a separate commit from the runc fixes (and maybe from the tests as well if there's the potential to cherry-pick just the busy loop fix), to make it a bit easier to trace back in the history?
Am I right in guessing that we could just take the busy loop break for cherry-pick, and leave the tests and runc changes on master? Bit frustrating to not take the tests as well, but from my understanding we'd need the runc changes as well (which seem a bit riskier to cherry-pick to me).
Yeah, that is correct, we have two fixes here. One for the busy loop, one for zombie processes via runc.Exec. I can certainly split up the patch. One patch for the runc.Exec fix, one patch for the busy loop + test.
I suspect we really should cherry-pick both fixes. The zombie processes accumulating from context cancel for gateway exec processes is not great and leads to solves "getting stuck". In practice, I don't think gateway containers with a TTY are used much, so that is likely why we have not seen many problems with this, but both issues are pretty frustrating when trying to use gateway containers. |
38e37c5
to
21a1d11
Compare
I have have split out the runc.Exec fix into #3722, we will need that merged first before the tests in this PR will pass. |
This fixes an issue where the tty message handling loop will go into a tight loop and never exit upon context.Canceled. There is select statement in `(*procMessageForwarder).Recv` that returns nil on ctx.Done, but the control loop in `(*container).Start` did not exit on this condition. I think the intent was to flush out any inflight messages on cancel, but this is already done in `(*procMessageForwarder) Close`. Signed-off-by: coryb <cbennett@netflix.com>
21a1d11
to
aa827f5
Compare
This fixes an issue where the tty message handling loop will go into a tight loop and never exit upon context.Canceled. There is select statement in
(*procMessageForwarder).Recv
that returns nil on ctx.Done, but the control loop in(*container).Start
did not exit on this condition. I think the intent was to flush out any inflight messages on cancel, but this is already done in(*procMessageForwarder) Close
.