fix gateway exec tty cleanup on context.Canceled #3658

coryb · 2023-02-21T18:47:48Z

This fixes an issue where the tty message handling loop will go into a tight loop and never exit upon context.Canceled. There is select statement in (*procMessageForwarder).Recv that returns nil on ctx.Done, but the control loop in (*container).Start did not exit on this condition. I think the intent was to flush out any inflight messages on cancel, but this is already done in (*procMessageForwarder) Close.

coryb · 2023-02-21T19:33:52Z

looking at the test failures ...

coryb · 2023-02-22T00:48:21Z

executor/runcexecutor/executor.go

@@ -310,6 +310,7 @@ func (w *runcExecutor) Run(ctx context.Context, id string, root executor.Mount,
 				timeout()
 				select {
 				case <-time.After(50 * time.Millisecond):
+					cancelRun()


This allows the tests to pass, I am a little confused why this was necessary. The pid1 sleep 10 was continuing to run, even though the ctx.Done was triggered almost immediately. Then the w.runc.Kill ran which did not return an error, but the w.run call below didn't terminate. So this loop waited for 50ms then ran the kill 9 again and again, until the pid1 ended after 10s. I am not sure how the kill 9 is not actually terminating the runc process. With this change we cancel the ctx passed into w.run after the kill 9 is ignored, and the process ends up exiting as expected.

This does not look quite right. I'm not sure why sigkill wouldn't work (can we replicate it outside of test?) but even when we handle misbehaving runc that doesn't react(properly) to sigkill, we should give it more than 50ms to shut down normally.

I have tried to reproduce the steps by calling runc directly and so far it works, not sure what the difference is. I will keep looking, although I have very little time these days. The only thing I can think of is that the sigkill signal is not actually being sent to the runc or sleep process, not sure how/why though.

if you can see that this work outside of the test. can you just add a counter before calling cancelRun() so that case does not get called on 50ms but after a couple of seconds for example? Also some comments describing why we are doing this.

I spent some time debugging this, I think the problem is with the test container, not with buildkit. The sleep process gets wedged on zap_pid_ns_processes which seems to be related to parent/child reaping. The cancelRun was just giving up and shutting things down even though the sleep process persisted. I have updated the test container to run with tini as the entrypoint to better handle reaping during the go tests, so far it seems to work well. Hopefully it will make it through the github workflows...

Looks like the tests all pass when run under tini with out the cancelRun() hack.

I have another idea why the zombie processes might happening, let me test that out. In theory runc should be doing the waitpid so there should be no zombies for tini to handle...

I have updated the PR again, the bug was in (*runcExecutor).Exec, and the tini hack put me on the right path. So the runc pid1 (sleep) was getting wedged on zap_pid_ns_process, which according to the docs:

The zap_pid_ns_processes function is used in Linux to terminate all processes within a specific namespace

So the problem was not with the pid1 Kill directly, it was actually a zombie process from the Exec (the sh command) which was preventing zap_pid_ns_processes from finishing. It turns out we were using context.Background() for the runc call via (*runcExecutor).Run but were using the request ctx for the runc call via (*runcExecutor).Exec. So when the ctx.Done happened on the parent runc command for the Exec was getting terminated immediately, before it could call waitpid on the child sh command.

I have moved the wait/kill logic into a common routine now that will be called for both Run and Exec and we use context.Background() for both runc calls now.

coryb · 2023-03-03T17:15:39Z

I got a build failure that seems unrelated, retrying:

        	Error Trace:	/src/client/client_test.go:6512
        	Error:      	content still exists
        	Test:       	TestIntegration/TestDiffCircularSymlinks/worker=containerd-1.6

coryb · 2023-03-03T18:39:15Z

Okay, tests passing, it looks like TestIntegration/TestDiffCircularSymlinks is flakey.

jedevc · 2023-03-06T09:45:25Z

The flakiness is likely from #3401.

coryb · 2023-03-09T20:44:41Z

@tonistiigi would be great to get your review on this one again, I think all the issues are known/fixed at this point.

jedevc

Looks good to me, the fix makes sense 🎉 I did spend some time trying to actually find the part of the PR that was relevant to the fix vs the runc issue. Could you split the busy loop fix into a separate commit from the runc fixes (and maybe from the tests as well if there's the potential to cherry-pick just the busy loop fix), to make it a bit easier to trace back in the history?

Am I right in guessing that we could just take the busy loop break for cherry-pick, and leave the tests and runc changes on master? Bit frustrating to not take the tests as well, but from my understanding we'd need the runc changes as well (which seem a bit riskier to cherry-pick to me).

coryb · 2023-03-15T17:05:15Z

Looks good to me, the fix makes sense 🎉 I did spend some time trying to actually find the part of the PR that was relevant to the fix vs the runc issue. Could you split the busy loop fix into a separate commit from the runc fixes (and maybe from the tests as well if there's the potential to cherry-pick just the busy loop fix), to make it a bit easier to trace back in the history?

Yeah, that is correct, we have two fixes here. One for the busy loop, one for zombie processes via runc.Exec. I can certainly split up the patch. One patch for the runc.Exec fix, one patch for the busy loop + test.

Am I right in guessing that we could just take the busy loop break for cherry-pick, and leave the tests and runc changes on master? Bit frustrating to not take the tests as well, but from my understanding, we'd need the runc changes as well (which seem a bit riskier to cherry-pick to me).

I suspect we really should cherry-pick both fixes. The zombie processes accumulating from context cancel for gateway exec processes is not great and leads to solves "getting stuck". In practice, I don't think gateway containers with a TTY are used much, so that is likely why we have not seen many problems with this, but both issues are pretty frustrating when trying to use gateway containers.

coryb · 2023-03-16T15:32:12Z

I have have split out the runc.Exec fix into #3722, we will need that merged first before the tests in this PR will pass.

This fixes an issue where the tty message handling loop will go into a tight loop and never exit upon context.Canceled. There is select statement in `(*procMessageForwarder).Recv` that returns nil on ctx.Done, but the control loop in `(*container).Start` did not exit on this condition. I think the intent was to flush out any inflight messages on cancel, but this is already done in `(*procMessageForwarder) Close`. Signed-off-by: coryb <cbennett@netflix.com>

coryb force-pushed the gateway-exec-cancel branch from 9eec9d1 to 1dca040 Compare February 21, 2023 18:56

tonistiigi added the needs-cherry-pick/v0.11 label Feb 21, 2023

coryb force-pushed the gateway-exec-cancel branch from 1dca040 to 8d521a7 Compare February 21, 2023 23:54

coryb commented Feb 22, 2023

View reviewed changes

coryb force-pushed the gateway-exec-cancel branch 2 times, most recently from f7eb53c to 38e37c5 Compare March 3, 2023 16:46

tonistiigi requested a review from jedevc March 11, 2023 02:48

jedevc reviewed Mar 15, 2023

View reviewed changes

coryb mentioned this pull request Mar 16, 2023

fix process termination handling for runc exec #3722

Merged

coryb force-pushed the gateway-exec-cancel branch from 38e37c5 to 21a1d11 Compare March 16, 2023 15:31

coryb force-pushed the gateway-exec-cancel branch from 21a1d11 to aa827f5 Compare March 17, 2023 21:25

jedevc approved these changes Mar 20, 2023

View reviewed changes

tonistiigi merged commit eb5a51e into moby:master Mar 22, 2023

crazy-max added this to the v0.11.5 milestone Mar 22, 2023

tonistiigi mentioned this pull request Mar 22, 2023

[0.11] cherry picks for v0.11.5 #3734

Merged

crazy-max removed the needs-cherry-pick/v0.11 label Mar 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix gateway exec tty cleanup on context.Canceled #3658

fix gateway exec tty cleanup on context.Canceled #3658

coryb commented Feb 21, 2023

coryb commented Feb 21, 2023

coryb Feb 22, 2023

tonistiigi Feb 28, 2023

coryb Feb 28, 2023

tonistiigi Feb 28, 2023

coryb Mar 1, 2023

coryb Mar 1, 2023

coryb Mar 3, 2023

coryb Mar 3, 2023

coryb commented Mar 3, 2023

coryb commented Mar 3, 2023

jedevc commented Mar 6, 2023

coryb commented Mar 9, 2023

jedevc left a comment •

edited

coryb commented Mar 15, 2023

coryb commented Mar 16, 2023

fix gateway exec tty cleanup on context.Canceled #3658

fix gateway exec tty cleanup on context.Canceled #3658

Conversation

coryb commented Feb 21, 2023

coryb commented Feb 21, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coryb commented Mar 3, 2023

coryb commented Mar 3, 2023

jedevc commented Mar 6, 2023

coryb commented Mar 9, 2023

jedevc left a comment • edited

Choose a reason for hiding this comment

coryb commented Mar 15, 2023

coryb commented Mar 16, 2023

jedevc left a comment •

edited