task: don't `close()` io before `cancel()` #8643

laurazard · 2023-06-05T11:42:37Z

The contract for cio/io.go/IO states that a call to Close() will always be preceded by a call to Cancel() -

Line 59 in f3a0793

// Close cleans up all open io resources. Cancel() is always called before

which isn't being held up here.

Furthermore, the call to Close() here makes the subsequent Wait() moot, and causes issues to consumers (see: moby/moby#45689) – basically, if we close before Wait()ing, there will be times where the pipes get closed while task IO (stdin/stdout/whatever) is still being piped, so we lose task output.

It seems from

containerd/task.go

Line 338 in f3a0793

// Only cleanup the IO after a successful Delete

that the Close() should be called there, the call removed in this commit is unnecessary/erroneous.

k8s-ci-robot · 2023-06-05T11:42:46Z

Hi @laurazard. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

laurazard · 2023-06-05T11:43:57Z

/cc @dmcgowan @thaJeztah @corhere

k8s-ci-robot · 2023-06-05T11:44:00Z

@laurazard: GitHub didn't allow me to request PR reviews from the following users: corhere.

Note that only containerd members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @dmcgowan @thaJeztah @corhere

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

thaJeztah · 2023-06-05T11:50:55Z

/ok-to-test

thaJeztah · 2023-06-05T11:51:38Z

looks like this was introduced in 55faa5e (#5974), which was created to address

Containerd can become unusable after the process has been restarted on Windows #5621

laurazard · 2023-06-05T11:56:59Z

OH! I should've noticed that. I guess we might need a slightly more complex solution here. Closing task io before waiting doesn't seem correct to me, but maybe we can change Wait() to ensure it doesn't lock in the situation described in #5621.

Other than that, it still seems incorrect (albeit maybe harmless) to always call t.io.Close() twice ("maybe" since the comment on line

containerd/task.go

Line 338 in f3a0793

// Only cleanup the IO after a successful Delete

seems to imply that Close() shouldn't be called before a successful task delete)

task.go

laurazard · 2023-06-05T13:54:46Z

Investigating #5621 (and the fix in 55faa5e), it seems to me that adding the Close() call there was a bad solution. If the call to Wait() hangs on Windows in certain states, the root cause for that should be fixed.

Warning:
This is a lot of speculation and I didn't get to test everything out, but hopefully can direct someone towards the root cause for the issue on Windows

Since this happens only on restore, we can follow the thread and we find that:

The specific IO for which the Wait blocking is created in

containerd/pkg/cri/server/restart.go

Line 218 in 4281a95

containerIO, err = cio.NewContainerIO(id,

which looks like

containerd/pkg/cri/io/container_io.go

Lines 76 to 98 in 4281a95

    
           func NewContainerIO(id string, opts ...ContainerIOOpts) (_ *ContainerIO, err error) { 
        
           	c := &ContainerIO{ 
        
           		id:          id, 
        
           		stdoutGroup: cioutil.NewWriterGroup(), 
        
           		stderrGroup: cioutil.NewWriterGroup(), 
        
           	} 
        
           	for _, opt := range opts { 
        
           		if err := opt(c); err != nil { 
        
           			return nil, err 
        
           		} 
        
           	} 
        
           	if c.fifos == nil { 
        
           		return nil, errors.New("fifos are not set") 
        
           	} 
        
           	// Create actual fifos. 
        
           	stdio, closer, err := newStdioPipes(c.fifos) 
        
           	if err != nil { 
        
           		return nil, err 
        
           	} 
        
           	c.stdioPipes = stdio 
        
           	c.closer = closer 
        
           	return c, nil 
        
           }

and the Wait() being called is

containerd/pkg/cri/io/container_io.go

Line 227 in 4281a95

c.closer.Wait()

where the closer there comes from

containerd/pkg/cri/io/helpers.go

Lines 98 to 144 in 4281a95

    
           func newStdioPipes(fifos *cio.FIFOSet) (_ *stdioPipes, _ *wgCloser, err error) { 
        
           	var ( 
        
           		f           io.ReadWriteCloser 
        
           		set         []io.Closer 
        
           		ctx, cancel = context.WithCancel(context.Background()) 
        
           		p           = &stdioPipes{} 
        
           	) 
        
           	defer func() { 
        
           		if err != nil { 
        
           			for _, f := range set { 
        
           				f.Close() 
        
           			} 
        
           			cancel() 
        
           		} 
        
           	}() 
        
           	if fifos.Stdin != "" { 
        
           		if f, err = openPipe(ctx, fifos.Stdin, syscall.O_WRONLY|syscall.O_CREAT|syscall.O_NONBLOCK, 0700); err != nil { 
        
           			return nil, nil, err 
        
           		} 
        
           		p.stdin = f 
        
           		set = append(set, f) 
        
           	} 
        
           	if fifos.Stdout != "" { 
        
           		if f, err = openPipe(ctx, fifos.Stdout, syscall.O_RDONLY|syscall.O_CREAT|syscall.O_NONBLOCK, 0700); err != nil { 
        
           			return nil, nil, err 
        
           		} 
        
           		p.stdout = f 
        
           		set = append(set, f) 
        
           	} 
        
           	if fifos.Stderr != "" { 
        
           		if f, err = openPipe(ctx, fifos.Stderr, syscall.O_RDONLY|syscall.O_CREAT|syscall.O_NONBLOCK, 0700); err != nil { 
        
           			return nil, nil, err 
        
           		} 
        
           		p.stderr = f 
        
           		set = append(set, f) 
        
           	} 
        
           	return p, &wgCloser{ 
        
           		wg:     &sync.WaitGroup{}, 
        
           		set:    set, 
        
           		ctx:    ctx, 
        
           		cancel: cancel, 
        
           	}, nil 
        
           }

for which the Wait() is

containerd/pkg/cri/io/helpers.go

Line 62 in 4281a95

g.wg.Wait()

The wg there is referenced in

containerd/pkg/cri/io/container_io.go

Line 107 in 4281a95

func (c *ContainerIO) Pipe() {

which gets called in containerd/pkg/cri/server/restart.go – which fits in with our theory.

Pipe() doesn't look suspicious, and doesn't break on unix systems:

containerd/pkg/cri/io/container_io.go

Lines 107 to 134 in 4281a95

    
           func (c *ContainerIO) Pipe() { 
        
           	wg := c.closer.wg 
        
           	if c.stdout != nil { 
        
           		wg.Add(1) 
        
           		go func() { 
        
           			if _, err := io.Copy(c.stdoutGroup, c.stdout); err != nil { 
        
           				log.L.WithError(err).Errorf("Failed to pipe stdout of container %q", c.id) 
        
           			} 
        
           			c.stdout.Close() 
        
           			c.stdoutGroup.Close() 
        
           			wg.Done() 
        
           			log.L.Debugf("Finish piping stdout of container %q", c.id) 
        
           		}() 
        
           	} 
        
           	if !c.fifos.Terminal && c.stderr != nil { 
        
           		wg.Add(1) 
        
           		go func() { 
        
           			if _, err := io.Copy(c.stderrGroup, c.stderr); err != nil { 
        
           				log.L.WithError(err).Errorf("Failed to pipe stderr of container %q", c.id) 
        
           			} 
        
           			c.stderr.Close() 
        
           			c.stderrGroup.Close() 
        
           			wg.Done() 
        
           			log.L.Debugf("Finish piping stderr of container %q", c.id) 
        
           		}() 
        
           	} 
        
           }

we can see then that the likely culprit here is some locking on

containerd/pkg/cri/io/container_io.go

Line 112 in 4281a95

if _, err := io.Copy(c.stdoutGroup, c.stdout); err != nil {

that prevents the call to

containerd/pkg/cri/io/container_io.go

Line 117 in 4281a95

wg.Done()

Locking the Wait() out. This is also consistent with only being broken on Windows, since there's a platform-specific implementation for the pipe's read/write:

containerd/pkg/cri/io/helpers_windows.go

Lines 60 to 74 in 4281a95

    
           func (p *pipe) Write(b []byte) (int, error) { 
        
           	p.conWg.Wait() 
        
           	if p.conErr != nil { 
        
           		return 0, fmt.Errorf("connection error: %w", p.conErr) 
        
           	} 
        
           	return p.con.Write(b) 
        
           } 
        
           func (p *pipe) Read(b []byte) (int, error) { 
        
           	p.conWg.Wait() 
        
           	if p.conErr != nil { 
        
           		return 0, fmt.Errorf("connection error: %w", p.conErr) 
        
           	} 
        
           	return p.con.Read(b) 
        
           }

It's likely that this is only an issue in this case since we're "re-attaching" existing IO for stopped tasks on Windows, and those pipes are somehow broken and blocking. This happens around

containerd/container.go

Lines 407 to 413 in f92e576

    
           if ioAttach != nil && response.Process.Status != tasktypes.Status_UNKNOWN { 
        
           	// Do not attach IO for task in unknown state, because there 
        
           	// are no fifo paths anyway. 
        
           	if i, err = attachExistingIO(response, ioAttach); err != nil { 
        
           		return nil, err 
        
           	} 
        
           }

Calling Close() first "fixes" the issue, but breaks a bunch of other contracts/makes it impossible for a consumer to properly handle an exit.

thaJeztah · 2023-06-05T17:37:20Z

@cpuguy83 just brought #8334 up, which is something related to this area and was just merged (just in case useful)

laurazard · 2023-06-06T08:43:02Z

task.go

+		// io.Wait locks for restored tasks on Windows unless we call
+		// io.Close first (https://github.com/containerd/containerd/issues/5621)
+		// in other cases, preserve the contract and let IO finish before closing
+		if t.client.runtime == fmt.Sprintf("%s.%s", plugin.RuntimePlugin, "windows") {


In order to not reintroduce #5621 which was fixed by #5974, we leave the io.Close() call here, on Windows only. (for now, until the underlying issue re: io.Wait() locking for restored tasks on Windows)

Looks good to me. The Delete should be called after the task has been stopped.
The kernel or shim will ensure all the processes that are created by this task are killed. Just needs to ensure that the t.io.Cancel() won't close the fifo file (REF: #8334).

laurazard · 2023-06-06T08:44:45Z

/cc @AkihiroSuda @fuweid

fuweid · 2023-06-06T09:57:53Z

task.go

+		// in other cases, preserve the contract and let IO finish before closing
+		if t.client.runtime == fmt.Sprintf("%s.%s", plugin.RuntimePlugin, "windows") {
+			t.io.Close()
+		}
 		t.io.Cancel()


maybe we should add some comments like "Cancel is used to cancel the goroutine which is still in fifo-opening state (container hasn't been started yet). It's not trying to stop pipe because the pipe should be stopped by shim-side. Otherwise we might loss the data from container"

That sounds good! From this and the other PRs, it looks like there's a lot of confusion around what Cancel does/should do. I'll add the comment.

Thanks @laurazard ! The comment can be handled by the follow-up.

Oop, I was a bit too fast, I've added it in this commit. Lmk if you think that looks good @fuweid

The contract for `cio/io.go/IO` states that a call to `Close()` will always be preceded by a call to `Cancel()` - https://github.com/containerd/containerd/blob/f3a07934b49bf142925a5913e3e19f3528eda0d2/cio/io.go#L59 which isn't being held up here. Furthermore, the call to `Close()` here makes the subsequent `Wait()` moot, and causes issues to consumers (see: moby/moby#45689) It seems from https://github.com/containerd/containerd/blob/f3a07934b49bf142925a5913e3e19f3528eda0d2/task.go#L338 that the `Close()` should be called there, the call removed in this commit is unnecessary/erroneous. We leave the `Close()` call on Windows only since this was introduced in containerd#5974 to address containerd#5621. Signed-off-by: Laura Brehm <laurabrehm@hey.com>

fuweid

LGTM!

dcantah

Thanks LGTM

k8s-ci-robot added the needs-ok-to-test label Jun 5, 2023

k8s-ci-robot requested review from dmcgowan and thaJeztah June 5, 2023 11:43

k8s-ci-robot added ok-to-test and removed needs-ok-to-test labels Jun 5, 2023

thaJeztah reviewed Jun 5, 2023

View reviewed changes

task.go Show resolved Hide resolved

laurazard force-pushed the io-close-before-cancel branch from 02c4cdb to c841b0e Compare June 6, 2023 08:40

laurazard commented Jun 6, 2023

View reviewed changes

k8s-ci-robot requested review from AkihiroSuda and fuweid June 6, 2023 08:44

This comment was marked as duplicate.

Sign in to view

This comment was marked as outdated.

Sign in to view

fuweid reviewed Jun 6, 2023

View reviewed changes

laurazard force-pushed the io-close-before-cancel branch from c841b0e to 34a93a0 Compare June 6, 2023 10:22

fuweid approved these changes Jun 6, 2023

View reviewed changes

laurazard mentioned this pull request Jun 6, 2023

docker run loses stdout data when it's not being read fast enough moby/moby#45689

Closed

dmcgowan approved these changes Jun 6, 2023

View reviewed changes

dcantah approved these changes Jun 6, 2023

View reviewed changes

estesp approved these changes Jun 7, 2023

View reviewed changes

dmcgowan merged commit 67349c1 into containerd:main Jun 7, 2023
45 checks passed

laurazard mentioned this pull request Jun 8, 2023

[release/1.7 backport] task: don't close() io before cancel() #8658

Merged

fuweid added the cherry-pick/1.7.x Change to be cherry picked to release/1.7 branch label Jun 8, 2023

laurazard mentioned this pull request Jun 8, 2023

[release/1.6 backport] task: don't close() io before cancel() #8659

Merged

mikebrow added the cherry-pick/1.6.x Change to be cherry picked to release/1.6 branch label Jun 8, 2023

laurazard mentioned this pull request Jun 25, 2023

Healthcheck timeout leads to panic: runtime error: invalid memory address or nil pointer dereference moby/moby#45804

Closed

arnegroskurth mentioned this pull request Dec 29, 2023

Inconsistenly truncated output running "docker save" within container moby/moby#45764

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

task: don't `close()` io before `cancel()` #8643

task: don't `close()` io before `cancel()` #8643

laurazard commented Jun 5, 2023 •

edited

k8s-ci-robot commented Jun 5, 2023

laurazard commented Jun 5, 2023

k8s-ci-robot commented Jun 5, 2023

thaJeztah commented Jun 5, 2023

thaJeztah commented Jun 5, 2023

laurazard commented Jun 5, 2023 •

edited

laurazard commented Jun 5, 2023 •

edited

thaJeztah commented Jun 5, 2023

laurazard Jun 6, 2023 •

edited

fuweid Jun 6, 2023

laurazard commented Jun 6, 2023

This comment was marked as duplicate.

This comment was marked as outdated.

fuweid Jun 6, 2023

laurazard Jun 6, 2023

fuweid Jun 6, 2023

laurazard Jun 6, 2023

fuweid left a comment

dcantah left a comment

task: don't close() io before cancel() #8643

task: don't close() io before cancel() #8643

Conversation

laurazard commented Jun 5, 2023 • edited

k8s-ci-robot commented Jun 5, 2023

laurazard commented Jun 5, 2023

k8s-ci-robot commented Jun 5, 2023

thaJeztah commented Jun 5, 2023

thaJeztah commented Jun 5, 2023

laurazard commented Jun 5, 2023 • edited

laurazard commented Jun 5, 2023 • edited

thaJeztah commented Jun 5, 2023

laurazard Jun 6, 2023 • edited

Choose a reason for hiding this comment

fuweid Jun 6, 2023

Choose a reason for hiding this comment

laurazard commented Jun 6, 2023

This comment was marked as duplicate.

This comment was marked as outdated.

fuweid Jun 6, 2023

Choose a reason for hiding this comment

laurazard Jun 6, 2023

Choose a reason for hiding this comment

fuweid Jun 6, 2023

Choose a reason for hiding this comment

laurazard Jun 6, 2023

Choose a reason for hiding this comment

fuweid left a comment

Choose a reason for hiding this comment

dcantah left a comment

Choose a reason for hiding this comment

task: don't `close()` io before `cancel()` #8643

task: don't `close()` io before `cancel()` #8643

laurazard commented Jun 5, 2023 •

edited

laurazard commented Jun 5, 2023 •

edited

laurazard commented Jun 5, 2023 •

edited

laurazard Jun 6, 2023 •

edited