Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement streaming CRI methods in dockershim #35661

Merged
merged 1 commit into from Oct 29, 2016

Conversation

@timstclair
Copy link

commented Oct 27, 2016

NOTE: Temporarily includes commit from #35330 - only review the second commit.

Builds on #35330, using the library to implement the streaming methods in various CRI shims.

This does not actually wire up the new streaming methods in the kubelet (that will be my next PR). Once the new methods are wired up, I will delete the Legacy{Exec,Attach,PortForward} methods.

For #29579

/cc @kubernetes/sig-node @feiskyer


This change is Reviewable

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Oct 27, 2016

Jenkins GCI GKE smoke e2e failed for commit bacf050. Full PR test history.

The magic incantation to run this job again is @k8s-bot gci gke e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

// If command exits with a non-zero exit code, an error is returned.
ExecSync(containerID string, cmd []string, timeout time.Duration) (stdout []byte, stderr []byte, err error)
// Exec prepares a streaming endpoint to execute a command in the container, and returns the address.
Exec(*runtimeApi.ExecRequest) (*runtimeApi.ExecResponse, error)

This comment has been minimized.

Copy link
@feiskyer

feiskyer Oct 27, 2016

Member

just returns string (indicates the url) instead of ExecResponse here? (same with Attach and PortForward)

This comment has been minimized.

Copy link
@timstclair

timstclair Oct 27, 2016

Author

I see these methods delegating entirely to the server library (see implementation). Using the request / response here makes it easier to simply pass through to the library. I'm open to changing it, but I think we should change the library implementation to mirror this API in that case.

Exec(containerID string, cmd []string, tty bool, stdin io.Reader, stdout, stderr io.WriteCloser) error
// ExecSync executes a command in the container, and returns the stdout output.
// If command exits with a non-zero exit code, an error is returned.
ExecSync(containerID string, cmd []string, timeout time.Duration) (stdout []byte, stderr []byte, err error)

This comment has been minimized.

Copy link
@feiskyer

feiskyer Oct 27, 2016

Member

exit code is also needed here, some applications may check exit code instead of error messages.

This comment has been minimized.

Copy link
@feiskyer

feiskyer Oct 27, 2016

Member

Update: exit code is not required here because it is handled by utilexec.ExitError

This comment has been minimized.

Copy link
@timstclair
if err != nil {
return err
}

This comment has been minimized.

Copy link
@feiskyer

feiskyer Oct 27, 2016

Member

nit: check container status here? just returns an error if container is not running. (container may exit just after the url is returned)

This comment has been minimized.

Copy link
@timstclair
Timeout: &timeoutSeconds,
}
resp, err := r.runtimeClient.ExecSync(ctx, req)
if err != nil {

This comment has been minimized.

Copy link
@feiskyer

feiskyer Oct 27, 2016

Member

nit:
wrap the error to utilexec.ExitError, e.g.

        if exitErr, ok := err.(*exec.ExitError); ok {
        return &utilexec.ExitErrorWrapper{ExitError: exitErr}
    }

This comment has been minimized.

Copy link
@timstclair

timstclair Oct 27, 2016

Author

This is the error returned by the RPC client, so I'm pretty sure it would never be an instance of exec.ExitError. It will be the server implementation's responsibility to convert exit errors into an appropriate response.

This comment has been minimized.

Copy link
@feiskyer

feiskyer Oct 28, 2016

Member

I'm concerned how to extract the exit code correctly. It's easy to get the exit code from ExecSync, but not easy to add it to streaming Exec without extracting from error.

This comment has been minimized.

Copy link
@timstclair

timstclair Oct 28, 2016

Author

If an error is returned here, it means that the exec either did not start or terminate, so there is no exit code. What is the usecase for getting the exit code from the streaming exec?

This comment has been minimized.

Copy link
@feiskyer

feiskyer Oct 28, 2016

Member

What is the usecase for getting the exit code from the streaming exec?

Users also expect an exit code while the command exec failed. There is already an test about this, so let's check in next PR of wiring everything togather.


err = nil
if resp.GetExitCode() != 0 {
err = fmt.Errorf("command '%s' exited with %d: %s", strings.Join(cmd, " "), resp.GetExitCode(), resp.GetStderr())

This comment has been minimized.

Copy link
@feiskyer

feiskyer Oct 27, 2016

Member

also returns a utilexec.ExitError here

This comment has been minimized.

Copy link
@timstclair

timstclair Oct 27, 2016

Author

Done. (Using CodeExitError instead)


resp, err := r.runtimeClient.Attach(ctx, req)
if err != nil {
glog.Errorf("Attach %s from runtime service failed: %v", req.GetContainerId(), err)

This comment has been minimized.

Copy link
@feiskyer

feiskyer Oct 27, 2016

Member

maybe also wrap to utilexec.ExitError here

This comment has been minimized.

Copy link
@timstclair

@timstclair timstclair force-pushed the timstclair:dockershim branch from bacf050 to e176a32 Oct 27, 2016

@googlebot googlebot added the cla: yes label Oct 27, 2016

@timstclair
Copy link
Author

left a comment

Thanks for the review! Addressed comments and rebased with #35330.

Exec(containerID string, cmd []string, tty bool, stdin io.Reader, stdout, stderr io.WriteCloser) error
// ExecSync executes a command in the container, and returns the stdout output.
// If command exits with a non-zero exit code, an error is returned.
ExecSync(containerID string, cmd []string, timeout time.Duration) (stdout []byte, stderr []byte, err error)

This comment has been minimized.

Copy link
@timstclair
// If command exits with a non-zero exit code, an error is returned.
ExecSync(containerID string, cmd []string, timeout time.Duration) (stdout []byte, stderr []byte, err error)
// Exec prepares a streaming endpoint to execute a command in the container, and returns the address.
Exec(*runtimeApi.ExecRequest) (*runtimeApi.ExecResponse, error)

This comment has been minimized.

Copy link
@timstclair

timstclair Oct 27, 2016

Author

I see these methods delegating entirely to the server library (see implementation). Using the request / response here makes it easier to simply pass through to the library. I'm open to changing it, but I think we should change the library implementation to mirror this API in that case.

if err != nil {
return err
}

This comment has been minimized.

Copy link
@timstclair
Timeout: &timeoutSeconds,
}
resp, err := r.runtimeClient.ExecSync(ctx, req)
if err != nil {

This comment has been minimized.

Copy link
@timstclair

timstclair Oct 27, 2016

Author

This is the error returned by the RPC client, so I'm pretty sure it would never be an instance of exec.ExitError. It will be the server implementation's responsibility to convert exit errors into an appropriate response.


err = nil
if resp.GetExitCode() != 0 {
err = fmt.Errorf("command '%s' exited with %d: %s", strings.Join(cmd, " "), resp.GetExitCode(), resp.GetStderr())

This comment has been minimized.

Copy link
@timstclair

timstclair Oct 27, 2016

Author

Done. (Using CodeExitError instead)


resp, err := r.runtimeClient.Attach(ctx, req)
if err != nil {
glog.Errorf("Attach %s from runtime service failed: %v", req.GetContainerId(), err)

This comment has been minimized.

Copy link
@timstclair
@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Oct 27, 2016

Jenkins verification failed for commit e176a32. Full PR test history.

The magic incantation to run this job again is @k8s-bot verify test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

}

var _ DockerLegacyService = &dockerService{}

This comment has been minimized.

Copy link
@yujuhong

yujuhong Oct 27, 2016

Member

Hm...I don't think this is necessary since we already embed DockerLegacyService in the DockerService interface.

This comment has been minimized.

Copy link
@timstclair

timstclair Oct 28, 2016

Author

Oh yeah. I forget why I thought I needed this...

seccompProfileRoot: seccompProfileRoot,
client: dockertools.NewInstrumentedDockerInterface(client),
os: kubecontainer.RealOS{},
podSandboxImage: podSandboxImage,
streamingRuntime: &streamingRuntime{
client: client,
execHandler: &dockertools.NativeExecHandler{},

This comment has been minimized.

Copy link
@yujuhong

yujuhong Oct 27, 2016

Member

Add a comment saying that we don't support non-native exec handlers for now, and a TODO to either deprecate or add support for nsenter.

This comment has been minimized.

Copy link
@timstclair

timstclair Oct 28, 2016

Author

Done (filed #35747)

return err
}

// TODO(timstclair): Clean this up once PR#33366 merges.

This comment has been minimized.

Copy link
@yujuhong

yujuhong Oct 27, 2016

Member

#33366 doesn't implement the timeout, are you going to move code below to ExecInContainer?

This comment has been minimized.

Copy link
@timstclair

timstclair Oct 28, 2016

Author

Oh yeah, I was thinking d82bc85 was part of #33366. I don't think this method should be responsible for underlying handling of the timeout, so I can implement it in the native exec handler in a separate PR. /cc @rhcarvalho

This comment has been minimized.

Copy link
@rhcarvalho

rhcarvalho Oct 31, 2016

Contributor

#27956 has the code implementing the timeout, though we are not able to implement proper timeouts using the Docker API.

To speed up the discussion of timeout in the func signature vs. what respects the timeout, we extracted #33366 to do the former, then we can bring the actual implementations.

FYI this is an effort to fix #26895.

return err
}

id := kubecontainer.BuildContainerID("docker", containerID)

This comment has been minimized.

Copy link
@yujuhong

yujuhong Oct 27, 2016

Member

nit: change dockertools.AttachContainer to accept a string as an ID, since eventually we'll use this as an internal function in dockershim. This also get rid of the need to import kubecontainer in this file.

This comment has been minimized.

Copy link
@timstclair
return nil, err
}
tty := container.Config.Tty
return ds.streamingServer.GetAttach(req, tty)

This comment has been minimized.

Copy link
@yujuhong

yujuhong Oct 27, 2016

Member

Hm...I checked the code again, GetAttach embeds the tty option in its URL, but serveAttach never uses. Seems redundant, no?

This comment has been minimized.

Copy link
@timstclair

timstclair Oct 28, 2016

Author

It's used to establish the resize stream, here:

if opts.tty && handler.supportsTerminalResizing() {

}

id := kubecontainer.BuildContainerID("docker", containerID)
tty := container.Config.Tty

This comment has been minimized.

Copy link
@yujuhong

yujuhong Oct 27, 2016

Member

Hmm.....You check tty configuration on the container twice -- in Attach (line 117) and here. Is that necessary? I guess the real question is why don't we just pipe the option downstream to avoid the extra check, or simply not check until we have to perform the attach operation (i.e., here)?

This comment has been minimized.

Copy link
@timstclair

timstclair Oct 28, 2016

Author

Yeah, we discussed that a bit here: #35330 (comment) /cc @euank

I'd like to just add the tty argument to the AttachRequest, and get the Kubelet to supply it. That would clean up a lot of the extra tty checking in here. Are you OK waiting for a follow up PR to clean up all the uses?

return ds.streamingServer.GetPortForward(req)
}

func (ds *dockerService) checkContainerStatus(containerID string) (*dockertypes.ContainerJSON, error) {

This comment has been minimized.

Copy link
@yujuhong

yujuhong Oct 27, 2016

Member

IIUC, the check is just an optimization to avoid unnecessary redirect+spinning up the server.

This comment has been minimized.

Copy link
@timstclair

timstclair Oct 28, 2016

Author

This is called twice now, once when constructing the URL (before redirect) and right before initiating the streams (after redirect). I assume the actual exec command would fail if the container isn't running, so you're correct that this is just an optimization. I think it also helps to fail earlier in the stack & highlight the requirements.

@timstclair
Copy link
Author

left a comment

Thanks for the review. All comments address. One question around following up on the TTY in a separate PR.

seccompProfileRoot: seccompProfileRoot,
client: dockertools.NewInstrumentedDockerInterface(client),
os: kubecontainer.RealOS{},
podSandboxImage: podSandboxImage,
streamingRuntime: &streamingRuntime{
client: client,
execHandler: &dockertools.NativeExecHandler{},

This comment has been minimized.

Copy link
@timstclair

timstclair Oct 28, 2016

Author

Done (filed #35747)

}

var _ DockerLegacyService = &dockerService{}

This comment has been minimized.

Copy link
@timstclair

timstclair Oct 28, 2016

Author

Oh yeah. I forget why I thought I needed this...

return err
}

// TODO(timstclair): Clean this up once PR#33366 merges.

This comment has been minimized.

Copy link
@timstclair

timstclair Oct 28, 2016

Author

Oh yeah, I was thinking d82bc85 was part of #33366. I don't think this method should be responsible for underlying handling of the timeout, so I can implement it in the native exec handler in a separate PR. /cc @rhcarvalho

return nil, err
}
tty := container.Config.Tty
return ds.streamingServer.GetAttach(req, tty)

This comment has been minimized.

Copy link
@timstclair

timstclair Oct 28, 2016

Author

It's used to establish the resize stream, here:

if opts.tty && handler.supportsTerminalResizing() {

}

id := kubecontainer.BuildContainerID("docker", containerID)
tty := container.Config.Tty

This comment has been minimized.

Copy link
@timstclair

timstclair Oct 28, 2016

Author

Yeah, we discussed that a bit here: #35330 (comment) /cc @euank

I'd like to just add the tty argument to the AttachRequest, and get the Kubelet to supply it. That would clean up a lot of the extra tty checking in here. Are you OK waiting for a follow up PR to clean up all the uses?

return err
}

id := kubecontainer.BuildContainerID("docker", containerID)

This comment has been minimized.

Copy link
@timstclair
return ds.streamingServer.GetPortForward(req)
}

func (ds *dockerService) checkContainerStatus(containerID string) (*dockertypes.ContainerJSON, error) {

This comment has been minimized.

Copy link
@timstclair

timstclair Oct 28, 2016

Author

This is called twice now, once when constructing the URL (before redirect) and right before initiating the streams (after redirect). I assume the actual exec command would fail if the container isn't running, so you're correct that this is just an optimization. I think it also helps to fail earlier in the stack & highlight the requirements.

@feiskyer

This comment has been minimized.

Copy link
Member

commented Oct 28, 2016

@k8s-bot cri test this.

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Oct 28, 2016

Jenkins CRI GCE e2e failed for commit 3b681f7. Full PR test history.

The magic incantation to run this job again is @k8s-bot cri e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@yujuhong

This comment has been minimized.

Copy link
Member

commented Oct 28, 2016

LGTM. We can address the slightly complicated tty parameter passing problem in a followup PR.

@timstclair timstclair force-pushed the timstclair:dockershim branch from 3b681f7 to c60db99 Oct 28, 2016

@timstclair

This comment has been minimized.

Copy link
Author

commented Oct 28, 2016

Synced & squashed.

@timstclair timstclair added the lgtm label Oct 28, 2016

@feiskyer

This comment has been minimized.

Copy link
Member

commented Oct 28, 2016

LGTM

@k8s-github-robot

This comment has been minimized.

Copy link
Contributor

commented Oct 29, 2016

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit f099403 into kubernetes:master Oct 29, 2016

12 checks passed

Jenkins GCE Node e2e Build succeeded.
Details
Jenkins GCE e2e Build succeeded.
Details
Jenkins GCE etcd3 e2e Build succeeded.
Details
Jenkins GCI GCE e2e Build succeeded.
Details
Jenkins GCI GKE smoke e2e Build succeeded.
Details
Jenkins GKE smoke e2e Build succeeded.
Details
Jenkins Kubemark GCE e2e Build succeeded.
Details
Jenkins unit/integration Build succeeded.
Details
Jenkins verification Build succeeded.
Details
Submit Queue Queued to run github e2e tests a second time.
Details
cla/google All necessary CLAs are signed
cla/linuxfoundation timstclair authorized
Details
case err := <-errCh:
return err
case <-time.After(timeout):
return streaming.ErrorTimeout("exec", timeout)

This comment has been minimized.

Copy link
@rhcarvalho

rhcarvalho Oct 31, 2016

Contributor

@timstclair this code path leaks a goroutine, because the send operation in line 60 will never succeed without a receive. This is a common pattern where we make errCh buffered, with size 1.

errCh := make(chan error, 1)

And if we expect frequent calls (and I think we do, probes are executed often), then https://golang.org/pkg/time/#After is not very efficient because we don't cleanup until the timer fires (even though we might return before the timeout...)

This comment has been minimized.

Copy link
@feiskyer

feiskyer Oct 31, 2016

Member

@rhcarvalho File an issue for the bug?

This comment has been minimized.

Copy link
@timstclair

timstclair Oct 31, 2016

Author

Good catch, I'll send a pr to fix it.

deads2k pushed a commit to deads2k/kubernetes that referenced this pull request Dec 2, 2016

Kubernetes Submit Queue
Merge pull request kubernetes#36615 from timstclair/cri-attach-tty
Automatic merge from submit-queue

[CRI] Add TTY flag to AttachRequest

Follow up from kubernetes#35661
For kubernetes#29579

- Add TTY to the CRI AttachRequest
- Moves responsibility from the runtime shim to the Kubelet for populating the TTY bool in the request based on the container spec

/cc @euank @feiskyer @kubernetes/sig-node
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.