Support 'mixed' container runtime mode #39685

ivan4th · 2017-01-10T17:43:12Z

(The PR is here for the purpose of discussion first)

This PR adds mixed CRI mode to kubelet. For example, the following
kubelet args may be used with this patch:
--experimental-cri --container-runtime=mixed --container-runtime-endpoint=/run/criproxy.sock --image-service-endpoint=/run/criproxy.sock

In mixed mode, kubelet starts in-process docker-shim, but connects to the specified container runtime / image service endpoints instead. This makes it possible to have multiple CRI implementations on the same node using CRI proxy (here's an example of such a proxy).

Why this may be necessary: currently using an 'exotic' CRI implementation like virtlet or other VM-based one means that the node that supports the runtime must be deployed in a manner different from the other nodes. E.g. kubeadm runs kube-proxy as a DaemonSet and thus it can't be used to prepare nodes for alternative CRI impls. Also, depending on the cluster, it may be necessary to have housekeeping / hardware support / etc. DaemonSets that should be run by all/most nodes, and the 'exotic' nodes will have no way to run pods from these DaemonSets. Last but not least, deploying and updating these CRI implementations themselves may be extra burden, while with 'proxy' mechanism in place it's possible to just run them as DaemonSets themselves.

This mechanism makes it possible to solve #29913 using an external solution.

k8s-reviewable · 2017-01-10T17:43:21Z

This change is

ivan4th · 2017-01-11T11:26:23Z

Any comments on whether/why such a change is acceptable/not acceptable are welcome.

yujuhong · 2017-01-11T18:32:12Z

Supporting privileged pods and/or pods with host networking is important for hypervisor-based runtimes in kubernetes, but running kubelet in the "mixed" node seems clumsy to me. You could simply run a standalone CRI shim for docker, or any other container runtime of your choosing, and let them hide behind your proxy.

@kubernetes/sig-node-misc
@feiskyer (since hyper faces the same problem)

ivan4th · 2017-01-11T19:34:05Z

@yujuhong is it possible to run docker-shim without kubelet? I didn't know this is possible. It seems to me I saw somewhere some mentions of a plan to make docker-shim out-of-process, but couldn't find them again lately. And if I have to extract the docker shim code this will mean that I'll have to track changes in original implementation which is extra burden. Or am I missing something and there's some existing external docker shim project?

As a side note the proxy I mentioned is not tied to other parts of virtlet, so if there's interest in it we can extract it so it can be used by hyper and other runtimes.

yujuhong · 2017-01-11T22:44:49Z

@yujuhong is it possible to run docker-shim without kubelet? I didn't know this is possible. It seems to me I saw somewhere some mentions of a plan to make docker-shim out-of-process, but couldn't find them again lately. And if I have to extract the docker shim code this will mean that I'll have to track changes in original implementation which is extra burden. Or am I missing something and there's some existing external docker shim project?

It's not possible right now, but that doesn't mean we should exclude this option, and it may happen in the future :)
Back to your question, you don't necessary need the built-in docker shim; any non-hypervisor-based container runtime and shim will do. Personally I think supporting this "mixed" mode is too complicated and is not aligned with the long-term goal of decoupling kubelet and the shim. For a temporary solution, I'd suggest clone and adapt dockershim to be a standalone binary/process, or just use another cri runtime/shim.

If you don't really need to run all daemonset right away, you could also run kube-proxy directly on the node as a daemon. I believe hypernetes does the same.

feiskyer · 2017-01-12T02:36:00Z

I think supporting this "mixed" mode is too complicated and is not aligned with the long-term goal of decoupling kubelet and the shim.

Agreed.

If you don't really need to run all daemonset right away, you could also run kube-proxy directly on the node as a daemon. I believe hypernetes does the same.

Hypernetes also doesn't support host networking today, so it just run kube-proxy on the host. But a full-featured CRI implementation could indeed help a lot of managing various applications (including daemons). Adding a runtime proxy in the shim is one easiest way.

yujuhong · 2017-01-12T21:12:22Z

@ivan4th I think using a separate proxy to dispatch requests and aggregate results to/from multiple runtimes is completely fine (and can benefits other hypervisor-based runtimes. For the short-term, you may need to fork and run dockershim as a separate process, or pick another shim that has similar features.

euank · 2017-01-13T00:06:17Z

There are other places that assume one runtime as well.

For example, I think the containerID and version prefixes of docker://..... visible respectively in describe pod and describe node would only reflect one runtime, even for pods created with the other runtime.

ivan4th · 2017-01-16T20:30:01Z

@yujuhong thanks for the advice. I tried running docker-shim from the proxy itself and it mostly worked: https://github.com/Mirantis/virtlet/blob/b29eab94d733a697825db9d99fe477d432c88aac/cmd/criproxy/criproxy.go#L67
The problem is that at the moment attaching to a container only works if the node is accessible from the host where kubectl runs because of this line: https://github.com/kubernetes/kubernetes/blob/v1.5.2/pkg/kubelet/server/server.go#L614
(it redirects to an URL that can only be opened from within the cluster under most circumstances), didn't quite get how to work around this yet, will try to research it more (We didn't implement exec/attach in virtlet yet)
Another problem is that I somehow need to pick up some settings from kubelet config on proxy startup, perhaps by just altering the deployment procedure, but that's not too bad I think.

ivan4th · 2017-01-16T20:31:53Z

@euank is this a serious problem? I can handle Version() in the proxy in such a manner that there'll be just proxy://..... ContainerIDs. Not very informative unfortunately but not too bad either I think.

euank · 2017-01-17T08:50:19Z

@ivan4th that one's only cosmetic, but I worry there are other places where the kubelet has ingrained assumptions there is a single runtime where it will cause more serious issues.

yujuhong · 2017-01-17T17:23:56Z

The problem is that at the moment attaching to a container only works if the node is accessible from the host where kubectl runs because of this line: https://github.com/kubernetes/kubernetes/blob/v1.5.2/pkg/kubelet/server/server.go#L614
(it redirects to an URL that can only be opened from within the cluster under most circumstances), didn't quite get how to work around this yet, will try to research it more (We didn't implement exec/attach in virtlet yet)

Not sure I understand your question completely, but the redirect URL needs to be accessible from the apiserver (not from the kubectl client). You can read the design doc in #29579 (comment) for more information.

@ivan4th that one's only cosmetic, but I worry there are other places where the kubelet has ingrained assumptions there is a single runtime where it will cause more serious issues.

It's true that working with multiple runtimes on a node has never been on the roadmap of kubelet, so @ivan4th, you may run into problems from time to time. On the other hand, since kubelet sees only a proxy (i.e., a single runtime), I think it's possible to solve some (if not most) of the issues by implementing better coalescing logic in the proxy. It does require more work though.

ivan4th · 2017-01-17T17:39:13Z

@yujuhong actually the problem was that I needed to pass --feature-gates=StreamingProxyRedirects=true to apiserver. As of the problems, yes, I'll try to fix them as I go. I think I need to start with trying to run conformance tests against a cluster with the proxy.

Anyway, closing this PR because there appears to be a better solution (docker-shim in the proxy).

Support 'mixed' ContainerRuntime mode

97dd49b

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 10, 2017

k8s-github-robot assigned yujuhong Jan 10, 2017

k8s-github-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. release-note-label-needed labels Jan 10, 2017

ivan4th mentioned this pull request Jan 17, 2017

Run docker-shim inside CRI proxy Mirantis/virtlet#190

Closed

ivan4th closed this Jan 17, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support 'mixed' container runtime mode #39685

Support 'mixed' container runtime mode #39685

ivan4th commented Jan 10, 2017 •

edited

Loading

k8s-reviewable commented Jan 10, 2017

ivan4th commented Jan 11, 2017

yujuhong commented Jan 11, 2017

ivan4th commented Jan 11, 2017

yujuhong commented Jan 11, 2017

feiskyer commented Jan 12, 2017

yujuhong commented Jan 12, 2017

euank commented Jan 13, 2017 •

edited

Loading

ivan4th commented Jan 16, 2017 •

edited

Loading

ivan4th commented Jan 16, 2017

euank commented Jan 17, 2017

yujuhong commented Jan 17, 2017

ivan4th commented Jan 17, 2017

Support 'mixed' container runtime mode #39685

Support 'mixed' container runtime mode #39685

Conversation

ivan4th commented Jan 10, 2017 • edited Loading

k8s-reviewable commented Jan 10, 2017

ivan4th commented Jan 11, 2017

yujuhong commented Jan 11, 2017

ivan4th commented Jan 11, 2017

yujuhong commented Jan 11, 2017

feiskyer commented Jan 12, 2017

yujuhong commented Jan 12, 2017

euank commented Jan 13, 2017 • edited Loading

ivan4th commented Jan 16, 2017 • edited Loading

ivan4th commented Jan 16, 2017

euank commented Jan 17, 2017

yujuhong commented Jan 17, 2017

ivan4th commented Jan 17, 2017

ivan4th commented Jan 10, 2017 •

edited

Loading

euank commented Jan 13, 2017 •

edited

Loading

ivan4th commented Jan 16, 2017 •

edited

Loading