Skip to content
This repository has been archived by the owner on Mar 9, 2022. It is now read-only.

kubectl exec does not work #417

Closed
ijc opened this issue Nov 15, 2017 · 9 comments
Closed

kubectl exec does not work #417

ijc opened this issue Nov 15, 2017 · 9 comments
Milestone

Comments

@ijc
Copy link
Contributor

ijc commented Nov 15, 2017

I've been investigating this for a while on LinuxKit but I have now reproduced on an Ubuntu 16.04 provisioned using the contrib/ansible stuff as well, with both the default releases installed and with self built up to date versions of containerd and cri-containerd added:

root@cri-containerd:~# ctr version
Client:
  Version:  v1.0.0-beta.3-42-g19168f98
  Revision: 19168f98fec8e2a0adb77d3d248e65e27faa056b

Server:
  Version:  v1.0.0-beta.3-42-g19168f98
  Revision: 19168f98fec8e2a0adb77d3d248e65e27faa056b
root@cri-containerd:~# cri-containerd version
1.0.0-alpha.1-62-g6d4e41e

The logs below are using the development versions.

I see two different failure modes depending on my use of the -i/--stdin flag. I think there are 3 issues here:

  1. Without --tty errors with code = InvalidArgument desc = one of stdin, stdout, or stderr must be set.
  2. Without --tty doesn't actually return the above error (just logs it).
  3. With --tty times out.

Without --tty

Without -i I see (with or without -t):

root@cri-containerd:~# DEBUG=10 kubectl -v 10 -n kube-system exec etcd-cri-containerd ls /
I1115 15:13:49.979882   18102 loader.go:357] Config loaded from file /home/ubuntu/.kube/config
I1115 15:13:49.981252   18102 round_trippers.go:417] curl -k -v -XGET  -H "Accept: application/json, */*" -H "User-Agent: kubectl/v1.8.3 (linux/amd64) kubernetes/f0efb3c" https://192.168.140.249:6443/api/v1/namespaces/kube-system/pods/etcd-cri-containerd
I1115 15:13:49.991500   18102 round_trippers.go:436] GET https://192.168.140.249:6443/api/v1/namespaces/kube-system/pods/etcd-cri-containerd 200 OK in 10 milliseconds
I1115 15:13:49.991516   18102 round_trippers.go:442] Response Headers:
I1115 15:13:49.991526   18102 round_trippers.go:445]     Content-Type: application/json
I1115 15:13:49.991532   18102 round_trippers.go:445]     Content-Length: 2681
I1115 15:13:49.991537   18102 round_trippers.go:445]     Date: Wed, 15 Nov 2017 15:13:49 GMT
I1115 15:13:49.991588   18102 request.go:836] Response Body: {"kind":"Pod","apiVersion":"v1","metadata":{"name":"etcd-cri-containerd","namespace":"kube-system","selfLink":"/api/v1/namespaces/kube-system/pods/etcd-cri-containerd","uid":"79115eb6-ca0f-11e7-9e7d-cabceaed32ef","resourceVersion":"2691","creationTimestamp":"2017-11-15T14:15:46Z","labels":{"component":"etcd","tier":"control-plane"},"annotations":{"kubernetes.io/config.hash":"d76e26fba3bf2bfd215eb29011d55250","kubernetes.io/config.mirror":"d76e26fba3bf2bfd215eb29011d55250","kubernetes.io/config.seen":"2017-11-15T14:14:18.180315887Z","kubernetes.io/config.source":"file","scheduler.alpha.kubernetes.io/critical-pod":""}},"spec":{"volumes":[{"name":"etcd","hostPath":{"path":"/var/lib/etcd","type":"DirectoryOrCreate"}}],"containers":[{"name":"etcd","image":"gcr.io/google_containers/etcd-amd64:3.0.17","command":["etcd","--listen-client-urls=http://127.0.0.1:2379","--advertise-client-urls=http://127.0.0.1:2379","--data-dir=/var/lib/etcd"],"resources":{},"volumeMounts":[{"name":"etcd","mountPath":"/var/lib/etcd"}],"livenessProbe":{"httpGet":{"path":"/health","port":2379,"host":"127.0.0.1","scheme":"HTTP"},"initialDelaySeconds":15,"timeoutSeconds":15,"periodSeconds":10,"successThreshold":1,"failureThreshold":8},"terminationMessagePath":"/dev/termination-log","terminationMessagePolicy":"File","imagePullPolicy":"IfNotPresent"}],"restartPolicy":"Always","terminationGracePeriodSeconds":30,"dnsPolicy":"ClusterFirst","nodeName":"cri-containerd","hostNetwork":true,"securityContext":{},"schedulerName":"default-scheduler","tolerations":[{"operator":"Exists","effect":"NoExecute"}]},"status":{"phase":"Running","conditions":[{"type":"Initialized","status":"True","lastProbeTime":null,"lastTransitionTime":"2017-11-15T15:10:45Z"},{"type":"Ready","status":"True","lastProbeTime":null,"lastTransitionTime":"2017-11-15T15:10:46Z"},{"type":"PodScheduled","status":"True","lastProbeTime":null,"lastTransitionTime":"2017-11-15T15:10:45Z"}],"hostIP":"192.168.140.249","podIP":"192.168.140.249","startTime":"2017-11-15T15:10:45Z","containerStatuses":[{"name":"etcd","state":{"running":{"startedAt":"2017-11-15T15:10:46Z"}},"lastState":{"terminated":{"exitCode":255,"reason":"Unknown","startedAt":"2017-11-15T14:58:46Z","finishedAt":"2017-11-15T15:10:29Z","containerID":"cri-containerd://b3a3c89987d5b8debb210adb0597c38dca82bec4ad7a6cad176cf7d108e26152"}},"ready":true,"restartCount":3,"image":"gcr.io/google_containers/etcd-amd64:3.0.17","imageID":"gcr.io/google_containers/etcd-amd64@sha256:d83d3545e06fb035db8512e33bd44afb55dea007a3abd7b17742d3ac6d235940","containerID":"cri-containerd://816863612ff4b44f76309dad04d41e2186a182e9e8c829359b9b758ccdef81e1"}],"qosClass":"BestEffort"}}
I1115 15:13:50.041895   18102 round_trippers.go:417] curl -k -v -XPOST  -H "X-Stream-Protocol-Version: v4.channel.k8s.io" -H "X-Stream-Protocol-Version: v3.channel.k8s.io" -H "X-Stream-Protocol-Version: v2.channel.k8s.io" -H "X-Stream-Protocol-Version: channel.k8s.io" https://192.168.140.249:6443/api/v1/namespaces/kube-system/pods/etcd-cri-containerd/exec?command=ls&command=%2F&container=etcd&container=etcd&stderr=true&stdout=true
I1115 15:13:50.042049   18102 round_trippers.go:417] curl -k -v -XPOST  -H "X-Stream-Protocol-Version: v4.channel.k8s.io" -H "X-Stream-Protocol-Version: v3.channel.k8s.io" -H "X-Stream-Protocol-Version: v2.channel.k8s.io" -H "X-Stream-Protocol-Version: channel.k8s.io" -H "User-Agent: kubectl/v1.8.3 (linux/amd64) kubernetes/f0efb3c" https://192.168.140.249:6443/api/v1/namespaces/kube-system/pods/etcd-cri-containerd/exec?command=ls&command=%2F&container=etcd&container=etcd&stderr=true&stdout=true
I1115 15:13:50.130809   18102 round_trippers.go:436] POST https://192.168.140.249:6443/api/v1/namespaces/kube-system/pods/etcd-cri-containerd/exec?command=ls&command=%2F&container=etcd&container=etcd&stderr=true&stdout=true 500 Internal Server Error in 88 milliseconds
I1115 15:13:50.130828   18102 round_trippers.go:442] Response Headers:
I1115 15:13:50.130833   18102 round_trippers.go:445]     Date: Wed, 15 Nov 2017 15:13:50 GMT
I1115 15:13:50.130836   18102 round_trippers.go:445]     Content-Length: 84
I1115 15:13:50.130839   18102 round_trippers.go:445]     Content-Type: text/plain; charset=utf-8
I1115 15:13:50.130844   18102 round_trippers.go:436] POST https://192.168.140.249:6443/api/v1/namespaces/kube-system/pods/etcd-cri-containerd/exec?command=ls&command=%2F&container=etcd&container=etcd&stderr=true&stdout=true 500 Internal Server Error in 88 milliseconds
I1115 15:13:50.130849   18102 round_trippers.go:442] Response Headers:
I1115 15:13:50.130852   18102 round_trippers.go:445]     Content-Length: 84
I1115 15:13:50.130856   18102 round_trippers.go:445]     Content-Type: text/plain; charset=utf-8
I1115 15:13:50.130859   18102 round_trippers.go:445]     Date: Wed, 15 Nov 2017 15:13:50 GMT
I1115 15:13:50.131040   18102 helpers.go:207] server response object: [{
  "metadata": {}
}]
F1115 15:13:50.131050   18102 helpers.go:120] Error from server: 

From journalctl -u cri-containerd the error (which isn't logged above!) is:

Nov 15 15:13:50 cri-containerd cri-containerd[1073]: E1115 15:13:50.129498    1073 instrumented_service.go:203] Exec for "816863612ff4b44f76309dad04d41e2186a182e9e8c829359b9b758ccdef81e1" failed, error: rpc error: code = InvalidArgument desc = one of stdin, stdout, or stderr must be set

With --tty

# DEBUG=10 kubectl -v 10 -n kube-system exec -it etcd-cri-containerd ls /
I1115 15:15:35.067361   26914 loader.go:357] Config loaded from file /home/ubuntu/.kube/config
I1115 15:15:35.068694   26914 round_trippers.go:417] curl -k -v -XGET  -H "User-Agent: kubectl/v1.8.3 (linux/amd64) kubernetes/f0efb3c" -H "Accept: application/json, */*" https://192.168.140.249:6443/api/v1/namespaces/kube-system/pods/etcd-cri-containerd
I1115 15:15:35.084667   26914 round_trippers.go:436] GET https://192.168.140.249:6443/api/v1/namespaces/kube-system/pods/etcd-cri-containerd 200 OK in 15 milliseconds
I1115 15:15:35.084698   26914 round_trippers.go:442] Response Headers:
I1115 15:15:35.084702   26914 round_trippers.go:445]     Date: Wed, 15 Nov 2017 15:15:35 GMT
I1115 15:15:35.084713   26914 round_trippers.go:445]     Content-Type: application/json
I1115 15:15:35.084718   26914 round_trippers.go:445]     Content-Length: 2681
I1115 15:15:35.084782   26914 request.go:836] Response Body: {"kind":"Pod","apiVersion":"v1","metadata":{"name":"etcd-cri-containerd","namespace":"kube-system","selfLink":"/api/v1/namespaces/kube-system/pods/etcd-cri-containerd","uid":"79115eb6-ca0f-11e7-9e7d-cabceaed32ef","resourceVersion":"2691","creationTimestamp":"2017-11-15T14:15:46Z","labels":{"component":"etcd","tier":"control-plane"},"annotations":{"kubernetes.io/config.hash":"d76e26fba3bf2bfd215eb29011d55250","kubernetes.io/config.mirror":"d76e26fba3bf2bfd215eb29011d55250","kubernetes.io/config.seen":"2017-11-15T14:14:18.180315887Z","kubernetes.io/config.source":"file","scheduler.alpha.kubernetes.io/critical-pod":""}},"spec":{"volumes":[{"name":"etcd","hostPath":{"path":"/var/lib/etcd","type":"DirectoryOrCreate"}}],"containers":[{"name":"etcd","image":"gcr.io/google_containers/etcd-amd64:3.0.17","command":["etcd","--listen-client-urls=http://127.0.0.1:2379","--advertise-client-urls=http://127.0.0.1:2379","--data-dir=/var/lib/etcd"],"resources":{},"volumeMounts":[{"name":"etcd","mountPath":"/var/lib/etcd"}],"livenessProbe":{"httpGet":{"path":"/health","port":2379,"host":"127.0.0.1","scheme":"HTTP"},"initialDelaySeconds":15,"timeoutSeconds":15,"periodSeconds":10,"successThreshold":1,"failureThreshold":8},"terminationMessagePath":"/dev/termination-log","terminationMessagePolicy":"File","imagePullPolicy":"IfNotPresent"}],"restartPolicy":"Always","terminationGracePeriodSeconds":30,"dnsPolicy":"ClusterFirst","nodeName":"cri-containerd","hostNetwork":true,"securityContext":{},"schedulerName":"default-scheduler","tolerations":[{"operator":"Exists","effect":"NoExecute"}]},"status":{"phase":"Running","conditions":[{"type":"Initialized","status":"True","lastProbeTime":null,"lastTransitionTime":"2017-11-15T15:10:45Z"},{"type":"Ready","status":"True","lastProbeTime":null,"lastTransitionTime":"2017-11-15T15:10:46Z"},{"type":"PodScheduled","status":"True","lastProbeTime":null,"lastTransitionTime":"2017-11-15T15:10:45Z"}],"hostIP":"192.168.140.249","podIP":"192.168.140.249","startTime":"2017-11-15T15:10:45Z","containerStatuses":[{"name":"etcd","state":{"running":{"startedAt":"2017-11-15T15:10:46Z"}},"lastState":{"terminated":{"exitCode":255,"reason":"Unknown","startedAt":"2017-11-15T14:58:46Z","finishedAt":"2017-11-15T15:10:29Z","containerID":"cri-containerd://b3a3c89987d5b8debb210adb0597c38dca82bec4ad7a6cad176cf7d108e26152"}},"ready":true,"restartCount":3,"image":"gcr.io/google_containers/etcd-amd64:3.0.17","imageID":"gcr.io/google_containers/etcd-amd64@sha256:d83d3545e06fb035db8512e33bd44afb55dea007a3abd7b17742d3ac6d235940","containerID":"cri-containerd://816863612ff4b44f76309dad04d41e2186a182e9e8c829359b9b758ccdef81e1"}],"qosClass":"BestEffort"}}
I1115 15:15:35.146951   26914 round_trippers.go:417] curl -k -v -XPOST  -H "X-Stream-Protocol-Version: v4.channel.k8s.io" -H "X-Stream-Protocol-Version: v3.channel.k8s.io" -H "X-Stream-Protocol-Version: v2.channel.k8s.io" -H "X-Stream-Protocol-Version: channel.k8s.io" https://192.168.140.249:6443/api/v1/namespaces/kube-system/pods/etcd-cri-containerd/exec?command=ls&command=%2F&container=etcd&container=etcd&stdin=true&stdout=true&tty=true
                                                                                                                                                                                                                         I1115 15:15:35.146996   26914 round_trippers.go:417] curl -k -v -XPOST  -H "X-Stream-Protocol-Version: v4.channel.k8s.io" -H "X-Stream-Protocol-Version: v3.channel.k8s.io" -H "X-Stream-Protocol-Version: v2.channel.k8s.io" -H "X-Stream-Protocol-Version: channel.k8s.io" -H "User-Agent: kubectl/v1.8.3 (linux/amd64) kubernetes/f0efb3c" https://192.168.140.249:6443/api/v1/namespaces/kube-system/pods/etcd-cri-containerd/exec?command=ls&command=%2F&container=etcd&container=etcd&stdin=true&stdout=true&tty=true
I1115 15:15:35.239230   26914 round_trippers.go:436] POST https://192.168.140.249:6443/api/v1/namespaces/kube-system/pods/etcd-cri-containerd/exec?command=ls&command=%2F&container=etcd&container=etcd&stdin=true&stdout=true&tty=true 101 Switching Protocols in 92 milliseconds
I1115 15:15:35.239247   26914 round_trippers.go:442] Response Headers:
I1115 15:15:35.239252   26914 round_trippers.go:445]     X-Stream-Protocol-Version: v4.channel.k8s.io
I1115 15:15:35.239255   26914 round_trippers.go:445]     Date: Wed, 15 Nov 2017 15:15:35 GMT
I1115 15:15:35.239258   26914 round_trippers.go:445]     Connection: Upgrade
I1115 15:15:35.239261   26914 round_trippers.go:445]     Upgrade: SPDY/3.1
I1115 15:15:35.239265   26914 round_trippers.go:436] POST https://192.168.140.249:6443/api/v1/namespaces/kube-system/pods/etcd-cri-containerd/exec?command=ls&command=%2F&container=etcd&container=etcd&stdin=true&stdout=true&tty=true 101 Switching Protocols in 92 milliseconds
I1115 15:15:35.239269   26914 round_trippers.go:442] Response Headers:
I1115 15:15:35.239273   26914 round_trippers.go:445]     Date: Wed, 15 Nov 2017 15:15:35 GMT
I1115 15:15:35.239275   26914 round_trippers.go:445]     Connection: Upgrade
I1115 15:15:35.239278   26914 round_trippers.go:445]     Upgrade: SPDY/3.1
I1115 15:15:35.239280   26914 round_trippers.go:445]     X-Stream-Protocol-Version: v4.channel.k8s.io
I1115 15:15:35.239321   26914 logs.go:41] (0xc420087ef0) (0xc420727c20) Create stream
I1115 15:15:35.239329   26914 logs.go:41] (0xc420087ef0) (0xc420727c20) Stream added, broadcasting: 1
I1115 15:15:35.240239   26914 logs.go:41] (0xc420087ef0) Reply frame received for 1
I1115 15:15:35.240258   26914 logs.go:41] (0xc420087ef0) (0xc4206b4460) Create stream
I1115 15:15:35.240262   26914 logs.go:41] (0xc420087ef0) (0xc4206b4460) Stream added, broadcasting: 3
I1115 15:15:35.240591   26914 logs.go:41] (0xc420087ef0) Reply frame received for 3
I1115 15:15:35.240610   26914 logs.go:41] (0xc420087ef0) (0xc420727cc0) Create stream
I1115 15:15:35.240614   26914 logs.go:41] (0xc420087ef0) (0xc420727cc0) Stream added, broadcasting: 5
I1115 15:15:35.240953   26914 logs.go:41] (0xc420087ef0) Reply frame received for 5
I1115 15:15:35.240968   26914 logs.go:41] (0xc420087ef0) (0xc4206b4500) Create stream
I1115 15:15:35.240972   26914 logs.go:41] (0xc420087ef0) (0xc4206b4500) Stream added, broadcasting: 7
I1115 15:15:35.289997   26914 logs.go:41] (0xc420087ef0) Data frame received for 5
I1115 15:15:35.290011   26914 logs.go:41] (0xc420727cc0) (5) Data frame handling
I1115 15:15:35.302264   26914 logs.go:41] (0xc420087ef0) Data frame received for 1
I1115 15:15:35.302277   26914 logs.go:41] (0xc420727c20) (1) Data frame handling
I1115 15:15:35.302453   26914 logs.go:41] (0xc420087ef0) (0xc4206b4460) Stream removed, broadcasting: 3
«30s delay here»
I1115 15:16:05.241199   26914 logs.go:41] (0xc420087ef0) (0xc420727c20) Stream removed, broadcasting: 1
I1115 15:16:05.241233   26914 logs.go:41] (0xc420087ef0) (0xc4206b4460) Stream removed, broadcasting: 3
I1115 15:16:05.241240   26914 logs.go:41] (0xc420087ef0) (0xc420727cc0) Stream removed, broadcasting: 5
I1115 15:16:05.241292   26914 logs.go:41] (0xc420727cc0) (5) Data frame not sent (stream shut down)
F1115 15:16:05.241302   26914 helpers.go:120] error: Timeout occured

After I1115 15:15:35.146951 26914 round_trippers.go:417] things apparently go raw and the output is staircased (ie. \n is no longer interpreted as \r\n), I've cleaned up the indentation manually.

I've traced through a fair bit of this case but still haven't been able to track down the issue. From my tracing the process is being launched and is outputting things to its stdout, but it doesn't seem to go anywhere. I'm not 100% sure that the process's stdio is going to the FIFO rather than to /dev/null -- I had an strace which seemed to show runc dupping fd 9,10 and 11 onto 0,1,2 and observed that what I identified as the parent process appeared to be opening /dev/null onto fd 9, 10 and 11. It's possible that I've misidentified the parent though, or been confused by goroutines (which are a bit hard to follow in strace!).

About 1 time in 100 this case does actually work, so I suspect a race rather than an incorrect invocation.

Using the docker runtime all works fine.

@ijc
Copy link
Contributor Author

ijc commented Nov 15, 2017

I'm not 100% sure that the process's stdio is going to the FIFO rather than to /dev/null -- I had an strace which seemed to show runc dupping fd 9,10 and 11 onto 0,1,2 and observed that what I identified as the parent process appeared to be opening /dev/null onto fd 9, 10 and 11. It's possible that I've misidentified the parent though, or been confused by goroutines (which are a bit hard to follow in strace!).

As expected I was wrong here, I missed a sequence in the [2:CHILD] phase of runc which does:

openat(AT_FDCWD, "/dev/pts/0", O_RDWR)  = 8
dup3(8, 0, 0)                           = 0
dup3(8, 1, 0)                           = 1
dup3(8, 2, 0)                           = 2

Before the eventual execve.

@mikebrow
Copy link
Member

mikebrow commented Nov 15, 2017

which versions of runc and cri-containerd? @Random-Liu has been checking in prs to address io issues.

@ijc
Copy link
Contributor Author

ijc commented Nov 15, 2017

In the Ubuntu repro setup case:

$ runc --version
runc version 1.0.0-rc4+dev
commit: 74a17296470088de3805e138d3d87c62e613dfc4
spec: 1.0.0
$ cri-containerd version
1.0.0-alpha.1-62-g6d4e41e

In the LinuxKit case runc is 0351df1c5a66838d0c392b4ac4cf9450de844e2d and cri-containerd is the same as above.

That's the runc which containerd beta.3 suggests and the latest cri-containerd.

@Random-Liu
Copy link
Member

@ijc It's caused by this change kubernetes/kubernetes#52686.

It seems that with this change, Kubernetes <=1.8 won't work with new CRI container runtime.
@yujuhong Should we patch Kubernetes <=1.8, or should we update CRI version in 1.9 and behave differently in cri-containerd based on the version?

@Random-Liu
Copy link
Member

Random-Liu commented Nov 16, 2017

@ijc I discussed with @yujuhong about this today.

It is a little bit hard to fix the version skew problem because we don't have a complete api machinery for CRI now.

Are you only trying out cri-containerd or there is some user need this fix?

  • If it's the former, we could update Kubernetes version to a 1.9 tag version, such as https://github.com/kubernetes/kubernetes/releases/tag/v1.9.0-alpha.2. And we'll say newest cri-containerd only support 1.9+ in the newest release.
  • If it's the latter, we do have an idea in mind to hack this around. Just want to avoid hack as much as possible. :)

In the future their might be some other disruptive change in CRI. That's why we say CRI is still alpha. :)

@ijc
Copy link
Contributor Author

ijc commented Nov 16, 2017

A user mentioned it in #linuxkit (on the docker community slack). I understood it to just be something they noticed rather than a critical issue for them.

AIUI 1.9 alpha.0 is mid December and I suppose 1.9 proper will be early in the new year, so no need to hack around IMHO, it's not very long to wait (especially once various holidays are subtracted out).

Users who want a fully functioning system right now can always use the (non-alpha) Docker runtime. FWIW docker for mac with Kube integrated is using Docker right now, so we don't have a requirement from there yet.

@Random-Liu Random-Liu added this to the v1.0.0-beta.0 milestone Nov 16, 2017
@Random-Liu
Copy link
Member

@ijc Thanks! I'll keep this issue open until we release the beta version. In the release note, we should point that out.

@Random-Liu
Copy link
Member

@czm4514 The new cri-containerd will only work with Kubernetes 1.9 because of some CRI change kubernetes/kubernetes#52686.

What cri-containerd and kubernetes version are you using?

@ijc
Copy link
Contributor Author

ijc commented Dec 18, 2017

I updated Linuxkit to Kube v1.9.0 and cri-containerd beta.0 and can confirm that kubectl exec now works, thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants