[Kubemark] Failures in master kubelet trying to start pods #68190

shyamjvs · 2018-09-03T11:10:33Z

We recently started observing flaky failures in couple of kubemark jobs:

E.g failed run - https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-kubemark-500-gce/16634

The reason seems to be that kubelet was failing to start kube-apiserver pod with such errors continuously:

E0902 22:43:47.915221    2169 remote_runtime.go:92] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = failed to start sandbox container for pod "kube-apiserver-kubemark-500-kubemark-master": Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:245: running exec setns process for init caused \"exit status 16\""
E0902 22:43:47.915292    2169 kuberuntime_sandbox.go:54] CreatePodSandbox for pod "kube-apiserver-kubemark-500-kubemark-master_kube-system(bd6f956583b54bdf5a90b93b7e3a7e3d)" failed: rpc error: code = Unknown desc = failed to start sandbox container for pod "kube-apiserver-kubemark-500-kubemark-master": Error response from daemon: oci runtime error: container_linux.go:247: starting container process caused "process_linux.go:245: running exec setns process for init caused \"exit status 16\""

I'll try digging up a bit, but @yujuhong @mtaufen - do you have any leads on why this might be happening?

cc @kubernetes/sig-scalability-bugs @wojtek-t

The text was updated successfully, but these errors were encountered:

shyamjvs · 2018-09-03T11:15:03Z

Changed title, as it seems like kubelet is failing to start all master pods not just apiserver).

dims · 2018-09-04T00:21:06Z

@shyamjvs possibly related to moby/moby#31614 ?

bclouser · 2018-12-03T00:04:32Z

For what its worth, I think I am seeing this as well, or at least something similar:

Error

Failed create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox container for pod "jenkins-slave-j69pq": Error response from daemon: OCI runtime create failed: container_linux.go:348: starting container process caused "process_linux.go:301: running exec setns process for init caused \"signal: killed\"": unknown

Using jenkins/jnlp-slave configured manually from jenkins UI. This error appears when kubernetes attempts to create the pod.

Versions

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.0", GitCommit:"0ed33881dc4355495f623c6f22e7dd0b7632b7c0", GitTreeState:"clean", BuildDate:"2018-09-27T16:55:41Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"12", GitVersion:"v1.12.0", GitCommit:"0ed33881dc4355495f623c6f22e7dd0b7632b7c0", GitTreeState:"clean", BuildDate:"2018-09-27T16:55:41Z", GoVersion:"go1.10.4", Compiler:"gc", Platform:"linux/amd64"}

$ docker --version
Docker version 18.06.1-ce, build e68fc7a

Jenkins ver. 2.138.3

Using the latest jenkins/jnlp-slave:alpine container (eb079fd09f8e)

Bonus

Interestingly, if I add the container to the cluster manually configured for a static jenkins node it comes up all smiles:
kubectl create -f ./jnlp-slave.yaml

jnlp-slave.yaml:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  creationTimestamp: null
  labels:
    app: jenkins-slave
  name: jenkins-slave
spec:
  replicas: 1
  strategy:
    type: Recreate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: jenkins-slave
    spec:
      containers:
      - env:
        - name: JENKINS_URL
          value: http://10.12.1.72
        - name: JENKINS_SECRET
          # I am using a static node configuration 'benTheBuilder' as a test
          value: 8bdc272187ec3f9fc1fa89df540a10d3e1b77178e5cb9d3bda0db8b8acb9e7a7
        - name: JENKINS_AGENT_NAME
          value: benTheBuilder
        image: jenkins/jnlp-slave:alpine
        name: jenkins-slave
        ports:
        resources: {}

I am happy to provide more info if this is indeed related.

fejta-bot · 2019-03-03T00:23:04Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2019-04-19T13:46:26Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

wojtek-t · 2019-04-19T13:50:48Z

Seems obsolete now.

sulunemre · 2019-08-15T06:43:25Z

@bclouser I have a similar issue, did you find any solution? Thanks!

Resonance1584 · 2019-08-27T18:08:30Z

I encountered this error when I mis-configured a deployment to have

          requests:
            memory: "200m"
            cpu: "500m"
          limits:
            memory: "300m"
            cpu: "500m"

Note the incorrect m instead of Mi

betinro · 2020-03-02T15:30:06Z

I encountered this error when I mis-configured a deployment to have
          requests:
            memory: "200m"
            cpu: "500m"
          limits:
            memory: "300m"
            cpu: "500m"
Note the incorrect m instead of Mi

The same happens if you also specify memory limits too low. I tried starting with a 10Mi memory limits and i got the same errors.

k8s-ci-robot added the sig/scalability Categorizes an issue or PR as relevant to SIG Scalability. label Sep 3, 2018

shyamjvs changed the title ~~[Kubemark] Failures in kubelet trying to start kube-apiserver~~ [Kubemark] Failures in master kubelet trying to start pods Sep 3, 2018

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 3, 2019

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Apr 19, 2019

wojtek-t closed this as completed Apr 19, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Kubemark] Failures in master kubelet trying to start pods #68190

[Kubemark] Failures in master kubelet trying to start pods #68190

shyamjvs commented Sep 3, 2018 •

edited

shyamjvs commented Sep 3, 2018

dims commented Sep 4, 2018

bclouser commented Dec 3, 2018 •

edited

fejta-bot commented Mar 3, 2019

fejta-bot commented Apr 19, 2019

wojtek-t commented Apr 19, 2019

sulunemre commented Aug 15, 2019

Resonance1584 commented Aug 27, 2019 •

edited

betinro commented Mar 2, 2020

[Kubemark] Failures in master kubelet trying to start pods #68190

[Kubemark] Failures in master kubelet trying to start pods #68190

Comments

shyamjvs commented Sep 3, 2018 • edited

shyamjvs commented Sep 3, 2018

dims commented Sep 4, 2018

bclouser commented Dec 3, 2018 • edited

Error

Versions

Bonus

jnlp-slave.yaml:

fejta-bot commented Mar 3, 2019

fejta-bot commented Apr 19, 2019

wojtek-t commented Apr 19, 2019

sulunemre commented Aug 15, 2019

Resonance1584 commented Aug 27, 2019 • edited

betinro commented Mar 2, 2020

shyamjvs commented Sep 3, 2018 •

edited

bclouser commented Dec 3, 2018 •

edited

Resonance1584 commented Aug 27, 2019 •

edited