Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Latest docker version breaks kubelet in 1.5.0-alpha.3 #13281

Closed
0xmichalis opened this issue Mar 7, 2017 · 43 comments · Fixed by #14158
Closed

Latest docker version breaks kubelet in 1.5.0-alpha.3 #13281

0xmichalis opened this issue Mar 7, 2017 · 43 comments · Fixed by #14158
Assignees
Labels
component/kubernetes kind/bug Categorizes issue or PR as related to a bug. priority/P2
Milestone

Comments

@0xmichalis
Copy link
Contributor

20s       20s       1         router-1-deploy           Pod                 Warning   FailedSync         {kubelet 10.34.129.45}   Error syncing pod, skipping: failed to "StartContainer" for "POD" with RunContainerError: "Failed to check docker api version: docker: failed to parse docker version \"17.03.0-ce\": illegal zero-prefixed version component \"03\" in \"17.03.0-ce\""

@csrwng @jimmidyson

@0xmichalis
Copy link
Contributor Author

Running on Fedora 25

@csrwng
Copy link
Contributor

csrwng commented Mar 7, 2017

Follow up to #13219

@csrwng
Copy link
Contributor

csrwng commented Mar 7, 2017

I tried with the latest origin master and I get the same error (on fedora 25). However, I don't see the error on Docker for Mac.

@justinclayton
Copy link

justinclayton commented Mar 7, 2017

I am seeing the issue with Docker for Mac 17.03.0-ce, commit 60ccb22.

oc version oc v1.4.1+3f9807a (latest from Homebrew)

@csrwng
Copy link
Contributor

csrwng commented Mar 7, 2017

@justinclayton are you not able to run pods at all?

@justinclayton
Copy link

justinclayton commented Mar 7, 2017

Sorry, I'm in the wrong issue. I'm trying to run oc cluster up. Looks like #13284 is what I'm looking for.

@csrwng
Copy link
Contributor

csrwng commented Mar 7, 2017

@justinclayton you'll need an oc binary from the origin master

@zak-hassan
Copy link

On the latest Docker for Mac I'm experiencing issues running oc cluster up --metrics. The pods do not come up.
Any suggestions on what version of openshift cli I should use for this to work?

@csrwng
Copy link
Contributor

csrwng commented Mar 8, 2017

@zmhassan what are your versions of docker and oc, and do you see any events in the openshift-infra namespace about the metrics pods?

@zak-hassan
Copy link

zak-hassan commented Mar 8, 2017

@csrwng

zhassan:~ zhassan$ oc cluster up --metrics
-- Checking OpenShift client ... OK
-- Checking Docker client ... OK
-- Checking Docker version ... FAIL
   Error: Minor number must not contain leading zeroes "03"
zhassan:~ zhassan$ oc version
oc v1.5.0-alpha.3+cf7e336
kubernetes v1.5.2+43a9be4
features: Basic-Auth
Unable to connect to the server: EOF
zhassan:~ zhassan$ docker version
Client:
 Version:      17.03.0-ce
 API version:  1.26
 Go version:   go1.7.5
 Git commit:   60ccb22
 Built:        Thu Feb 23 10:40:59 2017
 OS/Arch:      darwin/amd64

Server:
 Version:      17.03.0-ce
 API version:  1.26 (minimum version 1.12)
 Go version:   go1.7.5
 Git commit:   3a232c8
 Built:        Tue Feb 28 07:52:04 2017
 OS/Arch:      linux/amd64
 Experimental: true

@csrwng
Copy link
Contributor

csrwng commented Mar 8, 2017

@zmhassan you need the latest origin master. If you somehow have access to a cluster, you can build your own binary using https://github.com/csrwng/build-origin

@zak-hassan
Copy link

Hi @csrwng I could build from source as I already have all the openshift source. This step your suggestion is only going to create binary that I can easily download from github or build myself.

@zak-hassan
Copy link

So if I build latest it should work is what your saying?

@csrwng
Copy link
Contributor

csrwng commented Mar 8, 2017

yes

@zak-hassan
Copy link

zak-hassan commented Mar 8, 2017

Nope doesn't work.

zhassan:kubernetes zhassan$ oc get pods
NAME                            READY     STATUS    RESTARTS   AGE
docker-registry-1-deploy        1/1       Running   0          54s
docker-registry-1-dpcwn         0/1       Running   0          28s
persistent-volume-setup-vh3pf   1/1       Running   0          54s
router-1-deploy                 1/1       Running   0          54s
router-1-ndsh6                  0/1       Running   0          29s
zhassan:kubernetes zhassan$ oc project openshift-infra
Now using project "openshift-infra" on server "https://127.0.0.1:8443".
zhassan:kubernetes zhassan$ oc get pods
NAME                         READY     STATUS    RESTARTS   AGE
metrics-deployer-pod-5ctbd   1/1       Running   0          15s
metrics-deployer-pod-lctjk   0/1       Error     0          1m
zhassan:kubernetes zhassan$ oc describe pod metrics-deployer-pod-lctjk
Name:			metrics-deployer-pod-lctjk
Namespace:		openshift-infra
Security Policy:	restricted
Node:			192.168.65.2/192.168.65.2
Start Time:		Wed, 08 Mar 2017 14:31:06 -0500
Labels:			controller-uid=c4d30a9d-0435-11e7-b875-c68515d70515
			job-name=metrics-deployer-pod
Status:			Failed
IP:			172.17.0.5
Controllers:		Job/metrics-deployer-pod
Containers:
  deployer:
    Container ID:	docker://046254a8829ecd5580eaa8430507bd1ecec285b2d5a9d62377e856907962720b
    Image:		openshift/origin-metrics-deployer:v1.5.0-alpha.3
    Image ID:		docker-pullable://openshift/origin-metrics-deployer@sha256:e3ae4eab3002fa532a30c3dcdfc320dd0a588e4adbc59896139cc1bff4c4f06e
    Port:
    State:		Terminated
      Reason:		Error
      Exit Code:	255
      Started:		Wed, 08 Mar 2017 14:31:47 -0500
      Finished:		Wed, 08 Mar 2017 14:31:57 -0500
    Ready:		False
    Restart Count:	0
    Volume Mounts:
      /etc/deploy from empty (rw)
      /secret from secret (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from metrics-deployer-token-cpn75 (ro)
    Environment Variables:
      PROJECT:				openshift-infra (v1:metadata.namespace)
      POD_NAME:				metrics-deployer-pod-lctjk (v1:metadata.name)
      IMAGE_PREFIX:			openshift/origin-
      IMAGE_VERSION:			v1.5.0-alpha.3
      MASTER_URL:			https://kubernetes.default.svc:443
      HAWKULAR_METRICS_HOSTNAME:	metrics-openshift-infra.127.0.0.1.nip.io
      MODE:				deploy
      REDEPLOY:				false
      IGNORE_PREFLIGHT:			false
      USE_PERSISTENT_STORAGE:		true
      CASSANDRA_NODES:			1
      CASSANDRA_PV_SIZE:		10Gi
      METRIC_DURATION:			7
      HEAPSTER_NODE_ID:			nodename
      METRIC_RESOLUTION:		10s
Conditions:
  Type		Status
  Initialized 	True
  Ready 	False
  PodScheduled 	True
Volumes:
  empty:
    Type:	EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
  secret:
    Type:	Secret (a volume populated by a Secret)
    SecretName:	metrics-deployer
  metrics-deployer-token-cpn75:
    Type:	Secret (a volume populated by a Secret)
    SecretName:	metrics-deployer-token-cpn75
QoS Class:	BestEffort
Tolerations:	<none>
Events:
  FirstSeen	LastSeen	Count	From			SubObjectPath	Type		Reason		Message
  ---------	--------	-----	----			-------------	--------	------		-------
  1m		1m		1	{default-scheduler }			Normal		Scheduled	Successfully assigned metrics-deployer-pod-lctjk to 192.168.65.2
  1m		1m		1	{kubelet 192.168.65.2}			Warning		FailedSync	Error syncing pod, skipping: failed to "StartContainer" for "POD" with RunContainerError: "Failed to check docker api version: docker: failed to parse docker version \"17.03.0-ce\": illegal zero-prefixed version component \"03\" in \"17.03.0-ce\""

  1m	1m	1	{kubelet 192.168.65.2}	spec.containers{deployer}	Normal	Pulling		pulling image "openshift/origin-metrics-deployer:v1.5.0-alpha.3"
  49s	49s	1	{kubelet 192.168.65.2}	spec.containers{deployer}	Normal	Pulled		Successfully pulled image "openshift/origin-metrics-deployer:v1.5.0-alpha.3"
  48s	48s	1	{kubelet 192.168.65.2}	spec.containers{deployer}	Normal	Created		Created container with docker id 046254a8829e; Security:[seccomp=unconfined]
  47s	47s	1	{kubelet 192.168.65.2}	spec.containers{deployer}	Normal	Started		Started container with docker id 046254a8829e
  47s	47s	1	{kubelet 192.168.65.2}					Warning	FailedSync	Error syncing pod, skipping: failed to "StartContainer" for "deployer" with RunContainerError: "Failed to check docker api version: docker: failed to parse docker version \"17.03.0-ce\": illegal zero-prefixed version component \"03\" in \"17.03.0-ce\""

  35s	35s	1	{kubelet 192.168.65.2}	spec.containers{deployer}	Normal	Killing	Killing container with docker id 046254a8829e: Need to kill pod.

@csrwng
Copy link
Contributor

csrwng commented Mar 8, 2017

@zmhassan what version of the images are you using ? (Are you specifying --version=blah with oc cluster up)

@csrwng
Copy link
Contributor

csrwng commented Mar 8, 2017

@zmhassan so I tried locally and I did see the errors but eventually pods ran and metrics came up.

@csrwng
Copy link
Contributor

csrwng commented Mar 13, 2017

Kube issue: kubernetes/kubernetes#42492

@csrwng
Copy link
Contributor

csrwng commented Mar 13, 2017

It looks like something changed in the kubelet recently that broke things. Running cluster up with --version=v1.5.0-alpha.3 logs the message about the version parsing failing, but the pods eventually do start. With --version=v1.5.0-rc.0, the pods don't start anymore.

/cc @derekwaynecarr

@mbechauf
Copy link

When I try oc cluster up --version=v1.5.0-alpha.3 and create a project, the build fail immediately with

error: cannot connect to the server: open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory

I then tried v1.5.0-alpha.3 client tools with oc cluster up --version=v1.5.0-alpha.3 fails immediately with

-- Checking OpenShift client ... OK
-- Checking Docker client ... OK
-- Checking Docker version ... FAIL
Error: Minor number must not contain leading zeroes "03"

Looks like I'm stuck. I was trying to use origin for an internal class in a week. Is there a way to install an older docker engine on CentOS?

@zak-hassan
Copy link

I'm still having issues and I'm not seeing metrics show up at all.

@csrwng
Copy link
Contributor

csrwng commented Mar 14, 2017

@mbechauf I believe the default centos yum repo has docker 1.12.x

@zmhassan are you getting any pods to start at all? Are you using --version=v1.5.0-alpha.3 to start your cluster?

@zak-hassan
Copy link

I managed to get it all working. Thank you.

@shveik
Copy link

shveik commented Mar 16, 2017

got a little further, but the issue is now during build:
Failed sync Error syncing pod, skipping: failed to "StartContainer" for "POD" with RunContainerError: "runContainer: docker: failed to parse docker version "17.03.0-ce": illegal zero-prefixed version component "03" in "17.03.0-ce""

@csrwng
Copy link
Contributor

csrwng commented Mar 16, 2017

@shveik is your registry up and running?

@shveik
Copy link

shveik commented Mar 16, 2017

great question, please bare with me since it's my first day playing with this, and i am following instructions here: https://github.com/openshift/origin/blob/master/docs/cluster_up_down.md#macos-with-docker-for-mac

so, looks like the registry is running:
➜ ~ oc get svc docker-registry -n default
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
docker-registry 172.30.1.1 5000/TCP 16m

however, the login is not working, since OPENSHIFT_TOKEN is null:
➜ ~ OPENSHIFT_TOKEN=$(oc whoami -t)
error: no token is currently in use for this session

i do have an insecure registry configured for 172.30.0.0/16

@csrwng
Copy link
Contributor

csrwng commented Mar 16, 2017

@shveik did you specify --version=v1.5.0-alpha.3 with 'oc cluster up'? If not, try that first

@csrwng
Copy link
Contributor

csrwng commented Mar 16, 2017

If you are logged in as system:admin, you will not have a token, since you are logged in via certificate. If you need a token, you need to be a regular user.

@mbechauf
Copy link

mbechauf commented Mar 16, 2017 via email

@shveik
Copy link

shveik commented Mar 16, 2017

@csrwng that was it, thank you. so when running with version, here is the output, so basically, it's the client and server versions that differ:
➜ ~ oc version
oc v1.5.0-rc.0+49a4a7a
kubernetes v1.5.2+43a9be4
features: Basic-Auth

Server https://127.0.0.1:8443
openshift v1.5.0-alpha.3+cf7e336
kubernetes v1.5.2+43a9be4

and here is without version:
➜ ~ oc version
oc v1.5.0-rc.0+49a4a7a
kubernetes v1.5.2+43a9be4
features: Basic-Auth

Server https://127.0.0.1:8443
openshift v1.5.0-rc.0+49a4a7a
kubernetes v1.5.2+43a9be4

@cars10
Copy link

cars10 commented Mar 21, 2017

how to solve that when using the server binary instead of using oc cluster up?

I use ./openshift start and experience the same issue. any way to specify the version there?

@stevekuznetsov
Copy link
Contributor

@mbechauf on CentOS for installing Docker, use yum install docker -- you will have 1.12 available to you in normal streams and 1.13 in @epel-testing

@soul2zimate
Copy link

I see a similar issue with current master:
openshift v3.6.0-alpha.0+611176d-121
kubernetes v1.5.2+43a9be4
etcd 3.1.0

Build is pending when I try to create a new ruby-ex app. See following message in Build -> Last Build -> Events tab:

Time Severity Reason Message
2:38:40 PM Warning Failed sync Error syncing pod, skipping: failed to "StartContainer" for "POD" with RunContainerError: "runContainer: docker: failed to parse docker version "17.03.0-ce": illegal zero-prefixed version component "03" in "17.03.0-ce""
18 times in the last 3 minutes

@csrwng
Copy link
Contributor

csrwng commented Mar 31, 2017

The kubelet is still not working with the new Docker version string. Easiest thing to do is to downgrade Docker.

@mbn18
Copy link

mbn18 commented Apr 14, 2017

Got stuck on the first tutorial.

OS: Fedora25

Version:

oc v3.6.0-alpha.1+7044e57-29
kubernetes v1.5.2+43a9be4
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://192.168.1.100:8443
openshift v3.6.0-alpha.1+7044e57-29
kubernetes v1.5.2+43a9be4

The error:

Error syncing pod, skipping: failed to "StartContainer" for "POD" with RunContainerError: "runContainer: docker: failed to parse docker version "17.04.0-ce": illegal zero-prefixed version component "04" in "17.04.0-ce""

Tutorial worked on:

oc v1.4.1+3f9807a
kubernetes v1.4.0+776c994
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://192.168.1.100:8443
openshift v1.4.1+3f9807a
kubernetes v1.4.0+776c994

@marioluan
Copy link

Which docker version should I use to make it work, while we don't have the [final] working solution?

@stevekuznetsov
Copy link
Contributor

OpenShift Origin 1.5 should use Docker 1.12.

@simbo1905
Copy link

simbo1905 commented Apr 20, 2017

On mac os with latest Docker for Mac Version 17.03.1-ce-mac5 (16048) I was able to oc cluster up with the oc from openshift-origin-client-tools-v1.5.0 but was unable to deploy an image seeing:

"runContainer: docker: failed to parse docker version" 17.03 illegal zero-prefixed version component "03" in "17.03.1-ce""

changing the oc cluster up command to name the alpha version in the chain above solved it:

./oc cluster up --version=v1.5.0-alpha.3   \
                --use-existing-config   \
                --host-data-dir=/oc_data \
                --metrics=true

The deployment just automatically ran and succeeded on the older version.

@csrwng
Copy link
Contributor

csrwng commented Apr 20, 2017

An alternative for now is also to downgrade to Docker 1.13.1 for Mac:
https://download.docker.com/mac/stable/1.13.1.15353/Docker.dmg

@soltysh
Copy link
Member

soltysh commented Apr 27, 2017

We will need to cherry-pick kubernetes/kubernetes#44068, probably to solve the problem

@soltysh
Copy link
Member

soltysh commented Apr 27, 2017

I'll pick it up once the upstream PR merges and the rebase lands

@soltysh
Copy link
Member

soltysh commented May 12, 2017

I've cherry-picked the upstream fix in attached PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/kubernetes kind/bug Categorizes issue or PR as related to a bug. priority/P2
Projects
None yet
Development

Successfully merging a pull request may close this issue.