Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

oc cluster up is failing to deploy the router #19109

Closed
jmontleon opened this issue Mar 27, 2018 · 6 comments · Fixed by #19113
Closed

oc cluster up is failing to deploy the router #19109

jmontleon opened this issue Mar 27, 2018 · 6 comments · Fixed by #19113
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/master

Comments

@jmontleon
Copy link
Contributor

oc cluster up is failing to deploy the router

Version
# which oc
/root/bin/oc
# /root/bin/oc version
oc v3.10.0-alpha.0+d961d32-378
kubernetes v1.9.1+a0ce1bc657
features: Basic-Auth GSSAPI Kerberos SPNEGO
Unable to connect to the server: x509: certificate signed by unknown authority
Steps To Reproduce
/root/bin/oc cluster up --routing-suffix=172.17.0.1.nip.io --public-hostname=172.17.0.1 --tag=latest --service-catalog=true
Current Result
Using public hostname IP 172.17.0.1 as the host IP
Starting OpenShift using openshift/origin:latest ...
I0327 09:34:57.897618   14714 config.go:38] Running "create-master-config"
I0327 09:35:00.748740   14714 config.go:45] Running "create-node-config"
I0327 09:35:02.393813   14714 flags.go:31] Running "create-kubelet-flags"
I0327 09:35:03.231833   14714 run_kubelet.go:48] Running "start-kubelet"
I0327 09:35:03.536893   14714 run_self_hosted.go:157] Waiting for the kube-apiserver to be ready.
I0327 09:35:32.546495   14714 apply_template.go:77] Installing "kube-proxy"
I0327 09:35:32.546537   14714 apply_template.go:77] Installing "kube-dns"
I0327 09:35:32.546495   14714 apply_template.go:77] Installing "openshift-apiserver"
I0327 09:35:35.012445   14714 interface.go:41] Finished installing "kube-proxy" "kube-dns" "openshift-apiserver"
I0327 09:36:28.051612   14714 run_self_hosted.go:186] openshift-apiserver available
I0327 09:36:28.051884   14714 apply_template.go:77] Installing "openshift-controller-manager"
I0327 09:36:30.218867   14714 interface.go:41] Finished installing "openshift-controller-manager"
I0327 09:36:30.356474   14714 apply_list.go:48] Installing "openshift/cakephp quickstart"
I0327 09:36:30.356518   14714 apply_list.go:48] Installing "openshift/mariadb"
I0327 09:36:30.356599   14714 apply_list.go:48] Installing "openshift/nodejs quickstart"
I0327 09:36:30.356648   14714 apply_list.go:48] Installing "openshift/sample pipeline"
I0327 09:36:30.356688   14714 apply_list.go:48] Installing "openshift/mongodb"
I0327 09:36:30.356816   14714 apply_list.go:48] Installing "openshift/centos7"
I0327 09:36:30.356849   14714 apply_list.go:48] Installing "openshift/rails quickstart"
I0327 09:36:30.356854   14714 apply_list.go:48] Installing "openshift/jenkins pipeline persistent"
I0327 09:36:30.356778   14714 apply_list.go:48] Installing "kube-system/heapster standalone"
I0327 09:36:30.357566   14714 apply_list.go:48] Installing "openshift/django quickstart"
I0327 09:36:30.358705   14714 apply_list.go:48] Installing "openshift/mysql"
I0327 09:36:30.358752   14714 apply_list.go:48] Installing "openshift-infra/template service broker registration"
I0327 09:36:30.358776   14714 apply_list.go:48] Installing "openshift-infra/template service broker rbac"
I0327 09:36:30.358823   14714 apply_list.go:48] Installing "kube-system/prometheus"
I0327 09:36:30.358847   14714 apply_list.go:48] Installing "openshift/postgresql"
I0327 09:36:30.356825   14714 apply_list.go:48] Installing "openshift/dancer quickstart"
I0327 09:36:30.358759   14714 apply_list.go:48] Installing "openshift-infra/service catalog"
I0327 09:36:30.359030   14714 apply_list.go:48] Installing "openshift-infra/template service broker apiserver"
I0327 09:36:30.358830   14714 apply_list.go:48] Installing "openshift-infra/web console server template"
I0327 09:36:30.373837   14714 registry_install.go:56] Running "openshift-image-registry"
scc "privileged" added to: ["system:serviceaccount:default:registry"]
I0327 09:36:37.019699   14714 interface.go:41] Finished installing "openshift/centos7" "openshift/cakephp quickstart" "openshift/dancer quickstart" "openshift/django quickstart" "openshift/nodejs quickstart" "openshift/rails quickstart" "openshift/sample pipeline" "openshift/mongodb" "openshift/mariadb" "openshift/jenkins pipeline persistent" "openshift/mysql" "openshift/postgresql" "kube-system/prometheus" "kube-system/heapster standalone" "openshift-infra/service catalog" "openshift-infra/template service broker rbac" "openshift-infra/template service broker registration" "openshift-infra/web console server template" "openshift-infra/template service broker apiserver" "openshift-image-registry"
I0327 09:36:37.026471   14714 admin.go:51] Running "install-router"
Error: FAIL
   Error: could not run "install-router": Docker run error rc=255
   Caused By:
     Error: Docker run error rc=255
     Details:
       Image: openshift/origin:latest
       Entrypoint: [oc]
       Command: [adm router --host-ports=true --loglevel=8 --config=/var/lib/origin/openshift.local.config/master/admin.kubeconfig --host-network=true --images=openshift/origin-${component}:latest --default-cert=/var/lib/origin/openshift.local.config/master/router.pem]

Afterwards:

# oc cluster down && /root/bin/oc cluster up --routing-suffix=172.17.0.1.nip.io --public-hostname=172.17.0.1 --tag=latest --service-catalog=true
Using public hostname IP 172.17.0.1 as the host IP
Starting OpenShift using openshift/origin:latest ...
I0327 09:37:25.657561   22014 flags.go:31] Running "create-kubelet-flags"
I0327 09:37:26.495616   22014 run_kubelet.go:48] Running "start-kubelet"
I0327 09:37:26.805449   22014 run_self_hosted.go:157] Waiting for the kube-apiserver to be ready.
I0327 09:37:46.812898   22014 apply_template.go:77] Installing "kube-proxy"
I0327 09:37:46.812907   22014 apply_template.go:77] Installing "kube-dns"
I0327 09:37:46.812893   22014 apply_template.go:77] Installing "openshift-apiserver"
I0327 09:37:49.577354   22014 interface.go:41] Finished installing "kube-proxy" "kube-dns" "openshift-apiserver"
I0327 09:38:31.663479   22014 run_self_hosted.go:186] openshift-apiserver available
I0327 09:38:31.663721   22014 apply_template.go:77] Installing "openshift-controller-manager"
I0327 09:38:33.755320   22014 interface.go:41] Finished installing "openshift-controller-manager"
I0327 09:38:33.847585   22014 apply_list.go:48] Installing "openshift/dancer quickstart"
I0327 09:38:33.847630   22014 apply_list.go:48] Installing "openshift/mariadb"
I0327 09:38:33.847632   22014 apply_list.go:48] Installing "openshift/mongodb"
I0327 09:38:33.847676   22014 apply_list.go:48] Installing "openshift-infra/service catalog"
I0327 09:38:33.847595   22014 apply_list.go:48] Installing "openshift/centos7"
I0327 09:38:33.847791   22014 apply_list.go:48] Installing "openshift/rails quickstart"
I0327 09:38:33.847869   22014 apply_list.go:48] Installing "openshift/postgresql"
I0327 09:38:33.847895   22014 apply_list.go:48] Installing "openshift/mysql"
I0327 09:38:33.848843   22014 apply_list.go:48] Installing "openshift/cakephp quickstart"
I0327 09:38:33.849475   22014 apply_list.go:48] Installing "openshift/django quickstart"
I0327 09:38:33.849692   22014 apply_list.go:48] Installing "kube-system/prometheus"
I0327 09:38:33.849744   22014 apply_list.go:48] Installing "openshift/nodejs quickstart"
I0327 09:38:33.849769   22014 apply_list.go:48] Installing "openshift/sample pipeline"
I0327 09:38:33.849895   22014 apply_list.go:48] Installing "kube-system/heapster standalone"
I0327 09:38:33.849752   22014 apply_list.go:48] Installing "openshift/jenkins pipeline persistent"
I0327 09:38:33.850017   22014 apply_list.go:48] Installing "openshift-infra/template service broker rbac"
I0327 09:38:33.850062   22014 apply_list.go:48] Installing "openshift-infra/web console server template"
I0327 09:38:33.850189   22014 apply_list.go:48] Installing "openshift-infra/template service broker registration"
I0327 09:38:33.850287   22014 apply_list.go:48] Installing "openshift-infra/template service broker apiserver"
I0327 09:38:39.618316   22014 interface.go:41] Finished installing "openshift/centos7" "openshift/dancer quickstart" "openshift/rails quickstart" "openshift/jenkins pipeline persistent" "openshift/sample pipeline" "openshift/mariadb" "openshift/postgresql" "openshift/cakephp quickstart" "openshift/nodejs quickstart" "openshift/mongodb" "openshift/mysql" "openshift/django quickstart" "kube-system/prometheus" "kube-system/heapster standalone" "openshift-infra/service catalog" "openshift-infra/template service broker rbac" "openshift-infra/template service broker registration" "openshift-infra/web console server template" "openshift-infra/template service broker apiserver" "openshift-image-registry"
I0327 09:38:39.626226   22014 admin.go:51] Running "install-router"
Error: FAIL
   Error: cannot create router service account
   Details:
     Last 10 lines of "origin" container log:
     I0327 13:37:43.356455   22369 reconciler.go:209] operationExecutor.VerifyControllerAttachedVolume started for volume "openshift-controller-manager-token-vvbqc" (UniqueName: "kubernetes.io/secret/dbc2394a-31c3-11e8-915d-525400029068-openshift-controller-manager-token-vvbqc") pod "openshift-controller-manager-n6s6m" (UID: "dbc2394a-31c3-11e8-915d-525400029068") 
     I0327 13:37:43.356580   22369 reconciler.go:209] operationExecutor.VerifyControllerAttachedVolume started for volume "master-cloud-provider" (UniqueName: "kubernetes.io/host-path/bbd9f2fc-31c3-11e8-915d-525400029068-master-cloud-provider") pod "openshift-apiserver-qv95l" (UID: "bbd9f2fc-31c3-11e8-915d-525400029068") 
     I0327 13:37:43.356669   22369 reconciler.go:209] operationExecutor.VerifyControllerAttachedVolume started for volume "openshift-apiserver-token-bv7vg" (UniqueName: "kubernetes.io/secret/bbd9f2fc-31c3-11e8-915d-525400029068-openshift-apiserver-token-bv7vg") pod "openshift-apiserver-qv95l" (UID: "bbd9f2fc-31c3-11e8-915d-525400029068") 
     I0327 13:37:43.356834   22369 reconciler.go:209] operationExecutor.VerifyControllerAttachedVolume started for volume "node-config" (UniqueName: "kubernetes.io/host-path/bbd390cc-31c3-11e8-915d-525400029068-node-config") pod "kube-dns-xg62d" (UID: "bbd390cc-31c3-11e8-915d-525400029068") 
     I0327 13:37:43.357013   22369 reconciler.go:209] operationExecutor.VerifyControllerAttachedVolume started for volume "master-cloud-provider" (UniqueName: "kubernetes.io/host-path/dbc2394a-31c3-11e8-915d-525400029068-master-cloud-provider") pod "openshift-controller-manager-n6s6m" (UID: "dbc2394a-31c3-11e8-915d-525400029068") 
     I0327 13:37:43.457787   22369 reconciler.go:154] Reconciler: start to sync state
     E0327 13:38:03.663410   22369 event.go:200] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"openshift-controller-manager-n6s6m.151fca8317735f6c", GenerateName:"", Namespace:"openshift-controller-manager", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:""}, InvolvedObject:v1.ObjectReference{Kind:"Pod", Namespace:"openshift-controller-manager", Name:"openshift-controller-manager-n6s6m", UID:"dbc2394a-31c3-11e8-915d-525400029068", APIVersion:"v1", ResourceVersion:"1243", FieldPath:""}, Reason:"SuccessfulMountVolume", Message:"MountVolume.SetUp succeeded for volume \"master-config\" ", Source:v1.EventSource{Component:"kubelet", Host:"localhost"}, FirstTimestamp:v1.Time{Time:time.Time{wall:0xbea6b029db44996c, ext:16648615981, loc:(*time.Location)(0xb0dca00)}}, LastTimestamp:v1.Time{Time:time.Time{wall:0xbea6b029db44996c, ext:16648615981, loc:(*time.Location)(0xb0dca00)}}, Count:1, Type:"Normal", EventTime:v1.MicroTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events "openshift-controller-manager-n6s6m.151fca8317735f6c" is forbidden: caches not synchronized' (will not retry!)
     E0327 13:38:13.720184   22369 event.go:200] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"openshift-controller-manager-n6s6m.151fca83177d73fb", GenerateName:"", Namespace:"openshift-controller-manager", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Initializers:(*v1.Initializers)(nil), Finalizers:[]string(nil), ClusterName:""}, InvolvedObject:v1.ObjectReference{Kind:"Pod", Namespace:"openshift-controller-manager", Name:"openshift-controller-manager-n6s6m", UID:"dbc2394a-31c3-11e8-915d-525400029068", APIVersion:"v1", ResourceVersion:"1243", FieldPath:""}, Reason:"SuccessfulMountVolume", Message:"MountVolume.SetUp succeeded for volume \"master-cloud-provider\" ", Source:v1.EventSource{Component:"kubelet", Host:"localhost"}, FirstTimestamp:v1.Time{Time:time.Time{wall:0xbea6b029db4eadfb, ext:16649276591, loc:(*time.Location)(0xb0dca00)}}, LastTimestamp:v1.Time{Time:time.Time{wall:0xbea6b029db4eadfb, ext:16649276591, loc:(*time.Location)(0xb0dca00)}}, Count:1, Type:"Normal", EventTime:v1.MicroTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events "openshift-controller-manager-n6s6m.151fca83177d73fb" is forbidden: caches not synchronized' (will not retry!)
     I0327 13:38:14.456002   22369 kuberuntime_manager.go:514] Container {Name:controllers Image:openshift/origin:latest Command:[hyperkube kube-controller-manager] Args:[--enable-dynamic-provisioning=true --use-service-account-credentials=true --leader-elect-retry-period=3s --leader-elect-resource-lock=configmaps --controllers=* --controllers=-ttl --controllers=-bootstrapsigner --controllers=-tokencleaner --controllers=-horizontalpodautoscaling --pod-eviction-timeout=5m --cluster-signing-key-file= --cluster-signing-cert-file= --experimental-cluster-signing-duration=720h --root-ca-file=/etc/origin/master/ca-bundle.crt --port=10252 --service-account-private-key-file=/etc/origin/master/serviceaccounts.private.key --kubeconfig=/etc/origin/master/openshift-master.kubeconfig --openshift-config=/etc/origin/master/master-config.yaml] WorkingDir: Ports:[] EnvFrom:[] Env:[] Resources:{Limits:map[] Requests:map[]} VolumeMounts:[{Name:master-config ReadOnly:false MountPath:/etc/origin/master/ SubPath: MountPropagation:<nil>} {Name:master-cloud-provider ReadOnly:false MountPath:/etc/origin/cloudprovider/ SubPath: MountPropagation:<nil>}] VolumeDevices:[] LivenessProbe:&Probe{Handler:Handler{Exec:nil,HTTPGet:&HTTPGetAction{Path:healthz,Port:10252,Host:,Scheme:HTTP,HTTPHeaders:[],},TCPSocket:nil,},InitialDelaySeconds:0,TimeoutSeconds:1,PeriodSeconds:10,SuccessThreshold:1,FailureThreshold:3,} ReadinessProbe:nil Lifecycle:nil TerminationMessagePath:/dev/termination-log TerminationMessagePolicy:File ImagePullPolicy:Always SecurityContext:&SecurityContext{Capabilities:nil,Privileged:*true,SELinuxOptions:nil,RunAsUser:nil,RunAsNonRoot:nil,ReadOnlyRootFilesystem:nil,AllowPrivilegeEscalation:nil,} Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it.
     I0327 13:38:14.456258   22369 kuberuntime_manager.go:758] checking backoff for container "controllers" in pod "kube-controller-manager-localhost_kube-system(561479871d2f5db75ff2d6db13c0688d)"


   Caused By:
     Error: serviceaccounts "router" already exists

If I do this I get back to the original error:

/root/bin/oc cluster down && for i in $(df -h | grep cluster | awk '{ print $6 }'); do umount $i; done && rm -rf /root/openshift.local.clusterup && /root/bin/oc cluster up --routing-suffix=172.17.0.1.nip.io --public-hostname=172.17.0.1 --tag=latest --service-catalog=true

I get back to the original error.

Expected Result

A working cluster.

Additional Information
docker images | grep latest | grep openshift/origin
docker.io/openshift/origin-web-console                             latest              114db6a065b9        8 hours ago         485 MB
docker.io/openshift/origin-docker-registry                         latest              caa992b3b104        8 hours ago         455 MB
docker.io/openshift/origin-haproxy-router                          latest              5c7d7055925b        9 hours ago         1.6 GB
docker.io/openshift/origin-docker-builder                          latest              1b3bc4138448        9 hours ago         1.58 GB
docker.io/openshift/origin-sti-builder                             latest              ccb58bf2c3b1        9 hours ago         1.58 GB
docker.io/openshift/origin-recycler                                latest              032f4c514cb3        9 hours ago         1.58 GB
docker.io/openshift/origin-deployer                                latest              e3d5268c9c7f        9 hours ago         1.58 GB
docker.io/openshift/origin                                         latest              1839c8bc70e2        9 hours ago         1.58 GB
docker.io/openshift/origin-template-service-broker                 latest              414668fc6531        9 hours ago         302 MB
docker.io/openshift/origin-service-catalog                         latest              86cc48a66124        9 hours ago         290 MB
docker.io/openshift/origin-pod                                     latest              aebf4798f2e4        9 hours ago         217 MB
docker.io/openshift/origin-release                                 latest              6e1501e40076        20 months ago       849 MB

# docker run -it --entrypoint /bin/oc 1839c8bc70e2 version
oc v3.10.0-alpha.0+d961d32-378
kubernetes v1.9.1+a0ce1bc657
features: Basic-Auth GSSAPI Kerberos SPNEGO

Tried without --tag which I believe uses the v3.10 tag. Same error.

I0327 09:49:47.584346   13520 admin.go:51] Running "install-router"
Error: FAIL
   Error: could not run "install-router": Docker run error rc=255
   Caused By:
     Error: Docker run error rc=255
     Details:
       Image: openshift/origin:v3.10
       Entrypoint: [oc]
       Command: [adm router --host-ports=true --loglevel=8 --config=/var/lib/origin/openshift.local.config/master/admin.kubeconfig --host-network=true --images=openshift/origin-${component}:v3.10 --default-cert=/var/lib/origin/openshift.local.config/master/router.pem]
@jwforres
Copy link
Member

@openshift/sig-master

specifically @deads2k since he has been changing oc cluster up

@jwforres jwforres added the kind/bug Categorizes issue or PR as related to a bug. label Mar 27, 2018
@deads2k
Copy link
Contributor

deads2k commented Mar 27, 2018

We just updated oc cluster up to put installation logs in a predictable spot CWD/openshift.local.clusterup/logs. Can you create a gist that has the stdout and stderr of the router installation container (should be obvious) for us to look at?

cc @mfojtik

@mfojtik mfojtik self-assigned this Mar 27, 2018
@jmontleon
Copy link
Contributor Author

There's just one line in stderr

cat install-router-001.stderr install-router-001.stdout 
F0327 13:49:49.032471       1 helpers.go:119] error: error getting client: Error loading config file "/var/lib/origin/openshift.local.config/master/admin.kubeconfig": open /var/lib/origin/openshift.local.config/master/admin.kubeconfig: permission denied

@jmontleon
Copy link
Contributor Author

And based on that I tried with setenforce 0 and see:

type=AVC msg=audit(1522172000.475:552156): avc:  denied  { open } for  pid=22299 comm="oc" path="/var/lib/origin/openshift.local.config/master/admin.kubeconfig" dev="dm-0" ino=479211 scontext=system_u:system_r:container_t:s0:c74,c997 tcontext=unconfined_u:object_r:admin_home_t:s0 tclass=file permissive=1

@jmontleon
Copy link
Contributor Author

If I run from /tmp it works.

Similar error if I run from /home

type=AVC msg=audit(1522172375.881:553131): avc:  denied  { read } for  pid=12066 comm="oc" name="admin.kubeconfig" dev="dm-0" ino=586785998 scontext=system_u:system_r:container_t:s0:c296,c564 tcontext=unconfined_u:object_r:user_home_t:s0 tclass=file permissive=0

@deads2k
Copy link
Contributor

deads2k commented Mar 27, 2018

Opened #19113 to make it match the registry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/master
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants