Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failing Test : persistent-volume-upgrade [sig-storage] #68899

Closed
tpepper opened this issue Sep 20, 2018 · 61 comments · Fixed by #68984
Closed

Failing Test : persistent-volume-upgrade [sig-storage] #68899

tpepper opened this issue Sep 20, 2018 · 61 comments · Fixed by #68984
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/storage Categorizes an issue or PR as relevant to SIG Storage.
Milestone

Comments

@tpepper
Copy link
Member

tpepper commented Sep 20, 2018

Failing Job


sig-release-master-upgrade#gce-new-master-upgrade-master

Gubernator Logs


https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-new-master-upgrade-master/1778

Error


Expected error:
    <*errors.errorString | 0xc420a4b980>: {
        s: "failed to get logs from pvc-tester-j4c4l for write-pod: the server does not allow this method on the requested resource (get pods pvc-tester-j4c4l)",
    }
    failed to get logs from pvc-tester-j4c4l for write-pod: the server does not allow this method on the requested resource (get pods pvc-tester-j4c4l)
not to have occurred

k8s.io/kubernetes/test/e2e/framework.(*Framework).testContainerOutputMatcher(0xc42097b340, 0x443f662, 0xf, 0xc42026a000, 0x0, 0xc420090e98, 0x1, 0x1, 0x45d0920)
	/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/util.go:2350 +0x216
k8s.io/kubernetes/test/e2e/framework.(*Framework).TestContainerOutput(0xc42097b340, 0x443f662, 0xf, 0xc42026a000, 0x0, 0xc420090e98, 0x1, 0x1)
	/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:475 +0x8a
k8s.io/kubernetes/test/e2e/upgrades/storage.(*PersistentVolumeUpgradeTest).testPod(0x6c17120, 0xc42097b340, 0x449c1c2, 0x20)
	/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/upgrades/storage/persistent_volumes.go:100 +0x10e
k8s.io/kubernetes/test/e2e/upgrades/storage.(*PersistentVolumeUpgradeTest).Test(0x6c17120, 0xc42097b340, 0xc42151eae0, 0x0)
	/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/upgrades/storage/persistent_volumes.go:82 +0x8f
k8s.io/kubernetes/test/e2e/lifecycle.(*chaosMonkeyAdapter).Test(0xc4217d7640, 0xc421382b80)
	/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/lifecycle/cluster_upgrade.go:447 +0x1da
k8s.io/kubernetes/test/e2e/lifecycle.(*chaosMonkeyAdapter).Test-fm(0xc421382b80)
	/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/lifecycle/cluster_upgrade.go:516 +0x34
k8s.io/kubernetes/test/e2e/chaosmonkey.(*chaosmonkey).Do.func1(0xc421382b80, 0xc42157c8a0)
	/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/chaosmonkey/chaosmonkey.go:89 +0x76
created by k8s.io/kubernetes/test/e2e/chaosmonkey.(*chaosmonkey).Do
	/go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/chaosmonkey/chaosmonkey.go:86 +0xa8

Triage


This test is a persistent fail in this one test bucket only, and aside from one flake it appears to consistently pass everywhere else in sig-release-master-upgrade and in sig-release-1.12-all.

/kind bug
/priority failing-test
/sig storage
@kubernetes/sig-storage-bugs
/assign @msau42
/cc @saad-ali @childsb
/milestone v1.12

@k8s-ci-robot k8s-ci-robot added this to the v1.12 milestone Sep 20, 2018
@k8s-ci-robot k8s-ci-robot added sig/storage Categorizes an issue or PR as relevant to SIG Storage. kind/bug Categorizes issue or PR as related to a bug. labels Sep 20, 2018
@msau42
Copy link
Member

msau42 commented Sep 20, 2018

looks like some permissions issue after upgrade
/assign @davidz627

@tpepper
Copy link
Member Author

tpepper commented Sep 20, 2018

@davidz627 has triaged this to an understanding that the failures happen because the pod is scheduled on the master and getting logs through the method they're using is disabled on master nodes.

This makes me think this is similar to #68881 (review) and we have non-release-blocking issues in test cases not being quite robust enough now in the face of node taints.

@tpepper
Copy link
Member Author

tpepper commented Sep 20, 2018

It is interesting that in some places this test is always failing, some it is always succeeding, and even in one it flakes but with a high success rate: https://k8s-testgrid.appspot.com/sig-release-master-upgrade#gce-new-master-upgrade-cluster

@davidz627
Copy link
Contributor

davidz627 commented Sep 20, 2018

Adding on to @tpepper's comment:
The pod that we are creating does not have any special "master schedulable" annotations or tolerations, so from my understanding this should still be not scheduled to the master for any reason.

The Pod Spec in question:

&v1.Pod{
	TypeMeta: metav1.TypeMeta{
		Kind:       "Pod",
		APIVersion: "v1",
	},
	ObjectMeta: metav1.ObjectMeta{
		GenerateName: "pvc-tester-",
		Namespace:    ns,
	},
	Spec: v1.PodSpec{
		Containers: []v1.Container{
			{
				Name:    "write-pod",
				Image:   BusyBoxImage,
				Command: []string{"/bin/sh"},
				Args:    []string{"-c", command},
				SecurityContext: &v1.SecurityContext{
					Privileged: &isPrivileged,
				},
			},
		},
		RestartPolicy: v1.RestartPolicyOnFailure,
	},
}

Potentially relevant scheduler logs.

I0920 05:11:58.570729       1 factory.go:1286] About to try and schedule pod e2e-tests-persistent-volume-upgrade-sig-storage-xsxmg/pvc-tester-j4c4l
I0920 05:11:58.570764       1 scheduler.go:403] Attempting to schedule pod: e2e-tests-persistent-volume-upgrade-sig-storage-xsxmg/pvc-tester-j4c4l
I0920 05:11:58.570918       1 scheduler_binder.go:509] All bound volumes for Pod "e2e-tests-persistent-volume-upgrade-sig-storage-xsxmg/pvc-tester-j4c4l" match with Node "bootstrap-e2e-master"
I0920 05:11:58.570954       1 scheduler_binder.go:509] All bound volumes for Pod "e2e-tests-persistent-volume-upgrade-sig-storage-xsxmg/pvc-tester-j4c4l" match with Node "bootstrap-e2e-minion-group-7hhn"
I0920 05:11:58.570984       1 scheduler_binder.go:509] All bound volumes for Pod "e2e-tests-persistent-volume-upgrade-sig-storage-xsxmg/pvc-tester-j4c4l" match with Node "bootstrap-e2e-minion-group-mt5f"
I0920 05:11:58.571008       1 scheduler_binder.go:509] All bound volumes for Pod "e2e-tests-persistent-volume-upgrade-sig-storage-xsxmg/pvc-tester-j4c4l" match with Node "bootstrap-e2e-minion-group-t0d1"
I0920 05:11:58.571116       1 scheduler_binder.go:200] AssumePodVolumes for pod "e2e-tests-persistent-volume-upgrade-sig-storage-xsxmg/pvc-tester-j4c4l", node "bootstrap-e2e-master"
I0920 05:11:58.571128       1 scheduler_binder.go:203] AssumePodVolumes for pod "e2e-tests-persistent-volume-upgrade-sig-storage-xsxmg/pvc-tester-j4c4l", node "bootstrap-e2e-master": all PVCs bound and nothing to do
I0920 05:11:58.571155       1 factory.go:1512] Attempting to bind pvc-tester-j4c4l to bootstrap-e2e-master

@msau42
Copy link
Member

msau42 commented Sep 20, 2018

The fix mentioned in #68881 is only handling DaemonSets.

I don't think it will fix this issue.

@msau42
Copy link
Member

msau42 commented Sep 20, 2018

Do we need to get sig-scheduling involved to see why the master is showing up as a choice for this pod?

@tpepper
Copy link
Member Author

tpepper commented Sep 20, 2018

/sig scheduling
@bsalamat
Can you triage this for 1.12 release blocking?

@k8s-ci-robot k8s-ci-robot added the sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. label Sep 20, 2018
@janetkuo
Copy link
Member

It seems that this pvc-tester- pod is only scheduled to master in every run which caused the test to fail consistently.

I0920 05:23:34.239] Sep 20 05:23:34.238: INFO: At 2018-09-20 05:11:58 +0000 UTC - event for pvc-tester-j4c4l: {default-scheduler } Scheduled: Successfully assigned e2e-tests-persistent-volume-upgrade-sig-storage-xsxmg/pvc-tester-j4c4l to bootstrap-e2e-master
I0920 05:23:34.239] Sep 20 05:23:34.238: INFO: At 2018-09-20 05:12:14 +0000 UTC - event for pvc-tester-j4c4l: {attachdetach-controller } SuccessfulAttachVolume: AttachVolume.Attach succeeded for volume "pv-upgrader7fm4" 
I0920 05:23:34.240] Sep 20 05:23:34.238: INFO: At 2018-09-20 05:12:24 +0000 UTC - event for pvc-tester-j4c4l: {kubelet bootstrap-e2e-master} Pulling: pulling image "docker.io/library/busybox:1.29"
I0920 05:23:34.240] Sep 20 05:23:34.238: INFO: At 2018-09-20 05:12:24 +0000 UTC - event for pvc-tester-j4c4l: {kubelet bootstrap-e2e-master} Pulled: Successfully pulled image "docker.io/library/busybox:1.29"
I0920 05:23:34.240] Sep 20 05:23:34.238: INFO: At 2018-09-20 05:12:24 +0000 UTC - event for pvc-tester-j4c4l: {kubelet bootstrap-e2e-master} Created: Created container
I0920 05:23:34.240] Sep 20 05:23:34.238: INFO: At 2018-09-20 05:12:28 +0000 UTC - event for pvc-tester-j4c4l: {kubelet bootstrap-e2e-master} Started: Started container

@davidz627
Copy link
Contributor

@janetkuo
However in this test suite, it seems like the test passes most of the time (pod is scheduled to a node), then occasionally flakes (pod scheduled to master).: https://k8s-testgrid.appspot.com/sig-release-master-upgrade#gce-new-master-upgrade-cluster

Whereas in this one it doesn't seem to fail at all!: https://k8s-testgrid.appspot.com/sig-release-master-upgrade#gke-gci-new-gci-master-upgrade-cluster-parallel

I talked to @krzyzacy about why the behavior might be different across different test suites but we were unable to come up with anything concrete.

@bsalamat
Copy link
Member

@davidz627 Is the master node in your test environment tainted properly?

@msau42
Copy link
Member

msau42 commented Sep 20, 2018

When we were doing unrelated development with DaemonSets, we noticed a similar issue where our DaemonSet pod would get scheduled on gce environment using hack/cluster-up.sh, and we noticed masters brought up using that script were not tainted

@msau42
Copy link
Member

msau42 commented Sep 20, 2018

What release was the tainting master logic added?

@krzyzacy
Copy link
Member

cc @yguo0905 we had similar issue that gpu driver installer is being scheduled on master at #68372, can also be related?

@davidz627
Copy link
Contributor

davidz627 commented Sep 20, 2018

Not sure if this is a source of truth on the cluster but the test artifacts here on a successful run show that the master node doesn't have a taint:
https://storage.googleapis.com/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-new-master-upgrade-cluster/2480/artifacts/nodes.yaml

and on a failed run ALSO no taint:
https://storage.googleapis.com/kubernetes-jenkins/logs/ci-kubernetes-e2e-gce-new-master-upgrade-cluster/2477/artifacts/nodes.yaml

But I'm not really sure where this nodes.yaml file comes from. And whether this is the cluster before or after the upgrade

@tpepper
Copy link
Member Author

tpepper commented Sep 20, 2018

ie: #68492

@msau42
Copy link
Member

msau42 commented Sep 20, 2018

Don't think #68492 will help. It looks specific to gpu labeling

@tpepper
Copy link
Member Author

tpepper commented Sep 21, 2018

Also ^^ is GKE specific, and GKE docu says no taints are inherited to a pool from the cluster, so shouldn't be something from that direction.

@tpepper
Copy link
Member Author

tpepper commented Sep 21, 2018

Some log snippets around the area of interest:

I0920 05:11:57.830] Found 4 node(s).
I0920 05:11:57.979] NAME                              STATUS                     ROLES    AGE   VERSION
I0920 05:11:57.980] bootstrap-e2e-master              Ready,SchedulingDisabled   <none>   16m   v1.13.0-alpha.0.1353+d1111a57d9243c
I0920 05:11:57.980] bootstrap-e2e-minion-group-7hhn   Ready                      <none>   15m   v1.11.4-beta.0.25+f818aa3443bc98
I0920 05:11:57.980] bootstrap-e2e-minion-group-mt5f   Ready                      <none>   15m   v1.11.4-beta.0.25+f818aa3443bc98
I0920 05:11:57.980] bootstrap-e2e-minion-group-t0d1   Ready                      <none>   16m   v1.11.4-beta.0.25+f818aa3443bc98
I0920 05:23:27.654] Logging node info for node bootstrap-e2e-master
I0920 05:23:27.657] Sep 20 05:23:27.651: INFO: Node Info: &Node{ObjectMeta:k8s_io_apimachinery_pkg_apis_meta_v1.ObjectMeta{Name:bootstrap-e2e-master,GenerateName:,Namespace:,SelfLink:/api/v1/nodes/bootstrap-e2e-master,UID:75766387-bc91-11e8-80e8-42010a800002,ResourceVersion:4262,Generation:0,CreationTimestamp:2018-09-20 04:55:56 +0000 UTC,DeletionTimestamp:<nil>,DeletionGracePeriodSeconds:nil,Labels:map[string]string{beta.kubernetes.io/arch: amd64,beta.kubernetes.io/fluentd-ds-ready: true,beta.kubernetes.io/instance-type: n1-standard-1,beta.kubernetes.io/metadata-proxy-ready: true,beta.kubernetes.io/os: linux,failure-domain.beta.kubernetes.io/region: us-central1,failure-domain.beta.kubernetes.io/zone: us-central1-f,kubernetes.io/hostname: bootstrap-e2e-master,},Annotations:map[string]string{node.alpha.kubernetes.io/ttl: 0,volumes.kubernetes.io/controller-managed-attach-detach: true,},OwnerReferences:[],Finalizers:[],ClusterName:,Initializers:nil,},Spec:NodeSpec{PodCIDR:10.64.1.0/24,DoNotUse_ExternalID:,ProviderID:gce://gce-cvm-upg-lat-1-3-ctl-skew/us-central1-f/bootstrap-e2e-master,Unschedulable:true,Taints:[{node.kubernetes.io/unschedulable  NoSchedule 2018-09-20 05:12:08 +0000 UTC}],ConfigSource:nil,},Status:NodeStatus{Capacity:ResourceList{cpu: {{1 0} {<nil>} 1 DecimalSI},ephemeral-storage: {{16684785664 0} {<nil>}  BinarySI},hugepages-2Mi: {{0 0} {<nil>} 0 DecimalSI},memory: {{3885408256 0} {<nil>}  BinarySI},pods: {{110 0} {<nil>} 110 DecimalSI},},Allocatable:ResourceList{cpu: {{1 0} {<nil>} 1 DecimalSI},ephemeral-storage: {{15016307073 0} {<nil>} 15016307073 DecimalSI},hugepages-2Mi: {{0 0} {<nil>} 0 DecimalSI},memory: {{3623264256 0} {<nil>}  BinarySI},pods: {{110 0} {<nil>} 110 DecimalSI},},Phase:,Conditions:[{NetworkUnavailable False 2018-09-20 04:56:13 +0000 UTC 2018-09-20 04:56:13 +0000 UTC RouteCreated RouteController created a route} {OutOfDisk False 2018-09-20 05:23:21 +0000 UTC 2018-09-20 04:55:56 +0000 UTC KubeletHasSufficientDisk kubelet has sufficient disk space available} {MemoryPressure False 2018-09-20 05:23:21 +0000 UTC 2018-09-20 04:55:56 +0000 UTC KubeletHasSufficientMemory kubelet has sufficient memory available} {DiskPressure False 2018-09-20 05:23:21 +0000 UTC 2018-09-20 04:55:56 +0000 UTC KubeletHasNoDiskPressure kubelet has no disk pressure} {PIDPressure False 2018-09-20 05:23:21 +0000 UTC 2018-09-20 04:55:56 +0000 UTC KubeletHasSufficientPID kubelet has sufficient PID available} {Ready True 2018-09-20 05:23:21 +0000 UTC 2018-09-20 04:56:16 +0000 UTC KubeletReady kubelet is posting ready status. AppArmor enabled}],Addresses:[{InternalIP 10.128.0.2} {ExternalIP 35.239.250.148} {InternalDNS bootstrap-e2e-master.c.gce-cvm-upg-lat-1-3-ctl-skew.internal} {Hostname bootstrap-e2e-master.c.gce-cvm-upg-lat-1-3-ctl-skew.internal}],DaemonEndpoints:NodeDaemonEndpoints{KubeletEndpoint:DaemonEndpoint{Port:10250,},},NodeInfo:NodeSystemInfo{MachineID:ecdd0ef5fc22c16c77ade90d525280e9,SystemUUID:ECDD0EF5-FC22-C16C-77AD-E90D525280E9,BootID:8a45557c-4637-4e10-8939-cfc8fe158476,KernelVersion:4.4.111+,OSImage:Container-Optimized OS from Google,ContainerRuntimeVersion:docker://17.3.2,KubeletVersion:v1.13.0-alpha.0.1353+d1111a57d9243c,KubeProxyVersion:v1.13.0-alpha.0.1353+d1111a57d9243c,OperatingSystem:linux,Architecture:amd64,},Images:[{[gcr.io/stackdriver-agents/stackdriver-logging-agent@sha256:c34bbdcd4585e3391a22f4cfca8ffc336eb0b9b8e0942a1da0f62ab4d0db0fc8 gcr.io/stackdriver-agents/stackdriver-logging-agent:0.2-1.5.30-1-k8s] 544936438} {[k8s.gcr.io/etcd@sha256:adf0009b3ea09d82fb15c7649a7436e0635cb51c74c5d58796db6c3c52713a1c k8s.gcr.io/etcd:3.2.24-0] 219655340} {[gcr.io/google_containers/kube-apiserver:v1.13.0-alpha.0.1353_d1111a57d9243c k8s.gcr.io/kube-apiserver:v1.13.0-alpha.0.1353_d1111a57d9243c] 193572744} {[gcr.io/google_containers/kube-controller-manager:v1.13.0-alpha.0.1353_d1111a57d9243c k8s.gcr.io/kube-controller-manager:v1.13.0-alpha.0.1353_d1111a57d9243c] 163792042} {[k8s.gcr.io/prometheus-to-sd@sha256:6c0c742475363d537ff059136e5d5e4ab1f512ee0fd9b7ca42ea48bc309d1662 k8s.gcr.io/prometheus-to-sd:v0.3.1] 88077694} {[k8s.gcr.io/ingress-gce-glbc-amd64@sha256:14f14351a03038b238232e60850a9cfa0dffbed0590321ef84216a432accc1ca k8s.gcr.io/ingress-gce-glbc-amd64:v1.2.3] 71797285} {[k8s.gcr.io/kube-addon-manager@sha256:d53486c3a0b49ebee019932878dc44232735d5622a51dbbdcec7124199020d09 k8s.gcr.io/kube-addon-manager:v8.7] 63322109} {[k8s.gcr.io/prometheus-to-sd@sha256:71b2389fc4931b1cc3bb27ba1873361c346950558dd2220beb02ab6b31d3a651 k8s.gcr.io/prometheus-to-sd:v0.2.4] 58688412} {[gcr.io/google_containers/kube-scheduler:v1.13.0-alpha.0.1353_d1111a57d9243c k8s.gcr.io/kube-scheduler:v1.13.0-alpha.0.1353_d1111a57d9243c] 58152498} {[k8s.gcr.io/etcd-empty-dir-cleanup@sha256:c9c0dea9dbdbdaca436e33947a7559dbe2a60ad2bfa4e77518cb7088e443e28a k8s.gcr.io/etcd-empty-dir-cleanup:3.2.24.0] 31724043} {[k8s.gcr.io/metadata-proxy@sha256:5be758058e67b578f7041498e2daca46ccd7426bc602d35ed0f50bd4a3659d50 k8s.gcr.io/metadata-proxy:v0.1.10] 8953717} {[gcr.io/kubernetes-e2e-test-images/serve-hostname@sha256:bab70473a6d8ef65a22625dc9a1b0f0452e811530fdbe77e4408523460177ff1 gcr.io/kubernetes-e2e-test-images/serve-hostname:1.1] 5851985} {[busybox@sha256:5e8e0509e829bb8f990249135a36e81a3ecbe94294e7a185cc14616e5fad96bd busybox:1.29] 1162769} {[k8s.gcr.io/pause@sha256:f78411e19d84a252e53bff71a4407a5686c46983a2c2eeed83929b888179acea k8s.gcr.io/pause:3.1] 742472}],VolumesInUse:[],VolumesAttached:[],Config:&NodeConfigStatus{Assigned:nil,Active:nil,LastKnownGood:nil,Error:,},},}
I0920 05:23:34.239] Sep 20 05:23:34.238: INFO: At 2018-09-20 05:12:14 +0000 UTC - event for pvc-tester-j4c4l: {attachdetach-controller } SuccessfulAttachVolume: AttachVolume.Attach succeeded for volume "pv-upgrader7fm4" 
I0920 05:23:34.240] Sep 20 05:23:34.238: INFO: At 2018-09-20 05:12:24 +0000 UTC - event for pvc-tester-j4c4l: {kubelet bootstrap-e2e-master} Pulling: pulling image "docker.io/library/busybox:1.29"
I0920 05:23:34.240] Sep 20 05:23:34.238: INFO: At 2018-09-20 05:12:24 +0000 UTC - event for pvc-tester-j4c4l: {kubelet bootstrap-e2e-master} Pulled: Successfully pulled image "docker.io/library/busybox:1.29"
I0920 05:23:34.240] Sep 20 05:23:34.238: INFO: At 2018-09-20 05:12:24 +0000 UTC - event for pvc-tester-j4c4l: {kubelet bootstrap-e2e-master} Created: Created container
I0920 05:23:34.240] Sep 20 05:23:34.238: INFO: At 2018-09-20 05:12:28 +0000 UTC - event for pvc-tester-j4c4l: {kubelet bootstrap-e2e-master} Started: Started container
I0920 05:23:34.268] Logging node info for node bootstrap-e2e-master
I0920 05:23:34.271] Sep 20 05:23:34.265: INFO: Node Info: &Node{ObjectMeta:k8s_io_apimachinery_pkg_apis_meta_v1.ObjectMeta{Name:bootstrap-e2e-master,GenerateName:,Namespace:,SelfLink:/api/v1/nodes/bootstrap-e2e-master,UID:75766387-bc91-11e8-80e8-42010a800002,ResourceVersion:4290,Generation:0,CreationTimestamp:2018-09-20 04:55:56 +0000 UTC,DeletionTimestamp:<nil>,DeletionGracePeriodSeconds:nil,Labels:map[string]string{beta.kubernetes.io/arch: amd64,beta.kubernetes.io/fluentd-ds-ready: true,beta.kubernetes.io/instance-type: n1-standard-1,beta.kubernetes.io/metadata-proxy-ready: true,beta.kubernetes.io/os: linux,failure-domain.beta.kubernetes.io/region: us-central1,failure-domain.beta.kubernetes.io/zone: us-central1-f,kubernetes.io/hostname: bootstrap-e2e-master,},Annotations:map[string]string{node.alpha.kubernetes.io/ttl: 0,volumes.kubernetes.io/controller-managed-attach-detach: true,},OwnerReferences:[],Finalizers:[],ClusterName:,Initializers:nil,},Spec:NodeSpec{PodCIDR:10.64.1.0/24,DoNotUse_ExternalID:,ProviderID:gce://gce-cvm-upg-lat-1-3-ctl-skew/us-central1-f/bootstrap-e2e-master,Unschedulable:true,Taints:[{node.kubernetes.io/unschedulable  NoSchedule 2018-09-20 05:12:08 +0000 UTC}],ConfigSource:nil,},Status:NodeStatus{Capacity:ResourceList{cpu: {{1 0} {<nil>} 1 DecimalSI},ephemeral-storage: {{16684785664 0} {<nil>}  BinarySI},hugepages-2Mi: {{0 0} {<nil>} 0 DecimalSI},memory: {{3885408256 0} {<nil>}  BinarySI},pods: {{110 0} {<nil>} 110 DecimalSI},},Allocatable:ResourceList{cpu: {{1 0} {<nil>} 1 DecimalSI},ephemeral-storage: {{15016307073 0} {<nil>} 15016307073 DecimalSI},hugepages-2Mi: {{0 0} {<nil>} 0 DecimalSI},memory: {{3623264256 0} {<nil>}  BinarySI},pods: {{110 0} {<nil>} 110 DecimalSI},},Phase:,Conditions:[{NetworkUnavailable False 2018-09-20 04:56:13 +0000 UTC 2018-09-20 04:56:13 +0000 UTC RouteCreated RouteController created a route} {OutOfDisk False 2018-09-20 05:23:31 +0000 UTC 2018-09-20 04:55:56 +0000 UTC KubeletHasSufficientDisk kubelet has sufficient disk space available} {MemoryPressure False 2018-09-20 05:23:31 +0000 UTC 2018-09-20 04:55:56 +0000 UTC KubeletHasSufficientMemory kubelet has sufficient memory available} {DiskPressure False 2018-09-20 05:23:31 +0000 UTC 2018-09-20 04:55:56 +0000 UTC KubeletHasNoDiskPressure kubelet has no disk pressure} {PIDPressure False 2018-09-20 05:23:31 +0000 UTC 2018-09-20 04:55:56 +0000 UTC KubeletHasSufficientPID kubelet has sufficient PID available} {Ready True 2018-09-20 05:23:31 +0000 UTC 2018-09-20 04:56:16 +0000 UTC KubeletReady kubelet is posting ready status. AppArmor enabled}],Addresses:[{InternalIP 10.128.0.2} {ExternalIP 35.239.250.148} {InternalDNS bootstrap-e2e-master.c.gce-cvm-upg-lat-1-3-ctl-skew.internal} {Hostname bootstrap-e2e-master.c.gce-cvm-upg-lat-1-3-ctl-skew.internal}],DaemonEndpoints:NodeDaemonEndpoints{KubeletEndpoint:DaemonEndpoint{Port:10250,},},NodeInfo:NodeSystemInfo{MachineID:ecdd0ef5fc22c16c77ade90d525280e9,SystemUUID:ECDD0EF5-FC22-C16C-77AD-E90D525280E9,BootID:8a45557c-4637-4e10-8939-cfc8fe158476,KernelVersion:4.4.111+,OSImage:Container-Optimized OS from Google,ContainerRuntimeVersion:docker://17.3.2,KubeletVersion:v1.13.0-alpha.0.1353+d1111a57d9243c,KubeProxyVersion:v1.13.0-alpha.0.1353+d1111a57d9243c,OperatingSystem:linux,Architecture:amd64,},Images:[{[gcr.io/stackdriver-agents/stackdriver-logging-agent@sha256:c34bbdcd4585e3391a22f4cfca8ffc336eb0b9b8e0942a1da0f62ab4d0db0fc8 gcr.io/stackdriver-agents/stackdriver-logging-agent:0.2-1.5.30-1-k8s] 544936438} {[k8s.gcr.io/etcd@sha256:adf0009b3ea09d82fb15c7649a7436e0635cb51c74c5d58796db6c3c52713a1c k8s.gcr.io/etcd:3.2.24-0] 219655340} {[gcr.io/google_containers/kube-apiserver:v1.13.0-alpha.0.1353_d1111a57d9243c k8s.gcr.io/kube-apiserver:v1.13.0-alpha.0.1353_d1111a57d9243c] 193572744} {[gcr.io/google_containers/kube-controller-manager:v1.13.0-alpha.0.1353_d1111a57d9243c k8s.gcr.io/kube-controller-manager:v1.13.0-alpha.0.1353_d1111a57d9243c] 163792042} {[k8s.gcr.io/prometheus-to-sd@sha256:6c0c742475363d537ff059136e5d5e4ab1f512ee0fd9b7ca42ea48bc309d1662 k8s.gcr.io/prometheus-to-sd:v0.3.1] 88077694} {[k8s.gcr.io/ingress-gce-glbc-amd64@sha256:14f14351a03038b238232e60850a9cfa0dffbed0590321ef84216a432accc1ca k8s.gcr.io/ingress-gce-glbc-amd64:v1.2.3] 71797285} {[k8s.gcr.io/kube-addon-manager@sha256:d53486c3a0b49ebee019932878dc44232735d5622a51dbbdcec7124199020d09 k8s.gcr.io/kube-addon-manager:v8.7] 63322109} {[k8s.gcr.io/prometheus-to-sd@sha256:71b2389fc4931b1cc3bb27ba1873361c346950558dd2220beb02ab6b31d3a651 k8s.gcr.io/prometheus-to-sd:v0.2.4] 58688412} {[gcr.io/google_containers/kube-scheduler:v1.13.0-alpha.0.1353_d1111a57d9243c k8s.gcr.io/kube-scheduler:v1.13.0-alpha.0.1353_d1111a57d9243c] 58152498} {[k8s.gcr.io/etcd-empty-dir-cleanup@sha256:c9c0dea9dbdbdaca436e33947a7559dbe2a60ad2bfa4e77518cb7088e443e28a k8s.gcr.io/etcd-empty-dir-cleanup:3.2.24.0] 31724043} {[k8s.gcr.io/metadata-proxy@sha256:5be758058e67b578f7041498e2daca46ccd7426bc602d35ed0f50bd4a3659d50 k8s.gcr.io/metadata-proxy:v0.1.10] 8953717} {[gcr.io/kubernetes-e2e-test-images/serve-hostname@sha256:bab70473a6d8ef65a22625dc9a1b0f0452e811530fdbe77e4408523460177ff1 gcr.io/kubernetes-e2e-test-images/serve-hostname:1.1] 5851985} {[busybox@sha256:5e8e0509e829bb8f990249135a36e81a3ecbe94294e7a185cc14616e5fad96bd busybox:1.29] 1162769} {[k8s.gcr.io/pause@sha256:f78411e19d84a252e53bff71a4407a5686c46983a2c2eeed83929b888179acea k8s.gcr.io/pause:3.1] 742472}],VolumesInUse:[],VolumesAttached:[],Config:&NodeConfigStatus{Assigned:nil,Active:nil,LastKnownGood:nil,Error:,},},}
I0920 05:25:20.039] • Failure [1515.350 seconds]
I0920 05:25:20.039] [sig-cluster-lifecycle] Upgrade [Feature:Upgrade]
I0920 05:25:20.039] /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/lifecycle/framework.go:22
I0920 05:25:20.039]   master upgrade
I0920 05:25:20.039]   /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/lifecycle/cluster_upgrade.go:90
I0920 05:25:20.039]     should maintain a functioning cluster [Feature:MasterUpgrade] [It]
I0920 05:25:20.040]     /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/lifecycle/cluster_upgrade.go:91
I0920 05:25:20.040] 
I0920 05:25:20.040]     Expected error:
I0920 05:25:20.040]         <*errors.errorString | 0xc420a4b980>: {
I0920 05:25:20.040]             s: "failed to get logs from pvc-tester-j4c4l for write-pod: the server does not allow this method on the requested resource (get pods pvc-tester-j4c4l)",
I0920 05:25:20.040]         }
I0920 05:25:20.040]         failed to get logs from pvc-tester-j4c4l for write-pod: the server does not allow this method on the requested resource (get pods pvc-tester-j4c4l)
I0920 05:25:20.040]     not to have occurred
I0920 05:25:20.041] 
I0920 05:25:20.041]     /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/util.go:2350
I0920 05:25:20.041] ------------------------------

@tpepper
Copy link
Member Author

tpepper commented Sep 21, 2018

/priority critical-urgent

@k8s-ci-robot k8s-ci-robot added the priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. label Sep 21, 2018
@msau42
Copy link
Member

msau42 commented Sep 21, 2018

I think the fix is obvious: we need to taint the master node on the gce master setup in cluster/.

What I still don't understand is how did we get this far without anything else failing, and why only this one test?

@msau42
Copy link
Member

msau42 commented Sep 21, 2018

Do taints persist across master upgrades, or do they need to be re-applied again?

@bsalamat
Copy link
Member

cc @dashpole for what we can do on the node side.
cc @tallclair for the security/policy related matters associated with adding taints for node condition

@tpepper
Copy link
Member Author

tpepper commented Sep 21, 2018

@kubernetes/sig-node-bugs help please

@msau42
Copy link
Member

msau42 commented Sep 21, 2018

@bsalamat regarding your example "a pod that tolerates certain node conditions might not be scheduled if the scheduler checked both node taints and node condition", could this be an API compatibility issue? Will a pod that tolerates a node condition in 1.11 fail to schedule in 1.12 because the scheduler no longer looks at node conditions and only node taints?

@bsalamat
Copy link
Member

bsalamat commented Sep 21, 2018

@msau42 Pods were not able to tolerate node conditions by setting some API fields. We had some annotations, such as pod critical annotation and other hard-coded logic in DS controller, etc. that would have the same effect of tolerating conditions, but they were not in our API.

@msau42
Copy link
Member

msau42 commented Sep 21, 2018

Thanks for the explanation. It looks like the critical pod annotation was alpha, so does not have a backwards compatibility requirement, and we've fixed all the known users of it.

@tallclair
Copy link
Member

xref: #63897 - cc @mikedanese @liggitt

Should this be handled by the node controller? I.e. if the Unschedulable bool is set on the node, add the unschedulable taint?

@dashpole
Copy link
Contributor

I don't believe the kubelet has the ability to update its taints anymore. Assuming this is the case, this seems like a version skew problem between master components. I.E. the 1.11 controller manager isn't compatible with the 1.13 scheduler because it doesn't add taints that the scheduler requires. This means that if the scheduler is upgraded before the controller manager (or before it has the chance to perform updates), we can run into issues.

I think making the scheduler respect conditions in 1.13, and thus making it compatible with the 1.11 controller manager, seems like the best remedy to this. Regardless of this issue, user pods would need to include both the critical annotation as well as the toleration in order to maintain the same behavior across releases that transition from conditions to taints.

@tallclair
Copy link
Member

Oh, I see that the node controller is already doing that, but there's a race condition (#68899 (comment)), sorry I missed that earlier.

@liggitt
Copy link
Member

liggitt commented Sep 21, 2018

Yes, translation of conditions and unschedulable field to taints is done by the controller

@bsalamat
Copy link
Member

I wonder if we can change our upgrade flow to ensure that the controller manager is upgraded and becomes stable (adds the taints) before the scheduler is upgraded.

@liggitt
Copy link
Member

liggitt commented Sep 21, 2018

I don't believe the kubelet has the ability to update its taints anymore.

The kubelet has never required the ability to update taints.

Assuming this is the case, this seems like a version skew problem between master components. I.E. the 1.11 controller manager isn't compatible with the 1.13 scheduler because it doesn't add taints that the scheduler requires.

Was that a typo, supposed to be 1.12? Have we ever supported two version skew between those components?

This means that if the scheduler is upgraded before the controller manager (or before it has the chance to perform updates), we can run into issues.

Under what circumstances do we upgrade to 1.13 scheduler while a 1.11 controller manager is still running?

I think making the scheduler respect conditions in 1.13, and thus making it compatible with the 1.11 controller manager, seems like the best remedy to this.

Same question about two version skew between these two components.

Regardless of this issue, user pods would need to include both the critical annotation as well as the toleration in order to maintain the same behavior across releases that transition from conditions to taints.

That's more concerning… is that something all user pods would be exposed to and required to change?

@yujuhong
Copy link
Contributor

+1 to @dashpole's suggestion in #68899 (comment) to unblock the release

@msau42
Copy link
Member

msau42 commented Sep 21, 2018

What about HA master upgrade?

scheduler and controller-manager each have independent leader elections, so during a HA rolling upgrade, the current leaders for scheduler and controller managers could be on different releases in either direction

@liggitt
Copy link
Member

liggitt commented Sep 21, 2018

during a HA rolling upgrade, the current leaders for scheduler and controller managers could be on different releases in either direction

I wouldn't expect an upgrade from 1.12 to 1.13 to begin until the upgrade from 1.11 to 1.12 was complete/successful

@dashpole
Copy link
Contributor

dashpole commented Sep 21, 2018

Was that a typo, supposed to be 1.12? Have we ever supported two version skew between those components?

Thanks for correcting me. It wasn't a typo, but just a lack of knowledge about how multi-version upgrades are performed. In that case we need to support 1 version skew.

Under what circumstances do we upgrade to 1.13 scheduler while a 1.11 controller manager is still running?

Yeah, it should be 1.12, not 1.13. But generally, the newer scheduler should still be compatible with the state left by the previous version's controller manager, even if it isn't running anymore.

That's more concerning… is that something all user pods would be exposed to and required to change?

Actually, I agree. @bsalamat have we officially deprecated the critical pod annotation? We should ideally add the correct tolerations to pods with the critical pod annotation to ensure backwards compatibility.

@msau42
Copy link
Member

msau42 commented Sep 21, 2018

Bobby commented here that the system pods have already been corrected. I also noted that the critical pod annotation is alpha.

@msau42
Copy link
Member

msau42 commented Sep 21, 2018

during a HA rolling upgrade, the current leaders for scheduler and controller managers could be on different releases in either direction

I wouldn't expect an upgrade from 1.12 to 1.13 to begin until the upgrade from 1.11 to 1.12 was complete/successful

I mean doing an HA rolling upgrade from 1.11 to 1.12. Let's say we have 3 masters, A, B, C, all on 1.11.

  1. A is upgraded to 1.12, B and C are still on 1.11.
  2. scheduler leader election could pick scheduler A as the leader, controller-manager leader election could pick controller-manager B as the leader. Now we have version skew between master components.

@bsalamat
Copy link
Member

bsalamat commented Sep 21, 2018

have we officially deprecated the feature? We should ideally add the correct tolerations to pods with the critical pod annotation to ensure backwards compatibility.

That feature (critical pods) was an alpha feature and we have already replaced it. The dilemma is that I am not so sure if critical pods were the only kind of pods with a "hack" for tolerating node conditions. Let me dig a bit more.

@liggitt
Copy link
Member

liggitt commented Sep 21, 2018

Now we have version skew between master components.

One version of skew, absolutely. @dashpole's comment made me concerned two versions of skew support were being claimed/proposed

@bsalamat
Copy link
Member

bsalamat commented Sep 21, 2018

I looked the scheduler code and the DaemonSet controller code to find if adding "CheckNodeCondition" could cause issues in presence of TaintNodeByCondition and I didn't find anything particularly concerning. So, I created #68953. Let's see if it breaks tests.

@k82cn
Copy link
Member

k82cn commented Sep 21, 2018

that will break schedule daemonset by scheduler. and registerwithtaints of kubelet is not worked in upgrade.

@bsalamat
Copy link
Member

that will break schedule daemonset by scheduler. and registerwithtaints of kubelet is not worked in upgrade.

I suspected that it would conflict with pods which now have tolerations and the scheduler should ignore node conditions for, but I thought it would be limited to critical pods. Thanks for pointing it out. I closed the PR.
@k82cn As an alternative solution, I suggested changing the order of upgrading server components in our upgrade process, so that the controller manager is upgraded before the scheduler, but I am not sure how easy it is to implement. Do you have any alternative fix in mind?

@k82cn
Copy link
Member

k82cn commented Sep 22, 2018

Yes, translation of conditions and unschedulable field to taints is done by the controller

unschedulable is taint by controller & kubelet; #63955 taints node with unschedulable to avoid race condition between controller and scheduler, it dependents on registerWithTaints of kubelet to taint when start up; but registerWithTaints does not work in upgrade case as mentioned above (we only update status when upgrading). IMO, it should be fixed; it'll also fix upgrade e2e test but may not work in real scenario which upgrade master firstly.

so that the controller manager is upgraded before the scheduler, but I am not sure how easy it is to implement. Do you have any alternative fix in mind?

let me think about it, I'll propose one today.

@liggitt
Copy link
Member

liggitt commented Sep 22, 2018

it dependents on registerWithTaints of kubelet to taint when start up; but registerWithTaints does not work in upgrade case as mentioned above (we only update status when upgrading). IMO, it should be fixed

The race that is important to resolve at creation time is that of a new node object appearing untainted.

That race does not apply in an upgrade scenario, where the node object already exists, and can already be observed and acted upon. Relying on the condition-to-taint controller in upgrade scenarios is reasonable.

@k82cn
Copy link
Member

k82cn commented Sep 22, 2018

The race after kubelet start up is well known; but we did not handle startup case well.

I think what we're doing is to deprecate node.Spec.Unschedulable, and replace it with unschedulable taint. We need to keep backward compatibility for it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/storage Categorizes an issue or PR as relevant to SIG Storage.
Projects
None yet
Development

Successfully merging a pull request may close this issue.