Skip to content
This repository has been archived by the owner on Mar 28, 2020. It is now read-only.

e2e flake: TestSelfHosted, migrate boot member, failed to create 3 members etcd cluster #640

Closed
hongchaodeng opened this issue Jan 10, 2017 · 3 comments
Labels

Comments

@hongchaodeng
Copy link
Member

23:30:40 --- FAIL: TestSelfHosted (131.13s)
23:30:40     --- FAIL: TestSelfHosted/self_hosted (0.00s)
23:30:40         --- FAIL: TestSelfHosted/self_hosted/migrate_boot_member_to_self_hosted_cluster (131.13s)
23:30:40         	self_hosted_test.go:93: etcdserver is ready
23:30:40         	util.go:369: 2017-01-10 23:28:29.846520848 +0000 UTC created etcd cluster: etcd-test-seed-8zlik
23:30:40         	util.go:369: 2017-01-10 23:28:29.855348139 +0000 UTC waiting size (3), etcd pods: []
23:30:40         	util.go:369: 2017-01-10 23:28:39.855591082 +0000 UTC waiting size (3), etcd pods: [etcd-test-seed-8zlik-0000]
23:30:40         	util.go:369: 2017-01-10 23:28:49.855929774 +0000 UTC waiting size (3), etcd pods: [etcd-test-seed-8zlik-0000]
23:30:40         	util.go:369: 2017-01-10 23:28:59.856797651 +0000 UTC waiting size (3), etcd pods: [etcd-test-seed-8zlik-0000]
23:30:40         	util.go:369: 2017-01-10 23:29:09.855790734 +0000 UTC waiting size (3), etcd pods: [etcd-test-seed-8zlik-0000]
23:30:40         	util.go:369: 2017-01-10 23:29:19.855706347 +0000 UTC waiting size (3), etcd pods: [etcd-test-seed-8zlik-0000]
23:30:40         	util.go:369: 2017-01-10 23:29:29.855725213 +0000 UTC waiting size (3), etcd pods: [etcd-test-seed-8zlik-0000]
23:30:40         	util.go:369: 2017-01-10 23:29:39.85573673 +0000 UTC waiting size (3), etcd pods: [etcd-test-seed-8zlik-0000]
23:30:40         	util.go:369: 2017-01-10 23:29:49.859968118 +0000 UTC waiting size (3), etcd pods: [etcd-test-seed-8zlik-0000]
23:30:40         	util.go:369: 2017-01-10 23:29:59.883416461 +0000 UTC waiting size (3), etcd pods: [etcd-test-seed-8zlik-0000]
23:30:40         	util.go:369: 2017-01-10 23:30:09.856611892 +0000 UTC waiting size (3), etcd pods: [etcd-test-seed-8zlik-0000]
23:30:40         	util.go:369: 2017-01-10 23:30:19.862667777 +0000 UTC waiting size (3), etcd pods: [etcd-test-seed-8zlik-0000]
23:30:40         	self_hosted_test.go:125: failed to create 3 members etcd cluster: still failing after 12 retries
23:30:40         	util.go:215: pod (etcd-test-seed-8zlik-0000): status (Running), cmd ([/bin/sh -c sleep 5; /usr/local/bin/etcd --data-dir=/var/etcd/data --name=etcd-test-seed-8zlik-0000 --initial-advertise-peer-urls=http://$(MY_POD_IP):2380 --listen-peer-urls=http://$(MY_POD_IP):2380 --listen-client-urls=http://$(MY_POD_IP):2379 --advertise-client-urls=http://$(MY_POD_IP):2379 --initial-cluster=default=http://10.240.0.10:12380,etcd-test-seed-8zlik-0000=http://$(MY_POD_IP):2380 --initial-cluster-state=existing])
23:30:40         	util.go:369: 2017-01-10 23:30:19.946465543 +0000 UTC waiting pod (etcd-test-seed-8zlik-0000) to be deleted.
23:30:40         	util.go:254: pod (etcd-test-seed-8zlik-0000) status.phase is (Running): init container status:
23:30:40         		add-member: Terminated: message () reason (Completed)
23:30:40         		container status:
23:30:40         	util.go:369: 2017-01-10 23:30:24.937108328 +0000 UTC waiting pod (etcd-test-seed-8zlik-0000) to be deleted.
23:30:40         	util.go:254: pod (etcd-test-seed-8zlik-0000) status.phase is (Running): init container status:
23:30:40         		add-member: Terminated: message () reason (Completed)
23:30:40         		container status:
23:30:40         	util.go:369: 2017-01-10 23:30:29.934424582 +0000 UTC waiting pod (etcd-test-seed-8zlik-0000) to be deleted.
23:30:40         	util.go:254: pod (etcd-test-seed-8zlik-0000) status.phase is (Running): init container status:
23:30:40         		add-member: Terminated: message () reason (Completed)
23:30:40         		container status:
23:30:40         	util.go:369: 2017-01-10 23:30:34.93466909 +0000 UTC waiting pod (etcd-test-seed-8zlik-0000) to be deleted.
23:30:40         	util.go:254: pod (etcd-test-seed-8zlik-0000) status.phase is (Running): init container status:
23:30:40         		add-member: Terminated: message () reason (Completed)
23:30:40         		container status:
23:30:40         	util.go:369: 2017-01-10 23:30:39.933312437 +0000 UTC waiting pod (etcd-test-seed-8zlik-0000) to be deleted.
23:30:40         	util.go:254: pod (etcd-test-seed-8zlik-0000) status.phase is (Running): init container status:
23:30:40         		add-member: Terminated: message () reason (Completed)
23:30:40         		container status:
23:30:40         	self_hosted_test.go:120: fail to wait pods deleted: still failing after 5 retries
23:30:40 FAIL
@hongchaodeng
Copy link
Member Author

From operator logs, this cluster failed to migrate:

time="2017-01-10T23:28:56Z" level=error msg="cluster failed to setup: etcdcli failed to remove boot member: grpc: timed out when dialing" cluster-name=etcd-test-seed-8zlik pkg=cluster

@hongchaodeng
Copy link
Member Author

hongchaodeng commented Jan 10, 2017

From operator logs, there is another self hosted cluster that should also fail:

time="2017-01-10T23:29:00Z" level=error msg="member peerURL (http://10.240.0.6:2380): unexpected unready member for selfhosted cluster" cluster-name=test-etcd-2qlnv pkg=cluster

It doesn't show up in the test output though.

@hongchaodeng
Copy link
Member Author

Not seeing this anymore.
We can revisit this if there is real use case that needs more robustness in removing boot member.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

1 participant