Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal Apply Rate spikes #8096

Closed
gyuho opened this issue Jun 14, 2017 · 8 comments
Closed

Proposal Apply Rate spikes #8096

gyuho opened this issue Jun 14, 2017 · 8 comments
Milestone

Comments

@gyuho
Copy link
Contributor

gyuho commented Jun 14, 2017

@xiang90 and me found the root cause and have solution. Just create this for reference.

Our testing cluster with kubemark showed spikes in sum(rate(etcd_server_proposals_applied_total [1m])).

proposal-spikes

Logs around the spike:

# node 1
2017-06-14 17:01:50.527744 I | etcdserver: start to snapshot (applied: 2900060, lastsnap: 2800057)
2017-06-14 17:01:50.536900 I | etcdserver: saved snapshot at index 2900060
2017-06-14 17:01:50.537312 I | etcdserver: compacted raft log at 2895060
2017-06-14 17:01:55.759008 I | pkg/fileutil: purged file /root/etcd/member/snap/0000000000000006-0000000000249f35.snap successfully
2017-06-14 17:02:29.528986 I | mvcc: store.index: compact 2228138
2017-06-14 17:02:29.649299 I | mvcc: finished scheduled compaction at 2228138 (took 117.021781ms)

# node 2
2017-06-14 17:01:50.527419 I | etcdserver: start to snapshot (applied: 2900060, lastsnap: 2800057)
2017-06-14 17:01:50.536373 I | etcdserver: saved snapshot at index 2900060
2017-06-14 17:01:50.536692 I | etcdserver: compacted raft log at 2895060
2017-06-14 17:02:17.497063 I | pkg/fileutil: purged file /root/etcd/member/snap/0000000000000006-0000000000249f35.snap successfully
2017-06-14 17:02:29.528974 I | mvcc: store.index: compact 2228138
2017-06-14 17:02:29.651662 I | mvcc: finished scheduled compaction at 2228138 (took 118.936079ms)
2017-06-14 17:03:12.529455 N | compactor: Starting auto-compaction at revision 2117484
2017-06-14 17:03:12.531058 N | compactor: Finished auto-compaction at revision 2117484
2017-06-14 17:04:59.707802 I | raft: 88087a036e7c60e1 [logterm: 9, index: 2943624, vote: 88087a036e7c60e1] ignored MsgVote from 69df290beb2b8a14 [logterm: 9, index: 2943623] at term 9: lease is not expired (remaining ticks: 7)

# node 3
2017-06-14 16:52:29.496868 I | mvcc: finished scheduled compaction at 2207578 (took 118.841479ms)
2017-06-14 16:57:29.451265 I | mvcc: store.index: compact 2217813
2017-06-14 16:57:29.574574 I | mvcc: finished scheduled compaction at 2217813 (took 119.858753ms)
2017-06-14 17:01:50.527964 I | etcdserver: start to snapshot (applied: 2900060, lastsnap: 2800056)
2017-06-14 17:01:50.536614 I | etcdserver: saved snapshot at index 2900060
2017-06-14 17:01:50.536890 I | etcdserver: compacted raft log at 2895060
2017-06-14 17:02:04.694354 I | pkg/fileutil: purged file /root/etcd/member/snap/0000000000000006-0000000000249f34.snap successfully
2017-06-14 17:02:29.529014 I | mvcc: store.index: compact 2228138
2017-06-14 17:02:29.649298 I | mvcc: finished scheduled compaction at 2228138 (took 117.061855ms)

# node 4
2017-06-14 16:57:29.574395 I | mvcc: finished scheduled compaction at 2217813 (took 120.044099ms)
2017-06-14 17:01:50.528166 I | etcdserver: start to snapshot (applied: 2900060, lastsnap: 2800056)
2017-06-14 17:01:50.546346 I | etcdserver: saved snapshot at index 2900060
2017-06-14 17:01:50.546639 I | etcdserver: compacted raft log at 2895060
2017-06-14 17:02:15.691987 I | pkg/fileutil: purged file /root/etcd/member/snap/0000000000000006-0000000000249f34.snap successfully
2017-06-14 17:02:29.529039 I | mvcc: store.index: compact 2228138
2017-06-14 17:02:29.659175 I | mvcc: finished scheduled compaction at 2228138 (took 126.962864ms)

# node 5
2017-06-14 17:01:50.528009 I | etcdserver: start to snapshot (applied: 2900060, lastsnap: 2800056)
2017-06-14 17:01:50.536932 I | etcdserver: saved snapshot at index 2900060
2017-06-14 17:01:50.537209 I | etcdserver: compacted raft log at 2895060
2017-06-14 17:02:15.967542 I | pkg/fileutil: purged file /root/etcd/member/snap/0000000000000006-0000000000249f34.snap successfully
2017-06-14 17:02:23.816398 I | wal: segmented wal file /root/etcd/member/wal/0000000000000031-00000000002cd00d.wal is created
2017-06-14 17:02:29.528944 I | mvcc: store.index: compact 2228138
2017-06-14 17:02:29.648869 I | mvcc: finished scheduled compaction at 2228138 (took 116.532844ms)
2017-06-14 17:02:46.152472 I | pkg/fileutil: purged file /root/etcd/member/wal/000000000000002c-000000000027ce1f.wal successfully
2017-06-14 17:04:59.707762 I | raft: 3213e0b0f8d70928 [term: 9] received a MsgVote message with higher term from 69df290beb2b8a14 [term: 10]
2017-06-14 17:04:59.707798 I | raft: 3213e0b0f8d70928 became follower at term 10

Nothing special on etcd server logs.

So we investigated the WAL entries, and found that all spikes are from lease revokes:

etcd-dump-logs --data-dir ./etcd-1 > etcd-1.log
etcd-dump-logs --data-dir ./etcd-2 > etcd-2.log
etcd-dump-logs --data-dir ./etcd-3 > etcd-3.log
etcd-dump-logs --data-dir ./etcd-4 > etcd-4.log
etcd-dump-logs --data-dir ./etcd-5 > etcd-5.log

   9	   2899991	norm	header:<ID:6980962663844685081 > lease_revoke:<ID:659879111853478908 > 
   9	   2899992	norm	header:<ID:6980962663844685082 > lease_revoke:<ID:659879111853503344 > 
   9	   2899993	norm	header:<ID:6980962663844685058 > lease_revoke:<ID:659879111853427852 > 

Dashboard with rate(etcd_debugging_server_lease_expired_total[1m]) also confirms this in the run-etcd-1-2 node:

screen shot 2017-06-14 at 10 37 20 am

If we further investigate the specific logs from run-etcd-1-2 node (since latest snapshot log dump only showed lease revokes):

etcd-dump-logs --data-dir ./etcd-2 -start-snap 0000000000000006-0000000000293318.snap > etcd-2-previous.log

   6	   2730283	norm	header:<ID:659879111853503345 > lease_grant:<TTL:3600 ID:659879111853503344 > 
   6	   2730284	norm	header:<ID:6980962663844573262 > txn:<compare:<target:MOD key:"/registry/services/endpoints/kube-system/kube-controller-manager" mod_revision:2106700 > success:<request_put:<key:"/registry/services/endpoints/kube-system/kube-controller-manager" value:"k8s\000\n\017\n\002v1\022\tEndpoints\022\376\002\n\373\002\n\027kube-controller-manager\022\000\032\013kube-system\"@/api/v1/namespaces/kube-system/endpoints/kube-controller-manager*$48c6d4eb-508a-11e7-bdd4-42010af0000b2\0008\000B\014\010\203\327\201\312\005\020\311\224\253\344\001b\324\001\n(control-plane.alpha.kubernetes.io/leader\022\247\001{\"holderIdentity\":\"kubernetes-kubemark-master\",\"leaseDurationSeconds\":15,\"acquireTime\":\"2017-06-13T22:47:31Z\",\"renewTime\":\"2017-06-14T15:57:53Z\",\"leaderTransitions\":0}z\000\032\000\"\000" > > failure:<request_range:<key:"/registry/services/endpoints/kube-system/kube-controller-manager" > > > 

TTL was 3600 (1-hour), and there was leader election an hour before.
And when leader election happens, we renew all the lease, and those renewed lease are being revoked all at the same time the hour later, thus spikes.

We plan to randomize leases on recovery, in addition to rate limiting lease revokes.

@xiang90
Copy link
Contributor

xiang90 commented Jun 14, 2017

xref: kubernetes/kubernetes#47532

@xinxiaogang
Copy link

@gyuho @xiang90 So what's the solution?

@gyuho
Copy link
Contributor Author

gyuho commented Jun 14, 2017

@xinxiaogang

We plan to randomize leases on recovery, in addition to rate limiting lease revokes.

gyuho added a commit to gyuho/etcd that referenced this issue Jun 14, 2017
Address etcd-io#8096.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
gyuho added a commit to gyuho/etcd that referenced this issue Jun 14, 2017
Randomize the very first expiry on lease recovery
to prevent recovered leases from expiring all at
the same time.

Address etcd-io#8096.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
gyuho added a commit to gyuho/etcd that referenced this issue Jun 14, 2017
Randomize the very first expiry on lease recovery
to prevent recovered leases from expiring all at
the same time.

Address etcd-io#8096.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
gyuho added a commit to gyuho/etcd that referenced this issue Jun 14, 2017
Randomize the very first expiry on lease recovery
to prevent recovered leases from expiring all at
the same time.

Address etcd-io#8096.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
gyuho added a commit to gyuho/etcd that referenced this issue Jun 15, 2017
Randomize the very first expiry on lease recovery
to prevent recovered leases from expiring all at
the same time.

Address etcd-io#8096.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
gyuho added a commit to gyuho/etcd that referenced this issue Jun 15, 2017
Randomize the very first expiry on lease recovery
to prevent recovered leases from expiring all at
the same time.

Address etcd-io#8096.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
gyuho added a commit to gyuho/etcd that referenced this issue Jun 15, 2017
Randomize the very first expiry on lease recovery
to prevent recovered leases from expiring all at
the same time.

Address etcd-io#8096.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
gyuho added a commit to gyuho/etcd that referenced this issue Jun 15, 2017
Randomize the very first expiry on lease recovery
to prevent recovered leases from expiring all at
the same time.

Address etcd-io#8096.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
gyuho added a commit to gyuho/etcd that referenced this issue Jun 15, 2017
Randomize the very first expiry on lease recovery
to prevent recovered leases from expiring all at
the same time.

Address etcd-io#8096.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
gyuho added a commit to gyuho/etcd that referenced this issue Jun 15, 2017
Randomize the very first expiry on lease recovery
to prevent recovered leases from expiring all at
the same time.

Address etcd-io#8096.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
@gyuho
Copy link
Contributor Author

gyuho commented Jun 16, 2017

I manually triggered leader election with master, and we don't see the spikes anymore.

screen shot 2017-06-15 at 5 54 54 pm

@xiang90
Copy link
Contributor

xiang90 commented Jun 16, 2017

You need to wait one or two hours to allow leases to be accumulated before triggered a election. Then wait another hour to check spikes.

@xiang90
Copy link
Contributor

xiang90 commented Jun 16, 2017

Also k8s tests need to be running against etcd.

@xiang90 xiang90 added this to the v3.3.0 milestone Jun 16, 2017
@armstrongli
Copy link

@gyuho The spick doesn't appear for every leader election. Actually there're events generated every second in kubernetes cluster.
The solution We plan to randomize leases on recovery, in addition to rate limiting lease revokes. can reduce symptoms, I believe.

@xiang90
Copy link
Contributor

xiang90 commented Jun 16, 2017

i can confirm it is fixed with our new test result. @heyitsanthony is working on a better smooth algo though.

@xiang90 xiang90 closed this as completed Jun 16, 2017
gyuho added a commit that referenced this issue Jun 23, 2017
Instead of unconditionally randomizing, extend leases on promotion
if too many leases expire within the same time span. If the server
has few leases or spread out expires, there will be no extension.

Squashed previous commits for #8149.

This is a combination of 4 commits below:

lease: randomize expiry on initial refresh call

Randomize the very first expiry on lease recovery
to prevent recovered leases from expiring all at
the same time.

Address #8096.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>

integration: remove lease exist checking on randomized expiry

Lease with TTL 5 should be renewed with randomization,
thus it's still possible to exist after 3 seconds.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>

lessor: extend leases on promote if expires will be rate limited

Instead of unconditionally randomizing, extend leases on promotion
if too many leases expire within the same time span. If the server
has few leases or spread out expires, there will be no extension.

Revert "integration: remove lease exist checking on randomized expiry"

This reverts commit 95bc33f. The new
lease extension algorithm should pass this test.
gyuho added a commit that referenced this issue Jun 23, 2017
Instead of unconditionally randomizing, extend leases on promotion
if too many leases expire within the same time span. If the server
has few leases or spread out expires, there will be no extension.

Squashed previous commits for #8149.

Author: Anthony Romano <anthony.romano@coreos.com>

This is a combination of 4 commits below:

lease: randomize expiry on initial refresh call

Randomize the very first expiry on lease recovery
to prevent recovered leases from expiring all at
the same time.

Address #8096.

integration: remove lease exist checking on randomized expiry

Lease with TTL 5 should be renewed with randomization,
thus it's still possible to exist after 3 seconds.

lessor: extend leases on promote if expires will be rate limited

Instead of unconditionally randomizing, extend leases on promotion
if too many leases expire within the same time span. If the server
has few leases or spread out expires, there will be no extension.

Revert "integration: remove lease exist checking on randomized expiry"

This reverts commit 95bc33f. The new
lease extension algorithm should pass this test.
yudai pushed a commit to yudai/etcd that referenced this issue Oct 5, 2017
Randomize the very first expiry on lease recovery
to prevent recovered leases from expiring all at
the same time.

Address etcd-io#8096.

Signed-off-by: Gyu-Ho Lee <gyuhox@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants