Give the “Compactor” work to etcd #46705

oikomi · 2017-05-31T12:05:36Z

Is this a request for help? (If yes, you should use our troubleshooting guide and community support channels, see https://kubernetes.io/docs/tasks/debug-application-cluster/troubleshooting/.): no

What keywords did you search in Kubernetes issues before filing this one? (If you have found any duplicates, you should instead reply there.): etcd compact

Is this a BUG REPORT or FEATURE REQUEST? (choose one): BUG REPORT

Kubernetes version (use kubectl version):
[ ~]$ kubectl version
Client Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.0", GitCommit:"fff5156092b56e6bd60fff75aad4dc9de6b6ef37", GitTreeState:"clean", BuildDate:"2017-03-28T16:36:33Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.0", GitCommit:"fff5156092b56e6bd60fff75aad4dc9de6b6ef37", GitTreeState:"clean", BuildDate:"2017-03-28T16:24:30Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/amd64"}

Environment:

Cloud provider or hardware configuration:
OS (e.g. from /etc/os-release):
[~]$ cat /etc/os-release
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

Kernel (e.g. uname -a):
[ ~]$ uname -a
Linux master-s0001 3.10.0-514.16.1.el7.x86_64 Unit test coverage in Kubelet is lousy. (~30%) #1 SMP Wed Apr 12 15:04:24 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
Install tools:
Others:

What happened:
[~]$ tail -f /apps/logs/kubernetes/kube-apiserver.log
E0531 11:49:20.139129 2949 watcher.go:188] watch chan error: etcdserver: mvcc: required revision has been compacted
E0531 12:09:22.391360 2949 watcher.go:188] watch chan error: etcdserver: mvcc: required revision has been compacted
E0531 12:29:24.651345 2949 watcher.go:188] watch chan error: etcdserver: mvcc: required revision has been compacted
E0531 12:39:25.717216 2949 watcher.go:188] watch chan error: etcdserver: mvcc: required revision has been compacted
E0531 13:39:32.418625 2949 watcher.go:188] watch chan error: etcdserver: mvcc: required revision has been compacted
E0531 13:59:04.624372 2949 watcher.go:188] watch chan error: etcdserver: mvcc: required revision has been compacted
E0531 16:49:23.451694 2949 watcher.go:188] watch chan error: etcdserver: mvcc: required revision has been compacted
E0531 17:39:29.085406 2949 watcher.go:188] watch chan error: etcdserver: mvcc: required revision has been compacted
E0531 18:09:32.358578 2949 watcher.go:188] watch chan error: etcdserver: mvcc: required revision has been compacted
E0531 18:59:07.953836 2949 watcher.go:188] watch chan error: etcdserver: mvcc: required revision has been compacted

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know:
we should not do 'compact' in k8s , just let etcd to do it use '--auto-compaction-retention'.

The text was updated successfully, but these errors were encountered:

0xmichalis · 2017-06-02T23:47:54Z

@kubernetes/sig-api-machinery-misc

smarterclayton · 2017-06-02T23:52:24Z

The etcd folks were the ones who set up this compaction in Kubernetes when we moved to v3. @xiang90 @hongchaodeng

hongchaodeng · 2017-06-03T05:39:55Z

Hi @oikomi
What's the etcd version? How did you find this? Is it reproducible?

hongchaodeng · 2017-06-03T05:50:40Z

Some context layout for the problem:
On 1.6.0 codebase:

It first do a get and set initialRev

wc.initialRev = getResp.Header.Revision

Then it watch from initialRev + 1:

	opts := []clientv3.OpOption{clientv3.WithRev(wc.initialRev + 1), clientv3.WithPrevKV()}
	...
	wch := wc.watcher.client.Watch(wc.ctx, wc.key, opts...)
	for wres := range wch {
		if wres.Err() != nil {
			err := wres.Err()
			// If there is an error on server (e.g. compaction), the channel will return it before closed.
			glog.Errorf("watch chan error: %v", err)

From watch API's view, it could return "revision compacted" error.

Some questions:

If compaction happened, will a running watcher be cut off? (No. It won't)
Is the watcher supposed to never stop? It seems like it creates new watcher quite frequently.

hongchaodeng · 2017-06-03T06:14:38Z

Note: etcd client dep of v1.6.0 is "v3.0.17"

hongchaodeng · 2017-06-03T06:27:36Z

~~So my guess is that between sync() and Watch(), it took too long to process those events from sync()~~

Better need reproduce steps.

oikomi · 2017-06-05T02:37:42Z

@hongchaodeng hi my etcd version is

[~]$ etcd --version
etcd Version: 3.1.4
Git SHA: 41e52eb
Go Version: go1.7.5
Go OS/Arch: linux/amd64

we found it in our production env, we're looking for reasons, too

we suggest provide an option to k8s to enable 'compact' and 'set compact interval‘

xiang90 · 2017-06-05T02:44:29Z

@oikomi

I do not think this issue has anything to do with who control the compaction interval. Even if you let etcd itself do compaction, you will hit the same issue.

The warning means that your watcher tries to watch from a compacted revision for some reason. it probably should not mark as error. The control should do a relist and resync from the current revision.

Also by looking through the log, it seems everything is fine. Probably the watcher just does not get any update in 10 minutes and when it gets cut off it tries to watch from an old revision. We should probably enable progress notification (a new feature in etcd3) to update watcher revision even if there is no new notification for that watcher.

/cc @wojtek-t might know better than me about this.

wojtek-t · 2017-06-05T12:32:08Z

Watches are generally solved from apiserver and we have per-resource-types history cached in apiserver (in cacher.go). That said, assuming that you are not starting the watch from very old resource version, this should very very rarely affect the end client (though those logs are visible in apiserver).

That said, I think we can try improving it. Basically, currently when we are doing compaction, we are compacting everything up until now:
https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apiserver/pkg/storage/etcd3/compact.go#L132
@xiang90 - would it be reasonable to e.g. compact to revision "now - 1000" or sth like that?

xiang90 · 2017-06-05T16:02:49Z

@wojtek-t

would it be reasonable to e.g. compact to revision "now - 1000" or sth like that?

it is reasonable (this is what we do for internally and suggested for other projects). we can do this in a time based manner or number based manner.

/cc @hongchaodeng can you take a look? or i can take a look later this week.

hongchaodeng · 2017-06-05T16:50:06Z

Some tradeoff:

time-based (current option) might have issue that there could be no change (really?) between 5 minutes.
number based could have issue that there could be too many changes (unlikely 1000?) in a short period.

It depends on the use case. I'm not sure about which is better so it is up to you @wojtek-t .

I still have questions why the watcher connection cut off? Because of resync? If it didn't cut off (keep watching), then it should be fine. Another solution is to use progress notification.

/cc @hongchaodeng can you take a look? or i can take a look later this week.

@xiang90 I will let you take a look first. If you don't have time I can take a look later.

wojtek-t · 2017-06-05T19:55:13Z

@hongchaodeng - I think either I misunderstood you or you misunderstood me.
The option that I'm suggesting is to:

still do compaction every X minutes (as we are doing now)
but instead of compacting to "current resource version", compact it to sth like "current resource version - 1000" or sth like that.

hongchaodeng · 2017-06-05T20:05:00Z

but instead of compacting to "current resource version"

The current code is compacting the version of X minutes ago, not current time. Right?

wojtek-t · 2017-06-05T20:12:21Z

The current code is compacting the version of X minutes ago, not current time. Right?

Ohh, if so, then I guess I misunderstood the code. If that's the case, I'm fine with that.

xiang90 · 2017-06-05T20:14:37Z

@wojtek-t

Yea. If we already use time based approach, then it should be fine. I feel time based makes more sense since the main use case is to deal with temp reconnection, which should happen within minutes. or a full recovery should be triggered anyway.

wojtek-t · 2017-06-05T20:17:46Z

Yea. If we already use time based approach, then it should be fine. I feel time based makes more sense since the main use case is to deal with temp reconnection, which should happen within minutes. or a full recovery should be triggered anyway.

+1

javipolo · 2017-06-07T16:21:11Z

This happent to me when I was updating some flags on etcd. I did restart all etcd servers with the new flags, and the kube-apiservers got all stuck with this message in the error log, and the requests to the api were giving timeouts

Etcd servers were fine (I checked with etcdctl cluster-health and member list), and just by restarting the apiservers the problem went away ...

This has happent twice in a row, so I guess it should not be hard to reproduce it

versions:
etcd 3.1.8
kube-apiserver 1.6.2

xiang90 · 2017-06-07T16:24:18Z

@javipolo can you create a new issue for what you saw? i feel it may be a different thing. also it would be helpful if you can write down how to reproduce it in detail. thanks.

xiang90 · 2017-06-07T16:25:51Z

@wojtek-t @hongchaodeng I feel we should still keep compaction inside k8s for the time being. I cannot see a strong reason to move it out.

The actionable item in this issue is to change the Error level logging to Warning (or even hide it when it is expected) probably.

tmjd · 2017-07-06T14:25:16Z

One reason to remove compaction from k8s is because it requires giving k8s the root role in etcd when using RBAC. I have been setting up etcd RBAC (with certificates) and was seeing a compact error and, when I asked in the etcd IRC, I was told that to allow compaction to work it was necessary to have the root role.

I'm not sure I would advocate to remove it all together but it would be nice to have a flag that can be used to disable compaction. That would make sense to me with the caveat that compaction will need to be done by some other means.

smarterclayton · 2017-07-27T03:08:05Z

If we enable paging in the api server we may want to increase the compaction interval on large clusters. For instance, on a 1 million key cluster, doing a JSON -> protobuf storage migration took longer than 5 minutes when using the paging prototype to test it. I suspect as well that people will want to tune some ranges to be retained longer for other tooling to be able to watch and record that data to historical archives. We're seeing a fair number of people want to control compaction according to when the archiving tools complete a record, not when k8s decides it.

mml · 2017-08-07T21:27:05Z

cc @jpbetz

waeljammal · 2017-08-25T16:16:43Z

Seeing this issue as well with 1.7.3 & etcd 3.1.8

Also looks like it does not update the revision once failed I see requests go out for a revision that fails because it's compacted but a short while later it tries to get the same revision again with same error instead of using latest revision.

jpbetz · 2017-09-27T22:26:58Z

Etcd is adding support for fine grained compaction periods, using go Durations, in etcd-io/etcd#8563. We need to wait until that is available since without only periods of hours can be specified. It will likely be available in etcd 3.3.

fejta-bot · 2018-01-18T11:14:25Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

fejta-bot · 2018-02-17T11:20:59Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

redbaron · 2018-02-18T10:03:18Z

/remove-lifecycle rotten

fejta-bot · 2018-05-19T10:33:11Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

redbaron · 2018-05-19T12:05:05Z

/remove-lifecycle stale

fejta-bot · 2018-08-17T12:52:40Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

redbaron · 2018-08-20T10:51:36Z

etcd 3.3 is available for a while now

/remove-lifecycle stale

fejta-bot · 2018-11-18T10:56:53Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2018-12-18T11:40:19Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

fejta-bot · 2019-01-17T12:25:09Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

k8s-ci-robot · 2019-01-17T12:25:17Z

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-github-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label May 31, 2017

k8s-ci-robot added the sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. label Jun 2, 2017

k8s-github-robot removed the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jun 2, 2017

sttts mentioned this issue Jan 17, 2018

Lots of etcd errors in origin-master-api logs after upgrading to 3.7 openshift/origin#18123

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 18, 2018

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 17, 2018

k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Feb 18, 2018

tgraf mentioned this issue Mar 5, 2018

Help needed, A how-to around istio "0.5.1" + cilium "v1.0.0-rc5" cilium/cilium#3010

Closed

tylerauerbeck mentioned this issue Mar 19, 2018

Bump etcd to 3.3 #61326

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 19, 2018

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 19, 2018

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 17, 2018

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 20, 2018

lavalamp mentioned this issue Sep 14, 2018

apiserver: start only one compactor per unique storagebackend transport config #68557

Merged

caseydavenport mentioned this issue Oct 19, 2018

etcd rbac guide talks about etcd v2 in calico v3.x docs projectcalico/calico#2243

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 18, 2018

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Dec 18, 2018

k8s-ci-robot closed this as completed Jan 17, 2019

Give the “Compactor” work to etcd #46705

Give the “Compactor” work to etcd #46705

Comments

oikomi commented May 31, 2017 • edited

0xmichalis commented Jun 2, 2017

smarterclayton commented Jun 2, 2017

hongchaodeng commented Jun 3, 2017

hongchaodeng commented Jun 3, 2017 • edited

hongchaodeng commented Jun 3, 2017

hongchaodeng commented Jun 3, 2017 • edited

oikomi commented Jun 5, 2017

xiang90 commented Jun 5, 2017

wojtek-t commented Jun 5, 2017

xiang90 commented Jun 5, 2017

hongchaodeng commented Jun 5, 2017

wojtek-t commented Jun 5, 2017

hongchaodeng commented Jun 5, 2017

wojtek-t commented Jun 5, 2017

xiang90 commented Jun 5, 2017

wojtek-t commented Jun 5, 2017

javipolo commented Jun 7, 2017

xiang90 commented Jun 7, 2017

xiang90 commented Jun 7, 2017

tmjd commented Jul 6, 2017

smarterclayton commented Jul 27, 2017

mml commented Aug 7, 2017

waeljammal commented Aug 25, 2017 • edited

jpbetz commented Sep 27, 2017

fejta-bot commented Jan 18, 2018

fejta-bot commented Feb 17, 2018

redbaron commented Feb 18, 2018

fejta-bot commented May 19, 2018

redbaron commented May 19, 2018

fejta-bot commented Aug 17, 2018

redbaron commented Aug 20, 2018

fejta-bot commented Nov 18, 2018

fejta-bot commented Dec 18, 2018

fejta-bot commented Jan 17, 2019

k8s-ci-robot commented Jan 17, 2019

oikomi commented May 31, 2017 •

edited

hongchaodeng commented Jun 3, 2017 •

edited

hongchaodeng commented Jun 3, 2017 •

edited

waeljammal commented Aug 25, 2017 •

edited