Cilium-dedicated etcd is reaching space quota limit without AUTO_COMPACTION #10663

luanguimaraesla · 2021-01-26T20:42:47Z

1. What kops version are you running? The command kops version, will display
this information.

1.18.2 (git-84495481e4)

2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.

Client Version: version.Info{Major:"1", Minor:"19", GitVersion:"v1.19.4", GitCommit:"d360454c9bcd1634cf4cc52d1867af5491dc9c5f", GitTreeState:"archive", BuildDate:"2020-11-25T13:19:56Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.14", GitCommit:"89182bdd065fbcaffefec691908a739d161efc03", GitTreeState:"clean", BuildDate:"2020-12-18T12:02:35Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}

3. What cloud provider are you using?

AWS

4. What commands did you run? What is the simplest way to reproduce this issue?

Create a new cluster using cilium dedicated etcd created with etcd-manager and wait for a few days.

5. What happened after the commands executed?

Without AUTO_COMPACTION options configured, after a few days, the etcd cluster reaches its space quota limit of 2Gb, and Cilium stops working with the following message:

# snippet of cilium pod log
level=fatal msg="Unable to connect to kvstore" error="etcdserver: mvcc: database space exceeded" module=etcd subsys=kvstore

The etcd cluster reports the following alert:

memberID:2157140721128943973 alarm:NOSPACE 
memberID:17459609605570688463 alarm:NOSPACE

6. What did you expect to happen?

If I edit my cluster and add the following environment variables to the manager configuration, it starts working as expected

# snippet cluster.yaml spec for etcdClusters item.
  - name: cilium
    version: 3.3.10
    manager:
      env:
      - name: ETCD_AUTO_COMPACTION_MODE
        value: revision
      - name: ETCD_AUTO_COMPACTION_RETENTION
        value: "1000"

Then I could see this cluster reporting:

2021-01-26 19:40:54.223524 I | pkg/flags: recognized and used environment variable ETCD_AUTO_COMPACTION_MODE=revision
2021-01-26 19:40:54.223530 I | pkg/flags: recognized and used environment variable ETCD_AUTO_COMPACTION_RETENTION=1000

It's likely to be a default option for all etcd clusters, or at least kops should have a section about this in the documentation, especially for Cilium configuration.

The text was updated successfully, but these errors were encountered:

olemarkus · 2021-01-27T18:23:01Z

/kind office-hours

Should this be handled on etcd-manager side?
Also worth considering if etcd-manager should run defrag from time to time.

olemarkus · 2021-07-08T19:30:09Z

k8s api server does compaction for the other etcd clusters. Will enable autocompaction on the cilium etcd cluster on new kops clusters

/assign

k8s-ci-robot added the kind/office-hours label Jan 27, 2021

kubernetes deleted a comment from k8s-ci-robot Jan 27, 2021

kubernetes deleted a comment from olemarkus Jan 27, 2021

justinsb self-assigned this Apr 9, 2021

justinsb added this to the v1.21 milestone Apr 9, 2021

johngmyers removed the kind/office-hours label Apr 23, 2021

johngmyers removed this from the v1.21 milestone Jun 10, 2021

k8s-ci-robot assigned olemarkus Jul 8, 2021

olemarkus mentioned this issue Jul 8, 2021

Cilium etcd fixes #11961

Merged

k8s-ci-robot closed this as completed in #11961 Jul 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cilium-dedicated etcd is reaching space quota limit without AUTO_COMPACTION #10663

Cilium-dedicated etcd is reaching space quota limit without AUTO_COMPACTION #10663

luanguimaraesla commented Jan 26, 2021

olemarkus commented Jan 27, 2021

olemarkus commented Jul 8, 2021

Cilium-dedicated etcd is reaching space quota limit without AUTO_COMPACTION #10663

Cilium-dedicated etcd is reaching space quota limit without AUTO_COMPACTION #10663

Comments

luanguimaraesla commented Jan 26, 2021

olemarkus commented Jan 27, 2021

olemarkus commented Jul 8, 2021