Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An example is missed, related to eventRecordQPS #1104

Open
winkee01 opened this issue Mar 1, 2022 · 6 comments
Open

An example is missed, related to eventRecordQPS #1104

winkee01 opened this issue Mar 1, 2022 · 6 comments

Comments

@winkee01
Copy link

winkee01 commented Mar 1, 2022

in node.yaml (e.g. this), there is a missing example related to eventRecordQPS, which makes it not consistent with other remediations.

 remediation: |
          If using a Kubelet config file, edit the file to set eventRecordQPS: to an appropriate level.
          If using command line arguments, edit the kubelet service file
          $kubeletsvc on each worker node and
          set the below parameter in KUBELET_SYSTEM_PODS_ARGS variable.
          Based on your system, restart the kubelet service. For example:
          systemctl daemon-reload
          systemctl restart kubelet.service

@winkee01
Copy link
Author

winkee01 commented Mar 1, 2022

I don't know what is an appropriate level, after googling, I think eventRecordQPS should be set to 0.

@mozillazg
Copy link
Collaborator

@winkee01 Thanks for reporting.

The content of remediation was copied from the CIS Kubernetes Benchmark. And the blow parameter is the --event-qps flag.

I don't know what is an appropriate level, after googling, I think eventRecordQPS should be set to 0.

It depends on your environment and business.
image
-- 《CIS Kubernetes Benchmark v1.6.1》

@joebowbeer
Copy link
Contributor

joebowbeer commented May 7, 2022

@winkee01 You should avoid 0 unless you're using the deprecated --event-qps command line.

This is confusing because when using the deprecated command line option, 0 is documented to use the default value of 5, whereas in the config file, a value of 0 is documented to disable rate limiting, which is precisely what this rule is supposed to prevent.

https://github.com/kubernetes/kubernetes/blob/master/cmd/kubelet/app/options/options.go

https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/apis/config/v1beta1/defaults.go

@joebowbeer
Copy link
Contributor

joebowbeer commented Sep 16, 2022

@mozillazg Note that these --event-qps / eventRecordQPS tests are wrong.

AFAICT, the tests are looking for a 0 in a deprecated flag (--event-qps) or in a kubelet config field (eventRecordQPS), however, the default value when both of these are missing is in fact 5, but if these are present, an explicit 0 has a very different meaning for the flag than for the config field.

For the deprecated command line flag, 0 means use the default value of 5, whereas in the config file, the default value if the field is missing is 5, but an explicit value of 0 disables rate limiting, which is precisely what this rule is supposed to prevent.

@Pluies
Copy link
Contributor

Pluies commented Sep 17, 2022

Hey @joebowbeer !

an explicit value of 0 disables rate limiting, which is precisely what this rule is supposed to prevent.

I read the CIS benchmark the other way around - losing security events due to rate-limiting is the main issue the CIS benchmark wants to prevent here, as per their introductory line: "Security relevant information should be logged", and the rationale "it is important to not restrict event creation". Rate-limiting will indeed prevent events from being created (if there's too many of them), so rate-limiting should be disabled.

The discussion about kubelet denial of service is an important caveat, but ought to be addressed by scaling up the relevant logging infrastructure rather than by setting a rate-limit here.

Does that make sense to you as well?

@joebowbeer
Copy link
Contributor

joebowbeer commented Sep 17, 2022

@Pluies I wrote more at awslabs/amazon-eks-ami#391 where this comment aligns with my interpretation of the original intent.

The original recommendation was to set --event-qps to 0 or to tune it to a value appropriate for your cluster.

Setting --event-qps to 0 was equivalent to not setting it at all. Both cases defaulted to 5 qps, which according to some articles is appropriate for 30 pods, but YMMV.

Then the flag was deprecated and the CIS Benchmark recommendation (and kube-bench) became twisted to check that either the --event-qps flag is 0 or that eventRecordQPS in the kubelet-config is 0, but these have two completely different meanings.

eventRecordQPS=0 is an explicit override to disable rate limiting, whereas omitting this field defaults to 5, which is equivalent to setting --event-qps to 0.

I can't claim that disabling rate limiting is necessarily wrong, but it is obvious from the source code and history of changes that disabling all rate limiting is not the intent of the recommendation, and it should not become the new effective default for a cluster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants