Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Operator/CES: Simplify rate limit configuration #32523

Merged
merged 1 commit into from
May 27, 2024
Merged

Conversation

thorn3r
Copy link
Contributor

@thorn3r thorn3r commented May 13, 2024

The CiliumEndpointSlice controller has 6 rate limit related config options. This PR attempts to simplify configuration by reducing that set to 2: one for supplying a custom, dynamic rate limit config, and another to supply a base multiplier for simple tuning of an existing config.

Note: The supplied default dynamic rate limit may need some adjustment. This will likely be updated after future scale testing and/or PR feedback.

See individual commits for more details

Simplify rate limit configuration options for the CiliumEndpointSlice controller.

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label May 13, 2024
@thorn3r thorn3r force-pushed the cesRateLimit branch 3 times, most recently from 18d7d1a to 39d7592 Compare May 13, 2024 21:32
@thorn3r
Copy link
Contributor Author

thorn3r commented May 13, 2024

/test

@thorn3r thorn3r added the release-note/minor This PR changes functionality that users may find relevant to operating Cilium. label May 14, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label May 14, 2024
@thorn3r
Copy link
Contributor Author

thorn3r commented May 14, 2024

/test

@thorn3r
Copy link
Contributor Author

thorn3r commented May 15, 2024

/test

@thorn3r
Copy link
Contributor Author

thorn3r commented May 15, 2024

/test

@thorn3r thorn3r marked this pull request as ready for review May 16, 2024 14:56
@thorn3r thorn3r requested review from a team as code owners May 16, 2024 14:56
@thorn3r thorn3r force-pushed the cesRateLimit branch 2 times, most recently from dda13df to cb4443c Compare May 21, 2024 21:05
@thorn3r
Copy link
Contributor Author

thorn3r commented May 21, 2024

oof would help if I remembered to run the make targets in the correct order 😅 fixed a comment in values.yaml to avoid flagging ciliumEndpointSlice.enabled as a spelling mistake.

@tklauser
Copy link
Member

@tklauser is that the case for beta features as well? The biggest impact here would probably be the helm change, since there are probably few users with custom rate limiting configured (the options are undocumented except for cmdref).

Ah, I see. I wasn't considering that this is only affecting a beta feature.

Agreed on deprecation vs removal for the helm change, and happy to do the same for the rate limit options as well if you think that's necessary

If the options are for a beta feature and not documented outside of the cmdref I'd say it's probably okay to just deprecate the Helm values and replace the options. But I'm not entirely sure of the policy in that regard, so might be good to confirm with maintainers and/or in the #development channel on slack.

@thorn3r
Copy link
Contributor Author

thorn3r commented May 23, 2024

/test

Copy link
Member

@giorio94 giorio94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, always in favor of simplifying the Cilium's configuration!

I've left a few suggestions inline, mainly about the possibility of making the default settings more explicit, further removing the multiplier setting, and clarifying the semantic around the deprecated wording.

install/kubernetes/cilium/templates/validate.yaml Outdated Show resolved Hide resolved
operator/pkg/ciliumendpointslice/endpointslice.go Outdated Show resolved Hide resolved
operator/pkg/ciliumendpointslice/rate_limit.go Outdated Show resolved Hide resolved
test/helpers/kubectl.go Outdated Show resolved Hide resolved
install/kubernetes/cilium/values.yaml.tmpl Outdated Show resolved Hide resolved
install/kubernetes/cilium/values.yaml.tmpl Outdated Show resolved Hide resolved
operator/pkg/ciliumendpointslice/rate_limit.go Outdated Show resolved Hide resolved
install/kubernetes/cilium/values.yaml.tmpl Outdated Show resolved Hide resolved
Documentation/operations/upgrade.rst Outdated Show resolved Hide resolved
@thorn3r thorn3r force-pushed the cesRateLimit branch 3 times, most recently from e5b8f11 to 2970589 Compare May 24, 2024 01:12
@thorn3r
Copy link
Contributor Author

thorn3r commented May 24, 2024

/test

@thorn3r thorn3r force-pushed the cesRateLimit branch 2 times, most recently from 2ccd753 to 520bf68 Compare May 24, 2024 02:22
Copy link
Member

@tklauser tklauser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @thorn3r!

@tklauser
Copy link
Member

/test

Copy link
Contributor

@lambdanis lambdanis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docs good

Copy link
Member

@giorio94 giorio94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks! Just a couple of final nits inline.

operator/pkg/ciliumendpointslice/rate_limit.go Outdated Show resolved Hide resolved
operator/pkg/ciliumendpointslice/rate_limit.go Outdated Show resolved Hide resolved
operator/pkg/ciliumendpointslice/cell.go Outdated Show resolved Hide resolved
@thorn3r
Copy link
Contributor Author

thorn3r commented May 24, 2024

Changed the config parsing to use a decoder and disallow unknown fields to avoid unexpected behavior if a user enters a typo and the intended field remains 0. Added a short test for this as well.

Cilium Operator has 6 flags related to configuring the client work queue
rate-limiter for the CiliumEndpointSlice controller. This commit
deprecates the existing flags and adds a single new flag for supplying a
custom dynamic rate limit configuration. The following flags are now deprecated
and will be removed in a future release:

```
--ces-write-qps-limit
--ces-write-qps-burst
--ces-enable-dynamic-rate-limit
--ces-dynamic-rate-limit-nodes
--ces-dynamic-rate-limit-qps-limit
--ces-dynamic-rate-limit-qps-burst
```

The `enableCiliumEndpointSlice` field in the helm values
has been deprecated and replaced by `ciliumEndpointSlice.enabled`.
`enableCiliumEndpointSlice` will be removed in a future release.

The static rate limit options have been removed in favor of always using
a dynamic rate limit. The default rate limits defined in
`ciliumEndpointSlice.rateLimits` should be sufficient
for most clusters, however a custom configuration can be supplied for
specific use-cases.

The new flag 'ces-rate-limits' accepts a string containing
a JSON list of 'rateLimit' objects, each containing a limit and burst
setting, and the number of nodes at which the rate limit should be
applied. Example:

"[{\"nodes\": 5, \"limit\": 15.0, \"burst\": 30}]"

Custom rate limits can also be supplied via the Helm
value `ciliumEndpointSlice.rateLimits`, which accepts a yaml
list of rate limit objects that will be formatted into a properly
escaped JSON string.

Signed-off-by: Tim Horner <timothy.horner@isovalent.com>
@thorn3r
Copy link
Contributor Author

thorn3r commented May 25, 2024

/test

@aanm aanm enabled auto-merge May 27, 2024 08:28
@aanm aanm added this pull request to the merge queue May 27, 2024
Merged via the queue into cilium:main with commit 04b4471 May 27, 2024
63 of 64 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note/minor This PR changes functionality that users may find relevant to operating Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants