Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow configuring nginx worker reload behaviour, to prevent multiple concurrent worker reloads which can lead to high resource usage and OOMKill #10884

Merged

Conversation

rsafonseca
Copy link
Contributor

@rsafonseca rsafonseca commented Jan 19, 2024

What this PR does / why we need it:

This PR introduces the ability to configure the concurrency behaviour of chained configuration updates that require a backend reload.

It does so by requeuing jobs that require a backend reload, whenever there's already a reload in progress, and verifies if the reload is finished by counting the number of workers, as per the configuration.

The default behaviour remains unchanged for now, and this behaviour can be configured via setting an option in the configmap.

The main purpose is to prevent several worker reloads to happen before the previous reload has finished, in order to avoid a steep and uncontrolled growth of resource usage, which can lead to OOMKill of the ingress-controller pods and/or CPU starvation. This is especially problematic in environments that use long lived connections (since the workers stay alive until the worker-shutdown-timeout is reached) and where a few configuration reloads might be triggered within that timeout.

There's a few issues regarding this problem such as #8166 but the solutions presented introduce other problems. The sync throttling suggested in the linked issue for example prevents service upstream endpoints from being updated on time, even if they don't require a reload, since all configuration changes currently share the same queue, hence the same throttling (for example, in the configuration proposed it could take up to ~2 minutes to remove terminating endpoints from a service, leading to HTTP 502)

This method addresses the root cause of the problem: multiple worker reloads which can happen spontaneously causing a huge amount of resources to be used, leading to OOMKill.

A few other related issues: #8336 #8362 #8357

Types of changes

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

Running in a live cluster, with and without the option enabled.

Checklist:

  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I've read the CONTRIBUTION guide
  • I have added unit and/or e2e tests to cover my changes.
  • All new and existing tests passed.

…tiple concurrent worker reloads

Signed-off-by: Rafael da Fonseca <rafael.fonseca@wildlifestudios.com>
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. area/docs labels Jan 19, 2024
@k8s-ci-robot
Copy link
Contributor

Welcome @rsafonseca!

It looks like this is your first PR to kubernetes/ingress-nginx 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/ingress-nginx has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 19, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @rsafonseca. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-priority size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 19, 2024
Copy link

netlify bot commented Jan 19, 2024

Deploy Preview for kubernetes-ingress-nginx canceled.

Name Link
🔨 Latest commit afa51bf
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-ingress-nginx/deploys/663b5e37c31444000887bb56

Signed-off-by: Rafael da Fonseca <rafael.fonseca@wildlifestudios.com>
@rikatz
Copy link
Contributor

rikatz commented Jan 21, 2024

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 21, 2024
@@ -668,6 +672,11 @@ Error: %v
//
//nolint:gocritic // the cfg shouldn't be changed, and shouldn't be mutated by other processes while being rendered.
func (n *NGINXController) OnUpdate(ingressCfg ingress.Configuration) error {
concurrentlyReloadWorkers := n.store.GetBackendConfiguration().ConcurrentlyReloadWorkers
if !concurrentlyReloadWorkers && n.workersReloading {
return errors.New("worker reload already in progress, requeuing reload")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be an error, or just a warning?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea of returning an error is to pick up on the existing behaviour on the queue, which will re-queue in case of error here

if err := t.sync(key); err != nil {

@rikatz
Copy link
Contributor

rikatz commented Jan 21, 2024

@rsafonseca first of all, thank you for taking care of this long standing problem, this is great!

As a suggestion, I'm not sure if using a single variable is the right approach here, you may still hit some issue on multiple reloads or something that causes a different behavior to replace the variable.

Instead, and just thinking loud (I know it may not be an easy approach) what if when this feature is enabled, define a buffered channel for the reload that controls, as a semaphore, the reloads?

I'm still not sure if it does make sense, but some channel that accepts just one item, and a reload would add an item on the queue (and block), and the checker process of workers that removes this item from the channel, unblocking any other attempts to reload.

You can then control if a reload was already requested checking the length of this channel.

@tao12345666333 you may have some better view then me here, as I honestly sucks on concurrency problems :)

@rikatz
Copy link
Contributor

rikatz commented Jan 21, 2024

/triage accepted
/priority important-soon

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority labels Jan 21, 2024
@rikatz
Copy link
Contributor

rikatz commented Jan 21, 2024

Also, another suggestion: As this is feature flagged (thanks!!!) let's try using positive behavior flag. Maybe instead of a flag that defaults to true, disabling the feature, something like a flag as "EnableSerialReloads = false" as default, and then if someone wants to enable it, set to true.

Booleans can become hard to define behavior (and this is why a bunch of APIs uses pointers, so you have true, false and undefined) so at least it is clear that if someone wants to Enable this feature, they use something like an "enable-something" flag.

This is better also for helm charts, etc.

Thanks!

Signed-off-by: Rafael da Fonseca <rafael.fonseca@wildlifestudios.com>
@k8s-ci-robot k8s-ci-robot added the area/helm Issues or PRs related to helm charts label Jan 22, 2024
@rsafonseca
Copy link
Contributor Author

Hi @rikatz, thanks for reviewing :)

Regarding concurrency and using a single variable, i don't think this would be a problem in the way things are designed overall, but let's see if anyone else has any concern here. I don't think it's worth the trouble to create a channel just for this, as the existing queue system already has a re-queue mechanism which is being employed here.

Regarding using a positive behaviour flag, that is a good point! I've refactored it and used your name suggestion there and added in a helm chart option for it as well :)

Signed-off-by: Rafael da Fonseca <rafael.fonseca@wildlifestudios.com>
@tao12345666333
Copy link
Member

/assign

Thanks, I will take a look this week.

@rsafonseca
Copy link
Contributor Author

Sorry for the long delay. Let me take a look

Any news @tao12345666333 ? I'm waiting to get this merged so I can move off my fork and upgrade ;)

@tao12345666333
Copy link
Member

Thank you for your contributions!
I will finish the review this week.

BTW this PR need rebase

@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 24, 2024
@rsafonseca
Copy link
Contributor Author

rsafonseca commented Apr 24, 2024

BTW this PR need rebase

rebased @tao12345666333 :)

looks like pull-ingress-nginx-codegen job is failing, but it's unrelated to this PR

@tao12345666333
Copy link
Member

/retest

internal/ingress/controller/nginx.go Outdated Show resolved Hide resolved
charts/ingress-nginx/values.yaml Outdated Show resolved Hide resolved
@rikatz
Copy link
Contributor

rikatz commented Apr 29, 2024

@rsafonseca as we have bumped a bunch of libraries, please rebase over main to be sure codegen job will run using golang v1.22 and the latest k8s versions during tests :)

tks

@rsafonseca
Copy link
Contributor Author

@rsafonseca as we have bumped a bunch of libraries, please rebase over main to be sure codegen job will run using golang v1.22 and the latest k8s versions during tests :)

tks

I did rebase 5 days ago @rikatz and my branch is still up to date with main. could that job be pulling from the main in my fork instead? I handn't synced that one, since it wasn't the branch i was merging, but i synced it now

…elm chart option

Signed-off-by: Rafael da Fonseca <rafael.fonseca@wildlifestudios.com>
Signed-off-by: Rafael da Fonseca <rafael.fonseca@wildlifestudios.com>
@tao12345666333
Copy link
Member

/test pull-ingress-nginx-codegen

@tao12345666333
Copy link
Member

I have created a PR #11344
It will unblock the failures of the codegen job.

@tao12345666333
Copy link
Member

#11344 has been merged, let's do a rebase and the codegen will be passed

@rsafonseca
Copy link
Contributor Author

Looks good @tao12345666333 , thanks for fixing this in main :)

@rsafonseca
Copy link
Contributor Author

Any blockers for merging this now? Can we get it shipped? :)

Copy link
Member

@tao12345666333 tao12345666333 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 14, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: rsafonseca, tao12345666333

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 14, 2024
@tao12345666333
Copy link
Member

Thank you for your contributions! @rsafonseca

@k8s-ci-robot k8s-ci-robot merged commit 4e11074 into kubernetes:main May 14, 2024
30 checks passed
@rsafonseca rsafonseca deleted the configure_worker_reload_concurrency branch May 15, 2024 09:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/docs area/helm Issues or PRs related to helm charts cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants