Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-calculate limits with k8s/vertical-pod-autoscaler #5494

Closed
denis-tingaikin opened this issue Apr 14, 2022 · 7 comments
Closed

Re-calculate limits with k8s/vertical-pod-autoscaler #5494

denis-tingaikin opened this issue Apr 14, 2022 · 7 comments
Assignees

Comments

@denis-tingaikin
Copy link
Member

Our first limits were set in #727

That was good for first step but it is not looking good for now. We still see issues with limits on different systems.

Solution

Re-calcluate limits with https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler

@denis-tingaikin
Copy link
Member Author

@edwarnicke Can we schedule this for v1.4.0?

@denis-tingaikin
Copy link
Member Author

This looks super actual. Many users face tiny limits.

/cc @glazychev-art

@NikitaSkrynnik
Copy link
Collaborator

Subtasks

  • Check NSMgr limits
  • Check Forwarder limits
  • Check Registry limits

@denis-tingaikin
Copy link
Member Author

@edwarnicke

We started work with this, because many customers report about tiny limits on some components

@denis-tingaikin
Copy link
Member Author

Example of the work for the vertical-pot-autoscaler

image

@NikitaSkrynnik
Copy link
Collaborator

NikitaSkrynnik commented Dec 22, 2022

Subtasks

  • Check find request perfomance in registry-k8s ~ 3h
  • Check NSM component limits on 10 NSC and NSE ~ 2h
  • Check NSM component limits on 30 NSC and NSE ~ 2h
  • Check NSM component limits on 60 NSC and NSE ~ 2h
  • Test ping time from NSC to NSE for 10, 30, 60 NSC and NSE ~ 2h
  • Test 10, 30, 60 NSC and NSE with disabled dataplane healing ~ 3h
  • Test NSCs and NSEs on one node ~ 2h

@NikitaSkrynnik
Copy link
Collaborator

Recommendations from VPA for 10, 20, 30, 40 NSCs and NSEs (Kernel2Ethernet2Kernel example, dataplane healing disabled)
image

40 NSCs and NSEs is the maximum at which all 40 pings go normally. If we deploy 50 NSCs and NSEs some pings stop working despite the fact that all requests were successful.

@edwarnicke Should we investigate the issue with 50 NSCs and NSEs now?

Here are logs from NSM with 50 NSCs and NSEs:
log.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

No branches or pull requests

3 participants