Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New K8s clients provider for Helm plugin #5358

Merged
merged 1 commit into from Sep 20, 2022

Conversation

castelblanque
Copy link
Collaborator

Signed-off-by: Rafa Castelblanque rcastelblanq@vmware.com

Description of the change

This PR adds a new logic for obtaining the clients for K8s APIs. It resides in the /pkg package, but it is only applied to the Helm plugin by now.
It should speed up the response times for Helm plugin: a lot when using Pinniped, not so much without Pinniped.

Problem solved here was found to be the root cause for #4919.

In the business logic of Helm plugin (at least), usually only one type of client is required per scenario (e.g. the typed client when retrieving available packages summary).
When using the old client getter, all types of clients were being retrieved due to the invocation of clientGetterHelper, even if only one was needed.

It turns out that when creating the dynamic client, many requests are done at the same time to the K8s APIs to get the resources, which probably is triggering the throttling mechanism. There is no throttling when pinniped-proxy is not in the picture, but when it is, all requests go through it, with throttling. List of URLs invoked:

/api/v1?timeout=32s
/apis?timeout=32s
/apis/admissionregistration.k8s.io/v1?timeout=32s
/apis/apiextensions.k8s.io/v1?timeout=32s
/apis/apiregistration.k8s.io/v1?timeout=32s
/apis/apps/v1?timeout=32s
/apis/authentication.concierge.pinniped.dev/v1alpha
/apis/authentication.k8s.io/v1?timeout=32s
/apis/authorization.k8s.io/v1?timeout=32s
/apis/autoscaling/v1?timeout=32s
/apis/autoscaling/v2?timeout=32s
/apis/autoscaling/v2beta2?timeout=32s
/apis/batch/v1?timeout=32s
/apis/certificates.k8s.io/v1?timeout=32s
/apis/config.concierge.pinniped.dev/v1alpha1?timeou
/apis/coordination.k8s.io/v1?timeout=32s
/apis/discovery.k8s.io/v1?timeout=32s
/apis/events.k8s.io/v1?timeout=32s
/apis/flowcontrol.apiserver.k8s.io/v1beta1?timeout=
/apis/flowcontrol.apiserver.k8s.io/v1beta2?timeout=
/apis/identity.concierge.pinniped.dev/v1alpha1?time
/apis/kubeapps.com/v1alpha1/apprepositories
/apis/login.concierge.pinniped.dev/v1alpha1?timeout
/apis/networking.k8s.io/v1?timeout=32s
/apis/node.k8s.io/v1?timeout=32s
/apis/policy/v1?timeout=32s
/apis/rbac.authorization.k8s.io/v1?timeout=32s
/apis/scheduling.k8s.io/v1?timeout=32s
/apis/storage.k8s.io/v1?timeout=32s
/apis/storage.k8s.io/v1beta1?timeout=32s

When creating all clients, even if only one was needed, the mentioned URLs are retrieved more than once, accumulating delays.

With this PR, the new clients provider allows to independently create the client according to the needs. All clients can still be retrieved together with ClientProviderInterface.GetClients, but there are dedicated methods for each type.
Difference with previous implementation is that the function building the clients does not return the instances, but returns functions for each one to be invoked if needed.

Benefits

Response time in many calls is improved from 5-9 seconds down to <1s when using Pinniped and Helm plugin.

Possible drawbacks

N/A

Applicable issues

Signed-off-by: Rafa Castelblanque <rcastelblanq@vmware.com>
@netlify
Copy link

netlify bot commented Sep 19, 2022

Deploy Preview for kubeapps-dev canceled.

Name Link
🔨 Latest commit 3a7e38f
🔍 Latest deploy log https://app.netlify.com/sites/kubeapps-dev/deploys/63288f181cf6710009c78202

@castelblanque castelblanque marked this pull request as ready for review September 19, 2022 15:55
Copy link
Contributor

@antgamdia antgamdia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Thanks for fixing this. Now Kubeapps is usable again in pinniped-based environments!!

Some general questions:

  1. Is this change also being used by the rest of the plugins ? If not, do you think it should be?
  2. If I'm getting this ok, the problem was the Dynamic interface being called always. You reduced the calls to the strictly necessary ones... but, in case we need the Dynamic client for a given operation: would the "get dynamic client" operation be throttled as it was?
  3. If you think so, have you taken a look at the dynamic client QPS/Burst options. It seems there are some .WithDiscoveryBurst(xxxx) and .WithDiscoveryQPS(yyy) and perhaps we are not setting these options (even if we allow tuning the Burst and QPS for the typed client operations, but not for the discovery API ones.

@castelblanque
Copy link
Collaborator Author

Is this change also being used by the rest of the plugins ? If not, do you think it should be?

Yes, that is the idea. Not in this PR, though. This is only for Helm. I'll create follow-up PRs.

If I'm getting this ok, the problem was the Dynamic interface being called always. You reduced the calls to the strictly necessary ones... but, in case we need the Dynamic client for a given operation: would the "get dynamic client" operation be throttled as it was?

I was expecting that, but after the change, load times have been pretty decent even when using dynamic client.
Let's have it tested for some time. If we detect that using dynamic clients + pinniped becomes a problem, we can dig into the problem further. It seems to me that the issue was more the accumulation of creations of dynamic clients for the same incoming request. Now that this is streamlined with this PR, we should be good.

If you think so, have you taken a look at the dynamic client QPS/Burst options. It seems there are some .WithDiscoveryBurst(xxxx) and .WithDiscoveryQPS(yyy) and perhaps we are not setting these options (even if we allow tuning the Burst and QPS for the typed client operations, but not for the discovery API ones.

Do you mean in Pinniped? In kubeapps-apis we are setting correctly those two params. We could take a look, indeed, it it shows to be a problem after this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Requests for catalog delayed
3 participants