Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce response times when using large repos in Carvel plugin #4099

Closed
antgamdia opened this issue Jan 17, 2022 · 5 comments
Closed

Reduce response times when using large repos in Carvel plugin #4099

antgamdia opened this issue Jan 17, 2022 · 5 comments
Assignees
Labels
component/apis-server Issue related to kubeapps api-server kind/enhancement An issue that reports an enhancement for an implemented feature
Projects

Comments

@antgamdia
Copy link
Contributor

antgamdia commented Jan 17, 2022

The current implementation is unusable (40-50 seconds) when it comes to large repositories. Since Carvel stores a Package for each version, we have #packages * #versions different packages.

For retrieving the package details, we need to query this object, so, for each package, we need to perform this O(n*m) search. Current implementation's response times are up to 10 seconds...

An initial general improvement should be tried and, if not improving, we might have to consider adding a cache...

More info at: #3784

Increasing the QPS (eg 100), does not seem to be working (still resp. times up to: 40-50 seconds)

@antgamdia antgamdia changed the title Reduce response times when using large (aka Bitnami) repos. Reduce response times when using large repos in Carvel plugin Jan 17, 2022
@antgamdia antgamdia added component/apis-server Issue related to kubeapps api-server kind/enhancement An issue that reports an enhancement for an implemented feature priority/high labels Jan 17, 2022
@antgamdia antgamdia added this to Next iteration discussion in Kubeapps Jan 17, 2022
@antgamdia
Copy link
Contributor Author

I'd bet we're gonna need a similar approach as the one implemented at #3636 😅

@absoludity absoludity moved this from Next iteration discussion to Committed in Kubeapps Mar 2, 2022
@absoludity absoludity moved this from Committed to Next iteration discussion in Kubeapps Mar 2, 2022
@absoludity absoludity moved this from Next iteration discussion to Committed in Kubeapps Mar 3, 2022
@absoludity absoludity self-assigned this Mar 3, 2022
@absoludity absoludity moved this from Committed to In progress in Kubeapps Mar 3, 2022
@absoludity
Copy link
Contributor

I took a look at the issue and possible solutions to this problem today, specifically at the GetAvailablePackageSummaries call.

I added the TCE repo (12 package metadatas with 17 package versions). Opening the catalog in Kubeapps three times resulted in:

I0303 05:14:17.418209       1 server.go:59] OK 5.691394379s /kubeappsapis.core.packages.v1alpha1.PackagesService/GetAvailablePackageSummaries
...
I0303 05:14:40.599514       1 server.go:59] OK 5.765965114s /kubeappsapis.core.packages.v1alpha1.PackagesService/GetAvailablePackageSummaries
...
I0303 05:14:48.814321       1 server.go:59] OK 5.128993047s /kubeappsapis.core.packages.v1alpha1.PackagesService/GetAvailablePackageSummaries

Similar to your thoughts above @antgamdia , it looks like the main issue is that currently we're doing a separate request for every package metadata to get the related package (version) resource. The data which is fetched with this extra request per item is simply the latest version.

Three options I can think of (ordered by worthiness :P ) are:

Option 1 - fetch all package (versions) with a single request

Using a single request for all package (version) resources and then filtering this for the data we need while iterating the results (rather than converting it to an in-memory slice) would be much more efficient than N queries in most situations.

It'd be nice if we could just select all relevant package (version) resources, but the k8s API server only allows a limited set of operators for field selectors, so it's only possible to fetch either all or for one specific metadata name (the latter being what the code does currently)

Option 2 - Don't display the version in the package summaries for carvel

I thought it wouldn't look too bad if the summaries view (for Carvel) just didn't display a version (ie. we send it back blank). This would solve the issue for the catalog view, but would look odd when mixing results from different plugins and would not help the similar situation in other carvel end-points (get installed package summaries).

Option 3 - Update the UX to get the version data separately

Plugin returns the summary data (basically, the carvel package metadata), then the dashboard requests version data for each package displayed to update dynamically.

Obviously a large change. We'd want to think through carefully whether this fit other plugins needs too (as we don't do it currently obviously, but perhaps that's because of expensive options we've chosen).

Result

I went ahead with the Option 1 since it was the most viable and most useful right now. Initial results (for the TCE repo) seem good, taking the average for over 5s for the call as above down to between 260-500ms:

I0303 05:36:29.301656       1 server.go:59] OK 280.707563ms /kubeappsapis.core.packages.v1alpha1.PackagesService/GetAvailablePackageSummaries
...
I0303 05:36:52.809490       1 server.go:59] OK 263.191122ms /kubeappsapis.core.packages.v1alpha1.PackagesService/GetAvailablePackageSummaries
...
I0303 05:37:27.609722       1 server.go:59] OK 490.541777ms /kubeappsapis.core.packages.v1alpha1.PackagesService/GetAvailablePackageSummaries

I'll push and link a PR with that where we can discuss the details if we want to land it (and we can update the GetInstalledPackageSummaries similarly).

@absoludity
Copy link
Contributor

Tested with both the TCE repository and @vrabbi 's bitnami repo at https://github.com/users/vrabbi/packages/container/package/bitnami-charts-repo so that I've got, in total, 112 packages on my local system:

k get packages | wc -l                                               
113

The change in the related PR brings the response time down from 40-50s to 4-5 seconds:

I0304 01:39:19.978673       1 server.go:59] OK 4.140540445s /kubeappsapis.core.packages.v1alpha1.PackagesService/GetAvailablePackageSummaries
...
I0304 01:41:18.568173       1 server.go:59] OK 3.899983355s /kubeappsapis.core.packages.v1alpha1.PackagesService/GetAvailablePackageSummaries
...
I0304 01:41:56.281673       1 server.go:59] OK 4.14603719s /kubeappsapis.core.packages.v1alpha1.PackagesService/GetAvailablePackageSummaries

@antgamdia
Copy link
Contributor Author

Wow, awesome Michael. So relieved we've been able to reduce the response times... I mean, it was unbearable; perhaps the cache would speed it up to ms, but at least now is usable. Thanks!

@vrabbi
Copy link

vrabbi commented Mar 13, 2022

This sounds awesome! Cant wait to try out the plugin with shorter loading times! Other then the timing i am loving the carvel support and it will be only better with this change making it a truly viable solution at scale!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/apis-server Issue related to kubeapps api-server kind/enhancement An issue that reports an enhancement for an implemented feature
Projects
No open projects
Kubeapps
  
Done
Development

No branches or pull requests

4 participants