Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce size of pods_selector by using replica prefixes #67

Closed

Conversation

lujiajing1126
Copy link

@lujiajing1126 lujiajing1126 commented Jun 10, 2023

Closes #60, #62, #66

This PR intends to reduce the size of pod_selector, by picking the <deployment>-<pod-template-hash> as the prefix.

This could significantly reduce the size of the querystring in the case that there are hundreds of pods for a single deployment.

300 may be a magic number here: pod-template-hash normally contains 8 bytes, with 5 chars pod hash. The default max http header that nginx can accept is ~4K, so with >300 pods it must exceed the threshold.

Signed-off-by: Megrez Lu <lujiajing1126@gmail.com>
@CLAassistant
Copy link

CLAassistant commented Jun 10, 2023

CLA assistant check
All committers have signed the CLA.

@LeaveMyYard
Copy link
Contributor

LeaveMyYard commented Jul 5, 2023

Hello and sorry for long time waiting on response :)

We've been working a lot with performance issues and also trying your solution.
One of the solutions we tried is to make pod gathering and metric gathering in one query:
https://github.com/robusta-dev/krr/blob/rework-promql-queries/robusta_krr/core/integrations/prometheus/metrics/cpu_metric.py

It resulted in using even more memory load on prometheus to execute that query, as well as longer loading time.
Currently we thing that the correct solution will be to use prometheus with POST except of GET to avoid the limitation of request URI being too large (#66)
You have mentioned that in #60, but still we can use POST by just subclassing the lib we are using for prometheus

Additionally, I've been thinging on fully rewriting regex to fully utilize the format for pod names, but there is a concern that there might be special cases where it fails.

In any case thank you so much for the PR, I will keep it for reference and will close it as soon as we have a solution that satisfies everyone

@LeaveMyYard
Copy link
Contributor

LeaveMyYard commented Jul 5, 2023

I am closing this PR as #97 should have fixed the issue

Still, reopen it if you feel like it improves anything else, but I will keep this possibility in mind in any case

Thank you very much for participating in a project ❤️

@LeaveMyYard LeaveMyYard closed this Jul 5, 2023
@lujiajing1126 lujiajing1126 deleted the reduce-query-size branch July 5, 2023 15:40
@lujiajing1126
Copy link
Author

lujiajing1126 commented Jul 5, 2023

Hello and sorry for long time waiting on response :)

We've been working a lot with performance issues and also trying your solution. One of the solutions we tried is to make pod gathering and metric gathering in one query: https://github.com/robusta-dev/krr/blob/rework-promql-queries/robusta_krr/core/integrations/prometheus/metrics/cpu_metric.py

It resulted in using even more memory load on prometheus to execute that query, as well as longer loading time.

I've tested this solution on VictoriaMetrics (a cluster w/ 4 vmstorage), but it failed to return data and throws timeout. Probably it is too complicated.

I don't know if anyone else can make another test.

Currently we thing that the correct solution will be to use prometheus with POST except of GET to avoid the limitation of request URI being too large (#66) You have mentioned that in #60, but still we can use POST by just subclassing the lib we are using for prometheus

Sure. I forgot to mention this. We've modified site-package to make use of POST. But besides that, I would like to remind (for VM users) that vmselect has an argument,

  -search.maxQueryLen size
     The maximum search query length in bytes
     Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 16384)

The maximum query length is 16KiB. So we still have a limit even for POST.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Too many historic pods causes querystring too long for Prometheus range_query API
3 participants