Reduce size of pods_selector by using replica prefixes #67

lujiajing1126 · 2023-06-10T14:13:12Z

This PR intends to reduce the size of pod_selector, by picking the <deployment>-<pod-template-hash> as the prefix.

This could significantly reduce the size of the querystring in the case that there are hundreds of pods for a single deployment.

300 may be a magic number here: pod-template-hash normally contains 8 bytes, with 5 chars pod hash. The default max http header that nginx can accept is ~4K, so with >300 pods it must exceed the threshold.

Signed-off-by: Megrez Lu <lujiajing1126@gmail.com>

CLAassistant · 2023-06-10T14:13:18Z

All committers have signed the CLA.

LeaveMyYard · 2023-07-05T11:13:15Z

Hello and sorry for long time waiting on response :)

We've been working a lot with performance issues and also trying your solution.
One of the solutions we tried is to make pod gathering and metric gathering in one query:
https://github.com/robusta-dev/krr/blob/rework-promql-queries/robusta_krr/core/integrations/prometheus/metrics/cpu_metric.py

It resulted in using even more memory load on prometheus to execute that query, as well as longer loading time.
Currently we thing that the correct solution will be to use prometheus with POST except of GET to avoid the limitation of request URI being too large (#66)
You have mentioned that in #60, but still we can use POST by just subclassing the lib we are using for prometheus

Additionally, I've been thinging on fully rewriting regex to fully utilize the format for pod names, but there is a concern that there might be special cases where it fails.

In any case thank you so much for the PR, I will keep it for reference and will close it as soon as we have a solution that satisfies everyone

LeaveMyYard · 2023-07-05T14:36:58Z

I am closing this PR as #97 should have fixed the issue

Still, reopen it if you feel like it improves anything else, but I will keep this possibility in mind in any case

Thank you very much for participating in a project ❤️

lujiajing1126 · 2023-07-05T15:49:26Z

Hello and sorry for long time waiting on response :)

We've been working a lot with performance issues and also trying your solution. One of the solutions we tried is to make pod gathering and metric gathering in one query: https://github.com/robusta-dev/krr/blob/rework-promql-queries/robusta_krr/core/integrations/prometheus/metrics/cpu_metric.py

It resulted in using even more memory load on prometheus to execute that query, as well as longer loading time.

I've tested this solution on VictoriaMetrics (a cluster w/ 4 vmstorage), but it failed to return data and throws timeout. Probably it is too complicated.

I don't know if anyone else can make another test.

Currently we thing that the correct solution will be to use prometheus with POST except of GET to avoid the limitation of request URI being too large (#66) You have mentioned that in #60, but still we can use POST by just subclassing the lib we are using for prometheus

Sure. I forgot to mention this. We've modified site-package to make use of POST. But besides that, I would like to remind (for VM users) that vmselect has an argument,

  -search.maxQueryLen size
     The maximum search query length in bytes
     Supports the following optional suffixes for size values: KB, MB, GB, TB, KiB, MiB, GiB, TiB (default 16384)

The maximum query length is 16KiB. So we still have a limit even for POST.

reduce size of pods_selector

b375e80

Signed-off-by: Megrez Lu <lujiajing1126@gmail.com>

lujiajing1126 added 3 commits June 14, 2023 09:12

Merge branch 'main' into reduce-query-size

62c4e2b

Merge branch 'main' into reduce-query-size

cae486c

Merge branch 'main' into reduce-query-size

379ac87

lujiajing1126 mentioned this pull request Jun 21, 2023

Support metrics-based workload discovery #59

Open

lujiajing1126 added 2 commits June 29, 2023 12:04

Merge branch 'main' into reduce-query-size

09bcd92

Merge branch 'main' into reduce-query-size

52b0293

LeaveMyYard mentioned this pull request Jul 5, 2023

Too many historic pods causes querystring too long for Prometheus range_query API #60

Closed

LeaveMyYard closed this Jul 5, 2023

lujiajing1126 deleted the reduce-query-size branch July 5, 2023 15:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce size of pods_selector by using replica prefixes #67

Reduce size of pods_selector by using replica prefixes #67

lujiajing1126 commented Jun 10, 2023 •

edited

Loading

CLAassistant commented Jun 10, 2023 •

edited

Loading

LeaveMyYard commented Jul 5, 2023 •

edited

Loading

LeaveMyYard commented Jul 5, 2023 •

edited

Loading

lujiajing1126 commented Jul 5, 2023 •

edited

Loading

Reduce size of pods_selector by using replica prefixes #67

Reduce size of pods_selector by using replica prefixes #67

Conversation

lujiajing1126 commented Jun 10, 2023 • edited Loading

CLAassistant commented Jun 10, 2023 • edited Loading

LeaveMyYard commented Jul 5, 2023 • edited Loading

LeaveMyYard commented Jul 5, 2023 • edited Loading

lujiajing1126 commented Jul 5, 2023 • edited Loading

lujiajing1126 commented Jun 10, 2023 •

edited

Loading

CLAassistant commented Jun 10, 2023 •

edited

Loading

LeaveMyYard commented Jul 5, 2023 •

edited

Loading

LeaveMyYard commented Jul 5, 2023 •

edited

Loading

lujiajing1126 commented Jul 5, 2023 •

edited

Loading