-
Notifications
You must be signed in to change notification settings - Fork 8.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PromQL: Improve space complexity by streaming more; avoid series pre-lookup. #7326
Comments
This is actually quite a small change in removing the Expand from CheckAndExpand. It's really just cleanup from the big reworking of promql I did a few years back.
I don't see how this is related to what's being discussed here. Hashing is cheap, and we'll need to do it anyway. I'm not sure why you've marked this as a P1 feature. It's a small performance thing that's not really something user visible, and it's not urgent. It's a P3 cleanup or enhancement. |
Happy to hear it's easy =D let's try it out then. |
Hi there, first time contributor looking to help with low hanging-fruit issues. Could someone point me to the relevant code files so I can better understand the issue? |
Basically in Line 728 in 933aa36
|
Hi, sorry for the inactivity. I would still like to pick this issue up. Currently looking through how engine.go works, is there any good documentation about the internals of the PromQL engine? My current understanding of the issue is that it is inefficient to gather all the series data before rangeEval. I am still unsure of what the proposed fix might involve. Could someone please enlighten? Thanks! |
I'm not sure there's much in general docs.
In essence eliminate ExpandSeriesSet, and then fix up anything that breaks due to that. The only place things will happen to be fully expanded is subqueries, as that's what the output is. |
Hey everyone, a first time contributor here. I would like to work on this issue if no one else is already working on it. Having gone through this issue and some of the related code, I have a few doubts I was hoping to get cleared. As far as I understand, we want to avoid using Unfortunately, @bwplotka's solution regarding Heap/MergeSort and streaming more made little sense to me, so it would be great if someone could elaborate on that. Thanks! |
We do it as needed one by one, rather than in one big batch. |
I think I understand the issue, but can't really seem to get a grasp on the solution. Where exactly do we determine that we need to move to the next Series? |
Anywhere we iterate over the slice today, instead we call next. |
Sorry for the inactivity, I have been trying to work on a patch, but can't get a few tests to pass (almost all of them pass), specifically some in |
That's a bit weird, can you make a PR so we can have a look? |
@bwplotka @brian-brazil , assuming the issue still exists, I would love to take a crack at this if @aryan9600 isn't keen on progressing with #8009 |
Totally, go for it. It's still available, but benchmarks might be necessary to assess results |
Just to give status on it. It's still needed for Thanos and (and potentially Cortex purposes). See: thanos-io/thanos#4780 Looks like @darshanime did a good job on this with #9071, so we can close #8009 (thanks for your hard initial work @aryan9600 !). |
happy to take a look or collab if no one else is on it |
Correct, not needed anymore, we can reopen if there will be a need! |
Hi 👋
While changing SeriesSet interfaces, we found with @brian-brazil that our usage of SerieSet in PromQL might cause extra unnecessary allocations for certain implementations and cases. Let me describe the problem in two diagrams:
Current PromQL logic for Range Query with Metric Selection:
Google Draw Link
All function names should match the current master: https://github.com/prometheus/prometheus/tree/18d9ebf0ffc26b8bd0e136f552c8e9886d29ade4
Issues:
Google Draw Link
Problem: We are accumulating all series first, to gather all iterators.
With the current interfaces (StoreSet, Remote Read, Thanos StoreAPI) “Series by Series” means that we stream Series labels together with labels.
Current Flow Issues:
eval
PromQL stage, we have to buffer & receive all data. Especially for remote data & Thanos, this means literally all. (: Realistic space complexity is now O(2 * series * chunks), for the biggest result, so ~ O(2 series * chunks) BSolution: Stream more!
The idea would be to follow really exactly the same algorithm as everywhere else (vertical dedup, merge series etc): Implement Heap / MergeSort that will iterate over "sorted" series from two SeriesSets.
This sounds like a big task but really needed, especially for Thanos community. Even with the work we are doing to make PromQL more concurrent for Prometheus Ecosystem (#6878), this issue might be impacting resource (memory) consumption significantly, especially for concurrent runs. I am happy to do this at some point, but I won't have time in near ~2weeks for this. so.. help wanted (:
Any feedback on this to make it easier to improve this? (: cc @slrtbtfs @brian-brazil @brancz @kakkoyun @cstyan @tomwilkie @gouthamve @cstyan
The text was updated successfully, but these errors were encountered: