New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[tsdb] Improve mergeSeriesSet #5920
Conversation
6bef6a4
to
d7ac452
Compare
I used the PR to test the new automated bench testing here are the results: @brian-brazil should be able to comment about the PR changes. |
@krasi-georgiev I ran the benchmark on the prometheus machine just now. Actually I found the difference is under 5% for all cases in |
yes you are right, I completely forgot that the GH actions are run in shared VM env so the results will be inconsistent. Will refactor the testing to use dedicated machines. |
Signed-off-by: naivewong <867245430@qq.com>
d7ac452
to
f06d52b
Compare
This looks good. Do benchmarks show much of a difference for two series sets to justify the separate handling? |
The benchmarks results on the op time seem a bit better.
|
What's that comparing? It looks largely in the noise. |
The benchmark is still comparing the master and this pr for |
If there's no big difference, let's go with the simpler approach. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't seem to find a test that ensures correctness for the NewMergedSeriesSet
Can you verify that we have one or write it if missing.
tsdb/querier.go
Outdated
return s.err | ||
} | ||
|
||
// This function is to call Next() for all SeriesSet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please follow Golang commentary guidelines.
https://golang.org/doc/effective_go.html#commentary
tsdb/querier.go
Outdated
s.all, s.buf = s.buf, s.all | ||
} | ||
|
||
// This function is to call Next() for the SeriesSet with the indices of s.ids. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't seem to describe the exact behavior of the function and the comment is an exact copy of nextAll
The correctness of prometheus/tsdb/querier_test.go Line 60 in f06d52b
The behaviors of nextAll and nextWithID are different. It's just like the comments that nextAll will calls Next() for all SeriesSet in all array in mergedSeriesSet and nextWithID only calls Next() for some SeriesSet in the all array.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with the small nit.
Signed-off-by: naivewong <867245430@qq.com>
@codesome @brian-brazil unless you have some comments will merge this in the next 1-2 days. |
I still think we should be removing binaryMergedSeriesSet. |
sure that is fine with me. |
@naivewong could you address this last comment so we can merge this one. |
Sorry, I just realized I misunderstood what Brian said before. I thought he just asked me to compare the original and the current method for two series. As a result, the previous result did not show the results without the binaryMergedSeriesSet for 2 series. The following is the actual results without binaryMergedSeriesSet for 2 series -
That's why I added binaryMergedSeriesSet. Sorry for the mistake again. |
Thanks, that looks largely irrelevant to me especially when we get to larger datasets where we care more. |
@brian-brazil so IIUC you are saying that |
Yes. |
Signed-off-by: naivewong <867245430@qq.com>
@brian-brazil the binary set has been removed, do you notice any other issues? |
👍 |
@naivewong thanks for the patience and the great work! |
Using the idea similar to tsdb/pr616 to improve both time and alloc.
Currently I use the original
mergeSeriesSet
if the len of the passed-in seriesset slice = 2, because I think the original one is the most efficient way for merging only 2 seriesset.Benchmark results of
BenchmarkMergedSeriesSet
.