Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

postings: use Loser Tree for merge #12878

Merged
merged 2 commits into from Dec 12, 2023
Merged

Conversation

bboreham
Copy link
Member

Loser Tree is an alternative to Heap for k-way merge. It's faster because it doesn't call Less() and Swap() through dynamic dispatch, and because it stores the value at the front of each sequence so saves calling At() over and over.

Full disclosure: I'm doing a talk about this algorithm at GopherCon next week.

I added a benchmark because none of the existing ones merge enough series to show the difference.
Loser tree does use more memory, but not a lot in the context of PromQL evaluation.

goos: darwin
goarch: arm64
pkg: github.com/prometheus/prometheus/tsdb/index
               │    before     │               after                │
               │    sec/op     │   sec/op     vs base               │
Merge/1-8         228.9n ±  1%   228.8n ± 1%        ~ (p=0.803 n=6)
Merge/10-8        39.84µ ±  1%   11.91µ ± 0%  -70.10% (p=0.002 n=6)
Merge/100-8       912.1µ ±  1%   151.1µ ± 0%  -83.44% (p=0.002 n=6)
Merge/1000-8     12.241m ± 10%   1.675m ± 1%  -86.32% (p=0.002 n=6)
Merge/10000-8    159.59m ±  2%   19.12m ± 1%  -88.02% (p=0.002 n=6)
Merge/100000-8   2781.5m ±  0%   590.4m ± 1%  -78.77% (p=0.002 n=6)

               │     before     │                 after                  │
               │      B/op      │     B/op      vs base                  │
Merge/1-8          0.000 ± 0%       0.000 ± 0%         ~ (p=1.000 n=6) 
Merge/10-8         224.0 ± 0%       720.0 ± 0%  +221.43% (p=0.002 n=6)
Merge/100-8      1.812Ki ± 0%     6.453Ki ± 0%  +256.03% (p=0.002 n=6)
Merge/1000-8     16.06Ki ± 0%     64.08Ki ± 0%  +298.93% (p=0.002 n=6)
Merge/10000-8    160.1Ki ± 0%     632.1Ki ± 0%  +294.89% (p=0.002 n=6)
Merge/100000-8   1.531Mi ± 0%     6.109Mi ± 0%  +298.97% (p=0.002 n=6)

Note change to test - instead of requiring that the data structure is identical to EmptyPostings(), check that calling Next() returns false, which implies it was empty.

Copy link
Contributor

@fpetkovski fpetkovski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Impressive, can't wait for the talk to be online.

tsdb/index/postings.go Outdated Show resolved Hide resolved
@gouthamve
Copy link
Member

Hey Bryan, is this ready to merge? There are a few conflicts ;)

We looked at this in our bug scrub.

@bboreham
Copy link
Member Author

Yes it can be merged. I had hoped to get the implementation closer between Loki and Prometheus, but currently there is too much of a performance loss to do that.

Also you might not want to take a dependency on something from my personal account. There are precedents - alecthomas, cespare, dennwc, klauspost.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
It's faster.

Note change to test - instead of requiring that the data structure is
identical to `EmptyPostings()`, check that calling `Next()` returns
false, which implies it was empty.

Also the check for context cancellation during initialization was
removed. Initialization should be a small portion of the work done
during merge, so it's not worth plumbing a context argument through.

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>
Copy link
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks 💪🏽

@yeya24
Copy link
Contributor

yeya24 commented Dec 12, 2023

Is this change ready to merge? Looking forward to test it

@bboreham bboreham merged commit d0c2d9c into prometheus:main Dec 12, 2023
24 checks passed
@bboreham bboreham deleted the loser-tree branch December 12, 2023 21:38
@yeya24
Copy link
Contributor

yeya24 commented Jan 18, 2024

Hi @bboreham, may I know if you have any benchmark for mergedPostings Seek method? Do we expect to see similar improvements as the provided benchmark

@bboreham
Copy link
Member Author

I don't recall benchmarking Seek specifically in Prometheus.

Some of the queries in BenchmarkRangeQuery do call through Seek, and I seem to remember they are dominated by the behaviour of Seek on the underlying postings. E.g. ListPostings.Seek() always does sort.Search, which is expensive in the case that the value you need is list[1].

@jub0bs
Copy link

jub0bs commented Apr 1, 2024

For information, Bryan's talk is now available at https://www.youtube.com/watch?v=AmLtlXEo4UU

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants