Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize postings offset table reading #11535

Merged
merged 7 commits into from Nov 14, 2022
Merged

Optimize postings offset table reading #11535

merged 7 commits into from Nov 14, 2022

Conversation

colega
Copy link
Contributor

@colega colega commented Nov 4, 2022

Instead of reading a generic-ish []string slice, we now always read two []byte slices: name and value. They point to the backing buffer slice until we store them as strings, where they're copied.

Moved the label-indices specific reading mechanism to that test only, since it's not being used anywhere else.

This avoids allocating a slice that escapes to the heap, making it both faster and more efficient in terms of memory management.

Also added a benchmark for a block with 1M series and 20 labels, which I think looks really promising:

name                 old time/op    new time/op    delta
OpenBlock/benchmark     130ms ± 3%      44ms ±10%  -66.47%  (p=0.008 n=5+5)

name                 old alloc/op   new alloc/op   delta
OpenBlock/benchmark    53.2MB ± 0%     5.7MB ± 0%  -89.21%  (p=0.008 n=5+5)

name                 old allocs/op  new allocs/op  delta
OpenBlock/benchmark     3.00M ± 0%     0.06M ± 0%  -97.91%  (p=0.008 n=5+5)
Old PR description: before applying feedback and optimizing even further Instead of reading a generic-ish `[]string` slice, we can read a generic type which would be specifically `labels.Label` for the posting offset table.

This avoids allocating a slice that escapes to the heap, making it both faster and more efficient in terms of memory management.

Also added a benchmark for a block with 1M series and 20 labels, which I think looks really promising:

OpenBlock/benchmark     130ms ± 3%      85ms ± 3%  -35.10%  (p=0.008 n=5+5)

name                 old alloc/op   new alloc/op   delta
OpenBlock/benchmark    53.2MB ± 0%    21.2MB ± 0%  -60.10%  (p=0.008 n=5+5)

name                 old allocs/op  new allocs/op  delta
OpenBlock/benchmark     3.00M ± 0%     2.00M ± 0%  -33.33%  (p=0.000 n=5+4)

P.S.: is this the first generics usage in Prometheus?

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Instead of reading a generic-ish []string, we can read a generic type
which would be specifically labels.Label.

This avoid allocating a slice that escapes to the heap, making it both
faster and more efficient in terms of memory management.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me

tsdb/index/index.go Outdated Show resolved Hide resolved
tsdb/index/index.go Outdated Show resolved Hide resolved
tsdb/index/index.go Outdated Show resolved Hide resolved
colega and others added 3 commits November 8, 2022 10:08
Co-authored-by: Marco Pracucci <marco@pracucci.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Copy link
Member

@bboreham bboreham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's very good, but I find it a little too clever.

Only two functions are passed as the read parameter to ReadOffsetTable - PostingsTableReader and reader. Since reader is only used in a test, suggest changing the test to work on labels, and hard-coding ReadOffsetTable to read Labels.

@@ -1251,15 +1247,12 @@ type Range struct {
// for all postings lists.
func (r *Reader) PostingsRanges() (map[labels.Label]Range, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PostingsRanges is never called?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apperently it's not. I can see it called in one Mimir's test though.

I feel we can remove it, also the Range type which is only used here, but I'd rather do that in a separate PR instead, leaving this one only for optimization changes.

Applied PR feedback: removed generics, moved the label indices reading
to that specific test as we're not using it in production anyway, we're
just testing what we've just built.

Also using two []bytes variables for name and value that use the backing
buffer instead of using strings, this reduces allocations a lot as we
only copy them when we store them (this is optimized by the compiler).

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
@colega colega changed the title Use specific types when reading offset table Optimize postings offset table reading Nov 11, 2022
@colega
Copy link
Contributor Author

colega commented Nov 11, 2022

Thank you for your feedback, @bboreham, I applied it, also changing the way label & value are passed to the func: they're []byte now (as that reduces allocations of strings we don't want to store: compiler already optimizes map[string([]byte)] lookup to avoid an allocation).

Updated the benchmark in the description PTAL.

Copy link
Member

@bboreham bboreham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

I believe the code in TestIndexRW_Postings exists to check that Prometheus is writing LabelIndicesTable, so that it can be read by an older version of Prometheus.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
@codesome codesome merged commit 8553a98 into prometheus:main Nov 14, 2022
charleskorn added a commit to grafana/mimir that referenced this pull request Dec 16, 2022
…ng unnecessary strings.

This is inspired by prometheus/prometheus#11535.

Unfortunately, we can't adopt that change as-is, as byte slices
returned by our new Decbuf.UvarintBytes() implementation are not valid
after subsequent reads - we can't take advantage of the magic of mmap.

This means that we must decide whether or not to allocate a string
for a key or value before reading any further in the file. However,
we want to store the last value for each key, but won't know if the
value is the last one until we've read the next one.

The trick is to read the table in two passes. On the first pass, we
read every 1-in-postingOffsetsInMemSampling entries, and keep track of
the position of the last value for each key.

On the second pass, we go back and read the last values for each key.

(I've started with two passes to avoid seeking backwards and
discarding the entire file buffer every time we start reading a new
key - it may be interesting to see if discarding the buffer is as
expensive as I expect.)

This involves a trade off: we'll scan the index-header file twice, but
gain massively reduced memory allocations. On my machine (a M1 MacBook
Pro with a fast SSD), the trade off pays off.

Compared to the previous commit:

name                                         old time/op    new time/op    delta
NewStreamBinaryReader/1Names1Values-10          122µs ± 5%     129µs ±13%     ~     (p=0.151 n=5+5)
NewStreamBinaryReader/1Names10Values-10         131µs ± 9%     124µs ± 2%     ~     (p=0.056 n=5+5)
NewStreamBinaryReader/1Names100Values-10        135µs ± 3%     133µs ± 3%     ~     (p=0.548 n=5+5)
NewStreamBinaryReader/1Names500Values-10        177µs ± 2%     162µs ± 1%   -8.29%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names1000Values-10       229µs ± 2%     198µs ± 2%  -13.51%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names5000Values-10       689µs ± 1%     535µs ± 2%  -22.37%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names1Values-10         169µs ± 3%     171µs ± 2%     ~     (p=0.310 n=5+5)
NewStreamBinaryReader/20Names10Values-10        194µs ± 2%     188µs ± 4%     ~     (p=0.056 n=5+5)
NewStreamBinaryReader/20Names100Values-10       438µs ± 7%     355µs ± 6%  -19.07%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names500Values-10      1.31ms ± 0%    0.94ms ± 3%  -28.29%  (p=0.016 n=4+5)
NewStreamBinaryReader/20Names1000Values-10     2.28ms ± 2%    1.62ms ± 3%  -29.16%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names5000Values-10     9.95ms ± 2%    6.71ms ± 1%  -32.57%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names1Values-10         280µs ± 1%     277µs ± 1%     ~     (p=0.095 n=5+5)
NewStreamBinaryReader/50Names10Values-10        347µs ± 2%     322µs ± 2%   -7.08%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names100Values-10       910µs ± 2%     701µs ± 1%  -22.94%  (p=0.016 n=5+4)
NewStreamBinaryReader/50Names500Values-10      2.97ms ± 2%    2.14ms ± 3%  -28.06%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names1000Values-10     5.29ms ± 2%    3.79ms ± 2%  -28.32%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names5000Values-10     24.9ms ± 1%    16.6ms ± 1%  -33.42%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names1Values-10        543µs ± 1%     548µs ± 3%     ~     (p=0.548 n=5+5)
NewStreamBinaryReader/100Names10Values-10       678µs ± 3%     632µs ± 4%   -6.77%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names100Values-10     1.73ms ± 5%    1.37ms ± 5%  -21.00%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names500Values-10     5.63ms ± 2%    4.08ms ± 2%  -27.62%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names1000Values-10    10.2ms ± 2%     7.3ms ± 1%  -29.01%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names5000Values-10    49.8ms ± 1%    33.4ms ± 0%  -33.04%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names1Values-10       1.16ms ± 2%    1.16ms ± 2%     ~     (p=0.548 n=5+5)
NewStreamBinaryReader/200Names10Values-10      1.39ms ± 1%    1.29ms ± 2%   -6.95%  (p=0.016 n=4+5)
NewStreamBinaryReader/200Names100Values-10     3.35ms ± 3%    2.68ms ± 4%  -20.01%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names500Values-10     11.5ms ± 1%     8.0ms ± 0%  -30.78%  (p=0.016 n=5+4)
NewStreamBinaryReader/200Names1000Values-10    21.1ms ± 3%    14.5ms ± 1%  -31.39%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names5000Values-10     100ms ± 2%      67ms ± 0%  -32.81%  (p=0.008 n=5+5)

name                                         old alloc/op   new alloc/op   delta
NewStreamBinaryReader/1Names1Values-10         3.18MB ± 0%    3.18MB ± 0%   +0.00%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names10Values-10        3.18MB ± 0%    3.18MB ± 0%   -0.00%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names100Values-10       3.18MB ± 0%    3.18MB ± 0%   -0.05%  (p=0.016 n=5+4)
NewStreamBinaryReader/1Names500Values-10       3.19MB ± 0%    3.18MB ± 0%   -0.24%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names1000Values-10      3.20MB ± 0%    3.18MB ± 0%   -0.48%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names5000Values-10      3.28MB ± 0%    3.20MB ± 0%   -2.36%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names1Values-10        3.18MB ± 0%    3.18MB ± 0%   +0.06%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names10Values-10       3.19MB ± 0%    3.18MB ± 0%   -0.02%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names100Values-10      3.22MB ± 0%    3.19MB ± 0%   -0.90%  (p=0.029 n=4+4)
NewStreamBinaryReader/20Names500Values-10      3.38MB ± 0%    3.23MB ± 0%   -4.54%  (p=0.016 n=4+5)
NewStreamBinaryReader/20Names1000Values-10     3.58MB ± 0%    3.27MB ± 0%   -8.61%  (p=0.029 n=4+4)
NewStreamBinaryReader/20Names5000Values-10     5.43MB ± 0%    3.56MB ± 0%  -34.41%  (p=0.016 n=4+5)
NewStreamBinaryReader/50Names1Values-10        3.19MB ± 0%    3.19MB ± 0%   +0.13%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names10Values-10       3.20MB ± 0%    3.19MB ± 0%   -0.09%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names100Values-10      3.29MB ± 0%    3.22MB ± 0%   -2.24%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names500Values-10      3.68MB ± 0%    3.30MB ± 0%  -10.44%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names1000Values-10     4.18MB ± 0%    3.41MB ± 0%  -18.44%  (p=0.016 n=4+5)
NewStreamBinaryReader/50Names5000Values-10     9.29MB ± 0%    4.13MB ± 0%  -55.48%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names1Values-10       3.20MB ± 0%    3.21MB ± 0%   +0.25%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names10Values-10      3.22MB ± 0%    3.21MB ± 0%   -0.17%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names100Values-10     3.40MB ± 0%    3.26MB ± 0%   -4.32%  (p=0.016 n=5+4)
NewStreamBinaryReader/100Names500Values-10     4.19MB ± 0%    3.42MB ± 0%  -18.36%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names1000Values-10    5.19MB ± 0%    3.65MB ± 0%  -29.73%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names5000Values-10    15.7MB ± 0%     5.1MB ± 0%  -67.61%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names1Values-10       3.22MB ± 0%    3.23MB ± 0%   +0.51%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names10Values-10      3.26MB ± 0%    3.25MB ± 0%   -0.33%  (p=0.016 n=4+5)
NewStreamBinaryReader/200Names100Values-10     3.63MB ± 0%    3.33MB ± 0%   -8.11%  (p=0.029 n=4+4)
NewStreamBinaryReader/200Names500Values-10     5.52MB ± 0%    3.66MB ± 0%  -33.68%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names1000Values-10    7.92MB ± 0%    4.12MB ± 0%  -48.02%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names5000Values-10    29.3MB ± 0%     7.0MB ± 0%  -76.10%  (p=0.016 n=5+4)

name                                         old allocs/op  new allocs/op  delta
NewStreamBinaryReader/1Names1Values-10           76.0 ± 0%      78.0 ± 0%   +2.63%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names10Values-10          95.0 ± 0%      80.0 ± 0%  -15.79%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names100Values-10          278 ± 0%        86 ± 0%  -69.06%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names500Values-10        1.08k ± 0%     0.10k ± 0%  -90.74%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names1000Values-10       2.08k ± 0%     0.12k ± 0%  -94.38%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names5000Values-10       10.1k ± 0%      0.2k ± 0%  -97.58%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names1Values-10           154 ± 0%       160 ± 0%   +3.90%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names10Values-10          534 ± 0%       200 ± 0%  -62.55%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names100Values-10       4.19k ± 0%     0.32k ± 0%  -92.37%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names500Values-10       20.2k ± 0%      0.6k ± 0%  -97.03%  (p=0.029 n=4+4)
NewStreamBinaryReader/20Names1000Values-10      40.3k ± 0%      0.9k ± 0%  -97.66%  (p=0.000 n=5+4)
NewStreamBinaryReader/20Names5000Values-10       200k ± 0%        3k ± 0%  -98.26%  (p=0.029 n=4+4)
NewStreamBinaryReader/50Names1Values-10           278 ± 0%       285 ± 0%   +2.52%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names10Values-10        1.23k ± 0%     0.39k ± 0%  -68.65%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names100Values-10       10.4k ± 0%      0.7k ± 0%  -93.40%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names500Values-10       50.5k ± 0%      1.4k ± 0%  -97.26%  (p=0.000 n=5+4)
NewStreamBinaryReader/50Names1000Values-10       101k ± 0%        2k ± 0%  -97.78%  (p=0.029 n=4+4)
NewStreamBinaryReader/50Names5000Values-10       501k ± 0%        9k ± 0%  -98.28%  (p=0.016 n=4+5)
NewStreamBinaryReader/100Names1Values-10          481 ± 0%       489 ± 0%   +1.66%  (p=0.029 n=4+4)
NewStreamBinaryReader/100Names10Values-10       2.38k ± 0%     0.69k ± 0%  -71.05%  (p=0.000 n=4+5)
NewStreamBinaryReader/100Names100Values-10      20.7k ± 0%      1.3k ± 0%  -93.76%  (p=0.000 n=5+4)
NewStreamBinaryReader/100Names500Values-10       101k ± 0%        3k ± 0%  -97.33%  (p=0.016 n=5+4)
NewStreamBinaryReader/100Names1000Values-10      201k ± 0%        4k ± 0%  -97.82%  (p=0.029 n=4+4)
NewStreamBinaryReader/100Names5000Values-10     1.00M ± 0%     0.02M ± 0%  -98.29%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names1Values-10          882 ± 0%       891 ± 0%   +1.02%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names10Values-10       4.68k ± 0%     1.29k ± 0%     ~     (p=0.079 n=4+5)
NewStreamBinaryReader/200Names100Values-10      41.3k ± 0%      2.5k ± 0%     ~     (p=0.079 n=4+5)
NewStreamBinaryReader/200Names500Values-10       202k ± 0%        5k ± 0%     ~     (p=0.079 n=4+5)
NewStreamBinaryReader/200Names1000Values-10      402k ± 0%        9k ± 0%  -97.84%  (p=0.000 n=4+5)
NewStreamBinaryReader/200Names5000Values-10     2.00M ± 0%     0.03M ± 0%  -98.30%  (p=0.016 n=4+5)

...and compared to bada69c:

name                                         old time/op    new time/op    delta
NewStreamBinaryReader/1Names1Values-10          122µs ± 8%     129µs ±13%     ~     (p=0.151 n=5+5)
NewStreamBinaryReader/1Names10Values-10         124µs ± 4%     124µs ± 2%     ~     (p=0.421 n=5+5)
NewStreamBinaryReader/1Names100Values-10        138µs ± 2%     133µs ± 3%   -3.45%  (p=0.016 n=5+5)
NewStreamBinaryReader/1Names500Values-10        187µs ± 4%     162µs ± 1%  -13.16%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names1000Values-10       262µs ± 2%     198µs ± 2%  -24.35%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names5000Values-10       837µs ± 3%     535µs ± 2%  -36.05%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names1Values-10         168µs ± 2%     171µs ± 2%     ~     (p=0.056 n=5+5)
NewStreamBinaryReader/20Names10Values-10        199µs ± 2%     188µs ± 4%   -5.37%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names100Values-10       505µs ± 1%     355µs ± 6%  -29.75%  (p=0.016 n=4+5)
NewStreamBinaryReader/20Names500Values-10      1.63ms ± 1%    0.94ms ± 3%  -42.27%  (p=0.016 n=4+5)
NewStreamBinaryReader/20Names1000Values-10     2.90ms ± 3%    1.62ms ± 3%  -44.24%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names5000Values-10     12.9ms ± 2%     6.7ms ± 1%  -47.87%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names1Values-10         276µs ± 0%     277µs ± 1%     ~     (p=0.286 n=4+5)
NewStreamBinaryReader/50Names10Values-10        368µs ± 1%     322µs ± 2%  -12.49%  (p=0.016 n=4+5)
NewStreamBinaryReader/50Names100Values-10      1.10ms ± 4%    0.70ms ± 1%  -36.16%  (p=0.016 n=5+4)
NewStreamBinaryReader/50Names500Values-10      3.73ms ± 3%    2.14ms ± 3%  -42.68%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names1000Values-10     6.74ms ± 2%    3.79ms ± 2%  -43.81%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names5000Values-10     32.0ms ± 0%    16.6ms ± 1%  -48.27%  (p=0.016 n=4+5)
NewStreamBinaryReader/100Names1Values-10        547µs ± 1%     548µs ± 3%     ~     (p=0.413 n=4+5)
NewStreamBinaryReader/100Names10Values-10       728µs ± 4%     632µs ± 4%  -13.19%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names100Values-10     2.08ms ± 5%    1.37ms ± 5%  -34.32%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names500Values-10     7.13ms ± 1%    4.08ms ± 2%  -42.86%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names1000Values-10    13.3ms ± 2%     7.3ms ± 1%  -45.46%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names5000Values-10    64.9ms ± 2%    33.4ms ± 0%  -48.57%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names1Values-10       1.17ms ± 0%    1.16ms ± 2%     ~     (p=0.190 n=4+5)
NewStreamBinaryReader/200Names10Values-10      1.49ms ± 0%    1.29ms ± 2%  -13.17%  (p=0.016 n=4+5)
NewStreamBinaryReader/200Names100Values-10     4.05ms ± 3%    2.68ms ± 4%  -33.87%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names500Values-10     14.4ms ± 2%     8.0ms ± 0%  -44.83%  (p=0.016 n=5+4)
NewStreamBinaryReader/200Names1000Values-10    27.3ms ± 2%    14.5ms ± 1%  -47.02%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names5000Values-10     131ms ± 2%      67ms ± 0%  -48.50%  (p=0.008 n=5+5)

name                                         old alloc/op   new alloc/op   delta
NewStreamBinaryReader/1Names1Values-10         3.18MB ± 0%    3.18MB ± 0%   +0.00%  (p=0.032 n=5+5)
NewStreamBinaryReader/1Names10Values-10        3.18MB ± 0%    3.18MB ± 0%   -0.01%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names100Values-10       3.18MB ± 0%    3.18MB ± 0%   -0.15%  (p=0.016 n=5+4)
NewStreamBinaryReader/1Names500Values-10       3.20MB ± 0%    3.18MB ± 0%   -0.74%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names1000Values-10      3.23MB ± 0%    3.18MB ± 0%   -1.47%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names5000Values-10      3.44MB ± 0%    3.20MB ± 0%   -6.91%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names1Values-10        3.18MB ± 0%    3.18MB ± 0%   +0.04%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names10Values-10       3.19MB ± 0%    3.18MB ± 0%   -0.22%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names100Values-10      3.29MB ± 0%    3.19MB ± 0%   -2.83%  (p=0.016 n=5+4)
NewStreamBinaryReader/20Names500Values-10      3.70MB ± 0%    3.23MB ± 0%  -12.80%  (p=0.016 n=4+5)
NewStreamBinaryReader/20Names1000Values-10     4.22MB ± 0%    3.27MB ± 0%  -22.47%  (p=0.029 n=4+4)
NewStreamBinaryReader/20Names5000Values-10     8.63MB ± 0%    3.56MB ± 0%  -58.73%  (p=0.016 n=4+5)
NewStreamBinaryReader/50Names1Values-10        3.19MB ± 0%    3.19MB ± 0%   +0.08%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names10Values-10       3.21MB ± 0%    3.19MB ± 0%   -0.58%  (p=0.016 n=4+5)
NewStreamBinaryReader/50Names100Values-10      3.45MB ± 0%    3.22MB ± 0%   -6.77%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names500Values-10      4.48MB ± 0%    3.30MB ± 0%  -26.43%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names1000Values-10     5.78MB ± 0%    3.41MB ± 0%  -41.01%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names5000Values-10     17.3MB ± 0%     4.1MB ± 0%  -76.09%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names1Values-10       3.20MB ± 0%    3.21MB ± 0%   +0.15%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names10Values-10      3.25MB ± 0%    3.21MB ± 0%   -1.15%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names100Values-10     3.72MB ± 0%    3.26MB ± 0%  -12.55%  (p=0.029 n=4+4)
NewStreamBinaryReader/100Names500Values-10     5.79MB ± 0%    3.42MB ± 0%  -40.94%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names1000Values-10    8.39MB ± 0%    3.65MB ± 0%  -56.54%  (p=0.016 n=4+5)
NewStreamBinaryReader/100Names5000Values-10    31.7MB ± 0%     5.1MB ± 0%  -83.95%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names1Values-10       3.22MB ± 0%    3.23MB ± 0%   +0.31%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names10Values-10      3.32MB ± 0%    3.25MB ± 0%   -2.26%  (p=0.016 n=4+5)
NewStreamBinaryReader/200Names100Values-10     4.27MB ± 0%    3.33MB ± 0%  -21.89%  (p=0.029 n=4+4)
NewStreamBinaryReader/200Names500Values-10     8.72MB ± 0%    3.66MB ± 0%  -58.03%  (p=0.016 n=4+5)
NewStreamBinaryReader/200Names1000Values-10    14.3MB ± 0%     4.1MB ± 0%  -71.26%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names5000Values-10    61.3MB ± 0%     7.0MB ± 0%  -88.58%  (p=0.016 n=5+4)

name                                         old allocs/op  new allocs/op  delta
NewStreamBinaryReader/1Names1Values-10           78.0 ± 0%      78.0 ± 0%     ~     (all equal)
NewStreamBinaryReader/1Names10Values-10           106 ± 0%        80 ± 0%  -24.53%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names100Values-10          379 ± 0%        86 ± 0%  -77.31%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names500Values-10        1.58k ± 0%     0.10k ± 0%  -93.67%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names1000Values-10       3.08k ± 0%     0.12k ± 0%  -96.20%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names5000Values-10       15.1k ± 0%      0.2k ± 0%  -98.38%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names1Values-10           175 ± 0%       160 ± 0%   -8.57%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names10Values-10          735 ± 0%       200 ± 0%  -72.79%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names100Values-10       6.20k ± 0%     0.32k ± 0%     ~     (p=0.079 n=4+5)
NewStreamBinaryReader/20Names500Values-10       30.2k ± 0%      0.6k ± 0%  -98.02%  (p=0.000 n=5+4)
NewStreamBinaryReader/20Names1000Values-10      60.3k ± 0%      0.9k ± 0%  -98.44%  (p=0.029 n=4+4)
NewStreamBinaryReader/20Names5000Values-10       300k ± 0%        3k ± 0%  -98.84%  (p=0.029 n=4+4)
NewStreamBinaryReader/50Names1Values-10           329 ± 0%       285 ± 0%  -13.37%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names10Values-10        1.73k ± 0%     0.39k ± 0%  -77.73%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names100Values-10       15.4k ± 0%      0.7k ± 0%     ~     (p=0.079 n=4+5)
NewStreamBinaryReader/50Names500Values-10       75.5k ± 0%      1.4k ± 0%  -98.17%  (p=0.029 n=4+4)
NewStreamBinaryReader/50Names1000Values-10       151k ± 0%        2k ± 0%  -98.52%  (p=0.029 n=4+4)
NewStreamBinaryReader/50Names5000Values-10       751k ± 0%        9k ± 0%  -98.86%  (p=0.016 n=4+5)
NewStreamBinaryReader/100Names1Values-10          582 ± 0%       489 ± 0%  -15.98%  (p=0.029 n=4+4)
NewStreamBinaryReader/100Names10Values-10       3.38k ± 0%     0.69k ± 0%  -79.62%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names100Values-10      30.7k ± 0%      1.3k ± 0%  -95.80%  (p=0.029 n=4+4)
NewStreamBinaryReader/100Names500Values-10       151k ± 0%        3k ± 0%  -98.22%  (p=0.029 n=4+4)
NewStreamBinaryReader/100Names1000Values-10      301k ± 0%        4k ± 0%  -98.54%  (p=0.029 n=4+4)
NewStreamBinaryReader/100Names5000Values-10     1.50M ± 0%     0.02M ± 0%  -98.86%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names1Values-10        1.08k ± 0%     0.89k ± 0%  -17.73%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names10Values-10       6.68k ± 0%     1.29k ± 0%     ~     (p=0.079 n=4+5)
NewStreamBinaryReader/200Names100Values-10      61.3k ± 0%      2.5k ± 0%     ~     (p=0.079 n=4+5)
NewStreamBinaryReader/200Names500Values-10       302k ± 0%        5k ± 0%  -98.25%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names1000Values-10      602k ± 0%        9k ± 0%  -98.56%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names5000Values-10     3.00M ± 0%     0.03M ± 0%  -98.86%  (p=0.008 n=5+5)
charleskorn added a commit to grafana/mimir that referenced this pull request Dec 19, 2022
…ng unnecessary strings.

This is inspired by prometheus/prometheus#11535.

Unfortunately, we can't adopt that change as-is, as byte slices
returned by our new Decbuf.UvarintBytes() implementation are not valid
after subsequent reads - we can't take advantage of the magic of mmap.

This means that we must decide whether or not to allocate a string
for a key or value before reading any further in the file. However,
we want to store the last value for each key, but won't know if the
value is the last one until we've read the next one.

The trick is to read the table in two passes. On the first pass, we
read every 1-in-postingOffsetsInMemSampling entries, and keep track of
the position of the last value for each key.

On the second pass, we go back and read the last values for each key.

(I've started with two passes to avoid seeking backwards and
discarding the entire file buffer every time we start reading a new
key - it may be interesting to see if discarding the buffer is as
expensive as I expect.)

This involves a trade off: we'll scan the index-header file twice, but
gain massively reduced memory allocations. On my machine (a M1 MacBook
Pro with a fast SSD), the trade off pays off.

Compared to the previous commit:

name                                         old time/op    new time/op    delta
NewStreamBinaryReader/1Names1Values-10          122µs ± 5%     129µs ±13%     ~     (p=0.151 n=5+5)
NewStreamBinaryReader/1Names10Values-10         131µs ± 9%     124µs ± 2%     ~     (p=0.056 n=5+5)
NewStreamBinaryReader/1Names100Values-10        135µs ± 3%     133µs ± 3%     ~     (p=0.548 n=5+5)
NewStreamBinaryReader/1Names500Values-10        177µs ± 2%     162µs ± 1%   -8.29%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names1000Values-10       229µs ± 2%     198µs ± 2%  -13.51%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names5000Values-10       689µs ± 1%     535µs ± 2%  -22.37%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names1Values-10         169µs ± 3%     171µs ± 2%     ~     (p=0.310 n=5+5)
NewStreamBinaryReader/20Names10Values-10        194µs ± 2%     188µs ± 4%     ~     (p=0.056 n=5+5)
NewStreamBinaryReader/20Names100Values-10       438µs ± 7%     355µs ± 6%  -19.07%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names500Values-10      1.31ms ± 0%    0.94ms ± 3%  -28.29%  (p=0.016 n=4+5)
NewStreamBinaryReader/20Names1000Values-10     2.28ms ± 2%    1.62ms ± 3%  -29.16%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names5000Values-10     9.95ms ± 2%    6.71ms ± 1%  -32.57%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names1Values-10         280µs ± 1%     277µs ± 1%     ~     (p=0.095 n=5+5)
NewStreamBinaryReader/50Names10Values-10        347µs ± 2%     322µs ± 2%   -7.08%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names100Values-10       910µs ± 2%     701µs ± 1%  -22.94%  (p=0.016 n=5+4)
NewStreamBinaryReader/50Names500Values-10      2.97ms ± 2%    2.14ms ± 3%  -28.06%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names1000Values-10     5.29ms ± 2%    3.79ms ± 2%  -28.32%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names5000Values-10     24.9ms ± 1%    16.6ms ± 1%  -33.42%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names1Values-10        543µs ± 1%     548µs ± 3%     ~     (p=0.548 n=5+5)
NewStreamBinaryReader/100Names10Values-10       678µs ± 3%     632µs ± 4%   -6.77%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names100Values-10     1.73ms ± 5%    1.37ms ± 5%  -21.00%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names500Values-10     5.63ms ± 2%    4.08ms ± 2%  -27.62%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names1000Values-10    10.2ms ± 2%     7.3ms ± 1%  -29.01%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names5000Values-10    49.8ms ± 1%    33.4ms ± 0%  -33.04%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names1Values-10       1.16ms ± 2%    1.16ms ± 2%     ~     (p=0.548 n=5+5)
NewStreamBinaryReader/200Names10Values-10      1.39ms ± 1%    1.29ms ± 2%   -6.95%  (p=0.016 n=4+5)
NewStreamBinaryReader/200Names100Values-10     3.35ms ± 3%    2.68ms ± 4%  -20.01%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names500Values-10     11.5ms ± 1%     8.0ms ± 0%  -30.78%  (p=0.016 n=5+4)
NewStreamBinaryReader/200Names1000Values-10    21.1ms ± 3%    14.5ms ± 1%  -31.39%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names5000Values-10     100ms ± 2%      67ms ± 0%  -32.81%  (p=0.008 n=5+5)

name                                         old alloc/op   new alloc/op   delta
NewStreamBinaryReader/1Names1Values-10         3.18MB ± 0%    3.18MB ± 0%   +0.00%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names10Values-10        3.18MB ± 0%    3.18MB ± 0%   -0.00%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names100Values-10       3.18MB ± 0%    3.18MB ± 0%   -0.05%  (p=0.016 n=5+4)
NewStreamBinaryReader/1Names500Values-10       3.19MB ± 0%    3.18MB ± 0%   -0.24%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names1000Values-10      3.20MB ± 0%    3.18MB ± 0%   -0.48%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names5000Values-10      3.28MB ± 0%    3.20MB ± 0%   -2.36%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names1Values-10        3.18MB ± 0%    3.18MB ± 0%   +0.06%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names10Values-10       3.19MB ± 0%    3.18MB ± 0%   -0.02%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names100Values-10      3.22MB ± 0%    3.19MB ± 0%   -0.90%  (p=0.029 n=4+4)
NewStreamBinaryReader/20Names500Values-10      3.38MB ± 0%    3.23MB ± 0%   -4.54%  (p=0.016 n=4+5)
NewStreamBinaryReader/20Names1000Values-10     3.58MB ± 0%    3.27MB ± 0%   -8.61%  (p=0.029 n=4+4)
NewStreamBinaryReader/20Names5000Values-10     5.43MB ± 0%    3.56MB ± 0%  -34.41%  (p=0.016 n=4+5)
NewStreamBinaryReader/50Names1Values-10        3.19MB ± 0%    3.19MB ± 0%   +0.13%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names10Values-10       3.20MB ± 0%    3.19MB ± 0%   -0.09%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names100Values-10      3.29MB ± 0%    3.22MB ± 0%   -2.24%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names500Values-10      3.68MB ± 0%    3.30MB ± 0%  -10.44%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names1000Values-10     4.18MB ± 0%    3.41MB ± 0%  -18.44%  (p=0.016 n=4+5)
NewStreamBinaryReader/50Names5000Values-10     9.29MB ± 0%    4.13MB ± 0%  -55.48%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names1Values-10       3.20MB ± 0%    3.21MB ± 0%   +0.25%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names10Values-10      3.22MB ± 0%    3.21MB ± 0%   -0.17%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names100Values-10     3.40MB ± 0%    3.26MB ± 0%   -4.32%  (p=0.016 n=5+4)
NewStreamBinaryReader/100Names500Values-10     4.19MB ± 0%    3.42MB ± 0%  -18.36%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names1000Values-10    5.19MB ± 0%    3.65MB ± 0%  -29.73%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names5000Values-10    15.7MB ± 0%     5.1MB ± 0%  -67.61%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names1Values-10       3.22MB ± 0%    3.23MB ± 0%   +0.51%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names10Values-10      3.26MB ± 0%    3.25MB ± 0%   -0.33%  (p=0.016 n=4+5)
NewStreamBinaryReader/200Names100Values-10     3.63MB ± 0%    3.33MB ± 0%   -8.11%  (p=0.029 n=4+4)
NewStreamBinaryReader/200Names500Values-10     5.52MB ± 0%    3.66MB ± 0%  -33.68%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names1000Values-10    7.92MB ± 0%    4.12MB ± 0%  -48.02%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names5000Values-10    29.3MB ± 0%     7.0MB ± 0%  -76.10%  (p=0.016 n=5+4)

name                                         old allocs/op  new allocs/op  delta
NewStreamBinaryReader/1Names1Values-10           76.0 ± 0%      78.0 ± 0%   +2.63%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names10Values-10          95.0 ± 0%      80.0 ± 0%  -15.79%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names100Values-10          278 ± 0%        86 ± 0%  -69.06%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names500Values-10        1.08k ± 0%     0.10k ± 0%  -90.74%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names1000Values-10       2.08k ± 0%     0.12k ± 0%  -94.38%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names5000Values-10       10.1k ± 0%      0.2k ± 0%  -97.58%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names1Values-10           154 ± 0%       160 ± 0%   +3.90%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names10Values-10          534 ± 0%       200 ± 0%  -62.55%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names100Values-10       4.19k ± 0%     0.32k ± 0%  -92.37%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names500Values-10       20.2k ± 0%      0.6k ± 0%  -97.03%  (p=0.029 n=4+4)
NewStreamBinaryReader/20Names1000Values-10      40.3k ± 0%      0.9k ± 0%  -97.66%  (p=0.000 n=5+4)
NewStreamBinaryReader/20Names5000Values-10       200k ± 0%        3k ± 0%  -98.26%  (p=0.029 n=4+4)
NewStreamBinaryReader/50Names1Values-10           278 ± 0%       285 ± 0%   +2.52%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names10Values-10        1.23k ± 0%     0.39k ± 0%  -68.65%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names100Values-10       10.4k ± 0%      0.7k ± 0%  -93.40%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names500Values-10       50.5k ± 0%      1.4k ± 0%  -97.26%  (p=0.000 n=5+4)
NewStreamBinaryReader/50Names1000Values-10       101k ± 0%        2k ± 0%  -97.78%  (p=0.029 n=4+4)
NewStreamBinaryReader/50Names5000Values-10       501k ± 0%        9k ± 0%  -98.28%  (p=0.016 n=4+5)
NewStreamBinaryReader/100Names1Values-10          481 ± 0%       489 ± 0%   +1.66%  (p=0.029 n=4+4)
NewStreamBinaryReader/100Names10Values-10       2.38k ± 0%     0.69k ± 0%  -71.05%  (p=0.000 n=4+5)
NewStreamBinaryReader/100Names100Values-10      20.7k ± 0%      1.3k ± 0%  -93.76%  (p=0.000 n=5+4)
NewStreamBinaryReader/100Names500Values-10       101k ± 0%        3k ± 0%  -97.33%  (p=0.016 n=5+4)
NewStreamBinaryReader/100Names1000Values-10      201k ± 0%        4k ± 0%  -97.82%  (p=0.029 n=4+4)
NewStreamBinaryReader/100Names5000Values-10     1.00M ± 0%     0.02M ± 0%  -98.29%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names1Values-10          882 ± 0%       891 ± 0%   +1.02%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names10Values-10       4.68k ± 0%     1.29k ± 0%     ~     (p=0.079 n=4+5)
NewStreamBinaryReader/200Names100Values-10      41.3k ± 0%      2.5k ± 0%     ~     (p=0.079 n=4+5)
NewStreamBinaryReader/200Names500Values-10       202k ± 0%        5k ± 0%     ~     (p=0.079 n=4+5)
NewStreamBinaryReader/200Names1000Values-10      402k ± 0%        9k ± 0%  -97.84%  (p=0.000 n=4+5)
NewStreamBinaryReader/200Names5000Values-10     2.00M ± 0%     0.03M ± 0%  -98.30%  (p=0.016 n=4+5)

...and compared to bada69c:

name                                         old time/op    new time/op    delta
NewStreamBinaryReader/1Names1Values-10          122µs ± 8%     129µs ±13%     ~     (p=0.151 n=5+5)
NewStreamBinaryReader/1Names10Values-10         124µs ± 4%     124µs ± 2%     ~     (p=0.421 n=5+5)
NewStreamBinaryReader/1Names100Values-10        138µs ± 2%     133µs ± 3%   -3.45%  (p=0.016 n=5+5)
NewStreamBinaryReader/1Names500Values-10        187µs ± 4%     162µs ± 1%  -13.16%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names1000Values-10       262µs ± 2%     198µs ± 2%  -24.35%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names5000Values-10       837µs ± 3%     535µs ± 2%  -36.05%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names1Values-10         168µs ± 2%     171µs ± 2%     ~     (p=0.056 n=5+5)
NewStreamBinaryReader/20Names10Values-10        199µs ± 2%     188µs ± 4%   -5.37%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names100Values-10       505µs ± 1%     355µs ± 6%  -29.75%  (p=0.016 n=4+5)
NewStreamBinaryReader/20Names500Values-10      1.63ms ± 1%    0.94ms ± 3%  -42.27%  (p=0.016 n=4+5)
NewStreamBinaryReader/20Names1000Values-10     2.90ms ± 3%    1.62ms ± 3%  -44.24%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names5000Values-10     12.9ms ± 2%     6.7ms ± 1%  -47.87%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names1Values-10         276µs ± 0%     277µs ± 1%     ~     (p=0.286 n=4+5)
NewStreamBinaryReader/50Names10Values-10        368µs ± 1%     322µs ± 2%  -12.49%  (p=0.016 n=4+5)
NewStreamBinaryReader/50Names100Values-10      1.10ms ± 4%    0.70ms ± 1%  -36.16%  (p=0.016 n=5+4)
NewStreamBinaryReader/50Names500Values-10      3.73ms ± 3%    2.14ms ± 3%  -42.68%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names1000Values-10     6.74ms ± 2%    3.79ms ± 2%  -43.81%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names5000Values-10     32.0ms ± 0%    16.6ms ± 1%  -48.27%  (p=0.016 n=4+5)
NewStreamBinaryReader/100Names1Values-10        547µs ± 1%     548µs ± 3%     ~     (p=0.413 n=4+5)
NewStreamBinaryReader/100Names10Values-10       728µs ± 4%     632µs ± 4%  -13.19%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names100Values-10     2.08ms ± 5%    1.37ms ± 5%  -34.32%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names500Values-10     7.13ms ± 1%    4.08ms ± 2%  -42.86%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names1000Values-10    13.3ms ± 2%     7.3ms ± 1%  -45.46%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names5000Values-10    64.9ms ± 2%    33.4ms ± 0%  -48.57%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names1Values-10       1.17ms ± 0%    1.16ms ± 2%     ~     (p=0.190 n=4+5)
NewStreamBinaryReader/200Names10Values-10      1.49ms ± 0%    1.29ms ± 2%  -13.17%  (p=0.016 n=4+5)
NewStreamBinaryReader/200Names100Values-10     4.05ms ± 3%    2.68ms ± 4%  -33.87%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names500Values-10     14.4ms ± 2%     8.0ms ± 0%  -44.83%  (p=0.016 n=5+4)
NewStreamBinaryReader/200Names1000Values-10    27.3ms ± 2%    14.5ms ± 1%  -47.02%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names5000Values-10     131ms ± 2%      67ms ± 0%  -48.50%  (p=0.008 n=5+5)

name                                         old alloc/op   new alloc/op   delta
NewStreamBinaryReader/1Names1Values-10         3.18MB ± 0%    3.18MB ± 0%   +0.00%  (p=0.032 n=5+5)
NewStreamBinaryReader/1Names10Values-10        3.18MB ± 0%    3.18MB ± 0%   -0.01%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names100Values-10       3.18MB ± 0%    3.18MB ± 0%   -0.15%  (p=0.016 n=5+4)
NewStreamBinaryReader/1Names500Values-10       3.20MB ± 0%    3.18MB ± 0%   -0.74%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names1000Values-10      3.23MB ± 0%    3.18MB ± 0%   -1.47%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names5000Values-10      3.44MB ± 0%    3.20MB ± 0%   -6.91%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names1Values-10        3.18MB ± 0%    3.18MB ± 0%   +0.04%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names10Values-10       3.19MB ± 0%    3.18MB ± 0%   -0.22%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names100Values-10      3.29MB ± 0%    3.19MB ± 0%   -2.83%  (p=0.016 n=5+4)
NewStreamBinaryReader/20Names500Values-10      3.70MB ± 0%    3.23MB ± 0%  -12.80%  (p=0.016 n=4+5)
NewStreamBinaryReader/20Names1000Values-10     4.22MB ± 0%    3.27MB ± 0%  -22.47%  (p=0.029 n=4+4)
NewStreamBinaryReader/20Names5000Values-10     8.63MB ± 0%    3.56MB ± 0%  -58.73%  (p=0.016 n=4+5)
NewStreamBinaryReader/50Names1Values-10        3.19MB ± 0%    3.19MB ± 0%   +0.08%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names10Values-10       3.21MB ± 0%    3.19MB ± 0%   -0.58%  (p=0.016 n=4+5)
NewStreamBinaryReader/50Names100Values-10      3.45MB ± 0%    3.22MB ± 0%   -6.77%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names500Values-10      4.48MB ± 0%    3.30MB ± 0%  -26.43%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names1000Values-10     5.78MB ± 0%    3.41MB ± 0%  -41.01%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names5000Values-10     17.3MB ± 0%     4.1MB ± 0%  -76.09%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names1Values-10       3.20MB ± 0%    3.21MB ± 0%   +0.15%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names10Values-10      3.25MB ± 0%    3.21MB ± 0%   -1.15%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names100Values-10     3.72MB ± 0%    3.26MB ± 0%  -12.55%  (p=0.029 n=4+4)
NewStreamBinaryReader/100Names500Values-10     5.79MB ± 0%    3.42MB ± 0%  -40.94%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names1000Values-10    8.39MB ± 0%    3.65MB ± 0%  -56.54%  (p=0.016 n=4+5)
NewStreamBinaryReader/100Names5000Values-10    31.7MB ± 0%     5.1MB ± 0%  -83.95%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names1Values-10       3.22MB ± 0%    3.23MB ± 0%   +0.31%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names10Values-10      3.32MB ± 0%    3.25MB ± 0%   -2.26%  (p=0.016 n=4+5)
NewStreamBinaryReader/200Names100Values-10     4.27MB ± 0%    3.33MB ± 0%  -21.89%  (p=0.029 n=4+4)
NewStreamBinaryReader/200Names500Values-10     8.72MB ± 0%    3.66MB ± 0%  -58.03%  (p=0.016 n=4+5)
NewStreamBinaryReader/200Names1000Values-10    14.3MB ± 0%     4.1MB ± 0%  -71.26%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names5000Values-10    61.3MB ± 0%     7.0MB ± 0%  -88.58%  (p=0.016 n=5+4)

name                                         old allocs/op  new allocs/op  delta
NewStreamBinaryReader/1Names1Values-10           78.0 ± 0%      78.0 ± 0%     ~     (all equal)
NewStreamBinaryReader/1Names10Values-10           106 ± 0%        80 ± 0%  -24.53%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names100Values-10          379 ± 0%        86 ± 0%  -77.31%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names500Values-10        1.58k ± 0%     0.10k ± 0%  -93.67%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names1000Values-10       3.08k ± 0%     0.12k ± 0%  -96.20%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names5000Values-10       15.1k ± 0%      0.2k ± 0%  -98.38%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names1Values-10           175 ± 0%       160 ± 0%   -8.57%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names10Values-10          735 ± 0%       200 ± 0%  -72.79%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names100Values-10       6.20k ± 0%     0.32k ± 0%     ~     (p=0.079 n=4+5)
NewStreamBinaryReader/20Names500Values-10       30.2k ± 0%      0.6k ± 0%  -98.02%  (p=0.000 n=5+4)
NewStreamBinaryReader/20Names1000Values-10      60.3k ± 0%      0.9k ± 0%  -98.44%  (p=0.029 n=4+4)
NewStreamBinaryReader/20Names5000Values-10       300k ± 0%        3k ± 0%  -98.84%  (p=0.029 n=4+4)
NewStreamBinaryReader/50Names1Values-10           329 ± 0%       285 ± 0%  -13.37%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names10Values-10        1.73k ± 0%     0.39k ± 0%  -77.73%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names100Values-10       15.4k ± 0%      0.7k ± 0%     ~     (p=0.079 n=4+5)
NewStreamBinaryReader/50Names500Values-10       75.5k ± 0%      1.4k ± 0%  -98.17%  (p=0.029 n=4+4)
NewStreamBinaryReader/50Names1000Values-10       151k ± 0%        2k ± 0%  -98.52%  (p=0.029 n=4+4)
NewStreamBinaryReader/50Names5000Values-10       751k ± 0%        9k ± 0%  -98.86%  (p=0.016 n=4+5)
NewStreamBinaryReader/100Names1Values-10          582 ± 0%       489 ± 0%  -15.98%  (p=0.029 n=4+4)
NewStreamBinaryReader/100Names10Values-10       3.38k ± 0%     0.69k ± 0%  -79.62%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names100Values-10      30.7k ± 0%      1.3k ± 0%  -95.80%  (p=0.029 n=4+4)
NewStreamBinaryReader/100Names500Values-10       151k ± 0%        3k ± 0%  -98.22%  (p=0.029 n=4+4)
NewStreamBinaryReader/100Names1000Values-10      301k ± 0%        4k ± 0%  -98.54%  (p=0.029 n=4+4)
NewStreamBinaryReader/100Names5000Values-10     1.50M ± 0%     0.02M ± 0%  -98.86%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names1Values-10        1.08k ± 0%     0.89k ± 0%  -17.73%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names10Values-10       6.68k ± 0%     1.29k ± 0%     ~     (p=0.079 n=4+5)
NewStreamBinaryReader/200Names100Values-10      61.3k ± 0%      2.5k ± 0%     ~     (p=0.079 n=4+5)
NewStreamBinaryReader/200Names500Values-10       302k ± 0%        5k ± 0%  -98.25%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names1000Values-10      602k ± 0%        9k ± 0%  -98.56%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names5000Values-10     3.00M ± 0%     0.03M ± 0%  -98.86%  (p=0.008 n=5+5)

Read postings offset table in one pass by seeking back to previous values when required.

On my machine with a SSD, this produces mixed results, but seems to
improve things for index-headers with a high number of values and
relatively few names:

name                                         old time/op    new time/op    delta
NewStreamBinaryReader/1Names1Values-10          129µs ±13%     136µs ± 4%     ~     (p=0.151 n=5+5)
NewStreamBinaryReader/1Names10Values-10         124µs ± 2%     129µs ± 7%   +3.75%  (p=0.032 n=5+5)
NewStreamBinaryReader/1Names100Values-10        133µs ± 3%     135µs ± 1%     ~     (p=0.421 n=5+5)
NewStreamBinaryReader/1Names500Values-10        162µs ± 1%     164µs ± 2%     ~     (p=0.421 n=5+5)
NewStreamBinaryReader/1Names1000Values-10       198µs ± 2%     198µs ± 2%     ~     (p=1.000 n=5+5)
NewStreamBinaryReader/1Names5000Values-10       535µs ± 2%     518µs ± 2%   -3.13%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names1Values-10         171µs ± 2%     171µs ± 1%     ~     (p=0.841 n=5+5)
NewStreamBinaryReader/20Names10Values-10        188µs ± 4%     208µs ± 2%  +10.17%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names100Values-10       355µs ± 6%     383µs ± 2%   +8.03%  (p=0.016 n=5+5)
NewStreamBinaryReader/20Names500Values-10       941µs ± 3%     932µs ± 3%     ~     (p=0.421 n=5+5)
NewStreamBinaryReader/20Names1000Values-10     1.62ms ± 3%    1.57ms ± 3%     ~     (p=0.095 n=5+5)
NewStreamBinaryReader/20Names5000Values-10     6.71ms ± 1%    6.33ms ± 1%   -5.57%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names1Values-10         277µs ± 1%     291µs ± 5%   +5.10%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names10Values-10        322µs ± 2%     394µs ± 5%  +22.15%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names100Values-10       701µs ± 1%     730µs ± 2%   +4.11%  (p=0.016 n=4+5)
NewStreamBinaryReader/50Names500Values-10      2.14ms ± 3%    2.08ms ± 3%     ~     (p=0.095 n=5+5)
NewStreamBinaryReader/50Names1000Values-10     3.79ms ± 2%    3.63ms ± 2%   -4.09%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names5000Values-10     16.6ms ± 1%    15.5ms ± 0%   -6.59%  (p=0.016 n=5+4)
NewStreamBinaryReader/100Names1Values-10        548µs ± 3%     542µs ± 3%     ~     (p=0.095 n=5+5)
NewStreamBinaryReader/100Names10Values-10       632µs ± 4%     741µs ± 3%  +17.27%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names100Values-10     1.37ms ± 5%    1.40ms ± 5%     ~     (p=0.222 n=5+5)
NewStreamBinaryReader/100Names500Values-10     4.08ms ± 2%    3.97ms ± 2%   -2.62%  (p=0.016 n=5+5)
NewStreamBinaryReader/100Names1000Values-10    7.27ms ± 1%    6.96ms ± 1%   -4.25%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names5000Values-10    33.4ms ± 0%    31.1ms ± 0%   -6.73%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names1Values-10       1.16ms ± 2%    1.16ms ± 3%     ~     (p=1.000 n=5+5)
NewStreamBinaryReader/200Names10Values-10      1.29ms ± 2%    1.57ms ± 4%  +21.49%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names100Values-10     2.68ms ± 4%    2.73ms ± 3%     ~     (p=0.095 n=5+5)
NewStreamBinaryReader/200Names500Values-10     7.97ms ± 0%    7.64ms ± 1%   -4.09%  (p=0.016 n=4+5)
NewStreamBinaryReader/200Names1000Values-10    14.5ms ± 1%    13.7ms ± 1%   -5.30%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names5000Values-10    67.3ms ± 0%    62.5ms ± 1%   -7.15%  (p=0.008 n=5+5)
pracucci pushed a commit to grafana/mimir that referenced this pull request Dec 20, 2022
…ookups (#3742)

* Expand index-header reader benchmarks to cover a wider range of scenarios up to 1M series.

* Don't allocate a new slice for every entry in the postings offset table.

We always expect to read exactly one key and value, so there's no need
for the slice. This dramatically improves the performance of reading
an index-header with the mmap-less index-header reader:

name                                         old time/op    new time/op    delta
NewStreamBinaryReader/1Names1Values-10          122µs ± 8%     122µs ± 5%     ~     (p=0.690 n=5+5)
NewStreamBinaryReader/1Names10Values-10         124µs ± 4%     131µs ± 9%     ~     (p=0.056 n=5+5)
NewStreamBinaryReader/1Names100Values-10        138µs ± 2%     135µs ± 3%     ~     (p=0.056 n=5+5)
NewStreamBinaryReader/1Names500Values-10        187µs ± 4%     177µs ± 2%   -5.31%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names1000Values-10       262µs ± 2%     229µs ± 2%  -12.53%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names5000Values-10       837µs ± 3%     689µs ± 1%  -17.62%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names1Values-10         168µs ± 2%     169µs ± 3%     ~     (p=1.000 n=5+5)
NewStreamBinaryReader/20Names10Values-10        199µs ± 2%     194µs ± 2%   -2.56%  (p=0.032 n=5+5)
NewStreamBinaryReader/20Names100Values-10       505µs ± 1%     438µs ± 7%  -13.20%  (p=0.016 n=4+5)
NewStreamBinaryReader/20Names500Values-10      1.63ms ± 1%    1.31ms ± 0%  -19.49%  (p=0.029 n=4+4)
NewStreamBinaryReader/20Names1000Values-10     2.90ms ± 3%    2.28ms ± 2%  -21.30%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names5000Values-10     12.9ms ± 2%     9.9ms ± 2%  -22.69%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names1Values-10         276µs ± 0%     280µs ± 1%   +1.64%  (p=0.016 n=4+5)
NewStreamBinaryReader/50Names10Values-10        368µs ± 1%     347µs ± 2%   -5.82%  (p=0.016 n=4+5)
NewStreamBinaryReader/50Names100Values-10      1.10ms ± 4%    0.91ms ± 2%  -17.15%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names500Values-10      3.73ms ± 3%    2.97ms ± 2%  -20.32%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names1000Values-10     6.74ms ± 2%    5.29ms ± 2%  -21.61%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names5000Values-10     32.0ms ± 0%    24.9ms ± 1%  -22.30%  (p=0.016 n=4+5)
NewStreamBinaryReader/100Names1Values-10        547µs ± 1%     543µs ± 1%     ~     (p=0.286 n=4+5)
NewStreamBinaryReader/100Names10Values-10       728µs ± 4%     678µs ± 3%   -6.89%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names100Values-10     2.08ms ± 5%    1.73ms ± 5%  -16.87%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names500Values-10     7.13ms ± 1%    5.63ms ± 2%  -21.06%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names1000Values-10    13.3ms ± 2%    10.2ms ± 2%  -23.17%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names5000Values-10    64.9ms ± 2%    49.8ms ± 1%  -23.20%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names1Values-10       1.17ms ± 0%    1.16ms ± 2%     ~     (p=0.190 n=4+5)
NewStreamBinaryReader/200Names10Values-10      1.49ms ± 0%    1.39ms ± 1%   -6.68%  (p=0.029 n=4+4)
NewStreamBinaryReader/200Names100Values-10     4.05ms ± 3%    3.35ms ± 3%  -17.32%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names500Values-10     14.4ms ± 2%    11.5ms ± 1%  -20.30%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names1000Values-10    27.3ms ± 2%    21.1ms ± 3%  -22.79%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names5000Values-10     131ms ± 2%     100ms ± 2%  -23.35%  (p=0.008 n=5+5)

name                                         old alloc/op   new alloc/op   delta
NewStreamBinaryReader/1Names1Values-10         3.18MB ± 0%    3.18MB ± 0%   -0.00%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names10Values-10        3.18MB ± 0%    3.18MB ± 0%   -0.01%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names100Values-10       3.18MB ± 0%    3.18MB ± 0%   -0.10%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names500Values-10       3.20MB ± 0%    3.19MB ± 0%   -0.50%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names1000Values-10      3.23MB ± 0%    3.20MB ± 0%   -0.99%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names5000Values-10      3.44MB ± 0%    3.28MB ± 0%   -4.66%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names1Values-10        3.18MB ± 0%    3.18MB ± 0%   -0.02%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names10Values-10       3.19MB ± 0%    3.19MB ± 0%   -0.20%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names100Values-10      3.29MB ± 0%    3.22MB ± 0%   -1.95%  (p=0.016 n=5+4)
NewStreamBinaryReader/20Names500Values-10      3.70MB ± 0%    3.38MB ± 0%   -8.65%  (p=0.029 n=4+4)
NewStreamBinaryReader/20Names1000Values-10     4.22MB ± 0%    3.58MB ± 0%  -15.17%  (p=0.029 n=4+4)
NewStreamBinaryReader/20Names5000Values-10     8.63MB ± 0%    5.43MB ± 0%  -37.09%  (p=0.029 n=4+4)
NewStreamBinaryReader/50Names1Values-10        3.19MB ± 0%    3.19MB ± 0%   -0.05%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names10Values-10       3.21MB ± 0%    3.20MB ± 0%   -0.50%  (p=0.016 n=4+5)
NewStreamBinaryReader/50Names100Values-10      3.45MB ± 0%    3.29MB ± 0%   -4.64%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names500Values-10      4.48MB ± 0%    3.68MB ± 0%  -17.85%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names1000Values-10     5.78MB ± 0%    4.18MB ± 0%  -27.67%  (p=0.016 n=5+4)
NewStreamBinaryReader/50Names5000Values-10     17.3MB ± 0%     9.3MB ± 0%  -46.28%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names1Values-10       3.20MB ± 0%    3.20MB ± 0%   -0.10%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names10Values-10      3.25MB ± 0%    3.22MB ± 0%   -0.99%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names100Values-10     3.72MB ± 0%    3.40MB ± 0%   -8.60%  (p=0.016 n=4+5)
NewStreamBinaryReader/100Names500Values-10     5.79MB ± 0%    4.19MB ± 0%  -27.65%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names1000Values-10    8.39MB ± 0%    5.19MB ± 0%  -38.15%  (p=0.016 n=4+5)
NewStreamBinaryReader/100Names5000Values-10    31.7MB ± 0%    15.7MB ± 0%  -50.46%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names1Values-10       3.22MB ± 0%    3.22MB ± 0%   -0.20%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names10Values-10      3.32MB ± 0%    3.26MB ± 0%   -1.93%  (p=0.029 n=4+4)
NewStreamBinaryReader/200Names100Values-10     4.27MB ± 0%    3.63MB ± 0%  -15.00%  (p=0.029 n=4+4)
NewStreamBinaryReader/200Names500Values-10     8.72MB ± 0%    5.52MB ± 0%  -36.72%  (p=0.016 n=4+5)
NewStreamBinaryReader/200Names1000Values-10    14.3MB ± 0%     7.9MB ± 0%  -44.70%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names5000Values-10    61.3MB ± 0%    29.3MB ± 0%  -52.22%  (p=0.008 n=5+5)

name                                         old allocs/op  new allocs/op  delta
NewStreamBinaryReader/1Names1Values-10           78.0 ± 0%      76.0 ± 0%   -2.56%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names10Values-10           106 ± 0%        95 ± 0%  -10.38%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names100Values-10          379 ± 0%       278 ± 0%  -26.65%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names500Values-10        1.58k ± 0%     1.08k ± 0%  -31.69%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names1000Values-10       3.08k ± 0%     2.08k ± 0%  -32.48%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names5000Values-10       15.1k ± 0%     10.1k ± 0%  -33.15%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names1Values-10           175 ± 0%       154 ± 0%  -12.00%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names10Values-10          735 ± 0%       534 ± 0%  -27.35%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names100Values-10       6.20k ± 0%     4.19k ± 0%     ~     (p=0.079 n=4+5)
NewStreamBinaryReader/20Names500Values-10       30.2k ± 0%     20.2k ± 0%  -33.08%  (p=0.000 n=5+4)
NewStreamBinaryReader/20Names1000Values-10      60.3k ± 0%     40.3k ± 0%     ~     (p=0.079 n=4+5)
NewStreamBinaryReader/20Names5000Values-10       300k ± 0%      200k ± 0%  -33.30%  (p=0.029 n=4+4)
NewStreamBinaryReader/50Names1Values-10           329 ± 0%       278 ± 0%  -15.50%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names10Values-10        1.73k ± 0%     1.23k ± 0%  -28.98%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names100Values-10       15.4k ± 0%     10.4k ± 0%     ~     (p=0.079 n=4+5)
NewStreamBinaryReader/50Names500Values-10       75.5k ± 0%     50.5k ± 0%  -33.12%  (p=0.000 n=4+5)
NewStreamBinaryReader/50Names1000Values-10       151k ± 0%      101k ± 0%  -33.22%  (p=0.029 n=4+4)
NewStreamBinaryReader/50Names5000Values-10       751k ± 0%      501k ± 0%  -33.31%  (p=0.029 n=4+4)
NewStreamBinaryReader/100Names1Values-10          582 ± 0%       481 ± 0%  -17.35%  (p=0.029 n=4+4)
NewStreamBinaryReader/100Names10Values-10       3.38k ± 0%     2.38k ± 0%  -29.61%  (p=0.000 n=5+4)
NewStreamBinaryReader/100Names100Values-10      30.7k ± 0%     20.7k ± 0%     ~     (p=0.079 n=4+5)
NewStreamBinaryReader/100Names500Values-10       151k ± 0%      101k ± 0%  -33.14%  (p=0.016 n=4+5)
NewStreamBinaryReader/100Names1000Values-10      301k ± 0%      201k ± 0%  -33.23%  (p=0.029 n=4+4)
NewStreamBinaryReader/100Names5000Values-10     1.50M ± 0%     1.00M ± 0%  -33.31%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names1Values-10        1.08k ± 0%     0.88k ± 0%  -18.56%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names10Values-10       6.68k ± 0%     4.68k ± 0%  -29.94%  (p=0.029 n=4+4)
NewStreamBinaryReader/200Names100Values-10      61.3k ± 0%     41.3k ± 0%  -32.64%  (p=0.029 n=4+4)
NewStreamBinaryReader/200Names500Values-10       302k ± 0%      202k ± 0%  -33.15%  (p=0.000 n=5+4)
NewStreamBinaryReader/200Names1000Values-10      602k ± 0%      402k ± 0%  -33.23%  (p=0.000 n=5+4)
NewStreamBinaryReader/200Names5000Values-10     3.00M ± 0%     2.00M ± 0%  -33.31%  (p=0.000 n=5+4)

* Improve performance of LookupSymbol by skipping over unwanted symbols.

The exact performance improvement depends on the ratio of name lookups
to value lookups. In the best case (all value lookups, no name
lookups), we save a bit over 2%:

name                                        old time/op    new time/op    delta
LookupSymbol/NameLookups0%-Parallelism1-10    8.57µs ± 0%    8.37µs ± 2%  -2.37%  (p=0.008 n=5+5)

* Update outdated comment.

* Improve index-header reading performance further by avoiding allocating unnecessary strings.

This is inspired by prometheus/prometheus#11535.

Unfortunately, we can't adopt that change as-is, as byte slices
returned by our new Decbuf.UvarintBytes() implementation are not valid
after subsequent reads - we can't take advantage of the magic of mmap.

This means that we must decide whether or not to allocate a string
for a key or value before reading any further in the file. However,
we want to store the last value for each key, but won't know if the
value is the last one until we've read the next one.

The trick is to read the table in two passes. On the first pass, we
read every 1-in-postingOffsetsInMemSampling entries, and keep track of
the position of the last value for each key.

On the second pass, we go back and read the last values for each key.

(I've started with two passes to avoid seeking backwards and
discarding the entire file buffer every time we start reading a new
key - it may be interesting to see if discarding the buffer is as
expensive as I expect.)

This involves a trade off: we'll scan the index-header file twice, but
gain massively reduced memory allocations. On my machine (a M1 MacBook
Pro with a fast SSD), the trade off pays off.

Compared to the previous commit:

name                                         old time/op    new time/op    delta
NewStreamBinaryReader/1Names1Values-10          122µs ± 5%     129µs ±13%     ~     (p=0.151 n=5+5)
NewStreamBinaryReader/1Names10Values-10         131µs ± 9%     124µs ± 2%     ~     (p=0.056 n=5+5)
NewStreamBinaryReader/1Names100Values-10        135µs ± 3%     133µs ± 3%     ~     (p=0.548 n=5+5)
NewStreamBinaryReader/1Names500Values-10        177µs ± 2%     162µs ± 1%   -8.29%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names1000Values-10       229µs ± 2%     198µs ± 2%  -13.51%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names5000Values-10       689µs ± 1%     535µs ± 2%  -22.37%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names1Values-10         169µs ± 3%     171µs ± 2%     ~     (p=0.310 n=5+5)
NewStreamBinaryReader/20Names10Values-10        194µs ± 2%     188µs ± 4%     ~     (p=0.056 n=5+5)
NewStreamBinaryReader/20Names100Values-10       438µs ± 7%     355µs ± 6%  -19.07%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names500Values-10      1.31ms ± 0%    0.94ms ± 3%  -28.29%  (p=0.016 n=4+5)
NewStreamBinaryReader/20Names1000Values-10     2.28ms ± 2%    1.62ms ± 3%  -29.16%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names5000Values-10     9.95ms ± 2%    6.71ms ± 1%  -32.57%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names1Values-10         280µs ± 1%     277µs ± 1%     ~     (p=0.095 n=5+5)
NewStreamBinaryReader/50Names10Values-10        347µs ± 2%     322µs ± 2%   -7.08%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names100Values-10       910µs ± 2%     701µs ± 1%  -22.94%  (p=0.016 n=5+4)
NewStreamBinaryReader/50Names500Values-10      2.97ms ± 2%    2.14ms ± 3%  -28.06%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names1000Values-10     5.29ms ± 2%    3.79ms ± 2%  -28.32%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names5000Values-10     24.9ms ± 1%    16.6ms ± 1%  -33.42%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names1Values-10        543µs ± 1%     548µs ± 3%     ~     (p=0.548 n=5+5)
NewStreamBinaryReader/100Names10Values-10       678µs ± 3%     632µs ± 4%   -6.77%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names100Values-10     1.73ms ± 5%    1.37ms ± 5%  -21.00%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names500Values-10     5.63ms ± 2%    4.08ms ± 2%  -27.62%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names1000Values-10    10.2ms ± 2%     7.3ms ± 1%  -29.01%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names5000Values-10    49.8ms ± 1%    33.4ms ± 0%  -33.04%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names1Values-10       1.16ms ± 2%    1.16ms ± 2%     ~     (p=0.548 n=5+5)
NewStreamBinaryReader/200Names10Values-10      1.39ms ± 1%    1.29ms ± 2%   -6.95%  (p=0.016 n=4+5)
NewStreamBinaryReader/200Names100Values-10     3.35ms ± 3%    2.68ms ± 4%  -20.01%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names500Values-10     11.5ms ± 1%     8.0ms ± 0%  -30.78%  (p=0.016 n=5+4)
NewStreamBinaryReader/200Names1000Values-10    21.1ms ± 3%    14.5ms ± 1%  -31.39%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names5000Values-10     100ms ± 2%      67ms ± 0%  -32.81%  (p=0.008 n=5+5)

name                                         old alloc/op   new alloc/op   delta
NewStreamBinaryReader/1Names1Values-10         3.18MB ± 0%    3.18MB ± 0%   +0.00%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names10Values-10        3.18MB ± 0%    3.18MB ± 0%   -0.00%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names100Values-10       3.18MB ± 0%    3.18MB ± 0%   -0.05%  (p=0.016 n=5+4)
NewStreamBinaryReader/1Names500Values-10       3.19MB ± 0%    3.18MB ± 0%   -0.24%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names1000Values-10      3.20MB ± 0%    3.18MB ± 0%   -0.48%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names5000Values-10      3.28MB ± 0%    3.20MB ± 0%   -2.36%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names1Values-10        3.18MB ± 0%    3.18MB ± 0%   +0.06%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names10Values-10       3.19MB ± 0%    3.18MB ± 0%   -0.02%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names100Values-10      3.22MB ± 0%    3.19MB ± 0%   -0.90%  (p=0.029 n=4+4)
NewStreamBinaryReader/20Names500Values-10      3.38MB ± 0%    3.23MB ± 0%   -4.54%  (p=0.016 n=4+5)
NewStreamBinaryReader/20Names1000Values-10     3.58MB ± 0%    3.27MB ± 0%   -8.61%  (p=0.029 n=4+4)
NewStreamBinaryReader/20Names5000Values-10     5.43MB ± 0%    3.56MB ± 0%  -34.41%  (p=0.016 n=4+5)
NewStreamBinaryReader/50Names1Values-10        3.19MB ± 0%    3.19MB ± 0%   +0.13%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names10Values-10       3.20MB ± 0%    3.19MB ± 0%   -0.09%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names100Values-10      3.29MB ± 0%    3.22MB ± 0%   -2.24%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names500Values-10      3.68MB ± 0%    3.30MB ± 0%  -10.44%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names1000Values-10     4.18MB ± 0%    3.41MB ± 0%  -18.44%  (p=0.016 n=4+5)
NewStreamBinaryReader/50Names5000Values-10     9.29MB ± 0%    4.13MB ± 0%  -55.48%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names1Values-10       3.20MB ± 0%    3.21MB ± 0%   +0.25%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names10Values-10      3.22MB ± 0%    3.21MB ± 0%   -0.17%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names100Values-10     3.40MB ± 0%    3.26MB ± 0%   -4.32%  (p=0.016 n=5+4)
NewStreamBinaryReader/100Names500Values-10     4.19MB ± 0%    3.42MB ± 0%  -18.36%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names1000Values-10    5.19MB ± 0%    3.65MB ± 0%  -29.73%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names5000Values-10    15.7MB ± 0%     5.1MB ± 0%  -67.61%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names1Values-10       3.22MB ± 0%    3.23MB ± 0%   +0.51%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names10Values-10      3.26MB ± 0%    3.25MB ± 0%   -0.33%  (p=0.016 n=4+5)
NewStreamBinaryReader/200Names100Values-10     3.63MB ± 0%    3.33MB ± 0%   -8.11%  (p=0.029 n=4+4)
NewStreamBinaryReader/200Names500Values-10     5.52MB ± 0%    3.66MB ± 0%  -33.68%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names1000Values-10    7.92MB ± 0%    4.12MB ± 0%  -48.02%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names5000Values-10    29.3MB ± 0%     7.0MB ± 0%  -76.10%  (p=0.016 n=5+4)

name                                         old allocs/op  new allocs/op  delta
NewStreamBinaryReader/1Names1Values-10           76.0 ± 0%      78.0 ± 0%   +2.63%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names10Values-10          95.0 ± 0%      80.0 ± 0%  -15.79%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names100Values-10          278 ± 0%        86 ± 0%  -69.06%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names500Values-10        1.08k ± 0%     0.10k ± 0%  -90.74%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names1000Values-10       2.08k ± 0%     0.12k ± 0%  -94.38%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names5000Values-10       10.1k ± 0%      0.2k ± 0%  -97.58%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names1Values-10           154 ± 0%       160 ± 0%   +3.90%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names10Values-10          534 ± 0%       200 ± 0%  -62.55%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names100Values-10       4.19k ± 0%     0.32k ± 0%  -92.37%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names500Values-10       20.2k ± 0%      0.6k ± 0%  -97.03%  (p=0.029 n=4+4)
NewStreamBinaryReader/20Names1000Values-10      40.3k ± 0%      0.9k ± 0%  -97.66%  (p=0.000 n=5+4)
NewStreamBinaryReader/20Names5000Values-10       200k ± 0%        3k ± 0%  -98.26%  (p=0.029 n=4+4)
NewStreamBinaryReader/50Names1Values-10           278 ± 0%       285 ± 0%   +2.52%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names10Values-10        1.23k ± 0%     0.39k ± 0%  -68.65%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names100Values-10       10.4k ± 0%      0.7k ± 0%  -93.40%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names500Values-10       50.5k ± 0%      1.4k ± 0%  -97.26%  (p=0.000 n=5+4)
NewStreamBinaryReader/50Names1000Values-10       101k ± 0%        2k ± 0%  -97.78%  (p=0.029 n=4+4)
NewStreamBinaryReader/50Names5000Values-10       501k ± 0%        9k ± 0%  -98.28%  (p=0.016 n=4+5)
NewStreamBinaryReader/100Names1Values-10          481 ± 0%       489 ± 0%   +1.66%  (p=0.029 n=4+4)
NewStreamBinaryReader/100Names10Values-10       2.38k ± 0%     0.69k ± 0%  -71.05%  (p=0.000 n=4+5)
NewStreamBinaryReader/100Names100Values-10      20.7k ± 0%      1.3k ± 0%  -93.76%  (p=0.000 n=5+4)
NewStreamBinaryReader/100Names500Values-10       101k ± 0%        3k ± 0%  -97.33%  (p=0.016 n=5+4)
NewStreamBinaryReader/100Names1000Values-10      201k ± 0%        4k ± 0%  -97.82%  (p=0.029 n=4+4)
NewStreamBinaryReader/100Names5000Values-10     1.00M ± 0%     0.02M ± 0%  -98.29%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names1Values-10          882 ± 0%       891 ± 0%   +1.02%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names10Values-10       4.68k ± 0%     1.29k ± 0%     ~     (p=0.079 n=4+5)
NewStreamBinaryReader/200Names100Values-10      41.3k ± 0%      2.5k ± 0%     ~     (p=0.079 n=4+5)
NewStreamBinaryReader/200Names500Values-10       202k ± 0%        5k ± 0%     ~     (p=0.079 n=4+5)
NewStreamBinaryReader/200Names1000Values-10      402k ± 0%        9k ± 0%  -97.84%  (p=0.000 n=4+5)
NewStreamBinaryReader/200Names5000Values-10     2.00M ± 0%     0.03M ± 0%  -98.30%  (p=0.016 n=4+5)

...and compared to bada69c:

name                                         old time/op    new time/op    delta
NewStreamBinaryReader/1Names1Values-10          122µs ± 8%     129µs ±13%     ~     (p=0.151 n=5+5)
NewStreamBinaryReader/1Names10Values-10         124µs ± 4%     124µs ± 2%     ~     (p=0.421 n=5+5)
NewStreamBinaryReader/1Names100Values-10        138µs ± 2%     133µs ± 3%   -3.45%  (p=0.016 n=5+5)
NewStreamBinaryReader/1Names500Values-10        187µs ± 4%     162µs ± 1%  -13.16%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names1000Values-10       262µs ± 2%     198µs ± 2%  -24.35%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names5000Values-10       837µs ± 3%     535µs ± 2%  -36.05%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names1Values-10         168µs ± 2%     171µs ± 2%     ~     (p=0.056 n=5+5)
NewStreamBinaryReader/20Names10Values-10        199µs ± 2%     188µs ± 4%   -5.37%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names100Values-10       505µs ± 1%     355µs ± 6%  -29.75%  (p=0.016 n=4+5)
NewStreamBinaryReader/20Names500Values-10      1.63ms ± 1%    0.94ms ± 3%  -42.27%  (p=0.016 n=4+5)
NewStreamBinaryReader/20Names1000Values-10     2.90ms ± 3%    1.62ms ± 3%  -44.24%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names5000Values-10     12.9ms ± 2%     6.7ms ± 1%  -47.87%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names1Values-10         276µs ± 0%     277µs ± 1%     ~     (p=0.286 n=4+5)
NewStreamBinaryReader/50Names10Values-10        368µs ± 1%     322µs ± 2%  -12.49%  (p=0.016 n=4+5)
NewStreamBinaryReader/50Names100Values-10      1.10ms ± 4%    0.70ms ± 1%  -36.16%  (p=0.016 n=5+4)
NewStreamBinaryReader/50Names500Values-10      3.73ms ± 3%    2.14ms ± 3%  -42.68%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names1000Values-10     6.74ms ± 2%    3.79ms ± 2%  -43.81%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names5000Values-10     32.0ms ± 0%    16.6ms ± 1%  -48.27%  (p=0.016 n=4+5)
NewStreamBinaryReader/100Names1Values-10        547µs ± 1%     548µs ± 3%     ~     (p=0.413 n=4+5)
NewStreamBinaryReader/100Names10Values-10       728µs ± 4%     632µs ± 4%  -13.19%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names100Values-10     2.08ms ± 5%    1.37ms ± 5%  -34.32%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names500Values-10     7.13ms ± 1%    4.08ms ± 2%  -42.86%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names1000Values-10    13.3ms ± 2%     7.3ms ± 1%  -45.46%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names5000Values-10    64.9ms ± 2%    33.4ms ± 0%  -48.57%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names1Values-10       1.17ms ± 0%    1.16ms ± 2%     ~     (p=0.190 n=4+5)
NewStreamBinaryReader/200Names10Values-10      1.49ms ± 0%    1.29ms ± 2%  -13.17%  (p=0.016 n=4+5)
NewStreamBinaryReader/200Names100Values-10     4.05ms ± 3%    2.68ms ± 4%  -33.87%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names500Values-10     14.4ms ± 2%     8.0ms ± 0%  -44.83%  (p=0.016 n=5+4)
NewStreamBinaryReader/200Names1000Values-10    27.3ms ± 2%    14.5ms ± 1%  -47.02%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names5000Values-10     131ms ± 2%      67ms ± 0%  -48.50%  (p=0.008 n=5+5)

name                                         old alloc/op   new alloc/op   delta
NewStreamBinaryReader/1Names1Values-10         3.18MB ± 0%    3.18MB ± 0%   +0.00%  (p=0.032 n=5+5)
NewStreamBinaryReader/1Names10Values-10        3.18MB ± 0%    3.18MB ± 0%   -0.01%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names100Values-10       3.18MB ± 0%    3.18MB ± 0%   -0.15%  (p=0.016 n=5+4)
NewStreamBinaryReader/1Names500Values-10       3.20MB ± 0%    3.18MB ± 0%   -0.74%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names1000Values-10      3.23MB ± 0%    3.18MB ± 0%   -1.47%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names5000Values-10      3.44MB ± 0%    3.20MB ± 0%   -6.91%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names1Values-10        3.18MB ± 0%    3.18MB ± 0%   +0.04%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names10Values-10       3.19MB ± 0%    3.18MB ± 0%   -0.22%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names100Values-10      3.29MB ± 0%    3.19MB ± 0%   -2.83%  (p=0.016 n=5+4)
NewStreamBinaryReader/20Names500Values-10      3.70MB ± 0%    3.23MB ± 0%  -12.80%  (p=0.016 n=4+5)
NewStreamBinaryReader/20Names1000Values-10     4.22MB ± 0%    3.27MB ± 0%  -22.47%  (p=0.029 n=4+4)
NewStreamBinaryReader/20Names5000Values-10     8.63MB ± 0%    3.56MB ± 0%  -58.73%  (p=0.016 n=4+5)
NewStreamBinaryReader/50Names1Values-10        3.19MB ± 0%    3.19MB ± 0%   +0.08%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names10Values-10       3.21MB ± 0%    3.19MB ± 0%   -0.58%  (p=0.016 n=4+5)
NewStreamBinaryReader/50Names100Values-10      3.45MB ± 0%    3.22MB ± 0%   -6.77%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names500Values-10      4.48MB ± 0%    3.30MB ± 0%  -26.43%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names1000Values-10     5.78MB ± 0%    3.41MB ± 0%  -41.01%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names5000Values-10     17.3MB ± 0%     4.1MB ± 0%  -76.09%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names1Values-10       3.20MB ± 0%    3.21MB ± 0%   +0.15%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names10Values-10      3.25MB ± 0%    3.21MB ± 0%   -1.15%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names100Values-10     3.72MB ± 0%    3.26MB ± 0%  -12.55%  (p=0.029 n=4+4)
NewStreamBinaryReader/100Names500Values-10     5.79MB ± 0%    3.42MB ± 0%  -40.94%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names1000Values-10    8.39MB ± 0%    3.65MB ± 0%  -56.54%  (p=0.016 n=4+5)
NewStreamBinaryReader/100Names5000Values-10    31.7MB ± 0%     5.1MB ± 0%  -83.95%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names1Values-10       3.22MB ± 0%    3.23MB ± 0%   +0.31%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names10Values-10      3.32MB ± 0%    3.25MB ± 0%   -2.26%  (p=0.016 n=4+5)
NewStreamBinaryReader/200Names100Values-10     4.27MB ± 0%    3.33MB ± 0%  -21.89%  (p=0.029 n=4+4)
NewStreamBinaryReader/200Names500Values-10     8.72MB ± 0%    3.66MB ± 0%  -58.03%  (p=0.016 n=4+5)
NewStreamBinaryReader/200Names1000Values-10    14.3MB ± 0%     4.1MB ± 0%  -71.26%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names5000Values-10    61.3MB ± 0%     7.0MB ± 0%  -88.58%  (p=0.016 n=5+4)

name                                         old allocs/op  new allocs/op  delta
NewStreamBinaryReader/1Names1Values-10           78.0 ± 0%      78.0 ± 0%     ~     (all equal)
NewStreamBinaryReader/1Names10Values-10           106 ± 0%        80 ± 0%  -24.53%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names100Values-10          379 ± 0%        86 ± 0%  -77.31%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names500Values-10        1.58k ± 0%     0.10k ± 0%  -93.67%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names1000Values-10       3.08k ± 0%     0.12k ± 0%  -96.20%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names5000Values-10       15.1k ± 0%      0.2k ± 0%  -98.38%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names1Values-10           175 ± 0%       160 ± 0%   -8.57%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names10Values-10          735 ± 0%       200 ± 0%  -72.79%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names100Values-10       6.20k ± 0%     0.32k ± 0%     ~     (p=0.079 n=4+5)
NewStreamBinaryReader/20Names500Values-10       30.2k ± 0%      0.6k ± 0%  -98.02%  (p=0.000 n=5+4)
NewStreamBinaryReader/20Names1000Values-10      60.3k ± 0%      0.9k ± 0%  -98.44%  (p=0.029 n=4+4)
NewStreamBinaryReader/20Names5000Values-10       300k ± 0%        3k ± 0%  -98.84%  (p=0.029 n=4+4)
NewStreamBinaryReader/50Names1Values-10           329 ± 0%       285 ± 0%  -13.37%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names10Values-10        1.73k ± 0%     0.39k ± 0%  -77.73%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names100Values-10       15.4k ± 0%      0.7k ± 0%     ~     (p=0.079 n=4+5)
NewStreamBinaryReader/50Names500Values-10       75.5k ± 0%      1.4k ± 0%  -98.17%  (p=0.029 n=4+4)
NewStreamBinaryReader/50Names1000Values-10       151k ± 0%        2k ± 0%  -98.52%  (p=0.029 n=4+4)
NewStreamBinaryReader/50Names5000Values-10       751k ± 0%        9k ± 0%  -98.86%  (p=0.016 n=4+5)
NewStreamBinaryReader/100Names1Values-10          582 ± 0%       489 ± 0%  -15.98%  (p=0.029 n=4+4)
NewStreamBinaryReader/100Names10Values-10       3.38k ± 0%     0.69k ± 0%  -79.62%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names100Values-10      30.7k ± 0%      1.3k ± 0%  -95.80%  (p=0.029 n=4+4)
NewStreamBinaryReader/100Names500Values-10       151k ± 0%        3k ± 0%  -98.22%  (p=0.029 n=4+4)
NewStreamBinaryReader/100Names1000Values-10      301k ± 0%        4k ± 0%  -98.54%  (p=0.029 n=4+4)
NewStreamBinaryReader/100Names5000Values-10     1.50M ± 0%     0.02M ± 0%  -98.86%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names1Values-10        1.08k ± 0%     0.89k ± 0%  -17.73%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names10Values-10       6.68k ± 0%     1.29k ± 0%     ~     (p=0.079 n=4+5)
NewStreamBinaryReader/200Names100Values-10      61.3k ± 0%      2.5k ± 0%     ~     (p=0.079 n=4+5)
NewStreamBinaryReader/200Names500Values-10       302k ± 0%        5k ± 0%  -98.25%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names1000Values-10      602k ± 0%        9k ± 0%  -98.56%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names5000Values-10     3.00M ± 0%     0.03M ± 0%  -98.86%  (p=0.008 n=5+5)

Read postings offset table in one pass by seeking back to previous values when required.

On my machine with a SSD, this produces mixed results, but seems to
improve things for index-headers with a high number of values and
relatively few names:

name                                         old time/op    new time/op    delta
NewStreamBinaryReader/1Names1Values-10          129µs ±13%     136µs ± 4%     ~     (p=0.151 n=5+5)
NewStreamBinaryReader/1Names10Values-10         124µs ± 2%     129µs ± 7%   +3.75%  (p=0.032 n=5+5)
NewStreamBinaryReader/1Names100Values-10        133µs ± 3%     135µs ± 1%     ~     (p=0.421 n=5+5)
NewStreamBinaryReader/1Names500Values-10        162µs ± 1%     164µs ± 2%     ~     (p=0.421 n=5+5)
NewStreamBinaryReader/1Names1000Values-10       198µs ± 2%     198µs ± 2%     ~     (p=1.000 n=5+5)
NewStreamBinaryReader/1Names5000Values-10       535µs ± 2%     518µs ± 2%   -3.13%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names1Values-10         171µs ± 2%     171µs ± 1%     ~     (p=0.841 n=5+5)
NewStreamBinaryReader/20Names10Values-10        188µs ± 4%     208µs ± 2%  +10.17%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names100Values-10       355µs ± 6%     383µs ± 2%   +8.03%  (p=0.016 n=5+5)
NewStreamBinaryReader/20Names500Values-10       941µs ± 3%     932µs ± 3%     ~     (p=0.421 n=5+5)
NewStreamBinaryReader/20Names1000Values-10     1.62ms ± 3%    1.57ms ± 3%     ~     (p=0.095 n=5+5)
NewStreamBinaryReader/20Names5000Values-10     6.71ms ± 1%    6.33ms ± 1%   -5.57%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names1Values-10         277µs ± 1%     291µs ± 5%   +5.10%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names10Values-10        322µs ± 2%     394µs ± 5%  +22.15%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names100Values-10       701µs ± 1%     730µs ± 2%   +4.11%  (p=0.016 n=4+5)
NewStreamBinaryReader/50Names500Values-10      2.14ms ± 3%    2.08ms ± 3%     ~     (p=0.095 n=5+5)
NewStreamBinaryReader/50Names1000Values-10     3.79ms ± 2%    3.63ms ± 2%   -4.09%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names5000Values-10     16.6ms ± 1%    15.5ms ± 0%   -6.59%  (p=0.016 n=5+4)
NewStreamBinaryReader/100Names1Values-10        548µs ± 3%     542µs ± 3%     ~     (p=0.095 n=5+5)
NewStreamBinaryReader/100Names10Values-10       632µs ± 4%     741µs ± 3%  +17.27%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names100Values-10     1.37ms ± 5%    1.40ms ± 5%     ~     (p=0.222 n=5+5)
NewStreamBinaryReader/100Names500Values-10     4.08ms ± 2%    3.97ms ± 2%   -2.62%  (p=0.016 n=5+5)
NewStreamBinaryReader/100Names1000Values-10    7.27ms ± 1%    6.96ms ± 1%   -4.25%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names5000Values-10    33.4ms ± 0%    31.1ms ± 0%   -6.73%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names1Values-10       1.16ms ± 2%    1.16ms ± 3%     ~     (p=1.000 n=5+5)
NewStreamBinaryReader/200Names10Values-10      1.29ms ± 2%    1.57ms ± 4%  +21.49%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names100Values-10     2.68ms ± 4%    2.73ms ± 3%     ~     (p=0.095 n=5+5)
NewStreamBinaryReader/200Names500Values-10     7.97ms ± 0%    7.64ms ± 1%   -4.09%  (p=0.016 n=4+5)
NewStreamBinaryReader/200Names1000Values-10    14.5ms ± 1%    13.7ms ± 1%   -5.30%  (p=0.008 n=5+5)
NewStreamBinaryReader/200Names5000Values-10    67.3ms ± 0%    62.5ms ± 1%   -7.15%  (p=0.008 n=5+5)

* Remove unnecessary assignment.

* Use correct terminology.

* Fix incorrect parameter name.

* Incorporate PR feedback: rename variable to reflect that it needs to be treated carefully.

* Incorporate PR feedback: rename Decbuf.UvarintBytes to UnsafeUvarintBytes()

* Incorporate PR feedback: add comment to explain logic.

#3742 (comment)

* Make use of SkipUvarintBytes in NewSymbols.

This reduces the time taken to load an index-header by up to 2-4% for
realistic scenarios:

name                                         old time/op    new time/op    delta
NewStreamBinaryReader/1Names1Values-10          128µs ± 1%     123µs ± 2%  -4.25%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names10Values-10         128µs ± 1%     122µs ± 1%  -4.95%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names100Values-10        137µs ± 3%     133µs ± 1%  -2.84%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names500Values-10        169µs ± 4%     161µs ± 1%  -4.87%  (p=0.008 n=5+5)
NewStreamBinaryReader/1Names1000Values-10       201µs ± 3%     192µs ± 3%  -4.47%  (p=0.016 n=5+5)
NewStreamBinaryReader/1Names5000Values-10       541µs ± 5%     494µs ± 2%  -8.64%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names1Values-10         178µs ± 4%     169µs ± 1%  -4.89%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names10Values-10        216µs ± 4%     200µs ± 0%  -7.14%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names100Values-10       374µs ± 5%     360µs ± 5%    ~     (p=0.095 n=5+5)
NewStreamBinaryReader/20Names500Values-10       955µs ± 1%     907µs ± 4%  -5.05%  (p=0.008 n=5+5)
NewStreamBinaryReader/20Names1000Values-10     1.60ms ± 0%    1.54ms ± 3%    ~     (p=0.063 n=4+5)
NewStreamBinaryReader/20Names5000Values-10     6.56ms ± 3%    6.26ms ± 1%  -4.64%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names1Values-10         297µs ± 4%     276µs ± 1%  -7.14%  (p=0.008 n=5+5)
NewStreamBinaryReader/50Names10Values-10        401µs ± 6%     384µs ± 4%  -4.25%  (p=0.032 n=5+5)
NewStreamBinaryReader/50Names100Values-10       738µs ± 3%     716µs ± 2%  -2.97%  (p=0.032 n=5+5)
NewStreamBinaryReader/50Names500Values-10      2.10ms ± 3%    2.06ms ± 3%    ~     (p=0.151 n=5+5)
NewStreamBinaryReader/50Names1000Values-10     3.68ms ± 2%    3.59ms ± 2%    ~     (p=0.056 n=5+5)
NewStreamBinaryReader/50Names5000Values-10     15.8ms ± 2%    15.4ms ± 1%  -2.77%  (p=0.008 n=5+5)
NewStreamBinaryReader/100Names1Values-10        556µs ± 6%     534µs ± 0%  -4.09%  (p=0.016 n=5+4)
NewStreamBinaryReader/100Names10Values-10       753µs ± 1%     736µs ± 2%  -2.18%  (p=0.032 n=5+5)
NewStreamBinaryReader/100Names100Values-10     1.40ms ± 4%    1.38ms ± 4%    ~     (p=0.222 n=5+5)
NewStreamBinaryReader/100Names500Values-10     3.95ms ± 3%    3.91ms ± 1%    ~     (p=0.310 n=5+5)
NewStreamBinaryReader/100Names1000Values-10    6.92ms ± 1%    6.92ms ± 1%    ~     (p=1.000 n=5+5)
NewStreamBinaryReader/100Names5000Values-10    30.9ms ± 0%    30.8ms ± 1%    ~     (p=0.095 n=5+5)
NewStreamBinaryReader/200Names1Values-10       1.17ms ± 6%    1.14ms ± 0%  -2.33%  (p=0.032 n=5+4)
NewStreamBinaryReader/200Names10Values-10      1.54ms ± 6%    1.84ms ±43%    ~     (p=0.548 n=5+5)
NewStreamBinaryReader/200Names100Values-10     2.73ms ± 3%    2.74ms ± 3%    ~     (p=0.421 n=5+5)
NewStreamBinaryReader/200Names500Values-10     7.62ms ± 1%    7.58ms ± 1%    ~     (p=0.151 n=5+5)
NewStreamBinaryReader/200Names1000Values-10    13.7ms ± 1%    13.7ms ± 1%    ~     (p=0.222 n=5+5)
NewStreamBinaryReader/200Names5000Values-10    62.4ms ± 1%    61.9ms ± 0%  -0.79%  (p=0.029 n=4+4)

* Address PR feedback: reduce scope of variable

Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com>

Co-authored-by: Oleg Zaytsev <mail@olegzaytsev.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants