Skip to content
This repository has been archived by the owner on Aug 23, 2023. It is now read-only.

ROB: Fix underflow bug and reduce allocations #2006

Merged
merged 6 commits into from
Nov 26, 2021

Conversation

shanson7
Copy link
Collaborator

Underflow Bug

This was not my main goal, but was breaking the benchmarks. This bug is unlikely in production because it requires very old timestamps.

The line rob.buf[rob.newest].Ts-(uint32(cap(rob.buf))*rob.interval can cause an underflow when rob.buf[rob.newest].Ts is smaller than cap(rob.buf)*rob.interval and will wrap around to a very large number. This incorrectly returns and error (metric too old). The benchmarks all start at Ts=1 so they were never actually getting beyond this check. I added a test case for this as well.

Reduce Allocations

One of the top allocators in our Metrictank deploy is the ROB. It makes an allocation if any point is flushed during Add. This is due to how it returns the flushed values from Add in a slice. However, in the most common case data comes in order and will flush exactly 1 point. This change is to return a schema.Point along with the slice. If a single point needs to be flushed, no slice needs to be allocated. If more than one is flushed, then we still need to allocate the slice. However, I also pre-allocate the slice to the size needed, which speeds up even the out-of-order cases.

This change does make the code a little uglier, but it is quite a bit faster and reduces allocations.

name                                   old time/op    new time/op    delta
ROB10AddDisallowUpdate-12                49.8ns ± 2%    20.7ns ± 7%   -58.43%  (p=0.000 n=18+20)
ROB120AddDisallowUpdate-12               50.5ns ± 3%    21.0ns ± 7%   -58.39%  (p=0.000 n=20+20)
ROB600AddDisallowUpdate-12               50.0ns ± 2%    19.7ns ± 2%   -60.55%  (p=0.000 n=20+17)
ROB10AddShuffled5DisallowUpdate-12       46.5ns ± 1%    26.4ns ± 6%   -43.18%  (p=0.000 n=17+18)
ROB120AddShuffled5DisallowUpdate-12      46.7ns ± 1%    26.1ns ± 1%   -44.08%  (p=0.000 n=19+20)
ROB600AddShuffled5DisallowUpdate-12      46.6ns ± 1%    26.1ns ± 2%   -44.10%  (p=0.000 n=19+20)
ROB10AddShuffled50DisallowUpdate-12      12.6ns ± 2%    10.5ns ± 6%   -16.87%  (p=0.000 n=17+20)
ROB120AddShuffled50DisallowUpdate-12     31.5ns ± 3%    26.3ns ± 2%   -16.41%  (p=0.000 n=19+18)
ROB600AddShuffled50DisallowUpdate-12     31.9ns ± 3%    26.1ns ± 3%   -18.18%  (p=0.000 n=19+20)
ROB10AddShuffled500DisallowUpdate-12     6.43ns ± 9%    6.34ns ± 6%      ~     (p=0.262 n=20+20)
ROB120AddShuffled500DisallowUpdate-12    11.5ns ± 6%    10.4ns ± 3%    -9.87%  (p=0.000 n=20+19)
ROB600AddShuffled500DisallowUpdate-12    28.3ns ± 6%    25.5ns ± 1%   -10.15%  (p=0.000 n=20+17)
ROB10AddAllowUpdate-12                   50.2ns ± 2%    19.6ns ± 1%   -60.91%  (p=0.000 n=19+19)
ROB120AddAllowUpdate-12                  50.4ns ± 1%    19.8ns ± 3%   -60.73%  (p=0.000 n=20+19)
ROB600AddAllowUpdate-12                  50.4ns ± 2%    19.8ns ± 2%   -60.67%  (p=0.000 n=20+18)
ROB10AddShuffled5AllowUpdate-12          47.1ns ± 3%    26.4ns ± 3%   -44.00%  (p=0.000 n=19+18)
ROB120AddShuffled5AllowUpdate-12         46.7ns ± 3%    26.2ns ± 1%   -44.00%  (p=0.000 n=20+19)
ROB600AddShuffled5AllowUpdate-12         47.1ns ± 3%    26.9ns ± 6%   -43.02%  (p=0.000 n=18+20)
ROB10AddShuffled50AllowUpdate-12         13.1ns ± 5%    10.9ns ±10%   -17.14%  (p=0.000 n=20+20)
ROB120AddShuffled50AllowUpdate-12        33.6ns ±10%    26.5ns ± 3%   -21.16%  (p=0.000 n=17+19)
ROB600AddShuffled50AllowUpdate-12        32.4ns ± 6%    26.4ns ± 3%   -18.67%  (p=0.000 n=20+20)
ROB10AddShuffled500AllowUpdate-12        6.25ns ± 3%    6.18ns ± 5%      ~     (p=0.072 n=17+20)
ROB120AddShuffled500AllowUpdate-12       11.2ns ± 4%    10.5ns ± 2%    -6.97%  (p=0.000 n=20+18)
ROB600AddShuffled500AllowUpdate-12       28.0ns ± 6%    25.7ns ± 3%    -8.41%  (p=0.000 n=19+20)

name                                   old alloc/op   new alloc/op   delta
ROB10AddDisallowUpdate-12                 15.0B ± 0%      0.0B       -100.00%  (p=0.000 n=20+20)
ROB120AddDisallowUpdate-12                15.0B ± 0%      0.0B       -100.00%  (p=0.000 n=20+20)
ROB600AddDisallowUpdate-12                15.0B ± 0%      0.0B       -100.00%  (p=0.000 n=20+20)
ROB10AddShuffled5DisallowUpdate-12        47.0B ± 0%     12.0B ± 0%   -74.47%  (p=0.000 n=20+20)
ROB120AddShuffled5DisallowUpdate-12       47.0B ± 0%     12.0B ± 0%   -74.47%  (p=0.000 n=20+20)
ROB600AddShuffled5DisallowUpdate-12       47.0B ± 0%     12.0B ± 0%   -74.47%  (p=0.000 n=20+20)
ROB10AddShuffled50DisallowUpdate-12       9.00B ± 0%     2.00B ± 0%   -77.78%  (p=0.000 n=20+20)
ROB120AddShuffled50DisallowUpdate-12      40.0B ± 0%     17.0B ± 0%   -57.50%  (p=0.000 n=20+20)
ROB600AddShuffled50DisallowUpdate-12      40.0B ± 0%     17.0B ± 0%   -57.50%  (p=0.000 n=20+20)
ROB10AddShuffled500DisallowUpdate-12      0.00B          0.00B           ~     (all equal)
ROB120AddShuffled500DisallowUpdate-12     8.00B ± 0%     4.00B ± 0%   -50.00%  (p=0.000 n=20+20)
ROB600AddShuffled500DisallowUpdate-12     32.0B ± 0%     16.0B ± 0%   -50.00%  (p=0.000 n=20+20)
ROB10AddAllowUpdate-12                    15.0B ± 0%      0.0B       -100.00%  (p=0.000 n=20+20)
ROB120AddAllowUpdate-12                   15.0B ± 0%      0.0B       -100.00%  (p=0.000 n=20+20)
ROB600AddAllowUpdate-12                   15.0B ± 0%      0.0B       -100.00%  (p=0.000 n=20+20)
ROB10AddShuffled5AllowUpdate-12           47.0B ± 0%     12.0B ± 0%   -74.47%  (p=0.000 n=20+20)
ROB120AddShuffled5AllowUpdate-12          47.0B ± 0%     12.0B ± 0%   -74.47%  (p=0.000 n=20+20)
ROB600AddShuffled5AllowUpdate-12          47.0B ± 0%     12.0B ± 0%   -74.47%  (p=0.000 n=20+20)
ROB10AddShuffled50AllowUpdate-12          9.00B ± 0%     2.00B ± 0%   -77.78%  (p=0.000 n=20+20)
ROB120AddShuffled50AllowUpdate-12         40.0B ± 0%     17.0B ± 0%   -57.50%  (p=0.000 n=20+20)
ROB600AddShuffled50AllowUpdate-12         40.0B ± 0%     17.0B ± 0%   -57.50%  (p=0.000 n=20+20)
ROB10AddShuffled500AllowUpdate-12         0.00B          0.00B           ~     (all equal)
ROB120AddShuffled500AllowUpdate-12        8.00B ± 0%     4.00B ± 0%   -50.00%  (p=0.000 n=20+20)
ROB600AddShuffled500AllowUpdate-12        32.0B ± 0%     16.0B ± 0%   -50.00%  (p=0.000 n=20+20)

mdata/reorder_buffer.go Outdated Show resolved Hide resolved
Copy link
Contributor

@Dieterbe Dieterbe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also (in addition to your solution, or as an alternative to your solution) allocate the slice of points at ROB creation time. and reuse that same slice every time we need it to return the slice out of Add().

I suppose the issue with that is we're holding on to a bunch of memory we only use occasionally. That could in turn be solved by each ROB sharing a pool of slices, but at that point, I prefer your approach.

@shanson7
Copy link
Collaborator Author

We could also (in addition to your solution, or as an alternative to your solution) allocate the slice of points at ROB creation time. and reuse that same slice every time we need it to return the slice out of Add().

I suppose the issue with that is we're holding on to a bunch of memory we only use occasionally. That could in turn be solved by each ROB sharing a pool of slices, but at that point, I prefer your approach.

Yeah, that was my first step but the extra memory was too much (basically doubling the ROB size). It's also a slightly awkward interface from a multi-threaded context, though that isn't a big deal for this code.

Copy link
Contributor

@Dieterbe Dieterbe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice work Sean.

@Dieterbe Dieterbe merged commit e0375b5 into grafana:master Nov 26, 2021
@shanson7 shanson7 deleted the rob_no_alloc branch November 30, 2021 18:35
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants