remote_write: reduce blocking from garbage-collect of series #9109

bboreham · 2021-07-22T13:56:36Z

As suggested at #9082 (comment),
Add a method UpdateSeriesSegment() which is used together with SeriesReset() to garbage-collect old series.
This allows us to split the lock around queueManager series data and avoid blocking Append() while reading series from the last checkpoint.

Also UpdateSeriesSegment() is massively less expensive than StoreSeries() when called with existing series, unless something like #9084 is merged.

To allow a different implementation to be used when garbage-collecting. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

Add a method `UpdateSeriesSegment()` which is used together with `SeriesReset()` to garbage-collect old series. This allows us to split the lock around queueManager series data and avoid blocking `Append()` while reading series from the last checkpoint. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

cstyan

Awesome, thanks @bboreham

Overall this LGTM, I like the implementation via passing of a function.

Few comments that aren't really blockers:

There are a few comments that aren't full sentences, doesn't really bother me but that's the convention we've gone with.
Personally sometimes I find it hard to track down implementations of function types, I'm curious what you think about having implements segmentReadFn in the comments for readSegment and readSegmentForGC. In this case the naming is pretty self explanatory so I don't think it's necessary to make a change here.
Given we're making this and other changes to improve samples throughput while reading a checkpoint, I think it's worth adding a benchmark. If we just open an issue for now with a note to compare main/the most recent release at the time of the benchmark vs the release prior to your changes then that's okay.

tsdb/wal/watcher.go

storage/remote/queue_manager.go

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

Reduce the total number of samples per iteration from 5000*5000 (25 million) which is too big for my laptop, to 1*10000. Extend `createTimeseries()` to add additional labels, so that the queue manager is doing more realistic work. Move the Append() call to a background goroutine - this works because TestWriteClient uses a WaitGroup to signal completion. Call `StoreSeries()` and `SeriesReset()` while adding samples, to simulate the garbage-collection that wal.Watcher does. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

This matches what Watcher.garbageCollectSeries() is doing now Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

bboreham · 2021-07-25T17:10:40Z

I fixed the comments, and updated the existing BenchmarkSampleDelivery to demonstrate the effect.

This benchmark crashed on my machine before I changed it, running out of memory. It attempted to enqueue 25 million samples in one go, which is not very realistic. I reduced the number of samples and extended the label set on each series to give StoreSeries() more work to do.

Results, comparing calling StoreSeries() against UpdateSeriesSegment():

name              old time/op    new time/op    delta
SampleDelivery-4    88.3ms ± 2%    61.8ms ± 2%  -29.94%  (p=0.008 n=5+5)

name              old alloc/op   new alloc/op   delta
SampleDelivery-4    52.0MB ± 0%    47.2MB ± 0%   -9.29%  (p=0.008 n=5+5)

name              old allocs/op  new allocs/op  delta
SampleDelivery-4      425k ± 0%      414k ± 0%   -2.56%  (p=0.008 n=5+5)

cstyan

LGTM

Personally I think the current benchmark should still call the StoreSeries function and we should have a separate benchmark for checkpoint reading to differentiate between StoreSeries and UpdateSeriesSegment.

csmarchbanks

👍

Separate benchmarks could be nice, I am curious how much of the current benchmark is spent processing samples vs running the gc? If they are similar then maybe one benchmark is OK.

bboreham · 2021-07-27T08:12:49Z

the current benchmark should still call the StoreSeries function

It does, before the timing loop, just as it did before this PR.
Are you suggesting that StoreSeries should be called inside the timing loop?

Can you describe the real-life scenario you want the benchmark to imitate?

bboreham · 2021-07-27T08:16:26Z

I am curious how much of the current benchmark is spent processing samples vs running the gc?

Removing the two lines calling UpdateSeriesSegment and SeriesReset makes negligible difference.

$ benchstat nogc.txt withgc.txt 
name              old time/op    new time/op    delta
SampleDelivery-4    61.1ms ± 4%    60.7ms ± 3%   ~     (p=0.841 n=5+5)

name              old alloc/op   new alloc/op   delta
SampleDelivery-4    47.1MB ± 0%    47.2MB ± 0%   ~     (p=0.286 n=4+5)

name              old allocs/op  new allocs/op  delta
SampleDelivery-4      414k ± 0%      414k ± 0%   ~     (p=0.333 n=4+5)

csmarchbanks · 2021-07-27T15:19:41Z

Thanks, I am happy with this as is. I will let Callum respond and merge if he is happy as well.

cstyan · 2021-07-27T20:01:37Z

It does, before the timing loop, just as it did before this PR.

I must have misread something then, LGTM

Are you suggesting that StoreSeries should be called inside the timing loop?

I think it would be worth exploring a benchmark between the two, but that's not necessary for this PR.

bboreham added 2 commits July 22, 2021 13:43

Refactor: pass segment-reading function as param

ae890bc

To allow a different implementation to be used when garbage-collecting. Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

bboreham requested review from bwplotka, codesome, csmarchbanks, cstyan and tomwilkie as code owners July 22, 2021 13:56

This was referenced Jul 22, 2021

remote-write: don't stall on garbageCollectSeries() #9082

Closed

remote-write: retain interned labels #9084

Closed

cstyan reviewed Jul 23, 2021

View reviewed changes

tsdb/wal/watcher.go Outdated Show resolved Hide resolved

storage/remote/queue_manager.go Outdated Show resolved Hide resolved

storage/remote/queue_manager.go Outdated Show resolved Hide resolved

bboreham added 3 commits July 25, 2021 16:40

Cosmetic: review feedback on comments

9482c6f

Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

Change BenchmarkSampleDelivery to call UpdateSeriesSegment

cc01726

This matches what Watcher.garbageCollectSeries() is doing now Signed-off-by: Bryan Boreham <bjboreham@gmail.com>

cstyan approved these changes Jul 26, 2021

View reviewed changes

csmarchbanks approved these changes Jul 26, 2021

View reviewed changes

cstyan merged commit 60804c5 into prometheus:main Jul 27, 2021

yeya24 mentioned this pull request Oct 3, 2021

Implement stateless mode for Thanos Ruler (continue #4250) thanos-io/thanos#4731

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

remote_write: reduce blocking from garbage-collect of series #9109

remote_write: reduce blocking from garbage-collect of series #9109

bboreham commented Jul 22, 2021 •

edited

cstyan left a comment

bboreham commented Jul 25, 2021

cstyan left a comment

csmarchbanks left a comment

bboreham commented Jul 27, 2021

bboreham commented Jul 27, 2021

csmarchbanks commented Jul 27, 2021

cstyan commented Jul 27, 2021

remote_write: reduce blocking from garbage-collect of series #9109

remote_write: reduce blocking from garbage-collect of series #9109

Conversation

bboreham commented Jul 22, 2021 • edited

cstyan left a comment

Choose a reason for hiding this comment

bboreham commented Jul 25, 2021

cstyan left a comment

Choose a reason for hiding this comment

csmarchbanks left a comment

Choose a reason for hiding this comment

bboreham commented Jul 27, 2021

bboreham commented Jul 27, 2021

csmarchbanks commented Jul 27, 2021

cstyan commented Jul 27, 2021

bboreham commented Jul 22, 2021 •

edited