Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache overlapping blocks #2239

Merged
merged 5 commits into from
Jun 22, 2020
Merged

Conversation

cyriltovena
Copy link
Contributor

We (@slim-bean and I) realized that the batchIterator in Loki may re-process the same data over and over when more chunks are overlapping than the batch size. The side effect is that some users may process 30GIB of logs when in fact the real data is just 300MB. This affects a lot of queries.

This PR introduces a caches for block that are overlapping to avoid the costly decompression if we need to re-use a block when it overlaps with the next chunk. This is required for correctly deduping.

I've made a benchmark and run it before and after:

❯ benchcmp run1.txt run5.txt
benchmark                                old ns/op     new ns/op     delta
Benchmark_store_OverlappingChunks-16     26997116      23004418      -14.79%

benchmark                                old allocs     new allocs     delta
Benchmark_store_OverlappingChunks-16     352996         339839         -3.73%

benchmark                                old bytes     new bytes     delta
Benchmark_store_OverlappingChunks-16     31978684      9592088       -70.00%

I've also run this in our ops cluster and realize a 25% speed up for filter queries.

…ssing.

Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>
Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>
Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>
Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>
Signed-off-by: Cyril Tovena <cyril.tovena@gmail.com>
@codecov-commenter
Copy link

Codecov Report

Merging #2239 into master will increase coverage by 0.19%.
The diff coverage is 87.67%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2239      +/-   ##
==========================================
+ Coverage   62.07%   62.27%   +0.19%     
==========================================
  Files         156      157       +1     
  Lines       12531    12650     +119     
==========================================
+ Hits         7779     7878      +99     
- Misses       4145     4161      +16     
- Partials      607      611       +4     
Impacted Files Coverage Δ
pkg/chunkenc/dumb_chunk.go 0.00% <0.00%> (ø)
pkg/chunkenc/interface.go 87.50% <ø> (ø)
pkg/chunkenc/memchunk.go 70.83% <27.27%> (-2.93%) ⬇️
pkg/storage/store.go 67.16% <71.42%> (-1.02%) ⬇️
pkg/storage/batch.go 84.31% <94.02%> (ø)
pkg/storage/lazy_chunk.go 97.29% <97.29%> (ø)
pkg/ingester/stream.go 77.50% <100.00%> (ø)
pkg/storage/cache.go 100.00% <100.00%> (ø)
pkg/promtail/targets/tailer.go 76.13% <0.00%> (-2.28%) ⬇️
pkg/logql/evaluator.go 92.30% <0.00%> (-0.42%) ⬇️
... and 2 more

for _, b := range blocks {
// if we have already processed and cache block let's use it.
if cache, ok := c.overlappingBlocks[b.Offset()]; ok {
clone := *cache
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

blocks = append(blocks, b)
}
}
return blocks
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we gain anything by slicing up the existing c.blocks instead of allocating a new slice? Also curious if c.blocks was a slice of pointers if we could save a copy of the block here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ll check if it does help. There’s other place where I don’t reslice intentionally because reslicing keep underlying references.

Copy link
Collaborator

@slim-bean slim-bean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one thought/question, aside from that this looks great!

@cyriltovena cyriltovena merged commit 6ab832b into grafana:master Jun 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants