Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vertical query merging and compaction #370

Merged
merged 46 commits into from Feb 14, 2019

Conversation

Projects
None yet
4 participants
@codesome
Copy link
Member

codesome commented Sep 4, 2018

Fixes #90

Not tested in real world scenario yet.
Comments and suggestions are welcome.

codesome added some commits Sep 1, 2018

Vertical series iterator
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Select overlapped blocks first in compactor Plan()
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Added vertical compaction
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Code cleanup and comments
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Show resolved Hide resolved chunkenc/xor.go Outdated
Show resolved Hide resolved block.go Outdated
if len(chks) < 2 {
return chks, nil
}
var newChks []Meta // Will contain the merged chunks.

This comment has been minimized.

Copy link
@bwplotka

bwplotka Oct 3, 2018

Contributor

Can we allocate space for all of them?

newChks := make([]Meta, len(chks), len(chks)). It will reduce allocations. And it's fine if we will not use all of this. Statistically we will use all of those (when they not overlap).

This comment has been minimized.

Copy link
@codesome

codesome Oct 5, 2018

Author Member

Thanks for this. I actually observed significant increase in allocs when I tested with #240. I will test again with this change.

compact.go Outdated
@@ -611,7 +651,8 @@ func (c *LeveledCompactor) populateBlock(blocks []BlockReader, meta *BlockMeta,
chks[i].Chunk = newChunk
}
}
if err := chunkw.WriteChunks(chks...); err != nil {
var err error
if chks, err = chunkw.WriteChunks(chks...); err != nil {

This comment has been minimized.

Copy link
@bwplotka

bwplotka Oct 3, 2018

Contributor

let's not use if _ = :...; err != nil idiom then if we don't want to scope down those return variables.

@bwplotka
Copy link
Contributor

bwplotka left a comment

Thanks! It looks really really good, just some suggestions after initial review.

  1. Can we split those two things? vertical query and improved compaction?
  2. Make it more streamable (merging chunks)
  3. Do not change ChunkWriter API - no need to IMO
  4. Can we have the overlaps instrumented? So having logging and metrics if we spot and resolve overlap on compaction? This is extremely important, because it might happen that overlap is caused by wrong configuration / other bug in compaction (we had that before). As the result it is extremely hard to track down what happened. So compacting overlapped blocks should be clearly logged and reported (bassically the clear log line what was the SOURCE of blocks)
Show resolved Hide resolved querier.go
Show resolved Hide resolved compact.go Outdated
Show resolved Hide resolved compact.go Outdated
@codesome

This comment has been minimized.

Copy link
Member Author

codesome commented Oct 5, 2018

@bwplotka So are you suggesting to keep vertical query merge and compaction in separate PR? That is possible, but ideally you would like to vendor both of them together into prometheus.

@bwplotka

This comment has been minimized.

Copy link
Contributor

bwplotka commented Oct 5, 2018

Sure, just those are quite separate problems, so it would be easier to control those separatedly

@codesome

This comment has been minimized.

Copy link
Member Author

codesome commented Oct 5, 2018

On a second thought, we would expect vertical query merge when there are overlapping blocks. Hence we also need to take care of them during compaction, where vertical merging is required.

So taking into account vertical blocks, I think we should not split this PR.

PS: I will address the reviews soon.

@bwplotka

This comment has been minimized.

Copy link
Contributor

bwplotka commented Oct 5, 2018

Yup, can agree with that. It is separate in code, but you cannot allow overlapping blocks until having those two things working - my bad

codesome added some commits Oct 6, 2018

Fix review comments
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Merge remote-tracking branch 'upstream/master' into vertical-query-me…
…rge-and-compact

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Merge remote-tracking branch 'upstream/master' into vertical-query-me…
…rge-and-compact

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Fix tests
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Add benchmark for compaction
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Perform vertical compaction only when blocks are overlapping.
Actions for vertical compaction:
* Sorting chunk metas
* Calling chunks.MergeOverlappingChunks on the chunks

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Benchmark for vertical compaction
* BenchmarkNormalCompaction => BenchmarkCompaction
* Moved the benchmark from db_test.go to compact_test.go

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Merge remote-tracking branch 'upstream/master' into vertical-query-me…
…rge-and-compact

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@codesome

This comment has been minimized.

Copy link
Member Author

codesome commented Nov 12, 2018

I ran some benchmarks from this:

tsdb/compact_test.go

Lines 722 to 743 in 7ae4941

func BenchmarkCompaction(b *testing.B) {
cases := []struct {
startTimes []int64
numSamplesPerSeries int
compactionType string
}{
{
startTimes: []int64{0, 2000000000, 4000000000, 6000000000},
numSamplesPerSeries: 100,
compactionType: "normal",
},
{
startTimes: []int64{0, 2000000000, 4000000000, 6000000000},
numSamplesPerSeries: 1000,
compactionType: "normal",
},
{
startTimes: []int64{0, 2000000000, 4000000000, 6000000000},
numSamplesPerSeries: 10000,
compactionType: "normal",
},

Only normal compactions without overlapping blocks are benchmarked below for comparison.

This PR
Run #1
BenchmarkCompaction/type=normal,blocks=4,series=10000,samplesPerSeriesPerBlock=100-8         	2000000000	         0.16 ns/op	       0 B/op	       0 allocs/op
BenchmarkCompaction/type=normal,blocks=4,series=10000,samplesPerSeriesPerBlock=1000-8        	       2	 537675974 ns/op	18961224 B/op	  247209 allocs/op
BenchmarkCompaction/type=normal,blocks=4,series=10000,samplesPerSeriesPerBlock=10000-8       	       1	9465969487 ns/op	41204512 B/op	 3549935 allocs/op

Run #2
BenchmarkCompaction/type=normal,blocks=4,series=10000,samplesPerSeriesPerBlock=100-8         	2000000000	         0.16 ns/op	       0 B/op	       0 allocs/op
BenchmarkCompaction/type=normal,blocks=4,series=10000,samplesPerSeriesPerBlock=1000-8        	       1	1119701194 ns/op	37893632 B/op	  494247 allocs/op
BenchmarkCompaction/type=normal,blocks=4,series=10000,samplesPerSeriesPerBlock=10000-8       	       1	6397818257 ns/op	41212096 B/op	 3549846 allocs/op

Run #3
BenchmarkCompaction/type=normal,blocks=4,series=10000,samplesPerSeriesPerBlock=100-8         	2000000000	         0.15 ns/op	       0 B/op	       0 allocs/op
BenchmarkCompaction/type=normal,blocks=4,series=10000,samplesPerSeriesPerBlock=1000-8        	       1	1104762175 ns/op	37907664 B/op	  494263 allocs/op
BenchmarkCompaction/type=normal,blocks=4,series=10000,samplesPerSeriesPerBlock=10000-8       	       1	6654273137 ns/op	41196544 B/op	 3549730 allocs/op


Master 
Run #1
BenchmarkCompaction/type=normal,blocks=4,series=10000,samplesPerSeriesPerBlock=100-8         	2000000000	         0.16 ns/op	       0 B/op	       0 allocs/op
BenchmarkCompaction/type=normal,blocks=4,series=10000,samplesPerSeriesPerBlock=1000-8        	       1	1113853622 ns/op	37893152 B/op	  494149 allocs/op
BenchmarkCompaction/type=normal,blocks=4,series=10000,samplesPerSeriesPerBlock=10000-8       	       1	9133445837 ns/op	41223936 B/op	 3549891 allocs/op

Run #2
BenchmarkCompaction/type=normal,blocks=4,series=10000,samplesPerSeriesPerBlock=100-8         	1000000000	         0.33 ns/op	       0 B/op	       0 allocs/op
BenchmarkCompaction/type=normal,blocks=4,series=10000,samplesPerSeriesPerBlock=1000-8        	       1	1104740847 ns/op	37895792 B/op	  494170 allocs/op
BenchmarkCompaction/type=normal,blocks=4,series=10000,samplesPerSeriesPerBlock=10000-8       	       1	6670045726 ns/op	41201616 B/op	 3549779 allocs/op

Run #3
BenchmarkCompaction/type=normal,blocks=4,series=10000,samplesPerSeriesPerBlock=100-8         	2000000000	         0.15 ns/op	       0 B/op	       0 allocs/op
BenchmarkCompaction/type=normal,blocks=4,series=10000,samplesPerSeriesPerBlock=1000-8        	       1	1118837330 ns/op	37883280 B/op	  494189 allocs/op
BenchmarkCompaction/type=normal,blocks=4,series=10000,samplesPerSeriesPerBlock=10000-8       	       1	7429331297 ns/op	41204896 B/op	 3549765 allocs/op

As I fallback to normal compactions when blocks are not overlapping, similar benchmark results were expected.

PS: Am I benchmarking it right?

@codesome

This comment has been minimized.

Copy link
Member Author

codesome commented Nov 12, 2018

And this is for overlapping blocks, so, only vertical compaction.

tsdb/compact_test.go

Lines 749 to 763 in 7ae4941

{
startTimes: []int64{0, 20000, 40000, 60000},
numSamplesPerSeries: 100,
compactionType: "vertical",
},
{
startTimes: []int64{0, 200000, 400000, 600000},
numSamplesPerSeries: 1000,
compactionType: "vertical",
},
{
startTimes: []int64{0, 200000, 400000, 600000},
numSamplesPerSeries: 10000,
compactionType: "vertical",
},

time/op seems to get better for 10000 samples/series after 1st benchmark, even for previous benchmarks (system warms up?).

Run #1
BenchmarkCompaction/type=vertical,blocks=4,series=10000,samplesPerSeriesPerBlock=100-8         	       1	1148168277 ns/op	131909200 B/op	  625237 allocs/op
BenchmarkCompaction/type=vertical,blocks=4,series=10000,samplesPerSeriesPerBlock=1000-8        	       1	1114489523 ns/op	54561792 B/op	  537658 allocs/op
BenchmarkCompaction/type=vertical,blocks=4,series=10000,samplesPerSeriesPerBlock=10000-8       	       1	10125427341 ns/op	185875544 B/op	 3588591 allocs/op

Run #2
BenchmarkCompaction/type=vertical,blocks=4,series=10000,samplesPerSeriesPerBlock=100-8         	       1	1116357367 ns/op	131905232 B/op	  625348 allocs/op
BenchmarkCompaction/type=vertical,blocks=4,series=10000,samplesPerSeriesPerBlock=1000-8        	       1	1204296980 ns/op	54566352 B/op	  537675 allocs/op
BenchmarkCompaction/type=vertical,blocks=4,series=10000,samplesPerSeriesPerBlock=10000-8       	       1	7315432128 ns/op	185875576 B/op	 3588492 allocs/op

Run #3
BenchmarkCompaction/type=vertical,blocks=4,series=10000,samplesPerSeriesPerBlock=100-8         	       1	1121767347 ns/op	131904688 B/op	  625228 allocs/op
BenchmarkCompaction/type=vertical,blocks=4,series=10000,samplesPerSeriesPerBlock=1000-8        	       1	1192762274 ns/op	54566816 B/op	  537968 allocs/op
BenchmarkCompaction/type=vertical,blocks=4,series=10000,samplesPerSeriesPerBlock=10000-8       	       1	6638607223 ns/op	185867240 B/op	 3588570 allocs/op

Run #4
BenchmarkCompaction/type=vertical,blocks=4,series=10000,samplesPerSeriesPerBlock=100-8         	       1	1107938987 ns/op	131907600 B/op	  625266 allocs/op
BenchmarkCompaction/type=vertical,blocks=4,series=10000,samplesPerSeriesPerBlock=1000-8        	       1	1171487320 ns/op	54566152 B/op	  537728 allocs/op
BenchmarkCompaction/type=vertical,blocks=4,series=10000,samplesPerSeriesPerBlock=10000-8       	       1	7141373863 ns/op	186044512 B/op	 3588663 allocs/op
Merge remote-tracking branch 'upstream/master' into vertical-query-me…
…rge-and-compact

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@codesome

This comment has been minimized.

Copy link
Member Author

codesome commented Nov 20, 2018

http://prombench.prometheus.io/grafana/d/7gmLoNDmz/prombench?orgId=1&var-RuleGroup=All&var-pr-number=4861 (prometheus/prometheus#4861)

Benchmark looks fine for non-overlapping blocks, almost identical. So this PR is working fine with non-overlapping blocks 🎉

Benchmark for query iterator and seek for non overlapping blocks
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@codesome

This comment has been minimized.

Copy link
Member Author

codesome commented Nov 30, 2018

Iterator bench: 20 blocks, 1000 series, 2000 samples per series per block. All samples non overlapping.

Master:
BenchmarkQueryIteratorOverMultipleBlocks-8              1    3576308857 ns/op    82707848 B/op     1601934 allocs/op
This PR:
BenchmarkQueryIteratorOverMultipleBlocks-8              1    4381610051 ns/op    82885960 B/op     1601779 allocs/op

Seek bench: 20 blocks, 100 series, 200 samples per series per block. All samples non overlapping.

Master:
BenchmarkQuerySeekOverMultipleBlocks-8              1    2628171081 ns/op    1115735232 B/op    18537288 allocs/op
This PR:
BenchmarkQuerySeekOverMultipleBlocks-8              1    2720268670 ns/op    411117392 B/op     8482438 allocs/op

Benchmark for iterator doesn't look good. @gouthamve suggested adding new querier for overlapping blocks, and use it only when we detect overlapping blocks. This way non-overlapping blocks wont suffer from the performance loss.

Vertical query merge only for overlapping blocks
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
nits
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Show resolved Hide resolved CHANGELOG.md Outdated
Updated CHANGELOG
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@krasi-georgiev

This comment has been minimized.

Copy link
Member

krasi-georgiev commented Jan 28, 2019

I think the last bit is to add a benchmark to compare vertical querier vs a normal one.

@codesome

This comment has been minimized.

Copy link
Member Author

codesome commented Jan 29, 2019

I already have some benchmark

tsdb/compact_test.go

Lines 724 to 790 in 47436d0

func BenchmarkCompaction(b *testing.B) {
cases := []struct {
ranges [][2]int64
compactionType string
}{
{
ranges: [][2]int64{{0, 100}, {200, 300}, {400, 500}, {600, 700}},
compactionType: "normal",
},
{
ranges: [][2]int64{{0, 1000}, {2000, 3000}, {4000, 5000}, {6000, 7000}},
compactionType: "normal",
},
{
ranges: [][2]int64{{0, 10000}, {20000, 30000}, {40000, 50000}, {60000, 70000}},
compactionType: "normal",
},
{
ranges: [][2]int64{{0, 100000}, {200000, 300000}, {400000, 500000}, {600000, 700000}},
compactionType: "normal",
},
// 40% overlaps.
{
ranges: [][2]int64{{0, 100}, {60, 160}, {120, 220}, {180, 280}},
compactionType: "vertical",
},
{
ranges: [][2]int64{{0, 1000}, {600, 1600}, {1200, 2200}, {1800, 2800}},
compactionType: "vertical",
},
{
ranges: [][2]int64{{0, 10000}, {6000, 16000}, {12000, 22000}, {18000, 28000}},
compactionType: "vertical",
},
{
ranges: [][2]int64{{0, 100000}, {60000, 160000}, {120000, 220000}, {180000, 280000}},
compactionType: "vertical",
},
}
nSeries := 10000
for _, c := range cases {
nBlocks := len(c.ranges)
b.Run(fmt.Sprintf("type=%s,blocks=%d,series=%d,samplesPerSeriesPerBlock=%d", c.compactionType, nBlocks, nSeries, c.ranges[0][1]-c.ranges[0][0]+1), func(b *testing.B) {
dir, err := ioutil.TempDir("", "bench_compaction")
testutil.Ok(b, err)
defer os.RemoveAll(dir)
blockDirs := make([]string, 0, len(c.ranges))
var blocks []*Block
for _, r := range c.ranges {
block, err := OpenBlock(nil, createBlock(b, dir, genSeries(nSeries, 10, r[0], r[1])), nil)
testutil.Ok(b, err)
blocks = append(blocks, block)
defer block.Close()
blockDirs = append(blockDirs, block.Dir())
}
c, err := NewLeveledCompactor(nil, log.NewNopLogger(), []int64{0}, nil)
testutil.Ok(b, err)
b.ResetTimer()
b.ReportAllocs()
_, err = c.Compact(dir, blockDirs, blocks)
testutil.Ok(b, err)
})
}
}

@krasi-georgiev

This comment has been minimized.

Copy link
Member

krasi-georgiev commented Jan 29, 2019

I meant for the querier not the compaction.

@codesome

This comment has been minimized.

Copy link
Member Author

codesome commented Jan 30, 2019

Apparently, I have benchmarks for query iterator and seek, but only for non-overlapping blocks. I will modify it to also have overlapping blocks, and use Querier from DB instead.

@krasi-georgiev

This comment has been minimized.

Copy link
Member

krasi-georgiev commented Jan 30, 2019

I think we need something like this:

Create some non overlapping blocks
Bench the execution

Create identical blocks to the bench with the non overlap but this time add some overlap.
Bench the execution.

The amount of overlap probably affects the results so choose some average and if needed will modify later with more bench tests.

Refactor iterator and seek benchmarks for Querier.
Also has as overlapping blocks.

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@codesome

This comment has been minimized.

Copy link
Member Author

codesome commented Jan 30, 2019

I have updated the benchmarks. Seek() is not affected much. Next() is slower for overlapping blocks. I haven't run the entire benchmark, will post the results tomorrow.

@codesome

This comment has been minimized.

Copy link
Member Author

codesome commented Jan 31, 2019

Iterator (50% overlap got killed as entire benchmark ran too long):

BenchmarkQueryIterator/nBlocks=20,nSeries=1000,numSamplesPerSeriesPerBlock=20000,overlap=0%-8         	       1	29186357370 ns/op	603467672 B/op	10161757 allocs/op
BenchmarkQueryIterator/nBlocks=20,nSeries=1000,numSamplesPerSeriesPerBlock=20000,overlap=10%-8        	       1	28994143626 ns/op	603341768 B/op	10161601 allocs/op
BenchmarkQueryIterator/nBlocks=20,nSeries=1000,numSamplesPerSeriesPerBlock=20000,overlap=30%-8        	       1	28871116142 ns/op	603383736 B/op	10161653 allocs/op

Seek:

BenchmarkQuerySeek/nBlocks=20,nSeries=100,numSamplesPerSeriesPerBlock=2000,overlap=0%-8         	       1	21325032453 ns/op	4039194432 B/op	42088711 allocs/op
BenchmarkQuerySeek/nBlocks=20,nSeries=100,numSamplesPerSeriesPerBlock=2000,overlap=10%-8        	       1	20784196549 ns/op	3674383152 B/op	38288627 allocs/op
BenchmarkQuerySeek/nBlocks=20,nSeries=100,numSamplesPerSeriesPerBlock=2000,overlap=30%-8        	       1	19583454557 ns/op	2944741800 B/op	30688424 allocs/op
BenchmarkQuerySeek/nBlocks=20,nSeries=100,numSamplesPerSeriesPerBlock=2000,overlap=50%-8        	       1	18379243396 ns/op	2214991320 B/op	23088053 allocs/op

The reduction of time with overlap might also be a side affect of the reduced number of total Next()/Seek() calls.

@codesome

This comment has been minimized.

Copy link
Member Author

codesome commented Jan 31, 2019

@krasi-georgiev @gouthamve anything pending in this PR?

@codesome

This comment has been minimized.

Copy link
Member Author

codesome commented Feb 7, 2019

An update about the benchmark: When creating blocks for the benchmark, genSeries is called for every block, which means the label-value set for each series will be different for all blocks. So even if the blocks ranges are overlapping, no series will be actually overlapping. I will fix that to have same label-value set for all blocks and benchmark again.

Show resolved Hide resolved querier.go
Additional test case
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@codesome

This comment has been minimized.

Copy link
Member Author

codesome commented Feb 10, 2019

New Benchmark Results

Iterator At()

BenchmarkQueryIterator/nBlocks=20,nSeries=1000,numSamplesPerSeriesPerBlock=20000,overlap=0%-8         	       1	34298913566 ns/op	605490752 B/op	10218627 allocs/op
BenchmarkQueryIterator/nBlocks=20,nSeries=1000,numSamplesPerSeriesPerBlock=20000,overlap=10%-8        	       1	41726019807 ns/op	605794752 B/op	10218627 allocs/op
BenchmarkQueryIterator/nBlocks=20,nSeries=1000,numSamplesPerSeriesPerBlock=20000,overlap=30%-8        	       1	40826367658 ns/op	605773768 B/op	10218601 allocs/op

Looks like trade-off between overlap and the number of At() calls that it had to make. 0%->10% overlap was an increase in time, but 10%->30% overlap was a decrease in time.

Iterator Seek()

BenchmarkQuerySeek/nBlocks=20,nSeries=100,numSamplesPerSeriesPerBlock=2000,overlap=0%-8         	       1	27214158815 ns/op	11047536360 B/op	134093510 allocs/op
BenchmarkQuerySeek/nBlocks=20,nSeries=100,numSamplesPerSeriesPerBlock=2000,overlap=10%-8        	       1	24988221153 ns/op	3677849184 B/op		38325722 allocs/op
BenchmarkQuerySeek/nBlocks=20,nSeries=100,numSamplesPerSeriesPerBlock=2000,overlap=30%-8        	       1	23036797199 ns/op	2948248200 B/op		30725706 allocs/op
BenchmarkQuerySeek/nBlocks=20,nSeries=100,numSamplesPerSeriesPerBlock=2000,overlap=50%-8        	       1	21063503047 ns/op	2218647152 B/op		23125703 allocs/op

Not much affect on Seek() here, the decrease in time should be the decrease in Seek() calls.

genSeries takes optional labels. Updated BenchmarkQueryIterator and B…
…enchmarkQuerySeek.

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

@codesome codesome force-pushed the codesome:vertical-query-merge-and-compact branch from d985113 to 5f8d911 Feb 10, 2019

Show resolved Hide resolved block_test.go Outdated
Show resolved Hide resolved compact_test.go Outdated
Show resolved Hide resolved compact_test.go Outdated

codesome added some commits Feb 12, 2019

Split genSeries into genSeries and populateSeries
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Check error in benchmark
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Merge remote-tracking branch 'upstream/master' into vertical-query-me…
…rge-and-compact

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@krasi-georgiev

This comment has been minimized.

Copy link
Member

krasi-georgiev commented Feb 12, 2019

LGTM

@gouthamve
Copy link
Member

gouthamve left a comment

LGTM with a couple of nits.

Show resolved Hide resolved chunkenc/xor.go Outdated
Show resolved Hide resolved compact.go Outdated
Show resolved Hide resolved compact.go Outdated
compact.go Outdated
@@ -93,6 +95,10 @@ func newCompactorMetrics(r prometheus.Registerer) *compactorMetrics {
Name: "prometheus_tsdb_compactions_failed_total",
Help: "Total number of compactions that failed for the partition.",
})
m.overlappingBlocks = prometheus.NewCounter(prometheus.CounterOpts{
Name: "prometheus_tsdb_compactions_overlapping_total",

This comment has been minimized.

Copy link
@gouthamve

gouthamve Feb 13, 2019

Member

compactions_overlapping_total makes it feel like the number of concurrent compactions. Maybe compactions_overlapping_blocks_total?

This comment has been minimized.

Copy link
@codesome

codesome Feb 14, 2019

Author Member

compactions_overlapping_blocks_total sounds like number of blocks that were overlapping during compaction. How about vertical_compactions_total?

This comment has been minimized.

Copy link
@gouthamve

gouthamve Feb 14, 2019

Member

Yup, better

This comment has been minimized.

Copy link
@codesome

codesome Feb 14, 2019

Author Member

Already done :)

Show resolved Hide resolved compact.go Outdated
compact.go Outdated
@@ -438,7 +508,8 @@ func (w *instrumentedChunkWriter) WriteChunks(chunks ...chunks.Meta) error {

// write creates a new block that is the union of the provided blocks into dir.
// It cleans up all files of the old blocks after completing successfully.
func (c *LeveledCompactor) write(dest string, meta *BlockMeta, blocks ...BlockReader) (err error) {
// The returned bool 'overlapping' is true if the parent blocks were overlapping.

This comment has been minimized.

Copy link
@gouthamve

gouthamve Feb 13, 2019

Member

There is no bool returned.

compact.go Outdated
@@ -541,7 +612,8 @@ func (c *LeveledCompactor) write(dest string, meta *BlockMeta, blocks ...BlockRe

// populateBlock fills the index and chunk writers with new data gathered as the union
// of the provided blocks. It returns meta information for the new block.
func (c *LeveledCompactor) populateBlock(blocks []BlockReader, meta *BlockMeta, indexw IndexWriter, chunkw ChunkWriter) error {
// The returned bool is true if the parent blocks were overlapping.

This comment has been minimized.

Copy link
@gouthamve

gouthamve Feb 13, 2019

Member

Nothing is returned ;)

Show resolved Hide resolved compact.go Outdated
Show resolved Hide resolved chunks/chunks.go Outdated

codesome added some commits Feb 14, 2019

Fix review comments
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Warn about overlapping blocks in reload()
Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
@codesome

This comment has been minimized.

Copy link
Member Author

codesome commented Feb 14, 2019

@gouthamve I hope all your comments are answered?

@gouthamve gouthamve merged commit c59ed49 into prometheus:master Feb 14, 2019

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

naivewong added a commit to naivewong/tsdb that referenced this pull request Feb 15, 2019

Vertical query merging and compaction (prometheus#370)
* Vertical series iterator

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Select overlapped blocks first in compactor Plan()

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Added vertical compaction

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Code cleanup and comments

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix review comments

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix tests

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Add benchmark for compaction

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Perform vertical compaction only when blocks are overlapping.

Actions for vertical compaction:
* Sorting chunk metas
* Calling chunks.MergeOverlappingChunks on the chunks

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Benchmark for vertical compaction

* BenchmarkNormalCompaction => BenchmarkCompaction
* Moved the benchmark from db_test.go to compact_test.go

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Benchmark for query iterator and seek for non overlapping blocks

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Vertical query merge only for overlapping blocks

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Simplify logging in Compact(...)

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Updated CHANGELOG.md

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Calculate overlapping inside populateBlock

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* MinTime and MaxTime for BlockReader.

Using this to find overlapping blocks in populateBlock()

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Sort blocks w.r.t. MinTime in reload()

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Log about overlapping in LeveledCompactor.write() instead of returning bool

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Log about overlapping inside LeveledCompactor.populateBlock()

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix review comments

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Refactor createBlock to take optional []Series

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* review1

Signed-off-by: Krasi Georgiev <kgeorgie@redhat.com>

* Updated CHANGELOG and minor nits

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* nits

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Updated CHANGELOG

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Refactor iterator and seek benchmarks for Querier.

Also has as overlapping blocks.

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Additional test case

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* genSeries takes optional labels. Updated BenchmarkQueryIterator and BenchmarkQuerySeek.

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Split genSeries into genSeries and populateSeries

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Check error in benchmark

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Fix review comments

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>

* Warn about overlapping blocks in reload()

Signed-off-by: Ganesh Vernekar <cs15btech11018@iith.ac.in>
Signed-off-by: naivewong <867245430@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.